温馨提示:本文翻译自stackoverflow.com,查看原文请点击:sql server - SQL-query for grouping continuous numbers

sql server - SQL查询分组连续数字

发布于 2020-03-27 10:34:42

可以说我有下表:

| ... | orderId | serialNumber | type          | ... |
|-----|---------|--------------|---------------|-----|
| ... | 1       | 01           | someType      | ... |
| ... | 1       | 02           | someType      | ... |
| ... | 1       | 03           | someOtherType | ... |
| ... | 1       | 04           | someOtherType | ... |
| ... | 1       | 05           | someType      | ... |
| ... | 1       | 06           | someType      | ... |
| ... | 2       | 07           | someType      | ... |
| ... | 2       | 08           | someType      | ... |

我希望查询生成以下结果:

| orderId | count | min | max | type          |
|---------|-------|-----|-----|---------------|
| 1       | 2     |  01 | 02  | someType      |
| 1       | 2     |  03 | 04  | someOtherType |
| 1       | 2     |  05 | 06  | someType      |
| 2       | 4     |  07 | 08  | someType      |

我想要按类型和orderId分组。当serialNumber不在进行时,应创建一个新条目。

这是我当前的查询:

SELECT
 orderId,
 count(*) AS count,
 min(serialNumber) AS min,
 max(serialNumber) AS max,
 type
FROM tblMyTable
group by type, orderId
order by orderId

但是它产生的结果是错误的:

| orderId | count | min | max | type          |
|---------|-------|-----|-----|---------------|
| 1       | 4     |  01 | 06  | someType      |  <-- this should be 2 entries
| 1       | 2     |  03 | 04  | someOtherType |
| 2       | 2     |  07 | 08  | someType      |

如您所见,它不会检测到serialNumber 03和04属于另一个描述。这将导致错误的计数和最大值。

我不知道如何添加检查连续序列号的条件。

编辑:也可能出现单个记录,如:

 | orderId | count | min | max | type          |
 |---------|-------|-----|-----|---------------|
 | 1337    | 1     |  10 | 10  | someNewType   | 

查看更多

查看更多

提问者
daly
被浏览
142
Gordon Linoff 2019-07-03 22:07

这是一个间隙和孤岛的问题,但是我将使用行数差异方法:

select orderid, count(*), min(serialNumber), max(serialNumber), type
from (select t.*,
             row_number() over (partition by orderid order by serialnumber) as seqnum,
             row_number() over (partition by orderid, type order by serialnumber) as seqnum_type
      from t
     ) t
group by orderid, type, (seqnum - seqnum_type)
order by orderid, min(serialNumber);

serialnumber看起来像一个字符串,但它具有数字值。如果您可以信任它的排序,那么您甚至都不需要两个行号值:

select orderid, count(*), min(serialNumber), max(serialNumber), type
from (select t.*,
             row_number() over (partition by orderid, type order by serialnumber) as seqnum_type
      from t
     ) t
group by orderid, type, (serialnumber - seqnum_type)
order by orderid, min(serialNumber);

起作用的原因有点难以解释,但是如果运行子查询,则很明显。您将看到行号之间的差异如何标识您要标识的组。