Lets say i have the following table:
| ... | orderId | serialNumber | type | ... |
|-----|---------|--------------|---------------|-----|
| ... | 1 | 01 | someType | ... |
| ... | 1 | 02 | someType | ... |
| ... | 1 | 03 | someOtherType | ... |
| ... | 1 | 04 | someOtherType | ... |
| ... | 1 | 05 | someType | ... |
| ... | 1 | 06 | someType | ... |
| ... | 2 | 07 | someType | ... |
| ... | 2 | 08 | someType | ... |
I want my query to generate the following result:
| orderId | count | min | max | type |
|---------|-------|-----|-----|---------------|
| 1 | 2 | 01 | 02 | someType |
| 1 | 2 | 03 | 04 | someOtherType |
| 1 | 2 | 05 | 06 | someType |
| 2 | 4 | 07 | 08 | someType |
I want groups by type and orderId. When the serialNumber is not ongoing, a new entry should be created.
This is my current query:
SELECT
orderId,
count(*) AS count,
min(serialNumber) AS min,
max(serialNumber) AS max,
type
FROM tblMyTable
group by type, orderId
order by orderId
but the result it creates is wrong:
| orderId | count | min | max | type |
|---------|-------|-----|-----|---------------|
| 1 | 4 | 01 | 06 | someType | <-- this should be 2 entries
| 1 | 2 | 03 | 04 | someOtherType |
| 2 | 2 | 07 | 08 | someType |
As you can see it does not detect that serialNumber 03 and 04 belongs to another description. This leads to an incorrect count and max-value.
I don't know how to add a criteria that checks for a continuous serialNumber.
EDIT: There can also occur single records like:
| orderId | count | min | max | type |
|---------|-------|-----|-----|---------------|
| 1337 | 1 | 10 | 10 | someNewType |
This is a gaps and islands problem, but I would use the difference of row numbers approach:
select orderid, count(*), min(serialNumber), max(serialNumber), type
from (select t.*,
row_number() over (partition by orderid order by serialnumber) as seqnum,
row_number() over (partition by orderid, type order by serialnumber) as seqnum_type
from t
) t
group by orderid, type, (seqnum - seqnum_type)
order by orderid, min(serialNumber);
Your serialnumber
looks like a string, but it has numeric values. If you can trust the sequencing of it, you don't even need both row number values:
select orderid, count(*), min(serialNumber), max(serialNumber), type
from (select t.*,
row_number() over (partition by orderid, type order by serialnumber) as seqnum_type
from t
) t
group by orderid, type, (serialnumber - seqnum_type)
order by orderid, min(serialNumber);
The reason this works is a little hard to explain, but quite obvious if you run the subqueries. You will see how the difference between the row numbers identifies the groups that you want to identify.
Much cleaner ... I just got the Flag approach stuck in my little head +1