Warm tip: This article is reproduced from stackoverflow.com, please click
sql-server sql-server-2016 tsql

Select with multiple subselects on same table with parent-children like relations

发布于 2020-04-04 10:14:22

I have a table which contains data on machines. Among many other there are columns for the (unique) equipment number (EMEQP#), the financial cost (EMCOST), the current value (EMBOOK) and the equipment link (EMLEQP). Some machines are made up of several other machines, these are linked by putting the EMEQP# of the "parent" machine in the EMLEQP column. The EMLEQP of the parent is sometimes filled in (with its own EMEQP#) but can also be empty. Also not every parent has children, but even then the EMLEQP can be filled in or not.

Here's a simplified sample of the data:

EMEQP#    |  EMLEQP  |  EMCOST  |  EMBOOK  
ACMD001   |  ACMD001 |  10      |  12  
ACMD001A  |  ACMD001 |  8       |  1  
ACMD002   |  NULL    |  10      |  12  
SLE003    |  NULL    |  11      |  5
RUP5120   |  SLE003  |  12      |  7  

So in this case ACMD001, ACMD002 and SLE003 are parent machines (because EMLEQP is empty or the same as EMEQP#. ACMD001A is a child of ACMD001 and RUP5120 a child of SLE003.

Result of the query should be:

EMEQP#   |  EMCOST  |  EMBOOK  
ACMD001  |  18      |  13  
ACMD002  |  10      |  12  
SLE003   |  23      |  12

I need to know the total EMCOST and EMBOOK of each machine, so the sum of the parent and any children. At first I tried using group by on EMLEQP, but this doesn't work as that can be empty on the parent.
Next I tried working with a subquery. The main query selects all the parents (WHERE EMLEQP = EMEQP# OR EMLEQP = '') and the subselect calculates the sum of all machines where EMLEQP is the EMEQP# of the parent (or empty to include the parent as well). Here's what I have so far to calculate the EMCOST sum.

SELECT EMEQP#, (SELECT SUM(b.emcost) FROM uv_EQPMASFL AS b WHERE (b.emleqp = a.emeqp# AND b.emeqp# <> a.emeqp# OR b.emeqp# = a.emeqp#) 'EMCOST' FROM uv_EQPMASFL a WHERE (EMLEQP = EMEQP# or EMLEQP = '')

This query works, but is extremely slow. It takes over 15 minutes to run it and I haven't even added the second sum to it. Since I cannot use two columns in a subquery, I will need to add a second subquery to get the second sum. Finally I would also like it if I could get another column to indicate the number of machines which are part of the parent (parent included).

There must be a better, more efficient way to get this data, but I'm not seeing how.

Questioner
Nils Tiebos
Viewed
29
Nils Tiebos 2020-02-04 17:18

After the suggestion from @XAMT, I took another approach and was able to fix this.
For anyone interested, I first SELECT the data where EMLEQP is not empty (so all children and parents with their own EMEQP# in EMLEQP and then UNION that with a SELECT of the data where EMLEQP is empty.
From that selection I take another SELECT which I GROUP BY EMEQP#