Warm tip: This article is reproduced from serverfault.com, please click

How to speed up count(distinct) with Between clause in MySQL

发布于 2020-11-24 17:55:27

I have a MySQL table of 10 million rows and 3 columns, in following format:

id                                     time                               num

ca65e871-d758-437e-b76f-175234760e7b  2020-11-14 23:08:05.553770          11112222222
...

For running the first query below, I indexed the table on (num, time) and it works very fast (<5 milliseconds on 10 million rows table):

SELECT COUNT(*) 
FROM TABLE_NAME 
WHERE time >= '2020-11-14 23:08:05.553752' AND num = 11112222222

However I also need to execute count(distinct) on the same table with between clause, something like this:

SELECT COUNT(DISTINCT num) 
FROM TABLE_NAME 
WHERE time >= '2020-11-14 23:08:05.553752'
  AND num BETWEEN (11112222222 - 30)
              AND (11112222222 + 30)

This turns out to be significantly slower, around 200 milliseconds. Is there a way to speed the execution time of the second query on the same table?

Questioner
Makaroni
Viewed
0
Akina 2020-11-25 02:21:55

If your MySQl is 8+ then try:

WITH RECURSIVE
cte AS ( SELECT 11112222222 - 30 num
         UNION ALL
         SELECT num + 1 FROM cte WHERE num < 11112222222 + 30 )
SELECT COUNT(*)
FROM cte
WHERE EXISTS ( SELECT NULL
               FROM TABLE_NAME 
               WHERE TABLE_NAME.num = cte.num
                 AND time >= '2020-11-14 23:08:05.553752' )

If you'll often execute such query then I'd suggest to create service table with the numbers from -30 to 30 and use it instead of recursive CTE.