Warm tip: This article is reproduced from stackoverflow.com, please click
mysql optimization select groupwise-maximum

optimize a query in MySQL 5.7

发布于 2020-04-07 10:18:12

Here is my system:

Linux Ubuntu 18.04 LTS
MySQL 5.7

I have a query (below) that is taking too long to complete.

It is taking as long at 9 seconds to complete. That is an untenable amount of time for a user when waiting for a web page to complete loading.

Also, the dataset is large and growing. The events table has 250,000 rows. I tend to add 1200 to 1800 rows per day.

To help in optimization, I would like to add indexes and such, but am stumped about how (and if) I can do that with the derived/joined select query I have.

Here is my query. (Anyone know how to limit the width of the output so that we don't have to scroll to the right to see the full line?)

mysql> explain select OUTSIDE.ownerUID as Owner, 
  OUTSIDE.propUID as Property, 
  OUTSIDE.camname as 'Camera Name', 
  OUTSIDE.direction as Direction, 
  OUTSIDE.camtimestamp as 'Event Time', 
  convert_tz(now(),'UTC','US/Central') as Now, 
  sec_to_time(convert_tz(now(),'UTC','US/Central') - OUTSIDE.CamTimeStamp) as 'Elapsed Time' 
from events OUTSIDE, 
  (select camname,max(camtimestamp) as maxtimestamp from events group by camname) as INSIDE 
where OUTSIDE.camname = INSIDE.camname 
  AND OUTSIDE.camtimestamp = INSIDE.maxtimestamp;
+----+-------------+------------+------------+-------+----------------------+--------------+---------+---------------------+--------+----------+-------------+
| id | select_type | table      | partitions | type  | possible_keys        | key          | key_len | ref                 | rows   | filtered | Extra       |
+----+-------------+------------+------------+-------+----------------------+--------------+---------+---------------------+--------+----------+-------------+
|  1 | PRIMARY     | <derived2> | NULL       | ALL   | NULL                 | NULL         | NULL    | NULL                | 263103 |   100.00 | Using where |
|  1 | PRIMARY     | OUTSIDE    | NULL       | ref   | camtimestamp,camname | camtimestamp | 6       | INSIDE.maxtimestamp |      1 |    99.73 | Using where |
|  2 | DERIVED     | events     | NULL       | index | camname              | camname      | 257     | NULL                | 263103 |   100.00 | NULL        |
+----+-------------+------------+------------+-------+----------------------+--------------+---------+---------------------+--------+----------+-------------+
3 rows in set, 1 warning (0.03 sec)

Here are the results of the query:

mysql> select OUTSIDE.ownerUID as Owner, 
  OUTSIDE.propUID as Property, 
  OUTSIDE.camname as 'Camera Name', 
  OUTSIDE.direction as Direction, 
  OUTSIDE.camtimestamp as 'Event Time', 
  convert_tz(now(),'UTC','US/Central') as Now, 
  sec_to_time(convert_tz(now(),'UTC','US/Central') - OUTSIDE.CamTimeStamp) as 'Elapsed Time' 
from events OUTSIDE, 
  (select camname,max(camtimestamp) as maxtimestamp from events group by camname) as INSIDE 
where OUTSIDE.camname = INSIDE.camname 
  AND OUTSIDE.camtimestamp = INSIDE.maxtimestamp;
+-------+----------+-------------+-----------+---------------------+---------------------+--------------+
| Owner | Property | Camera Name | Direction | Event Time          | Now                 | Elapsed Time |
+-------+----------+-------------+-----------+---------------------+---------------------+--------------+
|     1 |        1 | wls1        | In        | 2020-01-30 12:27:31 | 2020-01-30 12:29:53 | 00:03:42     |
|     1 |        1 | wls2        | Out       | 2020-01-30 12:25:29 | 2020-01-30 12:29:53 | 00:07:04     |
+-------+----------+-------------+-----------+---------------------+---------------------+--------------+
2 rows in set (6.49 sec)

========================================================== Thank you to all who answered the question or commented. I took Uueerdo's suggestion of a composite index and came up with the following results:

mysql> create index CamNameCamTime on events (camname,camtimestamp);

mysql> select OUTSIDE.ownerUID as Owner,
           OUTSIDE.propUID as Property,
           OUTSIDE.camname as 'Camera Name',
           OUTSIDE.direction as Direction,
           OUTSIDE.camtimestamp as 'Event Time',
           convert_tz(now(),'UTC','US/Central') as Now,
           sec_to_time(convert_tz(now(),'UTC','US/Central') - OUTSIDE.CamTimeStamp) as 'Elapsed Time' 
      from events OUTSIDE,
         (select camname,max(camtimestamp) as maxtimestamp 
           from events group by camname) as INSIDE
      where OUTSIDE.camname = INSIDE.camname AND OUTSIDE.camtimestamp = INSIDE.maxtimestamp;
+-------+----------+-------------+-----------+---------------------+---------------------+--------------+
| Owner | Property | Camera Name | Direction | Event Time          | Now                 | Elapsed Time |
+-------+----------+-------------+-----------+---------------------+---------------------+--------------+
|     1 |        1 | wls1        | In        | 2020-01-30 18:43:19 | 2020-01-30 18:44:33 | 00:01:54     |
|     1 |        1 | wls2        | Out       | 2020-01-30 18:41:51 | 2020-01-30 18:44:33 | 00:04:42     |
+-------+----------+-------------+-----------+---------------------+---------------------+--------------+
2 rows in set (0.00 sec)

mysql> explain select OUTSIDE.ownerUID as Owner,
    ->            OUTSIDE.propUID as Property,
    ->            OUTSIDE.camname as 'Camera Name',
    ->            OUTSIDE.direction as Direction,
    ->            OUTSIDE.camtimestamp as 'Event Time',
    ->            convert_tz(now(),'UTC','US/Central') as Now,
    ->            sec_to_time(convert_tz(now(),'UTC','US/Central') - OUTSIDE.CamTimeStamp) as 'Elapsed Time'
    ->       from events OUTSIDE,
    ->          (select camname,max(camtimestamp) as maxtimestamp
    ->            from events group by camname) as INSIDE
    ->       where OUTSIDE.camname = INSIDE.camname AND OUTSIDE.camtimestamp = INSIDE.maxtimestamp;
+----+-------------+------------+------------+-------+-------------------------------------+----------------+---------+------------------------------------+------+----------+--------------------------+
| id | select_type | table      | partitions | type  | possible_keys                       | key            | key_len | ref                                | rows | filtered | Extra                    |
+----+-------------+------------+------------+-------+-------------------------------------+----------------+---------+------------------------------------+------+----------+--------------------------+
|  1 | PRIMARY     | <derived2> | NULL       | ALL   | NULL                                | NULL           | NULL    | NULL                               |    2 |   100.00 | Using where              |
|  1 | PRIMARY     | OUTSIDE    | NULL       | ref   | camtimestamp,camname,CamNameCamTime | CamNameCamTime | 263     | INSIDE.camname,INSIDE.maxtimestamp |    1 |   100.00 | NULL                     |
|  2 | DERIVED     | events     | NULL       | range | camname,CamNameCamTime              | CamNameCamTime | 257     | NULL                               |    2 |   100.00 | Using index for group-by |
+----+-------------+------------+------------+-------+-------------------------------------+----------------+---------+------------------------------------+------+----------+--------------------------+
3 rows in set, 1 warning (0.00 sec)




Success!

Questioner
user3055756
Viewed
65
Uueerdo 2020-02-04 01:06

MySQL cannot use more than one index on each reference to table in a query, so two simple indexes on camname and camtimestamp separately is likely to only be of limited usefulness in this scenario. Such indexes would be perfect for selecting records in a date range, or all records for a name; but when searching for max values of every name, a camname index would be of questionable usefulness (since it must examine every camtimestamp of every name anyway), and a camstimestamp would still necessitate searching every record to make sure all camnames are accounted for (the least camtimestamp could be the only one for a particular camname).

With a composite index on (camname, camtimestamp) the max values for each camname can be quickly identified in the subquery, and then the index can be used again to pick out the row(s) in the table matching the results of the subquery.