###一、背景
####1、阿里云rds Python SDK批量获取慢sql日志条目,API只能分页返回数据,每页最多100个慢sql。
####2、sql平台是基于Django开发,定时任务有失败重试的可能,为保证慢sql不重复获取,采用update_or_create方式(若有更好的方案请留言,谢谢)。


###二、问题
####1、描述:每页100条,插入需要35秒左右,获取前一天的慢sql,任务执行5个小时未完成,每天一次的慢sql统计任务统计不到全量前一天的慢sql,造成统计结果不准确。

[2019-12-26 01:00:02,065][ThreadPoolExecutor-0_0:139773805065984][task_id:default][aliyun_slowlog.py:78][DEBUG]- =========================第1页==================================
[2019-12-26 01:00:36,217][ThreadPoolExecutor-0_0:139773805065984][task_id:default][aliyun_slowlog.py:78][DEBUG]- =========================第2页==================================
[2019-12-26 01:01:10,204][ThreadPoolExecutor-0_0:139773805065984][task_id:default][aliyun_slowlog.py:78][DEBUG]- =========================第3页==================================
[2019-12-26 01:01:44,486][ThreadPoolExecutor-0_0:139773805065984][task_id:default][aliyun_slowlog.py:78][DEBUG]- =========================第4页==================================
[2019-12-26 01:02:19,824][ThreadPoolExecutor-0_0:139773805065984][task_id:default][aliyun_slowlog.py:78][DEBUG]- =========================第5页==================================
[2019-12-26 01:02:53,839][ThreadPoolExecutor-0_0:139773805065984][task_id:default][aliyun_slowlog.py:78][DEBUG]- =========================第6页==================================
[2019-12-26 01:03:28,044][ThreadPoolExecutor-0_0:139773805065984][task_id:default][aliyun_slowlog.py:78][DEBUG]- =========================第7页==================================
[2019-12-26 01:04:02,224][ThreadPoolExecutor-0_0:139773805065984][task_id:default][aliyun_slowlog.py:78][DEBUG]- =========================第8页==================================
[2019-12-26 01:04:36,516][ThreadPoolExecutor-0_0:139773805065984][task_id:default][aliyun_slowlog.py:78][DEBUG]- =========================第9页==================================
[2019-12-26 01:05:11,014][ThreadPoolExecutor-0_0:139773805065984][task_id:default][aliyun_slowlog.py:78][DEBUG]- =========================第10页==================================
[2019-12-26 01:05:45,755][ThreadPoolExecutor-0_0:139773805065984][task_id:default][aliyun_slowlog.py:78][DEBUG]- =========================第11页==================================
[2019-12-26 01:06:20,328][ThreadPoolExecutor-0_0:139773805065984][task_id:default][aliyun_slowlog.py:78][DEBUG]- =========================第12页==================================

####2、现象

mysql> show processlist;
+------+------+-------------------+---------+---------+-------+--------------+------------------------------------------------------------------------------------------------------+
| Id   | User | Host              | db      | Command | Time  | State        | Info                                                                                                 |
+------+------+-------------------+---------+---------+-------+--------------+------------------------------------------------------------------------------------------------------+
| 1549 | root | 10.xx.102.1:38200 | archery | Sleep   |    36 |              | NULL                                                                                                 |
| 1550 | root | xx.xx.xxx.1:38202 | NULL    | Sleep   |    15 |              | NULL                                                                                                 |
| 2542 | root | xx.xx.xxx.3:56684 | archery | Sleep   | 15379 |              | NULL                                                                                                 |
| 2543 | root | xx.xx.xxx.3:56694 | archery | Query   |     1 | Sending data | SELECT `ws_sql_slow_record`.`id`, `ws_sql_slow_record`.`db_instance_id`, `ws_sql_slow_recor |
| 3141 | root | xx.xx.xxx.3:58864 | archery | Sleep   | 15369 |              | NULL                                                                                                 |
| 3652 | root | xx.xx.xxx.3:60618 | archery | Sleep   |     7 |              | NULL                                                                                                 |
| 3653 | root | xx.xx.xxx.1:45304 | NULL    | Query   |     0 | starting     | show processlist                                                                                     |
+------+------+-------------------+---------+---------+-------+--------------+------------------------------------------------------------------------------------------------------+
7 rows in set (0.00 sec)

  

mysql> SELECT `ws_sql_slow_record`.`id`, `ws_sql_slow_record`.`db_instance_id`, `ws_sql_slow_record`.`db_instance_description`, `ws_sql_slow_record`.`host_address`, `ws_sql_slow_record`.`db_name`, `ws_sql_slow_record`.`sql_text`, `ws_sql_slow_record`.`query_times`, `ws_sql_slow_record`.`lock_times`, `ws_sql_slow_record`.`parse_row_counts`, `ws_sql_slow_record`.`return_row_counts`, `ws_sql_slow_record`.`execution_start_time` FROM `ws_sql_slow_record` WHERE (`ws_sql_slow_record`.`db_instance_description` = 'XXXXXXXX主库' AND `ws_sql_slow_record`.`db_instance_id` = 'rm-2ze962s1u16ytgzf6' AND `ws_sql_slow_record`.`db_name` = 'customer_service' AND `ws_sql_slow_record`.`execution_start_time` = '2019-12-25T09:54:24Z' AND `ws_sql_slow_record`.`host_address` = 'customer_service_writer[customer_service_writer] @  [172.23.20.1' AND `ws_sql_slow_record`.`lock_times` = 0 AND `ws_sql_slow_record`.`parse_row_counts` = 1177581 AND `ws_sql_slow_record`.`query_times` = 0 AND `ws_sql_slow_record`.`return_row_counts` = 598 AND `ws_sql_slow_record`.`sql_text` = 'select * from `crm_service_order` where `origin_service_type` = 2 and `status` in (10, 20, 28) and `created_at` >= \'2019-10-16 00:00:00\' order by `id` asc limit 600 offset 12000');
+--------+----------------------+-------------------------+------------------------------------------------------------------+------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------+------------+------------------+-------------------+----------------------+
| id     | db_instance_id       | db_instance_description | host_address                                                     | db_name          | sql_text                                                                                                                                                                          | query_times | lock_times | parse_row_counts | return_row_counts | execution_start_time |
+--------+----------------------+-------------------------+------------------------------------------------------------------+------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------+------------+------------------+-------------------+----------------------+
| 252239 | rm-xxxxxxxxxxxxxxxxx | XXXXXXXX主库            | customer_service_writer[customer_service_writer] @  [172.23.20.1 | customer_service | select * from `crm_service_order` where `origin_service_type` = 2 and `status` in (10, 20, 28) and `created_at` >= '2019-10-16 00:00:00' order by `id` asc limit 600 offset 12000 |           0 |          0 |          1177581 |               598 | 2019-12-25T09:54:24Z |
+--------+----------------------+-------------------------+------------------------------------------------------------------+------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------+------------+------------------+-------------------+----------------------+
1 row in set (0.38 sec)

 

###三、处理过程

####1、统计列基数

mysql> select count(1) from ws_sql_slow_record;
+----------+
| count(1) |
+----------+
|   274099 |
+----------+
1 row in set (0.09 sec)

mysql> select count(distinct host_address) from ws_sql_slow_record;
+------------------------------+
| count(distinct host_address) |
+------------------------------+
|                          731 |
+------------------------------+
1 row in set (0.01 sec)

mysql> select count(distinct execution_start_time) from ws_sql_slow_record;
+--------------------------------------+
| count(distinct execution_start_time) |
+--------------------------------------+
|                                78203 |
+--------------------------------------+
1 row in set (0.51 sec)

mysql> select count(distinct sql_text) from ws_sql_slow_record;
+--------------------------+
| count(distinct sql_text) |
+--------------------------+
|                    70136 |
+--------------------------+
1 row in set (2.50 sec)

mysql> select count(distinct host_address,execution_start_time) from ws_sql_slow_record;
+---------------------------------------------------+
| count(distinct host_address,execution_start_time) |
+---------------------------------------------------+
|                                            181844 |
+---------------------------------------------------+
1 row in set (0.41 sec)

####2、创建合适索引

mysql> alter table ws_sql_slow_record add index idx_host_address_execution_start_time(host_address,execution_start_time);
Query OK, 0 rows affected (4.76 sec)
Records: 0  Duplicates: 0  Warnings: 0

####3、验证索引效果,每页插入耗时降低到2秒

mysql> SELECT `ws_sql_slow_record`.`id`, `ws_sql_slow_record`.`db_instance_id`, `ws_sql_slow_record`.`db_instance_description`, `ws_sql_slow_record`.`host_address`, `ws_sql_slow_record`.`db_name`, `ws_sql_slow_record`.`sql_text`, `ws_sql_slow_record`.`query_times`, `ws_sql_slow_record`.`lock_times`, `ws_sql_slow_record`.`parse_row_counts`, `ws_sql_slow_record`.`return_row_counts`, `ws_sql_slow_record`.`execution_start_time` FROM `ws_sql_slow_record` WHERE (`ws_sql_slow_record`.`db_instance_description` = 'XXXXXXXX主库' AND `ws_sql_slow_record`.`db_instance_id` = 'rm-xxxxxxxxxxxxxxxxxxx' AND `ws_sql_slow_record`.`db_name` = 'customer_service' AND `ws_sql_slow_record`.`execution_start_time` = '2019-12-25T09:54:24Z' AND `ws_sql_slow_record`.`host_address` = 'customer_service_writer[customer_service_writer] @  [172.23.20.1' AND `ws_sql_slow_record`.`lock_times` = 0 AND `ws_sql_slow_record`.`parse_row_counts` = 1177581 AND `ws_sql_slow_record`.`query_times` = 0 AND `ws_sql_slow_record`.`return_row_counts` = 598 AND `ws_sql_slow_record`.`sql_text` = 'select * from `crm_service_order` where `origin_service_type` = 2 and `status` in (10, 20, 28) and `created_at` >= \'2019-10-16 00:00:00\' order by `id` asc limit 600 offset 12000');
+--------+----------------------+-------------------------+------------------------------------------------------------------+------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------+------------+------------------+-------------------+----------------------+
| id     | db_instance_id       | db_instance_description | host_address                                                     | db_name          | sql_text                                                                                                                                                                          | query_times | lock_times | parse_row_counts | return_row_counts | execution_start_time |
+--------+----------------------+-------------------------+------------------------------------------------------------------+------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------+------------+------------------+-------------------+----------------------+
| 252239 | rm-xxxxxxxxxxxxxxxxxxx | XXXXXXXX主库            | customer_service_writer[customer_service_writer] @  [172.23.20.1 | customer_service | select * from `crm_service_order` where `origin_service_type` = 2 and `status` in (10, 20, 28) and `created_at` >= '2019-10-16 00:00:00' order by `id` asc limit 600 offset 12000 |           0 |          0 |          1177581 |               598 | 2019-12-25T09:54:24Z |
+--------+----------------------+-------------------------+------------------------------------------------------------------+------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------+------------+------------------+-------------------+----------------------+
1 row in set (0.00 sec)

mysql> show processlist;
+------+------+-------------------+---------+---------+-------+----------+------------------+
| Id   | User | Host              | db      | Command | Time  | State    | Info             |
+------+------+-------------------+---------+---------+-------+----------+------------------+
| 1549 | root | xx.xx.xxx.1:38200 | archery | Sleep   |    29 |          | NULL             |
| 1550 | root | xx.xx.xxx.1:38202 | NULL    | Sleep   |    58 |          | NULL             |
| 2542 | root | xx.xx.xxx.3:56684 | archery | Sleep   | 18602 |          | NULL             |
| 2543 | root | xx.xx.xxx.3:56694 | archery | Sleep   |   672 |          | NULL             |
| 3141 | root | xx.xx.xxx.3:58864 | archery | Sleep   | 18592 |          | NULL             |
| 3668 | root | xx.xx.xxx.1:45354 | archery | Query   |     0 | starting | show processlist |
| 3762 | root | xx.xx.xxx.3:33592 | archery | Sleep   |     7 |          | NULL             |
+------+------+-------------------+---------+---------+-------+----------+------------------+
7 rows in set (0.00 sec)
[2019-12-26 10:52:13,882][ThreadPoolExecutor-0_0:139773805065984][task_id:default][aliyun_slowlog.py:78][DEBUG]- =========================第229页==================================
[2019-12-26 10:52:15,315][ThreadPoolExecutor-0_0:139773805065984][task_id:default][aliyun_slowlog.py:78][DEBUG]- =========================第230页==================================
[2019-12-26 10:52:17,684][ThreadPoolExecutor-0_0:139773805065984][task_id:default][aliyun_slowlog.py:78][DEBUG]- =========================第231页==================================
[2019-12-26 10:52:20,138][ThreadPoolExecutor-0_0:139773805065984][task_id:default][aliyun_slowlog.py:78][DEBUG]- =========================第232页==================================
[2019-12-26 10:52:22,356][ThreadPoolExecutor-0_0:139773805065984][task_id:default][aliyun_slowlog.py:78][DEBUG]- =========================第233页==================================
[2019-12-26 10:52:24,550][ThreadPoolExecutor-0_0:139773805065984][task_id:default][aliyun_slowlog.py:78][DEBUG]- =========================第234页==================================
[2019-12-26 10:52:26,834][ThreadPoolExecutor-0_0:139773805065984][task_id:default][aliyun_slowlog.py:78][DEBUG]- =========================第235页==================================
[2019-12-26 10:52:29,104][ThreadPoolExecutor-0_0:139773805065984][task_id:default][aliyun_slowlog.py:78][DEBUG]- =========================第236页==================================
[2019-12-26 10:52:31,278][ThreadPoolExecutor-0_0:139773805065984][task_id:default][aliyun_slowlog.py:78][DEBUG]- =========================第237页==================================
[2019-12-26 10:52:33,941][ThreadPoolExecutor-0_0:139773805065984][task_id:default][aliyun_slowlog.py:78][DEBUG]- =========================第238页==================================

  

02-11 20:02