###一、背景
####1、阿里云rds Python SDK批量获取慢sql日志条目,API只能分页返回数据,每页最多100个慢sql。
####2、sql平台是基于Django开发,定时任务有失败重试的可能,为保证慢sql不重复获取,采用update_or_create方式(若有更好的方案请留言,谢谢)。
###二、问题
####1、描述:每页100条,插入需要35秒左右,获取前一天的慢sql,任务执行5个小时未完成,每天一次的慢sql统计任务统计不到全量前一天的慢sql,造成统计结果不准确。
[2019-12-26 01:00:02,065][ThreadPoolExecutor-0_0:139773805065984][task_id:default][aliyun_slowlog.py:78][DEBUG]- =========================第1页================================== [2019-12-26 01:00:36,217][ThreadPoolExecutor-0_0:139773805065984][task_id:default][aliyun_slowlog.py:78][DEBUG]- =========================第2页================================== [2019-12-26 01:01:10,204][ThreadPoolExecutor-0_0:139773805065984][task_id:default][aliyun_slowlog.py:78][DEBUG]- =========================第3页================================== [2019-12-26 01:01:44,486][ThreadPoolExecutor-0_0:139773805065984][task_id:default][aliyun_slowlog.py:78][DEBUG]- =========================第4页================================== [2019-12-26 01:02:19,824][ThreadPoolExecutor-0_0:139773805065984][task_id:default][aliyun_slowlog.py:78][DEBUG]- =========================第5页================================== [2019-12-26 01:02:53,839][ThreadPoolExecutor-0_0:139773805065984][task_id:default][aliyun_slowlog.py:78][DEBUG]- =========================第6页================================== [2019-12-26 01:03:28,044][ThreadPoolExecutor-0_0:139773805065984][task_id:default][aliyun_slowlog.py:78][DEBUG]- =========================第7页================================== [2019-12-26 01:04:02,224][ThreadPoolExecutor-0_0:139773805065984][task_id:default][aliyun_slowlog.py:78][DEBUG]- =========================第8页================================== [2019-12-26 01:04:36,516][ThreadPoolExecutor-0_0:139773805065984][task_id:default][aliyun_slowlog.py:78][DEBUG]- =========================第9页================================== [2019-12-26 01:05:11,014][ThreadPoolExecutor-0_0:139773805065984][task_id:default][aliyun_slowlog.py:78][DEBUG]- =========================第10页================================== [2019-12-26 01:05:45,755][ThreadPoolExecutor-0_0:139773805065984][task_id:default][aliyun_slowlog.py:78][DEBUG]- =========================第11页================================== [2019-12-26 01:06:20,328][ThreadPoolExecutor-0_0:139773805065984][task_id:default][aliyun_slowlog.py:78][DEBUG]- =========================第12页==================================
####2、现象
mysql> show processlist; +------+------+-------------------+---------+---------+-------+--------------+------------------------------------------------------------------------------------------------------+ | Id | User | Host | db | Command | Time | State | Info | +------+------+-------------------+---------+---------+-------+--------------+------------------------------------------------------------------------------------------------------+ | 1549 | root | 10.xx.102.1:38200 | archery | Sleep | 36 | | NULL | | 1550 | root | xx.xx.xxx.1:38202 | NULL | Sleep | 15 | | NULL | | 2542 | root | xx.xx.xxx.3:56684 | archery | Sleep | 15379 | | NULL | | 2543 | root | xx.xx.xxx.3:56694 | archery | Query | 1 | Sending data | SELECT `ws_sql_slow_record`.`id`, `ws_sql_slow_record`.`db_instance_id`, `ws_sql_slow_recor | | 3141 | root | xx.xx.xxx.3:58864 | archery | Sleep | 15369 | | NULL | | 3652 | root | xx.xx.xxx.3:60618 | archery | Sleep | 7 | | NULL | | 3653 | root | xx.xx.xxx.1:45304 | NULL | Query | 0 | starting | show processlist | +------+------+-------------------+---------+---------+-------+--------------+------------------------------------------------------------------------------------------------------+ 7 rows in set (0.00 sec)
mysql> SELECT `ws_sql_slow_record`.`id`, `ws_sql_slow_record`.`db_instance_id`, `ws_sql_slow_record`.`db_instance_description`, `ws_sql_slow_record`.`host_address`, `ws_sql_slow_record`.`db_name`, `ws_sql_slow_record`.`sql_text`, `ws_sql_slow_record`.`query_times`, `ws_sql_slow_record`.`lock_times`, `ws_sql_slow_record`.`parse_row_counts`, `ws_sql_slow_record`.`return_row_counts`, `ws_sql_slow_record`.`execution_start_time` FROM `ws_sql_slow_record` WHERE (`ws_sql_slow_record`.`db_instance_description` = 'XXXXXXXX主库' AND `ws_sql_slow_record`.`db_instance_id` = 'rm-2ze962s1u16ytgzf6' AND `ws_sql_slow_record`.`db_name` = 'customer_service' AND `ws_sql_slow_record`.`execution_start_time` = '2019-12-25T09:54:24Z' AND `ws_sql_slow_record`.`host_address` = 'customer_service_writer[customer_service_writer] @ [172.23.20.1' AND `ws_sql_slow_record`.`lock_times` = 0 AND `ws_sql_slow_record`.`parse_row_counts` = 1177581 AND `ws_sql_slow_record`.`query_times` = 0 AND `ws_sql_slow_record`.`return_row_counts` = 598 AND `ws_sql_slow_record`.`sql_text` = 'select * from `crm_service_order` where `origin_service_type` = 2 and `status` in (10, 20, 28) and `created_at` >= \'2019-10-16 00:00:00\' order by `id` asc limit 600 offset 12000'); +--------+----------------------+-------------------------+------------------------------------------------------------------+------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------+------------+------------------+-------------------+----------------------+ | id | db_instance_id | db_instance_description | host_address | db_name | sql_text | query_times | lock_times | parse_row_counts | return_row_counts | execution_start_time | +--------+----------------------+-------------------------+------------------------------------------------------------------+------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------+------------+------------------+-------------------+----------------------+ | 252239 | rm-xxxxxxxxxxxxxxxxx | XXXXXXXX主库 | customer_service_writer[customer_service_writer] @ [172.23.20.1 | customer_service | select * from `crm_service_order` where `origin_service_type` = 2 and `status` in (10, 20, 28) and `created_at` >= '2019-10-16 00:00:00' order by `id` asc limit 600 offset 12000 | 0 | 0 | 1177581 | 598 | 2019-12-25T09:54:24Z | +--------+----------------------+-------------------------+------------------------------------------------------------------+------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------+------------+------------------+-------------------+----------------------+ 1 row in set (0.38 sec)
###三、处理过程
####1、统计列基数
mysql> select count(1) from ws_sql_slow_record; +----------+ | count(1) | +----------+ | 274099 | +----------+ 1 row in set (0.09 sec) mysql> select count(distinct host_address) from ws_sql_slow_record; +------------------------------+ | count(distinct host_address) | +------------------------------+ | 731 | +------------------------------+ 1 row in set (0.01 sec) mysql> select count(distinct execution_start_time) from ws_sql_slow_record; +--------------------------------------+ | count(distinct execution_start_time) | +--------------------------------------+ | 78203 | +--------------------------------------+ 1 row in set (0.51 sec) mysql> select count(distinct sql_text) from ws_sql_slow_record; +--------------------------+ | count(distinct sql_text) | +--------------------------+ | 70136 | +--------------------------+ 1 row in set (2.50 sec) mysql> select count(distinct host_address,execution_start_time) from ws_sql_slow_record; +---------------------------------------------------+ | count(distinct host_address,execution_start_time) | +---------------------------------------------------+ | 181844 | +---------------------------------------------------+ 1 row in set (0.41 sec)
####2、创建合适索引
mysql> alter table ws_sql_slow_record add index idx_host_address_execution_start_time(host_address,execution_start_time); Query OK, 0 rows affected (4.76 sec) Records: 0 Duplicates: 0 Warnings: 0
####3、验证索引效果,每页插入耗时降低到2秒
mysql> SELECT `ws_sql_slow_record`.`id`, `ws_sql_slow_record`.`db_instance_id`, `ws_sql_slow_record`.`db_instance_description`, `ws_sql_slow_record`.`host_address`, `ws_sql_slow_record`.`db_name`, `ws_sql_slow_record`.`sql_text`, `ws_sql_slow_record`.`query_times`, `ws_sql_slow_record`.`lock_times`, `ws_sql_slow_record`.`parse_row_counts`, `ws_sql_slow_record`.`return_row_counts`, `ws_sql_slow_record`.`execution_start_time` FROM `ws_sql_slow_record` WHERE (`ws_sql_slow_record`.`db_instance_description` = 'XXXXXXXX主库' AND `ws_sql_slow_record`.`db_instance_id` = 'rm-xxxxxxxxxxxxxxxxxxx' AND `ws_sql_slow_record`.`db_name` = 'customer_service' AND `ws_sql_slow_record`.`execution_start_time` = '2019-12-25T09:54:24Z' AND `ws_sql_slow_record`.`host_address` = 'customer_service_writer[customer_service_writer] @ [172.23.20.1' AND `ws_sql_slow_record`.`lock_times` = 0 AND `ws_sql_slow_record`.`parse_row_counts` = 1177581 AND `ws_sql_slow_record`.`query_times` = 0 AND `ws_sql_slow_record`.`return_row_counts` = 598 AND `ws_sql_slow_record`.`sql_text` = 'select * from `crm_service_order` where `origin_service_type` = 2 and `status` in (10, 20, 28) and `created_at` >= \'2019-10-16 00:00:00\' order by `id` asc limit 600 offset 12000'); +--------+----------------------+-------------------------+------------------------------------------------------------------+------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------+------------+------------------+-------------------+----------------------+ | id | db_instance_id | db_instance_description | host_address | db_name | sql_text | query_times | lock_times | parse_row_counts | return_row_counts | execution_start_time | +--------+----------------------+-------------------------+------------------------------------------------------------------+------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------+------------+------------------+-------------------+----------------------+ | 252239 | rm-xxxxxxxxxxxxxxxxxxx | XXXXXXXX主库 | customer_service_writer[customer_service_writer] @ [172.23.20.1 | customer_service | select * from `crm_service_order` where `origin_service_type` = 2 and `status` in (10, 20, 28) and `created_at` >= '2019-10-16 00:00:00' order by `id` asc limit 600 offset 12000 | 0 | 0 | 1177581 | 598 | 2019-12-25T09:54:24Z | +--------+----------------------+-------------------------+------------------------------------------------------------------+------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------+------------+------------------+-------------------+----------------------+ 1 row in set (0.00 sec) mysql> show processlist; +------+------+-------------------+---------+---------+-------+----------+------------------+ | Id | User | Host | db | Command | Time | State | Info | +------+------+-------------------+---------+---------+-------+----------+------------------+ | 1549 | root | xx.xx.xxx.1:38200 | archery | Sleep | 29 | | NULL | | 1550 | root | xx.xx.xxx.1:38202 | NULL | Sleep | 58 | | NULL | | 2542 | root | xx.xx.xxx.3:56684 | archery | Sleep | 18602 | | NULL | | 2543 | root | xx.xx.xxx.3:56694 | archery | Sleep | 672 | | NULL | | 3141 | root | xx.xx.xxx.3:58864 | archery | Sleep | 18592 | | NULL | | 3668 | root | xx.xx.xxx.1:45354 | archery | Query | 0 | starting | show processlist | | 3762 | root | xx.xx.xxx.3:33592 | archery | Sleep | 7 | | NULL | +------+------+-------------------+---------+---------+-------+----------+------------------+ 7 rows in set (0.00 sec)
[2019-12-26 10:52:13,882][ThreadPoolExecutor-0_0:139773805065984][task_id:default][aliyun_slowlog.py:78][DEBUG]- =========================第229页================================== [2019-12-26 10:52:15,315][ThreadPoolExecutor-0_0:139773805065984][task_id:default][aliyun_slowlog.py:78][DEBUG]- =========================第230页================================== [2019-12-26 10:52:17,684][ThreadPoolExecutor-0_0:139773805065984][task_id:default][aliyun_slowlog.py:78][DEBUG]- =========================第231页================================== [2019-12-26 10:52:20,138][ThreadPoolExecutor-0_0:139773805065984][task_id:default][aliyun_slowlog.py:78][DEBUG]- =========================第232页================================== [2019-12-26 10:52:22,356][ThreadPoolExecutor-0_0:139773805065984][task_id:default][aliyun_slowlog.py:78][DEBUG]- =========================第233页================================== [2019-12-26 10:52:24,550][ThreadPoolExecutor-0_0:139773805065984][task_id:default][aliyun_slowlog.py:78][DEBUG]- =========================第234页================================== [2019-12-26 10:52:26,834][ThreadPoolExecutor-0_0:139773805065984][task_id:default][aliyun_slowlog.py:78][DEBUG]- =========================第235页================================== [2019-12-26 10:52:29,104][ThreadPoolExecutor-0_0:139773805065984][task_id:default][aliyun_slowlog.py:78][DEBUG]- =========================第236页================================== [2019-12-26 10:52:31,278][ThreadPoolExecutor-0_0:139773805065984][task_id:default][aliyun_slowlog.py:78][DEBUG]- =========================第237页================================== [2019-12-26 10:52:33,941][ThreadPoolExecutor-0_0:139773805065984][task_id:default][aliyun_slowlog.py:78][DEBUG]- =========================第238页==================================