一、什么是Table Metadata Lock
在MySQL以前的版本中,存在这样一个bug:
MySQL官方文档链接:
http://bugs.mysql.com/bug.php?id=989
这个bug大致意思是说:当一个会话在主库执行DML操作还没提交时,另一个会话对同一个对象执行了DDL操作如drop table,而由于MySQL的binlog是基于事务提交的先后顺序进行记录的,因此在从库上应用时,就出现了先drop table,然后再向table中insert的情况,导致从库应用出错。
因此,MySQL在5.5.3版本后引入了Metadata lock,只有在事务结束后才会释放Metadata lock,因此在事务提交或回滚前,是无法进行DDL操作的。
MySQL官方文档位置:
http://dev.mysql.com/doc/refman/5.6/en/metadata-locking.html
二、遇到的Metadata Lock问题
前几天时候,公司的开发同事让对一张表添加字段,由于该表数据量很大,因此使用了pt-online-change-schema工具进行字段的添加,在添加的过程中发现进度非常慢,通过shop processlist发现以及积累了大量的Metadata lock:Waiting for table metadata lock。
这些语句很明显是被添加字段的DDL所阻塞,但是DDL又是被谁阻塞了呢?
查询当前正在进行的事务:
mysql> select * from information_schema.innodb_trx\G
*************************** 1. row ***************************
trx_id: 7202
trx_state: RUNNING
trx_started: 2016-07-20 23:02:53
trx_requested_lock_id: NULL
trx_wait_started: NULL
trx_weight: 0
trx_mysql_thread_id: 52402
trx_query: NULL
trx_operation_state: NULL
trx_tables_in_use: 0
trx_tables_locked: 0
trx_lock_structs: 0
trx_lock_memory_bytes: 360
trx_rows_locked: 0
trx_rows_modified: 0
trx_concurrency_tickets: 0
trx_isolation_level: READ COMMITTED
trx_unique_checks: 1
trx_foreign_key_checks: 1
trx_last_foreign_key_error: NULL
trx_adaptive_hash_latched: 0
trx_adaptive_hash_timeout: 10000
trx_is_read_only: 0
trx_autocommit_non_locking: 0
1 row in set (0.00 sec)
发现一个正在运行的事务,从trx_started字段可以判断出,该事务已经运行了很久,一直没有结束,看来就是这个事务阻塞了添加字段的DDL语句。
根据查询到的trx_started时间以及trx_mysql_thread_id到MySQL的general log中查找,当然前提是开启了general log的功能,在general日志中对应的时间发现该thread执行了语句:
set autocommit=0;
关闭了自动提交,再往下看,oh my god……下面居然是一堆SELECT语句!
好了,终于找到原因,kill掉先:
mysql> kill 52402;
Query OK, 0 rows affected (0.00 sec)
之后便可以正常执行下去了。
三、测试验证
为了进一步确认问题的原因并验证,进行模拟测试:
会话1:显式开启事务,执行SELECT:
mysql> begin;
Query OK, 0 rows affected (0.00 sec)
mysql> select * from test;
+----------+
| date |
+----------+
| 20150616 |
| 20150617 |
| 20150619 |
+----------+
3 rows in set (0.00 sec)
会话2:对test表执行DDL:
mysql> alter table test add index `date`(`date`);
语句被阻塞,show processlist查看状态:
mysql> show processlist;
+-------+-------------+-----------+------+---------+--------+-----------------------------------------------------------------------------+-------------------------------------------+
| Id | User | Host | db | Command | Time | State | Info |
+-------+-------------+-----------+------+---------+--------+-----------------------------------------------------------------------------+-------------------------------------------+
| 16 | system user | | NULL | Connect | 540155 | Waiting for master to send event | NULL |
| 17 | system user | | NULL | Connect | 529732 | Slave has read all relay log; waiting for the slave I/O thread to update it | NULL |
| 51673 | root | localhost | test | Sleep | 55 | | NULL |
| 51681 | root | localhost | test | Query | 0 | init | show processlist |
| 51683 | root | localhost | test | Query | 29 | Waiting for table metadata lock | alter table test add index `date`(`date`) |
+-------+-------------+-----------+------+---------+--------+-----------------------------------------------------------------------------+-------------------------------------------+
5 rows in set (0.00 sec)
可以看到alter table语句的状态为Waiting for table metadata lock
会话3:对test表进行查询:
mysql> select * from test;
同样被阻塞:
mysql> show processlist;
+-------+-------------+-----------+------+---------+--------+-----------------------------------------------------------------------------+-------------------------------------------+
| Id | User | Host | db | Command | Time | State | Info |
+-------+-------------+-----------+------+---------+--------+-----------------------------------------------------------------------------+-------------------------------------------+
| 16 | system user | | NULL | Connect | 540305 | Waiting for master to send event | NULL |
| 17 | system user | | NULL | Connect | 529882 | Slave has read all relay log; waiting for the slave I/O thread to update it | NULL |
| 51673 | root | localhost | test | Sleep | 205 | | NULL |
| 51681 | root | localhost | test | Query | 0 | init | show processlist |
| 51683 | root | localhost | test | Query | 179 | Waiting for table metadata lock | alter table test add index `date`(`date`) |
| 51703 | root | localhost | test | Query | 18 | Waiting for table metadata lock | select * from test |
+-------+-------------+-----------+------+---------+--------+-----------------------------------------------------------------------------+-------------------------------------------+
6 rows in set (0.00 sec)
接下来我们将会话1的事务提交,效果如下:
- 会话1:
mysql> begin;
Query OK, 0 rows affected (0.00 sec)
mysql> select * from test;
+----------+
| date |
+----------+
| 20150616 |
| 20150617 |
| 20150619 |
+----------+
3 rows in set (0.00 sec)
mysql> commit;
Query OK, 0 rows affected (0.00 sec)
- 会话2:
mysql> alter table test add index `date`(`date`);
Query OK, 0 rows affected (3 min 49.87 sec)
Records: 0 Duplicates: 0 Warnings: 0
- 会话3:
mysql> select * from test;
+----------+
| date |
+----------+
| 20150616 |
| 20150617 |
| 20150619 |
+----------+
3 rows in set (1 min 8.27 sec)
- show processlist:
mysql> show processlist;
+-------+-------------+-----------+------+---------+--------+-----------------------------------------------------------------------------+------------------+
| Id | User | Host | db | Command | Time | State | Info |
+-------+-------------+-----------+------+---------+--------+-----------------------------------------------------------------------------+------------------+
| 16 | system user | | NULL | Connect | 540411 | Waiting for master to send event | NULL |
| 17 | system user | | NULL | Connect | 529988 | Slave has read all relay log; waiting for the slave I/O thread to update it | NULL |
| 51673 | root | localhost | test | Sleep | 55 | | NULL |
| 51681 | root | localhost | test | Query | 0 | init | show processlist |
| 51683 | root | localhost | test | Sleep | 285 | | NULL |
| 51703 | root | localhost | test | Sleep | 124 | | NULL |
+-------+-------------+-----------+------+---------+--------+-----------------------------------------------------------------------------+------------------+
6 rows in set (0.00 sec)
可以看到,当会话1提交事务后,会话2和会话3的语句便可以正常执行了,由于被阻塞的原因,因此执行时间分别为 ( 3 min 49.87 sec ) 、( 1 min 8.27 sec )
四、总结
- 对于纯SELECT操作来说,完全没有必要添加事务,MySQL的innodb是基于MVCC多版本控制,加事务没有任何意义
- 需要使用到事务时,也要尽量缩小事务的运行时间,一个事务中不要包含太多的语句