我们先看一下这个报错日志:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 | InnoDB: Warning: a long semaphore wait: --Thread 140593224754944 has waited at btr0cur.c line 528 for 241.00 seconds the semaphore: X-lock on RW-latch at 0x7fd9142bfcc8 created in file dict0dict.c line 1838 a writer (thread id 140570526021376) has reserved it in mode exclusive number of readers 0, waiters flag 1, lock_word: 0 Last time read locked in file btr0cur.c line 535 Last time write locked in file /pb2/build/sb_0-10180689-1378752874.69/mysql-5.5.34/storage/innobase/btr/btr0cur.c line 528 InnoDB: Warning: a long semaphore wait: --Thread 140570431108864 has waited at btr0cur.c line 528 for 241.00 seconds the semaphore: X-lock on RW-latch at 0x7fd9142bfcc8 created in file dict0dict.c line 1838 a writer (thread id 140570526021376) has reserved it in mode exclusive number of readers 0, waiters flag 1, lock_word: 0 Last time read locked in file btr0cur.c line 535 Last time write locked in file /pb2/build/sb_0-10180689-1378752874.69/mysql-5.5.34/storage/innobase/btr/btr0cur.c line 528 …………………… END OF INNODB MONITOR OUTPUT ============================ InnoDB: ###### Diagnostic info printed to the standard error stream InnoDB: Error: semaphore wait has lasted > 600 seconds InnoDB: We intentionally crash the server, because it appears to be hung. 140101 4:32:58 InnoDB: Assertion failure in thread 140570570065664 in file srv0srv.c line 2502 InnoDB: We intentionally generate a memory trap. InnoDB: Submit a detailed bug report to http://bugs.mysql.com. InnoDB: If you get repeated assertion failures or crashes, even InnoDB: immediately after the mysqld startup, there may be InnoDB: corruption in the InnoDB tablespace. Please refer to InnoDB: http://dev.mysql.com/doc/refman/5.5/...-recovery.html InnoDB: about forcing recovery. 20:32:58 UTC - mysqld got signal 6 ; This could be because you hit a bug. It is also possible that this binary or one of the libraries it was linked against is corrupt, improperly built, or misconfigured. This error can also be caused by malfunctioning hardware. We will try our best to scrape up some info that will hopefully help diagnose the problem, but since we have already crashed, something is definitely wrong and this may fail. key_buffer_size=16777216 read_buffer_size=131072 max_used_connections=608 max_threads=1600 thread_count=516 connection_count=515 It is possible that mysqld could use up to key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 444459 K bytes of memory Hope that's ok; if not, decrease some variables in the equation. Thread pointer: 0x0 Attempting backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong... stack_bottom = 0 thread_stack 0x30000 /usr/local/mysql/bin/mysqld(my_print_stacktrace+0x35)[0x7a5f15] /usr/local/mysql/bin/mysqld(handle_fatal_signal+0x403)[0x673a13] /lib/libpthread.so.0(+0xef60)[0x7fde6901cf60] /lib/libc.so.6(gsignal+0x35)[0x7fde68219165] /lib/libc.so.6(abort+0x180)[0x7fde6821bf70] /usr/local/mysql/bin/mysqld[0x7ff2ce] /lib/libpthread.so.0(+0x68ba)[0x7fde690148ba] /lib/libc.so.6(clone+0x6d)[0x7fde682b602d] The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains information that should help you find out what is causing the crash. 131231 04:34:11 mysqld_safe Number of processes running now: 0 131231 04:34:11 mysqld_safe mysqld restarted |
这台机器凌晨MySQL进程崩溃,错误日志里全都是
InnoDB: Warning: a long semaphore wait
--Thread 140570431108864 has waited at btr0cur.c line 528 for 241.00 seconds the semaphore:
X-lock on RW-latch at 0x7fd9142bfcc8 created in file dict0dict.c line 1838
查看监控图(参考25日至31日)
发现spin waits和OS waits等待时间相当高,在手册里查到了这一句话:
1 | You can monitor the use of the adaptive hash index and the contention for its use in the SEMAPHORES section of the output of the SHOW ENGINE INNODB STATUScommand. If you see many threads waiting on an RW-latch created in btr0sea.c, then it might be useful to disable adaptive hash indexing. |
1 | Sometimes, the read/write lock that guards access to the adaptive hash index canbecome a source of contention under heavy workloads, such as multiple concurrent joins. |
由于自适应哈希索引造成大量的锁争用,进而堵塞很多进程,最终导致MySQL崩溃重启。
找到原因后,关闭了自适应哈希索引,观察了一天后(参考性能图1月1日),spin waits和OS waits等待时间逐渐减少。
1 | set global innodb_adaptive_hash_index = 0; |
最终病因找到解决之。
参考手册: