mysql连接hang住问题分析
【问题现象】:
1. Linuxc多线程连接mysql数据库,每次都是短连接,操作完后就释放连接,有时候会出现mysql_real_connect挂住的现象
2. 挂住超时mysql_real_connect返回后报错如下:Lostconnection to MySQL server at 'reading initial communication packet', systemerror: 104,返错后线程号没变,会继续往下运行
【初步原因分析】:
1. mysql_real_connect连接数据库, 没有显式调用超时时间,重连什么的,使用的默认值:
2. 昨天拿c mysqlclient的源码进行了分析测试,发现以下几个配置项的默认值(如果调mysql_real_connect之前没有设置任何属性,mysql client端的机器上/etc/my.cnf也没有配置)如下:
connect_timeout = 0
read_timeout = 0
write_timeout = 0
reconnect = 0 //1表示自动重连
所以这些参数是需要显式设置的。
{net = {vio = 0x7b8f150, options = {connect_timeout = 0, read_timeout = 0, write_timeout = 0,}
3. Mysqlclient中报错的代码如下(与服务端连接建立后,读解析包的时候失败了):
/*
Part 1: Connection established, read and parse first packet
*/
if ((pkt_length=cli_safe_read(mysql)) == packet_error)
{
if (mysql->net.last_errno == CR_SERVER_LOST)
set_mysql_extended_error(mysql, CR_SERVER_LOST, unknown_sqlstate,
ER(CR_SERVER_LOST_EXTENDED),
"reading initial communication packet",
errno);
fprintf(myfp,"%s after cli_safe_read packet_error /n",cur_time());
goto error;
}
线程阻塞时抓到的堆栈如下:
#0 0x0000003ebe0c5f3b in read () from /lib64/libc.so.6
#1 0x00007f240dd91430 in vio_read () from /usr/lib/libmysql.so.16.0.0
#2 0x00007f240dcf9b05 in my_real_read () from /usr/lib/libmysql.so.16.0.0
#3 0x00007f240dcf9e38 in my_net_read () from /usr/lib/libmysql.so.16.0.0
#4 0x00007f240dcec396 in cli_safe_read () from /usr/lib/libmysql.so.16.0.0
#5 0x00007f240dceea17 in mysql_real_connect () from /usr/lib/libmysql.so.16.0.0
4. 似乎根本原因是为什么阻塞在了read函数上,并且没有超时返回
size_t vio_read(Vio * vio, uchar* buf, size_t size)
{
size_t r;
DBUG_ENTER("vio_read");
DBUG_PRINT("enter", ("sd: %d buf: 0x%lx size: %u", vio->sd, (long) buf,
(uint) size));
/* Ensure nobody uses vio_read_buff and vio_read simultaneously */
DBUG_ASSERT(vio->read_end == vio->read_pos);
#ifdef __WIN__
r = recv(vio->sd, buf, size,0);
#else
errno=0; /* For linux */
r = read(vio->sd, buf, size);
#endif /* __WIN__ */
#ifndef DBUG_OFF
if (r == (size_t) -1)
{
DBUG_PRINT("vio_error", ("Got error %d during read",errno));
}
#endif /* DBUG_OFF */
DBUG_PRINT("exit", ("%ld", (long) r));
DBUG_RETURN(r);
}
5. 跟了下mysqlclient中与服务端连接读写socket的代码,发现配置的mysql_options中配置的MYSQL_OPT_READ_TIMEOUT和MYSQL_OPT_WRITE_TIMEOUT正是socket连接中的读写超时时间:
setsockopt(vio->sd, SOL_SOCKET, which ? SO_SNDTIMEO : SO_RCVTIMEO,
IF_WIN(const char*, const void*)&wait_timeout,
sizeof(wait_timeout))
这个超时时间默认是很大的值(31536000s),如果read阻塞了线程就会挂住,直到tcp的超时时间断开连接(默认2小时)
6. mysql官网上有类似的说明:
http://dev.mysql.com/worklog/task/?id=1907
http://bugs.mysql.com/bug.php?id=4143
7. 数据库读/写超时可以通过调用mysql_options()设置,具体参数为MYSQL_OPT_READ_TIMEOUT和MYSQL_OPT_WRITE_TIMEOUT。
MYSQL_OPT_READ_TIMEOUT (argument type: unsigned int *)
The timeout in seconds for attempts to read from the server. Each attempt usesthis timeout value and there are retries if necessary, so the total effectivetimeout value is three times the option value. You can set the value so that alost connection can be detected earlier than the TCP/IP Close_Wait_Timeoutvalue of 10 minutes. Before MySQL 5.1.41, this option applies only to TCP/IPconnections and, prior to MySQL 5.1.12, only for Windows.
MYSQL_OPT_WRITE_TIMEOUT (argument type: unsigned int *)
The timeout in seconds for attempts to write to the server. Each attempt usesthis timeout value and there are net_retry_count retries if necessary, so thetotal effective timeout value is net_retry_count times the option value. BeforeMySQL 5.1.41, this option applies only to TCP/IP connections and, prior toMySQL 5.1.12, only for Windows.
然而,值得注意的是此两个参数并不是对所有版本都支持:
Before MySQL 5.1.41, this option applies only to TCP/IPconnections and, prior to MySQL 5.1.12, only for Windows.
对于linux系统来说,你必须使用官方发布的5.1.12版本之后的客户端库,否则你需要自己编译线程安全的客户端库。
bitsCN.com