使用TCP socket多次发送较少的数据时,对方可能会一段时间收不到数据。这可能是受到了TCP Nagle算法的影响。

先来了解下Nagle算法。
Nagle算法作用是尽可能发送大块数据,避免网络中拥有大量小块数据而降低了网络利用率。
Nagle算法基本定义是任意时刻,最多只能有一个未被确认的小段。 所谓“小段”,指的是小于MSS尺寸的数据块,所谓“未被确认”,是指一个数据块发送出去后,没有收到对方发送的ACK确认该数据已收到。
Nagle算法的规则:
(1)如果包长度达到MSS,则允许发送;
(2)如果该包含有FIN,则允许发送;
(3)设置了TCP_NODELAY选项,则允许发送;
(4)未设置TCP_CORK选项时,若所有发出去的小数据包(包长度小于MSS)均被确认,则允许发送;
(5)上述条件都未满足,但发生了超时(一般为200ms),则立即发送。

结合代码来说,对于程序中“写-写-读”的情况,第二次写会延时一个RTT。
来看代码。
server.c
  1. int main(void)
  2. {
  3.     signal(SIGPIPE, SIG_IGN);
  4.     int listenfd;
  5.     if ((listenfd = socket(PF_INET, SOCK_STREAM, IPPROTO_TCP)) < 0)
  6.         ERR_EXIT("socket error");

  7.     struct sockaddr_in servaddr;
  8.     memset(&servaddr, 0, sizeof(servaddr));
  9.     servaddr.sin_family = AF_INET;
  10.     servaddr.sin_port = htons(5188);
  11.     servaddr.sin_addr.s_addr = htonl(INADDR_ANY);

  12.     int on = 1;
  13.     if (setsockopt(listenfd, SOL_SOCKET, SO_REUSEADDR, &on, sizeof(on)) < 0)
  14.         ERR_EXIT("setsockopt error");

  15.     if (bind(listenfd, (struct sockaddr *)&servaddr, sizeof(servaddr)) < 0)
  16.         ERR_EXIT("bind error");

  17.     if (listen(listenfd, SOMAXCONN) < 0)
  18.         ERR_EXIT("listen error");

  19.     while (1) {
  20.         int conn;
  21.         struct sockaddr_in peeraddr;
  22.         socklen_t peerlen = sizeof(peeraddr);
  23.         if ((conn = accept(listenfd, (struct sockaddr *)&peeraddr, &peerlen)) < 0)
  24.             ERR_EXIT("accept error");
  25.         printf("recv connect ip=%s port=%d\n", inet_ntoa(peeraddr.sin_addr), ntohs(peeraddr.sin_port));

  26.         while (1) {
  27.             char recvbuf[1024];
  28.             int readlen;

  29.             readlen = recv(conn, recvbuf, sizeof(recvbuf), 0);
  30.             if (readlen <= 0)
  31.                 break;
  32.             send(conn, recvbuf, readlen, 0);
  33.         }
  34.         printf("close\n");
  35.         close(conn);
  36.     }
  37.     close(listenfd);

  38.     return 0;
  39. }
client.c
  1. int main(int argc, char *argv[])
  2. {
  3.     int sock;
  4.     int size = 1024;
  5.     char *buffer;
  6.     int option = get_args(argc, argv);

  7.     buffer = malloc(size);
  8.     if (buffer == NULL)
  9.         ERR_EXIT("malloc error");
  10.     if ((sock = socket(PF_INET, SOCK_STREAM, IPPROTO_TCP)) < 0)
  11.         ERR_EXIT("socket error");

  12.     struct sockaddr_in servaddr;
  13.     memset(&servaddr, 0, sizeof(servaddr));
  14.     servaddr.sin_family = AF_INET;
  15.     servaddr.sin_port = htons(5188);
  16.     servaddr.sin_addr.s_addr = inet_addr("127.0.0.1");

  17.     if (connect(sock, (struct sockaddr *)&servaddr, sizeof(servaddr)) < 0)
  18.         ERR_EXIT("connect error");

  19.     int writelen, readlen;
  20.     int on = 1;

  21.     printf("write...\n");
  22.     writelen = send(sock, buffer, size/2, 0);
  23.     usleep(100);
  24.     printf("write...\n");
  25.     writelen = send(sock, buffer+size/2, size/2, 0);

  26.     printf("read...\n");
  27.     readlen = recv(sock, buffer, size, 0);

  28.     printf("close\n");
  29.     close(sock);

  30.     return 0;
  31. }

可以看到,客户端程序先后执行了两次发送数据,发送的大小都没达到MSS,符合Nagle算法规则。我们来观察两次发送之间的延时时间。

为了实验效果,把服务端程序放到远程机子上执行,客户端在本地执行。
使用tcpdump抓包,tcpdump的使用命令可以参考这个教程。

$ sudo tcpdump -i wlan0 tcp port 5188
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on wlan0, link-type EN10MB (Ethernet), capture size 65535 bytes
10:23:31.274688 IP vv-Inspiron-5323.lan.55509 > 47.88.22.84.5188: Flags [S], seq 2552470467, win 29200, options [mss 1460,sackOK,TS val 98772 ecr 0,nop,wscale 7], length 0
10:23:31.446624 IP 47.88.22.84.5188 > vv-Inspiron-5323.lan.55509: Flags [S.], seq 2653946229, ack 2552470468, win 28960, options [mss 1400,sackOK,TS val 2278931709 ecr 98772,nop,wscale 7], length 0
10:23:31.446682 IP vv-Inspiron-5323.lan.55509 > 47.88.22.84.5188: Flags [.], ack 1, win 229, options [nop,nop,TS val 98815 ecr 2278931709], length 0
10:23:31.446779 IP vv-Inspiron-5323.lan.55509 > 47.88.22.84.5188: Flags [P.], seq 1:513, ack 1, win 229, options [nop,nop,TS val 98815 ecr 2278931709], length 512
10:23:31.618338 IP 47.88.22.84.5188 > vv-Inspiron-5323.lan.55509: Flags [.], ack 513, win 235, options [nop,nop,TS val 2278931752 ecr 98815], length 0
10:23:31.618374 IP vv-Inspiron-5323.lan.55509 > 47.88.22.84.5188: Flags [P.], seq 513:1025, ack 1, win 229, options [nop,nop,TS val 98858 ecr 2278931752], length 512
10:23:31.618675 IP 47.88.22.84.5188 > vv-Inspiron-5323.lan.55509: Flags [P.], seq 1:513, ack 513, win 235, options [nop,nop,TS val 2278931752 ecr 98815], length 512
10:23:31.618702 IP vv-Inspiron-5323.lan.55509 > 47.88.22.84.5188: Flags [.], ack 513, win 237, options [nop,nop,TS val 98858 ecr 2278931752], length 0
10:23:31.618768 IP vv-Inspiron-5323.lan.55509 > 47.88.22.84.5188: Flags [F.], seq 1025, ack 513, win 237, options [nop,nop,TS val 98858 ecr 2278931752], length 0
10:23:31.790670 IP 47.88.22.84.5188 > vv-Inspiron-5323.lan.55509: Flags [P.], seq 513:1025, ack 1026, win 243, options [nop,nop,TS val 2278931795 ecr 98858], length 512
10:23:31.790709 IP vv-Inspiron-5323.lan.55509 > 47.88.22.84.5188: Flags [R], seq 2552471493, win 0, length 0
10:23:31.790899 IP 47.88.22.84.5188 > vv-Inspiron-5323.lan.55509: Flags [F.], seq 1025, ack 1026, win 243, options [nop,nop,TS val 2278931795 ecr 98858], length 0
10:23:31.790939 IP vv-Inspiron-5323.lan.55509 > 47.88.22.84.5188: Flags [R], seq 2552471493, win 0, length 0

通过上面红色字体可以发现两次发送之间延时了170ms左右。
来看一下,服务器客户端主机之间的RTT时间
$ ping 47.88.22.84
PING 47.88.22.84 (47.88.22.84) 56(84) bytes of data.
64 bytes from 47.88.22.84: icmp_seq=1 ttl=52 time=174 ms
64 bytes from 47.88.22.84: icmp_seq=2 ttl=52 time=174 ms
64 bytes from 47.88.22.84: icmp_seq=3 ttl=52 time=173 ms
RTT正好是170ms左右。

这个发送延时的解决方法有两个:
1. 把客户端的“写-写-读”修改为“写-读”
这种方法需要应用层实现,但有时环境并不允许我们这么做。
2. 禁止Nagle算法
在C语言中,可以使用setsockopt函数来禁止Nagle算法,具体使用如下:
  1. int on = 1;
  2. setsockopt(sock, IPPROTO_TCP, TCP_NODELAY, (void *)&on, sizeof(on));

我们再来抓一下包:
$ sudo tcpdump -i wlan0 tcp port 5188
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on wlan0, link-type EN10MB (Ethernet), capture size 65535 bytes
10:31:09.982217 IP vv-Inspiron-5323.lan.55548 > 47.88.22.84.5188: Flags [S], seq 2900548376, win 29200, options [mss 1460,sackOK,TS val 213449 ecr 0,nop,wscale 7], length 0
10:31:10.156928 IP 47.88.22.84.5188 > vv-Inspiron-5323.lan.55548: Flags [S.], seq 2411382823, ack 2900548377, win 28960, options [mss 1400,sackOK,TS val 2279046387 ecr 213449,nop,wscale 7], length 0
10:31:10.156982 IP vv-Inspiron-5323.lan.55548 > 47.88.22.84.5188: Flags [.], ack 1, win 229, options [nop,nop,TS val 213492 ecr 2279046387], length 0
10:31:10.157062 IP vv-Inspiron-5323.lan.55548 > 47.88.22.84.5188: Flags [P.], seq 1:513, ack 1, win 229, options [nop,nop,TS val 213492 ecr 2279046387], length 512
10:31:10.159812 IP vv-Inspiron-5323.lan.55548 > 47.88.22.84.5188: Flags [P.], seq 513:1025, ack 1, win 229, options [nop,nop,TS val 213493 ecr 2279046387], length 512
10:31:10.328542 IP 47.88.22.84.5188 > vv-Inspiron-5323.lan.55548: Flags [.], ack 513, win 235, options [nop,nop,TS val 2279046430 ecr 213492], length 0
10:31:10.328858 IP 47.88.22.84.5188 > vv-Inspiron-5323.lan.55548: Flags [P.], seq 1:513, ack 513, win 235, options [nop,nop,TS val 2279046430 ecr 213492], length 512
10:31:10.328935 IP vv-Inspiron-5323.lan.55548 > 47.88.22.84.5188: Flags [.], ack 513, win 237, options [nop,nop,TS val 213535 ecr 2279046430], length 0
10:31:10.329007 IP vv-Inspiron-5323.lan.55548 > 47.88.22.84.5188: Flags [F.], seq 1025, ack 513, win 237, options [nop,nop,TS val 213535 ecr 2279046430], length 0
10:31:10.371897 IP 47.88.22.84.5188 > vv-Inspiron-5323.lan.55548: Flags [.], ack 1025, win 243, options [nop,nop,TS val 2279046441 ecr 213493], length 0
10:31:10.500831 IP 47.88.22.84.5188 > vv-Inspiron-5323.lan.55548: Flags [P.], seq 513:1025, ack 1025, win 243, options [nop,nop,TS val 2279046473 ecr 213535], length 512
10:31:10.500883 IP vv-Inspiron-5323.lan.55548 > 47.88.22.84.5188: Flags [R], seq 2900549401, win 0, length 0
10:31:10.501164 IP 47.88.22.84.5188 > vv-Inspiron-5323.lan.55548: Flags [F.], seq 1025, ack 1026, win 243, options [nop,nop,TS val 2279046473 ecr 213535], length 0
10:31:10.501207 IP vv-Inspiron-5323.lan.55548 > 47.88.22.84.5188: Flags [R], seq 2900549402, win 0, length 0
170ms延时已经没了。

在高互动的网络环境下,Nagle算法往往是不需要的。go语言默认情况下就是禁止Nagle算法的。

本文中提到的所有代码都能在这里看到。





12-23 09:24