TCP，HTTP和多线程甜点

本文介绍了TCP，HTTP和多线程甜点的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！问题描述 29岁程序员，3月因学历无情被辞！我正在尝试了解我所获得的性能数据以及如何确定最佳线程数。请参阅本文的底部以了解我的结果我在perl中编写了一个实验性的多线程Web客户端，它下载了一个页面，为每个页面抓取了源代码图像标记并下载图像 - 丢弃数据。I wrote an experimental multi-threaded web client in perl which downloads a page, grabs the source for each image tag and downloads the image - discarding the data.它使用非阻塞连接，每个文件的初始超时为10秒，每次超时后重复一次，然后重试。它还缓存IP地址，因此每个线程只需要进行一次DNS查找。It uses a non-blocking connect with an initial per file timeout of 10 seconds which doubles after each timeout and retry. It also caches IP addresses so each thread only has to do a DNS lookup once.通过2.5Mbit连接从文件中下载的数据总量为2271122字节a href =http://hubblesite.org/gallery/album/entire/npp/all/hires/true/ =nofollow noreferrer> http://hubblesite.org/gallery/album/entire/npp/ all / hires / true / 。缩略图由一家公司托管，该公司声称专门为高带宽应用提供低延迟。The total amount of data downloaded is 2271122 bytes in 1316 files via 2.5Mbit connection from http://hubblesite.org/gallery/album/entire/npp/all/hires/true/ . The thumbnail images are hosted by a company which claims to specialize in low latency for high bandwidth applications.挂壁时间为：在最坏的情况下（50个线程），客户端消耗的CPU时间少于2秒。In the worst case ( 50 threads ) less than 2 seconds of CPU time are consumed by the client. avg文件大小1.7k 平均100毫秒（通过ping测量） avg cli cpu / img 1 msavg file size 1.7kavg rtt 100 ms ( as measured by ping )avg cli cpu/img 1 ms最快的平均下载速度是5个线程，大约15 KB /秒整体。The fastest average download speed is 5 threads at about 15 KB / sec overall.服务器实际上确实具有相当低的延迟，因为每个映像只需218 ms意味着服务器平均只需要18毫秒来处理每个请求：The server actually does seem to have pretty low latency as it takes only 218 ms to get each image meaning it takes only 18 ms on average for the server to process each request: 0 cli发送syn 50 srv rcvs syn 50 srv发送syn + ack 100 cli conn established / cli发送get 150 srv recv's get 168 srv读取文件，发送数据，调用关闭 218 cli recv HTTP标头+ 2个段中的完整文件MSS == 14480 cli sends syn50 srv rcvs syn50 srv sends syn + ack100 cli conn established / cli sends get150 srv recv's get168 srv reads file, sends data , calls close218 cli recv HTTP headers + complete file in 2 segments MSS == 1448我可以看到每个文件的平均下载速度很低，因为文件较小，连接设置的每个文件的成本相对较高。I can see that the per file average download speed is low because of the small file sizes and the relatively high cost per file of the connection setup.我是什么不明白为什么我看到超过2个线程的性能几乎没有改善。服务器似乎足够快，但已经开始超时连接5个线程。What I don't understand is why I see virtually no improvement in performance beyond 2 threads. The server seems to be sufficiently fast, but already starts timing out connections at 5 threads.超时似乎是在大约900 - 1000个成功连接后开始，无论是5还是50线程，我假设它可能是服务器上的某种限制阈值，但我希望10个线程仍然明显快于2。The timeouts seem to start after about 900 - 1000 successful connections whether it's 5 or 50 threads, which I assume is probably some kind of throttling threshold on the server, but I would expect 10 threads to still be significantly faster than 2.我在这里遗漏了一些东西？Am I missing something here? EDIT-1为了比较，我安装了DownThemAll Firefox扩展程序并使用它下载图像。我将它设置为4个同时连接，10秒超时。 DTM花了大约3分钟来下载所有文件并将它们写入磁盘，并且在大约900个连接之后它也开始经历超时。Just for comparisons sake I installed the DownThemAll Firefox extension and downloaded the images using it. I set it to 4 simultaneous connections with a 10 second timeout. DTM took about 3 minutes to download all the files + write them to disk, and it also started experiencing timeouts after about 900 connections.我将运行tcpdump到尝试更好地了解tcp协议级别的情况。I'm going to run tcpdump to try and get a better picture what's going on at the tcp protocol level.我还清除了Firefox的缓存并点击重新加载。 40秒重新加载页面和所有图像。这似乎太快了 - 也许Firefox将它们保存在一个未清除的内存缓存中？所以我打开了Opera，它也花了大约40秒。我认为它们的速度要快得多，因为它们必须使用HTTP / 1.1流水线技术？I also cleared Firefox's cache and hit reload. 40 Seconds to reload the page and all the images. That seemed way too fast - maybe Firefox kept them in a memory cache which wasn't cleared? So I opened Opera and it also took about 40 seconds. I assume they're so much faster because they must be using HTTP/1.1 pipelining? 答案是什么！??因此，经过一些测试和编写代码以通过流水线重用套接字后，我发现了一些有趣的信息。So after a little more testing and writing code to reuse the sockets via pipelining I found out some interesting info.在5个线程上运行，非流水线版本在77秒内检索前1026个图像，但还需要65秒来检索剩余的290个图像。这几乎证实了 MattH的关于我的客户受到 SYN FLOOD 事件导致服务器在短时间内停止响应我的连接尝试。然而，这只是问题的一部分，因为5个线程获得1026个图像的77秒仍然很慢;如果你删除 SYN FLOOD 问题，它仍然需要大约99秒来检索所有文件。所以基于一些研究和一些 tcpdump ，似乎问题的另一部分是延迟和连接设置开销。When running at 5 threads the non-pipelined version retrieves the first 1026 images in 77 seconds but takes a further 65 seconds to retrieve the remaining 290 images. This pretty much confirms MattH's theory about my client getting hit by a SYN FLOOD event causing the server to stop responding to my connection attempts for a short period of time. However, that is only part of the problem since 77 seconds is still very slow for 5 threads to get 1026 images; if you remove the SYN FLOOD issue it would still take about 99 seconds to retrieve all the files. So based on a little research and some tcpdump's it seems like the other part of the issue is latency and the connection setup overhead.这是我们回到找到Sweet Spot或最佳线程数的问题。我修改了客户端以实现HTTP / 1.1 Pipelining，发现在这种情况下最佳线程数在15到20之间。例如：Here's where we get back to the issue of finding the "Sweet Spot" or the optimal number of threads. I modified the client to implement HTTP/1.1 Pipelining and found that the optimal number of threads in this case is between 15 and 20. For example: 有四个因素影响这个因素;延迟/ rtt，最大端到端带宽，recv缓冲区大小以及正在下载的映像文件的大小。请参阅此网站，了解有关接收缓冲区大小和RTT延迟如何影响可用的讨论带宽。除上述内容外，平均文件大小会影响每个连接的最大传输速率。每次发出GET请求时，都会在中为传输管道创建一个空的间隙，这是连接RTT的大小。例如，如果你的最大可能转移率（recv buff size / RTT）是2.5Mbit而你的RTT是100ms，那么每个GET请求在你的中产生至少32kB的差距管。对于320kB的大平均图像大小，相当于每个文件10％的开销，有效地将可用带宽减少到2.25Mbit。但是，对于平均文件大小为3.2kB的，开销上升到1000％，可用带宽减少到232 kbit /秒 - 大约29kB。In addition to the above, average file size affects the maximum per connectiontransfer rate. Every time you issue a GET request you create an empty gap inyour transfer pipe which is the size of the connection RTT. For example, ifyou're Maximum Possible Transfer Rate ( recv buff size / RTT ) is 2.5Mbit andyour RTT is 100ms, then every GET request incurs a minimum 32kB gap in yourpipe. For a large average image size of 320kB that amounts to a 10% overheadper file, effectively reducing your available bandwidth to 2.25Mbit. However,for a small average file size of 3.2kB the overhead jumps to 1000% andavailable bandwidth is reduced to 232 kbit / second - about 29kB. 所以要找到最佳线程数：So to find the optimal number of threads:对于我上面的场景，这给了我11个线程的最佳线程数，这非常接近我的真实世界结果。For my above scenario this gives me an optimum thread count of 11 threads, which is extremely close to my real world results.如果实际连接速度慢于理论上的MPTR然后它应该在计算中使用。If the actual connection speed is slower than the theoretical MPTR then itshould be used in the calculation instead.推荐答案请纠正我这个摘要是不正确的：Please correct me this summary is incorrect: 您的多线程客户端将启动一个连接到服务器的线程只发出一个 HTTP GET 然后该线程关闭。当你说1,2,5,10,50 t时hreads，你只是指你允许的并发线程数，每个线程本身只处理一个请求你的客户需要2到5分钟来下载超过1000个图像 Firefox和Opera将在40秒内下载一个等效的数据集Your multi-threaded client will start a thread that connects to the server and issues just one HTTP GET then that thread closes.When you say 1, 2, 5, 10, 50 threads, you're just referring to how many concurrent threads you allow, each thread itself only handles one requestYour client takes between 2 and 5 minutes to download over 1000 imagesFirefox and Opera will download an equivalent data set in 40 seconds我建议服务器速率限制http连接，可以是Web服务器守护程序本身，服务器本地防火墙，也可能是专用防火墙。I suggest that the server rate-limits http connections, either by the webserver daemon itself, a server-local firewall or most likely dedicated firewall.您实际上是通过不重新使用HTTP连接来滥用Web服务对于多个请求以及您遇到的超时是因为您的 SYN FLOOD 被限制。You are actually abusing the webservice by not re-using the HTTP Connections for more than one request and that the timeouts you experience are because your SYN FLOOD is being clamped. Firefox和Opera可能使用4到8个连接来下载所有文件。Firefox and Opera are probably using between 4 and 8 connections to download all of the files.如果重新设计代码以重新使用连接，则应该达到类似的性能。If you redesign your code to re-use the connections you should achieve similar performance. 这篇关于TCP，HTTP和多线程甜点的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！上岸，阿里云！