问题描述
我有一个简单的代码,为一个URL头请求,然后打印响应头。我注意到,在某些网站上,这可能需要很长时间才能完成。
I have simple code that does a head request for a URL and then prints the response headers. I've noticed that on some sites, this can take a long time to complete.
例如,请求 http://www.arstechnica .com
大约需要两分钟。我已经尝试过相同的请求使用另一个网站做同样的基本任务,它立即回来。
For example, requesting http://www.arstechnica.com
takes about two minutes. I've tried the same request using another web site that does the same basic task, and it comes back immediately. So there must be something I have set incorrectly that's causing this delay.
这里是我有的代码:
$ch = curl_init();
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch, CURLOPT_URL, $url);
curl_setopt ($ch, CURLOPT_CONNECTTIMEOUT, 20);
curl_setopt ($ch, CURLOPT_USERAGENT, $_SERVER['HTTP_USER_AGENT']);
// Only calling the head
curl_setopt($ch, CURLOPT_HEADER, true); // header will be at output
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'HEAD'); // HTTP request is 'HEAD'
$content = curl_exec ($ch);
curl_close ($ch);
这里是一个链接到网站,执行相同的功能:
Here's a link to the web site that does the same function: http://www.seoconsultants.com/tools/headers.asp
上面的代码,至少在我的服务器上,需要两分钟来检索www.arstechnica.com,但上面的链接上的服务立即返回它。
The code above, at least on my server, takes two minutes to retrieve www.arstechnica.com, but the service at the link above returns it right away.
我缺少什么?
推荐答案
尝试简化一下:
print htmlentities(file_get_contents("http://www.arstechnica.com"));
上述输出立即在我的网络服务器上。如果它不在你的网站上,你的网站主机有很好的机会来设置这些类型的请求。
The above outputs instantly on my webserver. If it doesn't on yours, there's a good chance your web host has some kind of setting in place to throttle these kind of requests.
EDIT :
EDIT:
由于上述情况会立即发生,请尝试设置:
Since the above happens instantly for you, try setting this curl setting on your original code:
curl_setopt ($ch, CURLOPT_FOLLOWLOCATION, true);
使用您发布的工具,我注意到 http: arstechnica.com
为发送给它的任何请求发送了301标头。
Using the tool you posted, I noticed that http://www.arstechnica.com
has a 301 header sent for any request sent to it. It is possible that cURL is getting this and not following the new Location specified to it, thus causing your script to hang.
第二个编辑:可能是cURL正在获取此文件,而不是关注指定的新位置,
SECOND EDIT:
很奇怪,尝试与上面相同的代码使我的网络服务器也挂起。我替换了此代码:
Curiously enough, trying the same code you have above was making my webserver hang too. I replaced this code:
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'HEAD'); // HTTP request is 'HEAD'
有了这个:
curl_setopt($ch, CURLOPT_NOBODY, true);
建议您执行HEAD请求。它使它立即工作。
Which is the way the manual recommends you do a HEAD request. It made it work instantly.
这篇关于PHP / Curl:HEAD请求在一些网站上需要很长时间的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!