问题描述
我尝试使用 file_exists(URL/robots.txt) 来查看该文件是否存在于随机选择的网站上,但我得到了错误的响应;
I tryed to use file_exists(URL/robots.txt) to see if the file exists on randomly chosen websites and i get a false response;
如何检查 robots.txt 文件是否存在?
How do i check if the robots.txt file exists ?
我不想在检查之前开始下载.
I dont want to start the download before i check.
使用 fopen() 会成功吗?因为:成功时返回文件指针资源,错误时返回 FALSE.
Using fopen() will do the trick ? because : Returns a file pointer resource on success, or FALSE on error.
我想我可以放一些类似的东西:
and i guess that i can put something like:
$f=@fopen($url,"r");
if($f) ...
我的代码:
http://www1.macys.com/robots.txt也许它不在那里http://www.intend.ro/robots.txt也许它不在那里http://www.emag.ro/robots.txt也许它不在那里http://www1.bloomingdales.com/robots.txt也许它不在那里
http://www1.macys.com/robots.txtmaybe it's not therehttp://www.intend.ro/robots.txtmaybe it's not therehttp://www.emag.ro/robots.txtmaybe it's not therehttp://www1.bloomingdales.com/robots.txtmaybe it's not there
try {
if (file_exists($file))
{
echo 'exists'.PHP_EOL;
$curl_tool = new CurlTool();
$content = $curl_tool->fetchContent($file);
//if the file exists on local disk, delete it
if (file_exists(CRAWLER_FILES . 'robots_' . $website_id . '.txt'))
unlink(CRAWLER_FILES . 'robots_' . $website . '.txt');
echo CRAWLER_FILES . 'robots_' . $website_id . '.txt', $content . PHP_EOL;
file_put_contents(CRAWLER_FILES . 'robots_' . $website_id . '.txt', $content);
}
else
{
echo 'maybe it\'s not there'.PHP_EOL;
}
} catch (Exception $e) {
echo 'EXCEPTION ' . $e . PHP_EOL;
}
推荐答案
file_exists
不能用于其他网站上的资源.它适用于本地文件系统.看看这里关于如何正确执行检查.
file_exists
cannot be used on resources on another websites. It's intended for local filesystem. Have a look here on how to perform the check properly.
正如其他人在评论中提到的,正如链接所说,使用 get_headers
函数(可能)最容易做到这一点:
As other have mentioned in the comments and as the link says it's (probably) easiest to use get_headers
function to do this:
try {
if (strpos(get_headers($url,1),"404")!==FALSE){
... your code ...
} else {
... you get the idea ...
}
}
这篇关于用于 URL/robots.txt 的 PHP file_exists() 返回 false的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!