问题描述
我有一个需要检查的 URL 列表,以查看它们是否仍然有效.我想编写一个 bash 脚本来为我做这件事.
I have a list of URLS that I need to check, to see if they still work or not. I would like to write a bash script that does that for me.
我只需要返回的 HTTP 状态代码,即 200、404、500 等等.仅此而已.
I only need the returned HTTP status code, i.e. 200, 404, 500 and so forth. Nothing more.
EDIT 请注意,如果页面显示未找到 404"但返回 200 OK 消息,则存在问题.这是一个配置错误的 Web 服务器,但您可能必须考虑这种情况.
EDIT Note that there is an issue if the page says "404 not found" but returns a 200 OK message. It's a misconfigured web server, but you may have to consider this case.
有关这方面的更多信息,请参阅 检查 URL 是否转到包含文本404"的页面
For more on this, see Check if a URL goes to a page containing the text "404"
推荐答案
Curl 有一个特定的选项,--write-out
,为此:
Curl has a specific option, --write-out
, for this:
$ curl -o /dev/null --silent --head --write-out '%{http_code}
' <url>
200
-o/dev/null
丢弃通常的输出--silent
扔掉进度表--head
发出 HEAD HTTP 请求,而不是 GET--write-out '%{http_code}'
打印需要的状态码-o /dev/null
throws away the usual output--silent
throws away the progress meter--head
makes a HEAD HTTP request, instead of GET--write-out '%{http_code}'
prints the required status code
将其包含在一个完整的 Bash 脚本中:
To wrap this up in a complete Bash script:
#!/bin/bash
while read LINE; do
curl -o /dev/null --silent --head --write-out "%{http_code} $LINE
" "$LINE"
done < url-list.txt
(眼尖的读者会注意到,这对每个 URL 使用一个 curl 进程,这会导致 fork 和 TCP 连接惩罚.如果将多个 URL 组合在一个 curl 中会更快,但没有空间可以写出来curl 需要大量重复选项才能做到这一点.)
(Eagle-eyed readers will notice that this uses one curl process per URL, which imposes fork and TCP connection penalties. It would be faster if multiple URLs were combined in a single curl, but there isn't space to write out the monsterous repetition of options that curl requires to do this.)
这篇关于用于获取 url 列表的 HTTP 状态代码的脚本?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!