本文介绍了WGET-错误414:请求URI太大的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用wget访问文本文件中的链接列表.链接示例为:

I use wget to access a list of links from a text file. A link example would be:

http://localhost:8888/data/test.php?value=ABC123456789

PHP 文件返回一个表,该表包含信息,响应将从该表附加到另一个 text 文件.对于错误,很明显,当前它不能处理URL的数量,因为它超过了字符数限制.如果我仅使用2个网址,则效果很好.

The PHP file returns a table with information from which the response is to be appended to another text file. As to the error, it is obvious that currently it cannot handle the amount of URLs because it exceeds the character limit. If I use 2 URLs only, it works perfectly fine.

文本文件总共包含10000个URL.我正在使用的命令是:

The text file contains a total of 10 000 URLs. The command I am using is:

wget -i /Applications/MAMP/htdocs/data/URLs.txt -O - >> /Applications/MAMP/htdocs/data/append.txt

根据我的研究,"修复"的一种快速方法是更改​​LimitRequestLine或添加(如果不存在).自从我使用MAMP(用于MacOS)以来,我所做的是:

According to my research, a quick way to "fix" this is to change the LimitRequestLine or adding it if it does not exist. Since I use MAMP (for MacOS) what I did was:

打开/Applications/MAMP/conf/apache/httpd.conf

并插入AccessFileName .htaccess下:

LimitRequestLine 1000000000
LimitRequestFieldSize 1000000000

但是我仍然遇到相同的错误.我不知道为什么会这样.

But I still get the same error. I don't know why this happens.

使用cURL会更容易吗?如果是,那么类似的命令是什么?

May it be easier to use cURL? If yes, what would be a similar command?

推荐答案

您的414: Request-URI Too Large错误与URL的数量无关,没有,使用curl并没有帮助.

your 414: Request-URI Too Large error has nothing to do with the amount of urls, and no, using curl wouldn't help.

问题是您的某些URL(或1个)对于目标服务器来说太长了,从而导致错误.

the problem is that some (or 1?) of your urls is simply too long for the target server, causing the error.

您可能可以通过

cat URLs.txt | awk '{print length, $0}' | sort -nr | head -1

(感谢该命令的 https://stackoverflow.com/a/1655488/1067003 )

另一个可能的原因是您没有正确地以URL结尾的URLs.txt中的URL,并且某些URL(或所有URL?)被串联了.为记录起见,终止字符是"\ n",也就是十六进制代码0A-不是大多数Windows编辑器使用的\ r \ n,我不确定wget将如何处理这种格式错误的行终止符(按照其定义)

another possible cause is that you're not properly line-terminating the urls in URLs.txt , and some of the urls (or all of them?) gets concatenated. for the record, the terminating character is "\n", aka hex code 0A - not the \r\n that most windows-editors use, i'm not sure how wget would handle such malformed line terminators (per its definition)

请注意,如果要下载.HTML文件(或任何其他可压缩文件),curl会比wget快得多,因为curl支持使用--compressed参数(使用gzipdeflate)进行压缩传输.截至目前),尽管wget根本不支持压缩-HTML的压缩效果非常好(比使用gzip的未压缩版本小5-6倍)

note that if you are downloading loads of .HTML files (or any other compressible files), curl would be much faster than wget, as curl supports compressed transfers with the --compressed argument (utilizing gzip and deflate as of speaking), while wget doesn't support compression at all - and HTML compresses very very well (easily 5-6 times smaller than the uncompressed version with gzip)

这篇关于WGET-错误414:请求URI太大的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-25 18:57