问题描述
我使用 scrapy ,并且我想忽略响应URL.我在输出控制台中看到的是:
I using scrapy, and I would like to get Ignoring response URL.I just see in the output console this:
调试:忽略响应< 999 https://www.mywebsite.com >:HTTP状态代码为没有处理或不允许.
DEBUG: Ignoring response <999 https://www.mywebsite.com>: HTTP status code is not handled or not allowed.
推荐答案
根据文档此处,您可以添加一个HTTP状态代码列表,即使默认情况下不允许它们也应由您的蜘蛛处理.
According to the documentation here you can add a list of HTTP status codes which should be handled by your spider even if they are not allowed by default.
在您的情况下,您必须在蜘蛛定义中添加以下行:
In your case you have to add following line to your spider definition:
handle_httpstatus_list = [999]
即使使用此状态代码,这也将导致蜘蛛获得结果.
This will cause the spider to get the result even with this status code.
下一次,在提出问题之前,请先通过StackOverflow查看类似的问题并阅读文档.放置一些代码让我们知道您的错误发生在哪里也没有错.没有这些信息,社区很少会给出任何答案.
Next time before asking a question pleas look through StackOverflow for similar questions and read the docs. And it won't be wrong to put some code to let us know where is your error happening. Without this information it is seldom that the community can give any answers.
这篇关于身份验证失败-999- HTTP状态代码未处理或不允许的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!