本文介绍了在 Scrapy 中获取所有蜘蛛类名称的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
在旧版本中,我们可以使用以下代码获取蜘蛛列表(蜘蛛名称),但在当前版本 (1.4) 中,我遇到了
[py.warnings] 警告:run-all-spiders.py:17:ScrapyDeprecationWarning:CrawlerRunner.spiders 属性重命名为 CrawlerRunner.spider_loader.对于 process.spider.list() 中的 spider_name:# 列出我项目中所有可用的蜘蛛
使用crawler.spider.list()
:
如何在 Scrapy 中获取蜘蛛列表(和等效的类名)?
解决方案
我在我的实用程序脚本中使用它来运行蜘蛛:
from scrapy import spiderloader从 scrapy.utils 导入项目设置 = project.get_project_settings()spider_loader = spiderloader.SpiderLoader.from_settings(settings)蜘蛛 = spider_loader.list()classes = [spider_loader.load(name) for name in spiders]
在您的情况下,按照警告消息的建议将 spiders
重命名为 spider_loader
就足够了.
in the older version we could get the list of spiders(spider names ) with following code, but in the current version (1.4) I faced with
[py.warnings] WARNING: run-all-spiders.py:17: ScrapyDeprecationWarning: CrawlerRunner.spiders attribute is renamed to CrawlerRunner.spider_loader.
for spider_name in process.spiders.list():
# list all the available spiders in my project
Use crawler.spiders.list()
:
>>> for spider_name in crawler.spiders.list():
... print(spider_name)
How Can I get spiders list (and equivalent class names) in Scrapy?
解决方案
I'm using this in my utility script for running spiders:
from scrapy import spiderloader
from scrapy.utils import project
settings = project.get_project_settings()
spider_loader = spiderloader.SpiderLoader.from_settings(settings)
spiders = spider_loader.list()
classes = [spider_loader.load(name) for name in spiders]
In you case, it should suffice to rename spiders
to spider_loader
as suggested by the warning message.
这篇关于在 Scrapy 中获取所有蜘蛛类名称的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!