本文介绍了在 Scrapy 中获取所有蜘蛛类名称的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在旧版本中,我们可以使用以下代码获取蜘蛛列表(蜘蛛名称),但在当前版本 (1.4) 中,我遇到了

[py.warnings] 警告:run-all-spiders.py:17:ScrapyDeprecationWarning:CrawlerRunner.spiders 属性重命名为 CrawlerRunner.spider_loader.对于 process.spider.list() 中的 spider_name:# 列出我项目中所有可用的蜘蛛

使用crawler.spider.list():

>>>对于 crawler.spider.list() 中的 spider_name:...打印(蜘蛛名称)

如何在 Scrapy 中获取蜘蛛列表(和等效的类名)?

解决方案

我在我的实用程序脚本中使用它来运行蜘蛛:

from scrapy import spiderloader从 scrapy.utils 导入项目设置 = project.get_project_settings()spider_loader = spiderloader.SpiderLoader.from_settings(settings)蜘蛛 = spider_loader.list()classes = [spider_loader.load(name) for name in spiders]

在您的情况下,按照警告消息的建议将 spiders 重命名为 spider_loader 就足够了.

in the older version we could get the list of spiders(spider names ) with following code, but in the current version (1.4) I faced with

[py.warnings] WARNING: run-all-spiders.py:17: ScrapyDeprecationWarning: CrawlerRunner.spiders attribute is renamed to CrawlerRunner.spider_loader.
for spider_name in process.spiders.list():
    # list all the available spiders in my project

Use crawler.spiders.list():

>>> for spider_name in crawler.spiders.list():
...     print(spider_name)

How Can I get spiders list (and equivalent class names) in Scrapy?

解决方案

I'm using this in my utility script for running spiders:

from scrapy import spiderloader
from scrapy.utils import project

settings = project.get_project_settings()
spider_loader = spiderloader.SpiderLoader.from_settings(settings)
spiders = spider_loader.list()
classes = [spider_loader.load(name) for name in spiders]

In you case, it should suffice to rename spiders to spider_loader as suggested by the warning message.

这篇关于在 Scrapy 中获取所有蜘蛛类名称的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-11 11:39