问题描述
我在一个文件中写了两个蜘蛛.当我运行 scrapy runspider two_spiders.py
时,只有第一个 Spider 被执行.如何在不将文件拆分为两个文件的情况下同时运行它们.
I have written two spiders in single file. When I ran scrapy runspider two_spiders.py
, only the first Spider was executed. How can I run both of them without splitting the file into two files.
two_spiders.py:
two_spiders.py:
import scrapy
class MySpider1(scrapy.Spider):
# first spider definition
...
class MySpider2(scrapy.Spider):
# second spider definition
...
推荐答案
让我们阅读文档:
在同一个进程中运行多个蜘蛛
默认情况下,Scrapy 运行一个当您运行 scrapy crawl
时,每个进程只有一个蜘蛛.然而,Scrapy支持使用 内部 API 为每个进程运行多个蜘蛛.
By default, Scrapy runs a single spider per process when you run scrapy crawl
. However, Scrapy supports running multiple spiders per process using the internal API.
这是一个同时运行多个蜘蛛的例子:
Here is an example that runs multiple spiders simultaneously:
import scrapy
from scrapy.crawler import CrawlerProcess
class MySpider1(scrapy.Spider):
# Your first spider definition
...
class MySpider2(scrapy.Spider):
# Your second spider definition
...
process = CrawlerProcess()
process.crawl(MySpider1)
process.crawl(MySpider2)
process.start() # the script will block here until all crawling jobs are finished
(文档中还有几个例子)
(there are few more examples in the documentation)
根据您的问题,不清楚您是如何将两个蜘蛛放入一个文件中的.用单个蜘蛛连接两个文件的内容是不够的.
From your question it is not clear how have you put two spiders into one file. It was not enough to concatenate content of two files with single spiders.
尽量按照文档中写的去做.或者至少向我们展示您的代码.没有它,我们将无法帮助您.
Try to do what is written in the documentation. Or at least show us your code. Without it we can't help you.
这篇关于Scrapy 将两个蜘蛛放在一个文件中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!