本文介绍了Python Scrapy:“runspider"和“runspider"有什么区别?和“爬行"命令?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
有人能解释一下 runspider 和 crawl 命令之间的区别吗?它们应该在哪些上下文中使用?
Can someone explain the difference between runspider and crawl commands? What are the contexts in which they should be used?
推荐答案
在命令中:
scrapy crawl [options] <spider>
是项目名称(在 settings.py 中定义为
BOT_NAME
).
<spider>
is the project name (defined in settings.py, as BOT_NAME
).
并在命令中:
scrapy runspider [options] <spider_file>
是包含蜘蛛的文件的路径.
<spider_file>
is the path to the file that contains the spider.
否则,选项相同:
Options
=======
--help, -h show this help message and exit
-a NAME=VALUE set spider argument (may be repeated)
--output=FILE, -o FILE dump scraped items into FILE (use - for stdout)
--output-format=FORMAT, -t FORMAT
format to use for dumping items with -o
Global Options
--------------
--logfile=FILE log file. if omitted stderr will be used
--loglevel=LEVEL, -L LEVEL
log level (default: DEBUG)
--nolog disable logging completely
--profile=FILE write python cProfile stats to FILE
--lsprof=FILE write lsprof profiling stats to FILE
--pidfile=FILE write process ID to FILE
--set=NAME=VALUE, -s NAME=VALUE
set/override setting (may be repeated)
--pdb enable pdb on failure
由于 runspider
不依赖于 BOT_NAME
参数,根据您自定义抓取工具的方式,您可能会发现 runspider
更多灵活.
Since runspider
doesn't depend on the BOT_NAME
parameter, depending on the way you are customising your scrapers, you might find runspider
more flexible.
这篇关于Python Scrapy:“runspider"和“runspider"有什么区别?和“爬行"命令?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!