本文介绍了Python Scrapy:“runspider"和“runspider"有什么区别?和“爬行"命令?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有人能解释一下 runspidercrawl 命令之间的区别吗?它们应该在哪些上下文中使用?

Can someone explain the difference between runspider and crawl commands? What are the contexts in which they should be used?

推荐答案

在命令中:

scrapy crawl [options] <spider>

是项目名称(在 settings.py 中定义为 BOT_NAME).

<spider> is the project name (defined in settings.py, as BOT_NAME).

并在命令中:

scrapy runspider [options] <spider_file>

是包含蜘蛛的文件的路径.

<spider_file> is the path to the file that contains the spider.

否则,选项相同:

Options
=======
--help, -h              show this help message and exit
-a NAME=VALUE           set spider argument (may be repeated)
--output=FILE, -o FILE  dump scraped items into FILE (use - for stdout)
--output-format=FORMAT, -t FORMAT
                        format to use for dumping items with -o

Global Options
--------------
--logfile=FILE          log file. if omitted stderr will be used
--loglevel=LEVEL, -L LEVEL
                        log level (default: DEBUG)
--nolog                 disable logging completely
--profile=FILE          write python cProfile stats to FILE
--lsprof=FILE           write lsprof profiling stats to FILE
--pidfile=FILE          write process ID to FILE
--set=NAME=VALUE, -s NAME=VALUE
                        set/override setting (may be repeated)
--pdb                   enable pdb on failure

由于 runspider 不依赖于 BOT_NAME 参数,根据您自定义抓取工具的方式,您可能会发现 runspider 更多灵活.

Since runspider doesn't depend on the BOT_NAME parameter, depending on the way you are customising your scrapers, you might find runspider more flexible.

这篇关于Python Scrapy:“runspider"和“runspider"有什么区别?和“爬行"命令?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-28 12:59