您好Stackoverflow社区
我遇到以下问题。我有一个拼凑的项目,已添加到我的项目中:
-.idea
-associate
-core
-scrapyproject
-- scrapyproject_one
--- spiders
---- __iniy.py__
---- dmoz_spider.py
-- __init__.py
-- items.py
-- pipelines.py
-- settings.py
我的dmoz_spider.py看起来像这样:
import scrapy
from scrapyproject.scrapyproject_one import items
class DmozSpider(scrapy.Spider):
name = "dmoz"
allowed_domains = ["dmoz.org"]
start_urls = [
"http://www.dmoz.org/Computers/Programming/Languages/Python/Books/",
"http://www.dmoz.org/Computers/Programming/Languages/Python/Resources/"
]
def parse(self, response):
for sel in response.xpath('//ul/li'):
item = items.ScrapyprojectItem()
item['title'] = sel.xpath('a/text()').extract()
item['link'] = sel.xpath('a/@href').extract()
item['desc'] = sel.xpath('text()').extract()
yield item
但是当我导航到scrapyproject>文件夹并执行时
scrapy dmoz crawl
我收到以下错误:
Traceback (most recent call last):
File "c:\users\admin\appdata\local\programs\python\python35-32\lib\runpy.py", line 170, in _run_module_as_main
"__main__", mod_spec)
File "c:\users\admin\appdata\local\programs\python\python35-32\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "C:\Users\Admin\AppData\Local\Programs\Python\Python35-32\Scripts\scrapy.exe\__main__.py", line 9, in <module>
File "c:\users\admin\appdata\local\programs\python\python35-32\lib\site-packages\scrapy\cmdline.py", line 108, in execute
settings = get_project_settings()
File "c:\users\admin\appdata\local\programs\python\python35-32\lib\site-packages\scrapy\utils\project.py", line 60, in get_proj
ect_settings
settings.setmodule(settings_module_path, priority='project')
File "c:\users\admin\appdata\local\programs\python\python35-32\lib\site-packages\scrapy\settings\__init__.py", line 282, in set
File "c:\users\admin\appdata\local\programs\python\python35-32\lib\site-packages\scrapy\utils\project.py", line 60, in g
et_project_settings
settings.setmodule(settings_module_path, priority='project')
File "c:\users\admin\appdata\local\programs\python\python35-32\lib\site-packages\scrapy\settings\__init__.py", line 282,
File "c:\users\admin\appdata\local\programs\python\python35-32\lib\site-packages\scrapy\cmdline.py", line 108, in
execute
settings = get_project_settings()
File "c:\users\admin\appdata\local\programs\python\python35-32\lib\site-packages\scrapy\utils\project.py", line 60
, in get_project_settings
settings.setmodule(settings_module_path, priority='project')
File "c:\users\admin\appdata\local\programs\python\python35-32\lib\site-packages\scrapy\settings\__init__.py", lin
e 282, in setmodule
ne 60, in get_project_settings
settings.setmodule(settings_module_path, priority='project')
File "c:\users\admin\appdata\local\programs\python\python35-32\lib\site-packages\scrapy\settings\__init__.py"
, line 282, in setmodule
ct.py", line 60, in get_project_settings
settings.setmodule(settings_module_path, priority='project')
File "c:\users\admin\appdata\local\programs\python\python35-32\lib\site-packages\scrapy\settings\__
init__.py", line 282, in setmodule
module = import_module(module)
File "c:\users\admin\appdata\local\programs\python\python35-32\lib\importlib\__init__.py", line 126
126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 986, in _gcd_import
File "<frozen importlib._bootstrap>", line 969, in _find_and_load
File "<frozen importlib._bootstrap>", line 944, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 222, in _call_with_frames_removed
File "<frozen importlib._bootstrap>", line 986, in _gcd_import
File "<frozen importlib._bootstrap>", line 969, in _find_and_load
File "<frozen importlib._bootstrap>", line 956, in _find_and_load_unlocked
ImportError: No module named 'scrapyproject'
想知道是否有人会知道我如何能够做到这一点。任何建议将不胜感激!
中号
最佳答案
好吧,我知道了。
我需要做的是在PyCharm中将“ spiderproject”文件夹声明为“ Sources文件夹”。
您可以通过转到文件>设置>项目:[项目名称]>项目结构来实现。
选择您的scrapy项目(在本例中为“ spiderproject”)的1级项目文件夹,然后单击顶部的蓝色Folder标记为Sources。
然后去蜘蛛
from spiderproject.items import [whatever you named your item class you defined in items.py ]
希望这可以帮助。
中号