问题描述
我正在开发一个使用 Spacy 的代码库.我使用以下方法安装了 spacy:
I'm working on a codebase that uses Spacy. I installed spacy using:
sudo pip3 install spacy
然后
sudo python3 -m spacy download en
在最后一条命令结束时,我收到一条消息:
At the end of this last command, I got a message:
Linking successful
/home/rayabhik/.local/lib/python3.5/site-packages/en_core_web_sm -->
/home/rayabhik/.local/lib/python3.5/site-packages/spacy/data/en
You can now load the model via spacy.load('en')
现在,当我尝试运行我的代码时:
Now, when I try running my code, on the line:
from spacy.en import English
它给了我以下错误:
ImportError: No module named 'spacy.en'
我看过 Stackexchange,最接近的是:导入错误与 spacy:没有名为 en 的模块"这不能解决我的问题.
I've looked on Stackexchange and the closest is: Import error with spacy: "No module named en"which does not solve my problem.
任何帮助将不胜感激.谢谢.
Any help would be appreciated. Thanks.
我可能已经通过执行以下操作解决了这个问题:
I might have solved this by doing the following:
Python 3.5.2 (default, Sep 14 2017, 22:51:06)
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import spacy
>>> spacy.load('en')
<spacy.lang.en.English object at 0x7ff414e1e0b8>
然后使用:
from spacy.lang.en import English
我仍然保持开放,以防有任何其他答案.
I'm still keeping this open in case there are any other answers.
推荐答案
是的,我可以确认您的解决方案是正确的.你从pip下载的spaCy版本是v2.0,包含了很多新功能,但也一些对 API 的更改.其中之一是所有语言数据都已移至子模块 spacy.lang
以保持更清晰和更好的组织.因此,您现在从 spacy.lang.en
导入,而不是使用 spacy.en
.
Yes, I can confirm that your solution is correct. The version of spaCy you downloaded from pip is v2.0, which includes a lot of new features, but also a few changes to the API. One of them is that all language data has been moved to a submodule spacy.lang
to keep thing cleaner and better organised. So instead of using spacy.en
, you now import from spacy.lang.en
.
- from spacy.en import English
+ from spacy.lang.en import English
但是,还值得一提的是,您在运行 spacy download en
时下载的内容与 spacy.lang.en
不同.spaCy 附带的语言数据包括静态数据,如标记规则、停用词或词形还原表.您可以下载的en
包是统计模型en_core_web_sm
的快捷方式.它包括语言数据以及二进制权重,使 spaCy 能够对词性标签、依赖项和命名实体进行预测.
However, it's also worth mentioning that what you download when you run spacy download en
is not the same as spacy.lang.en
. The language data shipped with spaCy includes the static data like tokenization rules, stop words or lemmatization tables. The en
package that you can download is a shortcut for the statistical model en_core_web_sm
. It includes the language data, as well as binary weight to enable spaCy to make predictions for part-of-speech tags, dependencies and named entities.
我实际上建议使用完整的模型名称,而不是仅仅下载 en
,这样可以更清楚地了解正在发生的事情:
Instead of just downloading en
, I'd actually recommend using the full model name, which makes it much more obvious what's going on:
python -m spacy download en_core_web_sm
nlp = spacy.load("en_core_web_sm")
当您调用 spacy.load
时,spaCy 会执行以下操作:
When you call spacy.load
, spaCy does the following:
- 找到名为
"en_core_web_sm"
的已安装模型(包或快捷链接). - 阅读它的
meta.json
并检查它使用的是哪种语言(在本例中为spacy.lang.en
),以及它的处理管道的外观(在此case、tagger
、parser
和ner
). - 初始化语言类并向其添加管道.
- 从模型数据中加载二进制权重,以便管道组件(如标记器、解析器或实体识别器)可以进行预测.
- Find the installed model named
"en_core_web_sm"
(a package or shortcut link). - Read its
meta.json
and check which language it's using (in this case,spacy.lang.en
), and how its processing pipeline should look (in this case,tagger
,parser
andner
). - Initialise the language class and add the pipeline to it.
- Load in the binary weights from the model data so pipeline components (like the tagger, parser or entity recognizer) can make predictions.
有关详细信息,请参阅文档中的此部分.
See this section in the docs for more details.
这篇关于导入错误:没有名为“spacy.en"的模块的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!