问题描述
我试图让NLTK和wordnet在Heroku上工作。我已经完成了 heroku run python
nltk.download()
wordnet
pip install -r requirements.txt
但是我收到这个错误:
找不到资源'corpora / wordnet'。请使用NLTK
Downloader获取资源:>>> nltk.download()
搜索:
- '/ app / nltk_data'
- '/ usr / share / nltk_data'
- '/ usr / local / share / nltk_data '
- '/ usr / lib / nltk_data'
- '/ usr / local / lib / nltk_data'
然而,我看过/ app / nltk_data,它在那里,所以我不知道发生了什么。
我刚刚遇到同样的问题。最后为我工作的是在应用程序的文件夹本身中创建一个nltk_data目录,将该语料库下载到该目录,并在我的代码中添加一行,让nltk知道查找该目录。您可以在本地完成所有操作,然后将更改推送到Heroku。
因此,假设我的python应用程序位于名为myapp /的目录中
步骤1:创建目录
cd myapp /
mkdir nltk_data
步骤2:将语料库下载到新目录 / p>
python -m nltk.downloader
这将弹出 nltk
下载器。将您的下载目录设置为 whatever_the_absolute_path_to_myapp_is / nltk_data /
。如果您正在使用GUI下载程序,则通过UI底部的文本字段设置下载目录。如果您使用的是第一个命令行,则将其设置在配置菜单中。
下载器知道指向您新创建的 nltk_data
目录后,下载您的语料库。
或从Python代码一步:
nltk.download(wordnet any_the_absolute_path_to_myapp_is / nltk_data /)
步骤3:让nltk知道在哪里看strong>
ntlk
查找数据,资源等。在 nltk.data.path
变量中指定的位置。所有你需要做的是将 nltk.data.path.append('./ nltk_data /')
添加到实际使用nltk的python文件中,它将查找语料库,记号器等,除了默认路径之外。
步骤4:发送到Heroku
git add nltk_data /
git commit -m'超级有用的提交消息'
git push heroku master
应该工作!不管怎样,有一件值得注意的事情是,从执行nltk的python文件到nltk_data目录的路径可能会有所不同,具体取决于你的应用程序的结构,所以在你执行 nltk.data时就这么说了。 path.append('path_to_nltk_data')
I'm trying to get NLTK and wordnet working on Heroku. I've already done
heroku run python
nltk.download()
wordnet
pip install -r requirements.txt
But I get this error:
Resource 'corpora/wordnet' not found. Please use the NLTK
Downloader to obtain the resource: >>> nltk.download()
Searched in:
- '/app/nltk_data'
- '/usr/share/nltk_data'
- '/usr/local/share/nltk_data'
- '/usr/lib/nltk_data'
- '/usr/local/lib/nltk_data'
Yet, I've looked at in /app/nltk_data and it's there, so I'm not sure what's going on.
I just had this same problem. What ended up working for me is creating an 'nltk_data' directory in the application's folder itself, downloading the corpus to that directory and adding a line to my code that lets the nltk know to look in that directory. You can do this all locally and then push the changes to Heroku.
So, supposing my python application is in a directory called "myapp/"
Step 1: Create the directory
cd myapp/
mkdir nltk_data
Step 2: Download Corpus to New Directory
python -m nltk.downloader
This'll pop up the nltk
downloader. Set your Download Directory to whatever_the_absolute_path_to_myapp_is/nltk_data/
. If you're using the GUI downloader, the download directory is set through a text field on the bottom of the UI. If you're using the command line one, you set it in the config menu.
Once the downloader knows to point to your newly created nltk_data
directory, download your corpus.
Or in one step from Python code:
nltk.download("wordnet", "whatever_the_absolute_path_to_myapp_is/nltk_data/")
Step 3: Let nltk Know Where to Look
ntlk
looks for data,resources,etc. in the locations specified in the nltk.data.path
variable. All you need to do is add nltk.data.path.append('./nltk_data/')
to the python file actually using nltk, and it will look for corpora, tokenizers, and such in there in addition to the default paths.
Step 4: Send it to Heroku
git add nltk_data/
git commit -m 'super useful commit message'
git push heroku master
That should work! It did for me anyway. One thing worth noting is that the path from the python file executing nltk stuff to the nltk_data directory may be different depending on how you've structured your application, so just account for that when you do nltk.data.path.append('path_to_nltk_data')
这篇关于Heroku上没有找到资源“syntaxa / wordnet”的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!