jupyter笔记本无法从lda2vec导入dirichlet_likelihood.py。
当前lda2vec的github中存在此py文件。
我安装了该模块并打开了工作簿,然后尝试运行它。我怀疑是我的问题很简单的原因。
笔记本是
https://github.com/cemoody/lda2vec/blob/master/examples/twenty_newsgroups/lda2vec/lda2vec.ipynb
当我在python命令行(在当前环境中)尝试以下操作时,它没有给出以下错误,而是需要安装的keras。在命令行上说它不能导入预处理。
uname -a
Linux ubuntu 4.18.0-15-generic #16~18.04.1-Ubuntu SMP Thu Feb 7 14:06:04 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
sudo apt-get install python3-venv
python3.6 -m venv .env
source .env/bin/activate
pip install --upgrade pip
pip install jupyter
pip install lda2vec
from lda2vec import preprocess, Corpus
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-4-2b87256bea6b> in <module>
----> 1 from lda2vec import preprocess, Corpus
2 import matplotlib.pyplot as plt
3 import numpy as np
4 get_ipython().run_line_magic('matplotlib', 'inline')
5
~/.env/lib/python3.6/site-packages/lda2vec/__init__.py in <module>
----> 1 import lda2vec.dirichlet_likelihood as dirichlet_likelihood
2 import lda2vec.embedding_mixture as embedding_mixture
3 from lda2vec.Lda2vec import Lda2vec as model
4 import lda2vec.word_embedding as word_embedding
5 import lda2vec.nlppipe as nlppipe
AttributeError: module 'lda2vec' has no attribute 'dirichlet_likelihood'
python
from lda2vec import preprocess, Corpus
Using TensorFlow backend.
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: cannot import name 'preprocess'
编辑
我通过做这些事情使它起作用
-获取ubuntu-18.04.2-desktop-amd64.iso
-为hyper-V更改了BIOS虚拟化设置
-在VMWare中制作VM
-将内存增加到3G
-给它40G的磁盘
然后在终端
sudo apt install python2.7
sudo apt install python-pip
pip install virtualenv
Mkdir 2.7env
Cd 2.7env
python2.7 -m venv .env
python2.7 -m virtualenv .env
source .env/bin/activate
pip install --upgrade pip
pip install jupyter
pip install -U spacy
python -m spacy download en
pip install wheel nltk gensim pyLDAvis lda2vec
sudo apt install git
git clone https://github.com/cemoody/lda2vec.git
cp ~/lda2vec/build/lib.linux-x86_64-2.7/lda2vec/corpus.py ~/2.7env/.env/lib/python2.7/site-packages/lda2vec/Corpus.py
cp ~/lda2vec/build/lib.linux-x86_64-2.7/lda2vec/preprocess.py ~/2.7env/.env/lib/python2.7/site-packages/lda2vec/preprocess.py
python -m pip install ipykernel
python -m ipykernel install --user
python lda2vec/examples/twenty_newsgroups/lda2vec/lda2vec_run.py
cd lda2vec/examples/twenty_newsgroups/lda2vec/
jupyter notebook
change the kernel to 2
在Firefox中打开lda2vec.ipynb
如上所述,我现在一直试图使它重新创建二十二新闻组npz文件,以便最终提供自己的内容。万一有人能更好地理解这一点,我怀疑可能是在内存少的虚拟机上运行此脚本的问题,但错误报告为
(.env) craig@ubuntu:~/whcjimmy/lda2vec/examples/twenty_newsgroups/data$ python preprocess.py
Traceback (most recent call last):
File "preprocess.py", line 31, in <module>
n_threads=4)
File "/home/craig/whcjimmy/.env/lib/python3.6/site-packages/lda2vec-0.1-py3.6.egg/lda2vec/preprocess.py", line 104, in tokenize
vocab = {v: nlp.vocab[v].lower_ for v in uniques if v != skip}
File "/home/craig/whcjimmy/.env/lib/python3.6/site-packages/lda2vec-0.1-py3.6.egg/lda2vec/preprocess.py", line 104, in <dictcomp>
vocab = {v: nlp.vocab[v].lower_ for v in uniques if v != skip}
File "vocab.pyx", line 242, in spacy.vocab.Vocab.__getitem__
File "lexeme.pyx", line 44, in spacy.lexeme.Lexeme.__init__
File "vocab.pyx", line 157, in spacy.vocab.Vocab.get_by_orth
File "strings.pyx", line 138, in spacy.strings.StringStore.__getitem__
KeyError: "[E018] Can't retrieve string for hash '9243420536193520'. This usually refers to an issue with the `Vocab` or `StringStore`."
最佳答案
好吧,我有这个工作。问题是
选择合适的python运行4年的git项目。 Python 2.7。
检查安装的模块是否具有git repo中的代码
在python终端中解决问题
复制和编辑python文件转到3
上面的一个问题是更改了依赖项的API。
ImportError: No module named 'spacy.en'
最初的问题可能是由于我不熟悉的有关git或python的问题。 git项目仍然会自测所有失败,并且构建失败。
但是我的jupyter笔记本正在运行并产生令人信服的输出。
关于python - 笔记本中lda2vec模块中的缺少属性,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/57882904/