问题描述
尝试为较小的语料库计算lda时在python中获得此错误,但在其他情况下效果很好.
Getting this error in python when trying to compute lda for a smaller size of corpus but works fine in other cases.
语料库的大小为15,我尝试将主题数设置为5,然后将其减少为2,但是仍然出现相同的错误: ValueError:无法在一个空集合(没有条件)上计算LDA strong>
The size of corpus is 15 and I tried setting the number of topic to 5 then reduced it to 2 but it still gives the same error : ValueError: cannot compute LDA over an empty collection (no terms)
在此行出现错误: lda = models.LdaModel(corpus,num_topics = topic_number,id2word = dictionary,pass = passes)
语料库为 corpus = [dictionary.doc2bow(文本),表示文本中的a,id,文本,s_date,e_date,qd,qd_perc]
为什么没有给出条件?
推荐答案
最后弄清楚了.小文档的问题在于,如果您尝试从字典中过滤极端,则最终可能会在语料库中出现空列表. corpus = [dictionary.doc2bow(text)]
.
Finally figured it out. The issue with small documents is that if you try to filter the extremes from dictionary, you might end up with empty lists in corpus.corpus = [dictionary.doc2bow(text)]
.
因此,在 corpus = [dictionary.doc2bow(text)] dictionary.filter_extremes(no_below = 2,no_above = 0.1)
中的参数值/code>
So the values of parameters in dictionary.filter_extremes(no_below=2, no_above=0.1)
needs to be selected accordingly and carefully before corpus = [dictionary.doc2bow(text)]
我刚刚删除了极端过滤器,并且lda模型现在可以正常运行了.尽管我将在极端过滤器中更改参数值,然后再使用.
I just removed the filter extremes and lda model runs fine now. Though I will change the parameter values in filter extreme and use it later.
这篇关于ValueError:无法通过空集合计算LDA(无条件)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!