


Getting this error in python when trying to compute lda for a smaller size of corpus but works fine in other cases.

语料库的大小为15,我尝试将主题数设置为5,然后将其减少为2,但是仍然出现相同的错误: ValueError:无法在一个空集合(没有条件)上计算LDA strong>

The size of corpus is 15 and I tried setting the number of topic to 5 then reduced it to 2 but it still gives the same error : ValueError: cannot compute LDA over an empty collection (no terms)

在此行出现错误: lda = models.LdaModel(corpus,num_topics = topic_number,id2word = dictionary,pass = passes)

语料库为 corpus = [dictionary.doc2bow(文本),表示文本中的a,id,文本,s_date,e_date,qd,qd_perc]



最后弄清楚了.小文档的问题在于,如果您尝试从字典中过滤极端,则最终可能会在语料库中出现空列表. corpus = [dictionary.doc2bow(text)] .

Finally figured it out. The issue with small documents is that if you try to filter the extremes from dictionary, you might end up with empty lists in corpus.corpus = [dictionary.doc2bow(text)].

因此,在 corpus = [dictionary.doc2bow(text)] dictionary.filter_extremes(no_below = 2,no_above = 0.1)中的参数值/code>

So the values of parameters in dictionary.filter_extremes(no_below=2, no_above=0.1) needs to be selected accordingly and carefully before corpus = [dictionary.doc2bow(text)]


I just removed the filter extremes and lda model runs fine now. Though I will change the parameter values in filter extreme and use it later.


09-18 09:31