本文介绍了ValueError:无法通过空集合计算LDA(无条件)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

尝试为较小的语料库计算lda时在python中获得此错误,但在其他情况下效果很好.

Getting this error in python when trying to compute lda for a smaller size of corpus but works fine in other cases.

语料库的大小为15,我尝试将主题数设置为5,然后将其减少为2,但是仍然出现相同的错误: ValueError:无法在一个空集合(没有条件)上计算LDA strong>

The size of corpus is 15 and I tried setting the number of topic to 5 then reduced it to 2 but it still gives the same error : ValueError: cannot compute LDA over an empty collection (no terms)

在此行出现错误: lda = models.LdaModel(corpus,num_topics = topic_number,id2word = dictionary,pass = passes)

语料库为 corpus = [dictionary.doc2bow(文本),表示文本中的a,id,文本,s_date,e_date,qd,qd_perc]

为什么没有给出条件?

推荐答案

最后弄清楚了.小文档的问题在于,如果您尝试从字典中过滤极端,则最终可能会在语料库中出现空列表. corpus = [dictionary.doc2bow(text)] .

Finally figured it out. The issue with small documents is that if you try to filter the extremes from dictionary, you might end up with empty lists in corpus.corpus = [dictionary.doc2bow(text)].

因此,在 corpus = [dictionary.doc2bow(text)] dictionary.filter_extremes(no_below = 2,no_above = 0.1)中的参数值/code>

So the values of parameters in dictionary.filter_extremes(no_below=2, no_above=0.1) needs to be selected accordingly and carefully before corpus = [dictionary.doc2bow(text)]

我刚刚删除了极端过滤器,并且lda模型现在可以正常运行了.尽管我将在极端过滤器中更改参数值,然后再使用.

I just removed the filter extremes and lda model runs fine now. Though I will change the parameter values in filter extreme and use it later.

这篇关于ValueError:无法通过空集合计算LDA(无条件)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-18 09:31