问题描述
我是典型的常规R用户.在 R 中有非常有用的 lda.collapsed.gibbs.sampler
在 lda
包tha使用折叠的Gibbs采样器来拟合潜在Dirichlet分配(LDA)模型,并使用Gibbs上一次迭代时的状态返回潜在参数的点估计采样.
I am typical, regular, everyday R user. In R there is very helpful lda.collapsed.gibbs.sampler
in lda
package tha uses a collapsed Gibbs sampler to fit a latent Dirichlet allocation (LDA) model and returns point estimates of the latent parameters using the state at the last iteration of Gibbs sampling.
此函数还有一个很棒的参数 compute.log.likelihood
,当设置为 TRUE
时,它将导致采样器计算日志每次扫描后,单词的可能性(在恒定因子之内)变量.这对于评估收敛性和比较不同的LDA模型(针对不同主题数进行计算)很有用.
This function also has a great parameter compute.log.likelihood
which, when set to TRUE
, will cause the sampler to compute the loglikelihood of the words (to within a constant factor) after each sweep over thevariables. This is useful for assessing convergence and in comparing different LDA models (computeted for different number of topics).
我对 vowpal_wabbit的LDA中是否有这样的选择感兴趣模型?
推荐答案
运行 vw -h --lda 1
时,帮助提供以下参数.默认情况下, metrics
参数处于关闭状态.它用于计算实现主题一致性的.尝试通过传递-metrics1
When running vw -h --lda 1
the help offers the following parameters.The metrics
parameter is off by default.It is used to compute the topic coherence which is implemented here.Try to enable this functionality by passing --metrics 1
Latent Dirichlet Allocation:
--lda arg Run lda with <int> topics
--lda_alpha arg (=0.100000001) Prior on sparsity of per-document topic
weights
--lda_rho arg (=0.100000001) Prior on sparsity of topic
distributions
--lda_D arg (=10000) Number of documents
--lda_epsilon arg (=0.00100000005) Loop convergence threshold
--minibatch arg (=1) Minibatch size, for LDA
--math-mode arg (=0) Math mode: simd, accuracy, fast-approx
--metrics arg (=0) Compute metrics
或直接跳转到 vw的源代码实用程序.
可以在此处找到有用的演示文稿,其中展示了大多数参数.
A helpful presentation showcasing most parameters can be found here.
这篇关于如何在vowpal wabbit中计算LDA模型的对数似然的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!