python - 使用`LKJCorr`优先级在PyMC3中修改的BPMF:使用`NUTS`的PositiveDefiniteError

以前在中实现了原始的AA>模型。Bayesian Probabilistic Matrix Factorization (BPMF)用于参考、数据源和问题设置。根据@twiecki对这个问题的回答，我已经实现了模型的一个变体，使用了相关矩阵的pymc3priors和标准差的均匀priors。在原始模型中，协方差矩阵是从威舍特分布中提取的，但由于目前的局限性，不能正确地对威舍特分布进行采样。See my previous question对于一个松散相关的问题，为“CC”先验的选择提供了简洁的解释。新型号如下。

import pymc3 as pm
import numpy as np
import theano.tensor as t


n, m = train.shape
dim = 10  # dimensionality
beta_0 = 1  # scaling factor for lambdas; unclear on its use
alpha = 2  # fixed precision for likelihood function
std = .05  # how much noise to use for model initialization

# We will use separate priors for sigma and correlation matrix.
# In order to convert the upper triangular correlation values to a
# complete correlation matrix, we need to construct an index matrix:
n_elem = dim * (dim - 1) / 2
tri_index = np.zeros([dim, dim], dtype=int)
tri_index[np.triu_indices(dim, k=1)] = np.arange(n_elem)
tri_index[np.triu_indices(dim, k=1)[::-1]] = np.arange(n_elem)

logging.info('building the BPMF model')
with pm.Model() as bpmf:
    # Specify user feature matrix
    sigma_u = pm.Uniform('sigma_u', shape=dim)
    corr_triangle_u = pm.LKJCorr(
        'corr_u', n=1, p=dim,
        testval=np.random.randn(n_elem) * std)

    corr_matrix_u = corr_triangle_u[tri_index]
    corr_matrix_u = t.fill_diagonal(corr_matrix_u, 1)
    cov_matrix_u = t.diag(sigma_u).dot(corr_matrix_u.dot(t.diag(sigma_u)))
    lambda_u = t.nlinalg.matrix_inverse(cov_matrix_u)

    mu_u = pm.Normal(
        'mu_u', mu=0, tau=beta_0 * lambda_u, shape=dim,
         testval=np.random.randn(dim) * std)
    U = pm.MvNormal(
        'U', mu=mu_u, tau=lambda_u,
        shape=(n, dim), testval=np.random.randn(n, dim) * std)

    # Specify item feature matrix
    sigma_v = pm.Uniform('sigma_v', shape=dim)
    corr_triangle_v = pm.LKJCorr(
        'corr_v', n=1, p=dim,
        testval=np.random.randn(n_elem) * std)

    corr_matrix_v = corr_triangle_v[tri_index]
    corr_matrix_v = t.fill_diagonal(corr_matrix_v, 1)
    cov_matrix_v = t.diag(sigma_v).dot(corr_matrix_v.dot(t.diag(sigma_v)))
    lambda_v = t.nlinalg.matrix_inverse(cov_matrix_v)

    mu_v = pm.Normal(
        'mu_v', mu=0, tau=beta_0 * lambda_v, shape=dim,
         testval=np.random.randn(dim) * std)
    V = pm.MvNormal(
        'V', mu=mu_v, tau=lambda_v,
        testval=np.random.randn(m, dim) * std)

    # Specify rating likelihood function
    R = pm.Normal(
        'R', mu=t.dot(U, V.T), tau=alpha * np.ones((n, m)),
        observed=train)

# `start` is the start dictionary obtained from running find_MAP for PMF.
# See the previous post for PMF code.
for key in bpmf.test_point:
    if key not in start:
        start[key] = bpmf.test_point[key]

with bpmf:
    step = pm.NUTS(scaling=start)

这个重新实施的目标是产生一个模型，可以使用“cc>采样器来估计。不幸的是，我仍然在最后一行得到同样的错误：

PositiveDefiniteError: Scaling is not positive definite. Simple check failed. Diagonal contains negatives. Check indexes [   0    1    2    3    ...   1030 1031 1032 1033 1034   ]

我已经在This answer中为pmf、bpmf和这个修改过的bpmf编写了所有代码，使复制错误变得简单。你所需要做的就是下载这些数据（也可以在gist中引用）。

最佳答案

看起来你正在将完整的精度矩阵传递到正态分布中：

mu_u = pm.Normal(
    'mu_u', mu=0, tau=beta_0 * lambda_u, shape=dim,
     testval=np.random.randn(dim) * std)

我假设您只想传递对角线值：

mu_u = pm.Normal(
    'mu_u', mu=0, tau=beta_0 * t.diag(lambda_u), shape=dim,
     testval=np.random.randn(dim) * std)

是否更改为mu_u并为您修复？