具有自定义损失函数的TensorFlow 2出现无效参数错误，尽管一切似乎都是正确的

本文介绍了具有自定义损失函数的TensorFlow 2出现无效参数错误，尽管一切似乎都是正确的的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我目前正在使用TensorFlow 2来训练模型，这些模型不仅提供时间序列的点预测，而且还提供预测分布度量(例如，均值和方差)。为此，我创建了一个层，并修改了损失函数来优化相应的参数。对于只有一个预测时间序列的一维情况，这种方法非常有效。

对于具有两个时间序列的情况，我想尝试相应地预测相关性，并使用了"；TensorFlow_Probability"；中的函数&MultiVariateNormal FullCoVariance&qot；。但是，我得到了以下错误：

InvalidArgumentError:  Input matrix must be square.
     [[node negative_normdist_loss_2/MultivariateNormalFullCovariance/init/Cholesky (defined at d:20_programmingpythonvirtualenvs	ensorflow-gpu-2libsite-packages	ensorflow_probabilitypythondistributionsmvn_full_covariance.py:194) ]] [Op:__inference_train_function_1133]

Errors may have originated from an input operation.
Input Source operations connected to node negative_normdist_loss_2/MultivariateNormalFullCovariance/init/Cholesky:
 negative_normdist_loss_2/MultivariateNormalFullCovariance/init/covariance_matrix (defined at d:20_programmingpythonvirtualenvs	ensorflow-gpu-2libsite-packages	ensorflow_probabilitypythondistributionsmvn_full_covariance.py:181)

Function call stack:
train_function

我知道输入维度有问题，但不幸的是我找不到具体的错误。(相关矩阵已经是二次的，即使它包含两次相同的参数。)

代码本身有点广泛。因此，我上传了一个工作(单变量)和非工作示例(多变量)，包括样本数据到这个目录：

https://drive.google.com/drive/folders/1IIAtKDB8paWV0aFVFALDUAiZTCqa5fAN?usp=sharing

为了更好地了解情况，我还复制了以下基本例程：

def negative_normdist_layer_2(x):
    # Get the number of dimensions of the input
    num_dims = len(x.get_shape())
    # Separate the parameters
    mu1, mu2, sigma11, sigma12, sigma22 = tf.unstack(x, num=5, axis=-1)
    # Add one dimension to make the right shape
    mu1 = tf.expand_dims(mu1, -1)
    mu2 = tf.expand_dims(mu2, -1)
    sigma11 = tf.expand_dims(sigma11, -1)
    sigma12 = tf.expand_dims(sigma12, -1)
    sigma22 = tf.expand_dims(sigma22, -1)
    # Apply a softplus to make positive
    sigma11 = tf.keras.activations.softplus(sigma11)
    sigma22 = tf.keras.activations.softplus(sigma22)
    # Join back together again
    out_tensor = tf.concat((mu1, mu2, sigma11, sigma12, sigma22), axis=num_dims-1)
    return out_tensor

def negative_normdist_loss_2(y_true, y_pred):
    # Separate the parameters
    mu1, mu2, sigma11, sigma12, sigma22 = tf.unstack(y_pred, num=5, axis=-1)
    # Add one dimension to make the right shape
    mu1 = tf.expand_dims(mu1, -1)
    mu2 = tf.expand_dims(mu2, -1)
    sigma11 = tf.expand_dims(sigma11, -1)
    sigma12 = tf.expand_dims(sigma12, -1)
    sigma22 = tf.expand_dims(sigma22, -1)
    # Calculate the negative log likelihood
    dist = tfp.distributions.MultivariateNormalFullCovariance(
        loc = [mu1, mu2], 
        covariance_matrix = [[sigma11, sigma12], [sigma12, sigma22]]
    )
    nll = tf.reduce_mean(-dist.log_prob(y_true))
    return nll

# Define inputs with predefined shape
input_shape = lookback // step, float_data.shape[-1]
inputs = Input(shape=input_shape)

# Build network with some predefined architecture
output1 = Flatten()(inputs)
output2 = Dense(32)(output1)

# Predict the parameters of a negative normdist distribution
outputs = Dense(5)(output2)
distribution_outputs = Lambda(negative_normdist_layer_2)(outputs)

# Construct model
model_norm_2 = Model(inputs=inputs, outputs=distribution_outputs)

opt = Adam()
model_norm_2.compile(loss = negative_normdist_loss_2, optimizer = opt)

history_norm_2 = model_norm_2.fit_generator(train_gen_mult,
                                            steps_per_epoch=500,
                                            epochs=20,
                                            validation_data=val_gen_mult,
                                            validation_steps=val_steps)

我使用的操作系统是Windows 10，Python版本是3.6。示例代码中列出的所有库都是最新的，包括TensorFlow-GPU。

如果能确定错误的确切原因并找到解决方案，我将不胜感激。

推荐答案

必须调换均值和协方差参数，因为根据MultivariateNormalFullCovariance的文档，它们应该是大小(Batch_Size，2)和(Batch_Size，2，2)(对于2维的问题)。尽管该层确保对角线项为正，但协方差矩阵的求逆存在问题。您可以使用MultivariateNormalTriL，它采用更低的三角矩阵，协方差求逆不再有问题(保持软加)：

def negative_normdist_loss_2(y_true, y_pred):
    # Separate the parameters
    mu1, mu2, sigma11, sigma12, sigma22 = tf.unstack(y_pred, num=5, axis=-1)
    mu = tf.transpose([mu1, mu2], perm=[1, 0])
    sigma_tril = tf.transpose([[sigma11, tf.zeros_like(sigma11)], [sigma12, sigma22]], perm=[2, 0, 1])
    dist = tfp.distributions.MultivariateNormalTriL(loc=mu, scale_tril=sigma_tril)
    nll = tf.reduce_mean(-dist.log_prob(y_true))
    return nll

然而，我想知道它背后的想法。它对应于一种有趣的非监督方法。数据允许您估计某种非常规成本函数的均值和协方差参数，但不清楚之后您可以如何处理它。

这篇关于具有自定义损失函数的TensorFlow 2出现无效参数错误，尽管一切似乎都是正确的的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！