本文介绍了RNN(keras)的欧几里得距离损失函数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想将欧几里得距离设置为LSTM或RNN的损失函数.

I want to set Euclidean distance as a loss function for LSTM or RNN.

该函数应具有什么输出:float,(batch_size)或(batch_size,时间步长)?

What output should such function have: float, (batch_size) or (batch_size, timesteps)?

模型输入X_train是(n_samples,时间步长,data_dim).Y_train具有相同的尺寸.

Model input X_train is (n_samples, timesteps, data_dim).Y_train has the same dimensions.

示例代码:

def euc_dist_keras(x, y):
    return K.sqrt(K.sum(K.square(x - y), axis=-1, keepdims=True))


model = Sequential()
model.add(SimpleRNN(n_units, activation='relu', input_shape=(timesteps, data_dim), return_sequences=True))
model.add(Dense(n_output, activation='linear'))

model.compile(loss=euc_dist_keras, optimizer='adagrad')

model.fit(y_train, y_train, batch_size=512, epochs=10)

所以,我应该对时间步长维度和/或batch_size的平均损失进行平均吗?

So, should I average loss over timesteps dimension and/or batch_size?

推荐答案

损失函数将采用预测和真实标签,并以Keras输出标量:

A loss functions will take predicted and true labels and will output a scalar, in Keras:

from keras import backend as K
def euc_dist_keras(y_true, y_pred):
    return K.sqrt(K.sum(K.square(y_true - y_pred), axis=-1, keepdims=True))

请注意,它将不会以X_train作为输入.损耗计算遵循正向传播步骤,与真实标签相比,其值可提供预测标签的优点.

Note, that it will not take X_train as an input. The loss calculation follows the forward propagation step, and it's value provides the goodness of predicted labels compared to true labels.

损失函数应具有标量输出.

The loss function should have scalar output.

使用欧几里德距离作为损失函数时,不需要此方法.

This would not be required to be able to use Euclidean distance as a loss function.

侧面说明:就您而言,我认为问题可能出在神经网络架构上,而不是损失.给定(batch_size, timesteps, data_dim)SimpleRNN的输出将为(batch_size, timesteps, n_units),而Dense层的输出将为(batch_size, n_output).因此,鉴于您的Y_train具有形状(batch_size, timesteps, data_dim),您可能需要使用TimeDistributed 包装器在每个时间片上应用Dense,并调整完全连接层中的隐藏单元数.

Side note: In your case, I think the problem might be with the neural network architecture, not the loss. Given (batch_size, timesteps, data_dim) the output of the SimpleRNN will be (batch_size, timesteps, n_units), and the output of Dense layer will be (batch_size, n_output). Thus, given your Y_train has the shape (batch_size, timesteps, data_dim) you would likely need to use TimeDistributed wrapper applying Dense per every temporal slice, and to adjust the number of hidden units in the fully connected layer.

这篇关于RNN(keras)的欧几里得距离损失函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-27 20:35