问题描述
我正在尝试为LSTM-VAE建模以使用Keras进行时间序列重建.
I am trying to model LSTM-VAE for time series reconstruction using Keras.
我已经提到 https://github.com/twairball /keras_lstm_vae/blob/master/lstm_vae/vae.py 和> https://machinelearningmastery.com/lstm-autoencoders/用于创建LSTM-VAE体系结构.
I had referred to https://github.com/twairball/keras_lstm_vae/blob/master/lstm_vae/vae.py and https://machinelearningmastery.com/lstm-autoencoders/ for creating the LSTM-VAE architecture.
我在训练网络时遇到麻烦,在热切的执行模式下训练时遇到以下错误:
I have trouble training the network, I get the following error while training in eager execution mode:
InvalidArgumentError: Incompatible shapes: [8,1] vs. [32,1] [Op:Mul]
输入形状为(7752,30,1)
,这里有30个时间步长和1个特征.
Input shape is (7752,30,1)
here 30 time steps and 1 feature.
模型编码器:
# encoder
latent_dim = 1
inter_dim = 32
#sample,timesteps, features
input_x = keras.layers.Input(shape= (X_train.shape[1], X_train.shape[2]))
#intermediate dimension
h = keras.layers.LSTM(inter_dim)(input_x)
#z_layer
z_mean = keras.layers.Dense(latent_dim)(h)
z_log_sigma = keras.layers.Dense(latent_dim)(h)
z = Lambda(sampling)([z_mean, z_log_sigma])
模型解码器:
# Reconstruction decoder
decoder1 = RepeatVector(X_train.shape[1])(z)
decoder1 = keras.layers.LSTM(100, activation='relu', return_sequences=True)(decoder1)
decoder1 = keras.layers.TimeDistributed(Dense(1))(decoder1)
采样功能:
batch_size = 32
def sampling(args):
z_mean, z_log_sigma = args
epsilon = K.random_normal(shape=(batch_size, latent_dim),mean=0., stddev=1.)
return z_mean + z_log_sigma * epsilon
VAE损失功能:
def vae_loss2(input_x, decoder1):
""" Calculate loss = reconstruction loss + KL loss for each data in minibatch """
# E[log P(X|z)]
recon = K.sum(K.binary_crossentropy(input_x, decoder1), axis=1)
# D_KL(Q(z|X) || P(z|X)); calculate in closed form as both dist. are Gaussian
kl = 0.5 * K.sum(K.exp(z_log_sigma) + K.square(z_mean) - 1. - z_log_sigma, axis=1)
return recon + kl
有什么建议可以使模型正常工作?
Any suggestions to make the model work?
推荐答案
您需要推断采样函数内部的batch_dim,并且需要注意损失...您的损失函数使用了先前图层的输出,因此您需要照顾这个.我用model.add_loss(...)
you need to infer the batch_dim inside the sampling function and you need to pay attention to your loss... your loss function uses the output of previous layers so you need to take care of this. I implement this using model.add_loss(...)
# encoder
latent_dim = 1
inter_dim = 32
timesteps, features = 100, 1
def sampling(args):
z_mean, z_log_sigma = args
batch_size = tf.shape(z_mean)[0] # <================
epsilon = K.random_normal(shape=(batch_size, latent_dim), mean=0., stddev=1.)
return z_mean + z_log_sigma * epsilon
# timesteps, features
input_x = Input(shape= (timesteps, features))
#intermediate dimension
h = LSTM(inter_dim, activation='relu')(input_x)
#z_layer
z_mean = Dense(latent_dim)(h)
z_log_sigma = Dense(latent_dim)(h)
z = Lambda(sampling)([z_mean, z_log_sigma])
# Reconstruction decoder
decoder1 = RepeatVector(timesteps)(z)
decoder1 = LSTM(inter_dim, activation='relu', return_sequences=True)(decoder1)
decoder1 = TimeDistributed(Dense(features))(decoder1)
def vae_loss2(input_x, decoder1, z_log_sigma, z_mean):
""" Calculate loss = reconstruction loss + KL loss for each data in minibatch """
# E[log P(X|z)]
recon = K.sum(K.binary_crossentropy(input_x, decoder1))
# D_KL(Q(z|X) || P(z|X)); calculate in closed form as both dist. are Gaussian
kl = 0.5 * K.sum(K.exp(z_log_sigma) + K.square(z_mean) - 1. - z_log_sigma)
return recon + kl
m = Model(input_x, decoder1)
m.add_loss(vae_loss2(input_x, decoder1, z_log_sigma, z_mean)) #<===========
m.compile(loss=None, optimizer='adam')
这篇关于Keras LSTM-VAE(可变自动编码器),用于时间序列异常检测的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!