我正在尝试在Keras中使用RNN对时变协方差建模,在这里我将信号Y的协方差分解为随时间变化的加权和:C_Y ^ t = SUM_i ^ npriors(alpha_i ^ t * beta_i),其中beta_i是一些固定的基集和alpha_i ^ t是我试图推断的术语。
作为成本函数,我(当前)使用负对数似然,其中似然度是具有推论协方差C_Y ^ t(如上所示)的零均值MVN:似然度= MVN(Y; 0,C_Y ^ t )。一旦正确实施,我将使用带有KL分歧的reparam技巧。
我不想明确地在经典的自动编码器设置中重建数据-我只想推断最适合随时间变化的协方差动态变化的alpha项。因此,在调用模型时,输出应仅为alpha_mu
和alpha_sigma
:
alpha_model_net = tf.keras.Model(inputs=[inputs_layer],
outputs= [alpha_mu,alpha_sigma],
name='Alpha_MODEL')
但是我不知道这些alpha术语是先验的,因此在调用
alpha_model_net.fit(Y_observed,[alpha_mu_predict,alpha_sigma_predict])
时,很难知道这些[alpha_mu_predict,alpha_sigma_predict]
术语在无监督的环境中应该是什么。因此,我的问题分为两个部分:
如果我不认识它们,我应该用什么作为
alpha_predict
?我实际上是在这里所示的尝试实现中使用我的自定义成本函数中的alpha分布样本,即
alpha_ast
吗?我自己去实施了。我的代码的关键部分可以在下面和a complete example with data simulation can be found on a Google Colab doc here中看到。
模型
mini_batch_length = 10 # feature length
nchans = 5 # number of features/channels of observed data, Y
nunits = 10 # number of GRU units
npriors = 2 # i.e. how many basis functions we have
inputs_layer = layers.Input(shape=(mini_batch_length,nchans), name='Y_input')
output,state = tf.compat.v1.keras.layers.CuDNNGRU(nunits, # number of units
return_state=True,
return_sequences=True,
name='uni_INF_GRU')(inputs_layer)
alpha_mu = tf.keras.layers.Dense(npriors,activation='linear',name='alpha_mu')(output)
alpha_sigma = tf.keras.layers.Dense(npriors,activation='linear',name='alpha_sigma')(output)
# use reparameterization trick to push the sampling out as input
alpha_ast = layers.Lambda(sampling,
name='alpha_ast')([alpha_mu, alpha_sigma])
# instantiate alpha MODEL network:
alpha_model_net = tf.keras.Model(inputs=[inputs_layer],
outputs= [alpha_ast],
name='Alpha_MODEL')
tf.keras.utils.plot_model(alpha_model_net, to_file='vae_mlp_encoder.png', show_shapes=True)
成本函数
def vae_loss(Y_portioned, alpha_ast):
"""
Our cost function is just the NLL
The likelihood is a multivariate normal with zero mean and time-varying
covariance:
P(Y|alpha^t) = MVN(Y; 0, C_Y^t)
where
C_Y^t = SUM_i^npriors (alpha_ast_i^t beta_i)
Y is our observed data
alpha_ast_i^t are our samples from the inferred parameters (mu,sigma)
beta_i are the basis functions (corresponding to covariance_matrix below)
and (perhaps obviously) are not trainable.
"""
# Alphas need to end up being of dimension (?,mini_batch_length,npriors,1,1),
# and need to undergo softplus transformation:
alpha_ext = tf.keras.backend.expand_dims(tf.keras.backend.expand_dims(
tf.keras.activations.softplus(alpha_ast),
axis=-1),axis=-1)
# Covariance basis set
# This needs to be of dim [npriors, sensors, sensors]:
covariance_basis = np.tile(np.zeros((nchans,nchans)),(npriors,1,1)).astype('float32')
covariance_basis[0,0,0] = 1
covariance_basis[1,1,1] = 1
# Covariance basis functions need to be of dimension [1,1, npriors, sensors, sensors]
covariance_ext = tf.reshape(covariance_basis,(1,1,npriors,nchans,nchans))
# Do the multiplicative sum over the npriors dimension:
cov_arg = tf.reduce_sum(tf.multiply(alpha_ext,covariance_ext),2)
safety_add = 1e-6*np.eye(nchans, nchans)
cov_arg = cov_arg + safety_add
mvn=tfd.MultivariateNormalFullCovariance(
loc = np.zeros((mini_batch_length,nchans)).astype('float32'),
covariance_matrix=cov_arg,
allow_nan_stats=False)
# Evaluate the -log(MVN) at the current batch of data. We add a tiny constant
# to avoid any NaN or inf troubles
loss = tf.reduce_sum(-tf.math.log(mvn.prob(Y_portioned)+1e-9))
return loss
拟合模型
opt = tf.keras.optimizers.Adam(lr=0.001)
alpha_model_net.compile(optimizer=opt, loss=vae_loss)
history=alpha_model_net.fit(Y_portioned, # Observed data.
Y_portioned, # ???
verbose=1,
shuffle=True,
epochs=100,
batch_size=400)
在此先非常感谢-如果我缺少任何关键细节,请告诉我。
使用TensorFlow 2.1.0后端
更新1:
我只是使用
add_loss
函数使用张量来计算NLL。现在看来这是可行的,并且我不需要在model.fit(x,y)中指定有害的y。如果不正确,将再次更新。示例模型
inputs_layer = layers.Input(shape=(mini_batch_length,nchans), name='Y_portioned_in')
output,state = tf.compat.v1.keras.layers.CuDNNGRU(nunits, # number of units
return_state=True,
return_sequences=True,
name='uni_INF_GRU')(inputs_layer)
dense_layer_mu = tf.keras.layers.Dense(npriors,activation='linear')(output)
dense_layer_sigma = tf.keras.layers.Dense(npriors,activation='linear')(output)
alpha_ast = layers.Lambda(sampling,
name='alpha_ast')([dense_layer_mu, dense_layer_sigma])
model = tf.keras.Model(inputs=[inputs_layer], outputs=[dense_layer_mu])
# Construct your custom loss as a tensor
loss = my_beautiful_custom_loss(alpha_ast,inputs_layer,npriors,nchans)
# Add loss to model
model.add_loss(loss)
# Compile without specifying a loss
opt = tf.keras.optimizers.Adam(lr=0.001)
model.compile(optimizer=opt)
history=model.fit(Y_portioned, # Input or "Y_true"
verbose=1,
shuffle=True,
epochs=400,
batch_size=200)
哪里
def my_beautiful_custom_loss(alpha_ast,Y_portioned,npriors,nchans):
# <Do something with input tensors here>
return loss
最佳答案
不确定这是最明智的做法,但是我使用了add_loss
函数来解决此问题。
我将以完整的实施方式更新我的原始问题。