本文介绍了如何在Keras中将遮罩层应用于顺序CNN模型?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在将RNN/LSTM模型中的CNN上应用遮罩层时遇到问题.

I have a problem of applying masking layer to CNNs in RNN/LSTM model.

我的数据不是原始图像,但是我转换为(16,34,4)(channels_first)的形状.数据是连续的,最长步长为22.因此,对于不变的方式,我将时间步长设置为22.由于它可能短于22个步长,因此我将np.zeros填充为其他步长.但是,对于0填充数据,它大约是所有数据集的一半,因此,对于0填充,使用如此多的无用数据无法获得很好的训练结果.然后我想添加一个掩码以取消这0个填充数据.

My data is not original image, but I converted into a shape of (16, 34, 4)(channels_first). The data is sequential, and the longest step length is 22. So for invariant way, I set the timestep as 22. Since it may be shorter than 22 steps, I fill others with np.zeros. However, for 0 padding data, it's about half among all dataset, so with 0 paddings, the training cannot reach a very good result with so much useless data. Then I want to add a mask to cancel these 0 padding data.

这是我的代码.

mask = np.zeros((16,34,4), dtype = np.int8)
input_shape = (22, 16, 34, 4)
model = Sequential()
model.add(TimeDistributed(Masking(mask_value=mask), input_shape=input_shape, name = 'mask'))
model.add(TimeDistributed(Conv2D(100, (5, 2), data_format = 'channels_first', activation = relu), name = 'conv1'))
model.add(TimeDistributed(BatchNormalization(), name = 'bn1'))
model.add(Dropout(0.5, name = 'drop1'))
model.add(TimeDistributed(Conv2D(100, (5, 2), data_format = 'channels_first', activation = relu), name ='conv2'))
model.add(TimeDistributed(BatchNormalization(), name = 'bn2'))
model.add(Dropout(0.5, name = 'drop2'))
model.add(TimeDistributed(Conv2D(100, (5, 2), data_format = 'channels_first', activation = relu), name ='conv3'))
model.add(TimeDistributed(BatchNormalization(), name = 'bn3'))
model.add(Dropout(0.5, name = 'drop3'))
model.add(TimeDistributed(Flatten(), name = 'flatten'))
model.add(GRU(256, activation='tanh', return_sequences=True, name = 'gru'))
model.add(Dropout(0.4, name = 'drop_gru'))
model.add(Dense(35, activation = 'softmax', name = 'softmax'))
model.compile(optimizer='Adam',loss='categorical_crossentropy',metrics=['acc'])

这是模型结构.
model.summary():

Here's the model structure.
model.summary():

_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
mask (TimeDist (None, 22, 16, 34, 4)     0
_________________________________________________________________
conv1 (TimeDistributed)      (None, 22, 100, 30, 3)    16100
_________________________________________________________________
bn1 (TimeDistributed)        (None, 22, 100, 30, 3)    12
_________________________________________________________________
drop1 (Dropout)              (None, 22, 100, 30, 3)    0
_________________________________________________________________
conv2 (TimeDistributed)      (None, 22, 100, 26, 2)    100100
_________________________________________________________________
bn2 (TimeDistributed)        (None, 22, 100, 26, 2)    8
_________________________________________________________________
drop2 (Dropout)              (None, 22, 100, 26, 2)    0
_________________________________________________________________
conv3 (TimeDistributed)      (None, 22, 100, 22, 1)    100100
_________________________________________________________________
bn3 (TimeDistributed)        (None, 22, 100, 22, 1)    4
_________________________________________________________________
drop3 (Dropout)              (None, 22, 100, 22, 1)    0
_________________________________________________________________
flatten (TimeDistributed)    (None, 22, 2200)          0
_________________________________________________________________
gru (GRU)                    (None, 22, 256)           1886976
_________________________________________________________________
drop_gru (Dropout)           (None, 22, 256)           0
_________________________________________________________________
softmax (Dense)              (None, 22, 35)            8995
=================================================================
Total params: 2,112,295
Trainable params: 2,112,283
Non-trainable params: 12
_________________________________________________________________

对于mask_value,我尝试使用0或此掩码结构,但均不起作用,并且仍然在其中填充了一半填充的所有数据中进行训练.
谁能帮我吗?

For mask_value, I tried with either 0 or this mask structure, but neither works and it still trains through all the data with half 0 paddings in it.
Can anyone help me?

B.T.W.,我在这里使用TimeDistributed连接RNN,我知道另一个名为ConvLSTM2D的.有谁知道其中的区别? ConvLSTM2D需要更多的模型参数,并且训练比TimeDistributed慢得多.

B.T.W., I used TimeDistributed here to connect RNN, and I know another one called ConvLSTM2D. Does anyone know the difference? ConvLSTM2D takes much more params for the model, and get training much slower than TimeDistributed...

推荐答案

不幸的是,Keras Conv图层尚不支持遮罩.在Keras Github页面上已经发布了有关此问题的几个问题,这里是一个与该主题进行最实质性的对话.似乎有一些挂起的实现细节,但问题从未解决.

Unfortunately masking is not yet supported by the Keras Conv layers. There have been several issues posted about this on the Keras Github page, here is the one with the most substantial conversation on the topic. It appears that there was some hang up implementation details and the issue was never resolved.

讨论中提出的解决方法是对序列中的填充字符进行显式嵌入,并进行全局池化. 这里是另一个我发现的解决方法(对我的用例没有帮助,但可能对您有帮助)-保留掩码数组以通过乘法合并.

The workaround proposed in the discussion is to have an explicit embedding for the padding character in sequences and do global pooling. Here is another workaround I found (not helpful for my use case but maybe helpful to you) - keeping a mask array to merge through multiplication.

您还可以查看有关此问题的对话与您的相似.

You can also check out the conversation around this question which is similar to yours.

这篇关于如何在Keras中将遮罩层应用于顺序CNN模型?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-21 01:06