问题描述
我试图掌握TimeDistributed包装器在Keras中的作用.
I am trying to grasp what TimeDistributed wrapper does in Keras.
我得到TimeDistributed将层应用于输入的每个时间片."
I get that TimeDistributed "applies a layer to every temporal slice of an input."
但是我做了一些实验,得到了我无法理解的结果.
But I did some experiment and got the results that I cannot understand.
简而言之,就LSTM层而言,TimeDistributed和Just Dense层具有相同的结果.
In short, in connection to LSTM layer, TimeDistributed and just Dense layer bear same results.
model = Sequential()
model.add(LSTM(5, input_shape = (10, 20), return_sequences = True))
model.add(TimeDistributed(Dense(1)))
print(model.output_shape)
model = Sequential()
model.add(LSTM(5, input_shape = (10, 20), return_sequences = True))
model.add((Dense(1)))
print(model.output_shape)
对于这两个模型,我得到的输出形状为(无,10、1).
For both models, I got output shape of (None, 10, 1).
在RNN层之后,谁能解释TimeDistributed层和Dense层之间的区别?
Can anyone explain the difference between TimeDistributed and Dense layer after an RNN layer?
推荐答案
在keras
中-在构建顺序模型时-通常是第二维(样本维之后),它与time
维相关.这意味着,例如,如果您的数据是5-dim
和(sample, time, width, length, channel)
,则可以沿时间维度使用TimeDistributed
(适用于4-dim
和(sample, width, length, channel)
)应用卷积层(将同一层应用于每个时间片)以获得5-d
输出.
In keras
- while building a sequential model - usually the second dimension (one after sample dimension) - is related to a time
dimension. This means that if for example, your data is 5-dim
with (sample, time, width, length, channel)
you could apply a convolutional layer using TimeDistributed
(which is applicable to 4-dim
with (sample, width, length, channel)
) along a time dimension (applying the same layer to each time slice) in order to obtain 5-d
output.
使用Dense
的情况是在版本2.0中的keras
中,默认情况下仅将Dense
应用于最后一个尺寸(例如,如果将Dense(10)
应用于形状为(n, m, o, p)
的输入,则输出为形状(n, m, o, 10)
),因此在您的情况下Dense
和TimeDistributed(Dense)
是等效的.
The case with Dense
is that in keras
from version 2.0 Dense
is by default applied to only last dimension (e.g. if you apply Dense(10)
to input with shape (n, m, o, p)
you'll get output with shape (n, m, o, 10)
) so in your case Dense
and TimeDistributed(Dense)
are equivalent.
这篇关于在Keras中,TimeDistributed层的作用是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!