问题描述
这是我的测试代码:
from keras import layers
input1 = layers.Input((2,3))
output = layers.Dense(4)(input1)
print(output)
输出为:
<tf.Tensor 'dense_2/add:0' shape=(?, 2, 4) dtype=float32>
但是会发生什么?
文档说:
输出会被重塑吗?
推荐答案
当前,与文档中所述相反,Dense
层应用于输入张量的最后一个轴:
Currently, contrary to what has been stated in documentation, the Dense
layer is applied on the last axis of input tensor:
换句话说,如果将具有m
单位的Dense
层应用于形状为(n_dim1, n_dim2, ..., n_dimk)
的输入张量,则其输出形状为(n_dim1, n_dim2, ..., m)
.
In other words, if a Dense
layer with m
units is applied on an input tensor of shape (n_dim1, n_dim2, ..., n_dimk)
it would have an output shape of (n_dim1, n_dim2, ..., m)
.
作为旁注::这使TimeDistributed(Dense(...))
和Dense(...)
彼此等效.
另一注:请注意,这具有共享权重的作用.例如,考虑以下玩具网络:
Another side note: be aware that this has the effect of shared weights. For example, consider this toy network:
model = Sequential()
model.add(Dense(10, input_shape=(20, 5)))
model.summary()
模型摘要:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_1 (Dense) (None, 20, 10) 60
=================================================================
Total params: 60
Trainable params: 60
Non-trainable params: 0
_________________________________________________________________
如您所见,Dense
层只有60个参数.如何? Dense
层中的每个单元都以相同的权重(相同的权重)(因此10 * 5 + 10 (bias params per unit) = 60
)连接到输入中每行的5个元素.
As you can see the Dense
layer has only 60 parameters. How? Each unit in the Dense
layer is connected to the 5 elements of each row in the input with the same weights, therefore 10 * 5 + 10 (bias params per unit) = 60
.
更新.这是上面示例的直观图示:
Update. Here is a visual illustration of the example above:
这篇关于Keras Dense层的输入未展平的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!