问题描述
给定 shape=[batch_size, max_time, 128]
(RNN 的输出)的张量,max_time
可能会有所不同,我想应用全连接层将数据投影到 [batch_size, max_time, 10]
形状上.
Given a tensor of shape=[batch_size, max_time, 128]
(the output of an RNN), for which max_time
may vary, I would like to apply a fully connected layer to project the data onto a [batch_size, max_time, 10]
shape.
问题是:我是否需要先对输入 Tensor 进行整形,合并前两个维度,然后应用 tf.layers.dense,然后再整形回 3D?或者我可以简单地在 3D 张量上使用 tf.layers.dense 来获得等效的效果吗?
The question is: do I need to reshape the input Tensor first, merging the first two dimensions, then apply tf.layers.dense, then reshape back to 3D? Or can I simply use tf.layers.dense on the 3D tensor to obtain an equivalent effect ?
我希望为 128 个 RNN 单元和 10 个输出类之间的所有连接共享一个权重矩阵,同时允许每个批次的可变长度 max_time
.
I would like to have a single weight matrix shared for all the connections between the 128 RNN units and the 10 output classes, allowing at the same a variable length max_time
for each batch.
推荐答案
经过进一步调查,似乎这两个选项是等效的.
After further investigation, is appears that the two options are equivalent.
Dense.call() 方法检查维数.如果这大于 2,则它计算输入和权重之间的 tensordot(对应于 numpy.tensordot 的操作),选择输入中的最后一个维度和权重中的第一个维度作为轴.否则,它应用标准矩阵乘法 (matmul).
The Dense.call() method checks the number of dimensions. If this is larger than 2, then it computes a tensordot (an operation which corresponds to numpy.tensordot) between the input and the weights, choosing as axes the last dimension in the input and the first dimension in the weights. Otherwise it applies standard matrix multiplication (matmul).
这篇关于在密集层之前重塑 3D 张量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!