问题描述
有人可以确定我的推理吗?
比方说,我有一个(预训练的)全连接层 fc
,该层将 bx20x20x10
作为输入,将 bx64
作为输出层,其中 b
是批处理大小.
现在,我输入的是 cx100x60x10
.身高和体重 100x60
可以细分为 20x20
的 5x3
.我想通过 fc
层(即"cx5x3x64")获得 5x3
的本地响应(输出).
现在我正在思考:这与具有 fc权重
的卷积层和宽20和高20的步幅相同.会有所不同吗?
是的,如果对密集层权重矩阵进行适当的重塑将是相同的.
首先让我们看一下密集层.您向密集层输入 20 x 20 x 10
矩阵.首先将其展平以生成 4000 x 1
向量.您希望输出的大小为 64 x 1
向量.因此,所需的权重矩阵是 4000 x 64
和 64
偏差参数.然后 y = w ^ T * x + b = [4000 x 64] ^ T * [4000 x 1] + [64 x 1]
将产生 [64 x 1] 代码>矢量.因此,
y [i] = w [i] [0] * x [0] + ... + w [i] [3999] * x [3999] + b [i]
对于 i = [0,63]
.请注意, b
表示偏差参数.
让我们转向卷积.要从大小为 100 x 60 x 10
的输入生成 5 x 3 x 64
输出,您需要64个过滤器,每个大小为(20,20)
并大步迈进(20,20)
,且不使用零填充.但是,每个 20 x 20
过滤器都具有沿整个深度延伸的局部连通性,即,神经元沿输入深度连接到所有10个维度.请阅读此,以了解有关卷积层本地连接的更多信息.
您的卷积层的接受域为 20 x 20
.卷积层中的每个神经元将连接到 20 x 20 x 10
.因此,总共有4000个权重(和一个偏置参数).您有64个此类过滤器.因此,该层的总可学习权重= 4000 x 64 + 64
. x
的一个 20 x 20 x 10
块和 w
之间的卷积(大小= 64 x 20 x 20 x 10
)可以执行为:
convResult = np.sum(np.sum(np.sum(x * w [:,:,::-1,::-1],axis = -1),axis = -1),axis = -1)
这里有一些要点.我做了 w [:,:::-1,::-1]
,因为theano卷积翻转了卷积内核(嗯,不是那么简单!).如果您对谁翻转而不是谁感兴趣,请阅读此.
64 x 1
向量.因此,通过适当地重塑密集层权重矩阵,可以在密集和卷积层上获得完全相同的结果.但是,您需要注意内核翻转以匹配结果.下面,我提供了一个代码片段,以手动计算卷积(使用numpy)并使用Theano.
导入theano从theano导入张量作为T将numpy导入为npX = T.ftensor4('X')W = T.ftensor4('W')out = T.nnet.conv2d(X,W)f = theano.function([X,W],out,allow_input_downcast = True)x = np.random.random((1,10,20,20))w = np.random.random((64,10,20,20))#使用Theano卷积c1 = np.squeeze(f(x,w)[0])#使用Numpy卷积c2 = np.sum(np.sum(np.sum(x * w [:,:,::-1,::-1],axis = -1),axis = -1),axis = -1)#检查两者几乎相同打印np.amax(c2-c1)
Could anyone make sure my reasoning?
Let's say I have a (pre-trained) fully connected layer fc
that takes bx20x20x10
as input and bx64
as output layer, where b
is batch size.
Now, I have an input of cx100x60x10
. The height and weight 100x60
can be subdivided into 5x3
of 20x20
. I would like to have 5x3
of local response (output) by fc
layer, i.e., `cx5x3x64'.
Now I am thinking: doing this is same with having convolution layer with fc weights
and stride with width 20 and height 20. Is that correct? There can be difference?
Yes, it will be the same if appropriate reshaping of the dense layer weight matrix is performed.
Let us first look at the dense layer. You input a 20 x 20 x 10
matrix to the dense layer. It will first be flattened out to produce a 4000 x 1
vector. You want the output to be of size 64 x 1
vector. So, the weight matrix required is 4000 x 64
and 64
bias parameters. Then y = w^T * x + b = [4000 x 64]^T * [4000 x 1] + [64 x 1]
will yield a [64 x 1]
vector. Therefore, y[i] = w[i][0]*x[0] + ... + w[i][3999]*x[3999] + b[i]
for i = [0, 63]
. Note that b
indicates a bias parameter.
Let us turn to convolution. To produce a 5 x 3 x 64
output from an input of size 100 x 60 x 10
, you need 64 filters, each of size (20,20)
and strides (20,20)
with no zero-padding. Each 20 x 20
filter however has local connectivity extending along the entire depth i.e. a neuron is connected to all the 10 dimensions along the depth of input. Please read this for more information on local connectivity of convolutional layer.
You convolutional layer has a receptive field of 20 x 20
. Each neuron in the convolutional layer will be connected to a 20 x 20 x 10
. Thus total 4000 weights (and one bias parameter). You have 64 such filters. Therefore, your total learnable weights for this layer = 4000 x 64 + 64
. Convolution between one 20 x 20 x 10
block of x
and w
(size = 64 x 20 x 20 x 10
) can be performed as:
convResult = np.sum(np.sum(np.sum(x*w[:,:,::-1,::-1], axis=-1), axis=-1),axis=-1)
There are some fine points here. I did w[:,:,::-1,::-1]
because theano convolution flips the convolution kernel (well, not that simple!). If you are interested in who flips and who does not, read this.
Finally, dense layer and convolution layer (in this context) essentially do the same operation. They first element-wise multiply and then sum up two sets of vectors/matrices of 4000 elements. This procedure is repeated 64 times to produce a 64 x 1
vector. So, it is possible to achieve exactly the same result with dense and convolution layer by proper reshaping of the dense layer weight matrix. However, you need to take care of kernel flipping to match the results.
Below I give a code snippet to compute convolution manually (using numpy) and using Theano.
import theano
from theano import tensor as T
import numpy as np
X = T.ftensor4('X')
W = T.ftensor4('W')
out = T.nnet.conv2d(X,W)
f = theano.function([X, W], out, allow_input_downcast=True)
x = np.random.random((1,10,20,20))
w = np.random.random((64,10,20,20))
# convolution using Theano
c1 = np.squeeze(f(x,w)[0])
# convolution using Numpy
c2 = np.sum(np.sum(np.sum(x*w[:,:,::-1,::-1],axis=-1),axis=-1),axis=-1)
# check that both are almost identical
print np.amax(c2 - c1)
这篇关于在Tensorflow中(通常是深度学习),我可以在"Tensorflow"中应用“本地FC层"吗与"CONV层"一起使用?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!