python - 如何在Keras中将密集层转换为等效的卷积层？

我想使用Keras做一些与Fully Convolutional Networks论文(https://people.eecs.berkeley.edu/~jonlong/long_shelhamer_fcn.pdf)类似的事情。我有一个网络，该网络最终将要素图展平并通过多个密集层运行它们。我想将权重从这样的网络加载到一个网络中，在该网络中，密集层被等效的卷积代替。

可以使用Keras随附的VGG16网络作为示例，其中将最后一个MaxPooling2D()的7x7x512输出展平，然后进入Dense(4096)层。在这种情况下，Dense(4096)将替换为7x7x4096卷积。

我的实际网络稍有不同，有一个GlobalAveragePooling2D()层，而不是MaxPooling2D()和Flatten()。 GlobalAveragePooling2D()的输出是一个2D张量，并且不需要额外对其进行平坦化，因此所有密集层(包括第一个密集层)都将被1x1卷积代替。

我已经看到了这个问题:Python keras how to transform a dense layer into a convolutional layer看起来非常相似，即使不完全相同。问题是我无法获得建议的解决方案，因为(a)我使用TensorFlow作为后端，因此权重重排/过滤器“旋转”不正确，并且(b)我无法确定了解如何加载重量。无法使用model.load_weights(by_name=True)将旧的权重文件加载到新的网络中，因为名称不匹配(即使尺寸不同)也是如此。

使用TensorFlow时应该重新安排什么？

如何装载砝码？我是否要为每个模型创建一个模型，在两个模型上都调用model.load_weights()来加载相同的权重，然后复制一些需要重新布置的额外权重？

最佳答案

基于hars的答案，我创建了以下函数，可将任意cnn转换为fcn:

from keras.models import Sequential
from keras.layers.convolutional import Convolution2D
from keras.engine import InputLayer
import keras

def to_fully_conv(model):

    new_model = Sequential()

    input_layer = InputLayer(input_shape=(None, None, 3), name="input_new")

    new_model.add(input_layer)

    for layer in model.layers:

        if "Flatten" in str(layer):
            flattened_ipt = True
            f_dim = layer.input_shape

        elif "Dense" in str(layer):

            input_shape = layer.input_shape
            output_dim =  layer.get_weights()[1].shape[0]
            W,b = layer.get_weights()

            if flattened_ipt:
                shape = (f_dim[1],f_dim[2],f_dim[3],output_dim)
                new_W = W.reshape(shape)
                new_layer = Convolution2D(output_dim,
                                          (f_dim[1],f_dim[2]),
                                          strides=(1,1),
                                          activation=layer.activation,
                                          padding='valid',
                                          weights=[new_W,b])
                flattened_ipt = False

            else:
                shape = (1,1,input_shape[1],output_dim)
                new_W = W.reshape(shape)
                new_layer = Convolution2D(output_dim,
                                          (1,1),
                                          strides=(1,1),
                                          activation=layer.activation,
                                          padding='valid',
                                          weights=[new_W,b])


        else:
            new_layer = layer

        new_model.add(new_layer)

    return new_model

您可以像这样测试功能:

model = keras.applications.vgg16.VGG16()
new_model = to_fully_conv(model)