问题描述
考虑到预定义的Keras模型,我试图首先加载预先训练的权重,然后删除内部(非最后几个)模型中的一到三个模型,然后将其替换为另一层.
Given a predefined Keras model, I am trying to first load in pre-trained weights, then remove one to three of the models internal (non-last few) layers, and then replace it with another layer.
我似乎在 keras.io 上找不到任何有关做这种事情或从中删除图层的文档.完全没有预定义的模型.
I can't seem to find any documentation on keras.io about to do such a thing or remove layers from a predefined model at all.
我正在使用的模型是一个良好的ole VGG-16网络,该网络在一个函数中实例化,如下所示:
The model I am using is a good ole VGG-16 network which is instantiated in a function as shown below:
def model(self, output_shape):
# Prepare image for input to model
img_input = Input(shape=self._input_shape)
# Block 1
x = Conv2D(64, (3, 3), activation='relu', padding='same', name='block1_conv1')(img_input)
x = Conv2D(64, (3, 3), activation='relu', padding='same', name='block1_conv2')(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block1_pool')(x)
# Block 2
x = Conv2D(128, (3, 3), activation='relu', padding='same', name='block2_conv1')(x)
x = Conv2D(128, (3, 3), activation='relu', padding='same', name='block2_conv2')(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block2_pool')(x)
# Block 3
x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv1')(x)
x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv2')(x)
x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv3')(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block3_pool')(x)
# Block 4
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv1')(x)
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv2')(x)
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv3')(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block4_pool')(x)
# Block 5
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv1')(x)
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv2')(x)
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv3')(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block5_pool')(x)
# Classification block
x = Flatten(name='flatten')(x)
x = Dense(4096, activation='relu', name='fc1')(x)
x = Dropout(0.5)(x)
x = Dense(4096, activation='relu', name='fc2')(x)
x = Dropout(0.5)(x)
x = Dense(output_shape, activation='softmax', name='predictions')(x)
inputs = img_input
# Create model.
model = Model(inputs, x, name=self._name)
return model
因此,举例来说,在将原始权重加载到所有其他层之后,我想在第1块中使用两个Conv层,并仅用一个Conv层替换它们.
So as an example, I'd like to take the two Conv layers in Block 1 and replace them with just one Conv layer, after loading the original weights into all of the other layers.
有什么想法吗?
推荐答案
假定您有模型vgg16_model
,可以通过上面的函数或keras.applications.VGG16(weights='imagenet')
进行初始化.现在,您需要在中间插入一个新层,以节省其他层的权重.
Assuming that you have a model vgg16_model
, initialized either by your function above or by keras.applications.VGG16(weights='imagenet')
. Now, you need to insert a new layer in the middle in such a way that the weights of other layers will be saved.
这个想法是将整个网络分解为不同的层,然后再组装回去.这是专门用于您的任务的代码:
The idea is to disassemble the whole network to separate layers, then assemble it back. Here is the code specifically for your task:
vgg_model = applications.VGG16(include_top=True, weights='imagenet')
# Disassemble layers
layers = [l for l in vgg_model.layers]
# Defining new convolutional layer.
# Important: the number of filters should be the same!
# Note: the receiptive field of two 3x3 convolutions is 5x5.
new_conv = Conv2D(filters=64,
kernel_size=(5, 5),
name='new_conv',
padding='same')(layers[0].output)
# Now stack everything back
# Note: If you are going to fine tune the model, do not forget to
# mark other layers as un-trainable
x = new_conv
for i in range(3, len(layers)):
layers[i].trainable = False
x = layers[i](x)
# Final touch
result_model = Model(input=layer[0].input, output=x)
result_model.summary()
以上代码的输出为:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_50 (InputLayer) (None, 224, 224, 3) 0
_________________________________________________________________
new_conv (Conv2D) (None, 224, 224, 64) 1792
_________________________________________________________________
block1_pool (MaxPooling2D) (None, 112, 112, 64) 0
_________________________________________________________________
block2_conv1 (Conv2D) (None, 112, 112, 128) 73856
_________________________________________________________________
block2_conv2 (Conv2D) (None, 112, 112, 128) 147584
_________________________________________________________________
block2_pool (MaxPooling2D) (None, 56, 56, 128) 0
_________________________________________________________________
block3_conv1 (Conv2D) (None, 56, 56, 256) 295168
_________________________________________________________________
block3_conv2 (Conv2D) (None, 56, 56, 256) 590080
_________________________________________________________________
block3_conv3 (Conv2D) (None, 56, 56, 256) 590080
_________________________________________________________________
block3_pool (MaxPooling2D) (None, 28, 28, 256) 0
_________________________________________________________________
block4_conv1 (Conv2D) (None, 28, 28, 512) 1180160
_________________________________________________________________
block4_conv2 (Conv2D) (None, 28, 28, 512) 2359808
_________________________________________________________________
block4_conv3 (Conv2D) (None, 28, 28, 512) 2359808
_________________________________________________________________
block4_pool (MaxPooling2D) (None, 14, 14, 512) 0
_________________________________________________________________
block5_conv1 (Conv2D) (None, 14, 14, 512) 2359808
_________________________________________________________________
block5_conv2 (Conv2D) (None, 14, 14, 512) 2359808
_________________________________________________________________
block5_conv3 (Conv2D) (None, 14, 14, 512) 2359808
_________________________________________________________________
block5_pool (MaxPooling2D) (None, 7, 7, 512) 0
_________________________________________________________________
flatten (Flatten) (None, 25088) 0
_________________________________________________________________
fc1 (Dense) (None, 4096) 102764544
_________________________________________________________________
fc2 (Dense) (None, 4096) 16781312
_________________________________________________________________
predictions (Dense) (None, 1000) 4097000
=================================================================
Total params: 138,320,616
Trainable params: 1,792
Non-trainable params: 138,318,824
_________________________________________________________________
这篇关于在Keras模型中删除然后插入新的中间层的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!