




I am using the VGG16 architecture within Keras, which I have retrained to fit my needs in the following way:

vgg16_model = keras.applications.vgg16.VGG16()
model = Sequential()
for layer in vgg16_model.layers:

for layer in model.layers:
    layer.trainable = False

model.add(Dense(3, activation='softmax'))

model.compile(Adam(lr=.0001), loss='categorical_crossentropy', metrics=['accuracy'])


Next I train the model and then I am saving the entire model the way it is suggested within the keras documentation:

from keras.models import load_model

model.save('my_model_vgg16.h5')  # creates a HDF5 file


model = load_model('my_model_vgg16.h5')


Using the trained model in JupyterNotebook works like a charm. However, when I am trying to load the saved model after restarting the kernel I get the following error:

ValueError: Dimension 0 in both shapes must be equal, but are 4096 and 1000 for 'Assign_30' (op: 'Assign') with input shapes: [4096,3], [1000,3].


I can't figure out why this error occurs since I am neither changing the input nor the output size of the model / layers during saving and loading.


For testing purpose I have tried using a much simpler sequential model which I have build from scratch in the same pipleline (i.e. same saving and loading procedures) and this gives me no error. Hence I wonder if there's something I am missing when using a pretrained model (saving and loading it).


For reference, the entire console error-log looks like this:

The problem is with the line model.layers.pop(). When you pop a layer directly from the list model.layers, the topology of this model is not updated accordingly. So all following operations would be wrong, given a wrong model definition.


Specifically, when you add a layer with model.add(layer), the list model.outputs is updated to be the output tensor of that layer. You can find the following lines in the source code of Sequential.add():

        output_tensor = layer(self.outputs[0])
        # ... skipping irrelevant lines
        self.outputs = [output_tensor]


When you call model.layers.pop(), however, model.outputs is not updated accordingly. As a result, the next added layer will be called with a wrong input tensor (because self.outputs[0] is still the output tensor of the removed layer).


This can be demonstrated by the following lines:

model = Sequential()
for layer in vgg16_model.layers:

model.add(Dense(3, activation='softmax'))

# => Tensor("predictions_1/Softmax:0", shape=(?, 1000), dtype=float32)
# the new layer is called on a wrong input tensor

# => <tf.Variable 'dense_1/kernel:0' shape=(1000, 3) dtype=float32_ref>
# the kernel shape is also wrong


The incorrect kernel shape is why you're seeing an error about incompatible shapes [4096,3] versus [1000,3].


To solve the problem, simply don't add the last layer to the Sequential model.

model = Sequential()
for layer in vgg16_model.layers[:-1]:


