问题描述
我正在Keras中使用VGG16架构,我已经通过以下方式对其进行了重新培训以满足我的需求:
I am using the VGG16 architecture within Keras, which I have retrained to fit my needs in the following way:
vgg16_model = keras.applications.vgg16.VGG16()
model = Sequential()
for layer in vgg16_model.layers:
model.add(layer)
model.layers.pop()
for layer in model.layers:
layer.trainable = False
model.add(Dense(3, activation='softmax'))
model.compile(Adam(lr=.0001), loss='categorical_crossentropy', metrics=['accuracy'])
接下来,我训练模型,然后按照keras文档中建议的方式保存整个模型:
Next I train the model and then I am saving the entire model the way it is suggested within the keras documentation:
from keras.models import load_model
model.save('my_model_vgg16.h5') # creates a HDF5 file
像这样加载模型时:
model = load_model('my_model_vgg16.h5')
在JupyterNotebook中使用训练有素的模型就像是一种魅力.但是,当我尝试在重新启动内核后加载已保存的模型时,出现以下错误:
Using the trained model in JupyterNotebook works like a charm. However, when I am trying to load the saved model after restarting the kernel I get the following error:
ValueError: Dimension 0 in both shapes must be equal, but are 4096 and 1000 for 'Assign_30' (op: 'Assign') with input shapes: [4096,3], [1000,3].
我无法弄清楚为什么会发生此错误,因为在保存和加载期间我都没有更改模型/图层的输入或输出大小.
I can't figure out why this error occurs since I am neither changing the input nor the output size of the model / layers during saving and loading.
出于测试目的,我尝试使用更简单的顺序模型,该模型是在相同的pipleline(即相同的保存和加载过程)中从头开始构建的,这不会给我带来任何错误.因此,我想知道使用预训练模型(保存并加载)时是否缺少某些东西.
For testing purpose I have tried using a much simpler sequential model which I have build from scratch in the same pipleline (i.e. same saving and loading procedures) and this gives me no error. Hence I wonder if there's something I am missing when using a pretrained model (saving and loading it).
作为参考,整个控制台错误日志如下:
For reference, the entire console error-log looks like this:
---------------------------------------------------------------------------
InvalidArgumentError Traceback (most recent call last)
~\Anaconda2\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\framework\common_shapes.py in _call_cpp_shape_fn_impl(op, input_tensors_needed, input_tensors_as_shapes_needed, require_shape_fn)
685 graph_def_version, node_def_str, input_shapes, input_tensors,
--> 686 input_tensors_as_shapes, status)
687 except errors.InvalidArgumentError as err:
~\Anaconda2\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\framework\errors_impl.py in __exit__(self, type_arg, value_arg, traceback_arg)
472 compat.as_text(c_api.TF_Message(self.status.status)),
--> 473 c_api.TF_GetCode(self.status.status))
474 # Delete the underlying status object from memory otherwise it stays alive
InvalidArgumentError: Dimension 0 in both shapes must be equal, but are 4096 and 1000 for 'Assign_30' (op: 'Assign') with input shapes: [4096,3], [1000,3].
During handling of the above exception, another exception occurred:
ValueError Traceback (most recent call last)
<ipython-input-5-a2d2e98db4b6> in <module>()
1 from keras.models import load_model
----> 2 loaded_model = load_model('my_model_vgg16.h5')
3 print("Loaded Model from disk")
4
5 #compile and evaluate loaded model
~\Anaconda2\envs\tensorflow-gpu\lib\site-packages\keras\models.py in load_model(filepath, custom_objects, compile)
244
245 # set weights
--> 246 topology.load_weights_from_hdf5_group(f['model_weights'], model.layers)
247
248 # Early return if compilation is not required.
~\Anaconda2\envs\tensorflow-gpu\lib\site-packages\keras\engine\topology.py in load_weights_from_hdf5_group(f, layers)
3164 ' elements.')
3165 weight_value_tuples += zip(symbolic_weights, weight_values)
-> 3166 K.batch_set_value(weight_value_tuples)
3167
3168
~\Anaconda2\envs\tensorflow-gpu\lib\site-packages\keras\backend\tensorflow_backend.py in batch_set_value(tuples)
2363 assign_placeholder = tf.placeholder(tf_dtype,
2364 shape=value.shape)
-> 2365 assign_op = x.assign(assign_placeholder)
2366 x._assign_placeholder = assign_placeholder
2367 x._assign_op = assign_op
~\Anaconda2\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\ops\variables.py in assign(self, value, use_locking)
571 the assignment has completed.
572 """
--> 573 return state_ops.assign(self._variable, value, use_locking=use_locking)
574
575 def assign_add(self, delta, use_locking=False):
~\Anaconda2\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\ops\state_ops.py in assign(ref, value, validate_shape, use_locking, name)
274 return gen_state_ops.assign(
275 ref, value, use_locking=use_locking, name=name,
--> 276 validate_shape=validate_shape)
277 return ref.assign(value)
~\Anaconda2\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\ops\gen_state_ops.py in assign(ref, value, validate_shape, use_locking, name)
54 _, _, _op = _op_def_lib._apply_op_helper(
55 "Assign", ref=ref, value=value, validate_shape=validate_shape,
---> 56 use_locking=use_locking, name=name)
57 _result = _op.outputs[:]
58 _inputs_flat = _op.inputs
~\Anaconda2\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\framework\op_def_library.py in _apply_op_helper(self, op_type_name, name, **keywords)
785 op = g.create_op(op_type_name, inputs, output_types, name=scope,
786 input_types=input_types, attrs=attr_protos,
--> 787 op_def=op_def)
788 return output_structure, op_def.is_stateful, op
789
~\Anaconda2\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\framework\ops.py in create_op(self, op_type, inputs, dtypes, input_types, name, attrs, op_def, compute_shapes, compute_device)
2956 op_def=op_def)
2957 if compute_shapes:
-> 2958 set_shapes_for_outputs(ret)
2959 self._add_op(ret)
2960 self._record_op_seen_by_control_dependencies(ret)
~\Anaconda2\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\framework\ops.py in set_shapes_for_outputs(op)
2207 shape_func = _call_cpp_shape_fn_and_require_op
2208
-> 2209 shapes = shape_func(op)
2210 if shapes is None:
2211 raise RuntimeError(
~\Anaconda2\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\framework\ops.py in call_with_requiring(op)
2157
2158 def call_with_requiring(op):
-> 2159 return call_cpp_shape_fn(op, require_shape_fn=True)
2160
2161 _call_cpp_shape_fn_and_require_op = call_with_requiring
~\Anaconda2\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\framework\common_shapes.py in call_cpp_shape_fn(op, require_shape_fn)
625 res = _call_cpp_shape_fn_impl(op, input_tensors_needed,
626 input_tensors_as_shapes_needed,
--> 627 require_shape_fn)
628 if not isinstance(res, dict):
629 # Handles the case where _call_cpp_shape_fn_impl calls unknown_shape(op).
~\Anaconda2\envs\tensorflow-gpu\lib\site-packages\tensorflow\python\framework\common_shapes.py in _call_cpp_shape_fn_impl(op, input_tensors_needed, input_tensors_as_shapes_needed, require_shape_fn)
689 missing_shape_fn = True
690 else:
--> 691 raise ValueError(err.message)
692
693 if missing_shape_fn:
ValueError: Dimension 0 in both shapes must be equal, but are 4096 and 1000 for 'Assign_30' (op: 'Assign') with input shapes: [4096,3], [1000,3].
推荐答案
问题出在model.layers.pop()
行.当直接从列表model.layers
中弹出层时,不会相应地更新此模型的拓扑.因此,如果模型定义错误,那么以下所有操作都是错误的.
The problem is with the line model.layers.pop()
. When you pop a layer directly from the list model.layers
, the topology of this model is not updated accordingly. So all following operations would be wrong, given a wrong model definition.
具体地说,当您添加带有model.add(layer)
的层时,列表model.outputs
被更新为该层的输出张量.您可以在Sequential.add()
的源代码中找到以下几行:
Specifically, when you add a layer with model.add(layer)
, the list model.outputs
is updated to be the output tensor of that layer. You can find the following lines in the source code of Sequential.add()
:
output_tensor = layer(self.outputs[0])
# ... skipping irrelevant lines
self.outputs = [output_tensor]
但是,当您调用model.layers.pop()
时,model.outputs
不会相应地更新.结果,将使用错误的输入张量调用下一个添加的层(因为self.outputs[0]
仍然是已移除层的输出张量).
When you call model.layers.pop()
, however, model.outputs
is not updated accordingly. As a result, the next added layer will be called with a wrong input tensor (because self.outputs[0]
is still the output tensor of the removed layer).
这可以通过以下几行来证明:
This can be demonstrated by the following lines:
model = Sequential()
for layer in vgg16_model.layers:
model.add(layer)
model.layers.pop()
model.add(Dense(3, activation='softmax'))
print(model.layers[-1].input)
# => Tensor("predictions_1/Softmax:0", shape=(?, 1000), dtype=float32)
# the new layer is called on a wrong input tensor
print(model.layers[-1].kernel)
# => <tf.Variable 'dense_1/kernel:0' shape=(1000, 3) dtype=float32_ref>
# the kernel shape is also wrong
不正确的内核形状是为什么您看到关于[4096,3]
与[1000,3]
不兼容形状的错误.
The incorrect kernel shape is why you're seeing an error about incompatible shapes [4096,3]
versus [1000,3]
.
要解决该问题,只需不将最后一层添加到Sequential
模型中即可.
To solve the problem, simply don't add the last layer to the Sequential
model.
model = Sequential()
for layer in vgg16_model.layers[:-1]:
model.add(layer)
这篇关于使用Keras加载先前保存的经过重新训练的VGG16模型时出现ValueError的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!