本文介绍了使用预训练 VGG-16 模型的 Caffe 形状不匹配错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 PyCaffe 来实现一个受 VGG 16 层网络启发的神经网络.我想使用他们的 GitHub 页面 提供的预训练模型.通常这是通过匹配图层名称来实现的.

I am using PyCaffe to implement a neural network inspired by the VGG 16 layer network. I want to use the pre-trained model available from their GitHub page. Generally this works by matching layer names.

对于我的 "fc6" 层,我在 train.prototxt 文件中有以下定义:

For my "fc6" layer I have the following definition in my train.prototxt file:

layer {
  name: "fc6"
  type: "InnerProduct"
  bottom: "pool5"
  top: "fc6"
  inner_product_param {
    num_output: 4096
  }
}

这里是 VGG-16 部署架构的 prototxt 文件.请注意,他们的 prototxt 中的 "fc6" 与我的相同(除了学习率,但这无关紧要).还值得注意的是,在我的模型中,输入的大小也相同:3 通道 224x224 像素图像.

Here is the prototxt file for the VGG-16 deploy architecture. Note that the "fc6"in their prototxt is identical to mine (except for the learning rate, but that's irrelevant). It's also worth noting that the inputs are all the same size in my model too: 3-channel 224x224px images.

我一直在关注 本教程非常接近,给我带来问题的代码块如下:

I have been following this tutorial pretty closely, and the block of code that's giving me an issue is the following:

solver = caffe.SGDSolver(osp.join(model_root, 'solver.prototxt'))
solver.net.copy_from(model_root + 'VGG_ILSVRC_16_layers.caffemodel')
solver.test_nets[0].share_with(solver.net)
solver.step(1)

第一行加载我的求解器 prototxt,然后第二行从预训练模型 (VGG_ILSVRC_16_layers.caffemodel) 复制权重.当求解器运行时,我收到此错误:

The first line loads my solver prototxt and then the second line copies the weights from the pre-trained model (VGG_ILSVRC_16_layers.caffemodel). When the solver runs, I get this error:

Cannot copy param 0 weights from layer 'fc6'; shape mismatch.  Source param
shape is 1 1 4096 25088 (102760448); target param shape is 4096 32768 (134217728).
To learn this layer's parameters from scratch rather than copying from a saved
net, rename the layer.

其要点是他们的模型期望层的大小为 1x1x4096,而我的仅为 4096.但我不知道如何更改?

The gist of it is that their model expects the layer to be of size 1x1x4096 while mine is just 4096. But I don't get how I can change this?

我在用户 Google 组指示我在复制之前进行网络手术以重塑预训练模型,但为此我需要来自原始架构数据层的 lmdb 文件,而我没有(当我尝试运行网络手术脚本时它会抛出错误).

I found this answer in the Users Google group instructing me to do net surgery to reshape the pre-trained model before copying, but in order to do that I need the lmdb files from the original architecture's data layers, which I don't have (it throws an error when I try to run the net surgery script).

推荐答案

问题不在于 4096,而在于 25088.您需要根据输入特征图计算网络每一层的输出特征图.请注意,fc 层采用固定大小的输入,因此前一个 conv 层的输出必须与 fc 层所需的输入大小匹配.使用前一个 conv 层的输入特征图大小计算您的 fc6 输入特征图大小(这是前一个 conv 层的输出特征图).公式如下:

The problem is not with 4096 but rather with 25088. You need to calculate the output feature maps for each layer of your network based on the input feature maps. Note that the fc layer takes an input of fixed size so the output of the previous conv layer must match the input size required by the fc layer. Calculate your fc6 input feature map size (this is the output feature map of the previous conv layer) using the input feature map size of the previous conv layer. Here's the formula:

H_out = ( H_in + 2 x Padding_Height - Kernel_Height ) / Stride_Height + 1
W_out = (W_in + 2 x Padding_Width - Kernel_Width) / Stride_Width + 1

这篇关于使用预训练 VGG-16 模型的 Caffe 形状不匹配错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-25 11:58