本文介绍了如何在caffe中的网络中输入多个N-D阵列?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想为需要多个输入的caffe中的语义分割创建自定义损失层.我希望此损失函数具有附加的输入因子,以便对小物体中的未命中检测进行惩罚.

I want to create a custom loss layer for semantic segmentation in caffe that requires multiple inputs. I wish this loss function to have an additional input factor in order to penalize the miss detection in small objects.

为此,我创建了一个图像GT,其中每个像素都包含一个权重.如果像素属于小物体,则权重很高.

To do that I have created an image GT that contains for each pixel a weight. If the pixel belongs to a small object the weight is high.

我是Caffe的新手,我不知道如何同时用三个2-D信号(图像,gt-mask和每个像素的权重)填充我的网络.我对caffe如何处理rgb数据和gt数据之间的对应关系感到怀疑.
我想对此进行扩展,以便为类标签图像添加2 gt,将另一个添加到损失函数中.

I am newbie in caffe and I do not know how to feed my net with three 2-D signals at the same time (image, gt-mask and the per-pixel weights). I have doubts regarding how is caffe doing the correspondence between rgb data and gt data.
I want to expand this in order to have 2 gt one for the class label image and the other to put this factor in the loss function.

您能提供一些提示以实现这一目标吗?

Can you give some hint in order to achive that?

谢谢

推荐答案

您想让每个训练样本都使用多个N-D信号.您担心以下事实:默认的"Data"图层只能处理一张图像作为训练样本.
有几种解决方案可以解决此问题:

You want to caffe to use several N-D signals for each training sample. You are concerned with the fact that the default "Data" layer can only handle one image as a training sample.
There are several solutions for this concern:

  1. 使用多个"Data"(就像您在模型中所做的那样链接到).为了在三个"Data"层之间进行同步,您需要了解caffe会顺序从底层LMDB中读取样本.因此,如果您以相同的顺序准备准备三个LMDB,那么caffe会一次按照从每个LMDB中读取样本的顺序读取一个样本,因此这三个输入将位于在训练/验证期间同步.
    请注意, convert_imageset 具有'shuffle'标志,请不要使用它,因为它将在三个LMDB中的每个样本中以不同的方式随机播放样本,您将不会同步.强烈建议您在准备LMDB之前先随机整理样本,但是要采用相同的方式>"shuffle"应用于所有三个输入,使它们彼此同步.

  1. Using several "Data" layers (as was done in the model you linked to). In order to sync between the three "Data" layers you'll have you need to know that caffe reads the samples from the underlying LMDB sequentially. So, if you prepare your three LMDBs in the same order caffe will read one sample at a time from each of the LMDBs in the order in which the samples were put there, so the three inputs will be in sync during training/validation.
    Note that convert_imageset has a 'shuffle' flag, do NOT use it as it will shuffle your samples differently in each of the three LMDBs and you will have no sync. You are strongly advised to shuffle the samples yourself before preparing the LMDBs but in a way that the same "shuffle" is applied to all three inputs leaving them in sync with each other.

使用5通道输入. caffe可以将N-D数据存储在LMDB中,而不仅仅是彩色/灰色图像.您可以使用python 创建LMDB,其中每个图像"都是5通道阵列,前三个通道是图像的RGB,后两个通道是真实的标签以及每个像素损失的权重.
在模型中,您只需要在 "Slice" 层>:

Using 5 channel input. caffe can store N-D data in LMDB and not only color/gray images. You can use python to create LMDB with each "image" is a 5-channel array with the first three channels are image's RGB and the last two are the ground-truth labels and the weight for the per-pixel loss.
In your model you only need to add a "Slice" layer on top of your "Data":

layer {
  name: "slice_input"
  type: "Slice"
  bottom: "raw_input" # 5-channel "image" stored in LMDB
  top: "rgb"
  top: "gt"
  top: "weight"
  slice_param {
    axis: 1
    slice_point: 3
    slice_point: 4
  }
}

  • 使用 "HDF5Data" (我个人最喜欢的).您可以将输入内容以二进制hdf5格式存储,并从这些文件中读取caffe.在Caffe中使用"HDF5Data"更加灵活,并且可以根据需要随意调整输入的形状.在您的情况下,您需要准备一个带有三个数据集"的二进制hdf5文件:'rgb''gt''weight'.创建hdf5文件时,您需要确保样本已同步.准备好之后,您就可以创建一个"HDF5Data"图层,其中三个"top"图层可供使用.

  • Using "HDF5Data" layer (my personal favorite). You can store your inputs in a binary hdf5 format and have caffe read from these files. Using "HDF5Data" is much more flexible in caffe and allows you to shape the inputs as much as you like. In your case you need to prepare a binary hdf5 file with three "datasets": 'rgb', 'gt' and 'weight'. You need to make sure the samples are synced when you create the hdf5 file(s). Once you have the, ready you can have a "HDF5Data" layer with three "top"s ready to be used.

    编写自己的"Python"输入层.我将不在这里详细介绍.但是您可以在python中实现自己的输入层.有关更多详细信息,请参见此线程.

    Write your own "Python" input layer. I will not go into the details here. But you can implement your own input layer in python. See this thread for more details.

    这篇关于如何在caffe中的网络中输入多个N-D阵列?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

  • 07-25 12:04