如何在咖啡中训练/测试我自己的数据集？

本文介绍了如何在咖啡中训练/测试我自己的数据集？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我从Caffe开始，mnist的例子跑得很好。

我把火车和标签数据作为 data.mat 。（我有300个培训数据，30个功能和标签是（ - 1，+1）保存在 data.mat

但是，我不太明白如何使用caffe来实现自己的数据集？

有没有一步一步的教程可以教我吗？

非常感谢！任何建议将不胜感激！

解决方案

我认为从Matlab传输数据到咖啡的最直接的方法是通过HDF5文件。

首先，使用。我认为您的培训数据存储在一个变量名称 X ，大小为300-by-30，标签存储在 y 一个300×1的向量：

  hdf5write（'my_data.h5'，'/ X'，
单（置换（重新形成（X，[300，30，1，1]），[4：-1：1]）））; 
 hdf5write（'my_data.h5'，'/ label'，
 single（permute（reshape（y，[300，1，1，1]），[4：-1：1]）） ，
'WriteMode'，'append'）;

请注意，数据保存为4D数组：第一个维度是要素的数量，第二个一个是特征的维度，最后两个是1（不表示空间维度）。另请注意，HDF5中的数据的名称是X和label - 这些名称应该用作输入数据层的top blob。

为什么 permute ？请参阅以获得解释。

您还需要准备一个列出您正在使用的所有hdf5文件的名称的文本文件（在您的情况下，仅 my_data.h5 ）。文件 /path/to/list/file.txt 应该有一行

现在您可以在train_val.prototxt中添加输入数据层

  layer {
 type：HDF5Data
 name：data
 top：X＃note ：与HDF5相同的名称
顶部：label＃
 hdf5_data_param {
 source：/path/to/list/file.txt
 batch_size：20 
} 
包括{phase：TRAIN} 
}

有关更多信息hdf5输入层，您可以在中看到。

I started with Caffe and the mnist example ran well.
I have the train and label data as data.mat. (I have 300 training data with 30 features and labels are (-1, +1) that have saved in data.mat).

However, I don't quite understand how I can use caffe to implement my own dataset?

Is there a step by step tutorial can teach me?

Many thanks!!!! Any advice would be appreciated!

解决方案

I think the most straight forward way to transfer data from Matlab to caffe is via HDF5 file.

First, save your data in Matlab in an HDF5 file using hdf5write. I assume your training data is stored in a variable name X of size 300-by-30 and the labels are stored in y a 300-by-1 vector:

hdf5write('my_data.h5', '/X',
  single( permute(reshape(X,[300, 30, 1, 1]),[4:-1:1]) ) );
hdf5write('my_data.h5', '/label',
  single( permute(reshape(y,[300, 1, 1, 1]),[4:-1:1]) ),
  'WriteMode', 'append' );

Note that the data is saved as a 4D array: the first dimension is the number of features, second one is the feature's dimension and the last two are 1 (representing no spatial dimensions). Also note that the names given to the data in the HDF5 are "X" and "label" - these names should be used as the "top" blobs of the input data layer.

Why permute? please see this answer for an explanation.

You also need to prepare a text file listing the names of all hdf5 files you are using (in your case, only my_data.h5). File /path/to/list/file.txt should have a single line

Now you can add an input data layer to your train_val.prototxt

layer {
  type: "HDF5Data"
  name: "data"
  top: "X"     # note: same name as in HDF5
  top: "label" #
  hdf5_data_param {
    source: "/path/to/list/file.txt"
    batch_size: 20
  }
  include { phase: TRAIN }
}

For more information regarding hdf5 input layer, you can see in this answer.

这篇关于如何在咖啡中训练/测试我自己的数据集？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！