测试我自己的数据集

测试我自己的数据集

本文介绍了如何在咖啡中训练/测试我自己的数据集?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我从Caffe开始,mnist的例子跑得很好。

我把火车和标签数据作为 data.mat 。 (我有300个培训数据,30个功能和标签是( - 1,+1)保存在 data.mat

但是,我不太明白如何使用caffe来实现自己的数据集?



有没有一步一步的教程可以教我吗?



非常感谢!任何建议将不胜感激!

解决方案

我认为从Matlab传输数据到咖啡的最直接的方法是通过HDF5文件。



首先,使用。我认为您的培训数据存储在一个变量名称 X ,大小为300-by-30,标签存储在 y 一个300×1的向量:

  hdf5write('my_data.h5','/ X',
单(置换(重新形成(X,[300,30,1,1]),[4:-1:1])));
hdf5write('my_data.h5','/ label',
single(permute(reshape(y,[300,1,1,1]),[4:-1:1])) ,
'WriteMode','append');

请注意,数据保存为4D数组:第一个维度是要素的数量,第二个一个是特征的维度,最后两个是1(不表示空间维度)。另请注意,HDF5中的数据的名称是Xlabel - 这些名称应该用作输入数据层的top blob。



为什么 permute ?请参阅以获得解释。



您还需要准备一个列出您正在使用的所有hdf5文件的名称的文本文件(在您的情况下,仅 my_data.h5 )。文件 /path/to/list/file.txt 应该有一行

现在您可以在train_val.prototxt中添加输入数据层

  layer {
type:HDF5Data
name:data
top:X#note :与HDF5相同的名称
顶部:label#
hdf5_data_param {
source:/path/to/list/file.txt
batch_size:20
}
包括{phase:TRAIN}
}

有关更多信息hdf5输入层,您可以在中看到。


I started with Caffe and the mnist example ran well.
I have the train and label data as data.mat. (I have 300 training data with 30 features and labels are (-1, +1) that have saved in data.mat).

However, I don't quite understand how I can use caffe to implement my own dataset?

Is there a step by step tutorial can teach me?

Many thanks!!!! Any advice would be appreciated!

解决方案

I think the most straight forward way to transfer data from Matlab to caffe is via HDF5 file.

First, save your data in Matlab in an HDF5 file using hdf5write. I assume your training data is stored in a variable name X of size 300-by-30 and the labels are stored in y a 300-by-1 vector:

hdf5write('my_data.h5', '/X',
  single( permute(reshape(X,[300, 30, 1, 1]),[4:-1:1]) ) );
hdf5write('my_data.h5', '/label',
  single( permute(reshape(y,[300, 1, 1, 1]),[4:-1:1]) ),
  'WriteMode', 'append' );

Note that the data is saved as a 4D array: the first dimension is the number of features, second one is the feature's dimension and the last two are 1 (representing no spatial dimensions). Also note that the names given to the data in the HDF5 are "X" and "label" - these names should be used as the "top" blobs of the input data layer.

Why permute? please see this answer for an explanation.

You also need to prepare a text file listing the names of all hdf5 files you are using (in your case, only my_data.h5). File /path/to/list/file.txt should have a single line

Now you can add an input data layer to your train_val.prototxt

layer {
  type: "HDF5Data"
  name: "data"
  top: "X"     # note: same name as in HDF5
  top: "label" #
  hdf5_data_param {
    source: "/path/to/list/file.txt"
    batch_size: 20
  }
  include { phase: TRAIN }
}

For more information regarding hdf5 input layer, you can see in this answer.

这篇关于如何在咖啡中训练/测试我自己的数据集?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-25 11:50