问题描述
我从Caffe开始,mnist的例子跑得很好。我把火车和标签数据作为
data.mat
。 (我有300个培训数据,30个功能和标签是( - 1,+1)
保存在 data.mat
但是,我不太明白如何使用caffe来实现自己的数据集?
有没有一步一步的教程可以教我吗?
非常感谢!任何建议将不胜感激!
我认为从Matlab传输数据到咖啡的最直接的方法是通过HDF5文件。
首先,使用。我认为您的培训数据存储在一个变量名称 X
,大小为300-by-30,标签存储在 y
一个300×1的向量:
hdf5write('my_data.h5','/ X',
单(置换(重新形成(X,[300,30,1,1]),[4:-1:1])));
hdf5write('my_data.h5','/ label',
single(permute(reshape(y,[300,1,1,1]),[4:-1:1])) ,
'WriteMode','append');
请注意,数据保存为4D数组:第一个维度是要素的数量,第二个一个是特征的维度,最后两个是1(不表示空间维度)。另请注意,HDF5中的数据的名称是X
和label
- 这些名称应该用作输入数据层的top
blob。
为什么 permute
?请参阅以获得解释。
您还需要准备一个列出您正在使用的所有hdf5文件的名称的文本文件(在您的情况下,仅 my_data.h5
)。文件 /path/to/list/file.txt
应该有一行
现在您可以在train_val.prototxt中添加输入数据层
layer {
type:HDF5Data
name:data
top:X#note :与HDF5相同的名称
顶部:label#
hdf5_data_param {
source:/path/to/list/file.txt
batch_size:20
}
包括{phase:TRAIN}
}
有关更多信息hdf5输入层,您可以在中看到。
I started with Caffe and the mnist example ran well.
I have the train and label data as data.mat
. (I have 300 training data with 30 features and labels are (-1, +1)
that have saved in data.mat
).
However, I don't quite understand how I can use caffe to implement my own dataset?
Is there a step by step tutorial can teach me?
Many thanks!!!! Any advice would be appreciated!
I think the most straight forward way to transfer data from Matlab to caffe is via HDF5 file.
First, save your data in Matlab in an HDF5 file using hdf5write
. I assume your training data is stored in a variable name X
of size 300-by-30 and the labels are stored in y
a 300-by-1 vector:
hdf5write('my_data.h5', '/X',
single( permute(reshape(X,[300, 30, 1, 1]),[4:-1:1]) ) );
hdf5write('my_data.h5', '/label',
single( permute(reshape(y,[300, 1, 1, 1]),[4:-1:1]) ),
'WriteMode', 'append' );
Note that the data is saved as a 4D array: the first dimension is the number of features, second one is the feature's dimension and the last two are 1 (representing no spatial dimensions). Also note that the names given to the data in the HDF5 are "X"
and "label"
- these names should be used as the "top"
blobs of the input data layer.
Why permute
? please see this answer for an explanation.
You also need to prepare a text file listing the names of all hdf5 files you are using (in your case, only my_data.h5
). File /path/to/list/file.txt
should have a single line
Now you can add an input data layer to your train_val.prototxt
layer {
type: "HDF5Data"
name: "data"
top: "X" # note: same name as in HDF5
top: "label" #
hdf5_data_param {
source: "/path/to/list/file.txt"
batch_size: 20
}
include { phase: TRAIN }
}
For more information regarding hdf5 input layer, you can see in this answer.
这篇关于如何在咖啡中训练/测试我自己的数据集?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!