问题描述
我正在使用caffe进行回归,而我的test.txt
和train.txt
文件是这样的:
I am doing regression using caffe, and my test.txt
and train.txt
files are like this:
/home/foo/caffe/data/finetune/flickr/3860781056.jpg 2.0
/home/foo/caffe/data/finetune/flickr/4559004485.jpg 3.6
/home/foo/caffe/data/finetune/flickr/3208038920.jpg 3.2
/home/foo/caffe/data/finetune/flickr/6170430622.jpg 4.0
/home/foo/caffe/data/finetune/flickr/7508671542.jpg 2.7272
我的问题是,当我在阅读时使用浮动标签时,例如caffe不允许使用像2.0这样的浮动标签,例如,仅'test.txt'
文件caffe识别
My problem is it seems caffe does not allow float labels like 2.0, when I use float labels while reading, for example the 'test.txt'
file caffe onlyrecognizes
这是错误的.
但是,例如,当我在文件中将2.0更改为2且以下几行相同时,caffe现在给出
But when I for example change the 2.0 to 2 in the file and the following lines same, caffe now gives
暗示浮动标签是造成此问题的原因.
implying that the float labels are responsible for the problem.
有人可以在这里帮助我解决此问题吗?我绝对需要使用float标签进行回归分析,所以有人知道解决此问题的方法或解决方案吗?预先感谢.
Can anyone help me here, to solve this problem, I definitely need to use float labels for regression, so does anyone know about a work around or solution for this? Thanks in advance.
编辑对于遇到类似问题的任何人使用caffe训练CSV数据的Lenet 一个可能会有所帮助.感谢@Shai.
EDITFor anyone facing a similar issue use caffe to train Lenet with CSV data might be of help. Thanks to @Shai.
推荐答案
在使用图像数据集输入层(具有lmdb
或leveldb
后端)时,caffe仅支持一个整数标签输入图像.
When using the image dataset input layer (with either lmdb
or leveldb
backend) caffe only supports one integer label per input image.
如果要进行回归并使用浮点标签,则应尝试使用HDF5数据层.参见例如此问题.
If you want to do regression, and use floating point labels, you should try and use the HDF5 data layer. See for example this question.
在python中,您可以使用h5py
包创建hdf5文件.
In python you can use h5py
package to create hdf5 files.
import h5py, os
import caffe
import numpy as np
SIZE = 224 # fixed size to all images
with open( 'train.txt', 'r' ) as T :
lines = T.readlines()
# If you do not have enough memory split data into
# multiple batches and generate multiple separate h5 files
X = np.zeros( (len(lines), 3, SIZE, SIZE), dtype='f4' )
y = np.zeros( (len(lines),1), dtype='f4' )
for i,l in enumerate(lines):
sp = l.split(' ')
img = caffe.io.load_image( sp[0] )
img = caffe.io.resize( img, (SIZE, SIZE, 3) ) # resize to fixed size
# you may apply other input transformations here...
# Note that the transformation should take img from size-by-size-by-3 and transpose it to 3-by-size-by-size
# for example
# transposed_img = img.transpose((2,0,1))[::-1,:,:] # RGB->BGR
X[i] = transposed_img
y[i] = float(sp[1])
with h5py.File('train.h5','w') as H:
H.create_dataset( 'X', data=X ) # note the name X given to the dataset!
H.create_dataset( 'y', data=y ) # note the name y given to the dataset!
with open('train_h5_list.txt','w') as L:
L.write( 'train.h5' ) # list all h5 files you are going to use
拥有所有h5
文件和列出它们的相应测试文件后,您可以将HDF5输入层添加到train_val.prototxt
:
Once you have all h5
files and the corresponding test files listing them you can add an HDF5 input layer to your train_val.prototxt
:
layer {
type: "HDF5Data"
top: "X" # same name as given in create_dataset!
top: "y"
hdf5_data_param {
source: "train_h5_list.txt" # do not give the h5 files directly, but the list.
batch_size: 32
}
include { phase:TRAIN }
}
说明:
当我说"caffe每个输入图像仅支持一个整数标签"时,我并不是说leveldb/lmdb容器是有限的,我指的是caffe的工具,特别是工具.
经过仔细检查,似乎caffe将类型Datum
的数据存储在leveldb/lmdb中,并且此类型的"label"属性定义为整数(请参见 caffe.proto ),因此,当使用caffe接口对leveldb/lmdb进行操作时,您只能使用单个int32标签图片.
Clarification:
When I say "caffe only supports one integer label per input image" I do not mean that the leveldb/lmdb containers are limited, I meant the tools of caffe, specifically the convert_imageset
tool.
At closer inspection, it seems like caffe stores data of type Datum
in leveldb/lmdb and the "label" property of this type is defined as integer (see caffe.proto) thus when using caffe interface to leveldb/lmdb you are restricted to a single int32 label per image.
这篇关于回归咖啡的测试标签,不允许浮动吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!