问题描述
我正在微调网络.在特定情况下,我想将其用于回归分析,这是可行的.在另一种情况下,我想将其用于分类.
I am finetuning a network. In a specific case I want to use it for regression, which works. In another case, I want to use it for classification.
对于这两种情况,我都有一个带有标签的HDF5文件.通过回归,这只是一个包含浮点数的1×1 numpy数组.我以为在将EuclideanLoss图层更改为SoftmaxLoss之后,可以使用相同的标签进行分类.但是,这样我得到了负损失:
For both cases I have an HDF5 file, with a label. With regression, this is just a 1-by-1 numpy array that contains a float. I thought I could use the same label for classification, after changing my EuclideanLoss layer to SoftmaxLoss. However, then I get a negative loss as so:
Iteration 19200, loss = -118232
Train net output #0: loss = 39.3188 (* 1 = 39.3188 loss)
您能解释一下,如果怎么了,怎么了?我确实看到训练损失约为40(这仍然很糟糕),但是网络仍在训练吗?负损失不断增加,越来越负.
Can you explain if, and so what, goes wrong? I do see that the training loss is about 40 (which is still terrible), but does the network still train? The negative loss just keeps on getting more negative.
更新
阅读 Shai的评论和,我进行了以下更改:
-我制作了最后一个完全连接的第6层的num_output
,因为我有6个标签(以前是1个).
-现在,我创建一个单向矢量,并将其作为标签传递到我的HDF5数据集中,如下所示:
UPDATE
After reading Shai's comment and answer, I have made the following changes:
- I made the num_output
of my last fully connected layer 6, as I have 6 labels (used to be 1).
- I now create a one-hot vector and pass that as a label into my HDF5 dataset as follows
f['label'] = numpy.array([1, 0, 0, 0, 0, 0])
现在尝试运行网络会返回
Trying to run my network now returns
Check failed: hdf_blobs_[i]->shape(0) == num (6 vs. 1)
经过在线研究后,我将向量重塑为1x6向量.这导致以下错误:
After some research online, I reshaped the vector to a 1x6 vector. This lead to the following error:
Check failed: outer_num_ * inner_num_ == bottom[1]->count() (40 vs. 240)
Number of labels must match number of predictions; e.g., if softmax axis == 1
and prediction shape is (N, C, H, W), label count (number of labels)
must be N*H*W, with integer values in {0, 1, ..., C-1}.
我的想法是为每个数据集(图像)添加1个标签,并在我的train.prototxt文件中创建批次.这不应该创建正确的批量大小吗?
My idea is to add 1 label per data set (image) and in my train.prototxt I create batches. Shouldn't this create the correct batch size?
推荐答案
从回归回归分类以来,您无需输出要与"label"
进行比较的标量,而是需要输出概率 vector 长度num-labels与离散类"label"
进行比较.您需要将"SoftmaxWithLoss"
之前的图层的num_output
参数从1
更改为num-labels.
Since you moved from regression to classification, you need to output not a scalar to compare with "label"
but rather a probability vector of length num-labels to compare with the discrete class "label"
. You need to change num_output
parameter of the layer before "SoftmaxWithLoss"
from 1
to num-labels.
我相信当前您正在访问未初始化的内存,在这种情况下,我希望caffe早晚崩溃.
I believe currently you are accessing un-initialized memory and I would expect caffe to crash sooner or later in this case.
更新:
您进行了两项更改:num_output
1-> 6,并且还将输入label
从标量更改为矢量.
第一个更改是使用"SoftmaxWithLossLayer"
所需的唯一更改.
请勿将label
从标量更改为热向量".
Update:
You made two changes: num_output
1-->6, and you also changed your input label
from a scalar to vector.
The first change was the only one you needed for using "SoftmaxWithLossLayer"
.
Do not change label
from a scalar to a "hot-vector".
为什么?
因为"SoftmaxWithLoss"
基本上会查看您输出的6矢量预测,所以将真实情况label
解释为 index 并查看-log(p[label])
:p[label]
越接近1(即,您对预期类别的预测可能性很高),则损失越小.如果将预测p[label]
接近于零(即,您错误地预测了预期类别的可能性很小),则损失会快速增长.
Why?
Because "SoftmaxWithLoss"
basically looks at the 6-vector prediction you output, interpret the ground-truth label
as index and looks at -log(p[label])
: the closer p[label]
is to 1 (i.e., you predicted high probability for the expected class) the lower the loss. Making a prediction p[label]
close to zero (i.e., you incorrectly predicted low probability for the expected class) then the loss grows fast.
使用"hot-vector"作为地面真实输入label
,可能会导致多类别分类(似乎不是您要在此处解决的任务).您可能会发现此SO线程与特定情况有关.
Using a "hot-vector" as ground-truth input label
, may give rise to multi-category classification (does not seems like the task you are trying to solve here). You may find this SO thread relevant to that particular case.
这篇关于HDF5中的Caffe分类标签的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!