问题描述
我正在使用Keras处理多类分类问题,并且正在使用二进制精度和分类精度作为度量.当我评估模型时,二进制精度得到了很高的值,而分类精度却得到了很低的值.我试图在自己的代码中重新创建二进制精度度量标准,但运气不佳.我的理解是,这是我需要重新创建的过程:
I'm working on a multiclass classification problem using Keras and I'm using binary accuracy and categorical accuracy as metrics. When I evaluate my model I get a really high value for the binary accuracy and quite a low one in for the categorical accuracy. I tried to recreate the binary accuracy metric in my own code but I am not having much luck. My understanding is that this is the process I need to recreate:
def binary_accuracy(y_true, y_pred):
return K.mean(K.equal(y_true, K.round(y_pred)), axis=-1)
这是我的代码:
from keras import backend as K
preds = model.predict(X_test, batch_size = 128)
print preds
pos = 0.00
neg = 0.00
for i, val in enumerate(roundpreds):
if val.tolist() == y_test[i]:
pos += 1.0
else:
neg += 1.0
print pos/(pos + neg)
但是,这给出的值比二进制精度给出的值低得多.二进制精度是否甚至是在多类问题中使用的适当度量?如果是这样,有人知道我要去哪里了吗?
But this gives a much lower value than the one given by binary accuracy. Is binary accuracy even an appropriate metric to be using in a multi-class problem? If so does anyone know where I am going wrong?
推荐答案
因此,您需要了解将binary_crossentropy
应用于多类预测时会发生什么.
So you need to understand what happens when you apply a binary_crossentropy
to a multiclass prediction.
- 让我们假设您从
softmax
输出的是(0.1, 0.2, 0.3, 0.4)
,并且一键编码的地面真实情况是(1, 0, 0, 0)
. -
binary_crossentropy
屏蔽了所有高于0.5
的输出,因此您的网络被转为(0, 0, 0, 0)
向量. -
(0, 0, 0, 0)
在4个索引中的3个上匹配真实情况(1, 0, 0, 0)
-对于完全错误的答案,其结果精度为 75% !
- Let's assume that your output from
softmax
is(0.1, 0.2, 0.3, 0.4)
and one-hot encoded ground truth is(1, 0, 0, 0)
. binary_crossentropy
masks all outputs which are higher than0.5
so out of your network is turned to(0, 0, 0, 0)
vector.(0, 0, 0, 0)
matches ground truth(1, 0, 0, 0)
on 3 out of 4 indexes - this makes resulting accuracy to be at the level of 75% for a completely wrong answer!
要解决此问题,您可以使用单个类别的准确性,例如像这样一个:
To solve this you could use a single class accuracy, e.g. like this one:
def single_class_accuracy(interesting_class_id):
def fn(y_true, y_pred):
class_id_preds = K.argmax(y_pred, axis=-1)
# Replace class_id_preds with class_id_true for recall here
positive_mask = K.cast(K.equal(class_id_preds, interesting_class_id), 'int32')
true_mask = K.cast(K.equal(y_true, interesting_class_id), 'int32')
acc_mask = K.cast(K.equal(positive_mask, true_mask), 'float32')
class_acc = K.mean(acc_mask)
return class_acc
return fn
这篇关于在多类分类问题中,为什么二进制精度给出较高的精度,而分类精度给出较低的精度?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!