问题描述
我正在使用以下Keras代码对mnist数据进行分类.从sklearn.metrics
的confusion_matrix
命令中我得到了混淆矩阵,从TruePositive= sum(numpy.diag(cm1))
命令中我得到了True Positive.但是我混淆了如何获得真否定,假肯定,假否定.我从此处阅读解决方案,但用户评论使我困惑.请帮助编写代码以获取参数.
I am classifying mnist data using following Keras code. From confusion_matrix
command of sklearn.metrics
i got confusion matrix and from TruePositive= sum(numpy.diag(cm1))
command i am able to get True Positive. But i am confuse how to get True Negative , False Positive, False Negative. I read solution from here but user comments confuse me. please help to code to get parameters.
from sklearn.metrics import confusion_matrix
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras import backend as K
import numpy as np
(x_train, y_train), (x_test, y_test) = mnist.load_data()
batch_size = 128
num_classes = 10
epochs = 1
img_rows, img_cols = 28, 28
y_test1=y_test
if K.image_data_format() == 'channels_first':
x_train = x_train.reshape(x_train.shape[0], 1, img_rows, img_cols)
x_test = x_test.reshape(x_test.shape[0], 1, img_rows, img_cols)
input_shape = (1, img_rows, img_cols)
else:
x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
input_shape = (img_rows, img_cols, 1)
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
activation='relu',
input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
#model.add(GlobalAveragePooling2D())
#model.add(GlobalMaxPooling2D())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss=keras.losses.binary_crossentropy,
optimizer=keras.optimizers.Adadelta(),
metrics=['accuracy'])
model.fit(x_train, y_train,
batch_size=batch_size,
epochs=epochs,
verbose=1,
validation_data=(x_test, y_test))
pre_cls=model.predict_classes(x_test)
cm1 = confusion_matrix(y_test1,pre_cls)
print('Confusion Matrix : \n', cm1)
TruePositive= sum(np.diag(cm1))
推荐答案
首先,您的代码中有遗漏-为了运行,我需要添加以下命令:
First of all, you have omissions in your code - in order to run, I needed to add the following commands:
import keras
(x_train, y_train), (x_test, y_test) = mnist.load_data()
已完成此操作,并给出了混淆矩阵cm1
:
Having done that, and given the confusion matrix cm1
:
array([[ 965, 0, 1, 0, 0, 2, 6, 1, 5, 0],
[ 0, 1113, 4, 2, 0, 0, 3, 0, 13, 0],
[ 8, 0, 963, 14, 5, 1, 7, 8, 21, 5],
[ 0, 0, 3, 978, 0, 7, 0, 6, 12, 4],
[ 1, 0, 4, 0, 922, 0, 9, 3, 3, 40],
[ 4, 1, 1, 27, 0, 824, 6, 1, 20, 8],
[ 11, 3, 1, 1, 5, 6, 925, 0, 6, 0],
[ 2, 6, 17, 8, 2, 0, 1, 961, 2, 29],
[ 5, 1, 2, 13, 4, 6, 2, 6, 929, 6],
[ 6, 5, 0, 7, 5, 6, 1, 6, 10, 963]])
以下是您如何获取所请求的TP,FP,FN,TN 每类的:
here is how you can get the requested TP, FP, FN, TN per class:
真实正值"只是对角线元素:
The True Positives are simply the diagonal elements:
TruePositive = np.diag(cm1)
TruePositive
# array([ 965, 1113, 963, 978, 922, 824, 925, 961, 929, 963])
误报是各列的总和,减去对角线元素:
The False Positives are the sum of the respective column, minus the diagonal element:
FalsePositive = []
for i in range(num_classes):
FalsePositive.append(sum(cm1[:,i]) - cm1[i,i])
FalsePositive
# [37, 16, 33, 72, 21, 28, 35, 31, 92, 92]
类似地,False Negatives是相应行的总和,减去对角线元素:
Similarly, the False Negatives are the sum of the respective row, minus the diagonal element:
FalseNegative = []
for i in range(num_classes):
FalseNegative.append(sum(cm1[i,:]) - cm1[i,i])
FalseNegative
# [15, 22, 69, 32, 60, 68, 33, 67, 45, 46]
现在,真正的负面人物有些棘手;让我们首先考虑一个真正的负数,相对于类0
到底意味着什么:它表示所有被正确识别为不是0
的样本.因此,基本上我们应该做的是删除相应的行&混淆矩阵中的第一个列,然后将所有剩余元素汇总:
Now, the True Negatives are a little trickier; let's first think what exactly a True Negative means, with respect to, say class 0
: it means all the samples that have been correctly identified as not being 0
. So, essentially what we should do is remove the corresponding row & column from the confusion matrix, and then sum up all the remaining elements:
TrueNegative = []
for i in range(num_classes):
temp = np.delete(cm1, i, 0) # delete ith row
temp = np.delete(temp, i, 1) # delete ith column
TrueNegative.append(sum(sum(temp)))
TrueNegative
# [8998, 8871, 9004, 8950, 9057, 9148, 9040, 9008, 8979, 8945]
我们进行一次完整性检查:对于每个类,TP,FP,FN和TN的总和必须等于测试集的大小(此处为10,000):让我们确认一下确实是这样:
Let's make a sanity check: for each class, the sum of TP, FP, FN, and TN must be equal to the size of our test set (here 10,000): let's confirm that this is indeed the case:
l = len(y_test)
for i in range(num_classes):
print(TruePositive[i] + FalsePositive[i] + FalseNegative[i] + TrueNegative[i] == l)
结果是
True
True
True
True
True
True
True
True
True
True
这篇关于如何从多类分类的混淆矩阵中提取假阳性,假阴性的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!