问题描述
我不知道我是否正确设置了二进制分类问题.我将正类标记为1,将负类标记为0.但是据我的理解,默认情况下,scikit-learn在其混淆矩阵中将类0用作正类(因此我将其设置为相反).这让我感到困惑.在scikit-learn的默认设置中,第一行是肯定类还是否定类?让我们假设混淆矩阵输出:
I can't figure out if I've setup my binary classification problem correctly. I labeled the positive class 1 and the negative 0. However It is my understanding that by default scikit-learn uses class 0 as the positive class in its confusion matrix (so the inverse of how I set it up). This is confusing to me. Is the top row, in scikit-learn's default setting, the positive or negative class?Lets assume the confusion matrix output:
confusion_matrix(y_test, preds)
[ [30 5]
[2 42] ]
在混乱矩阵中看起来如何?实际实例是scikit-learn中的行还是列?
How would it look like in a confusion matrix? Are the actual instances the rows or the columns in scikit-learn?
prediction prediction
0 1 1 0
----- ----- ----- -----
0 | TN | FP (OR) 1 | TP | FP
actual ----- ----- actual ----- -----
1 | FN | TP 0 | FN | TN
推荐答案
scikit学习按升序对标签进行排序,因此0表示第一列/行,1表示第二列/行
scikit learn sorts labels in ascending order, thus 0's are first column/row and 1's are the second one
>>> from sklearn.metrics import confusion_matrix as cm
>>> y_test = [1, 0, 0]
>>> y_pred = [1, 0, 0]
>>> cm(y_test, y_pred)
array([[2, 0],
[0, 1]])
>>> y_pred = [4, 0, 0]
>>> y_test = [4, 0, 0]
>>> cm(y_test, y_pred)
array([[2, 0],
[0, 1]])
>>> y_test = [-2, 0, 0]
>>> y_pred = [-2, 0, 0]
>>> cm(y_test, y_pred)
array([[1, 0],
[0, 2]])
>>>
这写在文档中:
因此,您可以通过为confusion_matrix调用提供标签来更改此行为
Thus you can alter this behavior by providing labels to confusion_matrix call
>>> y_test = [1, 0, 0]
>>> y_pred = [1, 0, 0]
>>> cm(y_pred, y_pred)
array([[2, 0],
[0, 1]])
>>> cm(y_pred, y_pred, labels=[1, 0])
array([[1, 0],
[0, 2]])
实际/预测的结果与图像中的信息一样累累-预测在列中,实际值在行中
And actual/predicted are oredered just like in your images - predictions are in columns and actual values in rows
>>> y_test = [5, 5, 5, 0, 0, 0]
>>> y_pred = [5, 0, 0, 0, 0, 0]
>>> cm(y_test, y_pred)
array([[3, 0],
[2, 1]])
- true:0,预测:0(值:3,位置[0,0])
- true:5,预测:0(值:2,位置[1,0])
- true:0,预测:5(值:0,位置[0,1])
- true:5,预测值:5(值:1,位置[1,1])
- true: 0, predicted: 0 (value: 3, position [0, 0])
- true: 5, predicted: 0 (value: 2, position [1, 0])
- true: 0, predicted: 5 (value: 0, position [0, 1])
- true: 5, predicted: 5 (value: 1, position [1, 1])
这篇关于Scikit-学习混淆矩阵的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!