问题描述
我正在使用此代码比较多种型号的性能:
I am using this code to compare performance of a number of models:
from sklearn import model_selection
X = input data
Y = binary labels
models = []
models.append(('LR', LogisticRegression()))
models.append(('LDA', LinearDiscriminantAnalysis()))
models.append(('KNN', KNeighborsClassifier()))
models.append(('CART', DecisionTreeClassifier()))
results = []
names = []
scoring = 'accuracy'
for name, model in models:
kfold = model_selection.KFold(n_splits=10, random_state=7)
cv_results = model_selection.cross_val_score(model, X, Y, cv=kfold,scoring=scoring)
results.append(cv_results)
names.append(name)
msg = "%s: %.2f (%.2f)" % (name, cv_results.mean(), cv_results.std())
print(msg)
我可以使用准确性"和召回率"作为评分方式,这些都会提高准确性和敏感性.我该如何创建一个赋予我特异性"的得分手
I can use 'accuracy' and 'recall' as scoring and these will give accuracy and sensitivity. How can I create a scorer that gives me 'specificity'
特异性= TN/(TN+FP)
Specificity= TN/(TN+FP)
其中TN和FP在混淆矩阵中分别为真负值和假正值
where TN, and FP are true negative and false positive values in the confusion matrix
我已经尝试过了
def tp(y_true, y_pred):
error= confusion_matrix(y_true, y_pred)[0,0]/(confusion_matrix(y_true,y_pred)[0,0] + confusion_matrix(y_true, y_pred)[0,1])
return error
my_scorer = make_scorer(tp, greater_is_better=True)
然后
cv_results = model_selection.cross_val_score(model, X,Y,cv=kfold,scoring=my_scorer)
,但不适用于n_split> = 10我在计算my_scorer时遇到此错误
but it will not work for n_split >=10I get this error for calculation of my_scorer
IndexError:索引1超出了尺寸1的轴1的范围
IndexError: index 1 is out of bounds for axis 1 with size 1
推荐答案
如果将二进制分类器的 recall_score
参数更改为 pos_label = 0
,您将获得专一性(默认是灵敏度, pos_label = 1
)
If you change the recall_score
parameters for a binary classifier to pos_label=0
you get specificity (default is sensitivity, pos_label=1
)
scoring = {
'accuracy': make_scorer(accuracy_score),
'sensitivity': make_scorer(recall_score),
'specificity': make_scorer(recall_score,pos_label=0)
}
这篇关于如何将特异性定义为模型评估的可召集得分手的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!