本文介绍了如何绘制无概率(svm)的二进制分类器的ROC并计算AUC?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一些SVM分类器(LinearSVC)输出测试集中每个样本的最终分类,类似

I have some SVM classifier (LinearSVC) outputting final classifications for every sample in the test set, something like

1, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1

以此类推.

真相"标签也类似

1, 0, 1, 0, 1, 0, 0, 1, 0, 0, 1, 1

我想用一些参数运行该svm,为roc曲线生成点,并计算auc.

I would like to run that svm with some parameters, and generate points for the roc curve, and calculate auc.

我可以自己做,但是我敢肯定在这种情况下有人在我之前做过.

I could do this by myself, but I am sure someone did it before me for cases like this.

不幸的是,我能找到的一切都是针对分类器返回概率的情况,而不是诸如或此处

Unfortunately, everything I can find is for cases where the classifier returns probabilities, rather than hard estimations, like here or here

我认为可以,但是找不到from sklearn.metrics import plot_roc_curve

I thought this would work, but from sklearn.metrics import plot_roc_curve is not found!

任何适合我情况的在线内容?

anything online that fits my case?

谢谢

推荐答案

您可以通过使用 sklearn.svm.SVC 并将probability参数设置为True.

You could get around the problem by using sklearn.svm.SVC and setting the probability parameter to True.

如您所见:

是否启用概率估计.必须先启用 调用fit,会减慢该方法的速度,因为它内部使用5倍 交叉验证,并且predict_proba可能与predict不一致. 在《用户指南》中了解更多信息.

Whether to enable probability estimates. This must be enabled prior to calling fit, will slow down that method as it internally uses 5-fold cross-validation, and predict_proba may be inconsistent with predict. Read more in the User Guide.

作为一个例子(细节省略):

As an example (details omitted):

from sklearn.svm import SVC
from sklearn.metrics import roc_curve
from sklearn.metrics import roc_auc_score

.
.
.

model = SVC(kernel="linear", probability=True)
model.fit(X_train, y_train)

.
.
.

decision_scores = model.decision_function(X_test)
fpr, tpr, thres = roc_curve(y_test, decision_scores)
print('AUC: {:.3f}'.format(roc_auc_score(y_test, decision_scores)))

# roc curve
plt.plot(fpr, tpr, "b", label='Linear SVM')
plt.plot([0,1],[0,1], "k--", label='Random Guess')
plt.xlabel("false positive rate")
plt.ylabel("true positive rate")
plt.legend(loc="best")
plt.title("ROC curve")
plt.show()

,您应该会得到这样的内容:

and you should get something like this:

注意LinearSVCSVC(kernel="linear")更快,尤其是在训练集非常大或功能丰富的情况下.

NOTE that LinearSVC is MUCH FASTER than SVC(kernel="linear"), especially if the training set is very large or plenty of features.

这篇关于如何绘制无概率(svm)的二进制分类器的ROC并计算AUC?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-13 19:22