问题描述
我有一些SVM分类器(LinearSVC)输出测试集中每个样本的最终分类,类似
I have some SVM classifier (LinearSVC) outputting final classifications for every sample in the test set, something like
1, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1
以此类推.
真相"标签也类似
1, 0, 1, 0, 1, 0, 0, 1, 0, 0, 1, 1
我想用一些参数运行该svm,为roc曲线生成点,并计算auc.
I would like to run that svm with some parameters, and generate points for the roc curve, and calculate auc.
我可以自己做,但是我敢肯定在这种情况下有人在我之前做过.
I could do this by myself, but I am sure someone did it before me for cases like this.
不幸的是,我能找到的一切都是针对分类器返回概率的情况,而不是诸如或此处
Unfortunately, everything I can find is for cases where the classifier returns probabilities, rather than hard estimations, like here or here
我认为这可以,但是找不到from sklearn.metrics import plot_roc_curve
!
I thought this would work, but from sklearn.metrics import plot_roc_curve
is not found!
任何适合我情况的在线内容?
anything online that fits my case?
谢谢
推荐答案
您可以通过使用 sklearn.svm.SVC
并将probability
参数设置为True
.
You could get around the problem by using sklearn.svm.SVC
and setting the probability
parameter to True
.
如您所见:
是否启用概率估计.必须先启用 调用fit,会减慢该方法的速度,因为它内部使用5倍 交叉验证,并且predict_proba可能与predict不一致. 在《用户指南》中了解更多信息.
Whether to enable probability estimates. This must be enabled prior to calling fit, will slow down that method as it internally uses 5-fold cross-validation, and predict_proba may be inconsistent with predict. Read more in the User Guide.
作为一个例子(细节省略):
As an example (details omitted):
from sklearn.svm import SVC
from sklearn.metrics import roc_curve
from sklearn.metrics import roc_auc_score
.
.
.
model = SVC(kernel="linear", probability=True)
model.fit(X_train, y_train)
.
.
.
decision_scores = model.decision_function(X_test)
fpr, tpr, thres = roc_curve(y_test, decision_scores)
print('AUC: {:.3f}'.format(roc_auc_score(y_test, decision_scores)))
# roc curve
plt.plot(fpr, tpr, "b", label='Linear SVM')
plt.plot([0,1],[0,1], "k--", label='Random Guess')
plt.xlabel("false positive rate")
plt.ylabel("true positive rate")
plt.legend(loc="best")
plt.title("ROC curve")
plt.show()
,您应该会得到这样的内容:
and you should get something like this:
注意,LinearSVC
比SVC(kernel="linear")
更快,尤其是在训练集非常大或功能丰富的情况下.
NOTE that LinearSVC
is MUCH FASTER than SVC(kernel="linear")
, especially if the training set is very large or plenty of features.
这篇关于如何绘制无概率(svm)的二进制分类器的ROC并计算AUC?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!