在scikit-learn中,是否有任何内置函数可以使二进制概率分类器获得最大的准确性?

例如。要获得最大的F1分数,我要做的是:

# AUCPR
precision, recall, thresholds = sklearn.metrics.precision_recall_curve(y_true, y_score)
auprc  = sklearn.metrics.auc(recall, precision)
max_f1 = 0
for r, p, t in zip(recall, precision, thresholds):
    if p + r == 0: continue
    if (2*p*r)/(p + r) > max_f1:
        max_f1 = (2*p*r)/(p + r)
        max_f1_threshold = t

我可以用类似的方式来计算最大精度:
accuracies = []
thresholds = np.arange(0,1,0.1)
for threshold in thresholds:
    y_pred = np.greater(y_score, threshold).astype(int)
    accuracy = sklearn.metrics.accuracy_score(y_true, y_pred)
    accuracies.append(accuracy)

accuracies = np.array(accuracies)
max_accuracy = accuracies.max()
max_accuracy_threshold =  thresholds[accuracies.argmax()]

但我想知道是否有任何内置功能。

最佳答案

from sklearn.metrics import accuracy_score
from sklearn.metrics import roc_curve

fpr, tpr, thresholds = roc_curve(y_true, probs)
accuracy_scores = []
for thresh in thresholds:
    accuracy_scores.append(accuracy_score(y_true, [m > thresh for m in probs]))

accuracies = np.array(accuracy_scores)
max_accuracy = accuracies.max()
max_accuracy_threshold =  thresholds[accuracies.argmax()]

10-06 14:35