本文介绍了Sklearn Logistic回归-调整临界点的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个逻辑回归模型试图预测A或B这两个类别之一.

I have a logistic regression model trying to predict one of two classes: A or B.

  • 我的模型在预测A时的准确度约为85%.
  • 预测B时模型的准确度约为50%.
  • 对B的预测并不重要,但是对A的预测非常重要.

我的目标是在预测A时最大程度地提高准确性.确定班级时,是否有任何方法可以调整默认决策阈值?

classifier = LogisticRegression(penalty = 'l2',solver = 'saga', multi_class = 'ovr')
classifier.fit(np.float64(X_train), np.float64(y_train))

谢谢!RB

推荐答案

如评论中所述,选择阈值的过程是在训练后完成的.您可以找到使您选择的实用程序功能最大化的阈值,例如:

As mentioned in the comments, procedure of selecting threshold is done after training. You can find threshold that maximizes utility function of your choice, for example:

from sklearn import metrics
preds = classifier.predict_proba(test_data)
tpr, tpr, thresholds = metrics.roc_curve(test_y,preds[:,1])
print (thresholds)

accuracy_ls = []
for thres in thresholds:
    y_pred = np.where(preds[:,1]>thres,1,0)
    # Apply desired utility function to y_preds, for example accuracy.
    accuracy_ls.append(metrics.accuracy_score(test_y, y_pred, normalize=True))

然后,选择使所选效用函数最大化的阈值.在您的情况下,请选择使 y_pred 中的 1 最大化的阈值.

After that, choose threshold that maximizes chosen utility function. In your case choose threshold that maximizes 1 in y_pred.

这篇关于Sklearn Logistic回归-调整临界点的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-20 09:13