本文介绍了Sklearn Logistic回归-调整临界点的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个逻辑回归模型试图预测A或B这两个类别之一.
I have a logistic regression model trying to predict one of two classes: A or B.
- 我的模型在预测A时的准确度约为85%.
- 预测B时模型的准确度约为50%.
- 对B的预测并不重要,但是对A的预测非常重要.
我的目标是在预测A时最大程度地提高准确性.确定班级时,是否有任何方法可以调整默认决策阈值?
classifier = LogisticRegression(penalty = 'l2',solver = 'saga', multi_class = 'ovr')
classifier.fit(np.float64(X_train), np.float64(y_train))
谢谢!RB
推荐答案
如评论中所述,选择阈值的过程是在训练后完成的.您可以找到使您选择的实用程序功能最大化的阈值,例如:
As mentioned in the comments, procedure of selecting threshold is done after training. You can find threshold that maximizes utility function of your choice, for example:
from sklearn import metrics
preds = classifier.predict_proba(test_data)
tpr, tpr, thresholds = metrics.roc_curve(test_y,preds[:,1])
print (thresholds)
accuracy_ls = []
for thres in thresholds:
y_pred = np.where(preds[:,1]>thres,1,0)
# Apply desired utility function to y_preds, for example accuracy.
accuracy_ls.append(metrics.accuracy_score(test_y, y_pred, normalize=True))
然后,选择使所选效用函数最大化的阈值.在您的情况下,请选择使 y_pred
中的 1
最大化的阈值.
After that, choose threshold that maximizes chosen utility function. In your case choose threshold that maximizes 1
in y_pred
.
这篇关于Sklearn Logistic回归-调整临界点的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!