本文介绍了scikit-learn 交叉验证,均方误差为负值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当我将以下代码与大小为 (952,144) 的数据矩阵 X 和大小为 (952) 的输出向量 y 一起使用时,mean_squared_errormetric 返回负值,这是出乎意料的.你有什么想法吗?

When I use the following code with Data matrix X of size (952,144) and output vector y of size (952), mean_squared_error metric returns negative values, which is unexpected. Do you have any idea?

from sklearn.svm import SVR
from sklearn import cross_validation as CV

reg = SVR(C=1., epsilon=0.1, kernel='rbf')
scores = CV.cross_val_score(reg, X, y, cv=10, scoring='mean_squared_error')

scores 中的所有值都是负数.

all values in scores are then negative.

推荐答案

试图解决这个问题,所以我提供了 David 和 larsmans 在评论部分雄辩地描述的答案:

Trying to close this out, so am providing the answer that David and larsmans have eloquently described in the comments section:

是的,这应该发生.实际 MSE 只是您得到的数字的正数.

Yes, this is supposed to happen. The actual MSE is simply the positive version of the number you're getting.

统一评分 API 总是最大化分数,因此需要最小化的分数被否定,以便统一评分 API 正常工作.因此,当它是应该最小化的分数时返回的分数被否定,如果它是应该最大化的分数则保留为正数.

The unified scoring API always maximizes the score, so scores which need to be minimized are negated in order for the unified scoring API to work correctly. The score that is returned is therefore negated when it is a score that should be minimized and left positive if it is a score that should be maximized.

这也在sklearn GridSearchCV with Pipeline中有所描述.

这篇关于scikit-learn 交叉验证,均方误差为负值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-28 22:19