问题描述
我正在尝试用一个简单的 R^2 分数来测试我的 Scikit-learn 机器学习算法,但由于某种原因它总是返回零.
I'm trying to test my Scikit-learn machine learning algorithm with a simple R^2 score, but for some reason it always returns zero.
import numpy
from sklearn.metrics import r2_score
prediction = numpy.array([0.1567, 4.7528, 1.1260, 0.2294]).reshape(1, -1)
training = numpy.array([0, 3, 1, 0]).reshape(1, -1)
r2 = r2_score(training, prediction, multioutput="raw_values")
print r2
[ 0. 0. 0. 0.]
这是一个由四部分组成的值,而不是四个单独的值.如何获得正确的 R^2 分数?
This is a single four-part value, not four separate values. How do I get proper R^2 scores?
推荐答案
如果你想计算两个向量之间的 r2 值,你应该只传递两个一维数组.请参阅文档
If you are trying to calculate the r2 value between two vectors you should just pass two one dimensional arrays. See the documentation
在您提供的示例中,将第一项与第一项进行比较,但请注意,您在预测和训练中只有一个列表,因此计算 R2 为 0.1567 到 0,即 0,然后计算它用于 4.7528 到 3,它也是 0 等等......听起来你想要两个向量的 R2,如下所示:
In the example you provided, the first item is compared to the first item, but note you only have one list in each the prediction and training, so it is calculating R2 for 0.1567 to 0, which is 0, then it calculates it for 4.7528 to 3 which is also 0 and so on... It sounds like you want the R2 for the two vectors like the following:
prediction = numpy.array([0.1567, 4.7528, 1.1260, 0.2294])
training = numpy.array([0, 3, 1, 0])
print(r2_score(training, prediction))
0.472439485
如果您有多维数组,您可以使用 multioutput
标志来确定输出应该是什么样子:
If you have multi-dimensional arrays you can use the multioutput
flag to determine what the output should look like:
#modified from the scikit-learn example
y_true = [[0.5, 1], [-1, 1], [7, -6]]
y_pred = [[0, 2], [-1, 2], [8, -5]]
print(r2_score(y_true, y_pred, multioutput='raw_values'))
array([ 0.96543779, 0.90816327])
这里的输出是将 y_true 中每个列表的第一项与 y_pred 的每个列表中的第一项进行比较,将第二项与第二项进行比较,以此类推
Here the output is where the first item of each list in y_true is compared to the first item in each list of y_pred, the second item to the second and so on
这篇关于Scikit-learn R2 始终为零的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!