我注意到r2_scoreexplained_variance_score都是用于回归问题的内置sklearn.metrics方法。

我总是觉得r2_score是模型解释的百分比差异。与explained_variance_score有何不同?

您何时会选择一个?

谢谢!

最佳答案

好,看这个例子:

In [123]:
#data
y_true = [3, -0.5, 2, 7]
y_pred = [2.5, 0.0, 2, 8]
print metrics.explained_variance_score(y_true, y_pred)
print metrics.r2_score(y_true, y_pred)
0.957173447537
0.948608137045
In [124]:
#what explained_variance_score really is
1-np.cov(np.array(y_true)-np.array(y_pred))/np.cov(y_true)
Out[124]:
0.95717344753747324
In [125]:
#what r^2 really is
1-((np.array(y_true)-np.array(y_pred))**2).sum()/(4*np.array(y_true).std()**2)
Out[125]:
0.94860813704496794
In [126]:
#Notice that the mean residue is not 0
(np.array(y_true)-np.array(y_pred)).mean()
Out[126]:
-0.25
In [127]:
#if the predicted values are different, such that the mean residue IS 0:
y_pred=[2.5, 0.0, 2, 7]
(np.array(y_true)-np.array(y_pred)).mean()
Out[127]:
0.0
In [128]:
#They become the same stuff
print metrics.explained_variance_score(y_true, y_pred)
print metrics.r2_score(y_true, y_pred)
0.982869379015
0.982869379015

因此,当平均残差为0时,它们是相同的。选择哪一个取决于您的需求,即平均残差假设为0?

关于python - Python Sci-Kit学习(指标): difference between r2_score and explained_variance_score?,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/24378176/

10-12 18:55