问题描述
作为R用户,我还想跟上scikit的速度.
As an R user, I wanted to also get up to speed on scikit.
创建线性回归模型很好,但是似乎找不到找到回归输出的标准摘要的合理方法.
Creating a linear regression model(s) is fine, but can't seem to find a reasonable way to get a standard summary of regression output.
代码示例:
# Linear Regression
import numpy as np
from sklearn import datasets
from sklearn.linear_model import LinearRegression
# Load the diabetes datasets
dataset = datasets.load_diabetes()
# Fit a linear regression model to the data
model = LinearRegression()
model.fit(dataset.data, dataset.target)
print(model)
# Make predictions
expected = dataset.target
predicted = model.predict(dataset.data)
# Summarize the fit of the model
mse = np.mean((predicted-expected)**2)
print model.intercept_, model.coef_, mse,
print(model.score(dataset.data, dataset.target))
问题:
- 似乎像 intercept 和 coef 一样内置在模型中,我只需键入
print
(倒数第二行)即可查看它们. - 所有其他标准回归输出(如R ^ 2,调整后的R ^ 2,p值等)如何?如果我正确阅读了示例,似乎您必须编写一个函数/方程式对于每个,然后打印它.
- 因此,没有针对lin的标准摘要输出. reg.型号?
- 而且,在我输出的系数输出数组中,是否没有与每个变量相关联的变量名?我只是得到了数字数组.有没有办法在我得到系数和它们所伴随的变量的输出的地方打印这些?
- seems like the intercept and coef are built into the model, and I just type
print
(second to last line) to see them. - What about all the other standard regression output like R^2, adjusted R^2, p values, etc. If I read the examples correctly, seems like you have to write a function/equation for each of these and then print it.
- So, is there no standard summary output for lin. reg. models?
- Also, in my printed array of outputs of coefficients, there are no variable names associated with each of these? I just get the numeric array. Is there a way to print these where I get an output of the coefficients and the variable they go with?
我的打印输出:
LinearRegression(copy_X=True, fit_intercept=True, normalize=False)
152.133484163 [ -10.01219782 -239.81908937 519.83978679 324.39042769 -792.18416163
476.74583782 101.04457032 177.06417623 751.27932109 67.62538639] 2859.69039877
0.517749425413
注意:从Linear,Ridge和Lasso开始.我已经看过这些例子.以下是基本OLS.
Notes: Started off with Linear, Ridge and Lasso. I have gone through the examples. Below is for the basic OLS.
推荐答案
sklearn中没有R类型回归摘要报告.主要原因是sklearn用于预测建模/机器学习,而评估标准是基于先前未见数据(例如回归的预测r ^ 2)的性能.
There exists no R type regression summary report in sklearn. The main reason is that sklearn is used for predictive modelling / machine learning and the evaluation criteria are based on performance on previously unseen data (such as predictive r^2 for regression).
确实存在一个称为sklearn.metrics.classification_report
的分类汇总功能,该功能可以在分类模型上计算几种类型的(预测性)得分.
There does exist a summary function for classification called sklearn.metrics.classification_report
which calculates several types of (predictive) scores on a classification model.
有关更经典的统计方法,请查看statsmodels
.
For a more classic statistical approach, take a look at statsmodels
.
这篇关于如何像R一样在Python scikit中获得回归摘要?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!