本文介绍了python中的单变量回归的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

限时删除!!

需要在python的一个数据帧中的列与同一数据帧中的其他几列之间运行多个单因素(单变量)回归模型

Need to run multiple single-factor (univariate) regression models in python between a column in a dataframe and several other columns in the same dataframe

-

因此,基于图像,我想在x1& amp;之间运行回归模型. dep,x2&等等等等

so based on the image, i want to run regression models between x1 & dep, x2 & dep and so on and so forth

想要输出-beta,截距,R-sq,p值,SSE,AIC,BIC,残差的正态性检验等

Want to output - beta, intercept, R-sq, p-value, SSE, AIC, BIC, Normality test of residuals etc

推荐答案

您可以在此处使用两个选项.一个是流行的 scikit-learn 库.用途如下

There are two options you can use here. One is the popular scikit-learn library. It is used as follows

from sklearn.linear_model import LinearRegression
model = LinearRegression()
model.fit(X, y)  # where X is your feature data and y is your target
reg.score(X, y)  # R^2 value
>>> 0.87
reg.coef_  # slope coeficients
>>> array([1.45, -9.2])
reg.intercept_  # intercept
>>> 6.1723...

您可以使用scikit进行其他统计.

There are not many other statistics you can use with scikit.

另一种选择是 statsmodels ,它提供了远在模型统计信息中更丰富的细节

Another option is statsmodels which offers far richer detail into the statistics of the model

import numpy as np
import statsmodels.api as sm

# generate some synthetic data
nsample = 100
x = np.linspace(0, 10, 100)
X = np.column_stack((x, x**2))
beta = np.array([1, 0.1, 10])
e = np.random.normal(size=nsample)

X = sm.add_constant(X)
y = np.dot(X, beta) + e

# fit the model and get a summary of the statistics
model = sm.OLS(y, X)
results = model.fit()
print(results.summary())

                            OLS Regression Results
==============================================================================
Dep. Variable:                      y   R-squared:                       1.000
Model:                            OLS   Adj. R-squared:                  1.000
Method:                 Least Squares   F-statistic:                 4.020e+06
Date:                Mon, 08 Jul 2019   Prob (F-statistic):          2.83e-239
Time:                        02:07:22   Log-Likelihood:                -146.51
No. Observations:                 100   AIC:                             299.0
Df Residuals:                      97   BIC:                             306.8
Df Model:                           2
Covariance Type:            nonrobust
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const          1.3423      0.313      4.292      0.000       0.722       1.963
x1            -0.0402      0.145     -0.278      0.781      -0.327       0.247
x2            10.0103      0.014    715.745      0.000       9.982      10.038
==============================================================================
Omnibus:                        2.042   Durbin-Watson:                   2.274
Prob(Omnibus):                  0.360   Jarque-Bera (JB):                1.875
Skew:                           0.234   Prob(JB):                        0.392
Kurtosis:                       2.519   Cond. No.                         144.
==============================================================================

您可以看到statsmodels提供了更多详细信息,例如AIC,BIC,t统计等.

You can see that statsmodels offer much more details, such as the AIC, BIC, t-statistics etc.

这篇关于python中的单变量回归的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

1403页,肝出来的..

09-08 11:12