问题描述
我正在尝试使用python计算响应数组与一组预测变量之间的多元线性回归和多元相关性。
我看到了一个非常简单的示例来计算多元线性回归,这很容易。
但是如何计算statsmodels的多重相关性?或其他任何选择。我想我可以使用rpy和R,但如果可能的话,我宁愿留在python中。
I am trying to use python to compute multiple linear regression and multiple correlation between a response array and a set of arrays of predictors.I saw the very simple example to compute multiple linear regression, which is easy.But how to compute multiple correlation with statsmodels? or with anything else, as an alternative. I guess i could use rpy and R, but i'd prefer to stay in python if possible.
edit [说明]:
考虑到类似此处描述的一种:
除了回归系数和其他回归参数之外,我还想为预测变量计算多个相关系数
edit [clarification]:Considering a situation like the one described here: http://sphweb.bumc.bu.edu/otlt/MPH-Modules/BS/BS704-EP713_MultivariableMethods/I would like to compute also multiple correlation coefficients for the predictors, in addition to the regression coefficients and the other regression parameters
推荐答案
您当然可以使用statsmodels和pandas完成此操作。这样的事情可能会让您入门
You could certainly do this with statsmodels and pandas. Something like this might get you started
import pandas
import statsmodels.api as sm
from statsmodels.formula.api import ols
data = pandas.DataFrame([["A", 4, 0, 1, 27],
["B", 7, 1, 1, 29],
["C", 6, 1, 0, 23],
["D", 2, 0, 0, 20],
["etc.", 3, 0, 1, 21]],
columns=["ID", "score", "male", "age20", "BMI"])
print data.corr()
model = ols("BMI ~ score + male + age20", data=data).fit()
print model.params
print model.summary()
看看文档:
编辑:我对mul的术语不熟悉尖端相关系数,但我相信这只是多元回归模型的R平方的平方根吗?
I'm not familiar with the terminology multiple correlation coefficient, but I believe this is just square root of the R-squared of a multiple regression model no?
print model.rsquared**.5
print model.rsquared_adj**.5
这是什么你在追吗?
这篇关于用什么做多重相关?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!