问题描述
我一直在使用Python进行回归分析.得到回归结果后,我需要将所有结果汇总到一个表中,然后将它们转换为LaTex(用于发布).是否有任何软件包可以在Python中执行此操作?Stata中的 estout 这样的表格如下:
I've been using Python for regression analysis. After getting the regression results, I need to summarize all the results into one single table and convert them to LaTex (for publication). Is there any package that does this in Python? Something like estout in Stata that gives the following table:
推荐答案
好吧, statsmodels
中有 summary_col
;它没有 estout
的所有花哨功能,但是它确实具有您正在寻找的基本功能(包括导出到LaTeX):
Well, there is summary_col
in statsmodels
; it doesn't have all the bells and whistles of estout
, but it does have the basic functionality you are looking for (including export to LaTeX):
import statsmodels.api as sm
from statsmodels.iolib.summary2 import summary_col
p['const'] = 1
reg0 = sm.OLS(p['p0'],p[['const','exmkt','smb','hml']]).fit()
reg1 = sm.OLS(p['p2'],p[['const','exmkt','smb','hml']]).fit()
reg2 = sm.OLS(p['p4'],p[['const','exmkt','smb','hml']]).fit()
print summary_col([reg0,reg1,reg2],stars=True,float_format='%0.2f')
===============================
p0 p2 p4
-------------------------------
const -1.03*** -0.01 0.62***
(0.11) (0.04) (0.07)
exmkt 1.28*** 0.97*** 0.98***
(0.02) (0.01) (0.01)
smb 0.37*** 0.28*** -0.14***
(0.03) (0.01) (0.02)
hml 0.77*** 0.46*** 0.69***
(0.04) (0.01) (0.02)
===============================
Standard errors in parentheses.
* p<.1, ** p<.05, ***p<.01
或者这是我添加R平方和观察数的版本:
Or here is a version where I add R-Squared and the number of observations:
print summary_col([reg0,reg1,reg2],stars=True,float_format='%0.2f',
info_dict={'N':lambda x: "{0:d}".format(int(x.nobs)),
'R2':lambda x: "{:.2f}".format(x.rsquared)})
===============================
p0 p2 p4
-------------------------------
const -1.03*** -0.01 0.62***
(0.11) (0.04) (0.07)
exmkt 1.28*** 0.97*** 0.98***
(0.02) (0.01) (0.01)
smb 0.37*** 0.28*** -0.14***
(0.03) (0.01) (0.02)
hml 0.77*** 0.46*** 0.69***
(0.04) (0.01) (0.02)
R2 0.86 0.95 0.88
N 1044 1044 1044
===============================
Standard errors in parentheses.
* p<.1, ** p<.05, ***p<.01
另一个例子,这次展示了 model_names
选项的使用以及自变量变化的回归:
Another example, this time showing the use of the model_names
option and regressions where the independent variables vary:
reg3 = sm.OLS(p['p4'],p[['const','exmkt']]).fit()
reg4 = sm.OLS(p['p4'],p[['const','exmkt','smb','hml']]).fit()
reg5 = sm.OLS(p['p4'],p[['const','exmkt','smb','hml','umd']]).fit()
print summary_col([reg3,reg4,reg5],stars=True,float_format='%0.2f',
model_names=['p4\n(0)','p4\n(1)','p4\n(2)'],
info_dict={'N':lambda x: "{0:d}".format(int(x.nobs)),
'R2':lambda x: "{:.2f}".format(x.rsquared)})
==============================
p4 p4 p4
(0) (1) (2)
------------------------------
const 0.66*** 0.62*** 0.15***
(0.10) (0.07) (0.04)
exmkt 1.10*** 0.98*** 1.08***
(0.02) (0.01) (0.01)
hml 0.69*** 0.72***
(0.02) (0.01)
smb -0.14*** 0.07***
(0.02) (0.01)
umd 0.46***
(0.01)
R2 0.78 0.88 0.96
N 1044 1044 1044
==============================
Standard errors in
parentheses.
* p<.1, ** p<.05, ***p<.01
要导出到LaTeX,请使用 as_latex
方法.
To export to LaTeX use the as_latex
method.
我可能是错的,但我认为没有实现t统计信息的选项而不是标准错误(如您的示例).
I could be wrong but I don't think an option for t-stats instead of standard errors (like in your example) is implemented.
这篇关于任何Python库都会生成发布样式回归表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!