本文介绍了R中的递归回归的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

限时删除!!

说我在R中有一个数据帧,如下所示:

Say I have a data frame in R as follows:

> set.seed(1)
> X <- runif(50, 0, 1)
> Y <- runif(50, 0, 1)
> df <- data.frame(X,Y)
> head(df)

          X          Y
1 0.2655087 0.47761962
2 0.3721239 0.86120948
3 0.5728534 0.43809711
4 0.9082078 0.24479728
5 0.2016819 0.07067905
6 0.8983897 0.09946616

如何在X上执行Y的递归回归,从前20个观察值开始,然后一次将一个观察值增加一个回归窗口,直到覆盖整个样本为止?

How do I perform a recursive regression of Y on X, starting at say the first 20 observations and increasing the regression window by one observation at a time until it covers the full sample?

关于如何执行固定窗口长度的滚动回归(例如,使用zoo包中的rollapply),有很多信息.但是,当我找到一个简单的递归选项时,我的搜索工作徒劳无功,在递归选项中,起点固定了,窗口大小却增加了. quantreg包中的lm.fit.recursive函数是一个例外(此处).这完美地工作了……但事实上,它没有记录有关标准错误的任何信息,而我需要构造一个递归置信区间.

There is a lot of information out there on how to perform a rolling regression of fixed window length (e.g. using rollapply in the zoo package). However, my search efforts have come up in vain when it comes to finding a simple recursive option, where the starting point is instead fixed and the window size increases. An exception is the lm.fit.recursive function from the quantregpackage (here). This works perfectly... but for the fact that it doesn't record any information about standard errors, which I need for a constructing recursive confidence intervals.

我当然可以使用循环来实现这一目标.但是,我的实际数据帧非常大,并且也按id分组,这会带来麻烦.因此,我希望找到一个更有效的选择.基本上,我正在寻找Stata中的滚动递归"命令的R等效项.

I can of course use a loop to achieve this. However, my actual data frame is very large and also grouped by id, which causes complications. So I'm hoping to find a more efficient option. Basically, I'm looking for the R equivalent of the "rolling [...], recursive" command in Stata.

推荐答案

也许会有所帮助:

set.seed(1)
X1 <- runif(50, 0, 1)
X2 <- runif(50, 0, 10) # I included another variable just for a better demonstration
Y <- runif(50, 0, 1)
df <- data.frame(X1,X2,Y)


rolling_lms <- lapply( seq(20,nrow(df) ), function(x) lm( Y ~ X1+X2, data = df[1:x , ]) )

使用上面的lapply函数,您可以使用完整的信息进行递归回归.

Using the above lapply function you the recursive regression you want with full information.

例如,对于具有20个观察值的第一次回归:

For example for the first regression with 20 observations:

> summary(rolling_lms[[1]])

Call:
lm(formula = Y ~ X1 + X2, data = df[1:x, ])

Residuals:
     Min       1Q   Median       3Q      Max
-0.45975 -0.19158 -0.05259  0.13609  0.67775

Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept)  0.61082    0.17803   3.431  0.00319 **
X1          -0.37834    0.23151  -1.634  0.12060
X2           0.01949    0.02541   0.767  0.45343
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.2876 on 17 degrees of freedom
Multiple R-squared:  0.1527,    Adjusted R-squared:  0.05297
F-statistic: 1.531 on 2 and 17 DF,  p-value: 0.2446

并拥有您需要的所有信息.

And has all the info you need.

> length(rolling_lms)
[1] 31

它从20个观测值开始进行了31次线性回归,直到达到50个.所有包含所有信息的回归都存储为rolling_lms列表的元素.

It performed 31 linear regressions starting from 20 observations and until it reached 50. Every regression with all the information is stored as an element of the rolling_lms list.

编辑

根据下面Carl的评论,为了获得每次回归的所有斜率的矢量,在这种情况下,对于X1变量,这是一种非常好的技术(如Carl建议):

As per Carl's comment below, in order to get a vector of all the slopes for each regression, for X1 variable on this occasion, this is a very good technique (as Carl suggested):

all_slopes<-unlist(sapply(1:31,function(j) rolling_lms[[j]]$coefficients[2]))

输出:

> all_slopes
         X1          X1          X1          X1          X1          X1          X1          X1          X1          X1
-0.37833614 -0.23231852 -0.20465589 -0.20458938 -0.11796060 -0.14621369 -0.13861210 -0.11906724 -0.10149900 -0.14045509
         X1          X1          X1          X1          X1          X1          X1          X1          X1          X1
-0.14331323 -0.14450837 -0.16214836 -0.15715630 -0.17388457 -0.11427933 -0.10624746 -0.09767893 -0.10111773 -0.06415914
         X1          X1          X1          X1          X1          X1          X1          X1          X1          X1
-0.06432559 -0.04492075 -0.04122131 -0.06138768 -0.06287532 -0.06305953 -0.06491377 -0.01389334 -0.01703270 -0.03683358
         X1
-0.02039574

这篇关于R中的递归回归的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

1403页,肝出来的..

09-07 02:35