本文介绍了lm()在mutate()中调用的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想知道是否可以在dplyr包的mutate()中使用lm()。目前我有一个date,company,return和market.ret的数据框可重现如下:

I wonder if it is possible to use lm() within mutate() of dplyr package. Currently I have a dataframe of "date", "company", "return" and "market.ret" reproducible as below:

library(dplyr)
n.dates <- 60
n.stocks <- 2
date <- seq(as.Date("2011-07-01"), by=1, len=n.dates)
symbol <- replicate(n.stocks, paste0(sample(LETTERS, 5), collapse = ""))
x <- expand.grid(date, symbol)
x$return <- rnorm(n.dates*n.stocks, 0, sd = 0.05)
names(x) <- c("date", "company", "return")
x <- group_by(x, date)    
x <- mutate(x, market.ret = mean(x$return, na.rm = TRUE))

现在,我想要适应每个要market.ret返回的公司,计算线性回归系数和存储一个新的列的斜坡。我想用mutate()这样做,但下面的代码不起作用:

Now for each company I would like to fit "return" by "market.ret", calculate the linear regression coefficient and store the slopes in a new column. I wish to do it with mutate(), but the below code does not work:

x <- group_by(x, company)
x <- mutate(x, beta = coef(lm(x$return~x$market.ret))[[2]])

R所报告的错误是:

Error in terms.formula(formula, data = data) : 
invalid term in model formula

提前感谢对于任何建议!

Thanks in advance for any suggestion!

推荐答案

这似乎适用于我:

group_by(x, company) %>%
    do(data.frame(beta = coef(lm(return ~ market.ret,data = .))[2])) %>%
    left_join(x,.)

这篇关于lm()在mutate()中调用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-22 07:35