问题描述
我正在努力解决它应该是我问过的上一个问题的简单扩展。
I'm struggling with what seems like it should be a simple extension of a previous question I'd asked here.
我正在尝试在(a)个日期范围内进行汇总(b)因素变量。示例数据可能是:
I'm trying to aggregate over (a) a range of dates and (b) a factor variable. Sample data might be:
Brand Day Rev RVP
A 1 2535.00 195.00
B 1 1785.45 43.55
C 1 1730.87 32.66
A 2 920.00 230.00
B 2 248.22 48.99
C 3 16466.00 189.00
A 1 2535.00 195.00
B 3 1785.45 43.55
C 3 1730.87 32.66
A 4 920.00 230.00
B 5 248.22 48.99
C 4 16466.00 189.00
借助有用的建议,我已经弄清楚了如何使用data.table查找几天内品牌的平均收入:
Thanks to helpful advice, I've figured out how to find the mean revenue for brands over a period of days using data.table:
new_df<-df[,(mean(Rev)), by=list(Brand,Day)]
现在,我想创建一个新表,其中有一个列,列出了每个品牌的按每日Rev进行的OLS回归得出的系数估算值。我试图这样做,如下所示:
Now, I'd like to create a new table where there is a column listing the coefficient estimate from an OLS regression of Rev by Day for each brand. I tried to do this as follows:
new_df2<-df[,(lm(Rev~Day)), by=list(Brand)]
这似乎不太正确。有什么想法吗?我确信这是我想念的东西。
That doesn't seem quite right. Thoughts? I'm sure it's something obvious I've missed.
推荐答案
我认为这就是您想要的:
I think this is what you want:
new_df2<-df[,(lm(Rev~Day)$coefficients[["Day"]]), by=list(Brand)]
lm
返回完整的模型对象,您需要向下钻取它,以便从每个组中获取单个值,然后将其转换为一列。
lm
returns a full model object, you need to drill down into it to get a single value from each group that can be turned into a column.
这篇关于使用data.table创建一列回归系数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!