本文介绍了data.table:按变量分组的累积值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有数据
set.seed(42)
dat <- data.table(id=1:8, group=c(1,1,2,2,2,3,3,3), val=rnorm(8))
> dat
id group val
1: 1 1 1.37095845
2: 2 1 -0.56469817
3: 3 2 0.36312841
4: 4 2 0.63286260
5: 5 2 0.40426832
6: 6 3 -0.10612452
7: 7 3 1.51152200
8: 8 3 -0.09465904
,我想获得 group $组的每个级别内的
val
的累积值
and I would like to obtain the cumulative values of val
within each level of group
.
> res
id group cum
1: 1 1 1.37095845
2: 2 1 0.80626037
3: 3 2 0.36312841
4: 4 2 0.995991
5: 5 2 1.400259
6: 6 3 -0.10612452
7: 7 3 1.405397
8: 8 3 1.310738
data.table
的效率总是让我感到惊讶,所以我想知道一种用 data.table
,但当然也欢迎使用其他任何有效的解决方案。
I am always astonished by the efficiency of data.table
, so I 'm wondering about a way to get this done in data.table
but of course any other efficient solution is just as welcome.
推荐答案
您可以使用 cumsum
:
dat[, cum:= cumsum(val), by = group]
dat[, val := NULL]
id group cum
1: 1 1 1.3709584
2: 2 1 0.8062603
3: 3 2 0.3631284
4: 4 2 0.9959910
5: 5 2 1.4002593
6: 6 3 -0.1061245
7: 7 3 1.4053975
8: 8 3 1.3107384
这篇关于data.table:按变量分组的累积值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!