问题描述
我想按特定组和操作汇总数据框
I want to aggregate a data frame by a certain group and operation
数据
> df <- data.frame(replicate(9, 1:4))
X1 X2 X3 X4 X5 X6 X7 X8 X9
1 1 1 1 1 1 1 1 1 1
2 2 2 2 2 2 2 2 2 2
3 3 3 3 3 3 3 3 3 3
4 4 4 4 4 4 4 4 4 4
聚合
> aggregate(df[,2], list(df[,1]), mean)
Group.1 x
1 1 1
2 2 2
3 3 3
4 4 4
上述汇总非常有用。但是,代替平均值
,我需要使用平均值* sd / length ^ 2
之类的功能组合。我们是否应该在这里使用除聚合以外的其他方法?
The above aggregation works, which is great. However instead of mean
, in place of that I need to use combination of functions like mean*sd/length^2
. Should we be using something other than aggregate here ?
推荐答案
我修改了示例数据框,以便获取每个组的长度和标准差(您不能这样做
I modified your sample data frame in order to get a length and standard deviation for each group (you can't do this with only one data point per group).
> df
X1 X2 X3 X4 X5 X6 X7 X8 X9
1 1 1 1 1 1 1 1 1 1
2 2 2 2 2 2 2 2 2 2
3 3 3 3 3 3 3 3 3 3
4 4 4 4 4 4 4 4 4 4
5 1 1 1 1 1 1 1 1 1
6 2 2 2 2 2 2 2 2 2
7 3 3 3 3 3 3 3 3 3
8 4 4 4 4 4 4 4 4 4
9 1 4 4 4 4 4 4 4 4
10 2 5 5 5 5 5 5 5 5
11 3 6 6 6 6 6 6 6 6
12 4 7 7 7 7 7 7 7 7
13 1 4 4 4 4 4 4 4 4
14 2 5 5 5 5 5 5 5 5
15 3 6 6 6 6 6 6 6 6
16 4 7 7 7 7 7 7 7 7
要通过更详细的公式进行汇总,请执行以下操作:
To aggregate by a more elaborated formula do:
aggregate(df[,2], list(df[,1]), function(x){mean(x)*sd(x)/length(x)^2})
Group.1 x
1 1 0.2706329
2 2 0.3788861
3 3 0.4871393
4 4 0.5953925
如果要具有相同的列标签,可以执行以下操作:
If you want to have the same column labels you could do:
aggregate(list(X2 = df[,2]), list(X1 = df[,1]), function(x){mean(x)*sd(x)/length(x)^2})
X1 X2
1 1 0.2706329
2 2 0.3788861
3 3 0.4871393
4 4 0.5953925
(或随后使用姓氏
重命名)
这篇关于R-按组汇总并具有某些功能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!