本文介绍了如何按组获取汇总统计信息的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图在 R/S-PLUS 中一次性获得按类别列分组的多个汇总统计信息.我找到了几个函数,但它们每个调用都做一个统计,比如 aggregate().

I'm trying to get multiple summary statistics in R/S-PLUS grouped by categorical column in one shot. I found couple of functions, but all of them do one statistic per call, like aggregate().

data <- c(62, 60, 63, 59, 63, 67, 71, 64, 65, 66, 68, 66,
          71, 67, 68, 68, 56, 62, 60, 61, 63, 64, 63, 59)
grp <- factor(rep(LETTERS[1:4], c(4,6,6,8)))
df <- data.frame(group=grp, dt=data)
mg <- aggregate(df$dt, by=df$group, FUN=mean)
mg <- aggregate(df$dt, by=df$group, FUN=sum)

我正在寻找的是在一次调用中获取同一组的多个统计数据,例如均值、最小值、最大值、标准差等,这可行吗?

What I'm looking for is to get multiple statistics for the same group like mean, min, max, std, ...etc in one call, is that doable?

推荐答案

1.点击

我会为 tapply() 投入 2 美分.

tapply(df$dt, df$group, summary)

您可以使用所需的特定统计数据编写自定义函数或格式化结果:

You could write a custom function with the specific statistics you want or format the results:

tapply(df$dt, df$group,
  function(x) format(summary(x), scientific = TRUE))
$A
       Min.     1st Qu.      Median        Mean     3rd Qu.        Max.
"5.900e+01" "5.975e+01" "6.100e+01" "6.100e+01" "6.225e+01" "6.300e+01"

$B
       Min.     1st Qu.      Median        Mean     3rd Qu.        Max.
"6.300e+01" "6.425e+01" "6.550e+01" "6.600e+01" "6.675e+01" "7.100e+01"

$C
       Min.     1st Qu.      Median        Mean     3rd Qu.        Max.
"6.600e+01" "6.725e+01" "6.800e+01" "6.800e+01" "6.800e+01" "7.100e+01"

$D
       Min.     1st Qu.      Median        Mean     3rd Qu.        Max.
"5.600e+01" "5.975e+01" "6.150e+01" "6.100e+01" "6.300e+01" "6.400e+01"

2.data.table

data.table 包为这些类型的操作提供了许多有用且快速的工具:

2. data.table

The data.table package offers a lot of helpful and fast tools for these types of operation:

library(data.table)
setDT(df)
> df[, as.list(summary(dt)), by = group]
   group Min. 1st Qu. Median Mean 3rd Qu. Max.
1:     A   59   59.75   61.0   61   62.25   63
2:     B   63   64.25   65.5   66   66.75   71
3:     C   66   67.25   68.0   68   68.00   71
4:     D   56   59.75   61.5   61   63.00   64

这篇关于如何按组获取汇总统计信息的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-04 03:00
查看更多