本文介绍了如何在R中的data.frame中聚合数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个很大的data.frame.data.frame包含很多值.

I have a large data.frame. The data.frame include a lot of values.

例如:

df <- data.frame(Company = c('A', 'A', 'B', 'C', 'A', 'B', 'B', 'C', 'C'),
                 Name = c("Wayne", "Duane", "William", "Rafael", "John", "Eric", "James", "Pablo", "Tammy"),
                 Age = c(26, 27, 28, 32, 28, 24, 34, 30, 25),
                 Wages = c(50000, 70000, 70000, 60000, 50000, 70000, 65000, 50000, 50000),
                 Education.University = c(1, 1, 1, 0, 0, 1, 1, 0, 1),
                 Productivity = c(100, 120, 120, 95, 88, 115, 100, 90, 120))

如何汇总我的 data.frame ?我想分析每个公司的价值.它必须看起来像:

How can I aggregate my data.frame? I want to analyze values on every Company. It must look like:

年龄->公司所有员工的平均年龄

Age -> average Age of all employees in Company

工资->公司所有员工的平均工资

Wages -> average Wages of all employees in Company

Education.University->公司所有员工的因素之和(1或0)

Education.University -> sum of factors (1 or 0) for all employees in Company

生产率->公司所有员工的平均生产率

Productivity -> average Productivity of all employees in Company

推荐答案

Base R

cbind(aggregate(.~Company, df[,-c(2, 5)], mean),
      aggregate(Education.University~Company, df, sum)[-1])
#  Company      Age    Wages Productivity Education.University
#1       A 27.00000 56666.67     102.6667                    2
#2       B 28.66667 68333.33     111.6667                    3
#3       C 29.00000 53333.33     101.6667                    1

此处是较长的版本,可能更易于理解

Here is the longer version that may be easier to understand

merge(x = aggregate(x = list(Age_av = df$Age,
                             Wages_av = df$Wages,
                             Productivity_av = df$Productivity),
                by = list(Company = df$Company),
                FUN = mean),
      y = aggregate(x = list(Education.University_sum = df$Education.University),
                by = list(Company = df$Company),
                FUN = sum),
      by = "Company")
#  Company   Age_av Wages_av Productivity_av Education.University_sum
#1       A 27.00000 56666.67        102.6667                        2
#2       B 28.66667 68333.33        111.6667                        3
#3       C 29.00000 53333.33        101.6667                        1

这篇关于如何在R中的data.frame中聚合数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-22 21:25