本文介绍了如何在R中的data.frame中聚合数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个很大的data.frame.data.frame包含很多值.
I have a large data.frame. The data.frame include a lot of values.
例如:
df <- data.frame(Company = c('A', 'A', 'B', 'C', 'A', 'B', 'B', 'C', 'C'),
Name = c("Wayne", "Duane", "William", "Rafael", "John", "Eric", "James", "Pablo", "Tammy"),
Age = c(26, 27, 28, 32, 28, 24, 34, 30, 25),
Wages = c(50000, 70000, 70000, 60000, 50000, 70000, 65000, 50000, 50000),
Education.University = c(1, 1, 1, 0, 0, 1, 1, 0, 1),
Productivity = c(100, 120, 120, 95, 88, 115, 100, 90, 120))
如何汇总我的 data.frame
?我想分析每个公司的价值.它必须看起来像:
How can I aggregate my data.frame
? I want to analyze values on every Company. It must look like:
年龄->公司所有员工的平均年龄
Age -> average Age of all employees in Company
工资->公司所有员工的平均工资
Wages -> average Wages of all employees in Company
Education.University->公司所有员工的因素之和(1或0)
Education.University -> sum of factors (1 or 0) for all employees in Company
生产率->公司所有员工的平均生产率
Productivity -> average Productivity of all employees in Company
推荐答案
Base R
cbind(aggregate(.~Company, df[,-c(2, 5)], mean),
aggregate(Education.University~Company, df, sum)[-1])
# Company Age Wages Productivity Education.University
#1 A 27.00000 56666.67 102.6667 2
#2 B 28.66667 68333.33 111.6667 3
#3 C 29.00000 53333.33 101.6667 1
此处是较长的版本,可能更易于理解
Here is the longer version that may be easier to understand
merge(x = aggregate(x = list(Age_av = df$Age,
Wages_av = df$Wages,
Productivity_av = df$Productivity),
by = list(Company = df$Company),
FUN = mean),
y = aggregate(x = list(Education.University_sum = df$Education.University),
by = list(Company = df$Company),
FUN = sum),
by = "Company")
# Company Age_av Wages_av Productivity_av Education.University_sum
#1 A 27.00000 56666.67 102.6667 2
#2 B 28.66667 68333.33 111.6667 3
#3 C 29.00000 53333.33 101.6667 1
这篇关于如何在R中的data.frame中聚合数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!