我有一个如下所示的 data.frame:

Geotype <- c(1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3)
Strategy <- c("Demand", "Strategy 1", "Strategy 2", "Strategy 3", "Strategy 4", "Strategy 5", "Strategy 6")
Year.1  <- c(1:21)
Year.2  <- c(1:21)
Year.3  <- c(1:21)
Year.4  <- c(1:21)
mydata <- data.frame(Geotype,Strategy,Year.1, Year.2, Year.3, Year.4)

我想总结每年的每个策略。

这意味着我需要在数据框中的每一列下累加6行,然后跳过需求行。然后我想对所有列(40 年)重复此操作。

我希望输出数据框看起来像这样:
Geotype.output <- c(1, 2, 3)
Year.1.output  <- c(27, 69, 111)
Year.2.output  <- c(27, 69, 111)
Year.3.output  <- c(27, 69, 111)
Year.4.output  <- c(27, 69, 111)
output <- data.frame(Geotype.output,Year.1.output, Year.2.output, Year.3.output, Year.4.output)

关于如何优雅地做到这一点的任何建议?我尝试使用thisthisthis一起破解一个解决方案,但是我没有成功,因为我需要跳过一行。

最佳答案

使用数据表:

library(data.table)
setDT(mydata)
output = mydata[Strategy != "Demand",
             .(Year.1.output = sum (Year.1),
               Year.2.output = sum (Year.2),
               Year.3.output = sum (Year.3),
               Year.4.output = sum (Year.4)),
             by = Geotype]

#    Geotype Year.1.output Year.2.output Year.3.output Year.4.output
# 1:       1            27            27            27            27
# 2:       2            69            69            69            69
# 3:       3           111           111           111           111

我们可以简化它以更轻松地处理多年列
setDT(mydata)[Strategy != "Demand",
             lapply(.SD, sum),
             by=Geotype,
             .SDcols=grep("Year", names(mydata))]

关于R 对 n 列中的每 n 行求和,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/39956852/

10-10 17:57