r - 如何按组(ID)复制最后一行？

我有一个随着时间推移接触表面的data.frame。我只想为每个AcvitivityID附加最后一行的重复项:

head(movsdf.rbind)
  ActivityID CareType HCWType Orientation    Surface       Date     Time       Dev.Date.Time SurfaceCategories
1         01       IV    RN01  leftFacing AlcOutside 2019-08-03 11:08:01 2019-08-03 11:08:01       HygieneArea
2         01       IV    RN01  leftFacing         In 2019-08-03 11:08:12 2019-08-03 11:08:12                In
3         01       IV    RN01  leftFacing       Door 2019-08-03 11:08:12 2019-08-03 11:08:12        FarPatient
4         02       IV    RN01  leftFacing       Door 2019-08-03 11:08:18 2019-08-03 11:08:18        FarPatient
5         02       IV    RN01  leftFacing      Other 2019-08-03 11:08:22 2019-08-03 11:08:22        FarPatient
6         03       IV    RN01  leftFacing      Table 2019-08-03 11:10:26 2019-08-03 11:10:26       NearPatient

示例数据:

movsdf.rbind<-data.frame(ActivityID=rep(1:4, each=10),Surface=rep(c("In","Table","Out"),each=10))

所以我可以通过here来使它工作:

repeatss <- aggregate(movsdf.rbind, by=list(movsdf.rbind$ActivityID), FUN = function(x) { last = tail(x,1) })

movsdf.rbind <-rbind(movsdf.rbind, repeatss)

这可以解决问题，但是看起来很笨拙，然后数据不整齐(这不是很重要，但是我觉得dplyr或data.table中可能存在更优雅的东西)。有什么想法吗？

最佳答案

另一种使用slice的方法:

library(dplyr)

DF %>%
  group_by(ActivityID) %>%
  slice(c(1:n(),n()))

这使:

两个基本的R替代方案:

# one
lastrows <- cumsum(aggregate(CareType ~ ActivityID, DF, length)[[2]])
DF[sort(c(seq(nrow(DF)), lastrows)),]

# two
idx <- unlist(tapply(1:nrow(DF), DF$ActivityID, FUN = function(x) c(x, tail(x, 1))))
DF[idx,]

两者都给出相同的结果。

两种data.table替代方法:

library(data.table)
setDT(DF)          # convert 'DF' to a data.table

# one
DF[DF[, .I[c(1:.N,.N)], by = ActivityID]$V1]

# two
DF[, .SD[c(1:.N,.N)], by = ActivityID]

使用的数据:

DF <- structure(list(ActivityID = c(1L, 1L, 1L, 2L, 2L, 3L),
                     CareType = c("IV", "IV", "IV", "IV", "IV", "IV"),
                     HCWType = c("RN01", "RN01", "RN01", "RN01", "RN01", "RN01"),
                     Orientation = c("leftFacing", "leftFacing", "leftFacing", "leftFacing", "leftFacing", "leftFacing"),
                     Surface = c("AlcOutside", "In", "Door", "Door", "Other", "Table"),
                     Date = c("2019-08-03", "2019-08-03", "2019-08-03", "2019-08-03", "2019-08-03", "2019-08-03"),
                     Time = c("11:08:01", "11:08:12", "11:08:12", "11:08:18", "11:08:22", "11:10:26"),
                     Dev.Date.Time = c("2019-08-03 11:08:01", "2019-08-03 11:08:12", "2019-08-03 11:08:12", "2019-08-03 11:08:18", "2019-08-03 11:08:22", "2019-08-03 11:10:26"),
                     SurfaceCategories = c("HygieneArea", "In", "FarPatient", "FarPatient", "FarPatient", "NearPatient")),
                class = "data.frame", row.names = c(NA, -6L))

关于r - 如何按组(ID)复制最后一行？，我们在Stack Overflow上找到一个类似的问题：https://stackoverflow.com/questions/57572895/