

我有一个df 数据,如果因子相同,我想在前一列和前一行中添加一个值。

I have a df data and I would like to add to a new column a value that exist in a previous column and row if the factor is the same.


data <- structure(list(Id = c("a", "b", "b", "b", "a", "a", "b", "b",
"a", "a"), duration.minutes = c(NA, 139L, 535L, 150L, NA, NA,
145L, 545L, 144L, NA), event = structure(c(1L, 4L, 3L, 4L, 2L,
1L, 4L, 3L, 4L, 2L), .Label = c("enter", "exit", "stop", "trip"
), class = "factor")), .Names = c("Id", "duration.minutes", "event"
), class = "data.frame", row.names = 265:274)

,我想添加一个新列 duration.minutes.past,如下所示:

and I would like to add a new column called "duration.minutes.past" like this:

data <- structure(list(Id = c("a", "b", "b", "b", "a", "a", "b", "b",
"a", "a"), duration.minutes = c(NA, 139L, 535L, 150L, NA, NA,
145L, 545L, 144L, NA), event = structure(c(1L, 4L, 3L, 4L, 2L,
1L, 4L, 3L, 4L, 2L), .Label = c("enter", "exit", "stop", "trip"
), class = "factor"), duration.minutes.past = c(NA, NA, 139,
NA, NA, NA, NA, 145, NA, NA)), .Names = c("Id", "duration.minutes",
"event", "duration.minutes.past"), row.names = 265:274, class = "data.frame")

如您所见,我在新列中添加了 duration.minutes.past 上一个行程 duration.minutes 对于相同的 Id 。如果 Id 不同或不是停止点,则 duration.minutes.past 的值为NA 。

As you can see, I added in this new column duration.minutes.past the duration.minutes of the previous trip for the same Id. if the Id is different or if is it not a stop, then the value for duration.minutes.past is NA.



我们可以做到这一点与 data.table 。将'data.frame'转换为'data.table'( setDT(data)),按'Id'分组,我们创建 lag 列的 duration.minutes中使用 shift ),然后将值更改为 NA,其中事件不等于停止

We can do this with data.table. Convert the 'data.frame' to 'data.table' (setDT(data)), grouped by 'Id', we create the lag column of 'duration.minutes' using shift), then change the value to 'NA' where the 'event' is not equal to 'stop'

setDT(data)[, duration.minutes.past := shift(duration.minutes),
             Id][event != "stop", duration.minutes.past := NA][]
#    Id duration.minutes event duration.minutes.past
#1:  a               NA enter                    NA
#2:  b              139  trip                    NA
#3:  b              535  stop                   139
#4:  b              150  trip                    NA
#5:  a               NA  exit                    NA
#6:  a               NA enter                    NA
#7:  b              145  trip                    NA
#8:  b              545  stop                   145
#9:  a              144  trip                    NA
#10: a               NA  exit                    NA

或者这可以使用 ave

data$duration.minutes.past <- with(data, NA^(event != "stop") *
      ave(duration.minutes, Id, FUN = function(x) c(NA, x[-length(x)])))


