识别每个ID的第一列值并根据该值进行替换

我有许多列的以下df:

input <- data.frame(ID = c(1,1,1,1,1,2,2,3,3,3),
Obs1 = c(1,0,1,1,0,0,1,1,0,1),
Obs2 = c(0,1,1,0,1,1,1,1,1,0),
Control1 = c(1,1,2,2,2,1,2,1,1,2),
Control2 = c(1,2,2,2,3,1,1,1,2,2))


我想修改“控件”列的值。如果每个ID的第一个“ Obs”值是0,那么我必须将-1减去整个ID组:

result <- data.frame(ID = c(1,1,1,1,1,2,2,3,3,3),
Obs1 = c(1,0,1,1,0,0,1,1,0,1),
Obs2 = c(0,1,1,0,1,1,1,1,1,0),
Control1 = c(1,1,2,2,2,0,1,1,1,1),
Control2 = c(0,1,1,1,2,1,1,1,2,2))


我获得每个ID的第一个obs值的方式如下:

i <- 1
aux <- vector("list", 2)

for (i in 2:3)
aux [[i]] <- input[!duplicated(input$ID), i]


使用此列表,如何修改“控件”列? (我有100多个)

最佳答案

使用data.table,我首先将数据转换为长格式(使用"Obs"函数将所有"Control"patterns列合并到同一列中),进行计算并转换回宽格式。这将缩放为任意数量的对。

library(data.table)
library(magrittr)

# The patterns of the columns we will be working with
cols <- c("Obs", "Control")

res <-
  # convert to data.table and add row index so we can dcast back afterwards
  setDT(input)[, rowind := .I] %>%

  # Convert to long format and combine all Obs and Controls into two columns
  melt(., id = c("rowind", "ID"), patterns(cols), value.name = cols) %>%

  # Reduce 1 from Control in case the first value is zero
  .[, Control := Control - first(Obs == 0), by = .(ID, variable)] %>%

  # Convert back to wide format
  dcast(., ID + rowind ~ variable, value.var = cols, sep = "") %>%

  # Remove the row index
  .[, rowind := NULL]

res
#     ID Obs1 Obs2 Control1 Control2
#  1:  1    1    0        1        0
#  2:  1    0    1        1        1
#  3:  1    1    1        2        1
#  4:  1    1    0        2        1
#  5:  1    0    1        2        2
#  6:  2    0    1        0        1
#  7:  2    1    1        1        1
#  8:  3    1    1        1        1
#  9:  3    0    1        1        2
# 10:  3    1    0        2        2

09-07 17:17