识别每个ID的第一列值并根据该值进行替换
我有许多列的以下df:
input <- data.frame(ID = c(1,1,1,1,1,2,2,3,3,3),
Obs1 = c(1,0,1,1,0,0,1,1,0,1),
Obs2 = c(0,1,1,0,1,1,1,1,1,0),
Control1 = c(1,1,2,2,2,1,2,1,1,2),
Control2 = c(1,2,2,2,3,1,1,1,2,2))
我想修改“控件”列的值。如果每个ID的第一个“ Obs”值是0,那么我必须将-1减去整个ID组:
result <- data.frame(ID = c(1,1,1,1,1,2,2,3,3,3),
Obs1 = c(1,0,1,1,0,0,1,1,0,1),
Obs2 = c(0,1,1,0,1,1,1,1,1,0),
Control1 = c(1,1,2,2,2,0,1,1,1,1),
Control2 = c(0,1,1,1,2,1,1,1,2,2))
我获得每个ID的第一个obs值的方式如下:
i <- 1
aux <- vector("list", 2)
for (i in 2:3)
aux [[i]] <- input[!duplicated(input$ID), i]
使用此列表,如何修改“控件”列? (我有100多个)
最佳答案
使用data.table,我首先将数据转换为长格式(使用"Obs"
函数将所有"Control"
和patterns
列合并到同一列中),进行计算并转换回宽格式。这将缩放为任意数量的对。
library(data.table)
library(magrittr)
# The patterns of the columns we will be working with
cols <- c("Obs", "Control")
res <-
# convert to data.table and add row index so we can dcast back afterwards
setDT(input)[, rowind := .I] %>%
# Convert to long format and combine all Obs and Controls into two columns
melt(., id = c("rowind", "ID"), patterns(cols), value.name = cols) %>%
# Reduce 1 from Control in case the first value is zero
.[, Control := Control - first(Obs == 0), by = .(ID, variable)] %>%
# Convert back to wide format
dcast(., ID + rowind ~ variable, value.var = cols, sep = "") %>%
# Remove the row index
.[, rowind := NULL]
res
# ID Obs1 Obs2 Control1 Control2
# 1: 1 1 0 1 0
# 2: 1 0 1 1 1
# 3: 1 1 1 2 1
# 4: 1 1 0 2 1
# 5: 1 0 1 2 2
# 6: 2 0 1 0 1
# 7: 2 1 1 1 1
# 8: 3 1 1 1 1
# 9: 3 0 1 1 2
# 10: 3 1 0 2 2