我从一项重复测量的队列研究中获得了健康数据,其中每个人每年都会进行多次随访。在基线(访问 0)时,一些人已经被诊断出患有感兴趣的疾病,而其他人则没有。当我在分析中查看事件案例时,我需要从我的数据中删除那些在访问 0 时被诊断为“生病”的人。我如何在 tidyverse 中做到这一点?我在下面包含了一个我将要查看的数据结构类型的示例:
subject_id <- c(1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4,5,5,5,5)
visit <- c(0,1,2,3,0,1,2,3,0,1,2,3,0,1,2,3,0,1,2,3)
diagnosis <- c("not sick", "not sick", "not sick", "sick", "sick", "sick", "sick", "sick", "not sick", "not sick", "sick", "sick", "sick", "sick", "sick", "sick", "not sick", "not sick", "not sick", "sick")
cohort <- data.frame(subject_id, visit, diagnosis)
cohort
最佳答案
编辑 :如果您想完全删除它们,则:
cohort %>%
group_by(subject_id) %>%
mutate(Condn = ifelse(visit==0 & diagnosis=="sick",1,0) ) %>%
filter(all(Condn==0))
原始
我们可以做的:
cohort %>%
group_by(subject_id) %>%
mutate(Condn = ifelse(visit==0 & diagnosis=="sick",1,0) ) %>%
filter(Condn==0) %>%
ungroup() %>%
select(-Condn)
关于根据基线特征从队列研究数据中删除个体,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/56795215/