我正在使用R,并具有以下数据帧示例,其中所有变量都是因素:

  first            second  third
 social     birth control   high
            birth control   high
medical  Anorexia Nervosa    low
medical  Anorexia Nervosa    low
               Alcoholism   high
 family        Alcoholism   high

基本上,我需要一个函数来帮助我根据第二列和第三列中的值填充第一列中的空白。
例如,如果我在第二列中有“生育控制”,而在第三列中有“高”,则需要在第一列中用“社交”填充空白。如果在第二和第三栏中分别是“酒精中毒”和“高”,我需要在第一栏中填写“家庭”。

最佳答案

根据显示的数据,对于“第二”和“第三”的每种组合,是否在“第一”中是否还有其他值还不清楚。如果只有一个值,并且您需要用该值替换'',则可以尝试

library(data.table)
setDT(df1)[, replace(first, first=='', first[first!='']),
                                         list(second, third)]

或更有效的方法是
setDT(df1)[, first:= first[first!=''] , list(second, third)]
#     first           second third
#1:  social    birth control  high
#2:  social    birth control  high
#3: medical Anorexia Nervosa   low
#4: medical Anorexia Nervosa   low
#5:  family       Alcoholism  high
#6:  family       Alcoholism  high

数据
df1 <- structure(list(first = c("social", "", "medical", "medical",
"", "family"), second = c("birth control", "birth control",
"Anorexia Nervosa",
"Anorexia Nervosa", "Alcoholism", "Alcoholism"), third = c("high",
"high", "low", "low", "high", "high")), .Names = c("first", "second",
"third"), class = "data.frame", row.names = c(NA, -6L))

10-04 23:21
查看更多