我正在使用R,并具有以下数据帧示例,其中所有变量都是因素:
first second third
social birth control high
birth control high
medical Anorexia Nervosa low
medical Anorexia Nervosa low
Alcoholism high
family Alcoholism high
基本上,我需要一个函数来帮助我根据第二列和第三列中的值填充第一列中的空白。
例如,如果我在第二列中有“生育控制”,而在第三列中有“高”,则需要在第一列中用“社交”填充空白。如果在第二和第三栏中分别是“酒精中毒”和“高”,我需要在第一栏中填写“家庭”。
最佳答案
根据显示的数据,对于“第二”和“第三”的每种组合,是否在“第一”中是否还有其他值还不清楚。如果只有一个值,并且您需要用该值替换''
,则可以尝试
library(data.table)
setDT(df1)[, replace(first, first=='', first[first!='']),
list(second, third)]
或更有效的方法是
setDT(df1)[, first:= first[first!=''] , list(second, third)]
# first second third
#1: social birth control high
#2: social birth control high
#3: medical Anorexia Nervosa low
#4: medical Anorexia Nervosa low
#5: family Alcoholism high
#6: family Alcoholism high
数据
df1 <- structure(list(first = c("social", "", "medical", "medical",
"", "family"), second = c("birth control", "birth control",
"Anorexia Nervosa",
"Anorexia Nervosa", "Alcoholism", "Alcoholism"), third = c("high",
"high", "low", "low", "high", "high")), .Names = c("first", "second",
"third"), class = "data.frame", row.names = c(NA, -6L))