假设您有以下简单的数据框:

Input <- c("X0_1-2 + X1_1-2","X0_1-2 + X1_1-2","X0_1-3 + X1_1-3","X0_3-2 + X1_3-2","X0_3-1 + X1_3-1","X0_2-1 + X1_2-1","X0_2-3 + X1_2-3","X0_13-1 + X1_13-1")
State1 <- c("1-3","1-3","1-2","3-1","3-2","2-1","2-1","13-3")
State2 <- c("1-2","1-2","1-3","3-2","3-1","2-3","2-3","13-1")
DataFrame <- cbind(Input,State1,State2)
DataFrame <- as.data.frame(DataFrame)

屈服
            Input State1 State2
1 X0_1-2 + X1_1-2    1-3    1-2
2 X0_1-2 + X1_1-2    1-3    1-2
3 X0_1-3 + X1_1-3    1-2    1-3
4 X0_3-2 + X1_3-2    3-1    3-2
5 X0_3-1 + X1_3-1    3-2    3-1
6 X0_2-1 + X1_2-1    2-1    2-3
7 X0_2-3 + X1_2-3    2-1    2-3
8 X0_13-1 + X1_13-1  13-3   13-1

我试图提出一种聪明的方法来添加一个等于“输入”列的额外列,但是要使“_”后面的值
是State1或State2的那些,根据它们与Input中相应的子字符串不同,即在这种情况下,期望的结果是
            Input State1 State2          Outcome
1 X0_1-2 + X1_1-2    1-3    1-2 X0_1-3 + X1_1-3
2 X0_1-2 + X1_1-2    1-3    1-2 X0_1-3 + X1_1-3
3 X0_1-3 + X1_1-3    1-2    1-3 X0_1-2 + X1_1-2
4 X0_3-2 + X1_3-2    3-1    3-2 X0_3-1 + X1_3-1
5 X0_3-1 + X1_3-1    3-2    3-1 X0_3-2 + X1_3-2
6 X0_2-1 + X1_2-1    2-1    2-3 X0_2-3 + X1_2-3
7 X0_2-3 + X1_2-3    2-1    2-3 X0_2-1 + X1_2-1
8 X0_13-1 + X1_13-1  13-3    13-1 X0_13-3 + X1_13-3

但是到目前为止还没有成功。

这个想法是用State1或State2值(两者中的任一个不同)替换输入字段中_后面的_后面的所有内容。

任何想法/投入将不胜感激。
谢谢!

最佳答案

我会这样做,假设df是您的数据框:

replacement <- c("State2","State1")[mapply(grepl, df$State2, df$Input)+1]
df$output <- sapply(1:nrow(df), function(i)gsub( "\\d+-\\d+",df[i, replacement[i]],df[i,"Input"]))

输出:
> df
            Input State1 State2          output
1 X0_1-2 + X1_1-2    1-3    1-2 X0_1-3 + X1_1-3
2 X0_1-2 + X1_1-2    1-3    1-2 X0_1-3 + X1_1-3
3 X0_1-3 + X1_1-3    1-2    1-3 X0_1-2 + X1_1-2
4 X0_3-2 + X1_3-2    3-1    3-2 X0_3-1 + X1_3-1
5 X0_3-1 + X1_3-1    3-2    3-1 X0_3-2 + X1_3-2
6 X0_2-1 + X1_2-1    2-1    2-3 X0_2-3 + X1_2-3
7 X0_2-3 + X1_2-3    2-1    2-3 X0_2-1 + X1_2-1
8 X0_2-1 + X1_2-1    2-3    2-1 X0_2-3 + X1_2-3

10-07 22:16