本文介绍了删除R中另一列中存在的一列中的单词的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个这样的数据框:
I have a dataframe that is in this format:
A <- c("John Smith", "Red Shirt", "Family values are better")
B <- c("John is a very highly smart guy", "We tried the tea but didn't enjoy it at all", "Family is very important as it gives you values")
df <- as.data.frame(A, B)
我的意图是将结果取回为:
My intention is to get the result back as:
ID A B
1 John Smith is a very highly smart guy
2 Red Shirt We tried the tea but didn't enjoy it at all
3 Family values are better is very important as it gives you
我尝试过:
test<-df %>% filter(sapply(1:nrow(.), function(i) grepl(A[i], B[i])))
但这没有给我我想要的东西。
But it doesn't give me what I want.
有任何建议/帮助吗?
推荐答案
一个解决方案ion将使用 mapply
以及 strsplit
。
One solution is to use mapply
along with strsplit
.
技巧是将 df $ A
拆分为单独的单词,然后将由 | ,然后将其用作 gsub
中的模式
,以替换为
。
The trick is to split df$A
in separate words and collapse those words separated by |
and then use it as pattern
in gsub
to replace with ""
.
lst <- strsplit(df$A, split = " ")
df$B <- mapply(function(x,y){gsub(paste0(x,collapse = "|"), "",df$B[y])},lst,1:length(lst))
df
# A B
# 1 John Smith is a very highly smart guy
# 2 Red Shirt We tried the tea but didn't enjoy it at all
# 3 Family values are better is very important as it gives you
另一种选择是:
df$B <- mapply(function(x,y)gsub(x,"",y) ,gsub(" ", "|",df$A),df$B)
数据:
A <- c("John Smith", "Red Shirt", "Family values are better")
B <- c("John is a very highly smart guy", "We tried the tea but didn't enjoy it at all", "Family is very important as it gives you values")
df <- data.frame(A, B, stringsAsFactors = FALSE)
这篇关于删除R中另一列中存在的一列中的单词的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!