本文介绍了数据框中两个逗号分隔因子之间的匹配数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个看起来像这样的数据框:

I have a dataframe that looks something like this:

Row    ID1    ID2    Colors1        Colors2
1      1      2      Green, Blue    Red, Orange
2      1      3      Green, Orange  Orange, Red

我想创建一个计算,告诉我Colors1和Colors2之间共有的颜色计数.所需的结果如下:

I would like to create a calculation that tells me the count of colors in common between Colors1 and Colors2. The desired result is the following:

Row    ID1    ID2    Colors1                Colors2         Common
1      1      2      Green, Blue, Purple    Green, Purple   2     #Green, Purple
2      1      3      Green, Orange          Orange, Red     1     #Orange

推荐答案

您可以使用:

col1 <- strsplit(df$Colors1, ", ")
col2 <- strsplit(df$Colors2, ", ")
df$Common <- sapply(seq_len(nrow(df)), function(x) length(intersect(col1[[x]], col2[[x]])))

示例

df <- data.frame(Colors1 = c('Green, Blue', 'Green, Blue, Purple'), Colors2 = c('Green, Purple', 'Orange, Red'), stringsAsFactors = FALSE)
col1 <- strsplit(df$Colors1, ", ")
col2 <- strsplit(df$Colors2, ", ")
df$Common <- sapply(seq_len(nrow(df)), function(x) length(intersect(col1[[x]], col2[[x]])))
df
#               Colors1         Colors2   Common
# 1         Green, Blue   Green, Purple        1
# 2 Green, Blue, Purple   Orange, Red          0

这篇关于数据框中两个逗号分隔因子之间的匹配数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-13 17:51