老实说,这是一项非常复杂的任务。这基本上是我之前提出的问题的扩展-Count unique values of a column by pairwise combinations of another column in R
假设这一次,我在R中有以下数据框:
data.frame(Reg.ID = c(1,1,2,2,2,3,3), Location = c("X","X","Y","Y","Y","X","X"), Product = c("A","B","A","B","C","B","A"))
数据看起来像这样-
Reg.ID Location Product
1 1 X A
2 1 X B
3 2 Y A
4 2 Y B
5 2 Y C
6 3 X B
7 3 X A
我想通过“产品”列中的值的成对组合来对“Reg.ID”列的唯一值进行计数,并按“位置”列进行分组。结果应如下所示-
Location Prod.Comb Count
1 X A,B 2
2 Y A,B 1
3 Y A,C 1
4 Y B,C 1
我尝试使用基本的R函数获取输出,但未获得任何成功。我猜有一个在R中使用
data.table
包的相当简单的解决方案?任何帮助将不胜感激。谢谢!
最佳答案
没有太多经过测试的想法,但这是data.table
首先想到的:
library(data.table)
dt <- data.table(Reg.ID = c(1,1,2,2,2,3,3), Location = c("X","X","Y","Y","Y","X","X"), Product = c("A","B","A","B","C","B","A"))
dt.cj <- merge(dt, dt, by ="Location", all = T, allow.cartesian = T)
dt.res <- dt.cj[Product.x < Product.y, .(cnt = length(unique(Reg.ID.x))),by = .(Location, Product.x, Product.y)]
# Location Product.x Product.y cnt
# 1: X A B 2
# 2: Y A B 1
# 3: Y A C 1
# 4: Y B C 1