问题描述
这是问题的延伸我得到任何两个 hp_id共现的实例数,如下所示:
Following the really efficient self-join/data.table answer provided by Uwe I get the number of instances of co-ocurrence of any TWO "hp_id" as follows:
df_by2<- setDT(df)[df, on = "person", allow = TRUE][
hp_char < i.hp_char, .N, by = .(HP_ID1 = hp_char, HP_ID2 = i.hp_char)]
这给我:
HP_ID1 HP_ID2 N
1: hp1 hp2 3
2: hp1 hp3 3
3: hp2 hp3 3
4: hp1 hp4 1
5: hp2 hp4 1
6: hp3 hp4 2
7: hp1 hp5 1
8: hp2 hp5 1
9: hp3 hp5 2
10: hp4 hp5 2
11: hp1 hp6 1
12: hp2 hp6 1
13: hp3 hp6 1
14: hp4 hp6 1
15: hp5 hp6 2
16: hp1 hp7 1
17: hp2 hp7 1
18: hp3 hp7 1
19: hp4 hp7 1
20: hp5 hp7 2
21: hp6 hp7 2
22: hp10 hp8 2
23: hp8 hp9 2
24: hp10 hp9 2
但是我想知道是否可以扩展此方法,其中
是实例的数量可以计算出两个以上的 hp_char共存。换句话说,我正在寻找这样的输出(例如,发生3次事件的次数):
However I was wondering if an extension of this method could be made where the number of instance of co-ocurrence of greater than two "hp_char" could be calculated. In other words I was looking for an output (e.g. for number of times 3 events occurring) in like so:
HP_ID1 HP_ID2 HP_ID3 N
1 hp1 hp2 hp3 3
2 hp3 hp4 hp5 2
3 hp5 hp6 hp7 2
4 hp8 hp9 hp10 2
到目前为止,我已经能够找到两个事件同时发生的多个解决方案,但是它们似乎不能普遍用于计数> 2个事件的实例。谢谢你的帮助!
So far I have been able to find multiple solutions for cooccurrence of two events but they do not seem to be generalizable to counting instances of >2 events. Thanks for any help!
推荐答案
您可以进行 double 自联接,其余部分几乎相同:
You can do a double self-join, the rest of it is pretty much the same:
df2 <- setDT(df)[df, on = "person", allow = TRUE][df,
on = "person", allow = TRUE]
df2[hp_char < i.hp_char & i.hp_char < i.hp_char.1,
.N, by = .(HP_ID1 = hp_char,
HP_ID2 = i.hp_char,
HP_ID3 = i.hp_char.1)][N >= 2]
# HP_ID1 HP_ID2 HP_ID3 N
#1: hp1 hp2 hp3 3
#2: hp3 hp4 hp5 2
#3: hp5 hp6 hp7 2
#4: hp10 hp8 hp9 2
这篇关于获取多个(> 2)字符同时出现的实例数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!