本文介绍了根据几列制作 group_indices的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我想生成索引以基于两列对观察进行分组.但我希望小组由观察组成,至少有一个共同的观察.我可以看到如何根据具有共同观察结果的观察结果分组,但不仅仅是其中一个观察结果.
I would like to generate indices to group observations based on two columns. But I want groups to be made of observation that share, at least one observation in commons. I can see how to make groups based on observations that share both observation in common, but not just one of them.
以数据框为例:
dt <- data.frame(id=1:10,
G1 = c("A","A","B","B","C","C","C","D","E","F"),
G2 = c("Z","X","X","Y","W","V","U","s","T","T"))
我想要一个专栏
1,1,1,1,2,2,2,3,4,4
我尝试使用 dplyr 中的 group_indices,但没有成功.
I tried with group_indices from dplyr, but haven't managed it.
推荐答案
使用 igraph 获取成员资格,然后映射名称:
Using igraph get membership, then map on names:
library(igraph)
# convert to graph, and get clusters membership ids
g <- graph_from_data_frame(df1[, c(2, 3, 1)])
myGroups <- components(g)$membership
myGroups
# A B C D E F Z X Y W V U s T
# 1 1 2 3 4 4 1 1 1 2 2 2 3 4
# then map on names
df1$group <- myGroups[df1$G1]
df1
# id G1 G2 group
# 1 1 A Z 1
# 2 2 A X 1
# 3 3 B X 1
# 4 4 B Y 1
# 5 5 C W 2
# 6 6 C V 2
# 7 7 C U 2
# 8 8 D s 3
# 9 9 E T 4
# 10 10 F T 4
这篇关于根据几列制作 group_indices的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!