问题描述
假设我有以下嵌套列表:
L = [['John','Sayyed'], ['John', 'Simon'] ,['bush','trump'],['Sam','Suri','NewYork'],['Suri','Orlando','Canada']]
如何通过获取具有公共元素的子列表与组内至少另一个子列表的并集来对这些子列表进行分组?所以对于前面的例子,结果应该是:
[['John','Sayyed','Simon'] ,['bush','trump'],['Sam','Suri','NewYork','Orlando','Canada']]
因此,前两个子列表在共享 'John'
时连接在一起.有人可以分享他们的宝贵想法吗?
在许多情况下,将问题建模为图形可以使相当复杂的任务变得更加容易.在这种情况下,我们从图论的角度寻找的是
关于连通分量(图论)
关于连接组件的更详细说明:
在图论中,无向图的连通分量(或只是分量)是一个子图,其中任意两个顶点通过路径相互连接,并且不与超图中的其他顶点相连
本质上,这段代码创建了一个图,其中的边来自列表,其中每条边由两个值 u,v
组成,其中 u
和 v
将是由这条边连接的节点.
因此,子列表与至少一个具有公共元素的子列表的并集可以转化为图论问题,因为所有节点之间都可以通过现有路径到达.
Say I have for example the following nested list:
L = [['John','Sayyed'], ['John' , 'Simon'] ,['bush','trump'],
['Sam','Suri','NewYork'],['Suri','Orlando','Canada']]
How can I group these sublists, by getting the union of sublists which have a common element with at least another sublist within the group? So for the previous example the result should be:
[['John','Sayyed','Simon'] ,['bush','trump'],
['Sam','Suri','NewYork','Orlando','Canada']]
Thus the first two sublists are joined as they share 'John'
.Could someone please share their valuable thoughts ?
In many cases, modeling a problem as a graph, can make make fairly complicated tasks much easier. In this case, what we'd be looking for from a graph theory point of view, are the connected components of the graph.
So a simple way to go about this, is to generate a graph with NetworkX, and add your list as the graph edges using add_edges_from
. Then use connected_components
, which will precisely give you a list of sets of the connected components in the graph:
import networkx as nx
L = [['John','Sayyed'], ['John' , 'Simon'] ,['bush','trump']]
G=nx.Graph()
G.add_edges_from(L)
list(nx.connected_components(G))
[{'John', 'Sayyed', 'Simon'}, {'bush', 'trump'}]
What about sublists with multiple (>2) items?
In the case of having sublists with more than 2
elements, you can add them as paths instead of nodes using nx.add_path
, since they can connect multiple nodes:
L = [['John','Sayyed'], ['John' , 'Simon'] ,['bush','trump'],
['Sam','Suri','NewYork'],['Suri','Orlando','Canada']]
G=nx.Graph()
for l in L:
nx.add_path(G, l)
list(nx.connected_components(G))
[{'John', 'Sayyed', 'Simon'},
{'bush', 'trump'},
{'Canada', 'NewYork', 'Orlando', 'Sam', 'Suri'}]
We can also vivisualize these connected components with nx.draw
:
pos = nx.spring_layout(G, scale=20, k=2/np.sqrt(G.order()))
nx.draw(G, pos, node_color='lightgreen', node_size=1000, with_labels=True)
On connected components (graph theory)
More detailed explanation on connected components:
So essentially, this code creates a graph, with edges from the list, where each edge is composed by two values u,v
where u
and v
will be nodes connected by this edge.
And hence, the union of sublists with at least one sublist with a common element can be translated into a Graph Theory problem as all nodes that are reachable between each other through the existing paths.
这篇关于将列表与公共元素结合起来的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!