问题描述
以前,我发布了此链接.
我想知道是否有人可以帮助仅检索叶子(如果不平衡的树),而不必检索N级的节点.
I'd like to know if someone could help on retrieving only the leaves (in case of unbalanced tree) and not necessarily the nodes of level N.
例如,给定下面的树(请注意:此树与链接上的树有些不同)
For instance, given the below tree (Note: this tree is a bit different than the one on the link)
child| level| parent|
40| 1| 40|
21| 2| 40|
18| 2| 40|
27| 2| 40|
37| 3| 21|
2| 3| 18|
85| 3| 21|
19| 3| 21|
14| 4| 37|
58| 4| 2|
47| 4| 37|
34| 4| 85|
45| 4| 18|
32| 4| 2|
47| 4| 85|
88| 4| 85|
12| 4| 37|
如果我请求孩子40的所有叶子,我将获取叶子:所有4个孩子,其中19个(没有4个孩子)和27个(因为节点停止了2个孩子).
If I request all the leaves of the child 40, I retrieve as leaves : all the level 4 children with 19 (it doesn’t have level 4 children) and also 27 (because the node stopped level 2).
对于18岁的孩子,年龄分别为58岁和32岁.
For the child 18, it will be 58 and 32.
推荐答案
如果我理解正确,则需要叶子,即不是父母的孩子.您可以通过以下方式获得它们:
If I understood correctly, you want the leaves, i.e. the children that aren't parents.You can get them by doing:
set(df['child']) - set(df['parent'])
如果您愿意使用networkx
,则可以使用很多现有功能:
edit:
If you are willing to use networkx
, you can use a lot of existing functionality:
import matplotlib.pyplot as plt
import networkx as nx
# create directed graph from dataframe:
G=nx.from_pandas_edgelist(df, source='parent', target='child', create_using=nx.DiGraph())
#visualise
nx.draw_networkx(G,with_labels=True)
#nitpicking here: your tree isn't a tree: 47 has two parents
# you can find leaves with this function:
def find_leaves(G, node):
# list all descendants of the node, as well as the node
d = list(nx.descendants(G, node))+[node]
# create a subgraph with only these nodes and find the leaves.
H = G.subgraph(d)
return [a for a in H.nodes if H.out_degree(a)==0 and H.in_degree(a)==1]
find_leaves(G, 18)
输出:
[45, 32, 58]
如果您不想使用networkx
,则可以执行以下操作:
edit 2:
If you don't want to use networkx
, you can do the following:
#create edgelist from dataframe:
edges = []
for ix, row in df.iterrows():
edges.append((row['parent'], row['child']))
# recursive function that starts at start_node and returns nodes that
# have no children:
def find_children(edges, start_node):
# find edges that have the start node as the parent
starting_edges = [(p,c) for p,c in edges if p == start_node]
leaves = []
# if node has children, loop through the children
if starting_edges:
for p, c in starting_edges:
leaves += find_children(edges, c)
# if the node has no children, store in list and return.
else:
leaves.append(start_node)
return leaves
#testing:
find_children(edges, 18)
#output:
[58,32,45]
find_children(edges, 85)
#output:
[34, 47, 88]
这篇关于如何在存储在DataFrame中的树中查找叶子的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!