DataFrame构造NetworkX图

DataFrame构造NetworkX图

本文介绍了从Pandas DataFrame构造NetworkX图的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想从一个简单的Pandas DataFrame创建一些NetworkX图:

I'd like to create some NetworkX graphs from a simple Pandas DataFrame:

        Loc 1   Loc 2   Loc 3   Loc 4   Loc 5   Loc 6   Loc 7
Foo     0       0       1       1       0       0           0
Bar     0       0       1       1       0       1           1
Baz     0       0       1       0       0       0           0
Bat     0       0       1       0       0       1           0
Quux    1       0       0       0       0       0           0

其中Foo…是索引,而Loc 1Loc 7是列.但是转换为Numpy矩阵或Recarray似乎无法为nx.Graph()生成输入.是否有实现这一目标的标准策略?我不反对在Pandas中重新格式化数据->转储为CSV->导入到NetworkX,但是似乎我应该能够从索引中生成边缘,并从值中生成节点.

Where Foo… is the index, and Loc 1 to Loc 7 are the columns. But converting to Numpy matrices or recarrays doesn't seem to work for generating input for nx.Graph(). Is there a standard strategy for achieving this? I'm not averse the reformatting the data in Pandas --> dumping to CSV --> importing to NetworkX, but it seems as if I should be able to generate the edges from the index and the nodes from the values.

推荐答案

NetworkX期望(节点和边的)方阵,也许*您想通过它:

NetworkX expects a square matrix (of nodes and edges), perhaps* you want to pass it:

In [11]: df2 = pd.concat([df, df.T]).fillna(0)

注意:索引和列的顺序相同很重要!

In [12]: df2 = df2.reindex(df2.columns)

In [13]: df2
Out[13]:
       Bar  Bat  Baz  Foo  Loc 1  Loc 2  Loc 3  Loc 4  Loc 5  Loc 6  Loc 7  Quux
Bar      0    0    0    0      0      0      1      1      0      1      1     0
Bat      0    0    0    0      0      0      1      0      0      1      0     0
Baz      0    0    0    0      0      0      1      0      0      0      0     0
Foo      0    0    0    0      0      0      1      1      0      0      0     0
Loc 1    0    0    0    0      0      0      0      0      0      0      0     1
Loc 2    0    0    0    0      0      0      0      0      0      0      0     0
Loc 3    1    1    1    1      0      0      0      0      0      0      0     0
Loc 4    1    0    0    1      0      0      0      0      0      0      0     0
Loc 5    0    0    0    0      0      0      0      0      0      0      0     0
Loc 6    1    1    0    0      0      0      0      0      0      0      0     0
Loc 7    1    0    0    0      0      0      0      0      0      0      0     0
Quux     0    0    0    0      1      0      0      0      0      0      0     0

In[14]: graph = nx.from_numpy_matrix(df2.values)

这不会将列/索引名称传递给图形,如果您想这样做,可以使用(您可能要警惕重复,这在熊猫的DataFrames中是允许的):

This doesn't pass the column/index names to the graph, if you wanted to do that you could use relabel_nodes (you may have to be wary of duplicates, which are allowed in pandas' DataFrames):

In [15]: graph = nx.relabel_nodes(graph, dict(enumerate(df2.columns))) # is there nicer  way than dict . enumerate ?

*目前尚不清楚所需图形的列和索引代表什么.

这篇关于从Pandas DataFrame构造NetworkX图的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-06 18:12