问题描述
我经常遇到复杂网络中的长尾度分布/直方图,如下图所示.从许多观察结果来看,它们使这些尾巴的末端很重,很重并且很拥挤:
I have often encountered and made long-tailed degree distributions/histograms from complex networks like the figures below. They make the heavy end of these tails, well, very heavy and crowded from many observations:
但是,我阅读的许多出版物的学位分布更为清晰,分布的末尾没有那么笨拙,而且观察结果的间距更均匀.
However, many publications I read have much cleaner degree distributions that don't have this clumpiness at the end of the distribution and the observations are more evenly-spaced.
!
如何使用NetworkX
和matplotlib
制作这样的图表?
How do you make a chart like this using NetworkX
and matplotlib
?
推荐答案
使用日志合并( 另请参见).这是一个代码,用于获取表示度值直方图的Counter
对象,并记录分布以生成稀疏且更平滑的分布.
Use log binning (see also). Here is code to take a Counter
object representing a histogram of degree values and log-bin the distribution to produce a sparser and smoother distribution.
import numpy as np
def drop_zeros(a_list):
return [i for i in a_list if i>0]
def log_binning(counter_dict,bin_count=35):
max_x = log10(max(counter_dict.keys()))
max_y = log10(max(counter_dict.values()))
max_base = max([max_x,max_y])
min_x = log10(min(drop_zeros(counter_dict.keys())))
bins = np.logspace(min_x,max_base,num=bin_count)
# Based off of: http://stackoverflow.com/questions/6163334/binning-data-in-python-with-scipy-numpy
bin_means_y = (np.histogram(counter_dict.keys(),bins,weights=counter_dict.values())[0] / np.histogram(counter_dict.keys(),bins)[0])
bin_means_x = (np.histogram(counter_dict.keys(),bins,weights=counter_dict.keys())[0] / np.histogram(counter_dict.keys(),bins)[0])
return bin_means_x,bin_means_y
在NetworkX
中生成经典的无标度网络,然后进行以下绘制:
Generating a classic scale-free network in NetworkX
and then plotting this:
import networkx as nx
ba_g = nx.barabasi_albert_graph(10000,2)
ba_c = nx.degree_centrality(ba_g)
# To convert normalized degrees to raw degrees
#ba_c = {k:int(v*(len(ba_g)-1)) for k,v in ba_c.iteritems()}
ba_c2 = dict(Counter(ba_c.values()))
ba_x,ba_y = log_binning(ba_c2,50)
plt.xscale('log')
plt.yscale('log')
plt.scatter(ba_x,ba_y,c='r',marker='s',s=50)
plt.scatter(ba_c2.keys(),ba_c2.values(),c='b',marker='x')
plt.xlim((1e-4,1e-1))
plt.ylim((.9,1e4))
plt.xlabel('Connections (normalized)')
plt.ylabel('Frequency')
plt.show()
生成以下图,显示蓝色的原始"分布与红色的"binbin"分布之间的重叠.
Produces the following plot showing the overlap between the "raw" distribution in blue and the "binned" distribution in red.
如果我错过了明显的事情,就可以改进这种方法或反馈的想法是值得欢迎的.
Thoughts on how to improve this approach or feedback if I've missed something obvious are welcome.
这篇关于绘制日志绑定的网络度分布的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!