在numpy中,计算dijkstra算法使用的跳跃次数的最快方法是什么?我有一个10000x1000元素连接矩阵,并使用scipy.sparse.csgraph.dijkstra计算填充距离矩阵和前置矩阵。我天真的解决办法如下:
import numpy as np
from numpy.random import rand
from scipy.sparse.csgraph import dijkstra
def dijkway(dijkpredmat, i, j):
"""calculate the path between two nodes in a dijkstra matrix"""
wayarr = []
while (i != j) & (j >= 0):
wayarr.append(j)
j = dijkpredmat[i,j]
return np.array(wayarr)
def jumpvec(pmat,node):
"""calculate number of jumps from one node to all others"""
jumps = np.zeros(len(pmat))
jumps[node] = -999
while 1:
try:
rvec = np.nonzero(jumps==0)[0]
r = rvec.min()
dway = dijkway(pmat, node, r)
jumps[dway] = np.arange(len(dway),0,-1)
except ValueError:
break
return jumps
#Create a matrix
mat = (rand(500,500)*20)
mat[(rand(50000)*500).astype(int), (rand(50000)*500).astype(int)] = np.nan
dmat,pmat = dijkstra(mat,return_predecessors=True)
timeit jumpvec(pmat,300)
In [25]: 10 loops, best of 3: 51.5 ms per loop
~50msek/节点可以,但是将距离矩阵扩展到10000个节点会增加~2sek/节点的时间jumpvec也要执行10000次。。。
最佳答案
以下代码可以在我的电脑上加速4倍,速度更快,因为:
使用ndarray.item()
从数组中获取值。
使用set object保存未处理的索引。
不要在while循环中创建numpy.arange()
。
Python代码:
def dijkway2(dijkpredmat, i, j):
wayarr = []
while (i != j) & (j >= 0):
wayarr.append(j)
j = dijkpredmat.item(i,j)
return wayarr
def jumpvec2(pmat,node):
jumps = np.zeros(len(pmat))
jumps[node] = -999
todo = set()
for i in range(len(pmat)):
if i != node:
todo.add(i)
indexs = np.arange(len(pmat), 0, -1)
while todo:
r = todo.pop()
dway = dijkway2(pmat, node, r)
jumps[dway] = indexs[-len(dway):]
todo -= set(dway)
return jumps
为了加快速度,您可以使用cython:
import numpy as np
cimport numpy as np
import cython
@cython.wraparound(False)
@cython.boundscheck(False)
cpdef dijkway3(int[:, ::1] m, int i, int j):
cdef list wayarr = []
while (i != j) & (j >= 0):
wayarr.append(j)
j = m[i,j]
return wayarr
@cython.wraparound(False)
@cython.boundscheck(False)
def jumpvec3(int[:, ::1] pmat, int node):
cdef np.ndarray jumps
cdef int[::1] jumps_buf
cdef int i, j, r, n
cdef list dway
jumps = np.zeros(len(pmat), int)
jumps_buf = jumps
jumps[node] = -999
for i in range(len(jumps)):
if jumps_buf[i] != 0:
continue
r = i
dway = dijkway3(pmat, node, r)
n = len(dway)
for j in range(n):
jumps_buf[<int>dway[j]] = n - j
return jumps
这是我的测试,cython版本要快80倍:
%timeit jumpvec3(pmat,1)
%timeit jumpvec2(pmat, 1)
%timeit jumpvec(pmat, 1)
输出:
1000 loops, best of 3: 138 µs per loop
100 loops, best of 3: 2.81 ms per loop
100 loops, best of 3: 10.8 ms per loop
关于python - 计算Dijkstra算法中的跳跃次数?,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/22518000/