问题描述
我想知道最好的方法是用scipy.sparse迭代稀疏矩阵的非零条目.例如,如果我执行以下操作:
I'm wondering what the best way is to iterate nonzero entries of sparse matrices with scipy.sparse. For example, if I do the following:
from scipy.sparse import lil_matrix
x = lil_matrix( (20,1) )
x[13,0] = 1
x[15,0] = 2
c = 0
for i in x:
print c, i
c = c+1
输出为
0
1
2
3
4
5
6
7
8
9
10
11
12
13 (0, 0) 1.0
14
15 (0, 0) 2.0
16
17
18
19
所以看起来迭代器正在接触每个元素,而不仅仅是非零条目.我看过了API
so it appears the iterator is touching every element, not just the nonzero entries. I've had a look at the API
http://docs.scipy.org/doc/scipy/reference/generation/scipy.sparse.lil_matrix.html
并进行了一些搜索,但是我似乎找不到有效的解决方案.
and searched around a bit, but I can't seem to find a solution that works.
推荐答案
(使用 coo_matrix )比我最初的建议要快得多. lil_matrix.nonzero.html#scipy-sparse-lil-matrix-nonzero"rel =" noreferrer>非零. Sven Marnach建议使用itertools.izip
也可以提高速度.当前最快的是using_tocoo_izip
:
bbtrb's method (using coo_matrix) is much faster than my original suggestion, using nonzero. Sven Marnach's suggestion to use itertools.izip
also improves the speed. Current fastest is using_tocoo_izip
:
import scipy.sparse
import random
import itertools
def using_nonzero(x):
rows,cols = x.nonzero()
for row,col in zip(rows,cols):
((row,col), x[row,col])
def using_coo(x):
cx = scipy.sparse.coo_matrix(x)
for i,j,v in zip(cx.row, cx.col, cx.data):
(i,j,v)
def using_tocoo(x):
cx = x.tocoo()
for i,j,v in zip(cx.row, cx.col, cx.data):
(i,j,v)
def using_tocoo_izip(x):
cx = x.tocoo()
for i,j,v in itertools.izip(cx.row, cx.col, cx.data):
(i,j,v)
N=200
x = scipy.sparse.lil_matrix( (N,N) )
for _ in xrange(N):
x[random.randint(0,N-1),random.randint(0,N-1)]=random.randint(1,100)
产生这些timeit
结果:
% python -mtimeit -s'import test' 'test.using_tocoo_izip(test.x)'
1000 loops, best of 3: 670 usec per loop
% python -mtimeit -s'import test' 'test.using_tocoo(test.x)'
1000 loops, best of 3: 706 usec per loop
% python -mtimeit -s'import test' 'test.using_coo(test.x)'
1000 loops, best of 3: 802 usec per loop
% python -mtimeit -s'import test' 'test.using_nonzero(test.x)'
100 loops, best of 3: 5.25 msec per loop
这篇关于遍历scipy.sparse向量(或矩阵)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!