问题描述
假设我有一个CSR格式的矩阵,将一行(或几行)设置为零的最有效方法是什么?
以下代码运行非常缓慢:
A = A.tolil()
A[indices, :] = 0
A = A.tocsr()
我不得不转换为scipy.sparse.lil_matrix
,因为CSR格式似乎既不支持花式索引也不支持切片设置值.
我猜scipy不能实现,但是CSR格式可以很好地支持它,请阅读Wikipedia上有关稀疏矩阵"的文章,以了解等:
# A.indptr is an array, one for each row (+1 for the nnz):
def csr_row_set_nz_to_val(csr, row, value=0):
"""Set all nonzero elements (elements currently in the sparsity pattern)
to the given value. Useful to set to 0 mostly.
"""
if not isinstance(csr, scipy.sparse.csr_matrix):
raise ValueError('Matrix given must be of CSR format.')
csr.data[csr.indptr[row]:csr.indptr[row+1]] = value
# Now you can just do:
for row in indices:
csr_row_set_nz_to_val(A, row, 0)
# And to remove zeros from the sparsity pattern:
A.eliminate_zeros()
当然,这会从稀疏模式中删除使用eliminate_zeros
从另一个位置设置的0.是否要执行此操作(此时)取决于您的实际操作,即.消除可能要延迟到所有其他可能添加新零的计算也都完成之后,或者在某些情况下您可能有0个值,然后又想再次更改,因此消除它们非常不好! >
您当然可以原则上将eliminate_zeros
和prune
短路,但这应该很麻烦,甚至可能更慢(因为您不会在C语言中这样做).
有关eliminiate_zeros(和修剪)的详细信息
稀疏矩阵通常不保存零元素,而仅存储非零元素所在的位置(大致并使用各种方法). eliminate_zeros
从稀疏模式中删除矩阵中的所有零(即,在存储 之前,没有为该位置存储任何值,但它为0).如果您想稍后将0更改为其他值,则消除效果不好,否则可以节省空间.
修剪将仅在所需时间更长时缩小存储的数据数组.请注意,虽然我第一次在其中使用A.prune()
,但A.eliminiate_zeros()
已经包含了修剪.
Suppose I have a matrix in the CSR format, what is the most efficient way to set a row (or rows) to zeros?
The following code runs quite slowly:
A = A.tolil()
A[indices, :] = 0
A = A.tocsr()
I had to convert to scipy.sparse.lil_matrix
because the CSR format seems to support neither fancy indexing nor setting values to slices.
I guess scipy just does not implement it, but the CSR format would support this quite well, please read the wikipedia article on "Sparse matrix" about what indptr
, etc. are:
# A.indptr is an array, one for each row (+1 for the nnz):
def csr_row_set_nz_to_val(csr, row, value=0):
"""Set all nonzero elements (elements currently in the sparsity pattern)
to the given value. Useful to set to 0 mostly.
"""
if not isinstance(csr, scipy.sparse.csr_matrix):
raise ValueError('Matrix given must be of CSR format.')
csr.data[csr.indptr[row]:csr.indptr[row+1]] = value
# Now you can just do:
for row in indices:
csr_row_set_nz_to_val(A, row, 0)
# And to remove zeros from the sparsity pattern:
A.eliminate_zeros()
Of course this removes 0s that were set from another place with eliminate_zeros
from the sparsity pattern. If you want to do that (at this point) depends on what you are doing really, ie. elimination might make sense to delay until all other calculations that might add new zero's are done as well, or in some cases you may have 0 values, that you want to change again later, so it would be very bad to eliminate them!
You could in principle of course short-circuit the eliminate_zeros
and prune
, but that should be a lot of hassle, and might be even slower (because you won't do it in C).
Details about eliminiate_zeros (and prune)
The sparse matrix, does generally not save zero elements, but just stores where the nonzero elements are (roughly and with various methods). eliminate_zeros
removes all zeros in your matrix from the sparsity pattern (ie. there is no value stored for that position, when before there was a vlaue stored, but it was 0). Eliminate is bad if you want to change a 0 to a different value lateron, otherwise, it saves space.
Prune would just shrink the data arrays stored when they are longer then necessary. Note that while I first had A.prune()
in there, A.eliminiate_zeros()
already includes prune.
这篇关于scipy.sparse:将行设置为零的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!