本文介绍了将稀疏数组中的元素与矩阵中的行相乘的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果矩阵X稀疏:

>> X = csr_matrix([[0,2,0,2],[0,2,0,1]])
>> print type(X)
>> print X.todense()
<class 'scipy.sparse.csr.csr_matrix'>
[[0 2 0 2]
 [0 2 0 1]]

还有一个矩阵Y:

>> print type(Y)
>> print text_scores
<class 'numpy.matrixlib.defmatrix.matrix'>
[[8]
 [5]]

...如何将X的每个元素乘以Y的行.例如:

...How can you multiply each element of X by the rows of Y. For example:

[[0*8 2*8 0*8 2*8]
 [0*5 2*5 0*5 1*5]]

或:

[[0 16 0 16]
 [0 10 0 5]]

我对此感到厌倦,但是显然由于尺寸不匹配而无法正常工作: Z = X.data * Y

I've tired this but obviously it doesn't work as the dimensions dont match: Z = X.data * Y

推荐答案

不幸的是,如果另一个CSR矩阵是密集的,则CSR矩阵的.multiply方法似乎使该矩阵致密.因此,这是避免这种情况的一种方法:

Unfortunatly the .multiply method of the CSR matrix seems to densify the matrix if the other one is dense. So this would be one way avoiding that:

# Assuming that Y is 1D, might need to do Y = Y.A.ravel() or such...

# just to make the point that this works only with CSR:
if not isinstance(X, scipy.sparse.csr_matrix):
    raise ValueError('Matrix must be CSR.')

Z = X.copy()
# simply repeat each value in Y by the number of nnz elements in each row:
Z.data *= Y.repeat(np.diff(Z.indptr))

这确实会创建一些临时对象,但至少将其完全矢量化,并且不会使稀疏矩阵致密化.

This does create some temporaries, but at least its fully vectorized, and it does not densify the sparse matrix.

对于COO矩阵,等效值为:

For a COO matrix the equivalent is:

Z.data *= Y[Z.row] # you can use np.take which is faster then indexing.

对于CSC矩阵,等效值为:

For a CSC matrix the equivalent would be:

Z.data *= Y[Z.indices]

这篇关于将稀疏数组中的元素与矩阵中的行相乘的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-31 03:43