本文介绍了稀疏矩阵排序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个稀疏矩阵.我需要逐行对此矩阵进行排序,并创建另一个[sparse]矩阵.代码可能会更好地解释它:

I have a sparse matrix. I need to sort this matrix row-by-row and create another [sparse] matrix.Code may explain it better:

# for `rand` function, you need newer version of scipy.
from scipy.sparse import *
m = rand(6,6, density=0.6)
d = m.getrow(0)
print d

Output1

(0, 5) 0.874881629788
(0, 4) 0.352559852239
(0, 2) 0.504791645463
(0, 1) 0.885898140175

我有这个m矩阵.我想创建一个具有排序版本的m的新矩阵.新矩阵包含这样的第0行.

I have this m matrix. I want to create a new matrix with sorted version of m. The new matrixcontains 0'th row like this.

new_d = new_m.getrow(0)
print new_d

Output2

(0, 1) 0.885898140175
(0, 5) 0.874881629788
(0, 2) 0.504791645463
(0, 4) 0.352559852239

所以我可以获得哪一列更大,等等:

So I can obtain which column is bigger etc:

print new_d.indices

Output3

array([1, 5, 2, 4])

当然,每一行都应该像上面一样独立地进行排序.

Of course every row should be sorted like above independently.

对于这个问题,我有一个解决方案,但这并不优雅.

I have one solution for this problem but it is not elegant.

推荐答案

如果您愿意忽略矩阵的零值元素,那么下面的代码应该可以使用.它也比使用getrow方法的实现要快得多,后者相当慢.

If you're willing to ignore the zero-value elements of the matrix, the code below should work. It is also much faster than implementations that use the getrow method, which is rather slow.

from itertools import izip

def sort_coo(m):
    tuples = izip(m.row, m.col, m.data)
    return sorted(tuples, key=lambda x: (x[0], x[2]))

例如:

    >>> from numpy.random import rand
    >>> from scipy.sparse import coo_matrix
    >>>
    >>> d = rand(10, 20)
    >>> d[d > .05] = 0
    >>> s = coo_matrix(d)
    >>> sort_coo(s)
    [(0, 2, 0.004775589084940246),
     (3, 12, 0.029941507166614145),
     (5, 19, 0.015030386789436245),
     (7, 0, 0.0075044957259399192),
     (8, 3, 0.047994403933129481),
     (8, 5, 0.049401058471327031),
     (9, 15, 0.040011608000125043),
     (9, 8, 0.048541825332137023)]

根据您的需要,您可能需要调整lambda中的排序键或进一步处理输出.如果您希望将所有内容都编入索引的字典中,则可以执行以下操作:

Depending on your needs you may want to tweak the sort keys in the lambda or further process the output. If you want everything in a row indexed dictionary you could do:

from collections import defaultdict

sorted_rows = defaultdict(list)

for i in sort_coo(m):
     sorted_rows[i[0]].append((i[1], i[2]))

这篇关于稀疏矩阵排序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-20 00:10