问题描述
我有一个非对称的稀疏矩阵稀疏度在某种程度上是随机的,我不能指望所有值都与对角线保持一定距离.
I have a sparse matrix that is not symmetric I.E. the sparsity is somewhat random, and I can't count on all the values being a set distance away from the diagonal.
但是,它仍然很稀疏,我想减少矩阵上的存储需求.因此,我试图找出如何存储从第一个非零开始的每一行,直到到达最后一个非零为止.
However, it is still sparse, and I want to reduce the storage requirement on the matrix. Therefore, I am trying to figure out how to store each row starting at the first non-zero, in order, until I get to the last non-zero.
也就是说,如果行m的第一个非零出现在第2列,而最后一个非零出现在第89列,则我想存储在A [m]行2-> 89中.
That is, if the first non-zero of row m occurs at column 2, and the last non-zero is at column 89, I want to store in A[m] rows 2-> 89.
由于每行不具有相同数量的非零,所以我将使A的所有行具有相同数量的元素,并对具有较少数量的non的行填充零至行末-零元素.
Since each row does not have the same number of non-zeros, I will make all the rows of A have the same number of elements, and pad zeros to the end of the row for rows that have a smaller number of non-zero elements.
我该如何用C语言翻译?我实际上并没有原始的完整矩阵来复制其值(原始矩阵以CSR形式出现在我的眼中).如果我在fortran中执行此操作,则可以通过跟踪非零列的开始/停止值并将其存储为这样,将数组定义为二维数组,并使每一行的长度可变.
How do I do this translation in C? I do not actually have an original, full matrix to just copy the values from (the original matrix is coming to me in CSR form). If I was doing this in fortran, I could just define my array to be two dimensional and just have each row be variable length by tracking the start/stop values of non-zero columns and store it like that.
我将尝试在下面进行演示:
I will try to demonstrate below:
这是我知道的值的矩阵表示形式-对于每个值,我都知道行和列的位置
This is a matrix representation of the values I know - and for each value, I know the row and column location
[1 2 3 4 ]
[ 5 6 7 8 ]
[ 10 11 12 13 ]
m[ 14 15 16 17 18 ]
[ 19 20 21 22 ]
现在,此行m
在第一个非零和最后一个非零之间具有最大的跨度"所以我的新矩阵将是5x[span of row m]
Now for this one row m
has the largest "span" between the first non-zero and last non-zeroso my new matrix is going to be 5x[span of row m]
[1 2 3 4 ]
[5 6 7 8 ]
[10 11 12 13 ]
m[14 15 16 17 18]
[19 20 21 22 ]
如您所见,行m
不需要零填充,因为它无论如何是最长的跨度"
As you can see, row m
needs no zero padding since it was the longest "span" anyway
现在其他所有行都将第零行作为第一个非零行,并保持每个非零行之间的零列间隔.
The other rows now all have row zero as the first non-zero, and maintain the spacing of zeros columns between each non-zero.
推荐答案
我将其实现为一个参差不齐的数组,其中A [n] [0]总是返回对角线上的元素. A [n] [1]将返回对角线右边的项目,A [n] [2]将返回对角线左边的项目,依此类推.然后,您只需要一个将矩阵索引[i,j]映射到参差不齐的数组索引[r] [s]的函数.
I would implement this as a ragged array, with A[n][0] always returning the element on the diagonal. A[n][1] will return the item just to the right of the diagonal, A[n][2] will return the item to the left of the diagonal, and so. Then, you just need a function that maps matrix index [i,j] to ragged array index[r][s].
这具有稀疏性的优点,如果您的值保持在对角线附近,则数组不会很长.
This has the advantage of sparsity, and if your values stay close to the diagonal the arrays are not very long.
或者,您可以有以下定义:
Alternatively, you could have this definition:
struct Row
{
int InitialOffset;
int NumElements;
int[] Values;
}
然后您将获得一个Row [].根据矩阵索引检索值如下所示:
Then you would have a Row[]. Retrieving a value based on matrix index would look like this:
//matrix is merely an array of rows...
int GetValue(*matrix this, int i, int j)
{
Row CurrentRow = (*this)[i];
if (CurrentRow.InitialOffset > j)
return 0;
else if (CurrentRow.InitialOffset + CurrentRow.NumElements < j)
return 0;
return CurrentRow.Values[j - CurrentRow.InitialOffset]
}
我的C语法有点朦胧,但是您应该明白这一点.
My C syntax is a little hazy, but you should get the idea.
根据您的演示,我建议这样做:
Based on your demonstration, I would recommend this:
struct Matrix
{
int[,] Data
int[] StartOffset;
int[] NumberElements;
}
使用如下...
int GetValue(*Matrix this, int i, int j)
{
if (this.StartOffset[i] > j)
return 0;
else if (this.StartOffset[i] + this.NumberElements[i] < j)
return 0;
return this.Data[i, j-this.StartOffset[i]];
}
您的初始化过程看起来像这样
Your initialization procedure would look something like this
//Data is a struct that holds row index, col index, and value
Matrix* InitMatrix (*Data values, int numVals)
{
//loop through values to find longest row and number of rows
//create new matrix, malloc matrix for longrow * numRows
//malloc numrows elements for StartOffset and NumItems
//foreach row, find min() and max()-min() of col indexs and
//store in StartOffset and NumItems
}
您需要进行一些处理,但是数据压缩并不便宜.
You need to do some processing, but data compression isn't cheap.
这篇关于C中的稀疏矩阵存储的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!