问题描述
我正在尝试在C中实现稀疏矩阵(COO格式)框架以进行并行计算(共享内存).最初,我计划要有一系列的空间信息结构.
I am trying to implement a sparse matrix (COO format) framework in C for parallel computing (shared memory). Initially I was planning to have an array of struct of the spatial information.
typedef struct {
unsigned int rowIdx; \\ Row Index
unsigned int colIdx; \\ Col Index
unsigned int dataVal; \\ Value
} entity, *spMat;
并行数组如何执行相同的操作?
How does parallel array perform for the same ?
推荐答案
这在很大程度上取决于您打算如何实现该解决方案.如果要利用CPU或GPU的数据并行功能,那么最好将其实现为数组结构,而不是结构数组.
This largely depends on how you intend to implement the solution. If you want to take advantage of data parallel features of the CPU or GPU then you might well be better off implementing this as a struct of arrays than an array of structs.
typedef struct {
unsigned int* rowIdxs;
unsigned int* colIdxs;
unsigned int* dataValues;
} entity, *spMat;
这将使编写CPU编译器的矢量化器或GPU编译器可以有效使用的代码更加容易.因此,在这种情况下,我可能会首先使用数组结构并针对数据并行性进行优化.
This will make it easier to write code that either the CPU compiler's vectorizor or the GPU's compiler can use efficiently. So in this case I would probably use an struct of arrays first and optimize for data parallel(ness).
话虽如此,这很大程度上取决于您的实施效果.可能会用这两种方法编写性能不佳的实现.
That being said it will largely depend on how good your implementation is. it would be possible to write a poorly performing implementation with either approach.
这篇关于平行阵列或结构阵列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!