本文介绍了有效的方式来计算矩阵与GSL直积的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我的算法的瓶颈是我的函数克罗内克产品称为KPro:
The bottleneck of my algorithm is my function Kronecker Product called KPro:
gsl_matrix *KPro(gsl_matrix *a, gsl_matrix *b) {
int i, j, k, l;
int m, p, n, q;
m = a->size1;
p = a->size2;
n = b->size1;
q = b->size2;
gsl_matrix *c = gsl_matrix_alloc(m*n, p*q);
double da, db;
for (i = 0; i < m; i++) {
for (j = 0; j < p; j++) {
da = gsl_matrix_get (a, i, j);
for (k = 0; k < n; k++) {
for (l = 0; l < q; l++) {
db = gsl_matrix_get (b, k, l);
gsl_matrix_set (c, n*i+k, q*j+l, da * db);
}
}
}
}
return c;
}
你知道使用GSL高效的实现?我无法找到一个合适的程序。
Do you know an efficient implementation using GSL? I can't find a suitable routine.
推荐答案
您可以显著提高通过堵,更有效地利用高速缓存的性能。
You can significantly improve the performance by 'blocking' and utilizing cache memory more effectively.
看看这个纸。是有伪code,我想你就可以很容易地变成C $ C $℃。它也有一种算法来找出最佳块大小给定的高速缓存大小和矩阵参数
Take a look at this paper. Is has pseudo code that I think you will be able to easily turn into C code. It also has an algorithm to figure out the optimum block size given cache size and matrix parameters.
这篇关于有效的方式来计算矩阵与GSL直积的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!