问题描述
我在的openmpi制作matriz乘法程序,我得到这个错误信息:
I'm making a matriz multiplication program in OpenMPI, and I got this error message:
[Mecha Liberta:12337] *** Process received signal ***
[Mecha Liberta:12337] Signal: Segmentation fault (11)
[Mecha Liberta:12337] Signal code: Address not mapped (1)
[Mecha Liberta:12337] Failing at address: 0xbfe4f000
--------------------------------------------------------------------------
mpirun noticed that process rank 1 with PID 12337 on node Mecha Liberta exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------
这就是我如何定义矩阵:
That's how I define the matrices:
int **a, **b, **r;
a = (int **)calloc(l,sizeof(int));
b = (int **)calloc(l,sizeof(int));
r = (int **)calloc(l,sizeof(int));
for (i = 0; i < l; i++)
a[i] = (int *)calloc(c,sizeof(int));
for (i = 0; i < l; i++)
b[i] = (int *)calloc(c,sizeof(int));
for (i = 0; i < l; i++)
r[i] = (int *)calloc(c,sizeof(int));
这是我的发送/ recv的(我是pretty肯定我的问题应该在这里):
And here's my Send/Recv (i'm pretty sure my problem should be here):
MPI_Send(&sent, 1, MPI_INT, dest, tag, MPI_COMM_WORLD);
MPI_Send(&lines, 1, MPI_INT, dest, tag, MPI_COMM_WORLD);
MPI_Send(&(a[sent][0]), lines*NCA, MPI_INT, dest, tag, MPI_COMM_WORLD);
MPI_Send(&b, NCA*NCB, MPI_INT, dest, tag, MPI_COMM_WORLD);
和
MPI_Recv(&sent, 1, MPI_INT, 0, tag, MPI_COMM_WORLD, &status);
MPI_Recv(&lines, 1, MPI_INT, 0, tag, MPI_COMM_WORLD, &status);
MPI_Recv(&a, lines*NCA, MPI_INT, 0, tag, MPI_COMM_WORLD, &status);
MPI_Recv(&b, NCA*NCB, MPI_INT, 0, tag, MPI_COMM_WORLD, &status);
有人能看到问题出在哪里?
Can anyone see where is the problem?
推荐答案
这是C和多维数组和MPI一个共同的问题。
This is a common problem with C and multidimensional arrays and MPI.
在这一行,说:
MPI_Send(&b, NCA*NCB, MPI_INT, dest, tag, MPI_COMM_WORLD);
你告诉MPI发送NCAxNCB整数开始b,来 DEST,MPI_COMM_WORLD
与标记标记
。 但是的,b不是一个指针NCAxNCB整数;这是一个指向NCA指针NCB整数。
you're telling MPI to send NCAxNCB integers starting at b to dest,MPI_COMM_WORLD
with tag tag
. But, b isn't a pointer to NCAxNCB integers; it's a pointer to NCA pointers to NCB integers.
所以,你想要做的是确保你的阵列是连续的(可能是更好的性能无论如何),使用这样的:
So what you want to do is to ensure your arrays are contiguous (probably better for performance anyway), using something like this:
int **alloc_2d_int(int rows, int cols) {
int *data = (int *)malloc(rows*cols*sizeof(int));
int **array= (int **)malloc(rows*sizeof(int*));
for (int i=0; i<rows; i++)
array[i] = &(data[cols*i]);
return array;
}
/* .... */
int **a, **b, **r;
a = alloc_2d_int(l, c);
b = alloc_2d_int(l, c);
r = alloc_2d_int(l, c);
然后
MPI_Send(&sent, 1, MPI_INT, dest, tag, MPI_COMM_WORLD);
MPI_Send(&lines, 1, MPI_INT, dest, tag, MPI_COMM_WORLD);
MPI_Send(&(a[sent][0]), lines*NCA, MPI_INT, dest, tag, MPI_COMM_WORLD);
MPI_Send(&(b[0][0]), NCA*NCB, MPI_INT, dest, tag, MPI_COMM_WORLD);
MPI_Recv(&sent, 1, MPI_INT, 0, tag, MPI_COMM_WORLD, &status);
MPI_Recv(&lines, 1, MPI_INT, 0, tag, MPI_COMM_WORLD, &status);
MPI_Recv(&(a[0][0]), lines*NCA, MPI_INT, 0, tag, MPI_COMM_WORLD, &status);
MPI_Recv(&(b[0][0]), NCA*NCB, MPI_INT, 0, tag, MPI_COMM_WORLD, &status);
应该按预期更多。
should work more as expected.
这篇关于MPI矩阵乘法与动态分配:赛格。故障的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!