我正在尝试对AMPI进行编码,以求和一个1d数组的总和。但是,我遇到了分段错误(核心已转储)。
我试图修复我的函数很多次,但是我找不到错误在哪里或如何修复。
/* File: mpi_sum.c
* Compile as: mpicc -g -Wall -std=c99 -o mpi_sum mpi_sum.c -lm
* Run as: mpirun -n 40 ./mpi_sum
* Description: An MPI solution to sum a 1D array. */
int main(int argc, char *argv[]) {
int myID, numProcs; // myID for the index to know when should the cpu start and stop calculate
//numPro numper of cpu you need to do the calculation
double localSum; // this for one operation on one cpu
double parallelSum; // this for collecting the values of localsum
int length = 10000000; // this for how many num
double Fact = 1 ;
int i; // this for for loop
clock_t clockStart, clockEnd; // timer
srand(5); // Initialize MPI
MPI_Init(NULL, NULL); //Initialize MPI
MPI_Comm_size(MPI_COMM_WORLD, &numProcs); // Get size
MPI_Comm_rank(MPI_COMM_WORLD, &myID); // Get rank
localSum = 0.0; // the value for eash cpu is 0
int A = (length / numProcs)*((long)myID); // this is to make each cpu work on his area
int B = (length / numProcs)*((long)myID + 1); // this is to make each cpu work on his area
A ++; // add 1 to go to next num
B ++;
clockStart = clock(); // start the timer to see how much time it take
for (i = A; i < B; i++)
{
Fact = (1 / myID - 1/numProcs) / (1 - 1/numProcs);
localSum += Fact ;
}
MPI_Reduce(&localSum, ¶llelSum, 1, MPI_DOUBLE, MPI_SUM, 0, MPI_COMM_WORLD);
clockEnd = clock();
if (myID == 0)
{
printf("Time to sum %d floats with MPI in parallel %3.5f seconds\n", length, (clockEnd - clockStart) / (float)CLOCKS_PER_SEC);
printf("The parallel sum: %f\n", parallelSum + 1);
}
MPI_Finalize();
return 0;
}
最佳答案
当我运行您的代码时,我的numProcs变成1,程序崩溃了
*** Process received signal ***
Signal: Floating point exception (8)
Signal code: Integer divide-by-zero (1)
Failing at address: 0x400af9
[ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0x10330)[0x7f8bb13d2330]
[ 1] ./mpi_sum[0x400af9]
[ 2] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x7f8bb101af45]
[ 3] ./mpi_sum[0x400919]
*** End of error message ***
Floating point exception (core dumped)
在线上
Fact = (1 / myID - 1/numProcs) / (1 - 1/numProcs);
因为我们的分母为零
由于您遇到了其他错误,我建议您输入以下内容:
printf("%d\n", __LINE__); fflush(stdout);
声明以了解崩溃的位置?