我正在尝试对AMPI进行编码,以求和一个1d数组的总和。但是,我遇到了分段错误(核心已转储)。

我试图修复我的函数很多次,但是我找不到错误在哪里或如何修复。

/* File: mpi_sum.c
* Compile as: mpicc -g -Wall -std=c99 -o mpi_sum mpi_sum.c -lm
* Run as:  mpirun -n 40  ./mpi_sum
* Description: An MPI solution to sum a 1D array. */

int main(int argc, char *argv[]) {
  int myID, numProcs;  // myID for the index to know when should the cpu start and stop calculate
                       //numPro numper of cpu you need to do the calculation
  double localSum;    // this for one operation on one cpu
  double parallelSum; // this for collecting the values of localsum
  int length = 10000000; // this for how many num
  double Fact = 1 ;
  int i; // this for for loop
  clock_t clockStart, clockEnd;   // timer
  srand(5); // Initialize MPI
  MPI_Init(NULL, NULL); //Initialize MPI
  MPI_Comm_size(MPI_COMM_WORLD, &numProcs); // Get size
  MPI_Comm_rank(MPI_COMM_WORLD, &myID); // Get rank
  localSum = 0.0;                               // the value for eash cpu is 0
  int A = (length / numProcs)*((long)myID);     // this is to make each cpu work on his area
  int B = (length / numProcs)*((long)myID + 1); // this is to make each cpu work on his area

  A ++;                                         // add 1 to go to next num
  B ++;

  clockStart = clock();                     // start the timer to see how much time it take
  for (i = A; i < B; i++)
  {
          Fact = (1 / myID - 1/numProcs) / (1 - 1/numProcs);
          localSum += Fact ;
  }

  MPI_Reduce(&localSum, &parallelSum, 1, MPI_DOUBLE, MPI_SUM, 0, MPI_COMM_WORLD);

  clockEnd = clock();

  if (myID == 0)
  {
          printf("Time to sum %d floats with MPI in parallel %3.5f seconds\n", length, (clockEnd - clockStart) / (float)CLOCKS_PER_SEC);
          printf("The parallel sum: %f\n", parallelSum + 1);
  }

    MPI_Finalize();
  return 0;
}

最佳答案

当我运行您的代码时,我的numProcs变成1,程序崩溃了

*** Process received signal ***
Signal: Floating point exception (8)
Signal code: Integer divide-by-zero (1)
Failing at address: 0x400af9
[ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0x10330)[0x7f8bb13d2330]
[ 1] ./mpi_sum[0x400af9]
[ 2] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x7f8bb101af45]
[ 3] ./mpi_sum[0x400919]
*** End of error message ***
Floating point exception (core dumped)


在线上

Fact = (1 / myID - 1/numProcs) / (1 - 1/numProcs);


因为我们的分母为零

由于您遇到了其他错误,我建议您输入以下内容:

printf("%d\n", __LINE__); fflush(stdout);


声明以了解崩溃的位置?

10-06 00:27