MPI分区并在Fortran中收集2D数组

本文介绍了MPI分区并在Fortran中收集2D数组的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个2D数组，每个进程都在其中运行一些计算.之后，我需要将所有计算出的列收集回根进程.我目前正在按照先到先得的方式进行分区.在伪代码中，主循环如下所示:

I have a 2D array where I'm running some computation on each process. Afterwards, I need to gather all the computed columns back to the root processes. I'm currently partitioning in a first come first serve manner. In pseudo code, the main loop looks like:

DO i = mpi_rank + 1, num_columns, mpi_size
   array(:,i) = do work here

完成此操作后，我需要将这些列收集到根进程中的正确索引中.做这个的最好方式是什么?如果分区方案不同，看起来MPI_GATHERV可以满足我的要求.但是，我不确定最好的分区方法是什么，因为num_columns和mpi_size不一定能被整除.

After this is completed, I need to gather these columns into the correct indices back in the root process. What is the best way to do this? It looks like MPI_GATHERV could do what I want if the partitioning scheme was different. However, I'm not sure what the best way to partition that would be since num_columns and mpi_size are not necessarily evenly divisible.

推荐答案

我建议采用以下方法:

将2D数组切成几乎相等"大小的块，即本地列数接近num_columns/mpi_size.
使用mpi_gatherv收集大块，mpi_gatherv与不同大小的块一起运行.

Cut the 2D array into chunks of "almost equal" size, i.e. with local number of columns close to num_columns / mpi_size.
Gather chunks with mpi_gatherv, which operates with chunks of different size.

要获得几乎相等"的列数，请将本地列数设置为num_columns/mpi_size的整数值，并且仅对前一个mod(num_columns,mpi_size) mpi任务增加一.

To get "almost equal" number of columns, set local number of columns to integer value of num_columns / mpi_size and increment by one only for first mod(num_columns,mpi_size) mpi tasks.

下表演示了在5个MPI进程上(10,12)矩阵的划分:

The following table demonstrates the partitioning of (10,12) matrix on 5 MPI processes:

  01  02  03  11  12  13  21  22  31  32  41  42
  01  02  03  11  12  13  21  22  31  32  41  42
  01  02  03  11  12  13  21  22  31  32  41  42
  01  02  03  11  12  13  21  22  31  32  41  42
  01  02  03  11  12  13  21  22  31  32  41  42
  01  02  03  11  12  13  21  22  31  32  41  42
  01  02  03  11  12  13  21  22  31  32  41  42
  01  02  03  11  12  13  21  22  31  32  41  42
  01  02  03  11  12  13  21  22  31  32  41  42
  01  02  03  11  12  13  21  22  31  32  41  42

这里的第一位数字是进程的 id ，第二位数字是本地列的数量.如您所见，进程0和1每个都有3列，而所有其他进程每个只有2列.

Here the first digit is an id of the process, the second digit is a number of local columns.As you can see, processes 0 and 1 got 3 columns each, while all other processes got only 2 columns each.

下面您可以找到我编写的有效示例代码.最棘手的部分是为MPI_Gatherv生成rcounts和displs数组.讨论的表是代码的输出.

Below you can find working example code that I wrote.The trickiest part would be the generation of rcounts and displs arrays for MPI_Gatherv. The discussed table is an output of the code.

  program mpi2d
  implicit none
  include 'mpif.h'
  integer myid, nprocs, ierr
  integer,parameter:: m = 10       ! global number of rows
  integer,parameter:: n = 12       ! global number of columns
  integer nloc                     ! local  number of columns
  integer array(m,n)               ! global m-by-n, i.e. m rows and n columns
  integer,allocatable:: loc(:,:)   ! local piece of global 2d array
  integer,allocatable:: rcounts(:) ! array of nloc's (for mpi_gatrherv)
  integer,allocatable:: displs(:)  ! array of displacements (for mpi_gatherv)
  integer i,j


  ! Initialize
  call mpi_init(ierr)
  call mpi_comm_rank(MPI_COMM_WORLD, myid, ierr)
  call mpi_comm_size(MPI_COMM_WORLD, nprocs, ierr)

  ! Partition, i.e. get local number of columns
  nloc = n / nprocs
  if (mod(n,nprocs)>myid) nloc = nloc + 1

  ! Compute partitioned array
  allocate(loc(m,nloc))
  do j=1,nloc
    loc(:,j) = myid*10 + j
  enddo

  ! Build arrays for mpi_gatherv:
  ! rcounts containes all nloc's
  ! displs  containes displacements of partitions in terms of columns
  allocate(rcounts(nprocs),displs(nprocs))
  displs(1) = 0
  do j=1,nprocs
    rcounts(j) = n / nprocs
    if(mod(n,nprocs).gt.(j-1)) rcounts(j)=rcounts(j)+1
    if((j-1).ne.0)displs(j) = displs(j-1) + rcounts(j-1)
  enddo

  ! Convert from number of columns to number of integers
  nloc    = m * nloc
  rcounts = m * rcounts
  displs  = m * displs

  ! Gather array on root
  call mpi_gatherv(loc,nloc,MPI_INT,array,
 &  rcounts,displs,MPI_INT,0,MPI_COMM_WORLD,ierr)

  ! Print array on root
  if(myid==0)then
    do i=1,m
      do j=1,n
        write(*,'(I04.2)',advance='no') array(i,j)
      enddo
      write(*,*)
    enddo
  endif

  ! Finish
  call mpi_finalize(ierr)

  end

这篇关于MPI分区并在Fortran中收集2D数组的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！

Partitioning