问题描述
我需要将一个整数数组(基本上是一个二维数组)从 root 传递给所有处理器.我在 C 程序中使用 MPI.如何为二维数组声明 MPI 数据类型以及如何发送消息(我应该使用广播还是分散)
I need to pass an array of integer arrays (basically a 2 d array )to all the processors from root.I am using MPI in C programs. How to declare MPI datatype for 2 d array.and how to send the message (should i use broadcast or scatter)
推荐答案
你需要使用 广播,因为您想向每个进程发送相同消息的副本.Scatter 分解消息并在进程之间分配块.
You'll need to use Broadcast, because you want to send a copy of the same message to every process. Scatter breaks up a message and distributes the chunks between processes.
至于如何发送数据:HIndexed 数据类型适合您.
As for how to send the data: the HIndexed datatype is for you.
假设你的二维数组是这样定义的:
Suppose your 2d array is defined like this:
int N; // number of arrays (first dimension)
int sizes[N]; // number of elements in each array (second dimensions)
int* arrays[N]; // pointers to the start of each array
首先你必须计算每个数组的起始地址相对于数据类型的起始地址的位移,为了方便起见,它可以是第一个数组的起始地址:
First you have to calculate the displacement of each array's starting address, relative to the starting address of the datatype, which can be the starting address of the first array to make things convenient:
MPI_Aint base;
MPI_Address(arrays[0], &base);
MPI_Aint* displacements = new int[N];
for (int i=0; i<N; ++i)
{
MPI_Address(arrays[i], &displacements[i]);
displacements[i] -= base;
}
那么你的类型的定义是:
Then the definition for your type would be:
MPI_Datatype newType;
MPI_Type_hindexed(N, sizes, displacements, MPI_INTEGER, &newType);
MPI_Type_commit(&newType);
此定义将创建一个数据类型,其中包含一个接一个打包的所有数组.完成此操作后,您只需将数据作为此类型的单个对象发送:
This definition will create a datatype that contains all your arrays packed one after the other. Once this is done, you just send your data as a single object of this type:
MPI_Bcast(arrays, 1, newType, root, comm); // 'root' and 'comm' is whatever you need
但是,您还没有完成.接收进程将需要知道您要发送的数组的大小:如果在编译时无法获得该知识,则您必须首先发送带有该数据的单独消息(简单的整数数组).如果 N
、sizes
和 arrays
在接收进程中的定义与上面类似,并且分配了足够的空间来填充数组,那么所有的接收进程进程需要做的是定义相同的数据类型(与发送者完全相同的代码),然后作为该类型的单个实例接收发送者的消息:
However, you're not done yet. The receiving processes will need to know the sizes of the arrays you're sending: if that knowledge isn't available at compile time, you'll have to send a separate message with that data first (simple array of ints). If N
, sizes
and arrays
are defined similar as above on the receiving processes, with enough space allocated to fill the arrays, then all the receiving processes need to do is define the same datatype (exact same code as the sender), and then receive the sender's message as a single instance of that type:
MPI_Bcast(arrays, 1, newType, root, comm); // 'root' and 'comm' must have the same value as in the sender's code
瞧!所有进程现在都有一份您的数组副本.
And voilá! All processes now have a copy of your array.
当然,如果二维数组的第二维固定为某个值 M
,事情会变得容易得多.在这种情况下,最简单的解决方案是简单地将它存储在一个 int[N*M]
数组中:C++ 将保证它都是连续的内存,因此您可以在不定义自定义数据类型的情况下广播它,像这样:
Of course, things get a lot easier if the 2nd dimension of your 2d array is fixed to some value M
. In that case, the easiest solution is to simply store it in a single int[N*M]
array: C++ will guarantee that it's all contiguous memory, so you can broadcast it without defining a custom datatype, like this:
MPI_Bcast(arrays, N*M, MPI_INTEGER, root, comm);
注意:您可能使用索引 类型而不是 HIndexed.不同之处在于,在 Indexed 中,displacements
数组以元素数量给出,而在 HIndexed 中,它是字节数(H 代表 Heterogenous).如果您要使用索引,则 displacements
中给出的值必须除以 sizeof(int)
.但是,我不确定在堆上的任意位置定义的整数数组是否能保证在 C++ 中对齐"到整数限制,并且在任何情况下,HIndexed 版本的代码(略微)更少并产生相同的结果.
Note: you might get away with using the Indexed type instead of HIndexed. The difference is that in Indexed, the displacements
array is given in number of elements, while in HIndexed it's the number of bytes (H stands for Heterogenous). If you were to use Indexed, then the values given in displacements
would have to be divided by sizeof(int)
. However, I'm not sure if integer arrays defined in arbitrary positions on the heap are guaranteed to "line up" to integer limits in C++, and in any case, the HIndexed version has (marginally) less code and produces the same result.
这篇关于二维数组的 MPI 数据类型的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!