问题描述
我正在使用MPI实现分布式图像(灰度)卷积.我现有的模式是在根进程中将图像读取为一维展平数组,然后将它们分散到所有进程(行分解),然后在根进程中执行MPI_Gather
,然后将图像再次写为一维扁平阵列.显然,这没有得到预期的结果,因为在图像卷积的情况下,边界处的情况变得棘手.
I'm implementing a distributed image(greyscale) convolution using MPI. My existing pattern is to read the image as a 1D flattened array at the root process and then scatter them to all the processes (row-decomposition) and then do a MPI_Gather
at the root process and then write the image out again as a 1D flattened array. Obviously, this doesn't give the expected results since with image convolution, the situation gets tricky at the boundaries.
因此,为了改进上述模式,我想实现所谓的ghost cell exchange
模式,其中进程在ghost rows.
中以伪代码交换其行:
So, to improve upon the aforementioned pattern, I want to implement the so called ghost cell exchange
pattern wherein the processes exchange their rows in the ghost rows.
In pseudocode:
if (rank == 0) {
src = null
dest = rank + 1
}
if (rank == size - 1) {
src = rank - 1
dest = null
} else {
src = rank - 1
dest = rank + 1
}
MPI_SendRecv(&sendbuf[offset], slen, dest..
&recvbuf[offset], rlen, src);
如何在每个进程中为鬼行"分配内存?我应该预先分配内存然后分散吗?我不想采用自定义数据类型"解决方案,因为这对于我正在考虑的问题的范围来说是过大的选择.
How do I allocate memory for the "ghost rows" on each process? Should I pre-allocate the memory and then scatter? I don't want to go for a "custom-datatype" solution since it's an overkill for the scope of the problem I'm considering.
推荐答案
理想情况下,虚影单元格应该与 normal 单元格属于同一内存块.这样,您可以使寻址方案保持简单.在该方案中,使用MPI_Scatter
和MPI_Gather
将图像按完整行的倍数分布.在非边界等级中,您为另外两个幽灵行分配了足够的内存:
Ideally, the ghost cells should be part of the same memory block as your normal cells. That way, you can keep the addressing scheme simple. In that scheme, the image is distributed by multiples of complete rows, using MPI_Scatter
and MPI_Gather
. In a non-border rank you allocate enough memory for two additional ghost rows:
height = total_hight / ranks;
std::vector<float> data(width * (height + 2));
float* image = &data[width];
float* ghost_north = &data[0]
float* ghost_south = &data[width * (height + 1)]
float* inner_north = image;
float* inner_south = &image[width * (height - 1)]
MPI_Scatter(root_image, width * height, MPI_FLOAT,
image, width * height, MPI_FLOAT, ...);
...
iterations {
MPI_SendRecv(inner_north, width, MPI_FLOAT, north, tag,
ghost_north, width, MPI_FLOAT, north, tag, ...)
MPI_SendRecv(inner_south, width, MPI_FLOAT, south, tag,
ghost_south, width, MPI_FLOAT, south, tag, ...)
... compute ...
}
MPI_Gather(image, width * height, MPI_FLOAT,
root_image, width * height, MPI_FLOAT, ...);
此伪代码不考虑特殊的边框情况.
This pseudocode does not consider special border cases.
简单一维拆分的问题是通信成本和其他光环数据不是最优的.特别是对于较小的图像和较大的参与等级.
The issue with the simple one-dimensional splitting, is that communication cost and additional halo data is non-optimal. Especially for smaller images and a larger number of participating ranks.
这是 Rolf Rabenseifner的出色例子 MPI的数据分解和光环通信方法.他还解释了如何改善沟通方式.对于2D分解,您将需要派生MPI数据类型用于初始通信和垂直边界.
Here is an excellent example by Rolf Rabenseifner regarding data decompoisition and halo communication methods with MPI. He also explains how you can improve communication methods. For 2D decomposition, you will need derived MPI datatypes for both initial communication and vertical boundaries.
这篇关于MPI中的鬼细胞交换模式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!