问题描述
我正在编写一个OpenMDAO问题,该问题在并行组中调用一组外部代码.这些外部代码之一是基于PETSc的fortran FEM代码.我意识到这是潜在的问题,因为OpenMDAO还利用PETSc.目前,我正在使用python的子进程在组件中调用外部代码.
I am writing an OpenMDAO problem that calls a group of external codes in a parallel group. One of these external codes is a PETSc-based fortran FEM code. I realize this is potentially problematic since OpenMDAO also utilizes PETSc. At the moment, I'm calling the external code in a component using python's subprocess.
如果我以串行方式运行我的OpenMDAO问题(即python2.7 omdao_problem.py),那么一切(包括外部代码)都可以正常工作.但是,当我尝试并行运行它(即mpirun -np 4 python2.7 omdao_problem.py)时,它会一直工作到子进程调用为止,此时我得到了错误:
If I run my OpenMDAO problem in serial (i.e. python2.7 omdao_problem.py), everything, including the external code, works just fine. When I try to run it in parallel, however (i.e. mpirun -np 4 python2.7 omdao_problem.py) then it works up until the subprocess call, at which point I get the error:
*** Process received signal ***
Signal: Segmentation fault: 11 (11)
Signal code: Address not mapped (1)
Failing at address: 0xe3c00
[ 0] 0 libsystem_platform.dylib 0x00007fff94cb652a _sigtramp + 26
[ 1] 0 libopen-pal.20.dylib 0x00000001031360c5 opal_timer_darwin_bias + 15469
*** End of error message ***
我不能做很多事情,但是对我来说,问题似乎出在使用基于MPI的python代码调用另一个启用了MPI的代码.我尝试在外部代码的位置使用非mpi的"hello world"可执行文件,并且可以由并行OpenMDAO代码调用而不会出错.我不需要外部代码实际并行运行,但是我确实需要使用PETSc求解器等,因此固有地依赖于MPI. (我想我可以考虑同时放置启用MPI和未启用MPI的PETSc构建吗?如果可能的话,我宁愿不这样做,因为我发现这很匆忙.)
I can't make much of this, but it seems reasonable to me that the problem would come from using an MPI-based python code to call another MPI-enabled code. I've tried using a non-mpi "hello world" executable in the external code's place and that can be called by the parallel OpenMDAO code without error. I do not need the external code to actually run in parallel, but I do need to use the PETSc solvers and such, hence the inherent reliance on MPI. (I guess I could consider having both an MPI-enabled and non-MPI-enabled build of PETSc laying around? Would prefer not to do that if possible as I can see that becoming a mess in a hurry.)
我发现了此讨论,其中似乎存在类似的问题(并且进一步指出,就像我正在做的那样,在MPI代码中使用子流程是不行的).在这种情况下,即使不打算使用MPI_Comm_spawn,也可能是一种选择.知道在OpenMDAO中是否可行?使此功能起作用的其他途径?任何想法或建议都将不胜感激.
I found this discussion which appears to present a similar issue (and further states that using subprocess in an MPI code, as I'm doing, is a no-no). In that case, it looks like using MPI_Comm_spawn may be an option, even though it isn't intended for that use. Any idea if that would work in the context of OpenMDAO? Other avenues to pursue for getting this to work? Any thoughts or suggestions are greatly appreciated.
推荐答案
您无需将外部代码作为子过程来调用.使用F2py在python中包装fortran代码,并将comm对象向下传递给它. 此文档示例显示了如何使用使用通讯的组件.
You don't need to call the external code as a sub-process. Wrap the fortran code in python using F2py and pass a comm object down into it. This docs example shows how to work with components that use a comm.
如果需要,可以使用MPI生成.这种方法已经完成,但远非理想.如果您可以将代码包装在内存中,然后让OpenMDAO传递给您一个通讯,您的效率将会大大提高.
You could use an MPI spawn if you want to. This approach has been done, but its far from ideal. You will be much more efficient if you can wrap the code in memory and let OpenMDAO pass you a comm.
这篇关于无法在并行OpenMDAO中调用基于PETSc/MPI的外部代码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!