我正在尝试通过python访问带有OpenMPI的共享库,但是由于某些原因,我收到以下错误消息:

[Geo00433:01196] mca: base: component_find: unable to open /usr/li/openmpi/lib/openmpi/mca_paffinity_hwloc: perhaps a missing symbol, or compiled for a different version of Open MPI? (ignored)
[Geo00433:01196] mca: base: component_find: unable to open /usr/lib/openmpi/lib/openmpi/mca_carto_auto_detect: perhaps a missing symbol, or compiled for a different version of Open MPI? (ignored)
[Geo00433:01196] mca: base: component_find: unable to open /usr/lib/openmpi/lib/openmpi/mca_carto_file: perhaps a missing symbol, or compiled for a different version of Open MPI? (ignored)
[Geo00433:01196] mca: base: component_find: unable to open /usr/lib/openmpi/lib/openmpi/mca_shmem_mmap: perhaps a missing symbol, or compiled for a different version of Open MPI? (ignored)
[Geo00433:01196] mca: base: component_find: unable to open /usr/lib/openmpi/lib/openmpi/mca_shmem_posix: perhaps a missing symbol, or compiled for a different version of Open MPI? (ignored)
[Geo00433:01196] mca: base: component_find: unable to open /usr/lib/openmpi/lib/openmpi/mca_shmem_sysv: perhaps a missing symbol, or compiled for a different version of Open MPI? (ignored)
-------------------------------------------------------------------------
It looks like opal_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during opal_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here is some additional information (which may only be relevant to an
Open MPI developer):

  opal_shmem_base_select failed
    --> Returned value -1 instead of OPAL_SUCCESS
--------------------------------------------------------------------------
[Geo00433:01196] [[INVALID],INVALID] ORTE_ERROR_LOG: Error in file runtime/orte_init.c at line 79
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems.  This failure appears to be an internal failure; here is some
additional information (which may only be relevant to an Open MPI
developer):

  ompi_mpi_init: orte_init failed
  --> Returned "Error" (-1) instead of "Success" (0)
--------------------------------------------------------------------------
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL: your MPI job will now abort
[Geo00433:1196] Local abort before MPI_INIT completed successfully; not able to aggregate error messages, and not able to guarantee that all other processes were killed!

任何线索是什么原因?我已经检查了许多网页,但是仍然无法找到解决我问题的方法。

我安装了Ubuntu 15.10和mpich以及open-mpi。

非常感谢你们!

最佳答案

即使仅安装了Open MPI,我在Ubuntu 16.04上也遇到了相同的问题(或者非常相似,但错误消息略有不同)。据我所知,Ubuntu的mpi4py软件包的构建方式存在问题,但不确定到底是什么。

复制:由于该问题不能完全清楚地显示错误消息的产生方式(我不具有编辑它的声誉),因此,请按以下步骤进行操作。首先,安装Ubuntu的mpi4py软件包,然后进入python环境:

$ sudo apt-get install mpi
$ python

在python内部,尝试以下操作:
>>> from mpi4py import MPI

然后,您应该会收到像OP一样的错误消息。

解决方案:这就是我的工作方式。首先卸载Ubuntu的软件包:
$ sudo apt-get remove mpi4py

然后安装Open MPI header (下一步涉及构建mpi4py)和pip:
$ sudo apt-get install libopenmpi-dev python-pip

最后安装mpi4py:
$ sudo pip install mpi4py

如果您尝试使用上面的python命令,现在应该可以正常使用了。

10-08 08:24
查看更多