Can't import MPI with Spectrum MPI
Hi,
I am trying to install mpi4py on a machine with Spectrum MPI. I follow the procedure from the documentation and download 2.0.0 tar ball, edit mpi.cfg and run
python setup.py build --configure
python setup.py install
Both step appear to work, but when I try to import MPI from mpi4py
then I get an error:
Sorry! You were supposed to get help about:
mpi_init:startup:internal-failure
But I couldn't open the help file:
/__unresolved_path__________________/exports/optimized/share/spectrum_mpi/help-mpi-runtime.txt: No such file or directory. Sorry!
Please let me know if you have any ideas how to fix this.
One thing I am afraid of is that python3.6 I built from scratch using gcc4 whereas Spectrum MPI mpicc appear to be some sort of wrappers for another compiler.
Comments (17)
-
-
-
assigned issue to
-
assigned issue to
-
PS: Sorry for the late answer. I was somehow unsubscribed to issue notifications for this repo.
-
reporter Sorry, I could not reply quickly myself. I have now gone with OpenMPI but will be back to testing Spectrum with mpi4py after Easter. I will get back in touch then.
-
From where did you get Spectrum MPI? I'm looking for it, but all what I get is Platform MPI. Are you using a developer preview?
-
Hi Lisandro,
I am getting the same issue with Spectrum MPI and mpi4py. Did you investigated it?
-
@cristianomalossi What is "Spectrum MPI"? From where do you get it? Is it the same as "Platform MPI"? What's the output of
ldd /path/to/mpi4py/MPI.so
? -
Hi Lisandro,
Spectrum MPI is an IBM MPI version, based on OpenMPI 2.0.x (interface, syntax, etc. is equivalent). https://www-03.ibm.com/systems/it/spectrum-computing/products/mpi/ I think Platform MPI was the older version.
Unfortunately, I think it runs only on IBM Power nodes, so to get it you should actually get access to an IBM Power8 node. But maybe we can solve the issue without that need.
Concerning the output of ldd. I cannot find an MPI.so file in the installation path: $INSTALLATION_PATH/lib64/python3.4/site-packages/mpi4py/
However, this is what I have:
> cat mpi.cfg [mpi] mpicc = /opt/ibm/spectrum_mpi/bin/mpicc mpicxx = /opt/ibm/spectrum_mpi/bin/mpicxx mpifort = /opt/ibm/spectrum_mpi/bin/mpifort mpif77 = /opt/ibm/spectrum_mpi/bin/mpif77 mpif90 = /opt/ibm/spectrum_mpi/bin/mpif90 > ldd MPI.cpython-34m.so linux-vdso64.so.1 (0x0000100000000000) libdl.so.2 => /opt/at9.0/lib64/libdl.so.2 (0x0000100000290000) libpython3.4m.so.1.0 => /opt/at9.0/lib64/libpython3.4m.so.1.0 (0x00001000002c0000) libmpi_ibm.so.2 => /opt/ibm/spectrum_mpi/lib/libmpi_ibm.so.2 (0x00001000005b0000) libgcc_s.so.1 => /opt/at9.0/lib64/power8/libgcc_s.so.1 (0x00001000006f0000) libm.so.6 => /opt/at9.0/lib64/power8/libm.so.6 (0x0000100000720000) libc.so.6 => /opt/at9.0/lib64/power8/libc.so.6 (0x0000100000800000) /opt/at9.0/lib64/ld64.so.2 (0x000000003f7a0000) libpthread.so.0 => /opt/at9.0/lib64/power8/libpthread.so.0 (0x00001000009f0000) libutil.so.1 => /opt/at9.0/lib64/power8/libutil.so.1 (0x0000100000a30000) libopen-rte.so.2 => /opt/ibm/spectrum_mpi/lib/libopen-rte.so.2 (0x0000100000a60000) libopen-pal.so.2 => /opt/ibm/spectrum_mpi/lib/libopen-pal.so.2 (0x0000100000b20000) librt.so.1 => /opt/at9.0/lib64/power8/librt.so.1 (0x0000100000c10000) libhwloc.so.5 => /opt/ibm/spectrum_mpi/lib/libhwloc.so.5 (0x0000100000c40000) libnuma.so.1 => /lib64/libnuma.so.1 (0x0000100000c90000) libevent-2.0.so.5 => /opt/ibm/spectrum_mpi/lib/libevent-2.0.so.5 (0x0000100000cc0000) libevent_pthreads-2.0.so.5 => /opt/ibm/spectrum_mpi/lib/libevent_pthreads-2.0.so.5 (0x0000100000d20000) > ldd dl.cpython-34m.so linux-vdso64.so.1 (0x0000100000000000) libdl.so.2 => /opt/at9.0/lib64/libdl.so.2 (0x0000100000040000) libpython3.4m.so.1.0 => /opt/at9.0/lib64/libpython3.4m.so.1.0 (0x0000100000070000) libpthread.so.0 => /opt/at9.0/lib64/power8/libpthread.so.0 (0x0000100000360000) libc.so.6 => /opt/at9.0/lib64/power8/libc.so.6 (0x00001000003a0000) /opt/at9.0/lib64/ld64.so.2 (0x0000000048490000) libutil.so.1 => /opt/at9.0/lib64/power8/libutil.so.1 (0x0000100000590000) libm.so.6 => /opt/at9.0/lib64/power8/libm.so.6 (0x00001000005c0000) Then under lib-pmpi > ldd libmpe.so linux-vdso64.so.1 (0x0000100000000000) libmpi_ibm.so.2 => /opt/ibm/spectrum_mpi/lib/libmpi_ibm.so.2 (0x0000100000040000) libgcc_s.so.1 => /opt/at9.0/lib64/power8/libgcc_s.so.1 (0x0000100000180000) libm.so.6 => /opt/at9.0/lib64/power8/libm.so.6 (0x00001000001b0000) libc.so.6 => /opt/at9.0/lib64/power8/libc.so.6 (0x0000100000290000) libopen-rte.so.2 => /opt/ibm/spectrum_mpi/lib/libopen-rte.so.2 (0x0000100000480000) libopen-pal.so.2 => /opt/ibm/spectrum_mpi/lib/libopen-pal.so.2 (0x0000100000540000) libdl.so.2 => /opt/at9.0/lib64/libdl.so.2 (0x0000100000600000) librt.so.1 => /opt/at9.0/lib64/power8/librt.so.1 (0x0000100000630000) libutil.so.1 => /opt/at9.0/lib64/power8/libutil.so.1 (0x0000100000660000) libhwloc.so.5 => /opt/ibm/spectrum_mpi/lib/libhwloc.so.5 (0x0000100000690000) libnuma.so.1 => /lib64/libnuma.so.1 (0x0000100000710000) libevent-2.0.so.5 => /opt/ibm/spectrum_mpi/lib/libevent-2.0.so.5 (0x0000100000740000) libevent_pthreads-2.0.so.5 => /opt/ibm/spectrum_mpi/lib/libevent_pthreads-2.0.so.5 (0x00001000007a0000) libpthread.so.0 => /opt/at9.0/lib64/power8/libpthread.so.0 (0x00001000007c0000) /opt/at9.0/lib64/ld64.so.2 (0x000000003ec50000) > ldd libvt.so linux-vdso64.so.1 (0x0000100000000000) libmpi_ibm.so.2 => /opt/ibm/spectrum_mpi/lib/libmpi_ibm.so.2 (0x0000100000040000) libgcc_s.so.1 => /opt/at9.0/lib64/power8/libgcc_s.so.1 (0x0000100000180000) libm.so.6 => /opt/at9.0/lib64/power8/libm.so.6 (0x00001000001b0000) libc.so.6 => /opt/at9.0/lib64/power8/libc.so.6 (0x0000100000290000) libopen-rte.so.2 => /opt/ibm/spectrum_mpi/lib/libopen-rte.so.2 (0x0000100000480000) libopen-pal.so.2 => /opt/ibm/spectrum_mpi/lib/libopen-pal.so.2 (0x0000100000540000) libdl.so.2 => /opt/at9.0/lib64/libdl.so.2 (0x0000100000600000) librt.so.1 => /opt/at9.0/lib64/power8/librt.so.1 (0x0000100000630000) libutil.so.1 => /opt/at9.0/lib64/power8/libutil.so.1 (0x0000100000660000) libhwloc.so.5 => /opt/ibm/spectrum_mpi/lib/libhwloc.so.5 (0x0000100000690000) libnuma.so.1 => /lib64/libnuma.so.1 (0x0000100000710000) libevent-2.0.so.5 => /opt/ibm/spectrum_mpi/lib/libevent-2.0.so.5 (0x0000100000740000) libevent_pthreads-2.0.so.5 => /opt/ibm/spectrum_mpi/lib/libevent_pthreads-2.0.so.5 (0x00001000007a0000) libpthread.so.0 => /opt/at9.0/lib64/power8/libpthread.so.0 (0x00001000007c0000) /opt/at9.0/lib64/ld64.so.2 (0x000000003e100000)
There are also a couple of other .so files with similar dependencies. All looks correct from here. I have installed mpi4py using pip.
-
Any change you can try it again with a git checkout? I have access to a Power8 machine, I'll try to debug the issue.
-
@RobertSawko @cristianomalossi I managed to fix the issue, it was related to the usual Open MPI issues with dynamic loading of the MPI library. The fixes are in the master branch of the git repository (commits 182a78f and c04e66a). I'm not planning to make a patch release of mpi4py 2.0.0, so I would recommend you to build from the git repository (you can even
pip install https://bitbucket.org/mpi4py/mpi4py/get/master.tar.gz
). If for any reason you need to build the release 2.0.0 tarball, I can provide instructions to patch the source (adding a single line in an internal C header file should enough). -
- changed status to resolved
-
Hi Lisandro,
many thanks for the very quick action. I am going to test this soon. When the next release is planned?
-
Next release should happen soon (though I've been promising it for ages). I'm still cleaning up the implementation of the new mpi4py.futures package.
-
reporter Thanks for this, @dalcinl . I have tried to run against spectrum your newer version, but didn't manage. I have compiled with gcc compiled OpenMPI which works on
helloworld.py
example, but then swapped MPIs for Spectrum and that still complains.Is it possible to have MPI agnostic version or do I need to recompile
mpi4py
with target MPI implementation?Here's an example error output.
Sorry! You were supposed to get help about: opal_init:startup:internal-failure But I couldn't open the help file: /gpfs/panther/local/apps/ibm/spectrum_mpi/10.1.0/share/openmpi/help-opal-runtime.txt: No such file or directory. Sorry!
@cristianomalossi , could you give it a try on your side to eliminate cluster config as a potential issue?
-
Did you use mpi4py from the master branch of a git checkout? Can you try to run your script with
mpiexec -n <np> -tcp python script.py
? If that not works, well, I guess you have to bug IBM folks about it. BTW, have you tried to run a simple pure C MPI code (likedemo/helloworld.c
in mpi4py sources?)Unfortunately, different MPI implementations are not ABI compatible, so it is not possible to compile once and switch implementations at runtime. In some cases it would be possible, like Open MPI and Spectrum MPI, but right now it would not work, the IBM folks decided to rename the shared library to
libmpi_ibm.so.X
, whild Open MPI useslibmpi.so.X
. I think this is a bad decision. As an alternative example, MPICH and Intel MPI have an agreement to maintain ABI compatibility, so you can build your binaries with one of these implementations, and then switch at runtime (maybe by settingLD_LIBRARY_PATH
). Maybe you should comment this to IBM folks as well, your request is quite sensible. -
Agree that libmpi_ibm.so.X name is a bad decision. With sudo I created a link libmpi.so.X to overcome that issue. However, I have other issues with spectrum mpi at the moment.
mpi4py can be compiled. But I have not tested further.
-
mpi4py master (to be released today as 3.0.0) should kind of work with Spectrum MPI out of the box.
- Log in to comment
This is most likely an issue with dynamic loading of the MPI libraries. I'll need to debug it myself using Spectrum MPI. I'll do it next week. I hope IBM provides a evaluation/developer subscription, otherwise I'll not be able to do it myself and I'll need to ask for your help.