Can't import MPI with Spectrum MPI

Issue #63 resolved
Robert Manson-Sawko created an issue

Hi,

I am trying to install mpi4py on a machine with Spectrum MPI. I follow the procedure from the documentation and download 2.0.0 tar ball, edit mpi.cfg and run

python setup.py build --configure
python setup.py install

Both step appear to work, but when I try to import MPI from mpi4py then I get an error:

Sorry!  You were supposed to get help about:
    mpi_init:startup:internal-failure
But I couldn't open the help file:
    /__unresolved_path__________________/exports/optimized/share/spectrum_mpi/help-mpi-runtime.txt: No such file or directory.  Sorry!

Please let me know if you have any ideas how to fix this.

One thing I am afraid of is that python3.6 I built from scratch using gcc4 whereas Spectrum MPI mpicc appear to be some sort of wrappers for another compiler.

Comments (17)

  1. Lisandro Dalcin

    This is most likely an issue with dynamic loading of the MPI libraries. I'll need to debug it myself using Spectrum MPI. I'll do it next week. I hope IBM provides a evaluation/developer subscription, otherwise I'll not be able to do it myself and I'll need to ask for your help.

  2. Lisandro Dalcin

    PS: Sorry for the late answer. I was somehow unsubscribed to issue notifications for this repo.

  3. Robert Manson-Sawko reporter

    Sorry, I could not reply quickly myself. I have now gone with OpenMPI but will be back to testing Spectrum with mpi4py after Easter. I will get back in touch then.

  4. Lisandro Dalcin

    From where did you get Spectrum MPI? I'm looking for it, but all what I get is Platform MPI. Are you using a developer preview?

  5. Cristiano Malossi

    Hi Lisandro,

    I am getting the same issue with Spectrum MPI and mpi4py. Did you investigated it?

  6. Lisandro Dalcin

    @cristianomalossi What is "Spectrum MPI"? From where do you get it? Is it the same as "Platform MPI"? What's the output of ldd /path/to/mpi4py/MPI.so ?

  7. Cristiano Malossi

    Hi Lisandro,

    Spectrum MPI is an IBM MPI version, based on OpenMPI 2.0.x (interface, syntax, etc. is equivalent). https://www-03.ibm.com/systems/it/spectrum-computing/products/mpi/ I think Platform MPI was the older version.

    Unfortunately, I think it runs only on IBM Power nodes, so to get it you should actually get access to an IBM Power8 node. But maybe we can solve the issue without that need.

    Concerning the output of ldd. I cannot find an MPI.so file in the installation path: $INSTALLATION_PATH/lib64/python3.4/site-packages/mpi4py/

    However, this is what I have:

    > cat mpi.cfg
    [mpi]
    mpicc = /opt/ibm/spectrum_mpi/bin/mpicc
    mpicxx = /opt/ibm/spectrum_mpi/bin/mpicxx
    mpifort = /opt/ibm/spectrum_mpi/bin/mpifort
    mpif77 = /opt/ibm/spectrum_mpi/bin/mpif77
    mpif90 = /opt/ibm/spectrum_mpi/bin/mpif90
    
    > ldd MPI.cpython-34m.so
        linux-vdso64.so.1 (0x0000100000000000)
        libdl.so.2 => /opt/at9.0/lib64/libdl.so.2 (0x0000100000290000)
        libpython3.4m.so.1.0 => /opt/at9.0/lib64/libpython3.4m.so.1.0 (0x00001000002c0000)
        libmpi_ibm.so.2 => /opt/ibm/spectrum_mpi/lib/libmpi_ibm.so.2 (0x00001000005b0000)
        libgcc_s.so.1 => /opt/at9.0/lib64/power8/libgcc_s.so.1 (0x00001000006f0000)
        libm.so.6 => /opt/at9.0/lib64/power8/libm.so.6 (0x0000100000720000)
        libc.so.6 => /opt/at9.0/lib64/power8/libc.so.6 (0x0000100000800000)
        /opt/at9.0/lib64/ld64.so.2 (0x000000003f7a0000)
        libpthread.so.0 => /opt/at9.0/lib64/power8/libpthread.so.0 (0x00001000009f0000)
        libutil.so.1 => /opt/at9.0/lib64/power8/libutil.so.1 (0x0000100000a30000)
        libopen-rte.so.2 => /opt/ibm/spectrum_mpi/lib/libopen-rte.so.2 (0x0000100000a60000)
        libopen-pal.so.2 => /opt/ibm/spectrum_mpi/lib/libopen-pal.so.2 (0x0000100000b20000)
        librt.so.1 => /opt/at9.0/lib64/power8/librt.so.1 (0x0000100000c10000)
        libhwloc.so.5 => /opt/ibm/spectrum_mpi/lib/libhwloc.so.5 (0x0000100000c40000)
        libnuma.so.1 => /lib64/libnuma.so.1 (0x0000100000c90000)
        libevent-2.0.so.5 => /opt/ibm/spectrum_mpi/lib/libevent-2.0.so.5 (0x0000100000cc0000)
        libevent_pthreads-2.0.so.5 => /opt/ibm/spectrum_mpi/lib/libevent_pthreads-2.0.so.5 (0x0000100000d20000)
    
    > ldd dl.cpython-34m.so
        linux-vdso64.so.1 (0x0000100000000000)
        libdl.so.2 => /opt/at9.0/lib64/libdl.so.2 (0x0000100000040000)
        libpython3.4m.so.1.0 => /opt/at9.0/lib64/libpython3.4m.so.1.0 (0x0000100000070000)
        libpthread.so.0 => /opt/at9.0/lib64/power8/libpthread.so.0 (0x0000100000360000)
        libc.so.6 => /opt/at9.0/lib64/power8/libc.so.6 (0x00001000003a0000)
        /opt/at9.0/lib64/ld64.so.2 (0x0000000048490000)
        libutil.so.1 => /opt/at9.0/lib64/power8/libutil.so.1 (0x0000100000590000)
        libm.so.6 => /opt/at9.0/lib64/power8/libm.so.6 (0x00001000005c0000)
    
    Then under lib-pmpi
    > ldd libmpe.so
        linux-vdso64.so.1 (0x0000100000000000)
        libmpi_ibm.so.2 => /opt/ibm/spectrum_mpi/lib/libmpi_ibm.so.2 (0x0000100000040000)
        libgcc_s.so.1 => /opt/at9.0/lib64/power8/libgcc_s.so.1 (0x0000100000180000)
        libm.so.6 => /opt/at9.0/lib64/power8/libm.so.6 (0x00001000001b0000)
        libc.so.6 => /opt/at9.0/lib64/power8/libc.so.6 (0x0000100000290000)
        libopen-rte.so.2 => /opt/ibm/spectrum_mpi/lib/libopen-rte.so.2 (0x0000100000480000)
        libopen-pal.so.2 => /opt/ibm/spectrum_mpi/lib/libopen-pal.so.2 (0x0000100000540000)
        libdl.so.2 => /opt/at9.0/lib64/libdl.so.2 (0x0000100000600000)
        librt.so.1 => /opt/at9.0/lib64/power8/librt.so.1 (0x0000100000630000)
        libutil.so.1 => /opt/at9.0/lib64/power8/libutil.so.1 (0x0000100000660000)
        libhwloc.so.5 => /opt/ibm/spectrum_mpi/lib/libhwloc.so.5 (0x0000100000690000)
        libnuma.so.1 => /lib64/libnuma.so.1 (0x0000100000710000)
        libevent-2.0.so.5 => /opt/ibm/spectrum_mpi/lib/libevent-2.0.so.5 (0x0000100000740000)
        libevent_pthreads-2.0.so.5 => /opt/ibm/spectrum_mpi/lib/libevent_pthreads-2.0.so.5 (0x00001000007a0000)
        libpthread.so.0 => /opt/at9.0/lib64/power8/libpthread.so.0 (0x00001000007c0000)
        /opt/at9.0/lib64/ld64.so.2 (0x000000003ec50000)
    
    > ldd libvt.so
        linux-vdso64.so.1 (0x0000100000000000)
        libmpi_ibm.so.2 => /opt/ibm/spectrum_mpi/lib/libmpi_ibm.so.2 (0x0000100000040000)
        libgcc_s.so.1 => /opt/at9.0/lib64/power8/libgcc_s.so.1 (0x0000100000180000)
        libm.so.6 => /opt/at9.0/lib64/power8/libm.so.6 (0x00001000001b0000)
        libc.so.6 => /opt/at9.0/lib64/power8/libc.so.6 (0x0000100000290000)
        libopen-rte.so.2 => /opt/ibm/spectrum_mpi/lib/libopen-rte.so.2 (0x0000100000480000)
        libopen-pal.so.2 => /opt/ibm/spectrum_mpi/lib/libopen-pal.so.2 (0x0000100000540000)
        libdl.so.2 => /opt/at9.0/lib64/libdl.so.2 (0x0000100000600000)
        librt.so.1 => /opt/at9.0/lib64/power8/librt.so.1 (0x0000100000630000)
        libutil.so.1 => /opt/at9.0/lib64/power8/libutil.so.1 (0x0000100000660000)
        libhwloc.so.5 => /opt/ibm/spectrum_mpi/lib/libhwloc.so.5 (0x0000100000690000)
        libnuma.so.1 => /lib64/libnuma.so.1 (0x0000100000710000)
        libevent-2.0.so.5 => /opt/ibm/spectrum_mpi/lib/libevent-2.0.so.5 (0x0000100000740000)
        libevent_pthreads-2.0.so.5 => /opt/ibm/spectrum_mpi/lib/libevent_pthreads-2.0.so.5 (0x00001000007a0000)
        libpthread.so.0 => /opt/at9.0/lib64/power8/libpthread.so.0 (0x00001000007c0000)
        /opt/at9.0/lib64/ld64.so.2 (0x000000003e100000)
    

    There are also a couple of other .so files with similar dependencies. All looks correct from here. I have installed mpi4py using pip.

  8. Lisandro Dalcin

    Any change you can try it again with a git checkout? I have access to a Power8 machine, I'll try to debug the issue.

  9. Lisandro Dalcin

    @RobertSawko @cristianomalossi I managed to fix the issue, it was related to the usual Open MPI issues with dynamic loading of the MPI library. The fixes are in the master branch of the git repository (commits 182a78f and c04e66a). I'm not planning to make a patch release of mpi4py 2.0.0, so I would recommend you to build from the git repository (you can even pip install https://bitbucket.org/mpi4py/mpi4py/get/master.tar.gz). If for any reason you need to build the release 2.0.0 tarball, I can provide instructions to patch the source (adding a single line in an internal C header file should enough).

  10. Cristiano Malossi

    Hi Lisandro,

    many thanks for the very quick action. I am going to test this soon. When the next release is planned?

  11. Lisandro Dalcin

    Next release should happen soon (though I've been promising it for ages). I'm still cleaning up the implementation of the new mpi4py.futures package.

  12. Robert Manson-Sawko reporter

    Thanks for this, @dalcinl . I have tried to run against spectrum your newer version, but didn't manage. I have compiled with gcc compiled OpenMPI which works on helloworld.py example, but then swapped MPIs for Spectrum and that still complains.

    Is it possible to have MPI agnostic version or do I need to recompile mpi4py with target MPI implementation?

    Here's an example error output.

    Sorry!  You were supposed to get help about:
        opal_init:startup:internal-failure
    But I couldn't open the help file:
        /gpfs/panther/local/apps/ibm/spectrum_mpi/10.1.0/share/openmpi/help-opal-runtime.txt: No such file or directory.  Sorry!
    

    @cristianomalossi , could you give it a try on your side to eliminate cluster config as a potential issue?

  13. Lisandro Dalcin

    Did you use mpi4py from the master branch of a git checkout? Can you try to run your script with mpiexec -n <np> -tcp python script.py? If that not works, well, I guess you have to bug IBM folks about it. BTW, have you tried to run a simple pure C MPI code (like demo/helloworld.c in mpi4py sources?)

    Unfortunately, different MPI implementations are not ABI compatible, so it is not possible to compile once and switch implementations at runtime. In some cases it would be possible, like Open MPI and Spectrum MPI, but right now it would not work, the IBM folks decided to rename the shared library to libmpi_ibm.so.X, whild Open MPI uses libmpi.so.X. I think this is a bad decision. As an alternative example, MPICH and Intel MPI have an agreement to maintain ABI compatibility, so you can build your binaries with one of these implementations, and then switch at runtime (maybe by setting LD_LIBRARY_PATH). Maybe you should comment this to IBM folks as well, your request is quite sensible.

  14. Cristiano Malossi

    Agree that libmpi_ibm.so.X name is a bad decision. With sudo I created a link libmpi.so.X to overcome that issue. However, I have other issues with spectrum mpi at the moment.

    mpi4py can be compiled. But I have not tested further.

  15. Lisandro Dalcin

    mpi4py master (to be released today as 3.0.0) should kind of work with Spectrum MPI out of the box.

  16. Log in to comment