is new release (2.1.0?) coming?

Issue #50 resolved

Former user created an issue 2016-09-09

Hi Lisandro,

I wondered if new release is coming or I should try to take current development snapshot. 2.0.0 finally started to fail to build/test on Debian sid (https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=830440, there were upgrades to openmpi etc), so instead of doing patch work I thought to try "bleeding edge" ;)

Comments (13)

Lisandro Dalcin
I'm working on a new feature related to Python 3's concurrent.futures interface, I would like to get this in the next release, but I'm still working on it.

Regarding the failure in Debian you linked, the failing test (see the full log) is not related to MPI, not to any bug in mpi4py, just a bad assumption in mpi4py's testsuite. The following patch should fix the test failure, so I would argue that the patch work is rather minimal:

https://bitbucket.org/mpi4py/mpi4py/diff/test/test_dl.py?diff2=74d8da24a9f4&at=maint
- 2016-09-11T08:34:33+00:00

Anatol Anatol

I tried to upgrade to openmpi 2.2 and found mpi4py test issue

  ompi_mpi_init: ompi_rte_init failed
  --> Returned "Unable to start a daemon on the local node" (-127) instead of "Success" (0)
--------------------------------------------------------------------------
*** An error occurred in MPI_Init_thread
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***    and potentially your MPI job)
[anatol:1013] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!

Do you plan to make a new release that is compatible with openmpi?

2017-06-10T06:42:02+00:00

Lisandro Dalcin
Yes, I'm planning it. Actually, everything is ready to make it. I'm waiting for the Microsoft folks to release a MSMPI v8.1, and then make a new mpi4py release. In the meantime, I would suggest you to use a development snapshot.
- 2017-06-10T13:05:28+00:00

Anatol Anatol

Attempt to compile mpi4py HEAD with openmpi 2.1.1 give the same error in tests:

running test
--------------------------------------------------------------------------
The value of the MCA parameter "plm_rsh_agent" was set to a path
that could not be found:

  plm_rsh_agent: ssh : rsh

Please either unset the parameter, or check that the path is correct
--------------------------------------------------------------------------
[anatol:01129] [[INVALID],INVALID] ORTE_ERROR_LOG: Unable to start a daemon on the local node in file ess_singleton_module.c at line 582
[anatol:01129] [[INVALID],INVALID] ORTE_ERROR_LOG: Unable to start a daemon on the local node in file ess_singleton_module.c at line 166
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  orte_ess_init failed
  --> Returned value Unable to start a daemon on the local node (-127) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems.  This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  ompi_mpi_init: ompi_rte_init failed
  --> Returned "Unable to start a daemon on the local node" (-127) instead of "Success" (0)
--------------------------------------------------------------------------
*** An error occurred in MPI_Init_thread
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***    and potentially your MPI job)
[anatol:1129] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!

2017-06-14T07:10:23+00:00

Lisandro Dalcin
Are you able to run any other MPI program? Open MPI 2.1.1 is being used in our Bitbucket Pipelines builds https://bitbucket.org/mpi4py/mpi4py/addon/pipelines/home#!/, and latest OpenMPI builds are green for all Python versions. I'm afraid that the problem is in your side. I bet it is a simple configuration problem. Do you have the rsh command installed? Also, check Open MPI docs, you may set the paramenter plm_rsh_agent to ssh.
- 2017-06-14T09:28:32+00:00
Christopher Ostrouchov
First of all Lisandro I would like to thank you for this amazing feature and the great work you do on mpi4py. MPIExecutor is exactly what I needed as it fits into HPC workflows very nicely.

It there any plan to release this version to pypi? Currently I am just pinning a package of mine on a commit.
- 2017-08-19T14:13:38+00:00
Lisandro Dalcin
@costrouc I'm still working on some low level things as time permits. And I still have to do some testing of mpi4py.futures in Cray systems.
- 2017-08-20T09:23:59+00:00
Christopher Ostrouchov
@dalcinl I will need to check my remaining compute hours on NERSC but I would be happy to help with testing on Cray systems. Cori is a cray system. Would that be useful?
- 2017-08-20T17:39:28+00:00
Lisandro Dalcin
Sure! I would really appreciate it. Using python -m mpi4py.futures should of course work. Could you also try the other way? AFAIK, spawning was not supported, but maybe there was some recent upgrade I'm not aware of.
- 2017-08-20T17:42:13+00:00

Christopher Ostrouchov

Okay so I ran with python -m mpi4py.futures and without and it behaved exactly as you expected.

Summary of installation. NERSC prefers to use anaconda so I had to install that way

conda create -n mpi python=3.6
source activate mpi
git clone git@bitbucket.org:mpi4py/mpi4py.git
cd mpi4py
# change setup.cfg so that they point to correct compilers (cc, CC, ftn)
python setup.py build
python setup.py install

Now to jobs that I submitted for testing (they block mpi calls on login nodes for good reason).

#!/bin/bash -l

#SBATCH -N 1
#SBATCH -t 00:15:00
#SBATCH -p debug
#SBATCH -L SCRATCH   #Job requires $SCRATCH file system
#SBATCH -C haswell

module load python/3.6-anaconda-4.4
source activate mpi

pwd
python3.6 -m pip list

srun -n 1 python3.6 script.py
# srun -n 16 python3.6 -m mpi4py.futures script.py

script.py

import sys

from mpi4py import MPI
from mpi4py.futures import MPIPoolExecutor

def do_work(i):
    return "Did some work on %d" % i

def main():
    comm = MPI.COMM_WORLD
    rank = comm.Get_rank()
    size = comm.Get_size()

    print(size, rank)
    print(sys.executable)

    tasks = [i**2 for i in range(100)]

    with MPIPoolExecutor(max_workers=10) as executor:
        results = []
        for result in executor.map(do_work, tasks):
            print(type(result), result)
            results.append(result)
    print(results)

if __name__ == "__main__":
    main()

Running without -m mpi4py.futures lead to following error

/global/homes/c/costrouc
DEPRECATION: The default format will switch to columns in the future. You can use --format=(legacy|columns) (or define a format=(legacy|columns) in your pip.conf under the [list] section) to disable this warning.
mpi4py (2.0.1a0)
pip (9.0.1)
setuptools (27.2.0)
wheel (0.29.0)
Sun Aug 20 13:02:42 2017: [PE_0]:PMI2_Job_Spawn:PMI2_Job_Spawn not implemented.

Let me know if I can be of any more help. I think this is an awesome addition to python for HPC.

2017-08-20T20:13:50+00:00

Lisandro Dalcin
OK, many thanks! The PMI2_Job_Spawn error is indeed expected. The community just needs to make a bit of pressure on Cray to support MPI process spawning :-).

Could you please share here your changes to setup.cfg or mpi.cfg?
- 2017-08-20T20:21:47+00:00
Christopher Ostrouchov
The only changes that were made were to setup.cfg. The beginning of the file was changed according to compiling on cori instructions. Maybe most important is that I did not change any of the files at first and I got an error along the lines of #include mpi.h not found. To fix this you just need to make mpi4py use the correct compilers. In the case of cori these are cc, CC, and ftn.

setup.cfg changes
```
[config]
mpicc   = cc
mpicxx  = CC
mpifort = ftn
```
- 2017-08-21T01:20:02+00:00
Lisandro Dalcin
- changed status to resolved
mpi4py 3.0.0 just released!
- 2017-11-08T12:17:50+00:00
Log in to comment

Assignee: –

Type: proposal

Priority: minor

Status: resolved

Votes: 0

Watchers: None