MPI_THREAD_MULTIPLE vs MPI_THREAD_SINGLE

Issue #80 wontfix
Andreas Gocht created an issue

Hey there,

I just had a look on src/python.c line 33 folowing:

 33 static int
 34 PyMPI_Main(int argc, char **argv)
 35 {
 36   int sts=0, flag=1, finalize=0;
 37 
 38   /* MPI initalization */
 39   (void)MPI_Initialized(&flag);
 40   if (!flag) {
 41 #if defined(MPI_VERSION) && (MPI_VERSION > 1)
 42     int required = MPI_THREAD_MULTIPLE;
 43     int provided = MPI_THREAD_SINGLE;
 44     (void)MPI_Init_thread(&argc, &argv, required, &provided);
 45 #else
 46     (void)MPI_Init(&argc, &argv);
 47 #endif
 48     finalize = 1;
 49   }

I wondered why MPI_THREAD_MULTIPLE is required?

The MPI doc says:

MPI_THREAD_MULTIPLE Multiple threads may call MPI, with no restrictions.

[MPI doc page 488: http://mpi-forum.org/docs/mpi-3.1/mpi31-report.pdf]

However, wouldn't be MPI_THREAD_SINGLE be sufficient? As I understand multithreading with python is pointless anyway (https://wiki.python.org/moin/GlobalInterpreterLock).

I realised this while tracing an MPI application using Score-P (https://github.com/score-p/scorep_binding_python/tree/mpi2), which warns about the MPI thread level.

Best,

Andreas

Comments (6)

  1. Lisandro Dalcin

    We cannot know in advance the level of MPI thread support required by user code. Then, src/python.c (and the mpi4py.MPI module) request the maximum level to be safe.

    Multithreading in Python is not pointless, otherwise Python would not have a threading module, right?. While it is true that the GIL prevents Python bytecode (or Python C/API code) for running concurrently, once a third-party pure C/C++/Fortran library is let run with the GIL released, concurrent running of thread is indeed possible, as long as the library does not call back into Python.

    mpi4py itself (in the new mpi4py.futures module) is able to make concurrent MPI calls, and that requires MPI_THREAD_MULTIPLE support. FYI, if you call let say comm.Send(), mpi4py endups releasing the GIL to perform the (blocking) MPI_Send() call. Look at the example in the demo/threads directory, that code would not work if the threads were not able to run concurrently (and that happens thanks to mpi4py releasing the GIL).

    About the Score-P warning, well, IMHO, you should complain to them. I've put a lot of effort to make mpi4py thread-safe. In the face of ambiguity (what mpi4py users need), I refuse the temptation to guess (and request MPI for the maximum level of thread support).

    All that being said, I acknowledge that src\python.c should have a user-controlled way (command line? environment variable?) to call MPI_Init() (or request a different level of thread support) instead. I would happily accept a patch providing such features.

  2. Andreas Gocht reporter

    Multithreading in Python is not pointless, otherwise Python would not have a threading module, right?. While it is true that the GIL prevents Python bytecode (or Python C/API code) for running concurrently, once a third-party pure C/C++/Fortran library is let run with the GIL released, concurrent running of thread is indeed possible, as long as the library does not call back into Python.

    There is a point 😄 .

    mpi4py itself (in the new mpi4py.futures module) is able to make concurrent MPI calls, and that requires MPI_THREAD_MULTIPLE support. FYI, if you call let say comm.Send(), mpi4py endups releasing the GIL to perform the (blocking) MPI_Send() call. Look at the example in the demo/threads directory, that code would not work if the threads were not able to run concurrently (and that happens thanks to mpi4py releasing the GIL).

    Interesting, I'll have a look on this.

    All that being said, I acknowledge that src\python.c should have a user-controlled way (command line? environment variable?) to call MPI_Init() (or request a different level of thread support) instead. I would happily accept a patch providing such features.

    I'll have a look at this as soon as I have some time left. So something like mpirun python my_mpi.py --thread=single would be ok for you?

    Best,

    Andreas

  3. Lisandro Dalcin

    Yes. However note that it should be mpiexec python --mpi-thread-level=single script.py, ie., the flag has to be passed to the Python executable before the script, and you should intercept and remove the flag before calling Py_Main(), which might prove a bit difficult to get it right (unless you restrict the flag to be exactly in argv[1]).

    In Python 3, this could be handled with a -X option, i.e something like -X mpi-thread-level=single". This could be even hacked to be supported in Python 2 (trough interception and removal of the arg). As you see, lots of tiny details for a feature that may not pay off.MPI_Init_thread` was added to MPI about 20 years ago. MPI implementations should just support it, we are in 2017 in a multicore world, there is no excuse!!

    Years ago (2008) I asked Bill Gropp (father of MPI) about this choice in mpi4py, and he replied:

    I recommend that you initialize with MPI_Init_thread and MPI_THREAD_MULTIPLE . There is some overhead, but it is mainly an added latency and is thus most important for short messages. You can give users that want to optimize the option to select a lower level of thread support. At 5k entries, on a cluster, the added latency should not be too serious.

  4. Ehsan Moravveji

    I would like to reopen this ticket.

    I have compiled mpi4py from source on our cluster (and on our compute nodes, hyperthreading is disabled), using the following module list (and OpenMPI/4.0.0 which is CUDA-aware).

    Currently Loaded Modules: 1) GCCcore/6.4.0 8) libpciaccess/0.14-GCCcore-6.4.0 15) Tcl/8.6.8-GCCcore-6.4.0 2) binutils/2.28-GCCcore-6.4.0 9) hwloc/2.0.2-GCCcore-6.4.0 16) SQLite/3.21.0-GCCcore-6.4.0 3) GCC/6.4.0-2.28 10) CUDA/10.0.130 17) GMP/6.1.2-GCCcore-6.4.0 4) zlib/1.2.11-GCCcore-6.4.0 11) OpenMPI/4.0.0-GCC-6.4.0-2.28 18) libffi/3.2.1-GCCcore-6.4.0 5) numactl/2.0.11-GCCcore-6.4.0 12) bzip2/1.0.6-GCCcore-6.4.0 19) Python/3.6.5-GCCcore-6.4.0-bare 6) XZ/5.2.3-GCCcore-6.4.0 13) libreadline/7.0-GCCcore-6.4.0 20) mpi4py/3.0.1-GCC-6.4.0-2.28 7) libxml2/2.9.7-GCCcore-6.4.0 14) ncurses/6.0-GCCcore-6.4.0

    However, when I try to run the basic mpi4py test example, I get the following error:

    $ mpiexec -n 2 python -m mpi4py.bench helloworld

    [r23i13n19:35098] pml_ucx.c:228 Error: UCP worker does not support MPI_THREAD_MULTIPLE

    [r23i13n19:35099] pml_ucx.c:228 Error: UCP worker does not support MPI_THREAD_MULTIPLE

    Hello, World! I am process 0 of 2 on r23i13n19.

    Hello, World! I am process 1 of 2 on r23i13n19.

    What's your verdict on this?

    Thanks a lot.

  5. Lisandro Dalcin

    @ehsan_moravveji The MPI standard developed a mechanism such that you call MPI_Init_thread() with the required level of thread level you want/need, and then the MPI implementation initializes the MPI runtime and answers back to you with the provided level of thread support they can provide. The provided value can be higher, equal, or lower than required. If it is lower, it is up to the caller ofMPI_Init_thread() to handle the lack of enough thread support: the caller could error right away, or just proceed happily (that's what mpi4py does). So, from the point of view of the MPI standard, mpi4py is doing by default just doing the right thing to do. Note also that import mpi4py.rc; mpi4py.rc.threads = False at the very beginning of your script will not initialize MPI with thread support. mpi4py.bench and mpi4py.run have options to disable threads if the need arises, run then with --help.

    Now, about the Open MPI behavior:

    1. The script seems to proceed just fine, you just get a warning, despite it saying Error: ....
    2. The warning is annoying, but maybe Open MPI do have a way to silence it through MCA parameters.
    3. Ultimately, this is an Open MPI issue, you should file a bug report there and ask them to take action.
  6. Log in to comment