add OpenMPI env vars to notebook to avoid warning messages du to vader library in containers

Issue #2287 resolved
Roland Haas created an issue

Running in a Docker container based on Ubuntu 18.04 I am now getting error message like this

[ekohaes8:26785] Read -1, expected 313632, errno = 1

by the hundreds (this is using Ubuntu 18.04 rather than 16.04 so may only happen for new versions of OpenMPI. The OpenMPI ticket referenced below mentions this to happen on at least OpenMPI 4.0 and 3.1.3). This is apparently known: https://github.com/open-mpi/ompi/issues/4948 with the workaround being to set an env variables (or setting in the .conf file in $HOME):

export OMPI_MCA_btl_vader_single_copy_mechanism=none

This can also depend on the docker version used it would seem as docker run --cap-add=SYS_PTRACE ... is offered as a host-side workaround.

Note that the simulation itself is unaffected (other than producing very many warnings).

This happened to at least one person who reported this on the mailing list: http://lists.einsteintoolkit.org/pipermail/users/2019-September/007046.html

This was reported in https://bitbucket.org/einsteintoolkit/tickets/issues/2234/trouble-in-the-tutorial-server#comment-51221094 but reverted as not directly applicable to the jupyter tutorial server. However since this now seems to happen to regular users I would like to add this setting to the tutorial notebook.

Comments (5)

  1. Roland Haas reporter

    Adding the vader env setting

    export OMPI_MCA_btl_vader_single_copy_mechanism=none
    

    also helps prevent MPI hangs in OSX VirtualBox containers (OSX being the client) when using MacPorts as described in #2290 . The same hang still occurs in OSX Catalina VMs. This may be related to an actual bug on OpenMPI: https://github.com/open-mpi/ompi/issues/6568 and affects only MPI communication with large packages (which likely explains why not every single MPI using test would hang). The OpneMPI ticket provides what looks like a reproducer.

    I would like to add this env variable setting (with a comment about Docker and OSX virtual machines) to the tutorial notebook (I will also use it for my own VMs to run the testsuite but that is somewhat unrelated).

  2. Log in to comment