- removed comment
running on wheeler fails with error messages from MPI
Trying to run on wheeler fails with lengthy error messages from MPI even when using a single MPI rank.
I attach a sample error file from the gaussian test in the testsuite.
Wheeler is a private machine at Caltech so if this cannot be fixed I would remove it from the list of machines shown on http://einsteintoolkit.org/testsuite_results/index.php .
Erik since you maintained wheeler's files and the machine files point to an MPI stack in your $HOME, do you want to look into this? Otherwise I can set up a machine setup using the software stack used by SpEC (which sees much more regular use on wheeler) but that will likely not use the same set of modern compilers and other software that your setup uses, or we can remove wheeler from the list of officially supported machines for this release.
Keyword: None
Comments (3)
-
-
reporter - removed comment
Good question. I had a look just now. The log file shows that the mpirun used was
/home/eschnett/src/spack-view/bin/mpirun
and we set MPI_DIR to be /home/eschnett/src/spack-view however MPI_LIB_DIRS ends up being configured (by MPI's detect.pl script) as/home/eschnett/src/spack/opt/spack/linux-rhel7-x86_64/gcc-7.3.0-spack/openmpi-3.0.1-6lcorydh3sxme5eu3pculmgzo2nefolv/lib
. We do not (other than on Crays) use mpicc but instead use the normal compiler and explicitly link against the MPI library. Thempirun
in in/home/eschnett/src/spack-view/bin/mpirun
is identical to the one in/home/eschnett/src/spack/opt/spack/linux-rhel7-x86_64/gcc-7.3.0-spack/openmpi-3.0.1-6lcorydh3sxme5eu3pculmgzo2nefolv/bin
so it should be compatible.So, it does not seem to be quite so simple.
If I had to bet money I would bet on the system infiniband library having changed and one has to recompile OpenMPI to account for this.
-
reporter - changed status to resolved
- removed comment
I updated the wheeler files and the tests now work.
- Log in to comment
So... dumb question. Did you make sure that the mpirun and mpic++ that thorn MPI configured with were from the same MPI? Because, it seems, the configure script does not look in the PATH first.