ERROR: No GASNET_SSH_NODEFILE, GASNET_SSH_SERVERS, or GASNET_NODEFILE in environment
Hi,
I managed to compile UPC++ v2018.9.0, but I got error "ERROR: No GASNET_SSH_NODEFILE, GASNET_SSH_SERVERS, or GASNET_NODEFILE in environment" when testing the Hello World example. I did't get such error when using UPC++ v2017.9.
Could you please advise how to fix the error?
Thanks, Phuong
Comments (6)
-
-
reporter - attached upcxx_2018.9_install_1stPart.log
The first part of UPCXX v2018.9 installation log file
-
reporter Hi Dan,
Thanks so much for your prompt reply! Please find the first part of my upcxx-2018.9 installation log attached. The loaded modules are: 1) python/2.7-anaconda 2) gcc/7.2.0 3) openmpi/2.0.4
Our program needs gcc/2.7.0 and therefore it doesn't work with upcxx modules already installed.
Best regards, Phuong
-
Phuong,
Unless you need exactly gcc/7.2.0, you may be able to simply:
$ module swap PrgEnv-intel PrgEnv-gnu $ module load upcxx/2018.9.0
That will give you a stable build which uses gcc/g++ version 7.3.0.
In case that is not sufficient for your needs, the remainder of this post attempts to address your reported problem.From "Cray Inc." in the gcc version string in your log file, I am assuming this is for a NERSC Cray. If so, you need to set
CROSS=cray-aries-slurm
in your environment before building, as described under "Installation: Cray XC" in the INSTALL.md. Otherwise you are going to build executables which are appropriate to the login nodes, rather than the compute nodes. If this is for a different center's Cray systems, please let me know and we can figure out the proper setting forCROSS
.Additionally, I think I see signs of another problem. The message you report is coming from the ibv (Infiniband) support, because when
CROSS
is unset UPC++ is being configured to use the InfiniBand network on the login nodes. However, that network support includes integration with the SLURM batch system used at NERSC. That means that you should not have seen the message you reported unless you had attempted to run on the login node instead of in a batch job. If that is the case, then you should see NERSCs documentation on Running Jobs on Cori.Finally, I am concerned by the fact that you list only 3 loaded modules (python, gcc and openmpi).
That is very far from the default environment on NERSC systems which should have over 20 modules loaded:{hargrove@cori12 ~}$ module list Currently Loaded Modulefiles: 1) modules/3.2.10.6 12) xpmem/2.2.15-6.0.7.1_5.8__g7549d06.ari 2) nsg/1.2.0 13) job/2.2.3-6.0.7.0_44.1__g6c4e934.ari 3) intel/18.0.1.163 14) dvs/2.7_2.2.113-6.0.7.1_7.1__g1bbc03e 4) craype-network-aries 15) alps/6.6.43-6.0.7.0_26.4__ga796da3.ari 5) craype/2.5.14 16) rca/2.2.18-6.0.7.0_33.3__g2aa4f39.ari 6) cray-libsci/18.03.1 17) atp/2.1.1 7) udreg/2.3.2-6.0.7.0_33.18__g5196236.ari 18) PrgEnv-intel/6.0.4 8) ugni/6.0.14.0-6.0.7.0_23.1__gea11d3d.ari 19) craype-haswell 9) pmi/5.0.13 20) cray-mpich/7.7.0 10) dmapp/7.1.1-6.0.7.0_34.3__g5a674e0.ari 21) altd/2.0 11) gni-headers/5.0.12.0-6.0.7.0_24.1__g3b1768f.ari 22) darshan/3.1.4
When compiling anything to be run on the compute nodes (including upcxx itself) one of
PrgEnv-intel
orPrgEnv-gnu
should be loaded (and it will load most of the rest).-Paul
-
reporter Dear Paul,
Thanks so much for your insightful advice! I have managed to compile and run UPC++ v2018.9.
Best, Phuong
-
- changed status to invalid
User reports problem is resolved.
- Log in to comment
Hi Phuong -
This indicates a problem with your job spawning setup. Can you please provide more details of the system and configuration where you are running into problems? The status from the first page of the install script output should be helpful.
Also note that if you are using a NERSC system (cori or edison) there are upcxx modules already installed with the correct configuration settings (ie
module load upcxx/2018.9.0
)