ERROR: No GASNET_SSH_NODEFILE, GASNET_SSH_SERVERS, or GASNET_NODEFILE in environment

Issue #186 invalid
Phuong Ha created an issue

Hi,

I managed to compile UPC++ v2018.9.0, but I got error "ERROR: No GASNET_SSH_NODEFILE, GASNET_SSH_SERVERS, or GASNET_NODEFILE in environment" when testing the Hello World example. I did't get such error when using UPC++ v2017.9.

Could you please advise how to fix the error?

Thanks, Phuong

Comments (6)

  1. Dan Bonachea

    Hi Phuong -

    This indicates a problem with your job spawning setup. Can you please provide more details of the system and configuration where you are running into problems? The status from the first page of the install script output should be helpful.

    Also note that if you are using a NERSC system (cori or edison) there are upcxx modules already installed with the correct configuration settings (ie module load upcxx/2018.9.0)

  2. Phuong Ha reporter

    Hi Dan,

    Thanks so much for your prompt reply! Please find the first part of my upcxx-2018.9 installation log attached. The loaded modules are: 1) python/2.7-anaconda 2) gcc/7.2.0 3) openmpi/2.0.4

    Our program needs gcc/2.7.0 and therefore it doesn't work with upcxx modules already installed.

    Best regards, Phuong

  3. Paul Hargrove

    Phuong,

    Unless you need exactly gcc/7.2.0, you may be able to simply:

    $ module swap PrgEnv-intel PrgEnv-gnu
    $ module load upcxx/2018.9.0
    

    That will give you a stable build which uses gcc/g++ version 7.3.0.
    In case that is not sufficient for your needs, the remainder of this post attempts to address your reported problem.

    From "Cray Inc." in the gcc version string in your log file, I am assuming this is for a NERSC Cray. If so, you need to set CROSS=cray-aries-slurm in your environment before building, as described under "Installation: Cray XC" in the INSTALL.md. Otherwise you are going to build executables which are appropriate to the login nodes, rather than the compute nodes. If this is for a different center's Cray systems, please let me know and we can figure out the proper setting for CROSS.

    Additionally, I think I see signs of another problem. The message you report is coming from the ibv (Infiniband) support, because when CROSS is unset UPC++ is being configured to use the InfiniBand network on the login nodes. However, that network support includes integration with the SLURM batch system used at NERSC. That means that you should not have seen the message you reported unless you had attempted to run on the login node instead of in a batch job. If that is the case, then you should see NERSCs documentation on Running Jobs on Cori.

    Finally, I am concerned by the fact that you list only 3 loaded modules (python, gcc and openmpi).
    That is very far from the default environment on NERSC systems which should have over 20 modules loaded:

    {hargrove@cori12 ~}$ module list
    Currently Loaded Modulefiles:
      1) modules/3.2.10.6                                 12) xpmem/2.2.15-6.0.7.1_5.8__g7549d06.ari
      2) nsg/1.2.0                                        13) job/2.2.3-6.0.7.0_44.1__g6c4e934.ari
      3) intel/18.0.1.163                                 14) dvs/2.7_2.2.113-6.0.7.1_7.1__g1bbc03e
      4) craype-network-aries                             15) alps/6.6.43-6.0.7.0_26.4__ga796da3.ari
      5) craype/2.5.14                                    16) rca/2.2.18-6.0.7.0_33.3__g2aa4f39.ari
      6) cray-libsci/18.03.1                              17) atp/2.1.1
      7) udreg/2.3.2-6.0.7.0_33.18__g5196236.ari          18) PrgEnv-intel/6.0.4
      8) ugni/6.0.14.0-6.0.7.0_23.1__gea11d3d.ari         19) craype-haswell
      9) pmi/5.0.13                                       20) cray-mpich/7.7.0
     10) dmapp/7.1.1-6.0.7.0_34.3__g5a674e0.ari           21) altd/2.0
     11) gni-headers/5.0.12.0-6.0.7.0_24.1__g3b1768f.ari  22) darshan/3.1.4
    

    When compiling anything to be run on the compute nodes (including upcxx itself) one of PrgEnv-intel or PrgEnv-gnu should be loaded (and it will load most of the rest).

    -Paul

  4. Phuong Ha reporter

    Dear Paul,

    Thanks so much for your insightful advice! I have managed to compile and run UPC++ v2018.9.

    Best, Phuong

  5. Log in to comment