smp-conduit cannot use new spawner support via upcxx-run

Issue #627 resolved
Paul Hargrove created an issue

Since merge of GASNet PR#367, GASNet's smp-conduit has supported spawning via ssh, mpi and pmi. However, hard-coded use of (essentially) env GASNET_PSHM_NODES=n ./a.out prevents the use of the new capabilities via upcxx-run.

Note that pmi- and mpi-based spawning (e.g. via srun or mpirun) will work just fine when GASNET_SMP_SPAWNER is set appropriately (via --with-smp-spawner=... or explicitly in the environment). If one want/needs to launch using PMI or MPI for things like cpu/memory/gpu affinity, then direct use of srun or mpirun is recommended anyway.

If one has configured UPC++ using --with-smp-spawner=ssh, or mpi or pmi, and wishes to use upcxx-run (with smp-conduit's legacy "fork-based" spawner), the work-around for this issue is to set environment variable GASNET_SMP_SPAWNER=fork.

Comments (2)

  1. Paul Hargrove reporter

    There is another wrinkle, not visible until application run time, that I forgot to mention. On many laptops and VMs, the result of gethostid() is 0 or some permutation of the four bytes of 127.0.0.1, which GASNet considers to be invalid values (they are certainly not unique). This can result in the following sort of (non-fatal) message when one does launch via srun, mpirun, or similar:

    *** WARNING (proc 0): Invalid return 0x00000000 from gethostid().  Please see documentation on GASNET_HOST_DETECT in README and consider setting its value to 'hostname' or reconfiguring using '--with-host-detect=hostname' to make that the default value.
    

    The work-around is simply to do either of the things suggested in the error message.

    For completeness, I want to note that this was already an issue with udp-conduit (for both single and multi-host runs).

  2. Dan Bonachea

    upcxx-run: Add gasnetrun_smp support

    New logic detects the presence of gasnetrun_smp based on GASNet version, and adjusts beahvior and messages accordingly

    Resolves issue #627

    → <<cset 68866205b9ce>>

  3. Log in to comment