Running upcxx programs in parallel across multiple nodes (in 2017.9.0 release)
How to optimally configure, set up and run UPC++ programs in parallel across multiple nodes on SLURM-based clusters? Some general guidelines in the UPC++ guide can be useful.
We're currently using the 2017.9.0
release (can't immediately switch to the latest), and debugging to get a UPC++ program to use all the allocated nodes/cores in a cluster using SLURM.
With either using upcxx-run
or (srun
, sbatch
, salloc
, mpirun
etc), we notice that UPC++ gets all the ranks just on a single node with our configuration. Here is one command for example that we tried:
cd $UPCXX_SOURCE/example/prog-guide
salloc --nodes=4 --mem-per-cpu=4000 --time=00:15:00 --account=staff \
srun $UPCXX_INSTALL/bin/upcxx-run 16 ./hello-world
Recent releases like 2018.9.0
describe how to execute UPC++ programs on SLURM-based clusters. e.g.
For multiple nodes, specify the node count with -N <nodes>.
UPC++ Programmer’s Guide (v2018.9.0), page 4
Or, for instance mpirun
has been mentioned in issue #109.
Why we think the UPC++ ranks are not launched in parallel across multiple nodes in our scenario?
Well, we put in the calls to gethostname(hostname, sizeof(hostname));
and got the following output:
Running with 8 ranks ...
[Rank 6] running on compute-14-31.local...
[Rank 0] running on compute-14-31.local...
[Rank 7] running on compute-14-31.local...
[Rank 4] running on compute-14-31.local...
[Rank 3] running on compute-14-31.local...
[Rank 1] running on compute-14-31.local...
[Rank 5] running on compute-14-31.local...
[Rank 2] running on compute-14-31.local...
Running with 8 ranks ...
[Rank 3] running on compute-14-36.local...
[Rank 0] running on compute-14-36.local...
[Rank 6] running on compute-14-36.local...
[Rank 7] running on compute-14-36.local...
[Rank 1] running on compute-14-36.local..
Comments (5)
-
-
reporter Are you using some other software forcing this dependence?
No, not any third-party software dependency. Just our code which needs to be modified to be compatible with the later versions, apparently should get that done sooner!
Anyway I have switched to testing using the latest
upcxx-2018.9.0
to gethelloworld
to work on multiple nodes.If this reports a backend of "SMP" then that is the problem.
Yes, that was the cause,
$GASNetExtendedLibraryName: SMP $
. The guide mentions that in general, the network conduit is automatically set and shouldn’t have to be changed, however this doesn't happen in our case.By default,
helloworld.cpp
gets built againstsmp
. So$UPCXX_INSTALL/bin/upcxx-run -n 8 -N 2 ./hello-world
was still restricted to just the local node.you need to rebuild your UPC++ program with UPCXX_GASNET_CONDUIT=<backend>
Yes, working on that, and for now sticking to
upcxx-2018.9.0
.-
UPCXX_GASNET_CONDUIT=udp
compiles fine, need to correctly set up-ssh-servers HOSTS
in my sbatch scripts. But, yes,$UPCXX_INSTALL/bin/upcxx-run
automatically callsamudprun -v -np 8 ./hello-world
so this would work. -
UPCXX_GASNET_CONDUIT=ibv
andUPCXX_GASNET_CONDUIT=mpi
compile fine withmpicc
/mpicxx
, so able to get executable withGASNetCoreLibraryName: IBV
.
I am also referring to GASNet documentation, while CHAPEL project also has some details for the different back-ends.
For reference, here is my environment:
module load python2/2.7.10.gnu module load gcc/7.2.0 module load openmpi.gnu/2.1.0 ---------------------------------------------------------------------- GASNet configuration: Portable conduits: ----------------- Portable SMP-loopback conduit (smp) ON (auto) Portable UDP/IP conduit (udp) ON (auto) Portable MPI conduit (mpi) ON (auto) Native, high-performance conduits: --------------------------------- IBM BlueGene/Q / Power775 PAMI conduit (pami) OFF (not found) InfiniBand IB Verbs conduit (ibv) ON (auto) Cray XE/XK Gemini conduit (gemini) OFF (not found) Cray XC Aries conduit (aries) OFF (not found) Misc Settings ------------- MPI compatibility: yes Pthreads support: yes Segment config: fast PSHM support: posix FCA support: no BLCR support: no Atomics support: native ----------------------------------------------------------------------
And thanks @bonachea for the insightful comments, helped a lot.
-
-
UPCXX_GASNET_CONDUIT=ibv and UPCXX_GASNET_CONDUIT=mpi fail to compile, so I must fix my environment configuration. (errors like gasnet_bootstrap_mpi.c:(.text+0x12e): undefined reference to 'MPI_Abort')
Assuming you have Mellanox-compatible InfiniBand hardware, ibv-conduit should definitely be preferred over mpi-conduit which is a low-performance backend for portability only.
The fix to the linker issues is probably to install UPC++ with
CXX=mpicxx
Alternatively if you don't want to use MPI interop or the MPI job spawner, you can set
GASNET_CONFIGURE_ARGS=--without-mpicxx
and then use the ssh-spawner with ibv-conduit. -
reporter Okay, got it working with both
upcxx-2017.9.0
andupcxx-2018.9.0
. Buildingupcxx
withmpicc
/mpicxx
, and compiling programs withibv
conduit were the missing ingredients.cd <upcxx-source-path> CC=mpicc CXX=mpicxx ./install <upcxx-install-path> cd <upcxx-source-path>/example/prog-guide CC=mpicc CXX=mpicxx UPCXX_GASNET_CONDUIT=ibv make salloc --time=00:15:00 --account=staff \ --nodes=8 \ --tasks-per-node=4 \ --mem-per-cpu=8G \ $UPCXX_INSTALL/bin/upcxx-run -np 32 \ <upcxx-source-path>/example/prog-guide/hello-world
One question about the recommended settings for the memory setup (for
upcxx-2017.9.0
). For instance, in above configuration, with 32 GB physical memory per node, what are the good maximum configurations for?export GASNET_PHYSMEM_MAX=30G export GASNET_MAX_SEGSIZE=30G export UPCXX_SEGMENT_MB=4096
And if I still get memory errors, then my program is to blame, right? (I will try when installing UPC++
GASNET_CONFIGURE_ARGS='--enable-pshm --disable-pshm-posix --enable-pshm-sysv'
from issue#109).I am testing here with just a modified version of the old SpMV example, so memory requirements aren't huge.
-
- changed status to resolved
Sounds like the main issue is resolved. Feel free to open additional issues for other problems/questions.
One question about the recommended settings for the memory setup (for upcxx-2017.9.0).
This is another area that has significantly improved in the latest UPC++ releases.
Ideally all you should need is
upcxx-run -shared-heap=4GB
(or whatever amount of shared memory each process wants). In the latest release you can also specify as a fraction of physical memory (egupcxx-run -shared-heap=50%
). Note there is a bug in the current release (issue#100, already fixed in develop and will be officially released next month) that affects the operation of this option when using some spawners, however if your underlying system spawner issrun
then it should hopefully be unaffected.Also note that whatever value you pass (either via
upcxx-run -shared-heap
orUPCXX_SEGMENT_MB
) is reserved for the UPC++ shared heap by each process at startup, meaning that portion of the physical memory is unavailable to service private memory allocation (malloc/new). So ideally you choose a value roughly large enough to encompass your expected shared memory utilization. - Log in to comment
Hi @aminmkhan :
First I'd like to strongly encourage updating to the latest release - it includes a large number of fixes and improvements, notably including a complete rewrite of the
upcxx-run
script that is used for job spawning. Out of curiosity, may I ask why you're using an obsolete version? Are you using some other software forcing this dependence?Next thing is to ensure you've built your executable for the correct distributed-memory network backend. With a recent version the easiest way to do this is the info argument:
upcxx-run -i ./hello-world
The old version lacks this feature, so you'd instead need to use one of the following UNIX commands:ident hello-world | grep GASNetExtendedLibraryName
orstrings hello-world | grep GASNetExtendedLibraryName
. If this reports a backend of "SMP" then that is the problem - the smp backend only supports single-node operation. To build for a network backend, you need to rebuild your UPC++ program withUPCXX_GASNET_CONDUIT=<backend>
where backend is the network appropriate for your system hardware (egibv
,aries
orudp
).Once you are certain you have an executable built for a distributed memory backend, then onto the job spawn command. Your ancient version of upcxx-run lacks support for the
-N
option used to explicitly control job layout, another good reason to update. Without that, your most likely solution is job launch directly with mpirun or srun, assuming your install was built with MPI and/or PMI job spawn support. Ie:mpirun -N 4 -n 16 hello-world
orsrun -N 4 -n 16 hello-world
. However if you are using udp-conduit or ibv-conduit with ssh-spawner, you'll need to useamudprun
orgasnetrun_ibv
respectively.Hope this helps..