- edited description
UPC++ Memory configuration on super computer (Infiniband + MPI Spawning)
The problem
My program crashes at runtime due to memory configuration issue(s).
The program works fine in "smp" network conduit on any platform, including when using a single node on the super computer I am using.
The program crashes when using more nodes: it can work for 2 nodes but will crash for four, or vice-versa.
Software setup
-
UPC++ (https://bitbucket.org/upcxx/upcxx/downloads/upcxx-2017.9.0.tar.gz) compiled with MPI support
-
MPI version: openmpi.gnu/1.10.2 ((Open MPI) 1.10.2)
-
GCC/G++ version: gcc version 5.2.0 (GCC)
-
OS:
-
- LSB Version: :base-4.0-amd64:base-4.0-noarch:core-4.0-amd64:core-4.0-noarch:graphics-4.0-amd64:graphics-4.0-noarch:printing-4.0-amd64:printing-4.0-noarch
-
- Distributor ID: CentOS
-
- Description: CentOS release 6.9 (Final)
-
- Release: 6.9
-
- Codename: Final
Hardware setup
-
Super computer: abel.uio.no
-
RAM per node: 64GB
-
Sockets per node: 2
-
Physical cores per node: 16 (2x8)
-
HyperThreading: disabled
-
CPU: Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz
Network:
Infiniband
$ ibv_devices
device node GUID
------ ----------------
mlx4_0 002590ffff1720dc
$ ibv_devinfo
hca_id: mlx4_0
transport: InfiniBand (0)
fw_ver: 2.35.5100
node_guid: 0025:90ff:ff17:20dc
sys_image_guid: 0025:90ff:ff17:20df
vendor_id: 0x02c9
vendor_part_id: 4099
hw_ver: 0x1
board_id: SM_2191000001000
phys_port_cnt: 1
Device ports:
port: 1
state: PORT_ACTIVE (4)
max_mtu: 4096 (5)
active_mtu: 4096 (5)
sm_lid: 404
port_lid: 603
port_lmc: 0x00
link_layer: InfiniBand
My program
The program I am using is quite simple in terms of code.
-
read some data from a data file (the data size never exceeds 7GB). This step includes some duplication of the data in memory and some data distribution
-
do some operations on the data, this steps includes some communication (upcxx::rput())
-
outputs some info about performance
I know from an identical version in UPC that I implemented that the problem size (data size) is NOT a problem. The problem should fit in memory in most setups (1 node, 2 nodes, etc)
Bug replication N.1, step by step
$ qlogin --account=XXXX--ntasks-per-node=16 --mem-per-cpu=3900MB --nodes=2
-> job is created: two nodes reserved
$ source /cluster/bin/jobsetup
-> mandatory line to start using a compute node on the super computer
$ module load openmpi.gnu/1.10.2
-> this automatically loads: 1) gcc/5.2.0 2) openmpi.gnu/1.10.2
$ export GASNET_PHYSMEM_MAX=54G
-> using "only" 85% of the memory available per node
$ export UPCXX_SEGMENT_MB=2100
-> Note the "one hundred": 2100! This will be the cause of the crash
$ mpirun -n 32 upcxxProgram/upcxxSpmv ../dataset/D67MPI3Dheart.55 100 &> bugReport1
-> Nothing fancy here. Just launching my program using mpirun -> please check bugReport1 in attachment to see the details
Bug replication N.2, step by step, now with UPCXX_CODEMODE=debug
$ qlogin --account=XXXX--ntasks-per-node=16 --mem-per-cpu=3900MB --nodes=2
-> job is created: two nodes reserved
$ source /cluster/bin/jobsetup
-> mandatory line to start using a compute node on the super computer
$ module load openmpi.gnu/1.10.2
-> this automatically loads: 1) gcc/5.2.0 2) openmpi.gnu/1.10.2
$ export GASNET_PHYSMEM_MAX=54G
$ export UPCXX_SEGMENT_MB=2100
-> Note the "one hundred": 2100! This will be the cause of the crash
$ export UPCXX_CODEMODE=debug
$ make
g++ main.cpp tools.cpp mainComputation.cpp fileReader.cpp timeManagement.cpp -DUPCXX_BACKEND=gasnet1_seq -D_GNU_SOURCE=1 -DGASNET_SEQ -D_REENTRANT -I/cluster/home/jeremie/myRepo/pgm-jlg-upc-svn/trunk/otherThanSpmv/UPCXX/upcxxPackage/upcxx-2017.9.0/installed/gasnet.debug/include -I/cluster/home/jeremie/myRepo/pgm-jlg-upc-svn/trunk/otherThanSpmv/UPCXX/upcxxPackage/upcxx-2017.9.0/installed/gasnet.debug/include/ibv-conduit -I/cluster/home/jeremie/myRepo/pgm-jlg-upc-svn/trunk/otherThanSpmv/UPCXX/upcxxPackage/upcxx-2017.9.0/installed/upcxx.debug.gasnet1_seq.ibv/include -std=c++11 -D_GNU_SOURCE=1 -g3 -Wno-unused -Wno-unused-parameter -Wno-address -O2 -mavx -march=sandybridge -funroll-loops -fomit-frame-pointer -std=c++11 -L/cluster/home/jeremie/myRepo/pgm-jlg-upc-svn/trunk/otherThanSpmv/UPCXX/upcxxPackage/upcxx-2017.9.0/installed/upcxx.debug.gasnet1_seq.ibv/lib -lupcxx -lpthread -L/cluster/home/jeremie/myRepo/pgm-jlg-upc-svn/trunk/otherThanSpmv/UPCXX/upcxxPackage/upcxx-2017.9.0/installed/gasnet.debug/lib -lgasnet-ibv-seq -libverbs -lpthread -lrt -L/cluster/software/VERSIONS/gcc-5.2.0/lib/gcc/x86_64-unknown-linux-gnu/5.2.0 -lgcc -lrt -lm -I. -Iincludes/ -I/cluster/software/VERSIONS/openmpi.gnu-1.10.2/include -L/cluster/software/VERSIONS/openmpi.gnu-1.10.2/lib -lmpi -lm -lrt -o upcxxProgram/upcxxSpmv
-> compiles with no errors, no warnings
$ mpirun -n 32 upcxxProgram/upcxxSpmv ../dataset/D67MPI3Dheart.55 100 &> bugReport2
-> Nothing fancy here. Just launching my program using mpirun -> please check bugReport2 in attachment to see the details
Bug replication N.3, step by step
#Re using the same environment as the previous "bug replication", just changing this:
$ export UPCXX_SEGMENT_MB=2000
-> it was 2100, now it's 2000
See bugReport3. It contains the regular ouputs of my program, with no errors.
Additional information about the executable
ldd
$ ldd upcxxProgram/upcxxSpmv
linux-vdso.so.1 => (0x00007ffe74bc3000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00000032ff000000)
libibverbs.so.1 => /usr/lib64/libibverbs.so.1 (0x00000032ff400000)
librt.so.1 => /lib64/librt.so.1 (0x00000032ff800000)
libmpi.so.12 => /cluster/software/VERSIONS/openmpi.gnu-1.10.2/lib/libmpi.so.12 (0x00007fd18f4f9000)
libm.so.6 => /lib64/libm.so.6 (0x00000032fe800000)
libstdc++.so.6 => /cluster/software/VERSIONS/gcc-5.2.0/lib64/libstdc++.so.6 (0x00007fd18f169000)
libgcc_s.so.1 => /cluster/software/VERSIONS/gcc-5.2.0/lib64/libgcc_s.so.1 (0x00007fd18ef53000)
libc.so.6 => /lib64/libc.so.6 (0x00000032fe400000)
/lib64/ld-linux-x86-64.so.2 (0x00000032fe000000)
libdl.so.2 => /lib64/libdl.so.2 (0x00000032fec00000)
libnl.so.1 => /lib64/libnl.so.1 (0x0000003302800000)
libosmcomp.so.3 => /usr/lib64/libosmcomp.so.3 (0x00007fd18ed42000)
libopen-rte.so.12 => /cluster/software/VERSIONS/openmpi.gnu-1.10.2/lib/libopen-rte.so.12 (0x00007fd18eac8000)
libopen-pal.so.13 => /cluster/software/VERSIONS/openmpi.gnu-1.10.2/lib/libopen-pal.so.13 (0x00007fd18e7f2000)
libnuma.so.1 => /usr/lib64/libnuma.so.1 (0x0000003300000000)
libutil.so.1 => /lib64/libutil.so.1 (0x0000003300c00000)
libibumad.so.3 => /usr/lib64/libibumad.so.3 (0x00000032ffc00000)
file
$ file upcxxProgram/upcxxSpmv
upcxxProgram/upcxxSpmv: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.18, not stripped
ident
$ ident upcxxProgram/upcxxSpmv
upcxxProgram/upcxxSpmv:
$GASNetCoreLibraryVersion: 1.18 $
$GASNetCoreLibraryName: IBV $
$GASNetExtendedLibraryVersion: 1.13 $
$GASNetExtendedLibraryName: IBV $
$GASNetToolsThreadModel: SEQ $
$GASNetToolsConfig: RELEASE=2017.9.0,SPEC=1.10,PTR=64bit,debug,SEQ,timers_native,membars_native,atomics_native,atomic32_native,atomic64_native $
$GASNetBuildTimestamp: Dec 12 2017 15:04:30 $
$GASNetBuildId: Tue Dec 12 15:01:33 CET 2017 jeremie $
$GASNetConfigureArgs: '--enable-debug' '--disable-psm' '--disable-mxm' '--disable-portals4' '--disable-ofi' '--disable-dev-warnings' '--disable-parsync' $
$GASNetSystemTuple: x86_64-unknown-linux-gnu $
$GASNetSystemName: login-0-1.local $
$GASNetCompilerID: |COMPILER_FAMILY:GNU|COMPILER_VERSION:5.2.0|COMPILER_FAMILYID:1|STD:__STDC__,__STDC_VERSION__=201112L|misc:5.2.0| $
$GASNetGitHash: gex-2017.9.0 $
$GASNetEXAPIVersion: 0.2 $
$GASNetAPIVersion: 1 $
$GASNetThreadModel: GASNET_SEQ $
$GASNetSegment: GASNET_SEGMENT_FAST $
$GASNetConfig: (libgasnet.a) RELEASE=2017.9.0,SPEC=0.2,CONDUIT=IBV(IBV-1.18/IBV-1.13),THREADMODEL=SEQ,SEGMENT=FAST,PTR=64bit,noalign,pshm,debug,trace,stats,debugmalloc,srclines,timers_native,membars_native,atomics_native,atomic32_native,atomic64_native $
$GASNetConduitName: IBV $
$GASNetTracingEnabled: 1 $
$GASNetStatisticsEnabled: 1 $
$GASNetDefaultMaxSegsize: ((((uint64_t)1)<<31) - 4096) $
$GASNetMPISpawner: 1 $
$GASNetSSHSpawner: 1 $
Bug replication N.4
source /cluster/bin/jobsetup
module load openmpi.gnu/1.10.2
export GASNET_MAX_SEGSIZE=54G
export UPCXX_SEGMENT_MB=2000
export GASNET_PHYSMEM_MAX=54G
mpirun -n 32 upcxxProgram/upcxxSpmv ../dataset/D67MPI3Dheart.55 100
-> This works fine
Same setup except this:
export UPCXX_SEGMENT_MB=2200
mpirun -n 32 upcxxProgram/upcxxSpmv ../dataset/D67MPI3Dheart.55 100
-> This crashes, see bugReport4 (attached in the comment section)
Comments (23)
-
reporter -
reporter - edited description
-
reporter - edited description
-
reporter - edited description
-
reporter - edited description
-
reporter - edited description
-
reporter - edited description
-
reporter - edited description
-
reporter - edited description
-
reporter - edited description
-
The key output here is:
GASNet gasnetc_attach returning an error code: GASNET_ERR_BAD_ARG (Invalid function parameter passed) at /cluster/home/jeremie/myRepo/pgm-jlg-upc-svn/trunk/otherThanSpmv/UPCXX/upcxxPackage/upcxx-2017.9.0/.nobs/art/9c31690be3ce5d589562b58bbba9f5a85ea04406/GASNet-2017.9.0/ibv-conduit/gasnet_core.c:2271 reason: segsize too large UPC++ assertion failure on rank 15 [/cluster/home/jeremie/myRepo/pgm-jlg-upc-svn/trunk/otherThanSpmv/UPCXX/upcxxPackage/upcxx-2017.9.0/src/backend/gasnet/runtime.cpp:129]
Jeremie - can you please attach the output of the following commands:
file upcxxProgram/upcxxSpmv ldd upcxxProgram/upcxxSpmv ident upcxxProgram/upcxxSpmv
-
reporter - edited description
-
reporter I edited the original message to include ldd, file and ident info
-
Please re-try your program after additionally setting:
export GASNET_MAX_SEGSIZE=54G
-
reporter I have seen your message...I am trying to get some node allocation on the super computer I am using.
-
reporter - edited description
-
reporter - attached bugReport4
-
reporter - edited description
-
OK - N.4 is hitting a different error message:
*** FATAL ERROR: Unexpected error Bad address (errno=14) when registering the segment
which indicates we've made progress.
The change in output also confirms the error seen in N.3 was due to the lack of GASNET_MAX_SEGSIZE and issue
#96. The next release will include improvements to upcxx-run and the runtime to hopefully smooth over that problem.The new problem corresponds to an ibv-conduit issue on some Linux systems that we're still investigating. I will let @PHHargrove provide details on the recommended resolution for that problem.
-
Jérémie,
As @bonachea indicated, you've hit a different issue now.
This appears to be an instance of a problem we've known about for a while, but have never understood the details well enough to write it up properly.The issue is that for some reason the InfiniBand driver is sometimes unable to register large blocks of POSIX shared memory. Unfortunately we don't know why or under what circumstances.
We do have a possible work-around, however.
Ideally all that is required is that you set the following in your environment when installing UPC++GASNET_CONFIGURE_ARGS='--enable-pshm --disable-pshm-posix --enable-pshm-sysv'
This will configure GASNet to use SystemV shared memory instead of POSIX shared memory.You and I actually exchanged email on this same subject (with respect to Berkeley UPC) on January 4 and 6, with the subject "UPC programs using ibv network cannot use all the memory!".
My understanding of your Jan 6 email is that switching to SystemV shared memory allowed you to use up to 60GB on a 64GB node of Abel. So, I suspect the extra configuration arguments above will resolve your problems with UPC++ too.-Paul
-
reporter Oh nice! I don't know why, I did not think that I could have the same problem with both UPC and UPC++
Anyway now it seems to be working. I will tell you more once my jobs will be done running on the super computer.
At least, on two nodes, I have been able to use:
export UPCXX_SEGMENT_MB=3300
with no issue!
-
- changed status to resolved
-
- changed version to 2017.9.0 release
- Log in to comment