UPC++ Memory configuration on super computer (Infiniband + MPI Spawning)

Issue #109 resolved
Jérémie Lagravière created an issue

The problem

My program crashes at runtime due to memory configuration issue(s).

The program works fine in "smp" network conduit on any platform, including when using a single node on the super computer I am using.

The program crashes when using more nodes: it can work for 2 nodes but will crash for four, or vice-versa.

Software setup

  • UPC++ (https://bitbucket.org/upcxx/upcxx/downloads/upcxx-2017.9.0.tar.gz) compiled with MPI support

  • MPI version: openmpi.gnu/1.10.2 ((Open MPI) 1.10.2)

  • GCC/G++ version: gcc version 5.2.0 (GCC)

  • OS:

    • LSB Version: :base-4.0-amd64:base-4.0-noarch:core-4.0-amd64:core-4.0-noarch:graphics-4.0-amd64:graphics-4.0-noarch:printing-4.0-amd64:printing-4.0-noarch
    • Distributor ID: CentOS
    • Description: CentOS release 6.9 (Final)
    • Release: 6.9
    • Codename: Final

Hardware setup

  • Super computer: abel.uio.no

  • RAM per node: 64GB

  • Sockets per node: 2

  • Physical cores per node: 16 (2x8)

  • HyperThreading: disabled

  • CPU: Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz

Network:

Infiniband

$ ibv_devices 
    device                 node GUID
    ------              ----------------
    mlx4_0              002590ffff1720dc
$ ibv_devinfo 
hca_id: mlx4_0
    transport:          InfiniBand (0)
    fw_ver:             2.35.5100
    node_guid:          0025:90ff:ff17:20dc
    sys_image_guid:         0025:90ff:ff17:20df
    vendor_id:          0x02c9
    vendor_part_id:         4099
    hw_ver:             0x1
    board_id:           SM_2191000001000
    phys_port_cnt:          1
    Device ports:
        port:   1
            state:          PORT_ACTIVE (4)
            max_mtu:        4096 (5)
            active_mtu:     4096 (5)
            sm_lid:         404
            port_lid:       603
            port_lmc:       0x00
            link_layer:     InfiniBand

My program

The program I am using is quite simple in terms of code.

  1. read some data from a data file (the data size never exceeds 7GB). This step includes some duplication of the data in memory and some data distribution

  2. do some operations on the data, this steps includes some communication (upcxx::rput())

  3. outputs some info about performance

I know from an identical version in UPC that I implemented that the problem size (data size) is NOT a problem. The problem should fit in memory in most setups (1 node, 2 nodes, etc)

Bug replication N.1, step by step

$ qlogin --account=XXXX--ntasks-per-node=16 --mem-per-cpu=3900MB --nodes=2

-> job is created: two nodes reserved

$ source /cluster/bin/jobsetup 

-> mandatory line to start using a compute node on the super computer

$ module load openmpi.gnu/1.10.2

-> this automatically loads: 1) gcc/5.2.0 2) openmpi.gnu/1.10.2

$ export GASNET_PHYSMEM_MAX=54G

-> using "only" 85% of the memory available per node

$ export UPCXX_SEGMENT_MB=2100

-> Note the "one hundred": 2100! This will be the cause of the crash

$ mpirun -n 32 upcxxProgram/upcxxSpmv ../dataset/D67MPI3Dheart.55 100 &> bugReport1

-> Nothing fancy here. Just launching my program using mpirun -> please check bugReport1 in attachment to see the details

Bug replication N.2, step by step, now with UPCXX_CODEMODE=debug

$ qlogin --account=XXXX--ntasks-per-node=16 --mem-per-cpu=3900MB --nodes=2

-> job is created: two nodes reserved

$ source /cluster/bin/jobsetup 

-> mandatory line to start using a compute node on the super computer

$ module load openmpi.gnu/1.10.2

-> this automatically loads: 1) gcc/5.2.0 2) openmpi.gnu/1.10.2

$ export GASNET_PHYSMEM_MAX=54G
$ export UPCXX_SEGMENT_MB=2100

-> Note the "one hundred": 2100! This will be the cause of the crash

$ export UPCXX_CODEMODE=debug
$ make
g++ main.cpp tools.cpp mainComputation.cpp fileReader.cpp timeManagement.cpp -DUPCXX_BACKEND=gasnet1_seq -D_GNU_SOURCE=1 -DGASNET_SEQ -D_REENTRANT -I/cluster/home/jeremie/myRepo/pgm-jlg-upc-svn/trunk/otherThanSpmv/UPCXX/upcxxPackage/upcxx-2017.9.0/installed/gasnet.debug/include -I/cluster/home/jeremie/myRepo/pgm-jlg-upc-svn/trunk/otherThanSpmv/UPCXX/upcxxPackage/upcxx-2017.9.0/installed/gasnet.debug/include/ibv-conduit -I/cluster/home/jeremie/myRepo/pgm-jlg-upc-svn/trunk/otherThanSpmv/UPCXX/upcxxPackage/upcxx-2017.9.0/installed/upcxx.debug.gasnet1_seq.ibv/include -std=c++11 -D_GNU_SOURCE=1 -g3 -Wno-unused -Wno-unused-parameter -Wno-address -O2 -mavx -march=sandybridge -funroll-loops -fomit-frame-pointer -std=c++11 -L/cluster/home/jeremie/myRepo/pgm-jlg-upc-svn/trunk/otherThanSpmv/UPCXX/upcxxPackage/upcxx-2017.9.0/installed/upcxx.debug.gasnet1_seq.ibv/lib -lupcxx -lpthread -L/cluster/home/jeremie/myRepo/pgm-jlg-upc-svn/trunk/otherThanSpmv/UPCXX/upcxxPackage/upcxx-2017.9.0/installed/gasnet.debug/lib -lgasnet-ibv-seq -libverbs -lpthread -lrt -L/cluster/software/VERSIONS/gcc-5.2.0/lib/gcc/x86_64-unknown-linux-gnu/5.2.0 -lgcc -lrt -lm -I. -Iincludes/  -I/cluster/software/VERSIONS/openmpi.gnu-1.10.2/include -L/cluster/software/VERSIONS/openmpi.gnu-1.10.2/lib -lmpi  -lm -lrt -o upcxxProgram/upcxxSpmv

-> compiles with no errors, no warnings

$ mpirun -n 32 upcxxProgram/upcxxSpmv ../dataset/D67MPI3Dheart.55 100 &> bugReport2

-> Nothing fancy here. Just launching my program using mpirun -> please check bugReport2 in attachment to see the details

Bug replication N.3, step by step

#Re using the same environment as the previous "bug replication", just changing this:

$ export UPCXX_SEGMENT_MB=2000

-> it was 2100, now it's 2000

See bugReport3. It contains the regular ouputs of my program, with no errors.

Additional information about the executable

ldd

$ ldd upcxxProgram/upcxxSpmv 
    linux-vdso.so.1 =>  (0x00007ffe74bc3000)
    libpthread.so.0 => /lib64/libpthread.so.0 (0x00000032ff000000)
    libibverbs.so.1 => /usr/lib64/libibverbs.so.1 (0x00000032ff400000)
    librt.so.1 => /lib64/librt.so.1 (0x00000032ff800000)
    libmpi.so.12 => /cluster/software/VERSIONS/openmpi.gnu-1.10.2/lib/libmpi.so.12 (0x00007fd18f4f9000)
    libm.so.6 => /lib64/libm.so.6 (0x00000032fe800000)
    libstdc++.so.6 => /cluster/software/VERSIONS/gcc-5.2.0/lib64/libstdc++.so.6 (0x00007fd18f169000)
    libgcc_s.so.1 => /cluster/software/VERSIONS/gcc-5.2.0/lib64/libgcc_s.so.1 (0x00007fd18ef53000)
    libc.so.6 => /lib64/libc.so.6 (0x00000032fe400000)
    /lib64/ld-linux-x86-64.so.2 (0x00000032fe000000)
    libdl.so.2 => /lib64/libdl.so.2 (0x00000032fec00000)
    libnl.so.1 => /lib64/libnl.so.1 (0x0000003302800000)
    libosmcomp.so.3 => /usr/lib64/libosmcomp.so.3 (0x00007fd18ed42000)
    libopen-rte.so.12 => /cluster/software/VERSIONS/openmpi.gnu-1.10.2/lib/libopen-rte.so.12 (0x00007fd18eac8000)
    libopen-pal.so.13 => /cluster/software/VERSIONS/openmpi.gnu-1.10.2/lib/libopen-pal.so.13 (0x00007fd18e7f2000)
    libnuma.so.1 => /usr/lib64/libnuma.so.1 (0x0000003300000000)
    libutil.so.1 => /lib64/libutil.so.1 (0x0000003300c00000)
    libibumad.so.3 => /usr/lib64/libibumad.so.3 (0x00000032ffc00000)

file

$ file upcxxProgram/upcxxSpmv 
upcxxProgram/upcxxSpmv: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.18, not stripped

ident

$ ident upcxxProgram/upcxxSpmv 
upcxxProgram/upcxxSpmv:
     $GASNetCoreLibraryVersion: 1.18 $
     $GASNetCoreLibraryName: IBV $
     $GASNetExtendedLibraryVersion: 1.13 $
     $GASNetExtendedLibraryName: IBV $
     $GASNetToolsThreadModel: SEQ $
     $GASNetToolsConfig: RELEASE=2017.9.0,SPEC=1.10,PTR=64bit,debug,SEQ,timers_native,membars_native,atomics_native,atomic32_native,atomic64_native $
     $GASNetBuildTimestamp: Dec 12 2017 15:04:30 $
     $GASNetBuildId: Tue Dec 12 15:01:33 CET 2017 jeremie $
     $GASNetConfigureArgs: '--enable-debug' '--disable-psm' '--disable-mxm' '--disable-portals4' '--disable-ofi' '--disable-dev-warnings' '--disable-parsync' $
     $GASNetSystemTuple: x86_64-unknown-linux-gnu $
     $GASNetSystemName: login-0-1.local $
     $GASNetCompilerID: |COMPILER_FAMILY:GNU|COMPILER_VERSION:5.2.0|COMPILER_FAMILYID:1|STD:__STDC__,__STDC_VERSION__=201112L|misc:5.2.0| $
     $GASNetGitHash: gex-2017.9.0 $
     $GASNetEXAPIVersion: 0.2 $
     $GASNetAPIVersion: 1 $
     $GASNetThreadModel: GASNET_SEQ $
     $GASNetSegment: GASNET_SEGMENT_FAST $
     $GASNetConfig: (libgasnet.a) RELEASE=2017.9.0,SPEC=0.2,CONDUIT=IBV(IBV-1.18/IBV-1.13),THREADMODEL=SEQ,SEGMENT=FAST,PTR=64bit,noalign,pshm,debug,trace,stats,debugmalloc,srclines,timers_native,membars_native,atomics_native,atomic32_native,atomic64_native $
     $GASNetConduitName: IBV $
     $GASNetTracingEnabled: 1 $
     $GASNetStatisticsEnabled: 1 $
     $GASNetDefaultMaxSegsize: ((((uint64_t)1)<<31) - 4096) $
     $GASNetMPISpawner: 1 $
     $GASNetSSHSpawner: 1 $

Bug replication N.4

source /cluster/bin/jobsetup
module load openmpi.gnu/1.10.2
export GASNET_MAX_SEGSIZE=54G
export UPCXX_SEGMENT_MB=2000
export GASNET_PHYSMEM_MAX=54G
mpirun -n 32 upcxxProgram/upcxxSpmv ../dataset/D67MPI3Dheart.55 100

-> This works fine

Same setup except this:

export UPCXX_SEGMENT_MB=2200
mpirun -n 32 upcxxProgram/upcxxSpmv ../dataset/D67MPI3Dheart.55 100

-> This crashes, see bugReport4 (attached in the comment section)

Comments (23)

  1. Dan Bonachea

    The key output here is:

    GASNet gasnetc_attach returning an error code: GASNET_ERR_BAD_ARG (Invalid function parameter passed)
      at /cluster/home/jeremie/myRepo/pgm-jlg-upc-svn/trunk/otherThanSpmv/UPCXX/upcxxPackage/upcxx-2017.9.0/.nobs/art/9c31690be3ce5d589562b58bbba9f5a85ea04406/GASNet-2017.9.0/ibv-conduit/gasnet_core.c:2271
      reason: segsize too large
    UPC++ assertion failure on rank 15 [/cluster/home/jeremie/myRepo/pgm-jlg-upc-svn/trunk/otherThanSpmv/UPCXX/upcxxPackage/upcxx-2017.9.0/src/backend/gasnet/runtime.cpp:129]
    

    Jeremie - can you please attach the output of the following commands:

    file upcxxProgram/upcxxSpmv
    ldd upcxxProgram/upcxxSpmv
    ident upcxxProgram/upcxxSpmv
    
  2. Jérémie Lagravière reporter

    I have seen your message...I am trying to get some node allocation on the super computer I am using.

  3. Dan Bonachea

    OK - N.4 is hitting a different error message:

    *** FATAL ERROR: Unexpected error Bad address (errno=14) when registering the segment
    

    which indicates we've made progress.

    The change in output also confirms the error seen in N.3 was due to the lack of GASNET_MAX_SEGSIZE and issue #96. The next release will include improvements to upcxx-run and the runtime to hopefully smooth over that problem.

    The new problem corresponds to an ibv-conduit issue on some Linux systems that we're still investigating. I will let @PHHargrove provide details on the recommended resolution for that problem.

  4. Paul Hargrove

    Jérémie,

    As @bonachea indicated, you've hit a different issue now.
    This appears to be an instance of a problem we've known about for a while, but have never understood the details well enough to write it up properly.

    The issue is that for some reason the InfiniBand driver is sometimes unable to register large blocks of POSIX shared memory. Unfortunately we don't know why or under what circumstances.

    We do have a possible work-around, however.
    Ideally all that is required is that you set the following in your environment when installing UPC++ GASNET_CONFIGURE_ARGS='--enable-pshm --disable-pshm-posix --enable-pshm-sysv'
    This will configure GASNet to use SystemV shared memory instead of POSIX shared memory.

    You and I actually exchanged email on this same subject (with respect to Berkeley UPC) on January 4 and 6, with the subject "UPC programs using ibv network cannot use all the memory!".
    My understanding of your Jan 6 email is that switching to SystemV shared memory allowed you to use up to 60GB on a 64GB node of Abel. So, I suspect the extra configuration arguments above will resolve your problems with UPC++ too.

    -Paul

  5. Jérémie Lagravière reporter

    Oh nice! I don't know why, I did not think that I could have the same problem with both UPC and UPC++

    Anyway now it seems to be working. I will tell you more once my jobs will be done running on the super computer.

    At least, on two nodes, I have been able to use:

    export UPCXX_SEGMENT_MB=3300
    

    with no issue!

  6. Dan Bonachea

    Based on the discussion, the end user's problem has been resolved.

    GASNet Bug 3693 Bug 3694 remain open to address the underlying user-facing issue and discussion of potential solutions, respectively.

  7. Log in to comment