issue138.cpp breaks PGI on Linux and GCC+Clang on Cygwin

Issue #278 resolved
Dan Bonachea created an issue

As currently written, test/regression/issue138.cpp constructs a deep chain of template instantiations, which generates a combinatorial explosion of code in the object file, especially when optimizations that reduce text size are disabled.

This results in reliable compile failures in CI, where both PGI on Linux and GCC+Clang on Cygwin either time out or just fall over dead during compilation.

I have a workaround in mind.

Comments (5)

  1. Dan Bonachea reporter

    This problem is particularly pronounced on Cygwin, where the test reliably causes a fatal assembler error in debug mode.

    As documented here there is a problem in Windows' PE/COFF object format that prevents the assembler from generating over 32k sections in one object file (the current version of the test generates over 130k sections in debug mode). The addition of -g0 -Wa,-mbig-obj compiler options enables a workaround for the section limit on windows, and also disables debug symbol output (which is also enormous, causing a different assembler overflow).

  2. Dan Bonachea reporter

    issue 138 is now also seen to break NVHPC on ppc64el for all combinations of {seq,par}x{debug,opt} in nightly tests:

    Errors in debug mode look like a potentially truncated intermediate file, eg:

    /opt/nvidia/hpc_sdk/Linux_ppc64le/21.5/compilers/share/llvm/bin/llc: error: /opt/nvidia/hpc_sdk/Linux_ppc64le/21.5/compilers/share/llvm/bin/llc: /run/user/1003/nvc++rvD-fZpjSeiYG.ll:264419:37: error: expected '=' after name
    %struct._ZSt10_Head_baseILm0EN5upcxx
    
    /opt/nvidia/hpc_sdk/Linux_ppc64le/21.5/compilers/share/llvm/bin/llc: error: /opt/nvidia/hpc_sdk/Linux_ppc64le/21.5/compilers/share/llvm/bin/llc: /run/user/1003/nvc++LeVniVsE6TEgY.ll:264413:190: error: expected value token
            call void  @_ZNSt11_Tuple_implILm1EJN5upcxx6detail11raw_storageINSt7__cxx114listISt6vectorIdSaIdEESaIS7_EEEEEEEC1Ev (%struct._ZSt11_Tuple_implILm1EJN5upcxx6detail11raw_storageINSt7__cxx114
    

    whereas opt failues directly report running out of disk space:

    LLVM ERROR: IO failure on output stream: No space left on device
    PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace.
    Stack dump:
    0.  Program arguments: /opt/nvidia/hpc_sdk/Linux_ppc64le/21.5/compilers/share/llvm/bin/llc /run/user/1003/nvc++dBtahjqVW92Os.ll -mcpu=native -O1 -fast-isel=0 -non-global-value-max-name-size=4294967295 -mattr=-fre,-fres,-frsqrte,-frsqrtes -disable-tail-merge-return -disable-ppc-preinc -disable-ppc-unaligned -code-model=large --frame-pointer=none -o /run/user/1003/nvc++BBtahrrjeT5EM.s 
    /opt/nvidia/hpc_sdk/Linux_ppc64le/21.5/compilers/share/llvm/bin/llc(_ZN4llvm3sys15PrintStackTraceERNS_11raw_ostreamE+0x38)[0x11af77a8]
    /opt/nvidia/hpc_sdk/Linux_ppc64le/21.5/compilers/share/llvm/bin/llc[0x11af78d0]
    /opt/nvidia/hpc_sdk/Linux_ppc64le/21.5/compilers/share/llvm/bin/llc(_ZN4llvm3sys17RunSignalHandlersEv+0x78)[0x11af51b8]
    /opt/nvidia/hpc_sdk/Linux_ppc64le/21.5/compilers/share/llvm/bin/llc[0x11af53ac]
    linux-vdso64.so.1(__kernel_sigtramp_rt64+0x0)[0x77f461f404c8]
    /lib/powerpc64le-linux-gnu/libc.so.6(gsignal+0xd8)[0x77f461b00468]
    /lib/powerpc64le-linux-gnu/libc.so.6(abort+0x168)[0x77f461ad7cd0]
    /opt/nvidia/hpc_sdk/Linux_ppc64le/21.5/compilers/share/llvm/bin/llc(_ZN4llvm18report_fatal_errorERKNS_5TwineEb+0xd0)[0x11a52100]
    /opt/nvidia/hpc_sdk/Linux_ppc64le/21.5/compilers/share/llvm/bin/llc(_ZN4llvm18report_fatal_errorERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEb+0x38)[0x11a52288]
    /opt/nvidia/hpc_sdk/Linux_ppc64le/21.5/compilers/share/llvm/bin/llc(_ZN4llvm14raw_fd_ostreamD1Ev+0x118)[0x11acd118]
    /opt/nvidia/hpc_sdk/Linux_ppc64le/21.5/compilers/share/llvm/bin/llc[0x1033ffc8]
    /opt/nvidia/hpc_sdk/Linux_ppc64le/21.5/compilers/share/llvm/bin/llc(main+0x520)[0x102c6040]
    /lib/powerpc64le-linux-gnu/libc.so.6(+0x2814c)[0x77f461ad814c]
    /lib/powerpc64le-linux-gnu/libc.so.6(__libc_start_main+0x94)[0x77f461ad8324]
    nvc++-Fatal-/opt/nvidia/hpc_sdk/Linux_ppc64le/21.5/compilers/share/llvm/bin/llc TERMINATED by signal 6
    Arguments to /opt/nvidia/hpc_sdk/Linux_ppc64le/21.5/compilers/share/llvm/bin/llc
    /opt/nvidia/hpc_sdk/Linux_ppc64le/21.5/compilers/share/llvm/bin/llc /run/user/1003/nvc++dBtahjqVW92Os.ll -mcpu=native -O1 -fast-isel=0 -non-global-value-max-name-size=4294967295 -mattr=-fre,-fres,-frsqrte,-frsqrtes -disable-tail-merge-return -disable-ppc-preinc -disable-ppc-unaligned -code-model=large --frame-pointer=none -o /run/user/1003/nvc++BBtahrrjeT5EM.s
    
  3. Paul Hargrove

    For the case of the failures in nightly testing with NVHPC on ppc64el:

    This system is running with TMPDIR set to a 8GiB ramdisk.
    It is possible that temporary files for this test really do exceed that size, in which case NOT setting TMPDIR (or setting it back to the default /tmp for the UPC++ test suite) might be a work-around for this in our night testing. However, I would still consider that a work-around for a known issue, as opposed to "not an issue".

  4. Log in to comment