Multiple test failures with NVHPC 23.3+

Issue #601 resolved
Paul Hargrove created an issue

Tests of NVHPC 23.3 on all three architectures they support (x86_64, ppc64le, aarch64) have failures which did not occur with their 23.1 release.

There are two failure modes:

  1. Various assertion failures which indicate corruption of a global pointer.
    This has been seen from (at least) rput-cover, vis, allloc, global_ptr, non-contig-example, allocator-example and local_team in harness-based testing.
  2. Internal Compiler Errors.
    This has been seen with at least test/memory_kinds.cpp, both with and without --with-cuda.

Full harness output for x86_64, ppc6le and aarch64 (in that order) can be found in the following three locations:

So far, only codemode=debug exhibits the failures, and not just because assertions are disabled for codemode=opt. In particular:

  • upcxx -codemode=opt -DUPCXXI_GPTR_CHECK_ENABLED=1 ... does NOT fail.
  • upcxx -codemode=debug -O1 .. does NOT fail.

Both failure modes appear attributable to a flawed implementation of __attribute__((pure)), support for which is new in this release of NVHPC. Therefore, setting gasnet_cv_gasneti_have_cxx_attr_pure=no at configure time (in the environment or on the command line) is believed to be an effective work-around (as is use of any earlier supported release of NVHPC).

Comments (6)

  1. Paul Hargrove reporter

    Correction:

    The ICE compiling test/memory_kinds.cpp is reproducible in both codemodes (and both respond to the same work-around).

  2. Paul Hargrove reporter
    • changed status to open

    GASNet-EX commit 588e47720 adjusts logic for use of the pure attribute to exclude NVHPC 23.3 and newer.

    This issue remains open pending the following:

    • Advance of the GASNet-EX stable branch to include the work-around.
    • Identification of an eventual fixed release of NVHPC at which the work-around can be disabled.
  3. Log in to comment