Implicit ~device_allocator may make CUDA calls after CUDA is "deinitialized"

I have recently attempted to run sympack2D_cuda a Ubuntu 24.04 system with the distro-provided GCC (13.2) and CUDA toolchain (12.0). The result is a crash after return from main() with the following message from every rank:

UPC++ CUDA call failed:
on process 0 (cgpu-1)
at
[prefix]/upcxx.assert1.optlev0.dbgsym1.gasnet_seq.ibv/include/upcxx/cuda_internal.hpp:56

cuCtxPushCurrent(ctx)
error=:

The attached backtrace shows the following sequence:

__run_exit_handlers() is called sometime after return from main()
    ~device_alocator()
        ~device_allocator_core()
            release()
                context()
                    cuCtxPushCurrent()  returns CUDA_ERROR_DEINITIALIZED
                    fatalerror()

In lieu of symPACK, the following is sufficient to reproduce on this system:

#include <upcxx/upcxx.hpp>
upcxx::device_allocator<upcxx::cuda_device> gpu_allocator;  // global variable is prereq
int main(void) {
  upcxx::init();
  gpu_allocator = upcxx::make_gpu_allocator<upcxx::gpu_default_device>(32*1024*1024);
  upcxx::finalize();
  return 0;
}

Comments (1)