upcxx::copy crashes when passed a remote host source and local T* dest

Issue #221 resolved
Max Grossman created an issue

While performance may not be optimal, the spec seems to allow using upcxx::copy with the source as a remote global_ptr and the dest as a local T*. However, performing this operation causes the program to abort with either a segfault or this error:

*** FATAL ERROR (proc 2): Remote address out of range (TM0:3 ptr=0x00000000 27789430 nbytes=1060896) at _gex_event_s* _gex_RMA_PutNB(gex_TM_t, gex_Rank_t, void*, void*, size_t, _gex_event_s**, gex_Flags_t) at /gpfs/alpine/world-shared/csc296/summit/upcxx-cuda/gcc-6.4.0/2019.3.0/gasnet.debug/include/ibv-conduit/gasnet_extended.h:92

Reproducer is below. Using rget works fine, using copy crashes.

int main(int argc, char **argv) {
    upcxx::init();

    int numprocs = upcxx::rank_n();
    int rank = upcxx::rank_me();

    assert(numprocs == 2);

    upcxx::global_ptr<double> arr = upcxx::new_array<double>(10);

    upcxx::dist_object<upcxx::global_ptr<double>> *dobj =
        new upcxx::dist_object<upcxx::global_ptr<double>>(arr);
    upcxx::global_ptr<double> remote = dobj->fetch(!rank).wait();

    double *local = new double[10];

    // upcxx::rget(remote, local, 10).wait();
    upcxx::copy(remote, local, 10).wait();

    upcxx::finalize();

    return 0;
}

Comments (6)

  1. Dan Bonachea

    This issue is fixed in the memory_kinds branch and forthcoming 2020.11.0 memory kinds prototype.

  2. Log in to comment