Distributed objects instances cannot be stored in std::vector

Issue #217 resolved
Amin M. Khan created an issue

Following code compiles fine when trying to store pointers to distributed object in array or vector:

upcxx::dist_object<upcxx::global_ptr<unsigned int>>* arr[10];
std::vector< upcxx::dist_object<upcxx::global_ptr<unsigned int>>* > vec;

for (int i = 0; i < 10; i++) {
    upcxx::dist_object<upcxx::global_ptr<unsigned int>> u_g(upcxx::new_<unsigned int>(i));
    vec.push_back(&u_g);
    arr[i] = &u_g;
}

However, directly storing distributed object instances in std::vector throws errors at compile time.

    upcxx::dist_object<upcxx::global_ptr<unsigned int>> arr_dist[10];
    std::vector< upcxx::dist_object<upcxx::global_ptr<unsigned int>> > vec_dist;

    for (int i = 0; i < 10; i++) {
        upcxx::dist_object<upcxx::global_ptr<unsigned int>> u_g(upcxx::new_<unsigned int>(i));
        vec_dist.push_back(u_g);
        arr_dist[i] = u_g;
    }

The error is about use of deleted function: upcxx::dist_object<T>::dist_object(const upcxx::dist_object<T>&)

/cluster/software/VERSIONS/gcc-6.3.0/include/c++/6.3.0/ext/new_allocator.h: In instantiation of 'void __gnu_cxx::new_allocator<_Tp>::construct(_Up*, _Args&& ...) [with _Up = upcxx::dist_object<upcxx::global_ptr<unsigned int> >; _Args = {const upcxx::dist_object<upcxx::global_ptr<unsigned int> >&}; _Tp = upcxx::dist_object<upcxx::global_ptr<unsigned int> >]':
/cluster/software/VERSIONS/gcc-6.3.0/include/c++/6.3.0/bits/alloc_traits.h:455:4:   required from 'static void std::allocator_traits<std::allocator<_CharT> >::construct(std::allocator_traits<std::allocator<_CharT> >::allocator_type&, _Up*, _Args&& ...) [with _Up = upcxx::dist_object<upcxx::global_ptr<unsigned int> >; _Args = {const upcxx::dist_object<upcxx::global_ptr<unsigned int> >&}; _Tp = upcxx::dist_object<upcxx::global_ptr<unsigned int> >; std::allocator_traits<std::allocator<_CharT> >::allocator_type = std::allocator<upcxx::dist_object<upcxx::global_ptr<unsigned int> > >]'
/cluster/software/VERSIONS/gcc-6.3.0/include/c++/6.3.0/bits/stl_vector.h:918:30:   required from 'void std::vector<_Tp, _Alloc>::push_back(const value_type&) [with _Tp = upcxx::dist_object<upcxx::global_ptr<unsigned int> >; _Alloc = std::allocator<upcxx::dist_object<upcxx::global_ptr<unsigned int> > >; std::vector<_Tp, _Alloc>::value_type = upcxx::dist_object<upcxx::global_ptr<unsigned int> >]'
hello-world.cpp:128:31:   required from here
/cluster/software/VERSIONS/gcc-6.3.0/include/c++/6.3.0/ext/new_allocator.h:120:4: error: use of deleted function 'upcxx::dist_object<T>::dist_object(const upcxx::dist_object<T>&) [with T = upcxx::global_ptr<unsigned int>]'
  { ::new((void *)__p) _Up(std::forward<_Args>(__args)...); }
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /home/.local/opt/upcxx-2018.9.0/upcxx.O3.gasnet_seq.ibv/include/upcxx/upcxx.hpp:13:0,
                 from hello-world.cpp:18:
/home/.local/opt/upcxx-2018.9.0/upcxx.O3.gasnet_seq.ibv/include/upcxx/dist_object.hpp:108:5: note: declared here
     dist_object(dist_object const&) = delete;
     ^~~~~~~~~~~

With array similar problem:

// upcxx::dist_object<upcxx::global_ptr<unsigned int>> arr1[10]; // won't compile
upcxx::dist_object<upcxx::global_ptr<unsigned int>>* arr_dist;
arr_dist = new upcxx::dist_object<upcxx::global_ptr<unsigned int>>[10]; // error: use of deleted function

for (int i = 0; i < 10; i++) {
    upcxx::dist_object<upcxx::global_ptr<unsigned int>> u_g(upcxx::new_<unsigned int>(i));
    // arr_dist[i] = u_g; // won't compile
}

This may not be supported functionality and not the recommended way to do this, but wondering about the background cause.

Comments (7)

  1. Amin M. Khan reporter

    Just a note that with pointers to distributed objects, this works fine. Below is a working example:

    upcxx::dist_object<upcxx::global_ptr<unsigned int>>* arr[10];
    std::vector< upcxx::dist_object<upcxx::global_ptr<unsigned int>>* > vec;
    
    for (int i = 0; i < 10; i++) {
        auto my_ptr = upcxx::new_<unsigned int>(i);
        auto my_dist = new upcxx::dist_object<upcxx::global_ptr<unsigned int>>(my_ptr);
    
        vec.push_back(my_dist);
        arr[i] = my_dist;
    }
    
    upcxx::barrier();
    
    for (int i = 0; i < 10; i++) {
        auto fut1 = vec[i]->fetch(0);
        auto fut2 = arr[i]->fetch(0);
    
        // when testing on single node
        std::cout << *(fut1.wait().local()) << std::endl;
        std::cout << *(fut2.wait().local()) << std::endl;
    }
    
  2. Dan Bonachea

    @aminmkhan - std::vector requires the element type to satisfy the C++ concepts CopyAssignable and CopyConstructible. upcxx::dist_object only provides MoveConstructible - this is an intentional decision to correctly support the lookup methods of dist_object, which consult a central directory to locate a reference to the the local representative of a distributed object. Copy assignment is fundamentally incompatible with maintaining a unique local representative for a collectively constructed dist_object.

    The array case doesn't work because dist_object is not DefaultConstructible. I believe this decision was mostly motivated by reducing error-prone behaviors, as opposed to anything fundamental.

    I'd like to understand more about your usage case - we've usually discussed dist_object serving as the "top-level" of a distributed data structure, where it could contain global_ptr's or std::arrays of global_ptr's referencing all the representatives with affinity to a given process. I've generally assumed one dist_object per important data structure (and possibly per team) in a given module. I did not envision variable-length arrays of dist_objects being an important use case.

  3. Amir Kamil

    I think it would work if you use emplace_back(), which constructs the object directly in the vector, rather than push_back().

        std::vector< upcxx::dist_object<upcxx::global_ptr<unsigned int>> > vec_dist;
    
        for (int i = 0; i < 10; i++) {
            vec_dist.emplace_back(upcxx::new_<unsigned int>(i));
        }
    
  4. Amin M. Khan reporter

    I'd like to understand more about your usage case

    Imagine on each process, I have N separate global ptr arrays, and so each of the N distributed object is for each of those N global ptr.
    Each process produces messages for everyone else, and queues them in each of the N global ptr arrays.

    std::vector< upcxx::dist_object<upcxx::global_ptr<Message<TableKey_T, ItemKey_T, Msg_T>>>* > my_push_buffers_g;
    std::vector< Message<TableKey_T, ItemKey_T, Msg_T>* > my_push_buffers;
    
    for (int i = 0; i < upcxx::rank_n(); i++) {
        auto my_ptr = upcxx::new_array<Message<TableKey_T, ItemKey_T, Msg_T>>(BUFFER_MAX_SIZE);
        auto my_dist = new upcxx::dist_object<upcxx::global_ptr<Message<TableKey_T, ItemKey_T, Msg_T>>>(my_ptr);
    
        my_push_buffers.emplace_back(my_ptr.local());
        my_push_buffers_g.emplace_back(my_dist);
    }
    
  5. Dan Bonachea

    Imagine on each process, I have N separate global ptr arrays, and so each of the N distributed object is for each of those N global ptr.

    Since N here is the number of processes, this algorithm of course induces non-scalable memory utilization, so possibly not a great long-term design. Unless your communication pattern really uses all those worst-case buffers, you might eventually want something pull-based or otherwise adapted to the real communication pattern.

    However ignoring that for the moment, wouldn't it be simpler in this case to just have a single dist_object and a buffer of size N*BUFFER_MAX_SIZE?

  6. Amin M. Khan reporter

    Yes, using upcxx::new_array<Message<>>(N * BUFFER_MAX_SIZE) is a possible alternative.

    Another alternative could be to have global_ptr array of global_ptr, something like:

    upcxx::global_ptr<upcxx::global_ptr<int>> g_m = upcxx::new_array<upcxx::global_ptr<int>>(N);
    for (int i = 0; i < N; i++) {
        auto g_i = upcxx::new_array<int>(BUFFER_MAX_SIZE);
        *(g_m.local() + i) = g_i;
    }
    

    This scenario is about how best to elegantly store a two-dimensional data structure using global pointers to be suitable for RMA calls. In this particular case, rows can be fixed in advance, while columns can be variable length.

    • Given a process i of total N processes, which can send up to M messages
    • each process i keeps buffer[N x M]
    • where each entry in buffer[j, k] correspond to message k sent from process i to process j

    So:

    • Each process j will only upcxx::fetch distributed object corresponding to row i from process i
    • Each process j will upcxx::rget row i only from process i

    I am trying to see how above works in practice, else I will fall back to using N*BUFFER_MAX_SIZE approach.

  7. Log in to comment