Transfer text data using std::string or char* with RMA calls

Issue #275 resolved
Amin M. Khan created an issue

With the current release of UPC++, what is the recommended away to transfer text data using RMA calls, upcxx::rget, upcxx::rput?

There could be the following possibilities, and I think only the last one will work as the rest are not trivially serializable.

  1. upcxx::global_ptr<std::string> s;
  2. upcxx::global_ptr<std::vector<std::string>> s;
  3. upcxx::global_ptr<char*> c;
  4. upcxx::global_ptr<char[50]> c = upcxx::new_array<char[50]>(100);
  5. upcxx::global_ptr<char> c = upcxx::new_array<char>(5000)

In theory, global_ptr<std::vector<std::string>> would be the most flexible.
In practice, global_ptr<char> will work, but involves more programming logic in the user application to track different text snippets, in particular when text has to be removed and inserted.

For some context, upcxx-spec issue #136 covers the details on full featured serialization. See also some relevant discussion related to serialization in earlier issues:

Comments (2)

  1. Dan Bonachea

    RMA will likely always require trivially serializable types. The reason is that transferring non-trivial types requires active serialization work on the remote side - that entails remote CPU involvement, eliminates the possibility of offload to network hardware and makes the transfer no longer semantically a "memory access" (the MA in RMA).

    If you need to transfer strings you have several choices:

    1. Use RPC which explicitly runs code at the remote side (see this example). RPC argument passing can serialize std::string and other STL containers (and in an upcoming release will also handle user-defined serialization). If the data has non-trivial size, you should consider additionally using view to reduce copy overheads.
    2. Allocate char arrays in the shared heap (as in your ex 4 and 5 above), and perform RMA on upcxx::global_ptr<char> or other trivially serializable types.
    3. Use this extension to place std::string data in the shared heap, call std::string::data to retrieve the char *, then use try_global_ptr to create a upcxx::global_ptr<char> that other ranks can use in RMA.

    Which option is best depends on your situation.

    Hope this helps..

  2. Amin M. Khan reporter

    Thank you @Dan Bonachea for the useful pointers.

    Just one addition that the solution would also have to extend to user-defined data types containing string, for example:

    class Doc {
        std::string name;
        std::string type;
        std::string contents;
    };
    
    upcxx::global_ptr<Doc> docs = upcxx::new_array<upcxx::global_ptr<Doc>>(1000);
    

  3. Log in to comment