Add static_assert to prohibit massive top-level types as arguments to RPC

Issue #336 resolved
Dan Bonachea created an issue

The following bug was discussed in the course of pull request #176 and discussed briefly there. In a nutshell, the current serialization code assumes that objects with a statically known size are small enough to safely store on the program stack. When this assumption is violated by types that are statically large, serialization attempts result in stack overflows (even if the serialized representation is small).

In the 2020-02-26 meeting @john bachan agreed to fix this defect before the code freeze on Wed.

The following program demonstrates the problem, which by default should generate a stack overflow on most systems on develop @c94e2ff:

#include <upcxx/upcxx.hpp>
#include <cassert>
#include <iostream>

/*
 * This example illustrates the use of UPCXX_SERIALIZED_FIELDS to serialize a 
 * reference to a statically massive type.
 *
 * This is an example of subset serialization, in which a proper subset of
 * fields is sent without mutation in order to reduce communication costs.
 * Hence, the full object state is not transmitted and the RPCs transmitting
 * these objects must be aware of this so that they do not try to access
 * invalid/uninitialized state.
 */

#ifndef N
#define N (128 * 1024 * 1024)
#endif

class massive {
    public:
        /*
         * A flag used for testing whether an instance was
         * constructed by UPC++ deserialization logic or by user code.
         */
        bool deserialized;

        int value;
        int scratch_space[N];

        // Default constructor used by UPC++ serialization
        massive() {
            std::cout << "deserialization..." << std::endl;
            deserialized = true;
            scratch_space[N-1] = 27;
        }

        // Constructor to be used from user code
        massive(int val) {
            std::cout << "construction..." << std::endl;
            value = val;
            scratch_space[N-1] = 666;
            deserialized = false;
        }

        UPCXX_SERIALIZED_FIELDS(value)
};

int main(void) {
    upcxx::init();

    int rank = upcxx::rank_me();
    int nranks = upcxx::rank_n();

    massive *r = new massive(42);
    assert(!r->deserialized);

    std::cout << "serialization..." << std::endl;
    upcxx::rpc(0, [] (const massive& r_r) {
                /*
                 * Validate that this is a deserialized instance 
                 * and that the entire object was not sent.
                 */
                assert(r_r.deserialized);
                assert(r_r.value==42);
                assert(r_r.scratch_space[N-1] == 27);
            }, std::move(*r)).wait();

    upcxx::barrier();

    if (rank == 0) std::cout << "SUCCESS" << std::endl;

    upcxx::finalize();

    return 0;
}

Comments (5)

  1. john bachan

    Bad news: this is not a trivial enhancement. The implementation of rpc looks like this:

    rpc(Fn &&fn, Arg &&...arg) {
     // backend::serialize_and_send_am is illustrative only, actual backend func has different name
     // bind creates a new callable object temporary by gluing copies (hopefully moved) of arg and fn together
     backend::serialize_and_send_am(upcxx::bind(std::forward<Fn>(fn), std::forward<Arg>(arg)...));
    }
    

    Originally my hope was to jump into backend::serialize_and_send_am and fix the buffering for massive types to not use the stack. But as we can see, upcxx::bind is going to duplicate the massive type first. So fixing it later will only reduce the stack presence of massive types by 50%.

    Is this still worth pursuing for the 2020/3 release?

    A better fix which will get us the full 100% but I consider out of scope for the release is to add trickery to upcxx::bind so that it (or a variant, and use that in rpc) only glues references to the source objects together as opposed to copies. So long as serialization isn't deferred this is sound and avoids copy constructors for the case when objects weren't passed by &&. This "shallow" bind will use asymmetric serialization to deserialize to the regular value-holding bind so no remote-side code will need to change.

    Other problems with bind: it uses std::tuple to collect the bound arguments. Deserialization of std::tuple has to deserialize the elements out of place in separate buffers, then construct a tuple in a new buffer given references to those objects. I'll have to go into tuple's serialization and add massive buffer logic there too. In fact every container will probably need massive type support, deserializing a std::vector<int[1<<20]> will die for similar reasons.

    These other problems make me think that getting upcxx to be massive type "clean" will be too daunting. Also, massive types are not idiomatic c++ friendly since they violate the mantra that "move is cheap". In fact, the following is a recommended c++ pattern:

    struct my_wrapper {
      some_type wrapped;
      // clearly some_type better not be massive.
      my_wrapper(some_type value): wrapped(std::move(value)) {}
    };
    

    My opinion is that we don't pursue massive-type cleanliness. But we still fix upcxx::bind on references to avoid copying things like containers when we could just serialize out of the provided reference, but deferred until after 2020/3.

  2. Dan Bonachea reporter

    Based on John's analysis and the timing of our release cycle, I agree there should be no change for massive types in this release.

    Post-release we should investigate further what can be done for next release.

  3. Dan Bonachea reporter

    issue #336: Add static_assert to prohibit massive types as arguments to RPC

    Add static assertions to enforce a default size limit of 512 bytes for types passed as top-level arguments to rpc, rpc_ff and remote_cx::as_rpc.

    The correct resolution is for programs to either pass their large-type objects to rpc in a Serializable container (e.g., via upcxx::make_view), otherwise arrange to move the data using RMA (for TriviallySerializable data). However in the interests of backwards compatibility for legacy codes, the limit can also be raised at application compile time via -DUPCXX_MAX_RPC_ARG_SIZE=n.

    Resolves issue #336

    → <<cset 6fd79b63e694>>

  4. Log in to comment