upcxx::view broken with asymmetric deserialization

Issue #355 resolved
Dan Bonachea created an issue

The 2020.3.0 spec changed the definition of upcxx::deserializing_iterator<T> to accommodate asymmetric deserialization of user types. However the implementation was not updated, leaving upcxx::view<T> effectively broken when T != deserialized_type_t<T>.

Example:

#include <upcxx/upcxx.hpp>
#include <type_traits>
#include <vector>

#if SYMMETRIC
#define U T // force symmetric deserialization
#else
struct U { // asymmetric deserialization
  int w,x;
};
#endif

struct T { 
  int x,y,z;
  struct upcxx_serialization {
    template<typename Writer>
    static void serialize (Writer& writer, T const & t) {
        writer.write(t.x);
    }   
    template<typename Reader>
    static U* deserialize(Reader& reader, void* storage) {
        int x = reader.template read<int>();
        U *up = new(storage) U();
        up->x = x;
        return up; 
    }   
  };  
};

int main() {
  upcxx::init();

  T t;
  t.x = 42; 
  upcxx::rpc(0, [](const U &u) { 
                  assert(u.x == 42); 
                }, t).wait();


  std::vector<T> vt; 
  for (int i=0; i < 10; i++) vt.push_back(T{42,43,44});
  for (int i=0; i < 10; i++) assert(vt[i].x == 42);
  upcxx::rpc(0, [](upcxx::view<T> myview) { 
                  static_assert(std::is_same<upcxx::deserializing_iterator<T>,
                                             decltype(myview)::iterator>::value, "oops");
                  int cnt = 0;
                  auto it = myview.begin();
                  static_assert(std::is_same<U,  decltype(it)::value_type>::value, "oops");
                  static_assert(std::is_same<U*, decltype(it)::pointer>::value, "oops");
                  static_assert(std::is_same<U,  decltype(it)::reference>::value, "oops");
                  for (; it < myview.end(); it++, cnt++) {
                    U u = *it;
                    assert(u.x == 42); 
                  }   
                  assert(cnt == 10);
                }, upcxx::make_view(vt)).wait();

  upcxx::barrier();
  if (!upcxx::rank_me()) std::cout << "SUCCESS" << std::endl;
  upcxx::finalize();
}

This program works for symmetric T (-DSYMMETRIC) but fails for the default asymmetric on 2020.3.0 and develop @ 016aecf :

asym-view.cpp: In lambda function:
asym-view.cpp:48:77: error: static assertion failed: oops
   48 |                   static_assert(std::is_same<U,  decltype(it)::value_type>::value, "oops");
      |                                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~
asym-view.cpp:49:74: error: static assertion failed: oops
   49 |                   static_assert(std::is_same<U*, decltype(it)::pointer>::value, "oops");
      |                                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~
asym-view.cpp:50:76: error: static assertion failed: oops
   50 |                   static_assert(std::is_same<U,  decltype(it)::reference>::value, "oops");
      |                                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~
asym-view.cpp:52:27: error: conversion from 'T' to non-scalar type 'U' requested
   52 |                     U u = *it;
      |                           ^~~
asym-view.cpp: In function 'int main()':
asym-view.cpp:56:40: error: no matching function for call to 'rpc(int, main()::<lambda(upcxx::view<T>)>, upcxx::view<T, __gnu_cxx::__normal_iterator<const T*, std::vector<T> > >)'
   56 |                 }, upcxx::make_view(vt)).wait();
      |                                        ^
In file included from /usr/local/pkg/upcxx-dirac/gcc-9.3.0/nightly-2020.03.23/upcxx.debug.gasnet_seq.smp/include/upcxx/dist_object.hpp:7,
                 from /usr/local/pkg/upcxx-dirac/gcc-9.3.0/nightly-2020.03.23/upcxx.debug.gasnet_seq.smp/include/upcxx/upcxx.hpp:24,
                 from asym-view.cpp:1:
/usr/local/pkg/upcxx-dirac/gcc-9.3.0/nightly-2020.03.23/upcxx.debug.gasnet_seq.smp/include/upcxx/rpc.hpp:259:8: note: candidate: 'template<class Cxs, class Fn, class ... Arg> typename upcxx::detail::rpc_return<Fn(Arg ...), Cxs>::type upcxx::rpc(const upcxx::team&, upcxx::intrank_t, Cxs, Fn&&, Arg&& ...)'
  259 |   auto rpc(const team &tm, intrank_t recipient, Cxs cxs, Fn &&fn, Arg &&...args)
      |        ^~~
/usr/local/pkg/upcxx-dirac/gcc-9.3.0/nightly-2020.03.23/upcxx.debug.gasnet_seq.smp/include/upcxx/rpc.hpp:259:8: note:   template argument deduction/substitution failed:
asym-view.cpp:56:40: note:   candidate expects at least 4 arguments, 3 provided
   56 |                 }, upcxx::make_view(vt)).wait();
      |                                        ^
In file included from /usr/local/pkg/upcxx-dirac/gcc-9.3.0/nightly-2020.03.23/upcxx.debug.gasnet_seq.smp/include/upcxx/dist_object.hpp:7,
                 from /usr/local/pkg/upcxx-dirac/gcc-9.3.0/nightly-2020.03.23/upcxx.debug.gasnet_seq.smp/include/upcxx/upcxx.hpp:24,
                 from asym-view.cpp:1:
/usr/local/pkg/upcxx-dirac/gcc-9.3.0/nightly-2020.03.23/upcxx.debug.gasnet_seq.smp/include/upcxx/rpc.hpp:269:8: note: candidate: 'template<class Cxs, class Fn, class ... Arg> typename upcxx::detail::rpc_return<Fn(Arg ...), Cxs>::type upcxx::rpc(upcxx::intrank_t, Cxs, Fn&&, Arg&& ...)'
  269 |   auto rpc(intrank_t recipient, Cxs cxs, Fn &&fn, Arg &&...args)
      |        ^~~
/usr/local/pkg/upcxx-dirac/gcc-9.3.0/nightly-2020.03.23/upcxx.debug.gasnet_seq.smp/include/upcxx/rpc.hpp:269:8: note:   template argument deduction/substitution failed:
/usr/local/pkg/upcxx-dirac/gcc-9.3.0/nightly-2020.03.23/upcxx.debug.gasnet_seq.smp/include/upcxx/rpc.hpp: In substitution of 'template<class Cxs, class Fn, class ... Arg> typename upcxx::detail::rpc_return<Fn(Arg ...), Cxs>::type upcxx::rpc(upcxx::intrank_t, Cxs, Fn&&, Arg&& ...) [with Cxs = main()::<lambda(upcxx::view<T>)>; Fn = upcxx::view<T, __gnu_cxx::__normal_iterator<const T*, std::vector<T> > >; Arg = {}]':
asym-view.cpp:56:40:   required from here
/usr/local/pkg/upcxx-dirac/gcc-9.3.0/nightly-2020.03.23/upcxx.debug.gasnet_seq.smp/include/upcxx/rpc.hpp:269:8: error: no type named 'type' in 'struct upcxx::detail::rpc_remote_results<upcxx::view<T, __gnu_cxx::__normal_iterator<const T*, std::vector<T> > >(), void>'
/usr/local/pkg/upcxx-dirac/gcc-9.3.0/nightly-2020.03.23/upcxx.debug.gasnet_seq.smp/include/upcxx/rpc.hpp:280:8: note: candidate: 'template<class Fn, class ... Arg> typename upcxx::detail::rpc_return<Fn(Arg ...), upcxx::completions<upcxx::future_cx<upcxx::operation_cx_event, upcxx::progress_level::user> > >::type upcxx::rpc(const upcxx::team&, upcxx::intrank_t, Fn&&, Arg&& ...)'
  280 |   auto rpc(const team &tm, intrank_t recipient, Fn &&fn, Arg &&...args)
      |        ^~~
/usr/local/pkg/upcxx-dirac/gcc-9.3.0/nightly-2020.03.23/upcxx.debug.gasnet_seq.smp/include/upcxx/rpc.hpp:280:8: note:   template argument deduction/substitution failed:
asym-view.cpp:43:14: note:   cannot convert '0' (type 'int') to type 'const upcxx::team&'
   43 |   upcxx::rpc(0, [](upcxx::view<T> myview) {
      |              ^
In file included from /usr/local/pkg/upcxx-dirac/gcc-9.3.0/nightly-2020.03.23/upcxx.debug.gasnet_seq.smp/include/upcxx/dist_object.hpp:7,
                 from /usr/local/pkg/upcxx-dirac/gcc-9.3.0/nightly-2020.03.23/upcxx.debug.gasnet_seq.smp/include/upcxx/upcxx.hpp:24,
                 from asym-view.cpp:1:
/usr/local/pkg/upcxx-dirac/gcc-9.3.0/nightly-2020.03.23/upcxx.debug.gasnet_seq.smp/include/upcxx/rpc.hpp:291:8: note: candidate: 'template<class Fn, class ... Arg> typename upcxx::detail::rpc_return<Fn(Arg ...), upcxx::completions<upcxx::future_cx<upcxx::operation_cx_event, upcxx::progress_level::user> > >::type upcxx::rpc(upcxx::intrank_t, Fn&&, Arg&& ...)'
  291 |   auto rpc(intrank_t recipient, Fn &&fn, Arg &&...args)
      |        ^~~
/usr/local/pkg/upcxx-dirac/gcc-9.3.0/nightly-2020.03.23/upcxx.debug.gasnet_seq.smp/include/upcxx/rpc.hpp:291:8: note:   template argument deduction/substitution failed:
/usr/local/pkg/upcxx-dirac/gcc-9.3.0/nightly-2020.03.23/upcxx.debug.gasnet_seq.smp/include/upcxx/rpc.hpp: In substitution of 'template<class Fn, class ... Arg> typename upcxx::detail::rpc_return<Fn(Arg ...), upcxx::completions<upcxx::future_cx<upcxx::operation_cx_event, upcxx::progress_level::user> > >::type upcxx::rpc(upcxx::intrank_t, Fn&&, Arg&& ...) [with Fn = main()::<lambda(upcxx::view<T>)>; Arg = {upcxx::view<T, __gnu_cxx::__normal_iterator<const T*, std::vector<T, std::allocator<T> > > >}]':
asym-view.cpp:56:40:   required from here
/usr/local/pkg/upcxx-dirac/gcc-9.3.0/nightly-2020.03.23/upcxx.debug.gasnet_seq.smp/include/upcxx/rpc.hpp:291:8: error: no type named 'type' in 'struct upcxx::detail::rpc_remote_results<main()::<lambda(upcxx::view<T>)>(upcxx::view<T, __gnu_cxx::__normal_iterator<const T*, std::vector<T> > >), void>'
In file included from /usr/local/pkg/upcxx-dirac/gcc-9.3.0/nightly-2020.03.23/upcxx.debug.gasnet_seq.smp/include/upcxx/upcxx.hpp:36,
                 from asym-view.cpp:1:
/usr/local/pkg/upcxx-dirac/gcc-9.3.0/nightly-2020.03.23/upcxx.debug.gasnet_seq.smp/include/upcxx/view.hpp: In instantiation of 'static T* upcxx::detail::serialization_view_element<T, false>::deserialize(Reader&, void*) [with Reader = upcxx::detail::serialization_reader; T = T]':
/usr/local/pkg/upcxx-dirac/gcc-9.3.0/nightly-2020.03.23/upcxx.debug.gasnet_seq.smp/include/upcxx/view.hpp:39:57:   required from 'T upcxx::deserializing_iterator<T>::operator*() const [with T = T]'
asym-view.cpp:52:28:   required from here
/usr/local/pkg/upcxx-dirac/gcc-9.3.0/nightly-2020.03.23/upcxx.debug.gasnet_seq.smp/include/upcxx/view.hpp:314:44: error: cannot convert 'U*' to 'T*' in return
  314 |         return r.template read_into<T>(spot);
      |                                            ^
      |                                            |
      |                                            U*

At least the implementation's member type declarations in upcxx::deserializing_iterator<T> need to be updated - probably other stuff as well.

Comments (7)

  1. Dan Bonachea reporter

    The following change is necessary but NOT sufficient:

    --- a/src/view.hpp
    +++ b/src/view.hpp
    @@ -22,9 +22,9 @@ namespace upcxx {
       class deserializing_iterator {
       public:
         using difference_type = std::ptrdiff_t;
    -    using value_type = T;
    -    using pointer = T*;
    -    using reference = T;
    +    using value_type = typename serialization_traits<T>::deserialized_type;
    +    using pointer = value_type*;
    +    using reference = value_type;
         using iterator_category = std::input_iterator_tag;
    
       private:
    @@ -33,10 +33,10 @@ namespace upcxx {
       public:
         deserializing_iterator(char const *p = nullptr) noexcept: r_(p) {}
    
    -    T operator*() const noexcept {
    +    value_type operator*() const noexcept {
           detail::serialization_reader r1(r_);
    -      detail::raw_storage<T> raw;
    -      detail::serialization_view_element<T>::deserialize(r1, &raw);
    +      detail::raw_storage<value_type> raw;
    +      detail::serialization_view_element<value_type>::deserialize(r1, &raw);
           return raw.value_and_destruct();
         }
    

    This fixes the static assertions and the result type of operator*, but still yields type check errors on the rpc:

    /home/pcp1/bonachea/UPC/upcxx/test/regression/issue355.cpp:60:40: error: no matching function for call to 'rpc(int, main()::<lambda(upcxx::view<T>)>, upcxx::view<T, __gnu_cxx::__normal_iterator<const T*, std::vector<T> > >)'
       60 |                 }, upcxx::make_view(vt)).wait();
          |              
    ...
    /home/pcp1/bonachea/UPC/bxx-gptr/bld/upcxx.assert1.optlev0.dbgsym1.gasnet_seq.smp/include/upcxx/rpc.hpp:291:8: note:   template argument deduction/substitution failed:
    /home/pcp1/bonachea/UPC/bxx-gptr/bld/upcxx.assert1.optlev0.dbgsym1.gasnet_seq.smp/include/upcxx/rpc.hpp: In substitution of 'template<class Fn, class ... Arg> typename upcxx::detail::rpc_return<Fn(Arg ...), upcxx::completions<upcxx::future_cx<upcxx::operation_cx_event, upcxx::progress_level::user> > >::type upcxx::rpc(upcxx::intrank_t, Fn&&, Arg&& ...) [with Fn = main()::<lambda(upcxx::view<T>)>; Arg = {upcxx::view<T, __gnu_cxx::__normal_iterator<const T*, std::vector<T, std::allocator<T> > > >}]':
    /home/pcp1/bonachea/UPC/upcxx/test/regression/issue355.cpp:60:40:   required from here
    /home/pcp1/bonachea/UPC/bxx-gptr/bld/upcxx.assert1.optlev0.dbgsym1.gasnet_seq.smp/include/upcxx/rpc.hpp:291:8: error: no type named 'type' in 'struct upcxx::detail::rpc_remote_results<main()::<lambda(upcxx::view<T>)>(upcxx::view<T, __gnu_cxx::__normal_iterator<const T*, std::vector<T> > >), void>'
    

    Digging further I find what appears to be the root cause, which is the runtime also fails this static assertion for asymmetric T, which underlies the spec design:

      static_assert(std::is_same<upcxx::view<T>,                          
                    upcxx::deserialized_type_t<upcxx::view<T,std::vector<T>::iterator>>>::value, "oops");
    

    I suspect we may need to have a design discussion about this.

  2. Dan Bonachea reporter

    Some further elaboration, based on discussion in our 2020-03-25 meeting:

    Observations

    For the discussion below, assume A is an asymmetric user type, where deserialized_type_t<A> is B.

    • Note 1: This means an A-typed object argument to upcxx::rpc becomes a B-typed object argument to the rpc callback. More specifically, a serialized A deserializes into a B, using the serialization mechanisms defined in the A class (or an ancestor type).
    • Corollary 1: asymmetric user types can only be defined using custom serialization (spec 6.2.3). Therefore the (de)serialzation must be via functions provided in a specialization of upcxx::serialization<A> or a upcxx_serialization class member of A.
    • Corollary 2: Because of corollary 1, A is Serializable but not TriviallySerializable.
    • Note 2: The serialization properties of B are completely irrelevant - they are never invoked when an A is deserialized into a B. In fact, we don't require B to be Serializable at all. We only care that A's deserialzation can instantiate a B.
    • Corollary 3: Because A is not TriviallySerializable, the default view iterator for view<A> (view_default_iterator_t<A>) is deserializing_iterator<A>. This means the serialized representation of A objects in a view is deserialized into B objects on-the-fly and returned-by-value via view<A>::iterator::operator*(), and in particular we are NOT in the random-access view specialization for TriviallySerializable types (regardless of B's type traits).

    Specification

    I believe the spec's current treatment of this topic is consistent with all the above observations. In particular, it defines that deserializing_iterator<T>::value_type = deserialized_type_t <T>, which means that view<A>::iterator::value_type = B, providing the expected asymmetry.

    The spec interpretation implies that a view<A,iter> input to RPC always deserializes into a view<A> result, even for asymmetric A. The spec should probably be clarified to formally state something like:

    deserialized_type_t<view<A,I>> is view<A>, for any Serializable A. This remains true even when deserialized_type_t<A> is not A

    This is notably in contrast to how STL containers are handled (as defined in 6.3), whereby eg.deserialized_type_t<vector<A>> is vector<deserialized_type_t<A>>. We might even highlight these differences in a brief example.

    With a clarification to this effect, I believe the spec's treatment of this topic will be correct and consistent.

    Implementation

    The current implementation seems to assume deserialized_type_t<view<A,I>> is view<deserialized_type_t<A>> (ie view<B>). This interpretation is broken for several reasons when A != B (ie the asymmetric case):

    1. B might not be Serializable, which is a precondition for the view template, making view<B> an ill-formed type.
    2. Even if B happens to be Serializable, in general B doesn't know how to interpret the serialized representation of A. This transfer needs to be using the deserialization provided by A
    3. The default iterator properties defined by the view template should be based on A, not B

    Proposal

    So my current position is this issue should be resolved by clarifying the spec as suggested above and fixing the implementation to match.

    Alternative proposal

    It's possible we could devise a more complicated self-consistent proposal. For example, we could consider a special case behavior for serialization of view<A> when B also happens to be TriviallySerializable whereby the sender both serializes and deserializes each A -> B in the sequence, and then trivially serializes the B elements and sends them so they arrive as a view<B>. This would allow use of the random-access in-place no-op iterator at the target, but has other significant drawbacks - in particular:

    1. It's an even larger departure from how non-view types are handled (and how A would be handled outside a view),
    2. This would only apply for views over asymmetric types where the target type is TriviallySerializable, which is a special case of a special case that is likely to be rare in practice. We'd also still need to explain what happens when B is not TriviallySerializable or not Serializable (the current, more general problem).
    3. If B is a statically larger type than A we've potentially just un-done the bandwidth savings a user was expecting when they chose asymmetric deserialization.

    So I'm not in favor of changing the semantics in that direction.

  3. Amir Kamil

    We do not officially support nested views (spec issue 122), but we do have a test of them in test/view.cpp. That test assumes that view<view<T, Iter1>, Iter2> deserializes as view<view<T>>, which is not compatible with the wording in the spec and Dan’s clarification above. If we do decide to officially support nested views, we will need to tweak the wording to special-case how nested views deserialize.

    I will copy this over comment over to spec issue 122.

  4. john bachan

    Totally new approach:

    I believe we all agree that deserialized_type<view<T,Iter>> == view<deserialized_type<T>> is the aesthetic and intuitive expectation. So to preserve this, which happens to preserve the implementation "as-is" (without PR Issue355), perhaps we give view<T,?> the constraint that serialization_traits<T>::deserialize and serializatoin_traits<deserialized_type<T>>::deserialize must be functionally equivalent. I think we can even make this statically checkable if we strengthen the equivalence such that the two functions must be the same linker symbol (static-assert that their fnptr addresses match). As a user, getting the two deserialize functions to be the same is definitely trickier, but can definitely be done with careful inheritance.

    The limitation this brings is it limits T's on-wire representation to be complete enough to not need knowledge of the pre-serialized type (which is always curable by the user by putting more stuff on to the wire). If we find this too much, we could keep the deserialized type of view<T,?> to be view<T>, but then sfinae permit a conversion cast to view<deserialized_type<T>> if the fnptr's match. This conversion would allow roc's to use view<deserialized_type<T>> as the receiving function argument. But now we're back to the trickeries needed to support nested views. So I'm less in favor of this than just the hard limitation, which as I stated can be cured at some cost of wire bits.

  5. Dan Bonachea reporter

    @john bachan : I'm not sure I understand your proposal. It sounds close to simply prohibiting views over asymmetric types, which I agree side-steps the problem but is a bit like curing the disease by killing the patient. I would like to support asymmetries where the "target type" might not be Serializable at all, meaning serialization_traits<deserialized_type<T>>::deserialize might not even exist, let alone be functionally compatible with something else.

    I think this suggestion is also problematic from a modularity perspective: if deserialized_type<A> == B and B is a non-Serializable type, or has unrelated Serialization code maintained by someone else that "does" something completely different, then this approach also doesn't work. For example, consider big_hairy_type which is an object with alot of state and provides subset serialization to send just a compacted representation (say a universal object name) on-the-wire, deserializing as a std::string (which perhaps can be used to find the object in a registry). There are of course other ways this could be done, but the point is the deserialization code of the target type might not be under this module's control, and the target type might use a different on-the-wire representation which is undocumented or unknown to this module. This restriction would prevent the module from sending a view<big_hairy_type> under the proposed semantics.

    I believe we all agree that deserialized_type<view<T,Iter>> == view<deserialized_type<T>> is the aesthetic and intuitive expectation

    I agree this is the intuitive expectation at first glance, but it doesn't make sense on deeper consideration as explained in my earlier comments. If deserialized_type_t<view<A,I>> is view<deserialized_type_t<A>> (ie view<B>), that's essentially requiring that B somehow knows how to deserialize incoming objects serialized by A. This effectively means that serialize<A> must produce the same byte-wise representation expected by deserialize<B>. This seems problematic when A and B are maintained in separate modules by separate authors, and even more problematic if there are many unrelated types A_1, A_2, A_3 maintained by different authors that all deserialize as a B. This gets even worse if the B author never considered the possibility of some external author wanting to wrap a view around his type, and therefore does not document/guarantee a stable on-the-wire representation to enable adherence to the proposed restriction.

    For modularity reasons, I strongly believe deserialize<B> should not need to worry about external authors for the incoming bytes, and that our design should promote/enforce the property that they could only have been written by the matching serialize<B>. Other classes that want to asymmetrically deserialize producing a B should use their own deserialization code and invoke B's constructors to build a B. This is just reinforcing the design approach we've taken all throughout with respect to asymmetric serialization, and IMHO wrapping objects in a view should not break this property.

  6. Log in to comment