upcxx::view broken with asymmetric deserialization
The 2020.3.0 spec changed the definition of upcxx::deserializing_iterator<T>
to accommodate asymmetric deserialization of user types. However the implementation was not updated, leaving upcxx::view<T>
effectively broken when T != deserialized_type_t<T>
.
Example:
#include <upcxx/upcxx.hpp>
#include <type_traits>
#include <vector>
#if SYMMETRIC
#define U T // force symmetric deserialization
#else
struct U { // asymmetric deserialization
int w,x;
};
#endif
struct T {
int x,y,z;
struct upcxx_serialization {
template<typename Writer>
static void serialize (Writer& writer, T const & t) {
writer.write(t.x);
}
template<typename Reader>
static U* deserialize(Reader& reader, void* storage) {
int x = reader.template read<int>();
U *up = new(storage) U();
up->x = x;
return up;
}
};
};
int main() {
upcxx::init();
T t;
t.x = 42;
upcxx::rpc(0, [](const U &u) {
assert(u.x == 42);
}, t).wait();
std::vector<T> vt;
for (int i=0; i < 10; i++) vt.push_back(T{42,43,44});
for (int i=0; i < 10; i++) assert(vt[i].x == 42);
upcxx::rpc(0, [](upcxx::view<T> myview) {
static_assert(std::is_same<upcxx::deserializing_iterator<T>,
decltype(myview)::iterator>::value, "oops");
int cnt = 0;
auto it = myview.begin();
static_assert(std::is_same<U, decltype(it)::value_type>::value, "oops");
static_assert(std::is_same<U*, decltype(it)::pointer>::value, "oops");
static_assert(std::is_same<U, decltype(it)::reference>::value, "oops");
for (; it < myview.end(); it++, cnt++) {
U u = *it;
assert(u.x == 42);
}
assert(cnt == 10);
}, upcxx::make_view(vt)).wait();
upcxx::barrier();
if (!upcxx::rank_me()) std::cout << "SUCCESS" << std::endl;
upcxx::finalize();
}
This program works for symmetric T (-DSYMMETRIC
) but fails for the default asymmetric on 2020.3.0 and develop @ 016aecf :
asym-view.cpp: In lambda function:
asym-view.cpp:48:77: error: static assertion failed: oops
48 | static_assert(std::is_same<U, decltype(it)::value_type>::value, "oops");
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~
asym-view.cpp:49:74: error: static assertion failed: oops
49 | static_assert(std::is_same<U*, decltype(it)::pointer>::value, "oops");
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~
asym-view.cpp:50:76: error: static assertion failed: oops
50 | static_assert(std::is_same<U, decltype(it)::reference>::value, "oops");
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~
asym-view.cpp:52:27: error: conversion from 'T' to non-scalar type 'U' requested
52 | U u = *it;
| ^~~
asym-view.cpp: In function 'int main()':
asym-view.cpp:56:40: error: no matching function for call to 'rpc(int, main()::<lambda(upcxx::view<T>)>, upcxx::view<T, __gnu_cxx::__normal_iterator<const T*, std::vector<T> > >)'
56 | }, upcxx::make_view(vt)).wait();
| ^
In file included from /usr/local/pkg/upcxx-dirac/gcc-9.3.0/nightly-2020.03.23/upcxx.debug.gasnet_seq.smp/include/upcxx/dist_object.hpp:7,
from /usr/local/pkg/upcxx-dirac/gcc-9.3.0/nightly-2020.03.23/upcxx.debug.gasnet_seq.smp/include/upcxx/upcxx.hpp:24,
from asym-view.cpp:1:
/usr/local/pkg/upcxx-dirac/gcc-9.3.0/nightly-2020.03.23/upcxx.debug.gasnet_seq.smp/include/upcxx/rpc.hpp:259:8: note: candidate: 'template<class Cxs, class Fn, class ... Arg> typename upcxx::detail::rpc_return<Fn(Arg ...), Cxs>::type upcxx::rpc(const upcxx::team&, upcxx::intrank_t, Cxs, Fn&&, Arg&& ...)'
259 | auto rpc(const team &tm, intrank_t recipient, Cxs cxs, Fn &&fn, Arg &&...args)
| ^~~
/usr/local/pkg/upcxx-dirac/gcc-9.3.0/nightly-2020.03.23/upcxx.debug.gasnet_seq.smp/include/upcxx/rpc.hpp:259:8: note: template argument deduction/substitution failed:
asym-view.cpp:56:40: note: candidate expects at least 4 arguments, 3 provided
56 | }, upcxx::make_view(vt)).wait();
| ^
In file included from /usr/local/pkg/upcxx-dirac/gcc-9.3.0/nightly-2020.03.23/upcxx.debug.gasnet_seq.smp/include/upcxx/dist_object.hpp:7,
from /usr/local/pkg/upcxx-dirac/gcc-9.3.0/nightly-2020.03.23/upcxx.debug.gasnet_seq.smp/include/upcxx/upcxx.hpp:24,
from asym-view.cpp:1:
/usr/local/pkg/upcxx-dirac/gcc-9.3.0/nightly-2020.03.23/upcxx.debug.gasnet_seq.smp/include/upcxx/rpc.hpp:269:8: note: candidate: 'template<class Cxs, class Fn, class ... Arg> typename upcxx::detail::rpc_return<Fn(Arg ...), Cxs>::type upcxx::rpc(upcxx::intrank_t, Cxs, Fn&&, Arg&& ...)'
269 | auto rpc(intrank_t recipient, Cxs cxs, Fn &&fn, Arg &&...args)
| ^~~
/usr/local/pkg/upcxx-dirac/gcc-9.3.0/nightly-2020.03.23/upcxx.debug.gasnet_seq.smp/include/upcxx/rpc.hpp:269:8: note: template argument deduction/substitution failed:
/usr/local/pkg/upcxx-dirac/gcc-9.3.0/nightly-2020.03.23/upcxx.debug.gasnet_seq.smp/include/upcxx/rpc.hpp: In substitution of 'template<class Cxs, class Fn, class ... Arg> typename upcxx::detail::rpc_return<Fn(Arg ...), Cxs>::type upcxx::rpc(upcxx::intrank_t, Cxs, Fn&&, Arg&& ...) [with Cxs = main()::<lambda(upcxx::view<T>)>; Fn = upcxx::view<T, __gnu_cxx::__normal_iterator<const T*, std::vector<T> > >; Arg = {}]':
asym-view.cpp:56:40: required from here
/usr/local/pkg/upcxx-dirac/gcc-9.3.0/nightly-2020.03.23/upcxx.debug.gasnet_seq.smp/include/upcxx/rpc.hpp:269:8: error: no type named 'type' in 'struct upcxx::detail::rpc_remote_results<upcxx::view<T, __gnu_cxx::__normal_iterator<const T*, std::vector<T> > >(), void>'
/usr/local/pkg/upcxx-dirac/gcc-9.3.0/nightly-2020.03.23/upcxx.debug.gasnet_seq.smp/include/upcxx/rpc.hpp:280:8: note: candidate: 'template<class Fn, class ... Arg> typename upcxx::detail::rpc_return<Fn(Arg ...), upcxx::completions<upcxx::future_cx<upcxx::operation_cx_event, upcxx::progress_level::user> > >::type upcxx::rpc(const upcxx::team&, upcxx::intrank_t, Fn&&, Arg&& ...)'
280 | auto rpc(const team &tm, intrank_t recipient, Fn &&fn, Arg &&...args)
| ^~~
/usr/local/pkg/upcxx-dirac/gcc-9.3.0/nightly-2020.03.23/upcxx.debug.gasnet_seq.smp/include/upcxx/rpc.hpp:280:8: note: template argument deduction/substitution failed:
asym-view.cpp:43:14: note: cannot convert '0' (type 'int') to type 'const upcxx::team&'
43 | upcxx::rpc(0, [](upcxx::view<T> myview) {
| ^
In file included from /usr/local/pkg/upcxx-dirac/gcc-9.3.0/nightly-2020.03.23/upcxx.debug.gasnet_seq.smp/include/upcxx/dist_object.hpp:7,
from /usr/local/pkg/upcxx-dirac/gcc-9.3.0/nightly-2020.03.23/upcxx.debug.gasnet_seq.smp/include/upcxx/upcxx.hpp:24,
from asym-view.cpp:1:
/usr/local/pkg/upcxx-dirac/gcc-9.3.0/nightly-2020.03.23/upcxx.debug.gasnet_seq.smp/include/upcxx/rpc.hpp:291:8: note: candidate: 'template<class Fn, class ... Arg> typename upcxx::detail::rpc_return<Fn(Arg ...), upcxx::completions<upcxx::future_cx<upcxx::operation_cx_event, upcxx::progress_level::user> > >::type upcxx::rpc(upcxx::intrank_t, Fn&&, Arg&& ...)'
291 | auto rpc(intrank_t recipient, Fn &&fn, Arg &&...args)
| ^~~
/usr/local/pkg/upcxx-dirac/gcc-9.3.0/nightly-2020.03.23/upcxx.debug.gasnet_seq.smp/include/upcxx/rpc.hpp:291:8: note: template argument deduction/substitution failed:
/usr/local/pkg/upcxx-dirac/gcc-9.3.0/nightly-2020.03.23/upcxx.debug.gasnet_seq.smp/include/upcxx/rpc.hpp: In substitution of 'template<class Fn, class ... Arg> typename upcxx::detail::rpc_return<Fn(Arg ...), upcxx::completions<upcxx::future_cx<upcxx::operation_cx_event, upcxx::progress_level::user> > >::type upcxx::rpc(upcxx::intrank_t, Fn&&, Arg&& ...) [with Fn = main()::<lambda(upcxx::view<T>)>; Arg = {upcxx::view<T, __gnu_cxx::__normal_iterator<const T*, std::vector<T, std::allocator<T> > > >}]':
asym-view.cpp:56:40: required from here
/usr/local/pkg/upcxx-dirac/gcc-9.3.0/nightly-2020.03.23/upcxx.debug.gasnet_seq.smp/include/upcxx/rpc.hpp:291:8: error: no type named 'type' in 'struct upcxx::detail::rpc_remote_results<main()::<lambda(upcxx::view<T>)>(upcxx::view<T, __gnu_cxx::__normal_iterator<const T*, std::vector<T> > >), void>'
In file included from /usr/local/pkg/upcxx-dirac/gcc-9.3.0/nightly-2020.03.23/upcxx.debug.gasnet_seq.smp/include/upcxx/upcxx.hpp:36,
from asym-view.cpp:1:
/usr/local/pkg/upcxx-dirac/gcc-9.3.0/nightly-2020.03.23/upcxx.debug.gasnet_seq.smp/include/upcxx/view.hpp: In instantiation of 'static T* upcxx::detail::serialization_view_element<T, false>::deserialize(Reader&, void*) [with Reader = upcxx::detail::serialization_reader; T = T]':
/usr/local/pkg/upcxx-dirac/gcc-9.3.0/nightly-2020.03.23/upcxx.debug.gasnet_seq.smp/include/upcxx/view.hpp:39:57: required from 'T upcxx::deserializing_iterator<T>::operator*() const [with T = T]'
asym-view.cpp:52:28: required from here
/usr/local/pkg/upcxx-dirac/gcc-9.3.0/nightly-2020.03.23/upcxx.debug.gasnet_seq.smp/include/upcxx/view.hpp:314:44: error: cannot convert 'U*' to 'T*' in return
314 | return r.template read_into<T>(spot);
| ^
| |
| U*
At least the implementation's member type declarations in upcxx::deserializing_iterator<T>
need to be updated - probably other stuff as well.
Comments (7)
-
reporter -
reporter Some further elaboration, based on discussion in our 2020-03-25 meeting:
Observations
For the discussion below, assume A is an asymmetric user type, where
deserialized_type_t<A>
isB
.- Note 1: This means an
A
-typed object argument toupcxx::rpc
becomes aB
-typed object argument to the rpc callback. More specifically, a serializedA
deserializes into aB
, using the serialization mechanisms defined in theA
class (or an ancestor type). - Corollary 1: asymmetric user types can only be defined using custom serialization (spec 6.2.3). Therefore the (de)serialzation must be via functions provided in a specialization of
upcxx::serialization<A>
or aupcxx_serialization
class member ofA
. - Corollary 2: Because of corollary 1,
A
is Serializable but not TriviallySerializable. - Note 2: The serialization properties of
B
are completely irrelevant - they are never invoked when anA
is deserialized into aB
. In fact, we don't requireB
to be Serializable at all. We only care thatA
's deserialzation can instantiate aB
. - Corollary 3: Because
A
is not TriviallySerializable, the default view iterator forview<A>
(view_default_iterator_t<A>
) isdeserializing_iterator<A>
. This means the serialized representation ofA
objects in a view is deserialized intoB
objects on-the-fly and returned-by-value viaview<A>::iterator::operator*()
, and in particular we are NOT in the random-access view specialization for TriviallySerializable types (regardless ofB
's type traits).
Specification
I believe the spec's current treatment of this topic is consistent with all the above observations. In particular, it defines that
deserializing_iterator<T>::value_type = deserialized_type_t <T>
, which means thatview<A>::iterator::value_type = B
, providing the expected asymmetry.The spec interpretation implies that a
view<A,iter>
input to RPC always deserializes into aview<A>
result, even for asymmetricA
. The spec should probably be clarified to formally state something like:deserialized_type_t<view<A,I>>
isview<A>
, for any Serializable A. This remains true even whendeserialized_type_t<A>
is notA
This is notably in contrast to how STL containers are handled (as defined in 6.3), whereby eg.
deserialized_type_t<vector<A>>
isvector<deserialized_type_t<A>>
. We might even highlight these differences in a brief example.With a clarification to this effect, I believe the spec's treatment of this topic will be correct and consistent.
Implementation
The current implementation seems to assume
deserialized_type_t<view<A,I>>
isview<deserialized_type_t<A>>
(ieview<B>
). This interpretation is broken for several reasons whenA != B
(ie the asymmetric case):B
might not be Serializable, which is a precondition for theview
template, makingview<B>
an ill-formed type.- Even if
B
happens to be Serializable, in generalB
doesn't know how to interpret the serialized representation ofA
. This transfer needs to be using the deserialization provided byA
- The default iterator properties defined by the view template should be based on
A
, notB
Proposal
So my current position is this issue should be resolved by clarifying the spec as suggested above and fixing the implementation to match.
Alternative proposal
It's possible we could devise a more complicated self-consistent proposal. For example, we could consider a special case behavior for serialization of
view<A>
when B also happens to be TriviallySerializable whereby the sender both serializes and deserializes each A -> B in the sequence, and then trivially serializes the B elements and sends them so they arrive as aview<B>
. This would allow use of the random-access in-place no-op iterator at the target, but has other significant drawbacks - in particular:- It's an even larger departure from how non-view types are handled (and how
A
would be handled outside a view), - This would only apply for views over asymmetric types where the target type is TriviallySerializable, which is a special case of a special case that is likely to be rare in practice. We'd also still need to explain what happens when B is not TriviallySerializable or not Serializable (the current, more general problem).
- If
B
is a statically larger type thanA
we've potentially just un-done the bandwidth savings a user was expecting when they chose asymmetric deserialization.
So I'm not in favor of changing the semantics in that direction.
- Note 1: This means an
-
We do not officially support nested views (spec issue 122), but we do have a test of them in
test/view.cpp
. That test assumes thatview<view<T, Iter1>, Iter2>
deserializes asview<view<T>>
, which is not compatible with the wording in the spec and Dan’s clarification above. If we do decide to officially support nested views, we will need to tweak the wording to special-case how nested views deserialize.I will copy this over comment over to spec issue 122.
-
Totally new approach:
I believe we all agree that
deserialized_type<view<T,Iter>> == view<deserialized_type<T>>
is the aesthetic and intuitive expectation. So to preserve this, which happens to preserve the implementation "as-is" (without PR Issue355), perhaps we giveview<T,?>
the constraint thatserialization_traits<T>::deserialize
andserializatoin_traits<deserialized_type<T>>::deserialize
must be functionally equivalent. I think we can even make this statically checkable if we strengthen the equivalence such that the two functions must be the same linker symbol (static-assert that their fnptr addresses match). As a user, getting the two deserialize functions to be the same is definitely trickier, but can definitely be done with careful inheritance.The limitation this brings is it limits T's on-wire representation to be complete enough to not need knowledge of the pre-serialized type (which is always curable by the user by putting more stuff on to the wire). If we find this too much, we could keep the deserialized type of
view<T,?>
to beview<T>
, but then sfinae permit a conversion cast toview<deserialized_type<T>>
if the fnptr's match. This conversion would allow roc's to useview<deserialized_type<T>>
as the receiving function argument. But now we're back to the trickeries needed to support nested views. So I'm less in favor of this than just the hard limitation, which as I stated can be cured at some cost of wire bits. -
reporter @john bachan : I'm not sure I understand your proposal. It sounds close to simply prohibiting views over asymmetric types, which I agree side-steps the problem but is a bit like curing the disease by killing the patient. I would like to support asymmetries where the "target type" might not be Serializable at all, meaning
serialization_traits<deserialized_type<T>>::deserialize
might not even exist, let alone be functionally compatible with something else.I think this suggestion is also problematic from a modularity perspective: if
deserialized_type<A> == B
andB
is a non-Serializable type, or has unrelated Serialization code maintained by someone else that "does" something completely different, then this approach also doesn't work. For example, considerbig_hairy_type
which is an object with alot of state and provides subset serialization to send just a compacted representation (say a universal object name) on-the-wire, deserializing as astd::string
(which perhaps can be used to find the object in a registry). There are of course other ways this could be done, but the point is the deserialization code of the target type might not be under this module's control, and the target type might use a different on-the-wire representation which is undocumented or unknown to this module. This restriction would prevent the module from sending aview<big_hairy_type>
under the proposed semantics.I believe we all agree that deserialized_type<view<T,Iter>> == view<deserialized_type<T>> is the aesthetic and intuitive expectation
I agree this is the intuitive expectation at first glance, but it doesn't make sense on deeper consideration as explained in my earlier comments. If
deserialized_type_t<view<A,I>>
isview<deserialized_type_t<A>>
(ieview<B>
), that's essentially requiring thatB
somehow knows how to deserialize incoming objects serialized byA
. This effectively means thatserialize<A>
must produce the same byte-wise representation expected bydeserialize<B>
. This seems problematic whenA
andB
are maintained in separate modules by separate authors, and even more problematic if there are many unrelated typesA_1
,A_2
,A_3
maintained by different authors that all deserialize as aB
. This gets even worse if theB
author never considered the possibility of some external author wanting to wrap aview
around his type, and therefore does not document/guarantee a stable on-the-wire representation to enable adherence to the proposed restriction.For modularity reasons, I strongly believe
deserialize<B>
should not need to worry about external authors for the incoming bytes, and that our design should promote/enforce the property that they could only have been written by the matchingserialize<B>
. Other classes that want to asymmetrically deserialize producing aB
should use their own deserialization code and invokeB
's constructors to build aB
. This is just reinforcing the design approach we've taken all throughout with respect to asymmetric serialization, and IMHO wrapping objects in a view should not break this property. -
- changed status to resolved
Fix view<T> when T has asymmetric deserialization. Fixes
#355.→ <<cset eb308b5c638f>>
-
Merged in akamil/upcxx/issue355 (pull request #235)
Fix view<T> when T has asymmetric deserialization. Fixes
#355.Approved-by: Dan Bonachea dobonachea@lbl.gov
→ <<cset 77b3bf556f5e>>
- Log in to comment
The following change is necessary but NOT sufficient:
This fixes the static assertions and the result type of
operator*
, but still yields type check errors on the rpc:Digging further I find what appears to be the root cause, which is the runtime also fails this static assertion for asymmetric
T
, which underlies the spec design:I suspect we may need to have a design discussion about this.