Enh: Lifetime extension for deserialized objects, allowing RPC to return non-ready futures containing refs to args

Issue #367 resolved
Dan Bonachea created an issue

Currently RPC callbacks declared to return a reference type generate an ocean of confusing template detection type errors.

Example:

#include <upcxx/upcxx.hpp>
#include <iostream>
#include <assert.h>

using namespace upcxx;

std::vector<int> myvec{1,2,3}; // global variable
const int &getter(size_t idx) { return myvec[idx]; }
int main() {
  upcxx::init();
  int peer = (rank_me()+1)%rank_n();

  std::vector<int> v1 = rpc(peer, /*UserFn=*/[](){ return std::vector<int>{1,2,3}; }).wait();
  assert(v1[2] == 3);
  std::vector<int> v2 = rpc(peer, /*UserFn=*/[](){ return myvec; }).wait();
  assert(v2[2] == 3);
  std::vector<int> v3 = rpc(peer, /*UserFn=*/[]() -> std::vector<int> const { return myvec; }).wait();
  assert(v3[2] == 3);
  #ifndef WORKAROUND
  std::vector<int> v4 = rpc(peer, /*UserFn=*/[]() -> std::vector<int> const & { return myvec; }).wait();
  assert(v4[2] == 3);
  int v5 = rpc(0, /*UserFn=*/getter, 2).wait();
  assert(v5 == 3);
  barrier();
  std::vector<int> v6 = rpc(peer, /*UserFn=*/[]() -> std::vector<int>&& { return std::move(myvec); }).wait();
  assert(v6[2] == 3);
  #endif

  barrier();
  if (!rank_me()) std::cout << "SUCCESS" << std::endl;
  upcxx::finalize();
  return 0;
}

Example of errors when compiled without -DWORKAROUND, using 2020.3.0 release: (from one of the three problematic lines above)

In file included from /usr/local/pkg/upcxx-dirac/gcc-10.1.0/stable-2020.3.0/upcxx.debug.gasnet_seq.smp/include/upcxx/bind.hpp:6,
                 from /usr/local/pkg/upcxx-dirac/gcc-10.1.0/stable-2020.3.0/upcxx.debug.gasnet_seq.smp/include/upcxx/team_fwd.hpp:4,
                 from /usr/local/pkg/upcxx-dirac/gcc-10.1.0/stable-2020.3.0/upcxx.debug.gasnet_seq.smp/include/upcxx/team.hpp:4,
                 from /usr/local/pkg/upcxx-dirac/gcc-10.1.0/stable-2020.3.0/upcxx.debug.gasnet_seq.smp/include/upcxx/backend.hpp:7,
                 from /usr/local/pkg/upcxx-dirac/gcc-10.1.0/stable-2020.3.0/upcxx.debug.gasnet_seq.smp/include/upcxx/allocate.hpp:8,
                 from /usr/local/pkg/upcxx-dirac/gcc-10.1.0/stable-2020.3.0/upcxx.debug.gasnet_seq.smp/include/upcxx/upcxx.hpp:17,
                 from rpc-return2.cpp:1:
/usr/local/pkg/upcxx-dirac/gcc-10.1.0/stable-2020.3.0/upcxx.debug.gasnet_seq.smp/include/upcxx/serialization.hpp: In instantiation of 'struct upcxx::detail::serialization_traits_deserialized_type<const std::vector<int>&, void>':
/usr/local/pkg/upcxx-dirac/gcc-10.1.0/stable-2020.3.0/upcxx.debug.gasnet_seq.smp/include/upcxx/serialization.hpp:1205:12:   required from 'struct upcxx::detail::serialization_traits2<const std::vector<int>&, false>'
/usr/local/pkg/upcxx-dirac/gcc-10.1.0/stable-2020.3.0/upcxx.debug.gasnet_seq.smp/include/upcxx/serialization.hpp:1267:12:   required from 'struct upcxx::detail::serialization_traits1<const std::vector<int>&>'
/usr/local/pkg/upcxx-dirac/gcc-10.1.0/stable-2020.3.0/upcxx.debug.gasnet_seq.smp/include/upcxx/serialization.hpp:1274:10:   required from 'struct upcxx::serialization_traits<const std::vector<int>&>'
/usr/local/pkg/upcxx-dirac/gcc-10.1.0/stable-2020.3.0/upcxx.debug.gasnet_seq.smp/include/upcxx/serialization.hpp:1375:13:   required from 'struct upcxx::detail::serialization_tuple<std::tuple<const std::vector<int, std::allocator<int> >&>, 0, 1>'
/usr/local/pkg/upcxx-dirac/gcc-10.1.0/stable-2020.3.0/upcxx.debug.gasnet_seq.smp/include/upcxx/serialization.hpp:1453:10:   [ skipping 5 instantiation contexts, use -ftemplate-backtrace-limit=0 to disable ]
/usr/local/pkg/upcxx-dirac/gcc-10.1.0/stable-2020.3.0/upcxx.debug.gasnet_seq.smp/include/upcxx/serialization.hpp:1274:10:   required from 'struct upcxx::serialization_traits<std::tuple<const std::vector<int, std::allocator<int> >&> >'
/usr/local/pkg/upcxx-dirac/gcc-10.1.0/stable-2020.3.0/upcxx.debug.gasnet_seq.smp/include/upcxx/serialization.hpp:67:60:   required from 'constexpr const bool upcxx::is_serializable<std::tuple<const std::vector<int, std::allocator<int> >&> >::value'
/usr/local/pkg/upcxx-dirac/gcc-10.1.0/stable-2020.3.0/upcxx.debug.gasnet_seq.smp/include/upcxx/rpc.hpp:164:41:   required from 'struct upcxx::detail::rpc_event_values<main()::<lambda()>()>'
/usr/local/pkg/upcxx-dirac/gcc-10.1.0/stable-2020.3.0/upcxx.debug.gasnet_seq.smp/include/upcxx/completion.hpp:893:13:   required from 'struct upcxx::detail::completions_returner_head<upcxx::detail::event_is_here, upcxx::detail::rpc_event_values<main()::<lambda()>()>, upcxx::completions<upcxx::future_cx<upcxx::operation_cx_event, upcxx::progress_level::user> >, void>'
/usr/local/pkg/upcxx-dirac/gcc-10.1.0/stable-2020.3.0/upcxx.debug.gasnet_seq.smp/include/upcxx/completion.hpp:937:12:   required from 'struct upcxx::detail::completions_returner<upcxx::detail::event_is_here, upcxx::detail::rpc_event_values<main()::<lambda()>()>, upcxx::completions<upcxx::future_cx<upcxx::operation_cx_event, upcxx::progress_level::user> > >'
/usr/local/pkg/upcxx-dirac/gcc-10.1.0/stable-2020.3.0/upcxx.debug.gasnet_seq.smp/include/upcxx/rpc.hpp:189:13:   required from 'struct upcxx::detail::rpc_return<main()::<lambda()>(), upcxx::completions<upcxx::future_cx<upcxx::operation_cx_event, upcxx::progress_level::user> >, std::tuple<const std::vector<int, std::allocator<int> >&> >'
/usr/local/pkg/upcxx-dirac/gcc-10.1.0/stable-2020.3.0/upcxx.debug.gasnet_seq.smp/include/upcxx/rpc.hpp:291:8:   required by substitution of 'template<class Fn, class ... Arg> typename upcxx::detail::rpc_return<Fn(Arg ...), upcxx::completions<upcxx::future_cx<upcxx::operation_cx_event, upcxx::progress_level::user> > >::type upcxx::rpc(upcxx::intrank_t, Fn&&, Arg&& ...) [with Fn = main()::<lambda()>; Arg = {}]'
rpc-return2.cpp:20:96:   required from here
/usr/local/pkg/upcxx-dirac/gcc-10.1.0/stable-2020.3.0/upcxx.debug.gasnet_seq.smp/include/upcxx/serialization.hpp:1185:42: error: 'deserialize' is not a member of 'upcxx::serialization<const std::vector<int>&>'
 1185 |             serialization<T>::deserialize(std::declval<detail::serialization_reader&>(), nullptr)
      |             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/usr/local/pkg/upcxx-dirac/gcc-10.1.0/stable-2020.3.0/upcxx.debug.gasnet_seq.smp/include/upcxx/serialization.hpp: In instantiation of 'struct upcxx::detail::serialization_traits_deserialized_type<std::tuple<const std::vector<int, std::allocator<int> >&>, void>':
/usr/local/pkg/upcxx-dirac/gcc-10.1.0/stable-2020.3.0/upcxx.debug.gasnet_seq.smp/include/upcxx/serialization.hpp:1267:12:   recursively required by substitution of 'template<class T> struct upcxx::detail::serialization_traits_deserialized_value<T, typename std::conditional<true, void, typename upcxx::detail::serialization_traits2<T>::deserialized_type (*)()>::type> [with T = std::tuple<const std::vector<int, std::allocator<int> >&>]'
/usr/local/pkg/upcxx-dirac/gcc-10.1.0/stable-2020.3.0/upcxx.debug.gasnet_seq.smp/include/upcxx/serialization.hpp:1267:12:   required from 'struct upcxx::detail::serialization_traits1<std::tuple<const std::vector<int, std::allocator<int> >&> >'
/usr/local/pkg/upcxx-dirac/gcc-10.1.0/stable-2020.3.0/upcxx.debug.gasnet_seq.smp/include/upcxx/serialization.hpp:1274:10:   required from 'struct upcxx::serialization_traits<std::tuple<const std::vector<int, std::allocator<int> >&> >'
/usr/local/pkg/upcxx-dirac/gcc-10.1.0/stable-2020.3.0/upcxx.debug.gasnet_seq.smp/include/upcxx/serialization.hpp:67:60:   required from 'constexpr const bool upcxx::is_serializable<std::tuple<const std::vector<int, std::allocator<int> >&> >::value'
/usr/local/pkg/upcxx-dirac/gcc-10.1.0/stable-2020.3.0/upcxx.debug.gasnet_seq.smp/include/upcxx/rpc.hpp:164:41:   required from 'struct upcxx::detail::rpc_event_values<main()::<lambda()>()>'
/usr/local/pkg/upcxx-dirac/gcc-10.1.0/stable-2020.3.0/upcxx.debug.gasnet_seq.smp/include/upcxx/completion.hpp:893:13:   required from 'struct upcxx::detail::completions_returner_head<upcxx::detail::event_is_here, upcxx::detail::rpc_event_values<main()::<lambda()>()>, upcxx::completions<upcxx::future_cx<upcxx::operation_cx_event, upcxx::progress_level::user> >, void>'
/usr/local/pkg/upcxx-dirac/gcc-10.1.0/stable-2020.3.0/upcxx.debug.gasnet_seq.smp/include/upcxx/completion.hpp:937:12:   required from 'struct upcxx::detail::completions_returner<upcxx::detail::event_is_here, upcxx::detail::rpc_event_values<main()::<lambda()>()>, upcxx::completions<upcxx::future_cx<upcxx::operation_cx_event, upcxx::progress_level::user> > >'
/usr/local/pkg/upcxx-dirac/gcc-10.1.0/stable-2020.3.0/upcxx.debug.gasnet_seq.smp/include/upcxx/rpc.hpp:189:13:   required from 'struct upcxx::detail::rpc_return<main()::<lambda()>(), upcxx::completions<upcxx::future_cx<upcxx::operation_cx_event, upcxx::progress_level::user> >, std::tuple<const std::vector<int, std::allocator<int> >&> >'
/usr/local/pkg/upcxx-dirac/gcc-10.1.0/stable-2020.3.0/upcxx.debug.gasnet_seq.smp/include/upcxx/rpc.hpp:291:8:   required by substitution of 'template<class Fn, class ... Arg> typename upcxx::detail::rpc_return<Fn(Arg ...), upcxx::completions<upcxx::future_cx<upcxx::operation_cx_event, upcxx::progress_level::user> > >::type upcxx::rpc(upcxx::intrank_t, Fn&&, Arg&& ...) [with Fn = main()::<lambda()>; Arg = {}]'
rpc-return2.cpp:20:96:   required from here
/usr/local/pkg/upcxx-dirac/gcc-10.1.0/stable-2020.3.0/upcxx.debug.gasnet_seq.smp/include/upcxx/serialization.hpp:1185:42: error: 'deserialize' is not a member of 'upcxx::serialization<std::tuple<const std::vector<int, std::allocator<int> >&> >'
In file included from /usr/local/pkg/upcxx-dirac/gcc-10.1.0/stable-2020.3.0/upcxx.debug.gasnet_seq.smp/include/upcxx/dist_object.hpp:7,
                 from /usr/local/pkg/upcxx-dirac/gcc-10.1.0/stable-2020.3.0/upcxx.debug.gasnet_seq.smp/include/upcxx/upcxx.hpp:24,
                 from rpc-return2.cpp:1:
/usr/local/pkg/upcxx-dirac/gcc-10.1.0/stable-2020.3.0/upcxx.debug.gasnet_seq.smp/include/upcxx/rpc.hpp: In instantiation of 'struct upcxx::detail::rpc_event_values<main()::<lambda()>()>':
/usr/local/pkg/upcxx-dirac/gcc-10.1.0/stable-2020.3.0/upcxx.debug.gasnet_seq.smp/include/upcxx/completion.hpp:893:13:   required from 'struct upcxx::detail::completions_returner_head<upcxx::detail::event_is_here, upcxx::detail::rpc_event_values<main()::<lambda()>()>, upcxx::completions<upcxx::future_cx<upcxx::operation_cx_event, upcxx::progress_level::user> >, void>'
/usr/local/pkg/upcxx-dirac/gcc-10.1.0/stable-2020.3.0/upcxx.debug.gasnet_seq.smp/include/upcxx/completion.hpp:937:12:   required from 'struct upcxx::detail::completions_returner<upcxx::detail::event_is_here, upcxx::detail::rpc_event_values<main()::<lambda()>()>, upcxx::completions<upcxx::future_cx<upcxx::operation_cx_event, upcxx::progress_level::user> > >'
/usr/local/pkg/upcxx-dirac/gcc-10.1.0/stable-2020.3.0/upcxx.debug.gasnet_seq.smp/include/upcxx/rpc.hpp:189:13:   required from 'struct upcxx::detail::rpc_return<main()::<lambda()>(), upcxx::completions<upcxx::future_cx<upcxx::operation_cx_event, upcxx::progress_level::user> >, std::tuple<const std::vector<int, std::allocator<int> >&> >'
/usr/local/pkg/upcxx-dirac/gcc-10.1.0/stable-2020.3.0/upcxx.debug.gasnet_seq.smp/include/upcxx/rpc.hpp:291:8:   required by substitution of 'template<class Fn, class ... Arg> typename upcxx::detail::rpc_return<Fn(Arg ...), upcxx::completions<upcxx::future_cx<upcxx::operation_cx_event, upcxx::progress_level::user> > >::type upcxx::rpc(upcxx::intrank_t, Fn&&, Arg&& ...) [with Fn = main()::<lambda()>; Arg = {}]'
rpc-return2.cpp:20:96:   required from here
/usr/local/pkg/upcxx-dirac/gcc-10.1.0/stable-2020.3.0/upcxx.debug.gasnet_seq.smp/include/upcxx/rpc.hpp:164:41: error: static assertion failed: rpc return values must be Serializable.
  164 |         is_serializable<results_tuple>::value,
      |                                         ^~~~~
/usr/local/pkg/upcxx-dirac/gcc-10.1.0/stable-2020.3.0/upcxx.debug.gasnet_seq.smp/include/upcxx/rpc.hpp:175:13: error: lookup of 'deserialized_type' in 'upcxx::serialization_traits<std::tuple<const std::vector<int, std::allocator<int> >&> >' is ambiguous
  175 |       using tuple_t = typename std::conditional<
      |             ^~~~~~~
In file included from /usr/local/pkg/upcxx-dirac/gcc-10.1.0/stable-2020.3.0/upcxx.debug.gasnet_seq.smp/include/upcxx/bind.hpp:6,
                 from /usr/local/pkg/upcxx-dirac/gcc-10.1.0/stable-2020.3.0/upcxx.debug.gasnet_seq.smp/include/upcxx/team_fwd.hpp:4,
                 from /usr/local/pkg/upcxx-dirac/gcc-10.1.0/stable-2020.3.0/upcxx.debug.gasnet_seq.smp/include/upcxx/team.hpp:4,
                 from /usr/local/pkg/upcxx-dirac/gcc-10.1.0/stable-2020.3.0/upcxx.debug.gasnet_seq.smp/include/upcxx/backend.hpp:7,
                 from /usr/local/pkg/upcxx-dirac/gcc-10.1.0/stable-2020.3.0/upcxx.debug.gasnet_seq.smp/include/upcxx/allocate.hpp:8,
                 from /usr/local/pkg/upcxx-dirac/gcc-10.1.0/stable-2020.3.0/upcxx.debug.gasnet_seq.smp/include/upcxx/upcxx.hpp:17,
                 from rpc-return2.cpp:1:
/usr/local/pkg/upcxx-dirac/gcc-10.1.0/stable-2020.3.0/upcxx.debug.gasnet_seq.smp/include/upcxx/serialization.hpp:1456:11: note: candidates are: 'using deserialized_type = class std::tuple<typename upcxx::serialization_traits<T>::deserialized_type ...>'
 1456 |     using deserialized_type = std::tuple<
      |           ^~~~~~~~~~~~~~~~~
/usr/local/pkg/upcxx-dirac/gcc-10.1.0/stable-2020.3.0/upcxx.debug.gasnet_seq.smp/include/upcxx/serialization.hpp:1183:13: note:                 'using deserialized_type = typename std::remove_pointer<decltype (upcxx::serialization<T>::deserialize(declval<upcxx::detail::serialization_reader&>(), nullptr))>::type'
 1183 |       using deserialized_type = typename std::remove_pointer<
      |             ^~~~~~~~~~~~~~~~~
rpc-return2.cpp: In function 'int main()':
rpc-return2.cpp:20:96: error: no matching function for call to 'rpc(int&, main()::<lambda()>)'
   20 |   std::vector<int> v4 = rpc(peer, /*UserFn=*/[]() -> std::vector<int> const & { return myvec; }).wait();
      |                                                                                                ^
In file included from /usr/local/pkg/upcxx-dirac/gcc-10.1.0/stable-2020.3.0/upcxx.debug.gasnet_seq.smp/include/upcxx/dist_object.hpp:7,
                 from /usr/local/pkg/upcxx-dirac/gcc-10.1.0/stable-2020.3.0/upcxx.debug.gasnet_seq.smp/include/upcxx/upcxx.hpp:24,
                 from rpc-return2.cpp:1:
/usr/local/pkg/upcxx-dirac/gcc-10.1.0/stable-2020.3.0/upcxx.debug.gasnet_seq.smp/include/upcxx/rpc.hpp:259:8: note: candidate: 'template<class Cxs, class Fn, class ... Arg> typename upcxx::detail::rpc_return<Fn(Arg ...), Cxs>::type upcxx::rpc(const upcxx::team&, upcxx::intrank_t, Cxs, Fn&&, Arg&& ...)'
  259 |   auto rpc(const team &tm, intrank_t recipient, Cxs cxs, Fn &&fn, Arg &&...args)
      |        ^~~
/usr/local/pkg/upcxx-dirac/gcc-10.1.0/stable-2020.3.0/upcxx.debug.gasnet_seq.smp/include/upcxx/rpc.hpp:259:8: note:   template argument deduction/substitution failed:
rpc-return2.cpp:20:96: note:   candidate expects at least 4 arguments, 2 provided
   20 |   std::vector<int> v4 = rpc(peer, /*UserFn=*/[]() -> std::vector<int> const & { return myvec; }).wait();
      |                                                                                                ^
In file included from /usr/local/pkg/upcxx-dirac/gcc-10.1.0/stable-2020.3.0/upcxx.debug.gasnet_seq.smp/include/upcxx/dist_object.hpp:7,
                 from /usr/local/pkg/upcxx-dirac/gcc-10.1.0/stable-2020.3.0/upcxx.debug.gasnet_seq.smp/include/upcxx/upcxx.hpp:24,
                 from rpc-return2.cpp:1:
/usr/local/pkg/upcxx-dirac/gcc-10.1.0/stable-2020.3.0/upcxx.debug.gasnet_seq.smp/include/upcxx/rpc.hpp:269:8: note: candidate: 'template<class Cxs, class Fn, class ... Arg> typename upcxx::detail::rpc_return<Fn(Arg ...), Cxs>::type upcxx::rpc(upcxx::intrank_t, Cxs, Fn&&, Arg&& ...)'
  269 |   auto rpc(intrank_t recipient, Cxs cxs, Fn &&fn, Arg &&...args)
      |        ^~~
/usr/local/pkg/upcxx-dirac/gcc-10.1.0/stable-2020.3.0/upcxx.debug.gasnet_seq.smp/include/upcxx/rpc.hpp:269:8: note:   template argument deduction/substitution failed:
rpc-return2.cpp:20:96: note:   candidate expects at least 3 arguments, 2 provided
   20 |   std::vector<int> v4 = rpc(peer, /*UserFn=*/[]() -> std::vector<int> const & { return myvec; }).wait();
      |                                                                                                ^
In file included from /usr/local/pkg/upcxx-dirac/gcc-10.1.0/stable-2020.3.0/upcxx.debug.gasnet_seq.smp/include/upcxx/dist_object.hpp:7,
                 from /usr/local/pkg/upcxx-dirac/gcc-10.1.0/stable-2020.3.0/upcxx.debug.gasnet_seq.smp/include/upcxx/upcxx.hpp:24,
                 from rpc-return2.cpp:1:
/usr/local/pkg/upcxx-dirac/gcc-10.1.0/stable-2020.3.0/upcxx.debug.gasnet_seq.smp/include/upcxx/rpc.hpp:280:8: note: candidate: 'template<class Fn, class ... Arg> typename upcxx::detail::rpc_return<Fn(Arg ...), upcxx::completions<upcxx::future_cx<upcxx::operation_cx_event, upcxx::progress_level::user> > >::type upcxx::rpc(const upcxx::team&, upcxx::intrank_t, Fn&&, Arg&& ...)'
  280 |   auto rpc(const team &tm, intrank_t recipient, Fn &&fn, Arg &&...args)
      |        ^~~
/usr/local/pkg/upcxx-dirac/gcc-10.1.0/stable-2020.3.0/upcxx.debug.gasnet_seq.smp/include/upcxx/rpc.hpp:280:8: note:   template argument deduction/substitution failed:
rpc-return2.cpp:20:96: note:   candidate expects at least 3 arguments, 2 provided
   20 |   std::vector<int> v4 = rpc(peer, /*UserFn=*/[]() -> std::vector<int> const & { return myvec; }).wait();
      |                                                                                                ^
In file included from /usr/local/pkg/upcxx-dirac/gcc-10.1.0/stable-2020.3.0/upcxx.debug.gasnet_seq.smp/include/upcxx/dist_object.hpp:7,
                 from /usr/local/pkg/upcxx-dirac/gcc-10.1.0/stable-2020.3.0/upcxx.debug.gasnet_seq.smp/include/upcxx/upcxx.hpp:24,
                 from rpc-return2.cpp:1:
/usr/local/pkg/upcxx-dirac/gcc-10.1.0/stable-2020.3.0/upcxx.debug.gasnet_seq.smp/include/upcxx/rpc.hpp:291:8: note: candidate: 'template<class Fn, class ... Arg> typename upcxx::detail::rpc_return<Fn(Arg ...), upcxx::completions<upcxx::future_cx<upcxx::operation_cx_event, upcxx::progress_level::user> > >::type upcxx::rpc(upcxx::intrank_t, Fn&&, Arg&& ...)'
  291 |   auto rpc(intrank_t recipient, Fn &&fn, Arg &&...args)
      |        ^~~
/usr/local/pkg/upcxx-dirac/gcc-10.1.0/stable-2020.3.0/upcxx.debug.gasnet_seq.smp/include/upcxx/rpc.hpp:291:8: note:   substitution of deduced template arguments resulted in errors seen above

The problem is the current runtime fails to type check RPC callbacks returning reference types. As shown in the example this can arise with lambda callbacks declaring a trailing return type including a top-level reference, or when invoking a previously declared function with such a return type.

We need to decide whether or not to allow this pattern, and take appropriate actions.

Nothing in the current spec would lead users to understand why this code is problematic, so if the consensus decision is to prohibit such calls, then we need to specify that. In such case we'd ideally also add some static_asserts to give a more user-friendly type error explaining what the user did wrong.

If we decide to allow these, then we need to repair the implementation and add sufficient tests to convince ourselves it is correct.

Comments (7)

  1. Dan Bonachea reporter

    Response from @john bachan copied from pull request #178:

    After looking into it, permitting return types with & has thorns. Consider:

    rpc(0, [](T const &x)->T const& {
     return x;
    }, T(...));
    

    To permit this, we have to guarantee that deserialized values live at least as long as the RPCs return value. So that seems like an easy thing for the implementation to do, but futures sneak in and complicate things. Futures can appear in two places along an RPCs path, when deserializing the arguments (dist_object), or if the user returns a future from the lambda. The implementation is smart, for each of those two places it generates the best code possible depending on if futures are necessary. Consider:

    rpc(0, [](int const &x)->future<int const&> {
     return make_future<int const&>(x);
    }, 0xbeef);
    

    Ignoring the &, consider the best way to handle this. Because deserializing doesn’t involve dist_object, no futures sit between deserialization and lambda invocation. So the deserialized values can sit on the stack and are then passed into the lambda immediately. But the because the lambda returns a future, firing the reply message happens asynchronously and has to be tracked on the heap. But by that time our deserialized inputs have been reaped by stack frame exit. So that & now references garbage.

    Options that occur to me:

    1. Disable the no-future optimizations and just keep everything on the heap. 😞
    2. Disable the no-future optimizations but only in the presence of T& return types. Not bullet proof, consider how std::tuple<T&> could be missed. This type-sniffing complicates the implementation for what I see as little added value.
    3. Keep doing what we’re doing and:
      1. Spec the conditions that cause rpc args to live as long as the return value (no dist_objects and no future return). I have confirmed that the implementation does adhere to this.
      2. Spec the conservative thing, that rpc args live no longer than the lambdas invocation frame and can be outlived by the lambda’s return.

    I see no viable potential in 1 or 2. I don’t like 3.a because it involves tough spec work. So 3.b seems like the winner. Given that referencing the args is off the table, do we still permit & return types? Its definitely safer not to. But it does mean the “fetch” pattern incurs a possibly unnecessary copy:

    struct bigstate { int a[1000]; };
    bigstate my_state;
    
    void foo() {
     // returning reference permits reply serialization
     // to happen directly out of my_state global var.
     future<bigstate> peer_state = rpc(0, []()->bigstate& { return my_state; });
    }
    

    Ultimately I don’t see enough benefit in permitting returned & to outweigh the safety issues. Users can always code up their own request/reply with nested rpc_ff. Or they can move to RMA which is what we always recommend when communication volume becomes significant.

  2. Dan Bonachea reporter

    @john bachan - IIUC, the optimization you describe of keeping deserialized RPC arguments on the stack to avoid malloc overhead is exactly the design decision responsible for issue #336 where we overflow the stack given statically large RPC arguments. If this is correct, I think it means we now have two independent reasons to support a conditional "slow path" that deserializes things to the heap - to ensure correct lifetime behavior for RPC callbacks returning a future containing a reference to an argument, and to enable statically large types as RPC arguments without stack overflow crashes.

    I think type information should always be sufficient to tell us if the input or output types are statically large, or if the callback's declared return type is a future (although not if it's trivially ready). I think it's rare for RPC callbacks to deliberately return trivially ready futures, since it's always possible to write things another way (often with less typing); so I don't see it as a big problem if we adopt that to be our (more general) signal that input arguments should be heap-allocated and ensure they remain live until the return value is readied and serialized. I think this is similar to your "3a" but doesn't sound that hard to spec (esp if we leave out mention of dist_object) - it's basically the same verbiage we already apply for view-buffer lifetime extension, but instead applies to deserialized input arguments that are not moved away by the callback.

    Here is some strawman phrasing: (based on the ominously enumerated view buffer lifetime paragraph 6.6-6)

    The lifetime of the deserialized arguments to an RPC callback on the target is usually restricted to the duration of the RPC callback, expiring upon return. In this default case, any deserialized argument objects accepted by reference must be processed or moved/copied elsewhere before the RPC returns. However, if the RPC returns a future type, then the lifetime of the deserialized arguments is extended until that future is readied and the return value is serialized. This allows an RPC to accept inputs by reference and initiate an asynchronous operation to consume the objects, and as long as the resulting future is returned from the RPC, the objects will remain valid until the asynchronous operation is complete and the future readied. It also permits RPC callbacks to safely embed references accepted as arguments into the future return value for use in serializing the result returned to the initiator. Lifetime extension applies to all RPC variants, including rpc_ff and as_rpc where the return value is not made available to user code (although those cases do not serialize a result, so the lifetime ends at future readying).

    Some examples:

    upcxx::rpc(target, [](std::vector<int> const &a1, std::string&& a2) {
      future <> some_fut = do_something(a1, a2); // do something asynchronous
      // return a future that serializes the modified arguments and sends them back
      return upcxx::when_all(some_fut, upcxx::make_future<std::vector<int> const &, std::string&&>(a1, a2));
    }
    
    upcxx::rpc(target, [](std::vector<int> const &a1, std::string&& a2) {
      // hand-off processing to another thread
      // and return a future readied by LPC acknowledgment
      return lpc(peer_persona, [](std::vector<int> const &a1, std::string&& a2) { ... }, a1, a2);
    }
    

    This proposal has the benefit of symmetry with view buffer lifetimes, extending that behavior to all objects materialized by deserialization. After all, now that we have fully general Serialization, I see no strong reason why view should be unique in reaping this benefit, as many of the same arguments applied to view can also apply here; ie users may have heaviweight Serializable inputs to the RPC that they wish to hand off to other personas for processing, without unnecessary data copies (or possibly even moves, in the case of statically large types). These objects might not look anything like an array, eg including graph-like structures or other collections of non-uniform objects, making view an unsuitable container. The fact that general deserialization may have generated a "rich", non-contiguous and/or dynamically-sized object in memory IMO makes an even stronger argument, because RMA might not directly apply as alternative data transfer mechanism (even if one is willing to pay the cost of establishing and synchronizing a landing zone).

    Of course even for this proposed conditional "slow path", nothing should prevent us from deserializing all of the RPC arguments into one contiguous heap buffer, so the incremental cost of this feature could theoretically be tuned down to a single heap allocation per RPC invocation. I'd estimate the latency cost of this would be lost in the noise for any RPC crossing the I/O bus, and possibly not even discernable in the latency of RPCs following a PSHM bypass path. It's also worth noting that every rpc() injection (regardless of callback signature) defaults to incuring a heap allocation on the initiator for the future handle, so our design of RPC already assumes that the allocator is an acceptable cost in the rpc critical path.

    Thoughts?

  3. Dan Bonachea reporter

    In the 2020-05-20 meeting, John indicated verbal agreement with my proposal above to generalize the RPC view buffer lifetime extension feature to encompass all deserialized arguments to RPC in callbacks statically returning a future. He noted it may even be possible to avoid additional heap allocation by reserving sufficient space in the progress queue entry to hold the deserialized arguments, since their size is statically known.

    Tangentially relevant: The consensus in spec issue 159 appears to be that reference types should be permitted in RPC return types (and input arguments) and they should be serialized via unboxing, ie:

    deserialized_type_t<T /* cv qualifiers */ &> and deserialized_type_t<T /* cv qualifiers */ &&> are both deserialized_type_t<T>.

    Taken together, I think this means that code such as this example:

    upcxx::rpc(target, [](std::vector<int> const &a1, std::string&& a2) {
      future <> some_fut = do_something(a1, std::move(a2)); // do something asynchronous
      // return a future that serializes the modified arguments and sends them back
      return when_all(some_fut, 
          make_future<std::vector<int> const &, std::string&&>(a1, std::move(a2)));
    }
    

    will be valid (argument lifetimes are extended until after the returned future is readied and serialized) and the rpc would return a future<std::vector<int>, std::string> to the initiator (references stripped out) containing the eventual result. This seems to fulfill the principle of least surprise, in addition to hopefully enabling reduction of unnecessary data copies for RPC argument data.

  4. Dan Bonachea reporter

    In the 2020-07-29 meeting we resolved to adopt this work in several stages:

    1. Clarify the specification to specify that:
      1. For RPCs returning a non-future type or a ready future, the lifetime of input arguments passed by reference to RPC ends after serialization of the result (which IS permitted to include references to the input arguments).
      2. For RPCs returning a NON-ready future, the lifetime of input arguments passed by reference to RPC ends after return from the callback, meaning the returned non-ready future MAY NOT include references to the input arguments (unless they were moved out into other locations).
    2. Later, implement and specify a lifetime extension semantic for the second case, modeled after view buffer lifetime extension
      1. My comment above provides a good starting point for spec text
      2. John has indicated that command.hpp can be updated for the case of callbacks statically returning a future to deserialize input arguments into the same heap-allocated chunk that is used to handle callbacks taking a dist_object or view argument.
  5. Dan Bonachea reporter

    pull request #237 enables the return of references from RPC callbacks and the spec has been updated accordingly (stage 1 deployment from my previous comment)

    The spec/implementation still prohibit RPCs that return non-ready futures from including references to input arguments in that future. This issue remains open (with an updated description) as an enhancement request to deploy that stage 2 improvement.

  6. Log in to comment