Ensure completion_cx::as_promise(p) works even when p is destroyed prior to fulfillment

Consider the following program:

#include <upcxx/upcxx.hpp>
#include <iostream>
#include <algorithm>

using namespace upcxx;
using namespace std;

future<> fetch_vals(global_ptr<int> gp, int *dst, int cnt) {
#if STATIC_PROMISE
  static 
#endif
  promise<> p;

  for (int i=0;i<cnt;i++) {
    upcxx::rget(gp+i, dst+i, 1, operation_cx::as_promise(p));
  }
  return p.finalize();
}

double stack_writer(double v) {
  if (v < 3) return 1;
  else return stack_writer(v-2) * stack_writer(v-1) * v;
}

int main() {
  upcxx::init();
  int cnt = 100;
  int *vals = new int[cnt]();
  global_ptr<int> gp;

  if (!rank_me()) {
    gp = new_array<int>(cnt);
    int *src = gp.local();
    std::fill(src, src+cnt, 42);
  }
  gp = broadcast(gp, 0).wait();
  future<> f = fetch_vals(gp, vals, cnt);
  cout << stack_writer(10.1) << endl;
  f.wait();

  for (int i=0;i<cnt;i++)
    assert(vals[i] == 42);

  barrier();
  if (!rank_me()) cout << "SUCCESS" << endl;

  upcxx::finalize();
}

This proxy does not compute anything interesting, but the pattern in fetch_vals() is an idiom we'd like to encourage for functions that launch many operations and return a future - specifically, registering all the operations on a single promise costs less overhead than creating and conjoining a bunch of futures, and it's also more convenient in a case like this where the operations are being generated in a localized block of code.

Sadly, this code does not currently work as written, due to a lifetime problem. Running this code on distributed memory generates a crash, eg a SEGV with the following stack:

[1] #22 0x0000000100788f52 in upcxx::detail::persona_tls::fulfill_during_user_of_active<>(upcxx::persona&, upcxx::promise<>&, long) (this=0x60003ac88, per=..., pro=..., anon=1) at /usr/local/upcxx-2019.9.0/upcxx.debug.gasnet_seq.udp/include/upcxx/persona.hpp:659
[1] #23 0x000000010078bc3d in upcxx::backend::fulfill_during_<upcxx::promise<>&>(std::integral_constant<upcxx::progress_level, (upcxx::progress_level)1>, upcxx::promise<>&, long, upcxx::persona&) (pro=..., anon=1, active_per=...) at /usr/local/upcxx-2019.9.0/upcxx.debug.gasnet_seq.udp/include/upcxx/backend.hpp:152
[1] #24 0x000000010078bafe in upcxx::backend::fulfill_during<(upcxx::progress_level)1>(upcxx::promise<>&, long, upcxx::persona&) (pro=..., anon=1, active_per=...) at /usr/local/upcxx-2019.9.0/upcxx.debug.gasnet_seq.udp/include/upcxx/backend.hpp:172
[1] #25 0x000000010078b6db in upcxx::detail::cx_state<upcxx::promise_cx<upcxx::operation_cx_event>, std::tuple<> >::operator()() (this=0x6000d50e8) at /usr/local/upcxx-2019.9.0/upcxx.debug.gasnet_seq.udp/include/upcxx/completion.hpp:383
[1] #26 0x000000010078acfb in upcxx::detail::completions_state_head<true, upcxx::detail::rget_byref_event_values, upcxx::promise_cx<upcxx::operation_cx_event> >::operator_case<>(std::integral_constant<bool, true>) (this=0x6000d50e8) at /usr/local/upcxx-2019.9.0/upcxx.debug.gasnet_seq.udp/include/upcxx/completion.hpp:525
[1] #27 0x000000010078ad67 in upcxx::detail::completions_state_head<true, upcxx::detail::rget_byref_event_values, upcxx::promise_cx<upcxx::operation_cx_event> >::operator()<upcxx::operation_cx_event> (this=0x6000d50e8) at /usr/local/upcxx-2019.9.0/upcxx.debug.gasnet_seq.udp/include/upcxx/completion.hpp:535
[1] #28 0x0000000100789aa8 in upcxx::detail::completions_state<upcxx::detail::event_is_here, upcxx::detail::rget_byref_event_values, upcxx::completions<upcxx::promise_cx<upcxx::operation_cx_event> > >::operator()<upcxx::operation_cx_event> (this=0x6000d50e8) at /usr/local/upcxx-2019.9.0/upcxx.debug.gasnet_seq.udp/include/upcxx/completion.hpp:605
[1] #29 0x000000010078978d in upcxx::detail::rget_cb_byref<upcxx::detail::completions_state<upcxx::detail::event_is_here, upcxx::detail::rget_byref_event_values, upcxx::completions<upcxx::promise_cx<upcxx::operation_cx_event> > >, upcxx::detail::completions_state<upcxx::detail::event_is_remote, upcxx::detail::rget_byref_event_values, upcxx::completions<upcxx::promise_cx<upcxx::operation_cx_event> > > >::execute_and_delete (this=0x6000d50d0) at /usr/local/upcxx-2019.9.0/upcxx.debug.gasnet_seq.udp/include/upcxx/rget.hpp:121
[1] #30 0x000000010078cbe8 in upcxx::backend::gasnet::handle_cb_queue::burst (this=0x100870110 <upcxx::backend::gasnet::master_hcbs>, burst_n=3) at /home/bonachea/UPC/rel/upcxx-2019.9.0/src/backend/gasnet/runtime.cpp:2151
[1] #31 0x000000010041bcfe in upcxx::<lambda(upcxx::persona&)>::operator()(upcxx::persona &) const (__closure=0xffffc930, p=...) at /home/bonachea/UPC/rel/upcxx-2019.9.0/src/backend/gasnet/runtime.cpp:1665
[1] #32 0x000000010041d635 in upcxx::detail::persona_tls::foreach_active_as_top<upcxx::progress(upcxx::progress_level)::<lambda(upcxx::persona&)> >(upcxx::<lambda(upcxx::persona&)> &&) (this=0x60003ac88, fn=...) at /home/bonachea/UPC/rel/upcxx-2019.9.0/src/persona.hpp:768
[1] #33 0x000000010041be82 in upcxx::progress (level=upcxx::progress_level::user) at /home/bonachea/UPC/rel/upcxx-2019.9.0/src/backend/gasnet/runtime.cpp:1660
[1] #34 0x000000010078feaa in upcxx::detail::future_wait_upcxx_progress_user::operator() (this=0xffffcbf7) at /usr/local/upcxx-2019.9.0/upcxx.debug.gasnet_seq.udp/include/upcxx/future/future1.hpp:60
[1] #35 0x000000010078e85f in upcxx::future1<upcxx::detail::future_kind_shref<upcxx::detail::future_header_ops_general>>::wait<-1, upcxx::detail::future_wait_upcxx_progress_user>(upcxx::detail::future_wait_upcxx_progress_user&&) (this=0xffffcbc8, progress=...) at /usr/local/upcxx-2019.9.0/upcxx.debug.gasnet_seq.udp/include/upcxx/future/future1.hpp:218
[1] #36 0x00000001004013e7 in main () at promise-nuke2.cc:39

The problem is that when the promise<> p used to register the operations is a local variable (often the most convenient and elegant choice) it goes out of scope before fulfillment leading to a later crash when the RMA completes and the runtime tries to fulfill.

The simplest workaround is to extend the promise lifetime by promoting it to a static variable (e.g. -DSTATIC_PROMISE in the above code), however this is inelegant and doesn't handle multiple calls. The more localized solution that doesn't suffer the latter deficiency is to inject explicitly managed dynamic allocation to provide the required lifetime, something like this:

future<> fetch_vals(global_ptr<int> gp, int *dst, int cnt) {
  promise<> *p = new promise<>;

  for (int i=0;i<cnt;i++) {
    upcxx::rget(gp+i, dst+i, 1, operation_cx::as_promise(*p));
  }
  return p->finalize().then([p](){ delete p; });
}

However this is non-intuitive (for novices), non-RAII and inserts a dynamic allocation overhead into the critical path. This requirement to explicitly manage the promise lifetime imposes additional complexity into highly asynchronous user code.

As an aside, I don't think this lifetime requirement for operation_cx::as_promise() is adequately addressed by the current spec - it's only by assumption and implication that readers can infer a potential problem here (probably motivates a spec issue to fix that). It's also a departure from the behavior of future, which always behave as a handle to a reference-counted object and have no associated lifetime concerns. Finally, it's worth noting this lifetime problem only applies to the interval until promise fulfillment - once fulfilled, the promise can safely be destroyed and readied futures linked to it will continue to work, EVEN for non-empty promises. This means there is already a level of indirection (at least for non-empty promises) to an internal object containing the result value - which makes it particularly unfortunate that the user is forced to insert a redundant and potentially costly level of indirection.

The point of this issue is that I don't see any fundamental reason why promise needs to continue behaving as it currently does. If promise instead always behaved as a smart-pointer to a reference-counted (hidden) object, even before fulfillment, then the lifetime could be automatically managed and collection would also happen naturally after fulfillment and the destruction of the last reference (which might be held by the runtime itself). I'd like us to discuss/consider the costs and benefits of deploying such a feature upgrade. I think it would be a nice productivity improvement for highly asynchronous codes, enabling a more natural idiom for use of promises. Also, as a semantic relaxation it should be backwards-compatible with existing code.

Ensure completion_cx::as_promise(p) works even when p is destroyed prior to fulfillment

Official response

Comments (19)