RFE: Consider adding lifetime extension for LPC function objects

Issue #205 new
Dan Bonachea created an issue

UPC++ currently provides lifetime extension for RPC function objects, but not LPC function objects.

I do NOT currently have a use case in mind for this feature, and only encountered it while cleaning up the minutiae of the LPC specification. However this discrepancy seems like an odd asymmetry and the principle of least surprise would argue for providing lifetime extension here (if and only if it can be done with ZERO increase in runtime overhead for function objects that return non-future types, and near-zero for those returning ready futures).

Demonstration:

#include <upcxx/upcxx.hpp>
#include <iostream>
#include "util.hpp"

using namespace upcxx;

bool done = false;

struct MyFn {
 bool valid = true;
 MyFn() {
   say() << "Default constructor"; 
   check();
 }
 MyFn(MyFn&& that) : valid(that.valid) {
   say() << "Move constructor"; 
   that.valid = false;
   check();
 }
 ~MyFn() { 
   say() << "Destructor"; 
   valid = false; 
 }
 void check() const {
   UPCXX_ASSERT_ALWAYS(valid);
 }
 future<> operator()() {
   check();
   say() << "Running callback"; 
   MyFn *t = this;
   return current_persona().lpc([t]() {
            say() << "Nested callback"; 
            t->check();
            done = true;
          });
 }
};

int main() {
  init();
  print_test_header();

#if 1
  done = false;
  say() << "*** round-trip RPC ***";
  rpc(rank_me(),MyFn()).wait();
  UPCXX_ASSERT_ALWAYS(done);
#endif

#if 1
  done = false;
  say() << "*** one-way RPC ***";
  rpc_ff(rank_me(), MyFn());
  do { progress(); } while (!done);
#endif

#if 1
  done = false;
  say() << "*** round-trip LPC ***";
  current_persona().lpc(MyFn()).wait();
  UPCXX_ASSERT_ALWAYS(done);
#endif

#if 1
  done = false;
  say() << "*** one-way LPC ***";
  current_persona().lpc_ff(MyFn());
  do { progress(); } while (!done);
#endif

  print_test_success(); 

  finalize();
  return 0;
}

Output from 2023.3.0:

Test: lpc_lifetime.cpp
Ranks: 1
[0] *** round-trip RPC ***
[0] Default constructor
[0] Default constructor
[0] Running callback
[0] Nested callback
[0] Destructor
[0] Destructor
[0] *** one-way RPC ***
[0] Default constructor
[0] Destructor
[0] Default constructor
[0] Running callback
[0] Nested callback
[0] Destructor
[0] *** round-trip LPC ***
[0] Default constructor
[0] Move constructor
[0] Move constructor
[0] Destructor
[0] Running callback
[0] Destructor
[0] Nested callback
*** FATAL ERROR (proc 0): 
//////////////////////////////////////////////////////////////////////
UPC++ assertion failure:
 on process 0 (pcp-d-10)
 at lpc_lifetime.cpp:25
 in function: void MyFn::check() const()

Failed condition: valid

To have UPC++ freeze during these errors so you can attach a debugger,
rerun the program with GASNET_FREEZE_ON_ERROR=1 in the environment.
//////////////////////////////////////////////////////////////////////

Comments (2)

  1. Colin MacLean

    Here’s what I see in the spec regarding RPC and object lifetime:

    the lifetime of the buffer and view iterators is extended until that future is readied

    the return value affects the lifetime of the deserialized objects that the UPC++ runtime constructs and passes to the function object.

    The spec doesn’t actually say anything about the lifetime of the callback itself, only the lifetimes of the objects passed to the callback. A conforming implementation may currently invoke a temporary copy of the user’s callback whose lifetime ends immediately after being invoked, even in the RPC case.

    It should also be noted that C++17 required an explicit this and added *this to lambda captures to address this sort of thing, which might help inform how we resolve this issue.

     future<> operator()() {
       check();
       say() << "Running callback"; 
       return current_persona().lpc([*this]() {
                say() << "Nested callback"; 
                this->check();
                done = true;
              });
     }
    

    We probably want to extend the lifetimes of all callbacks returning non-ready futures for both LPCs and RPCs, but the above is the idomatic safe solution to the general case. Unless the user’s callback is huge, relying on lifetime extension in our special case and not capturing *this by value is a tiny optimization and requires a lot of knowledge on the user’s part to do safely. The lifetime issues of capturing this were a big and common enough pitfall for the committee to make the breaking change to lambdas, after all. While I support the lifetime extension, we should probably encourage users to just capture *this by value unless they have a really good reason not to.

  2. Dan Bonachea reporter

    Spec 9.3.1-16: (RPC)
    the return value affects the lifetime of the deserialized objects that the UPC++ runtime constructs and passes to the function object. ... , if the return value is a non-ready future, destruction of the deserialized objects is deferred until after the future becomes ready, allowing the function object to safely initiate further asynchronous computation that operates on those objects

    Colin said: The spec doesn’t actually say anything about the lifetime of the callback itself, only the lifetimes of the objects passed to the callback.

    We disagree about the interpretation of that spec excerpt wrt RPC. The function object is absolutely a "deserialized object that the UPC++ runtime constructs" and it's also implicitly passed to the function object invocation, despite the fact it's not a syntactically explicit argument (which is an irrelevant distinction IMO). This guarantee matters because a lambda function object will commonly include variable captures that can be just as important as the explicit function arguments when it comes to lifetime extension. In any case I think the RPC implementation is already doing this correctly for RPC now, so this aspect is really a non-issue IMO.

    we should probably encourage users to just capture *this by value unless they have a really good reason not to.

    Your proposed change doesn't actually work for my toy example as currently written, because the function object is non-copyable. I think that's a pretty good counterargument, albeit a somewhat contrived one. Also one main point of our lifetime extension feature (for any deserialized object in RPC) is to avoid needless object copying, an optimization which I think applies equally to any non-empty function object.

    In any case we seem to be in agreement that it should not be a high priority to provide lifetime extension for LPC (which is why I simply entered this issue to record the observation but did not add it to any milestone).

  3. Log in to comment