Motivation

Some programming models, such as SYCL, require periodic status checks for asynchronous events, such as kernel completion. This can be accomplished by repeatedly adding LPCs into the UPC++ event loop, but this re-queue behavior must be manually written by the user and incurs many unnecessary heap allocations of lpc_base-drived objects. To provide a performant solution to this problem, I propose adding a persistent_lpc type of LPC which adds itself back into the LPC inbox rather than deleting itself in execute_and_delete().

upcxx::future<> fut = upcxx::rpc(...,[](...) {
  sycl::event ev = // kernel launch
  upcxx::future<> pro;
  upcxx::current_persona().persistent_lpc_ff([ev,pro]() {
    if (ev.get_info<sycl::info::event::command_execution_status>()
           == sycl::info::event_command_status::complete) {
      pro.finalize();
      return true;
    } else return false;
  });
  return pro.get_future();
});

Concepts

A persistent LPC has two main goals

To execute until a "done" condition is met
To fulfill a promise when data becomes available

A basic persistent LPC Callable is a simple concept. The following is written using C++20 concepts for conciseness and precision of terms, but is implementable using C++11 SFINAE. It takes no arguments and returns a value convertible to bool to indicate its status:

template<typename C>
concept PersistentLpcCallable = requires(C c) {
  { c() } -> std::convertible_to<bool>;
};

‌

Getting data into a persistent LPC is different from a normal LPC or RPC. Repeatedly passing arguments to a persistent LPC for its state is clunky and is more elegantly handled by mutable lambdas:

int count = 0;
persona.persistent_lpc_ff([count]() mutable {
  return ++count >= 10;
});

‌

Likewise, getting data out of a persistent LPC is more difficult than a one-off LPC or RPC. Returning the future values along with the status, like std::tuple<bool,Ts...>, would result in needing to deal with a potentially large and garbage-containing tuple in invocations before the persistent LPC has reached a "done" state. Therefore, a promise must be passed to the user to fulfill. There are two ways this could be done: Either the user could create a promise themselves and capture a copy with a lambda or an overload could pass a promise as an argument to the persistent LPC Callable:

template<typename C, typename... Ts>
concept PromisePersistentLpcCallable = requires(C c) {
  { c(std::declval<upcxx::promise<Ts...>&>()) } -> std::convertible_to<bool>;
};

‌

Fire-and-forget Persistent LPCs

void upcxx::persona::persistent_lpc_ff(PersistentLpcCallable&& c);

The fire-and-forget version of persistent LPCs is straightforward due to not needing to deal with any promises and futures in the implementation. In execute_and_delete(), the persistent LPC would simply add itself back into the queue rather than deleting itself when not in a "done" state. It is also possible to use promises and futures, simply by using lambda captures:

int count = 0;
upcxx::promise<int> pro;
upcxx::future<int> fut = pro.get_future();
persona.persistent_lpc_ff([count,pro]() mutable {
  if (++count >= 10) {
    pro.fulfill_result(count);
    return true;
  } else return false;
});
int res = fut.wait();

‌

Persistent LPCs with implementation-controlled promises

I experimented with two different designs for promise::persistent_lpc() where the implementation took care of creating the promise and future creation. The first was to return a data-less future marked complete by the implementation when reaching a "done" state.

upcxx::future<> upcxx::persona::persistent_lpc(PersistentLpcCallable&& c);

‌

Example:

int count = 0;
upcxx::future<> fut = persona.persistent_lpc([count]() mutable {
  return ++count >= 10;
});
fut.wait();

‌

This is nice and convenient, but there isn't much of an advantage when the user needs to fulfill the promise with data themselves.

If we had

template<typename... Ts, typename C>
upcxx::future<Ts...> upcxx::promise::persistent_lpc(C&& c)
  requires PromisePersistentLpcCallable<C, Ts...>;

the usage wouldn't be much more concise than if the user created the promise themselves, even with C++14:

upcxx::future<int> fut = persona.persistent_lpc<int>([count=(int)0](auto& pro) mutable {
  if (++count >= 10) {
    pro.fulfill_result(count);
    return true;
  } else return false;
});
int res = fut.wait();

‌

This only saves two lines over a user-managed promise using fire-and-forget. Working with two futures in order to get data out of a persona::persistent_lpc(PersistentLpcCallable) would also be redundant. Therefore, a non-fire-and-forget version of persistent LPCs is only useful when data doesn't need to be returned. As such, I reason that only fire-and-forget persistent LPCs should be provided by UPC++.

Proposal

Add the following member function to upcxx::persona:

template<typename C>
upcxx::persona::persistent_lpc_ff(C&& c);

‌

Where C is a Callable taking no arguments and returning a type convertible to bool, which is executed until it returns true. Keeping the _ff suffix allows for adding the managed promise versions if there's ever a need demonstrated. The implementation of fire-and-forget persistent LPCs requires minimal effort whereas the managed promise versions require a decent amount of SFINAE to get the overloads working without ambiguity errors.

Proposal: Persistent LPCs

Motivation

Concepts

Fire-and-forget Persistent LPCs

Persistent LPCs with implementation-controlled promises

Proposal

Comments (3)