Clarify behavior of future/promise for synchronously completed RMA

The UPC++ RMA interface is generalized to abstract away affinity in rput and rget. The asynchrony support is useful for overlapping communication latencies, however for the case where the global_ptr argument actually is_local(), the RMA operation could (and probably should) be converted to synchronous load/store accesses on hardware shared memory - meaning the data movement operation could technically be "done" before the initiation call returns.

The question involves the semantics of the returned future/promise when an RMA operation was synchronously completed during initiation (most likely due to it being .local()). Specifically:

Do we allow a future-based RMA injection to return a readied future?
Do we allow a promised-based RMA injection to fulfill the promise before returning?

I don't see anything normative prohibiting these as a valid implementation. In fact, the first example in section 5.2 seems to imply this is possible. If we allow this behavior, then we really need to be clear about it because:

In the first case it means any callback the user chains on an RMA future with .then() could execute immediately (ie outside upcxx::progress)
In the second case it means promise fulfillment might trigger callback execution before an rget initiation returns (ie outside upcxx::progress). For an rput it means the subsequent call to promise::finalize_anonymous() could trigger callbacks (with no intervening call to upcxx::progress).

Both of these could be surprising to users if they expect RMA notifications to occur exclusively during a later call to progress. For example, if we allow this behavior, then this text from 6.3 should probably be adjusted:

An important aspect to clarify is that notification of completion only happens during user-level progress. Even if an operation completes early, the application cannot learn this fact without entering user-progress. For futures and promises, only when the initiating thread (persona actually) enters user level progress will the future or promise change its state (be readied or fulfilled).

in particular, this text clearly states that completion won't trigger callbacks asynchronously, but it doesn't really cover the case where an RMA was synchronously completed before return from the initiation call (in which cases there may be no "state change" required).

Comments (2)