Proposal: RPC injection call that can throw an exception on memory exhaustion

Issue #176 resolved
Dan Bonachea created an issue

As discussed in the 2021-01-27 meeting with ExaBiome, the lack of backpressure in RPC currently has a capacity problem, where too many large injections at once by a single initiator (or targeting the same inattentive process) can lead to memory exhaustion crashes, as described in Impl issue 242. As John correctly notes in that issue, this problem is fundamental to our current semantics:

It is a consequence of our rpc API that no implementation could possibly guarantee both bounded memory usage and deadlock freedom.

So we'd like to consider adding new RPC/RPCFF injection calls (probably with new names) that have new semantics enabling the implementation to deploy backpressure via either stalls or failure at the initiator. This would allow the runtime to rate-limit RPC injection, allowing us to throttle the use of sender-side resources and (hopefully) avoid resource exhaustion in use cases of interest. The likely use of a new entry point name means this would be "opt-in" for applications, and applications whose communication pattern is not resource-hungry can continue to ignore the issue.

We don't yet have a concrete proposal, but here are some possible approaches discussed:

Approach 1: Stall injection with user-level progress

Upgrade RPC injection to have user-level progress (current calls have internal progress) and allow it to stall calling user progress when resources are constrained.

This seemingly straightforward change has the nasty side-effect that any RPC injection operation needs to be prepared to run user callbacks (including event activations from asynchronous operations and incoming RPCs). This potential re-entrance creates a programmability hurdle for users, where subtle defects could be very difficult to debug as they would remain hidden until resources were constrained (timing-dependent).

It also has the very significant drawback that it doesn't solve the problem for RPC injections that are invoked from callbacks inside the restricted context, where we cannot support recursive user-level progress during a stall (because it would introduce re-entrance on the runtime and unbounded stack growth).

So this would be a partial solution at best, and is mentioned here for completeness.

Approach 2: Injection failure with an exception

In this approach, the new RPC injection calls would be permitted to throw an exception when sender-side resources are currently scarce (or perhaps permanently insufficient) to deliver the requested operation. The user would be responsible for catching this exception and taking appropriate action - either deferring the intended injection while performing other unrelated work (or possibly just running user-level progress), or alternatively choosing to abandon the injection attempt altogether and make an algorithmic adjustment.

The semantic details still need to be worked out, but would likely include:

  • Same argument syntax, completions, etc. offered by the existing upcxx::rpc(_ff) functions
  • Probably define a new exception type to be thrown on backpressure
  • Would deliberately leave unspecified if the possible exception is thrown before, after or even during serialization traversal of the callable and arguments (and it may vary from call-to-call).
  • In types with custom serialization there's the possibility that resource exhaustion might be detected within a [Writer]::write(_sequence) call inside a user-provided serialize() method. However we'd like to avoid the need to throw an exception from those functions (partially because it would mean messy exception passing between user and runtime stack frames during recovery). Potential resolutions include:
    • Ignoring the problem and fatal error if it occurs (rationale: the program in this situation is likely to get OOM killed due to private heap exhaustion, making recovery unrealistic anyhow)
    • Putting the writer into a "disabled" state that discards all written data, allowing serialization to complete without further resource growth, after which the injection is aborted and the exception thrown.

One major benefit of this approach is it leverages the existing C++ exception infrastructure, and applications can utilize those features to handle recovery at whatever scope makes the most sense (including ignoring it, which leads to a process kill via an exception that at least partially explains what happened). Applications using this feature would obviously need to be built with runtime exception support (eg without the GNU-specific -fno-exceptions option), but this is already true of our default (throwing) shared heap allocation functions.

The main limitation of both approaches discussed above is they only limit sender-side resources (this is where our current RPC rendezvous algorithm exhausts the shared memory resource). They do nothing to throttle based on target-side (private memory) resource exhaustion, which can occur with hot-spot targets that are inattentive to user-level progress. Impl issue 444 proposes the possibility of a tool to address the analogous resource exhaustion problem on the target side.

Comments (8)

  1. Dan Bonachea reporter

    This proposal was discussed in the 2021-02-24 meeting which included the UPC++ developers and @Rob Egan from ExaBiome.

    Current consensus seems to favor this proposal over the related one in impl issue 449.

  2. Dan Bonachea reporter

    Noting before I forget:

    It's technically possible the "Injection failure with an exception" variant of this proposal could be taught to throw a backpressure exception not only upon local memory resource exhaustion, but additionally for the case where injection memory resources are sufficient but the outgoing AM network channel is congested and unable to accept an injection without stalling (in the worst case, to await a remote CPU to make AM-level progress). This would work best with "high-fidelity" conduit NPAM support for gex_AM_PrepareRequestMedium(GEX_FLAG_IMMEDIATE), where the conduit's success/failure of Prepare would be a "soft" promise not to stall during the subsequent Commit due to network backpressure and would be used as a decision point to inject or throw.

    This seems like a good goal, but I'm not sure whether such failure-on-congestion behavior should be automatically included, or an independent user-directed opt-in.

  3. Dan Bonachea reporter

    Work towards "Approach 2" now appears in impl PR 376, which changes the existing calls to throw a upcxx::bad_shared_alloc exception where they would have previously crashed on shared heap exhaustion.

  4. Log in to comment