Query length/utilization of incoming RPC queue

Issue #444 new
Dan Bonachea created an issue

Discussion copied from issue #382:

@Rob Egan said:

At the meeting we discussed the mechanism whereby the rdzv buffers are received on the target destination side by allocating private memory, and we discussed how this could lead to out of private memory exception on the target side, if not sufficiently attentive there.

Is there any metric we can query on that target process to estimate how much (temporary) private memory upcxx / gasnet has been consumed (possibly even just those buffers which will be freed (eventually) by user progress())?

or maybe simply the count of the pending tasks waiting on a call to user progress()?

These are similar, but not quite the same questions, that progress_required() would answer for internal progress.

The primary point for these queries is to keep an eye on the pending workload and ensure that the the memory consumed by the pending workload can be loosely bound by more attentiveness… we can hard-code how many times and how often progress() is called but calling it excessively can affect the performance and calling it too few times can consume all the resources. We have no metric except trial and error to base that decision on and it might be better made at runtime.

Comments (7)

  1. Dan Bonachea reporter

    The current UPC++/GASNet implementation does not explicitly track its private heap usage in a production build (ignoring the GASNet-level debug-mode mallocator), and deploying such a change explicitly and uniformly across the system would be a massive undertaking. glibc provides mallinfo() that gives insight into the private heap utilization of the process, and is probably the best way to snapshot current private heap utilization. I believe there are also other transparent "system allocator replacement" libraries that can be linked in to track similar info. However in both cases discovering the effective "limit" might be non-trivial.

    However if we're primarily worried about monitoring memory resources consumed by the incoming RPC queue, then perhaps we can narrow our attention to finding solutions to that more restricted problem. It has separately occurred to me many times that it could be beneficial to provide additional insight into RPC queue depths, since (even ignoring resource consumption issues) attentiveness and progress frequency can become a major factor in UPC++ application performance.

    We don't currently have a way to query any information about the "incoming" RPC queue, although I think we could discuss designing something -- but it probably represents a significant effort. The runtime currently does NOT maintain such information explicitly in any way, and (depending on what we design) adding such tracking might have a non-trivial cost (especially in threaded applications). If we did add something, it might need to be a profiling-style feature that is opt-in at compile time and otherwise compiled out to avoid the added overhead.

    Note that upcxx::progress_required() only tracks outgoing operations and has nothing to do with this.

  2. Log in to comment