Prevent silent use of by-value RPC return of huge types

Issue #392 new
Dan Bonachea created an issue

UPC++ makes it relatively easy to write BAD code such as the following:

#include <upcxx/upcxx.hpp>
using namespace upcxx;

struct bigarray_t {
  double value[1024];

  bigarray_t operator+=(bigarray_t const &other) { // YUK!
    for (int i=0; i<1024; i++) value[i] += other.value[i];
    return *this;
  }
};

int main() {
  init();

  global_ptr<bigarray_t> gp;
  if (!rank_me()) gp = new_<bigarray_t>();
  gp = broadcast(gp, 0).wait();

  // THIS CODE IS A BAD EXAMPLE - DO NOT DO THIS
  future<bigarray_t> badidea1 = rget(gp); // rget huge val into a future
  bigarray_t bigval = badidea1.wait();
  future<> badidea2 = rput(bigval, gp); // rput huge val passed on stack
  badidea2.wait();
  future<bigarray_t> badidea3 = broadcast(bigval, 0); // bcast huge val passed on stack
  bigarray_t bigval3 = badidea3.wait();
  future<bigarray_t> badidea4 = reduce_all(bigval, op_add); // reduce huge val passed on stack
  bigarray_t bigval4 = badidea4.wait();

  finalize();
  return 0;
}

This code compiles and works, but this pattern is a really REALLY bad idea. The problem here is the user is manipulating a statically large type by-value. This is generally frowned-upon, but is not prohibited and as long as the stack doesn't actually overflow there's no obvious sign this is happening. The problem is obvious in this toy example, but can also be very subtly buried in a larger code. I found this analogous defect in supposedly well-tuned code from an experienced UPC++ programmer (who shall remain unnamed).

UPC++ magnifies the problem with types like this one that happen to be TriviallySerializable, because not only can the programmer pass them around on the stack, but they can also pass them directly to the by-value communication APIs (eg future<T> rget(global_ptr <T>)). These APIs are designed as a convenience for scalar values where copies are effectively free, but using them with a large type like this imposes ridiculous overheads from multiple in-memory copies. Unfortunately if an inexperienced UPC++ programmer is extrapolating from a scalar example to their huge type, they could easily end up with code like this.

The correct way to handle big arrays like this is to use the "bulk" overloads provided for each of these communication interfaces (e.g. future<> rget(global_ptr <T> src, T *dest, size_t count) ), but doing so requires additional arguments and more careful marshalling. To summarize, the bad solution requires less typing and doesn't alert you to the problem, and the right solution requires more insight and more typing. This is a dangerous design property.

We'd like to make it more obvious when a program has strayed into this anti-pattern. This was discussed in the 2020-07-15 meeting, and the consensus seems to be we should add static assertions to detect the instantiation of the scalar communication API with value types over some compile-time-tunable threshold with a reasonable default. This will ensure we don't silently compile such dubious code.

Comments (10)

  1. Paul Hargrove

    I might be mistaken, but my recollection is that our 2020-07-05 meeting also gave a favorable reception to the idea of renaming the by-value interfaces, such as by appending _by_value. This reduces the "bad solution requires less typing" aspect of the problem and serves to annotate the calls requiring (one or more) stack copies. This is in addition to (not a replacement for) the static assertion Dan mentioned.

    While the static assert is a "breaking change" only for codes with oversized args, simply renaming key functions such as rput and rget would break all uses. So, we probably need to observe a deprecation (with warning?) period to allow client codes to adjust. For this reason, I think we need to make this change soon (as in our next release) if we are to make it at all.

  2. Dan Bonachea reporter

    Merge of at pull request #245 a503950 had added static assertions for all of the cases in the original report (and many more).

    However we still do not protect against by-value return of massive objects from RPC, mostly because the inability to return a view means we cannot provide a localized drop-in replacement that accomplishes the equivalent data communication without alot of expensive moves of the massive data type (and potential stack-overflow risk).

    We could easily add static_assertions on the return path, but without a good "story" for an equivalent best practice replacement, it would basically amount to telling the user his communication algorithm is flawed. This issue remains open for further ideas/discussion.

  3. Log in to comment