Prohibit reference types in global_ptr and upcxx_memberof_general

Issue #158 resolved
Dan Bonachea created an issue

Our spec for global_ptr<T> currently says:

T must not have any cv qualifiers: std::is_const<T>::value and std::is_volatile<T>::value must both be false.

However this leaves out some important information and constraints about T.

In particular, I think we need to:

  1. Prohibit T from being a reference type. This doesn't typecheck, due to several semantic problems (eg the return type of global_ptr<U&>::local() is prohibited type U&*)
  2. Clarify that T is permitted to be an incomplete type - this currently works in our implementation (although we are not testing this case and probably should be)

Comments (9)

  1. Dan Bonachea reporter

    Some clarifications, based on brief discussion in our 2020-03-25 meeting:

    • The motivation for prohibiting global_ptr<int &> is basically the same as the reason that C++ [dcl.ref] prohibits the type int & *; i.e. it doesn't make semantic sense to have a pointer directly to a reference. Any case where this seems reasonable should probably be using a global_ptr<int> instead (ie a pointer directly to the object in question).
    • We DO allow pointers to types with embedded references. For example, global_ptr<std::tuple<int&>> is well-formed and functional (as is std::tuple<int&> *).
  2. Dan Bonachea reporter

    One loose end that hasn't yet been discussed but needs to be resolved is how we handle upcxx_memberof_general() on a reference-typed field. Any type with a non-static reference field breaks standard layout, so upcxx_memberof is already prohibited on such types. However upcxx_memberof_general is not currently prohibited, and can result in some potentially "sketchy" behavior.

    Example:

    struct hasref {
      int x;
      int &xref;
      hasref() : x(0), xref(x) {}
      hasref(int& theref) : x(0), xref(theref) {}
    };
    
    int main() {
      global_ptr<hasref> po = upcxx::new_<hasref>();
      upcxx::future<global_ptr<int>> fmp = upcxx_memberof_general(po, xref);
      global_ptr<int> mp = fmp.wait();
    
      global_ptr<int> sheapvar = upcxx::new_<int>(42);
      global_ptr<hasref> po2 = upcxx::new_<hasref>(*sheapvar.local());
      upcxx::future<global_ptr<int>> fmp3 = upcxx_memberof_general(po2, xref);
      global_ptr<int> mp3 = fmp3.wait();
      assert(sheapvar == mp3);
    }
    

    The example code above is currently accepted and manufactures a global_ptr<int> pointing to the target of the reference (a data-dependent location). This happens to make sense (and mostly work*) for these two cases, because the reference is pointing to an object in the shared heap with affinity to the same rank. However it's decidedly "weird" in the second case, where upcxx_memberof_general is generating a pointer to an object that is not actually a member of the hasref type. The second example highlights that we've strayed firmly out of the intended purpose of these macros: computing a global_ptr for the target of a reference field is really just a fetch of a dynamic remote field value, a value which even with fully static typing cannot be computed without communication.

    * Note I say "mostly work" above because the implementation was not designed for this use case and probably malfunctions in the case of shared-memory bypass, or where the reference crosses an affinity boundary.

    Moreover, it could easily be the case that a reference field references an object that is not in the shared heap at all, consider:

      int stackvar = 42;
      global_ptr<hasref> po3 = upcxx::new_<hasref>(stackvar);
      upcxx::future<global_ptr<int>> fmp4 = upcxx_memberof_general(po3, xref);
      global_ptr<int> mp4 = fmp4.wait();  // Runtime error!!
    

    This example compiles but at runtime manufactures an invalid global_ptr<int> pointing to the stack variable. This invalid global_ptr is immediately rejected by the new debug-mode checking in the gptr constructor (causing an out-of-bounds crash), and in non-debug mode would likely cause failures when used. The same would happen for references pointing into the private heap or static data - we cannot construct valid gptrs to non-shared-heap objects. On the other hand, if upcxx_memberof_general was prohibited from working on fields whose type includes a top-level reference, this "sketchy" use could easily be made a compile error via static_assert.

    It's worth noting that regardless of whether we deploy this prohibition, a user who really wants the behavior described in the first two examples above can achieve it explicitly as follows:

    future<global_ptr<int>> follow_xref(global_ptr<hasref> po) {
      return upcxx::rpc(po.where(), [&po](){ return upcxx::try_global_ptr(&(po.local()->xref)); });
    }
    

    This is basically no more or less efficient than what upcxx_memberof_general must do for this case, and I'd argue the above is better code for two reasons: (1) it's more explicit about what's happening in this potentially sketchy operation, and (2) when written like this using try_global_ptr, an xref pointing out of the shared heap generates a null gptr instead of an invalid one.

    TL;DR: In my opinion we should prohibit upcxx_memberof_general on fields with top-level reference types and deploy static checking to avoid this mess.

    Thoughts?

  3. Log in to comment