dist_object access operators should preserve const-ness

Issue #202 new
Dan Bonachea created an issue

Currently we have these operators for accessing the T object inside a dist_object<T>:

template <typename T>
T* dist_object<T>::operator->() const;
template <typename T>
T& dist_object<T>::operator*() const;

These declarations permit questionable code such as the following:

void consumer(const dist_object<int> &dobj) {
    // my argument is const-qualified, but I can change its contents anyhow!
  *dobj = 42;
}

This is contrary to the behavior of element accessors for containers in the C++ standard library (e.g. vector::operator[] and optional::operator*), where a const qualifier on the container also ensures that member functions providing access to the contained element(s) return a const-qualified reference/pointer/iterator. There are of course many ways to circumvent this const-ness, but in the absence of casts the default behavior is for containers to preserve const-ness in access to contained data.

This issue proposes to replace the two current dist_object operators above with overloads that preserve const-ness as follows:

template <typename T>
T* dist_object<T>::operator->();
template <typename T>
const T* dist_object<T>::operator->() const;

template <typename T>
T& dist_object<T>::operator*();
template <typename T>
const T& dist_object<T>::operator*() const;

This would be a breaking change, because it would disallow exactly the type of questionable code demonstrated above. However there is also overwhelming precedent in the C++ standard library supporting this as the "right" design pattern, and AFAIK we don't have a good motivation for diverging from that.

Comments (4)

  1. Colin MacLean

    The current behavior is consistent with C++ Standard Library smart pointers and std::reference_wrapper<T>, so there is precedent for the current behavior. The question is if dist_object should have container-like or pointer/reference-like semantics. I think there are arguments for both.

  2. Rob Egan

    MHM2 has not ever (intentionally) used a const dist_object<T>&. 

    Technically we could, say after the building phase and during the read-only phase, but honestly we never considered making the dist_object const, and would expect no change in performance.

    I agree there are arguments on both sides. Although I tend to think of a dist_object as behaving nearly identically to a std::shared_ptr, and they share the same semantics to dereference, I would favor the argument that a const dist_object<T>& could not have its underlying data modified.

    You guys are more interested in providing the proper API, so this is worth fixing th proposed way or at the very least documenting the behavior, IMO

    Not having looked at this before nor reading any of the reasons why in C++ a shared_ptr and vector treat the const-edness of the underlying differently from the container, I can imagine that the difference comes down to the copy of the container. Take a lambda capture or function argument for example. For a shared pointer copies are cheap and expected and part of the sharing raison d’etra - reference captures of a shared_ptr are odd and problematic. For a vector copies are generally avoided as they are expensive and conversely reference captures are used extensively. Additionally a shared_ptr<const T> always protects the const-edness of the data, but a vector<const T>? I’ve never seen that, can it even be used?

    since dist_object<T> is not copyable, and requires references to be used everywhere, that, to me, says it should preserve the const-edness of the value like a vector.

  3. Colin MacLean

    const std::shared_ptr<T> describes the pointer as being const rather than the object being pointed to being const, just like const T* and T *const. It’s a matter of describing the pointer/reference/handle itself rather than the object pointed to. dist_object<T> is somewhere in between. If it were dist_reference<T> that didn’t hold ownership of the object, the decision would clearly be to make it like smart pointers and std::reference_wrapper<T>. With dist_object it’s less clear as to if we’re primarily wanting to describe the handle and interface to it or to act as a wrapping container around an object.

  4. Dan Bonachea reporter

    Colin said:

    With dist_object it’s less clear as to if we’re primarily wanting to describe the handle and interface to it or to act as a wrapping container around an object.

    To me this is pretty clear cut. dist_object<T> is in no way a handle like shared_ptr<T> or std::reference_wrapper<T>.
    dist_object<T> is an object that directly contains and owns a T in exactly the same way as a std::tuple<T>, and it's not even copyable.
    The corresponding dist_id<T> has handle semantics and so does dist_object<T> *, but dist_object<T> itself does not have handle-like semantics.

    If we ignore the UPC++-specific parallel registry features of dist_object<T> and look only at its relationship to the contained T, I think the closest analogues in the shared library are std::tuple<T> and a (non-empty) std::optional<T>, both of whose element accessors propagate const-ness from the container to the element reference, a precedent consistent with my proposal (which Rob seems to agree with).

    Tangential: I am NOT proposing to change the behavior of dist_id<T>::here(), which currently returns a non-const dist_object<T>& even from a const dist_id<T>. I think this is consistent because dist_id<T> really is a handle abstraction. This implies that given const dist_object<int> &dobj one could use dobj.id().here() to "erase" const-ness from the reference, but one could of course accomplish the same more explicitly using address-of and const_cast.

    To my mind, the main counter-argument to this proposal is it demands stricter adherence to our "rules of thumb" for best-practice RPC callback arguments.
    Consider this example from the guide:

      using dobj_map_t = upcxx::dist_object<std::unordered_map<std::string, std::string> >;
      upcxx::future<> insert(const std::string &key, const std::string &val) {
        // the RPC returns an empty upcxx::future by default
        return upcxx::rpc(get_target_rank(key),
                          // lambda to insert the key-value pair
                          [](dobj_map_t &lmap, const std::string &key, const std::string &val) {
                            // insert into the local map at the target
                            lmap->insert({key, val});
                          }, local_map, key, val);
      }
    

    This code follows our best-practice for RPC callback arguments, namely non-scalar deserialized arguments (key and val) are usually accepted by CONST l-value reference, and dist_object or team arguments (lmap) are usually accepted by NON-CONST l-value reference.

    However consider a less-skilled or less-informed author who carelessly places a const qualifier on the lmap argument reference above. Such code would still compile and work under today's semantics (due to the currently const-forgiving nature of dist_object accessors), but would become a compile error under the proposed change. To state this another way, under current semantics a naive author who doesn't really understand references could be trained to always accept RPC arguments by const reference for efficiency, and everything will mostly work as expected because dist_object accessors currently ignore const (and most RPCs don't need to modify deserialized objects). But with this change they'll need to remember to make an exception for dist_object<T> arguments containing a T they intend to modify (which can ONLY be accepted as non-const l-value reference and nothing else under the new rules).

  5. Log in to comment