Managed storage for use with `deserialize_into()`

Issue #196 resolved
Amir Kamil created an issue

This issue is forked off of the discussion on impl PR 429, which addresses the deprecation of std::aligned_storage. Specifically, this issue focuses on the recommended idiom for use with deserialize_into() on a deserializing_iterator. As mentioned in issue 194, our current recommended pattern actually results in undefined behavior. Avoiding undefined behaviors requires manually invoking the destructor on the deserialized object, which isn’t something we want to require of users. In addition, the standard alternative to std::aligned_storage isn’t so great – it requires a manually aligned char array, which must be wrapped in a class-type object if the storage is dynamically allocated.

A straw proposal in the comments for impl PR 429 suggests providing an optional-like storage type that can be used with deserialize_into(). This type would handle both the alignment details as well as manage the lifetime of the deserialized object. The following is one possible implementation:

template<typename T>
struct storage : private detail::raw_storage<T> {
  storage() : ptr_(nullptr) {}
  storage(const storage &) = delete;
  storage& operator=(const storage &) = delete;

  void destruct(std::true_type) {}
  void destruct(std::false_type) {
    if (ptr_) ptr_->~T();
  }
  ~storage() {
    destruct(std::integral_constant<
               bool, std::is_trivially_destructible<T>::value
             >{});
  }

  void* raw(detail::internal_only) {
    return detail::raw_storage<T>::raw();
  }
  T* set(detail::internal_only, T *value) {
    UPCXX_ASSERT(!ptr_ && value == detail::raw_storage<T>::raw());
    return ptr_ = value;
  }

  T& operator*() {
    UPCXX_ASSERT(ptr_);
    return *ptr_;
  }
  T* operator->() {
    UPCXX_ASSERT(ptr_);
    return ptr_;
  }

  template<typename ...Args>
  void construct(Args&& ...args) {
    UPCXX_ASSERT(!ptr_);
    ptr_ = ::new(raw(detail::internal_only{})) T{std::forward<Args>(args)...};
    std::cout << "storage constructed " << *ptr_ << std::endl;
  }

  void reset() {
    destruct(std::integral_constant<
               bool, std::is_trivially_destructible<T>::value
             >{});
    _ptr = nullptr;
  }

private:
  T *ptr_;
};

This utilizes detail::raw_storage<T>, which handles the storage and alignment but does not manage the lifetime of the T object. The implementation above uses a separate pointer to track whether or not an object lives in the storage, but this can be done with a boolean instead. (We would either have the * and -> accessors reinterpret+launder the raw pointer on the fly, or use a union of raw_storage<T> and T.)

The raw() and set() internal functions can be used to by a new overload of deserialize_into() to deserialize into the storage and activate it:

template<typename T>
deserialized_type_t<T>*
  deserializing_iterator<T>::deserialize_into(
    storage<deserialized_type_t> &where) const {
  return where.set(detail::internal_only{},
                   deserialize_into(where.raw(detail::internal_only{})));
}

Access to the raw storage and the ability to toggle the storage on are features that std::optional does not provide, necessitating our own type.

The user idiom would then be as follows:

storage<foo> s1;
storage<foo> *p2 = new storage<foo>;
some_iterator.deserialize_into(s1);
some_iterator.deserialize_into(*p2);
...
// dropping s1 automatically destructs the underlying foo
delete p2; // destructs the underlying foo

In the proposal for storage above, the public-facing API does not include access to the raw storage itself, since construction of an object into that space needs to be accompanied by setting the storage to the full state. We could provide a raw_for_placement_new() member function that sets this state (we’d likely need to use a boolean-based implementation instead) in advance. The user could then use storage with placement new:

storage<foo> s3;
foo *f = new (s3.raw_for_placement_new()) foo;

However, this is quite problematic if the constructor throws an exception – the storage would be set to full but not contain an object, which would eventually result in the destructor being invoked on a non-existing object.

Alternatively, we could overload operator new and operator delete to avoid this problem:

template<typename T>
void* operator new(std::size_t, storage<T> &where) {
  where.activate(detail::internal_only{});   // assume this sets where to the full state
  return where.raw(detail::internal_only{});
}

template<typename T>
void operator delete(void *ptr, storage<T> &where) {
  UPCXX_ASSERT_ALWAYS(ptr == where.raw(detail::internal_only{}));
  where.deactivate(detail::internal_only{}); // assume this sets where to the empty state
}

The overload of operator delete here would only be invoked implicitly if the constructor throws (see “user-defined placement deallocation functions” here) when placement new invokes this operator new. This operator new is invoked when a storage object is passed to placement new:

storage<foo> s3;
foo *f = new (s3) foo;

Thoughts on this design?

Comments (4)

  1. Amir Kamil reporter

    Here’s a union+boolean-based implementation (edited to remove the unnecessary raw_storage):

    template<typename T>
    struct storage {
      storage() : dummy_(), full_(false) {}
      storage(const storage &) = delete;
      storage& operator=(const storage &) = delete;
    
      void destruct(std::true_type) {}
      void destruct(std::false_type) {
        if (full_) value_.~T();
      }
      ~storage() {
        destruct(std::integral_constant<
                   bool, std::is_trivially_destructible<T>::value
                 >{});
      }
    
      void* raw(detail::internal_only) {
        return &value_;
      }
      void activate(detail::internal_only) {
        UPCXX_ASSERT(!full_);
        full_ = true;
      }
      void deactivate(detail::internal_only) {
        UPCXX_ASSERT(full_);
        full_ = false;
      }
    
      T& operator*() {
        UPCXX_ASSERT(full_);
        return value_;
      }
      T* operator->() {
        UPCXX_ASSERT(full_);
        return &value_;
      }
    
      template<typename ...Args>
      void construct(Args&& ...args) {
        UPCXX_ASSERT(!full_);
        ::new(&value_) T{std::forward<Args>(args)...};
        full_ = true;
      }
    
    private:
      union {
        char dummy_;
        T value_;
      };
      bool full_;
    };
    
    template<typename T>
    deserialized_type_t<T>*
      deserializing_iterator<T>::deserialize_into(
        storage<deserialized_type_t> &where) const {
      auto result = deserialize_into(where.raw(detail::internal_only{}));
      where.activate(detail::internal_only{});
      return result;
    }
    

  2. Dan Bonachea

    Let's plan to discuss this proposal today.

    I'm glad this proposal includes "fully automatic" handling for placement new.

    IIUC storage<T>::construct() could potentially be used to perform move construction, and storage<T>::operator* could potentially be used as the source of a client-initiated move operation or the destination of a move-assignment, and everything should still work correctly wrt destructors. Do you concur?

  3. Log in to comment