RFE: Allow make_future() and when_all() (or new variants) to be used prior to UPC++ initialization

Issue #483 resolved
Amir Kamil created an issue

A common use for make_future() is to seed a future-conjoining loop:

future<> fut = make_future();
for (...) {
  fut = when_all(fut, some_comm_op)
}

It’s not unreasonable to declare and initialize the root future at static scope. However, this is currently prohibited by the spec.

Discussion of the potential pitfalls of allowing make_future() (and when_all(), which we intend to supersede uses of both make_future() and to_future()) before UPC++ initialization is in PR 345. PR 345 introduces optimizations to make_future<>() that may cause problems for using it in dynamic initialization of a variable with static storage duration. However, as noted in the comments for the PR, my current reading of the C++ spec is that it should work fine in dynamic initialization.

We should either:

  1. Verify that make_future<>() is guaranteed to succeed in dynamic initialization post-merge of PR 345 and bless all specializations of make_future() and when_all() prior to UPC++ initialization, or:
  2. Introduce new variants of the two templates (e.g. make_future_before_init()) that are guaranteed to work in dynamic initialization.

Comments (8)

  1. Amir Kamil reporter

    Analysis of make_future<>() copied from PR 345:

    Upon further investigation, I believe that make_future<>() is safe at static scope. It is a non-constexpr function, so it only runs in dynamic initialization. On the other hand, initialization of detail::future_header_result<>::the_always falls under constant (i.e. static) initialization, by my reading of the C++ spec:

    future_header future_header_result<>::the_always = {
      /*ref_n_*/-1,
      /*status_*/upcxx::detail::future_header::status_ready,
      /*sucs_head_*/nullptr,
      {/*result_*/&upcxx::detail::future_header_result<>::the_always}
    };
    

    The relevant category of constant initialization:

    2) Static or [thread-local (since C++11)] object [of a PODType (until C++11)] (not necessarily of class type), that is not initialized by a constructor call, if the object is value-initialized or if every expression in its initializer is a constant expression.

    The ref_n_, status_, and sucs_head_ initializers are clearly constant expressions. The result_ initializer is also a constant expression:

    * a prvalue core constant expression whose value satisfies the following constraints:
    * if the value is of pointer type, it holds
    * address of an object with static storage duration

    Thus, the_always falls under constant initialization. Note that constant initialization does not require that a static object actually be declared const.

    The initialization rules require that all static initialization must happen before any dynamic initialization. Thus, the_always is guaranteed to be properly initialized before make_future<>() can execute. (I don’t believe dynamic libraries are an exception to this rule. Dynamic initialization can be deferred until the first odr-use of a function or variable in the same TU, but I do not see a similar allowance for static initialization. Even so, invoking make_future<>() odr-uses the_always, which would force it to be initialized.)

    Aside from a careful reading of cpp-reference.com and [basic.start.init] in the C++ spec (in N3691 to be exact, since that is what I happened to have open at the time), I also tested this on both my machine and dirac, using the same compiler versions as those mentioned above in the performance results. I inserted a dynamic initializer before the initialization of the_always in core.cpp that invoked make_future<>() and printed out the details of impl_.hdr_ of the resulting future. With the_always implemented as above, all six system/compiler combinations, in both debug and opt modes, and with both smp and udp conduits, showed that impl_.hdr_ pointed at the_always and that the latter was correctly initialized. I repeated this with a modification to the initialization of the_always so that the initializer of ref_n_ was no longer a constant expression. In all combinations, impl_.hdr_->ref_n_ showed the value 0, as expected for a dynamic initialization that appears later in the same TU. (Incidentally, the compilers differed on the other fields. GCC and PGI constant-initialized the other fields of the_always, while Clang and Intel did not. This is actually legal and is called early dynamic initialization.)

    Since make_future<>() is legal in dynamic initialization, so is when_all(). Thus, we do not need UPCXX_ASSERT_INIT() in either function.

  2. Dan Bonachea

    Thanks for all your great work on this Amir!

    I agree with your analysis, namely that we "dodge the bullet" here (only) because this object falls under constant initialization, and the fact that all compilers probably use this implementation:

    In practice:
    Constant initialization is usually applied at compile time. Pre-calculated object representations are stored as part of the program image.

    which means that in practice these locations will be loaded from the static data segment of the executable (or shared library) with their constant values already intact.

    Nevertheless, I think we should add a "defensive" assertion in make_future of the form:

    UPCXX_ASSERT(detail::future_header_result<>::the_always.ref_n_ == -1) ;
    

    to defend against possible changes to initialization behavior and periodically spot-check our invariant.

    Once that's done, I think this becomes a spec issue to document the new "blessing" for make_future and when_all. In order to be useful, we also need to bless the (currently implicitly specified) CopyConstructor for future, and might as well bless CopyAssignment as well.

  3. Colin MacLean

    Yeah, it looks like this just so happens to be safe. However, the following design would avoid the initialization order problem in all cases:

    struct future_header_result<> {
      static future_header& the_always() {
        static instance = {
          /*ref_n_*/-1,
          /*status_*/upcxx::detail::future_header::status_ready,
          /*sucs_head_*/nullptr,
          {/*result_*/&upcxx::detail::future_header_result<>::the_always}
        };
        return instance;
      }
    };
    

    I don’t believe init() or finalize() would be needed at all if this technique were used to trigger initializations when first needed.

  4. Dan Bonachea

    I agree Colin's approach is probably also safe, but I (assuming the function definition lives in future/core.cpp) this also entails an additional library call we don't want in the critical path. If this was an inline function, then I'm not confident we'd get a unique instance of the_always, which is optional but a nice property.

    IMO we should stick with Amir's current solution until/unless we encounter a problem or a need for dynamic initialization.

  5. Colin MacLean

    The standard guarantees that there would be only one copy of the static local variable. If the compiler can see the definition (like in a header), then it would be surprising to the level of a bug if the compiler didn’t inline it. This technique is commonly used to avoid the need for things like init() and finalize() since with C++ it can all be done with this lazy static initialization technique that guarantees proper ordering. I agree that it’s fine for now. This is more the type of thing we would do if we wanted to deprecate init()and finalize() entirely and allow the user to do whatever they wanted whenever they wanted, which is entirely possible with this technique.

  6. Dan Bonachea

    I think all that's needed to resolve the implementation side of this issue is test coverage. The newly added test/regression/issue478.cpp seems like the right place to validate this new feature works as intended.

    Then of course there's documentation work in the spec and ChangeLog to explain the semantic loosening.

  7. Log in to comment