Provide a universal variadic factory for future

Issue #104 resolved
Dan Bonachea created an issue

Motivation

Chains of futures are important to UPC++ programming, and make_future is emerging as a prominent base case for constructing such chains (eg see guide issue 22).

The current spec has three free functions make_future, when_all and to_future, all of which are effectively acting as a future constructor, differing only slightly in details of how the arguments are handled - details that users usually don't care about. However the only C++ constructor we currently have for future is the mostly-useless default constructor that constructs a never-ready future (only really useful as a value to be overwritten).

We could potentially simplify the interface and make it more approachable by providing a variadic constructor for future that captures the common case and acts as sugar for one or more of the free functions, giving users the behavior they usually want.

Proposal:

template <typename ...T>
future <CTypes...>::future( T ...futures_or_results );

Given a variadic list of futures and/or non-futures as arguments, constructs a future representing the readiness of all the arguments that are futures. The results tuple of this future will be the concatenation of the result tuples of each future argument and the values of each non-future argument, in the order in which each argument occurs in futures_or_results. The type parameters of the returned object (CTypes...) is the concatenation of the type parameter lists of the future types in T and the non-future types themselves in T, in the order in which each type appears in T. If none of the arguments are futures, then the resulting future object is trivially ready.

(ie the same behavior as the current fully-general to_future, but declared as a C++ constructor).

Discussion

If we believe this can be implemented efficiently, it could be used to replace all three free functions for the majority of uses (and possibly remove them entirely). I think the only usage case not captured would be calling make_future(f) where you intentionally want to construct a result of type future<future<T>>, but that should be an uncommon case, so it seems fine to have a "special" spelling for that.

If we accept this proposal, we might also want to modify the default constructor future<>() to specify a trivially-ready future, for uniformity.

Thoughts?

Comments (31)

  1. Amir Kamil

    I wasn't involved in this discussion, but here is a relevant comment that's in the spec sources:

    % I think we were wrong, we cannot use the future constructor to build
    % ready futures as it creates ambiguity with the copy constructor:
    % does constructing a future<T> with a future<T> produce a copy of that
    % future or a future<future<T>>? If you said future<T>, then there's an
    % insconsistency with future(future<A>, future<B>) which would deduce
    % future<future<A>,future<B>> instead of future<A,B>.
    
  2. Former user Account Deleted

    @bonachea's proposal is to use to_future semantics to resolve the ambiguity. But another ambiguity/inconsistency comes from future<>(), should that be never ready like future<T, ...>() or immediately-ready like to_future<>()? If forced to pick one, I would go with immediately-ready, but there's a chance someone might have been expecting the other behavior.

  3. Dan Bonachea reporter

    But another ambiguity/inconsistency comes from future<>(), should that be never ready like future<T, ...>() or immediately-ready like to_future<>()? If forced to pick one, I would go with immediately-ready,

    I mentioned this at the end of the proposal, and I'm in agreement that future<>() should represent an immediately-ready future.

    but there's a chance someone might have been expecting the other behavior.

    I understand the semantic need for a never-ready future as the default constructor for a future<T>, since we cannot create a meaningful value of T out of thin air (unless we additionally require T be default constructable, another possible solution). However I have trouble imagining any situation where an application would ever actually want to construct a never-ready future<T> value, except as a placeholder created at object allocation time that needs to be overwritten with a "real" future<T> before it is ever used. Even if some contrived example found a use for a never-ready future, I think we can agree it's unlikely to be useful to the majority of real applications. So I'm not worried about making it harder to "spell" the incantation to construct a never-ready future, especially if doing so improves the syntax that we expect to be common usage in real applications.

  4. john bachan

    I think I have some cold water to throw on this proposal. For matters of performance, future<T> isn't a real type in the implementation. Its just an alias for this: using future<T...> = future1<MostGeneralFuture, T...> where MostGeneralFuture is a type encoding compile-time information about how the future was constructed (in the default case, it encodes that we have no special information about the future and need to track everything at runtime). This trick allows optimizations like the following:

    // the type-signature of make_future is like:
    future1<ReadyFuture,T...> make_future(T...);
    
    // given that and fancy implementation internals, this...
    make_future(1).then([](int one) { cout<<one; });
    
    // has the same runtime performance as...
    ([](int one) { cout<<one; })(1);
    
    // i.e. no future metadata allocations, no virtual dispatch,
    // just a direct and potentially inlineable function call
    

    I can argue for the importance of these optimizations if anyone disputes it.

    By calling the future constructor directly (which again is future1<MostGeneralFuture,T...> this proposal would encourage users to always be instantiating the slowest possible future. I don't think we want to make that the easy-mode.

  5. Dan Bonachea reporter

    @jdbachan : I think I'd like to hear more details about why this optimization is important. IMHO any user who directly writes something like make_future(1).then([](int one) { cout<<one; }); deserves what they get. I hope we can agree that optimizing such obviously silly cases is not a primary concern. Perhaps the idea is this obvious silly case would be separated by some user-written abstraction boundaries that make it less obvious in the source program, but in a way the optimization can still handle?

    In my limited experience, the main reason to create trivially ready futures is when constructing the base case for a chain of futures constructed at runtime, such as this snippet from the programmer's guide:

            upcxx::future<> f = upcxx::make_future();
            for (int i = 1; i < upcxx::rank_n(); i++) {
               // construct the chain of futures
               f = upcxx::when_all(f, fetch(all_hits, i).then([&](int64_t rhit) { hits += rhit; }));
            }
            // wait for the chain to complete
            f.wait();
    

    (FWIW I'd really like the base case in this particular idiom to eventually be expressed using a constructor, in this case a default constructor).

    The key point here is the "trivial" future has been placed into an application-declared upcxx::future<> variable, which is subsequently overwritten with "non-trivial" futures at runtime. Does this usage case (which I expect will be common) benefit at all from your described optimization?

    If not, do you have a more realistic usage case which you'd expect to be important in practice that does benefit from this optimization?

    If so, can you explain in more detail why your optimization requires free-function syntax and cannot be applied when using constructor syntax?

  6. john bachan

    Abstraction boundaries are exactly where this optimization becomes non-silly. As an example, consider a software module which wants to run its internal code at some point after some asynchronous client work.

    // BAD way to write API, incurs full overhead of dynamic future
    void cleanup_after(future<int> client_stuff) {
      client_stuff.then(/*cleanup using client int somehow*/);
    }
    
    // Good API, overheads adapt to calling context
    template<typename Ugly>
    void cleanup_after(future1<Ugly,int> client_stuff) {
      client_stuff.then(/*cleanup*/);
    }
    
    // client code ----------------------------------------
    
    cleanup_after(future<int>(7)); // incurs dynamic allocation and readiness check
    
    cleanup_after(make_future(7)); // incurs no future overheads, everything is inlineable and immediate
    
    // Compiler error with good API regardless of proposed constructors: cannot deduce Ugly
     // bad API works only with proposed future constructors
    cleanup_after(7);
    
  7. Dan Bonachea reporter

    John - I'm not sure I understand your example. If both sides of this API are in client code, it sounds like you are suggesting that clients would declare functions using internal UPC++ types and parametric over their template arguments? If that's the use case you are trying to support I'd argue that approach has much bigger problems (ie client code should not be making assumptions about the implementation internals of future).

  8. john bachan

    We haven't spec'd that typedef future = future1<...>, but I think we should do that someday to enable performant abstractions like this. By not providing the proposed constructor we are future-proofing our encouraged coding practice so that when/if we expose future1, old code will immediately see benefit.

    There is a slightly less problematic case when futures are returned from client code. Consider a callback that is given a buffer reference and is expected to return a future when that buffer has been consumed (we could do this for rpc_ff and serialize_view):

    like_rpc_ff(..., [](T *buf) {
      // consume buf synchronously...
    
      // lambdas have return type inference
      return future<>(); // dynamic future
      // or...
      return make_future<>(); // trivial future
    });
    

    When the implementation of like_rpc_ff chains buffer cleanup code onto the lambda, that chaining will become trivial if the right kind of future1 is inferred as the return type. This example has a workaround though where the like_rpc_ff implementor could accept lambdas returning types of either void or future<>, and use static chaining in the first case. Unless we open up other parts of our future internals, doing that type dispatch is tricky. I think it would be simpler for the like_rpc_ff implementor to just have a single code path using future1<Ugly,T...> and use our future combinators.

  9. Amir Kamil

    In our 3/2/18 meeting, it was noted that we wanted to remove the multi-argument nature of to_future(). I do not see any discussion of that here. We already provide a make_future() that takes a raw value to a future. We also have when_all() that combines multiple futures. Here are options for to_future():

    • Retain the current semantics of combining a variadic set of futures and non-futures into a single future.
    • Convert a single future or non-future into a future.
    • Discard the function template entirely.

    @jdbachan @bonachea What do you prefer?

  10. Dan Bonachea reporter

    @akamil I thought the discussion in the meeting was in reference to impl issue 78 which notes that multi-arg to_future() has never been implemented, and thus the spec and impl are out of compliance. First and foremost I want our implementation and specification to match as closely as possible, because it's misleading and confusing to users otherwise, so I'd like to fix that deficiency.

    Regarding the best ultimate design, my vote is still with the proposal in this issue: we provide a multi-argument constructor for future that does what I believe 99% of users will want/expect - ie the specified but not implemented semantics of to_future, which is then obsolete. make_future would remain only for the uncommon/advanced case of intentionally constructing a future<future<T>>, which I believe will be exceedingly rare in real application use.

  11. Amir Kamil

    I understand that we want to put the spec and impl in compliance, but that doesn't answer the question of what the right semantics are for to_future() in this release. One or both of the spec and impl need to be modified, but in what way? In the spec, there is an example that relies on the current semantics of to_future(), and I don't want to muck with it until we make a decision on this.

    As for the long-term solution, I am opposed to a constructor being the mechanism for doing this. Prior to C++17, constructors do not do template argument deduction, which means that users would have to name the actual resulting future type in order to invoke the constructor.

  12. BrianS

    which means that users would have to name the actual resulting future type in order to invoke the constructor.

    And that would be an ugly bit of code to write. It would make sense for the variadic standalone functions to deduce type.

    auto m = make_future(f1, r2, r3, f4); // fine, recursive variadic function taking futures and ready types  
    

    but note that this currently results in future<future< >, value, value, future< >> as the m return object. Do we want to have future collapsing? future<future<T> > decays into future<T>?

    I think we would lose some of the magic typing of future1 objects, which carry knowledge of how they were built. I do not see a clear case for the need for to_future and I think futures built from null or a ready value should be immediately ready and the user should have to go through the extra effort with promises if they really want a never-ready future.

  13. Amir Kamil

    Brian, what you are arguing for is exactly what the currently specified semantics of to_future() are. It is variadic, does future collapsing, and produces a ready future if only supplied with raw values. make_future() does not do future collapsing, and it always produces a ready future.

  14. john bachan

    I would like to change the spec to meet the impl, that is to make to_future single argument. Its sole purpose would be to collapse where single argument call of make_future would nest. I think this separates concerns well. If the user wants the variadic case (which we have no real-world motivating examples for yet, right?) they can fairly simply combine when_all and to_future which I think generates more readable code that, albeit verbosely, expresses the intent better than a "super call".

    I am not against the existence of the "super call" that is both to_future and when_all, Dan would like to see it become the constructor, I am against that for reasons stated in this thread, but would not object to introducing it under its own name, say when_all_as_future. Even giving when_all the super behavior in the presence of non-future arguments I see as preferable to to_future getting super behavior in the variadic case. This is because I see the argument for "super behavior" as the one correct generalization of when_all as "more obvious" than generalizing the gap of to_future from single to variadic (consider that to_future generalizes to variadic by leveraging when_all semantics, but why not when_any or when_any_two_of, when_any_three_of etc?).

    If we make the spec of to_future single-argument we can always later generalize that to variadic.

  15. BrianS

    So I can combine futures, with collapsing, by using something like

    auto f1 = ... ;
    auto f2 = ... ;
    double val = ....;
    auto results = when_all(to_future(f1), to_future(f2), to_future(val));
    

    This gives me a collapsing and concatenating with just a little more typing, but it makes the types easier to figure out. when_all takes futures, to_future either promotes a ready value to a ready future, or passes the future through unchanged. I think this all holds up.

    I don't know what the return type of when_any would be though. We would need to know the type of the returned object at compile time. I think we can work on progressively ready collections in a later version of the spec.

  16. Dan Bonachea reporter

    To summarize what appears to be the emerging consensus here, we'll have four basic ways to synchronously construct futures (that don't involve injecting communication, scheduling callbacks, or explicitly creating a promise), which differ in subtle but important ways:

    • future() constructor - Basically useless because it creates a never-ready future. Only used for default constructing future values you plan to overwrite.
    • make_future(a1,a2,...) - input is 0 or more future or non-future args, output is a future<a1, a2, ...> concatenating them without collapsing.
      • Notably the only function that can construct a ready futures of any type (future<> or future<T> )
    • to_future(arg) - input is exactly 1 future<T> or non-future T, output is a future<T>
      • Notably this is the only way to "collapse" an arg that may or may not be a future into a 1-level future
    • when_all(f1,f2...) - inputs are 0 or more args that must be futures, output is one concatenated future of their values
      • Can construct a ready, empty future: future<>, but not a ready non-empty future (without help from other calls)

    My personal opinion is this seems like overly complicated design space for something that should be straightforward - ie the basic task is wrapping a new future around zero or more "somethings", and I can see potential users asking why we have four different ways to spell that with subtly different restrictions on each and no one fully-general entry point.

    Assuming we stick with this design, we'll need to educate users about these differences so they know how to choose the right tool for a given circumstance.

  17. Dan Bonachea reporter

    I think the one users will care about is when_all(). The others I see as advanced usage.

    I don't agree - I think future chaining is very much common/expected usage, and the base case of a future chain often requires constructing a trivially-ready future. The Programmer's Guide currently advises users to call make_future for this purpose (when_all() can construct a trivially-ready future<>, but not a trivially ready future<int> if your base case has a value). And of course the chaining itself requires when_all, so that's at least two in common usage.

  18. Amir Kamil

    We discussed this at length during our 3/10/18 meeting. The consensus was that make_future(), to_future(), and when_all() represent distinct concepts and should not be merged. The purpose of make_future() is to construct a future from a value. when_all() is a conjunction combinator, and we can conceivably add other combinators in the future as John mentioned above. to_future() is primarily of use to power users/template metaprogrammers for collapsing rather than nesting futures. We do not expect the average user to have a need for it.

  19. Dan Bonachea reporter
    • changed status to open

    Re-opening this issue based on our discussion in zoom on 2019-12-12.

    We'd really like to make this less awkward for examples like ex4 in the RMA DHT, which is currently solved like this:

          return rget(e.ptr, dest_ptr, e.count) // fetch data
                .then([=]{ return upcxx::make_future(dest_ptr, e.count); });
    

    I'd much rather be able to write this:

          return to_future(rget(e.ptr, dest_ptr, e.count), dest_ptr, e.count);
    
  20. Amir Kamil

    This should currently work:

          return when_all(rget(e.ptr, dest_ptr, e.count), make_future(dest_ptr, e.count));
    

    This seems clearer to me than a call to to_future.

  21. Amir Kamil

    I am not opposed to extending when_all() to doing future promotion, and @john bachan previously indicated he sees that as a better option than changing to_future. I agree with that – it’s clear that when_all() is a conjunction combinator and what wouldn't be obvious with to_future.

    The implementation seems straightforward to me, though @john bachan would have to sign off on this to make sure that the references are being forwarded correctly:

    --- a/src/future/when_all.hpp
    +++ b/src/future/when_all.hpp
    @@ -18,7 +18,20 @@ namespace upcxx {
         };
    
         template<typename AnsKind, typename ...AnsT,
    -             typename ArgKind, typename ...ArgT,
    +             typename Arg, // non-future type
    +             typename ...MoreArgs>
    +    struct when_all_return_cat<
    +        /*Ans=*/future1<AnsKind, AnsT...>,
    +        /*ArgDecayed...=*/Arg, MoreArgs...
    +      > {
    +      using type = typename when_all_return_cat<
    +          future1<AnsKind, AnsT..., Arg>,
    +          MoreArgs...
    +        >::type;
    +    };
    +
    +    template<typename AnsKind, typename ...AnsT,
    +             typename ArgKind, typename ...ArgT, // future type
                  typename ...MoreArgs>
         struct when_all_return_cat<
             /*Ans=*/future1<AnsKind, AnsT...>,
    @@ -43,13 +56,13 @@ namespace upcxx {
         template<typename ...ArgFu>
         when_all_return_t<ArgFu...> when_all_fast(ArgFu &&...arg) {
           return typename when_all_return_t<ArgFu...>::impl_type(
    -        static_cast<ArgFu&&>(arg)...
    +        to_fast_future(static_cast<ArgFu&&>(arg))...
           );
         }
         // single component optimization
         template<typename ArgFu>
    -    ArgFu&& when_all_fast(ArgFu &&arg) {
    -      return static_cast<ArgFu&&>(arg);
    +    auto when_all_fast(ArgFu &&arg) -> decltype(to_fast_future(arg))&& {
    +      return to_fast_future(static_cast<ArgFu&&>(arg));
         }
       }
    
    @@ -67,13 +80,13 @@ namespace upcxx {
       template<typename ...ArgFu>
       detail::when_all_return_t<ArgFu...> when_all(ArgFu &&...arg) {
         return typename detail::when_all_return_t<ArgFu...>::impl_type(
    -      static_cast<ArgFu&&>(arg)...
    +      detail::to_fast_future(static_cast<ArgFu&&>(arg))...
         );
       }
       // single component optimization
       template<typename ArgFu>
    -  ArgFu&& when_all(ArgFu &&arg) {
    -    return static_cast<ArgFu&&>(arg);
    +  auto when_all(ArgFu &&arg) -> decltype(detail::to_fast_future(arg))&& {
    +    return detail::to_fast_future(static_cast<ArgFu&&>(arg));
       }
    
     }
    

    Thoughts on this?

  22. Amir Kamil

    I should also mention that as far as I can tell, when_all() does not optimize by future1 kind. (Looking at @john bachan to confirm.) So my proposed implementation would be no more efficient than the user explicitly inserting calls to make_future(). (My proposed implementation does use detail::to_fast_future() in case it does or will in the future optimize this.)

  23. Log in to comment