Provide a universal variadic factory for future
Motivation
Chains of futures are important to UPC++ programming, and make_future
is emerging as a prominent base case for constructing such chains (eg see guide issue 22).
The current spec has three free functions make_future
, when_all
and to_future
, all of which are effectively acting as a future constructor, differing only slightly in details of how the arguments are handled - details that users usually don't care about. However the only C++ constructor we currently have for future is the mostly-useless default constructor that constructs a never-ready future (only really useful as a value to be overwritten).
We could potentially simplify the interface and make it more approachable by providing a variadic constructor for future that captures the common case and acts as sugar for one or more of the free functions, giving users the behavior they usually want.
Proposal:
template <typename ...T>
future <CTypes...>::future( T ...futures_or_results );
Given a variadic list of futures and/or non-futures as arguments, constructs a future representing the readiness of all the arguments that are futures. The results tuple of this future will be the concatenation of the result tuples of each future argument and the values of each non-future argument, in the order in which each argument occurs in futures_or_results. The type parameters of the returned object (CTypes...) is the concatenation of the type parameter lists of the future types in T and the non-future types themselves in T, in the order in which each type appears in T. If none of the arguments are futures, then the resulting future object is trivially ready.
(ie the same behavior as the current fully-general to_future
, but declared as a C++ constructor).
Discussion
If we believe this can be implemented efficiently, it could be used to replace all three free functions for the majority of uses (and possibly remove them entirely). I think the only usage case not captured would be calling make_future(f)
where you intentionally want to construct a result of type future<future<T>>
, but that should be an uncommon case, so it seems fine to have a "special" spelling for that.
If we accept this proposal, we might also want to modify the default constructor future<>()
to specify a trivially-ready future, for uniformity.
Thoughts?
Comments (31)
-
-
Account Deleted @bonachea's proposal is to use
to_future
semantics to resolve the ambiguity. But another ambiguity/inconsistency comes fromfuture<>()
, should that be never ready likefuture<T, ...>()
or immediately-ready liketo_future<>()
? If forced to pick one, I would go with immediately-ready, but there's a chance someone might have been expecting the other behavior. -
reporter But another ambiguity/inconsistency comes from future<>(), should that be never ready like future<T, ...>() or immediately-ready like to_future<>()? If forced to pick one, I would go with immediately-ready,
I mentioned this at the end of the proposal, and I'm in agreement that future<>() should represent an immediately-ready future.
but there's a chance someone might have been expecting the other behavior.
I understand the semantic need for a never-ready future as the default constructor for a
future<T>
, since we cannot create a meaningful value ofT
out of thin air (unless we additionally requireT
be default constructable, another possible solution). However I have trouble imagining any situation where an application would ever actually want to construct a never-readyfuture<T>
value, except as a placeholder created at object allocation time that needs to be overwritten with a "real"future<T>
before it is ever used. Even if some contrived example found a use for a never-ready future, I think we can agree it's unlikely to be useful to the majority of real applications. So I'm not worried about making it harder to "spell" the incantation to construct a never-ready future, especially if doing so improves the syntax that we expect to be common usage in real applications. -
I think I have some cold water to throw on this proposal. For matters of performance,
future<T>
isn't a real type in the implementation. Its just an alias for this:using future<T...> = future1<MostGeneralFuture, T...>
where MostGeneralFuture is a type encoding compile-time information about how the future was constructed (in the default case, it encodes that we have no special information about the future and need to track everything at runtime). This trick allows optimizations like the following:// the type-signature of make_future is like: future1<ReadyFuture,T...> make_future(T...); // given that and fancy implementation internals, this... make_future(1).then([](int one) { cout<<one; }); // has the same runtime performance as... ([](int one) { cout<<one; })(1); // i.e. no future metadata allocations, no virtual dispatch, // just a direct and potentially inlineable function call
I can argue for the importance of these optimizations if anyone disputes it.
By calling the future constructor directly (which again is
future1<MostGeneralFuture,T...>
this proposal would encourage users to always be instantiating the slowest possible future. I don't think we want to make that the easy-mode. -
reporter - changed milestone to 2018.03.31 release
-
reporter @jdbachan : I think I'd like to hear more details about why this optimization is important. IMHO any user who directly writes something like
make_future(1).then([](int one) { cout<<one; });
deserves what they get. I hope we can agree that optimizing such obviously silly cases is not a primary concern. Perhaps the idea is this obvious silly case would be separated by some user-written abstraction boundaries that make it less obvious in the source program, but in a way the optimization can still handle?In my limited experience, the main reason to create trivially ready futures is when constructing the base case for a chain of futures constructed at runtime, such as this snippet from the programmer's guide:
upcxx::future<> f = upcxx::make_future(); for (int i = 1; i < upcxx::rank_n(); i++) { // construct the chain of futures f = upcxx::when_all(f, fetch(all_hits, i).then([&](int64_t rhit) { hits += rhit; })); } // wait for the chain to complete f.wait();
(FWIW I'd really like the base case in this particular idiom to eventually be expressed using a constructor, in this case a default constructor).
The key point here is the "trivial" future has been placed into an application-declared
upcxx::future<>
variable, which is subsequently overwritten with "non-trivial" futures at runtime. Does this usage case (which I expect will be common) benefit at all from your described optimization?If not, do you have a more realistic usage case which you'd expect to be important in practice that does benefit from this optimization?
If so, can you explain in more detail why your optimization requires free-function syntax and cannot be applied when using constructor syntax?
-
Abstraction boundaries are exactly where this optimization becomes non-silly. As an example, consider a software module which wants to run its internal code at some point after some asynchronous client work.
// BAD way to write API, incurs full overhead of dynamic future void cleanup_after(future<int> client_stuff) { client_stuff.then(/*cleanup using client int somehow*/); } // Good API, overheads adapt to calling context template<typename Ugly> void cleanup_after(future1<Ugly,int> client_stuff) { client_stuff.then(/*cleanup*/); } // client code ---------------------------------------- cleanup_after(future<int>(7)); // incurs dynamic allocation and readiness check cleanup_after(make_future(7)); // incurs no future overheads, everything is inlineable and immediate // Compiler error with good API regardless of proposed constructors: cannot deduce Ugly // bad API works only with proposed future constructors cleanup_after(7);
-
reporter John - I'm not sure I understand your example. If both sides of this API are in client code, it sounds like you are suggesting that clients would declare functions using internal UPC++ types and parametric over their template arguments? If that's the use case you are trying to support I'd argue that approach has much bigger problems (ie client code should not be making assumptions about the implementation internals of future).
-
We haven't spec'd that
typedef future = future1<...>
, but I think we should do that someday to enable performant abstractions like this. By not providing the proposed constructor we are future-proofing our encouraged coding practice so that when/if we expose future1, old code will immediately see benefit.There is a slightly less problematic case when futures are returned from client code. Consider a callback that is given a buffer reference and is expected to return a future when that buffer has been consumed (we could do this for rpc_ff and serialize_view):
like_rpc_ff(..., [](T *buf) { // consume buf synchronously... // lambdas have return type inference return future<>(); // dynamic future // or... return make_future<>(); // trivial future });
When the implementation of
like_rpc_ff
chains buffer cleanup code onto the lambda, that chaining will become trivial if the right kind offuture1
is inferred as the return type. This example has a workaround though where thelike_rpc_ff
implementor could accept lambdas returning types of eithervoid
orfuture<>
, and use static chaining in the first case. Unless we open up other parts of our future internals, doing that type dispatch is tricky. I think it would be simpler for thelike_rpc_ff
implementor to just have a single code path usingfuture1<Ugly,T...>
and use our future combinators. -
In our 3/2/18 meeting, it was noted that we wanted to remove the multi-argument nature of
to_future()
. I do not see any discussion of that here. We already provide amake_future()
that takes a raw value to a future. We also havewhen_all()
that combines multiple futures. Here are options forto_future()
:- Retain the current semantics of combining a variadic set of futures and non-futures into a single future.
- Convert a single future or non-future into a future.
- Discard the function template entirely.
@jdbachan @bonachea What do you prefer?
-
reporter @akamil I thought the discussion in the meeting was in reference to impl issue 78 which notes that multi-arg
to_future()
has never been implemented, and thus the spec and impl are out of compliance. First and foremost I want our implementation and specification to match as closely as possible, because it's misleading and confusing to users otherwise, so I'd like to fix that deficiency.Regarding the best ultimate design, my vote is still with the proposal in this issue: we provide a multi-argument constructor for future that does what I believe 99% of users will want/expect - ie the specified but not implemented semantics of
to_future
, which is then obsolete.make_future
would remain only for the uncommon/advanced case of intentionally constructing afuture<future<T>>
, which I believe will be exceedingly rare in real application use. -
I understand that we want to put the spec and impl in compliance, but that doesn't answer the question of what the right semantics are for
to_future()
in this release. One or both of the spec and impl need to be modified, but in what way? In the spec, there is an example that relies on the current semantics ofto_future()
, and I don't want to muck with it until we make a decision on this.As for the long-term solution, I am opposed to a constructor being the mechanism for doing this. Prior to C++17, constructors do not do template argument deduction, which means that users would have to name the actual resulting future type in order to invoke the constructor.
-
which means that users would have to name the actual resulting future type in order to invoke the constructor.
And that would be an ugly bit of code to write. It would make sense for the variadic standalone functions to deduce type.
auto m = make_future(f1, r2, r3, f4); // fine, recursive variadic function taking futures and ready types
but note that this currently results in
future<future< >, value, value, future< >>
as the m return object. Do we want to have future collapsing?future<future<T> >
decays intofuture<T>
?I think we would lose some of the magic typing of
future1
objects, which carry knowledge of how they were built. I do not see a clear case for the need forto_future
and I think futures built from null or a ready value should be immediately ready and the user should have to go through the extra effort with promises if they really want a never-ready future. -
Brian, what you are arguing for is exactly what the currently specified semantics of
to_future()
are. It is variadic, does future collapsing, and produces a ready future if only supplied with raw values.make_future()
does not do future collapsing, and it always produces a ready future. -
Is it possible to build
to_future
? I think I hadto_future
andmake_future
confused. -
I added a proposed implementation to impl issue 78.
-
I would like to change the spec to meet the impl, that is to make
to_future
single argument. Its sole purpose would be to collapse where single argument call ofmake_future
would nest. I think this separates concerns well. If the user wants the variadic case (which we have no real-world motivating examples for yet, right?) they can fairly simply combinewhen_all
andto_future
which I think generates more readable code that, albeit verbosely, expresses the intent better than a "super call".I am not against the existence of the "super call" that is both
to_future
andwhen_all
, Dan would like to see it become the constructor, I am against that for reasons stated in this thread, but would not object to introducing it under its own name, saywhen_all_as_future
. Even givingwhen_all
the super behavior in the presence of non-future arguments I see as preferable toto_future
getting super behavior in the variadic case. This is because I see the argument for "super behavior" as the one correct generalization ofwhen_all
as "more obvious" than generalizing the gap ofto_future
from single to variadic (consider thatto_future
generalizes to variadic by leveragingwhen_all
semantics, but why notwhen_any
orwhen_any_two_of
,when_any_three_of
etc?).If we make the spec of
to_future
single-argument we can always later generalize that to variadic. -
So I can combine futures, with collapsing, by using something like
auto f1 = ... ; auto f2 = ... ; double val = ....; auto results = when_all(to_future(f1), to_future(f2), to_future(val));
This gives me a collapsing and concatenating with just a little more typing, but it makes the types easier to figure out. when_all takes futures, to_future either promotes a ready value to a ready future, or passes the future through unchanged. I think this all holds up.
I don't know what the return type of
when_any
would be though. We would need to know the type of the returned object at compile time. I think we can work on progressively ready collections in a later version of the spec. -
reporter To summarize what appears to be the emerging consensus here, we'll have four basic ways to synchronously construct futures (that don't involve injecting communication, scheduling callbacks, or explicitly creating a promise), which differ in subtle but important ways:
future()
constructor - Basically useless because it creates a never-ready future. Only used for default constructing future values you plan to overwrite.make_future(a1,a2,...)
- input is 0 or more future or non-future args, output is afuture<a1, a2, ...>
concatenating them without collapsing.- Notably the only function that can construct a ready futures of any type (
future<>
orfuture<T>
)
- Notably the only function that can construct a ready futures of any type (
to_future(arg)
- input is exactly 1future<T>
or non-futureT
, output is afuture<T>
- Notably this is the only way to "collapse" an arg that may or may not be a future into a 1-level future
when_all(f1,f2...)
- inputs are 0 or more args that must be futures, output is one concatenated future of their values- Can construct a ready, empty future:
future<>
, but not a ready non-empty future (without help from other calls)
- Can construct a ready, empty future:
My personal opinion is this seems like overly complicated design space for something that should be straightforward - ie the basic task is wrapping a new future around zero or more "somethings", and I can see potential users asking why we have four different ways to spell that with subtly different restrictions on each and no one fully-general entry point.
Assuming we stick with this design, we'll need to educate users about these differences so they know how to choose the right tool for a given circumstance.
-
I think the one users will care about is
when_all()
. The others I see as advanced usage. -
reporter I think the one users will care about is
when_all()
. The others I see as advanced usage.I don't agree - I think future chaining is very much common/expected usage, and the base case of a future chain often requires constructing a trivially-ready future. The Programmer's Guide currently advises users to call
make_future
for this purpose (when_all()
can construct a trivially-readyfuture<>
, but not a trivially readyfuture<int>
if your base case has a value). And of course the chaining itself requireswhen_all
, so that's at least two in common usage. -
We discussed this at length during our 3/10/18 meeting. The consensus was that
make_future()
,to_future()
, andwhen_all()
represent distinct concepts and should not be merged. The purpose ofmake_future()
is to construct a future from a value.when_all()
is a conjunction combinator, and we can conceivably add other combinators in the future as John mentioned above.to_future()
is primarily of use to power users/template metaprogrammers for collapsing rather than nesting futures. We do not expect the average user to have a need for it. -
- changed status to resolved
Dan's summary above is in the spec as of 6dc338b.
-
reporter - changed status to open
Re-opening this issue based on our discussion in zoom on 2019-12-12.
We'd really like to make this less awkward for examples like ex4 in the RMA DHT, which is currently solved like this:
return rget(e.ptr, dest_ptr, e.count) // fetch data .then([=]{ return upcxx::make_future(dest_ptr, e.count); });
I'd much rather be able to write this:
return to_future(rget(e.ptr, dest_ptr, e.count), dest_ptr, e.count);
-
This should currently work:
return when_all(rget(e.ptr, dest_ptr, e.count), make_future(dest_ptr, e.count));
This seems clearer to me than a call to
to_future
. -
I am not opposed to extending
when_all()
to doing future promotion, and @john bachan previously indicated he sees that as a better option than changingto_future
. I agree with that – it’s clear thatwhen_all()
is a conjunction combinator and what wouldn't be obvious withto_future
.The implementation seems straightforward to me, though @john bachan would have to sign off on this to make sure that the references are being forwarded correctly:
--- a/src/future/when_all.hpp +++ b/src/future/when_all.hpp @@ -18,7 +18,20 @@ namespace upcxx { }; template<typename AnsKind, typename ...AnsT, - typename ArgKind, typename ...ArgT, + typename Arg, // non-future type + typename ...MoreArgs> + struct when_all_return_cat< + /*Ans=*/future1<AnsKind, AnsT...>, + /*ArgDecayed...=*/Arg, MoreArgs... + > { + using type = typename when_all_return_cat< + future1<AnsKind, AnsT..., Arg>, + MoreArgs... + >::type; + }; + + template<typename AnsKind, typename ...AnsT, + typename ArgKind, typename ...ArgT, // future type typename ...MoreArgs> struct when_all_return_cat< /*Ans=*/future1<AnsKind, AnsT...>, @@ -43,13 +56,13 @@ namespace upcxx { template<typename ...ArgFu> when_all_return_t<ArgFu...> when_all_fast(ArgFu &&...arg) { return typename when_all_return_t<ArgFu...>::impl_type( - static_cast<ArgFu&&>(arg)... + to_fast_future(static_cast<ArgFu&&>(arg))... ); } // single component optimization template<typename ArgFu> - ArgFu&& when_all_fast(ArgFu &&arg) { - return static_cast<ArgFu&&>(arg); + auto when_all_fast(ArgFu &&arg) -> decltype(to_fast_future(arg))&& { + return to_fast_future(static_cast<ArgFu&&>(arg)); } } @@ -67,13 +80,13 @@ namespace upcxx { template<typename ...ArgFu> detail::when_all_return_t<ArgFu...> when_all(ArgFu &&...arg) { return typename detail::when_all_return_t<ArgFu...>::impl_type( - static_cast<ArgFu&&>(arg)... + detail::to_fast_future(static_cast<ArgFu&&>(arg))... ); } // single component optimization template<typename ArgFu> - ArgFu&& when_all(ArgFu &&arg) { - return static_cast<ArgFu&&>(arg); + auto when_all(ArgFu &&arg) -> decltype(detail::to_fast_future(arg))&& { + return detail::to_fast_future(static_cast<ArgFu&&>(arg)); } }
Thoughts on this?
-
I should also mention that as far as I can tell,
when_all()
does not optimize byfuture1
kind. (Looking at @john bachan to confirm.) So my proposed implementation would be no more efficient than the user explicitly inserting calls tomake_future()
. (My proposed implementation does usedetail::to_fast_future()
in case it does or will in the future optimize this.) -
Upon further consultation with @john bachan ,
when_all()
does optimize for fast futures, so it’s worth providing this value promotion. We resolved to do so in our 2020-07-29 meeting. Proposed implementation is in implementation PR 241. -
reporter - changed title to Provide a universal variadic factory for future
-
assigned issue to
updating title to reflect our resolution
-
- changed status to resolved
Future promotion in when_all. Add missing preconditions. Fixes
#104.→ <<cset bea409d9cbb4>>
-
Merged in akamil/upcxx-spec/issue104 (pull request #50)
Future promotion in when_all. Add missing preconditions. Fixes
#104.Approved-by: Dan Bonachea dobonachea@lbl.gov
→ <<cset 3c2c6f6e3680>>
- Log in to comment
I wasn't involved in this discussion, but here is a relevant comment that's in the spec sources: