UPC++ Version 1.0 versus Version 0.1
Transition from v0.1 to v1.0
In November, 2016, we froze the old UPC++ repository as part of a transition phase which ended with the September 2017 release of UPC++ v1.0. The UPC++ v0.1 repository will remain frozen with no further maintenance.
UPC++ v1.0 deploys new capabilities, some of which were experimental in v0.1, removes some and modifies others. The table at the end of this document lists the UPC++ features for v0.1 (left) and planned additions, deletions and changes in v1.0.
What features have been added relative v0.1?
Futures, promises and continuations. Whereas v0.1 used an event-based mechanism for expressing task dependencies, v1.0 relies on a continuation-based model instead.
Progress guarantees. UPC++ v1.0 has more well-defined progress semantics than v0.1, especially in multi-threaded scenarios.
Remote atomics were experimental in v0.1 and did not necessarily utilize available hardware support. Available hardware support can now be leveraged, and the user sees significant performance benefits in certain combinations of hardware and applications.
Distributed objects. UPC++ v1.0 distributed objects have no direct analogue in v0.1, but they subsume v0.1's distributed shared arrays.
View-based Serialization. UPC++ v1.0 introduces a mechanism for efficiently passing large and/or complicated data arguments to RPCs.
Non-contiguous RMA. UPC++ v1.0 expands and generalizes the support for non-contiguous RMA relative to v0.1.
Teams represent ordered sets of processes, and are similar to MPI_Group. Teams were experimental in v0.1, but are fully supported in v1.0.
Memory kinds. UPC++ provides uniform interfaces for transfers between memory with different properties. Beginning in the 2019.3.0 release, UPC++ provides a prototype implementation for CUDA GPUs. Future releases will refine this capability, and may expand this to include other forms of non-host memory.
What has been removed from UPC++ v0.1?
In developing UPC++ v1.0 we also strove for simplicity and we have removed some obsolete features present in v0.1:
Multidimensional arrays (local only). We plan to interoperate with 3rd party solutions for multidimensional arrays.
Distributed shared arrays - this functionality has been subsumed by generalized distributed objects, which provide a more scalable solution.
Blocking communication (e.g. implicit global pointer dereference)
|Futures, Continuations, Promises||✔|
|Events||✔||Subsumed by futures, continuations, promises|
|Put and Get||✔||✔|
|Distributed 1D Arrays||✔||Subsumed by distributed objects|
|RPC||✔||✔ Serialization improvements|
|Global Pointer Dereference||✔ (Implicit blocking)|
|Memory Kinds (e.g. GPU)||✔|
|Shared Scalar Variables||✔ (Little use)|
|Non-Distributed MD Arrays||✔ ndarray prototype|
|Progress Guarantees||✔||✔ More rigorous|