UPC++ Version 1.0
September 26, 2018: We are proud to announce a new v2018.9.0 release of UPC++.
- UPC++ Implementation, v2018.9.0
- Contains everything you need to start using UPC++ on supported platforms
- Installation automatically downloads the GASNet-EX communication library (internet connection required)
- Note this release has not yet been tuned for performance
- Includes all of the documentation
- See README.md (includes ChangeLog) and INSTALL.md
- UPC++ Programmer's Guide v2018.9.0
- A gentle introduction to UPC++ with examples and descriptions.
- UPC++ Specification, v1.0 Draft 8
- Formal specification of the UPC++ library interface.
- Includes Pagoda group publications and citation information for the documentation.
UPC++ is a C++ library that supports Partitioned Global Address Space (PGAS) programming, and is designed to interoperate smoothly and efficiently with MPI, OpenMP, CUDA and AMTs. It leverages GASNet-EX to deliver low-overhead, fine-grained communication, including Remote Memory Access (RMA) and Remote Procedure Call (RPC).
UPC++ exposes a PGAS memory model, including one-sided communication (RMA and RPC). However, there are departures from the approaches taken by some predecessors such as UPC. These changes reflect a design philosophy that encourages the UPC++ programmer to directly express what can be implemented efficiently (ie without a need for parallel compiler analysis).
Most operations are non-blocking, and the powerful synchronization mechanisms encourage applications to design for aggressive asynchrony.
All communication is explicit - there is no implicit data motion.
UPC++ encourages the use of scalable data-structures and avoids non-scalable library features.
What Features Comprise UPC++?
RMA. UPC++ provides asynchronous one-sided communication (Remote Memory Access, a.k.a. Put and Get) for movement of data among processes.
RPC. UPC++ provides asynchronous Remote Procedure Call for running code (including C++ lambdas) on other processes.
Futures, promises and continuations. Futures are central to handling asynchronous operation of RMA and RPC. UPC++ uses a continuation-based model to express task dependencies.
Progress guarantees. Because UPC++ has no internal service threads, the library makes progress only when a core enters an active UPC++ call. However, the "persona" concept makes writing progress threads simple.
Remote atomics use an abstraction that enables efficient offload where hardware support is available.
Distributed objects. UPC++ enables construction of a scalable distributed object from any C++ object type, with one instance on each rank of a team. RPC can be used to access remote instances.
View-based Serialization. UPC++ introduces a mechanism for efficiently passing large and/or complicated data arguments to RPCs.
Non-contiguous RMA. UPC++ provides functions for non-contiguous data transfers directly on shared memory, for example to efficiently copy or transpose sections of N-dimension dense arrays.
Teams represent ordered sets of processes and play a role in collective communication. Initially we support barrier, broadcast and reductions, including abstractions to enable offload of reductions supported in hardware.
Memory kinds. UPC++ will provide uniform interfaces for transfers between memory with different properties, such as GPUs, HBM, NUMA and NVRAM. [Not yet implemented]
A comparison to the feature set of UPC++ v0.1 is also available.
- UPC++ Support Forum (email) - Best place to ask questions or browse prior discussions
- UPC++ Issue Tracker - Bug reporting, feature requests, etc