UPC++ Version 1.0
Mar 31, 2022: A new UPC++ 2022.3.0 release is now available for download!
Nov 15, 2021: UPC++ Tutorial at SC21 - now available on-demand!
A UPC++ Training site is now available, including video tutorials!
Latest Stable Downloads:
- UPC++ Implementation 2022.3.0 (tar.gz)
- UPC++ Programmer's Guide (PDF)
- A gentle introduction to UPC++ with examples and descriptions.
- Also available online as a single HTML page
- UPC++ Specification (PDF)
- Formal specification of the UPC++ library interface.
- UPC++ Extras (Repo)
- Optional extensions, including a new
dist_arrayclass template for scalable distributed arrays
- Extended example codes and tutorial materials
- Optional extensions, including a new
- Learning to use the library
- Includes Pagoda group publications and citation information for the documentation.
- See what real users have to say about UPC++!
UPC++ is a C++ library that supports Partitioned Global Address Space (PGAS) programming, and is designed to interoperate smoothly and efficiently with MPI, OpenMP, CUDA, ROCm HIP and other HPC frameworks. It leverages GASNet-EX to deliver low-overhead, fine-grained communication, including Remote Memory Access (RMA) and Remote Procedure Call (RPC).
UPC++ exposes a PGAS memory model, including one-sided communication (RMA and RPC). However, there are departures from the approaches taken by some predecessors such as UPC. These changes reflect a design philosophy that encourages the UPC++ programmer to directly express what can be implemented efficiently (ie without a need for parallel compiler analysis).
Most operations are non-blocking, and the powerful synchronization mechanisms encourage applications to design for aggressive asynchrony.
All communication is explicit - there is no implicit data motion.
UPC++ encourages the use of scalable data-structures and avoids non-scalable library features.
What Features Comprise UPC++?
RMA. UPC++ provides asynchronous one-sided communication (Remote Memory Access, a.k.a. Put and Get) for movement of data among processes.
RPC. UPC++ provides asynchronous Remote Procedure Call for running code (including C++ lambdas) on other processes.
Futures, promises and continuations. Futures are central to handling asynchronous operation of RMA and RPC. UPC++ uses a continuation-based model to express task dependencies.
Global pointers and memory kinds. UPC++ provides uniform interfaces for RMA transfers among host and device memories, including acceleration of GPU memory transfers via RDMA offload on compatible hardware. Future releases will continue to refine this capability.
Remote atomics use an abstraction that enables efficient offload where hardware support is available.
Distributed objects. UPC++ enables construction of a scalable distributed object from any C++ object type, with one instance on each rank of a team. RPC can be used to access remote instances.
Serialization. UPC++ introduces several complementary mechanisms for efficiently passing large and/or complicated data arguments to RPCs.
Non-contiguous RMA. UPC++ provides functions for non-contiguous RMA data transfers to/from arrays in shared memory, for example to efficiently copy or transpose sections of N-dimension dense arrays.
Teams represent ordered sets of processes and play a role in collective communication. Currently we support barrier, broadcast and reductions, including abstractions to enable offload of reductions supported in hardware.
Progress guarantees. Because UPC++ has no internal service threads, the library makes progress only when a core enters an active UPC++ call. However, the "persona" concept makes writing progress threads simple.
A comparison to the feature set of UPC++ v0.1 is also available.
Notable applications/kernels/frameworks using UPC++:
- HipMer and MetaHipMer 2: Extreme-Scale De Novo Genome Assemblers, developed by the ECP ExaBiome Project
- SIMCoV: 3-D agent-based cellular-level model of COVID-19 viral progression in human lungs (press release)
- symPACK: A sparse symmetric matrix direct linear solver
- mel-upx: Implements half-approximate graph matching (BoF slides), developed by the ECP ExaGraph Co-Design Center.
- SWE-UPC++/Pond: Shallow Water Equations for tsunami simulation, using the UPC++ Actor Library
- Berkeley Container Library (BCL): A cross-platform C++ library of distributed data structures, with backends for GASNet-EX and UPC++
- ConvergentMatrix: A dense matrix abstraction for distributed-memory HPC platforms, used to implement the SEMUCB-WM1 tomographic model.
Other related software:
- upcxx-extras: UPC++ extra examples and optional extensions
- upcxx-utils: Set of utilities layered over UPC++, authored by the HipMer group
- Berkeley UPC: Supports hybrid UPC/UPC++ applications
- GASNet-EX: The portable, high-performance communication runtime used by UPC++
- MRG8: An efficient, high-period PRNG with skip-ahead, designed for exascale HPC
UPC++ is developed and maintained by the Pagoda Project in the CLaSS Group at Lawrence Berkeley National Laboratory (LBNL), and is funded primarily by the Exascale Computing Project (17-SC-20-SC), a collaborative effort of the U.S. Department of Energy Office of Science and the National Nuclear Security Administration.
- 2021.9.0: [Implementation] [Guide] [Specification] [Announcement]
- 2021.3.0: [Implementation] [Specification] [Announcement]
- 2020.11.0: [Implementation (memory kinds prototype)] [Specification (draft)] [Announcement]
- 2020.10.0: [Implementation] [Guide] [Specification] [Announcement]
- 2020.3.2: [Implementation] [Guide] [Specification] [Announcement]
- 2020.3.0: [Implementation] [Guide] [Specification] [Announcement]
- 2019.9.0: [Implementation] [Guide] [Specification] [Announcement]
- 2019.3.2: [Implementation] [Guide] [Specification] [Announcement 1] [Announcement 2]
- 2018.9.0: [Implementation] [Guide] [Specification] [Announcement]
- 2018.3.2: [Implementation] [Guide] [Specification] [Announcement]
- 2018.1.0: [Implementation] [Guide] [Specification] [Announcement]
- 2017.9.0: [Implementation] [Guide] [Specification] [Announcement]