UPC++ Version 1.0
Nov 10, 2020: UPC++ Tutorial at SC20 - now available for on-demand streaming! (SC20 tutorial registration required)
Oct 30, 2020: A new UPC++ 2020.10.0 stable release is now available! We've also released a prototype using GPUDirect for native memory kinds support.
February 1, 2020: A new UPC++ Training site is now available, including video tutorials!
Lawrence Berkeley National Laboratory is hiring!
The Computer Languages and System Software Group (CLaSS) in the Computing Research Division at LBNL is recruiting for the following positions:
- CLaSS Group Lead: upcxx.lbl.gov/class-lead
- C++ Programmer/Software Engineer: upcxx.lbl.gov/2020-cxx-dev
- HPC Application Developer: upcxx.lbl.gov/2020-hpc-dev
Latest Stable Downloads:
- UPC++ Implementation 2020.10.0 (tar.gz)
- UPC++ Programmer's Guide, Revision 2020.10.0 (PDF)
- A gentle introduction to UPC++ with examples and descriptions.
- Also available online as a single HTML page
- UPC++ Specification, Revision 2020.10.0 (PDF)
- Formal specification of the UPC++ library interface.
- UPC++ Extras (Repo)
- Optional extensions, including a new
dist_arrayclass template for scalable distributed arrays
- Extended example codes and tutorial materials
- Optional extensions, including a new
Memory Kinds Prototype with native GDR support
- In addition to the current stable release, this is a prototype release of UPC++ which contains a new GPUDirect RDMA (GDR) native implementation of memory kinds for NVIDIA-branded CUDA devices with Mellanox-branded InfiniBand network adapters.
- Learning to use the library
- Includes Pagoda group publications and citation information for the documentation.
- See what real users have to say about UPC++!
UPC++ is a C++ library that supports Partitioned Global Address Space (PGAS) programming, and is designed to interoperate smoothly and efficiently with MPI, OpenMP, CUDA and AMTs. It leverages GASNet-EX to deliver low-overhead, fine-grained communication, including Remote Memory Access (RMA) and Remote Procedure Call (RPC).
UPC++ exposes a PGAS memory model, including one-sided communication (RMA and RPC). However, there are departures from the approaches taken by some predecessors such as UPC. These changes reflect a design philosophy that encourages the UPC++ programmer to directly express what can be implemented efficiently (ie without a need for parallel compiler analysis).
Most operations are non-blocking, and the powerful synchronization mechanisms encourage applications to design for aggressive asynchrony.
All communication is explicit - there is no implicit data motion.
UPC++ encourages the use of scalable data-structures and avoids non-scalable library features.
What Features Comprise UPC++?
RMA. UPC++ provides asynchronous one-sided communication (Remote Memory Access, a.k.a. Put and Get) for movement of data among processes.
RPC. UPC++ provides asynchronous Remote Procedure Call for running code (including C++ lambdas) on other processes.
Futures, promises and continuations. Futures are central to handling asynchronous operation of RMA and RPC. UPC++ uses a continuation-based model to express task dependencies.
Global pointers and memory kinds. UPC++ provides uniform interfaces for RMA transfers among host and device memories, including a reference implementation for CUDA GPUs. The 2020.11.0 prototype implements accelerated GPU memory transfers on compatible hardware. Future releases will continue to refine this capability.
Remote atomics use an abstraction that enables efficient offload where hardware support is available.
Distributed objects. UPC++ enables construction of a scalable distributed object from any C++ object type, with one instance on each rank of a team. RPC can be used to access remote instances.
Serialization. UPC++ introduces several complementary mechanisms for efficiently passing large and/or complicated data arguments to RPCs.
Non-contiguous RMA. UPC++ provides functions for non-contiguous data transfers directly on shared memory, for example to efficiently copy or transpose sections of N-dimension dense arrays.
Teams represent ordered sets of processes and play a role in collective communication. Initially we support barrier, broadcast and reductions, including abstractions to enable offload of reductions supported in hardware.
Progress guarantees. Because UPC++ has no internal service threads, the library makes progress only when a core enters an active UPC++ call. However, the "persona" concept makes writing progress threads simple.
A comparison to the feature set of UPC++ v0.1 is also available.
Notable applications/kernels/frameworks using UPC++:
- HipMer: An Extreme-Scale De Novo Genome Assembler, developed by the ECP ExaBiome Project
- symPACK: A sparse symmetric matrix direct linear solver
- mel-upx: Implements half-approximate graph matching (BoF slides), developed by the ECP ExaGraph Co-Design Center.
- SWE-UPC++: Shallow Water Equations for tsunami simulation, using the UPC++ Actor Library
- Berkeley Container Library (BCL): A cross-platform C++ library of distributed data structures, with backends for GASNet and UPC++
- ConvergentMatrix: A dense matrix abstraction for distributed-memory HPC platforms, used to implement the SEMUCB-WM1 tomographic model.
Other related software:
- upcxx-extras: UPC++ extra examples and optional extensions
- upcxx-utils: Set of utilities layered over UPC++, authored by the HipMer group
- Berkeley UPC: Now supports hybrid UPC/UPC++ applications!
- GASNet-EX: The portable, high-performance communication runtime used by UPC++
- MRG8: An efficient, high-period PRNG with skip-ahead, designed for exascale HPC
UPC++ is developed and maintained by the Pagoda Project at Lawrence Berkeley National Laboratory (LBNL), and is funded primarily by the Exascale Computing Project (17-SC-20-SC), a collaborative effort of the U.S. Department of Energy Office of Science and the National Nuclear Security Administration.
- 2020.3.2: [Implementation] [Guide] [Specification] [Announcement]
- 2020.3.0: [Implementation] [Guide] [Specification] [Announcement]
- 2019.9.0: [Implementation] [Guide] [Specification] [Announcement]
- 2019.3.2: [Implementation] [Guide] [Specification] [Announcement 1] [Announcement 2]
- 2018.9.0: [Implementation] [Guide] [Specification] [Announcement]
- 2018.3.2: [Implementation] [Guide] [Specification] [Announcement]
- 2018.1.0: [Implementation] [Guide] [Specification] [Announcement]
- 2017.9.0: [Implementation] [Guide] [Specification] [Announcement]