Clone wiki

upcxx / Home

UPC++ Version 1.0


Nov 10, 2020: UPC++ Tutorial at SC20 - available to registrants for on-demand streaming through May 2021!

Oct 30, 2020: A new UPC++ 2020.10.0 stable release is now available! We've also released a prototype using GPUDirect for native memory kinds support.

Aug 26, 2020: UPC++ Users Interactive Webinar - Slides from the event

February 1, 2020: A new UPC++ Training site is now available, including video tutorials!

Lawrence Berkeley National Laboratory is hiring!
The Computer Languages and System Software Group (CLaSS) in the Computing Research Division at LBNL is recruiting for the following positions:

Latest Stable Downloads:

Memory Kinds Prototype with native GDR support

  • In addition to the current stable release, this is a prototype release of UPC++ which contains a new GPUDirect RDMA (GDR) native implementation of memory kinds for NVIDIA-branded CUDA devices with Mellanox-branded InfiniBand network adapters.

Training Materials

  • Learning to use the library


  • Includes Pagoda group publications and citation information for the documentation.

User Testimonials

  • See what real users have to say about UPC++!


UPC++ is a C++ library that supports Partitioned Global Address Space (PGAS) programming, and is designed to interoperate smoothly and efficiently with MPI, OpenMP, CUDA and AMTs. It leverages GASNet-EX to deliver low-overhead, fine-grained communication, including Remote Memory Access (RMA) and Remote Procedure Call (RPC).

Design Philosophy

UPC++ exposes a PGAS memory model, including one-sided communication (RMA and RPC). However, there are departures from the approaches taken by some predecessors such as UPC. These changes reflect a design philosophy that encourages the UPC++ programmer to directly express what can be implemented efficiently (ie without a need for parallel compiler analysis).

  1. Most operations are non-blocking, and the powerful synchronization mechanisms encourage applications to design for aggressive asynchrony.

  2. All communication is explicit - there is no implicit data motion.

  3. UPC++ encourages the use of scalable data-structures and avoids non-scalable library features.

What Features Comprise UPC++?

  • RMA. UPC++ provides asynchronous one-sided communication (Remote Memory Access, a.k.a. Put and Get) for movement of data among processes.

  • RPC. UPC++ provides asynchronous Remote Procedure Call for running code (including C++ lambdas) on other processes.

  • Futures, promises and continuations. Futures are central to handling asynchronous operation of RMA and RPC. UPC++ uses a continuation-based model to express task dependencies.

  • Global pointers and memory kinds. UPC++ provides uniform interfaces for RMA transfers among host and device memories, including a reference implementation for CUDA GPUs. The 2020.11.0 prototype implements accelerated GPU memory transfers on compatible hardware. Future releases will continue to refine this capability.

  • Remote atomics use an abstraction that enables efficient offload where hardware support is available.

  • Distributed objects. UPC++ enables construction of a scalable distributed object from any C++ object type, with one instance on each rank of a team. RPC can be used to access remote instances.

  • Serialization. UPC++ introduces several complementary mechanisms for efficiently passing large and/or complicated data arguments to RPCs.

  • Non-contiguous RMA. UPC++ provides functions for non-contiguous data transfers directly on shared memory, for example to efficiently copy or transpose sections of N-dimension dense arrays.

  • Teams represent ordered sets of processes and play a role in collective communication. Initially we support barrier, broadcast and reductions, including abstractions to enable offload of reductions supported in hardware.

  • Progress guarantees. Because UPC++ has no internal service threads, the library makes progress only when a core enters an active UPC++ call. However, the "persona" concept makes writing progress threads simple.

A comparison to the feature set of UPC++ v0.1 is also available.

Notable applications/kernels/frameworks using UPC++:

Other related software:

  • upcxx-extras: UPC++ extra examples and optional extensions
  • upcxx-utils: Set of utilities layered over UPC++, authored by the HipMer group
  • Berkeley UPC: Now supports hybrid UPC/UPC++ applications!
  • GASNet-EX: The portable, high-performance communication runtime used by UPC++
  • MRG8: An efficient, high-period PRNG with skip-ahead, designed for exascale HPC


UPC++ is developed and maintained by the Pagoda Project at Lawrence Berkeley National Laboratory (LBNL), and is funded primarily by the Exascale Computing Project (17-SC-20-SC), a collaborative effort of the U.S. Department of Energy Office of Science and the National Nuclear Security Administration.

Previous Releases:

Contact Info