Program Structure

Issue #3 resolved
Dan Bonachea created an issue

Chapter 2: "main procedure of a UPC++ must call init before any user code" Might be a good idea to explicitly mention static initializers and the semantics of any code running there.

Should probably have a upcxx::is_init(), for the same composibility reasons that MPI has an analogous call. Alternatively, define that we allow multiple calls to init and reference count matching calls to finalize before true tear-down.

Chapter 2.2: "The current UPC++ implementation maps each rank to an OS process but it’s also possible to use OS threads or user-level Pthreads for UPC++ ranks."

Because UPC++ is a library instead of a compiler, this distinction is explicitly visible to the client via the behavior of global variables (even ignoring the hybrid application issues that expose this detail in UPC implementations). The distinction is especially relevant if a UPC++ thread calls pthread_create() or uses some other dynamic thread creation facility. Should probably make a stronger statement that each UPC++ rank is guaranteed to be a C++ process. If we realistically want to leave the door open for mapping multiple UPC++ ranks to individual threads in a process, then we'll probably need to provide a query of the execution mode and a facility for storing and retrieving rank-specific data (and discourage the use of C++ global variables). Note that existing thread-specific data facilities may be insufficient without a UPC++ wrapper that correctly maps calling pthread to library rank (especially for the case of a UPC++ client that creates its own dynamic threads within a rank).

Comments (6)

  1. Yili Zheng

    This is a very good topic for discussion. There are some design trade-offs and implementation constraints.

  2. Dan Bonachea reporter

    On the 5/6 telecon we made a few decisions relevant to this issue that seemed to mostly have consensus:

    • A UPC++ rank is a C++ process. This means regular C++ global variables and static data in externally-linked modules will automatically have one private copy per rank. It also means that upcxx::myrank() has a constant value process-wide.
    • Users may be permitted to create dynamic threads within that process (although we may require them to link a thread-safe version of the UPC++ implementation) but all such threads share the same UPC++ rank. A user who does this is explicitly assuming responsibility for any thread-safety issues that may arise within the process, especially when calling non-UPC++ modules.
    • Will add a upcxx::is_init() call to query the status of the UPC++ implementation. It returns true to indicate that UPC++ is fully initialized and all entry points are functional. A false return indicates that initialization has not yet completed and UPC++ is in a "pre-initialized" state.
    • May or may not require an explicit upcxx::init() call in main. We're fairly certain the argc/argv business will be a non-issue moving forward, but unsure about whether we want to require an explicit init or guarantee we can handle it automatically using the library's own static constructor. Either way we should probably prohibit explicit calls to upcxx::init() outside main.
    • While in the pre-init state (which in general may include all static constructors), only a very restricted set of UPC++ calls may be used. They will include:
      • upcxx::is_init()
      • Shared variable allocation calls
      • upcxx::preinit_ranks(), upcxx::preinit_myrank() - return the same values as later calls to ranks() and myranks() and may run slower, but work correctly in the pre-init state (by possibly partially initializing the library on-the-fly)
      • upcxx::register_postinit(<function_pointer>); (Need to discuss the best function pointer mechanism to use for this) Registers a function callback that executes exactly once on this rank after initialization has occurred and UPC++ is fully functional. Note this registration can also be safely called after init in which case the callback function just runs synchronously. Should probably specify the callbacks are run in the same order they are registered - is this enough to safely allow collective calls inside these? (What are the ordering guarantees for C++ static constructors?)
      • Pre-init code is explicitly prohibited from other UPC++ operations, notably including shared variable access, collectives, etc

    There was also a great deal of discussion regarding C++ destructors and how they relate to library finalization requirements and behavior, normal and abnormal process exits and job teardown, but most of that remains unresolved.

  3. Yili Zheng

    bool upcxx::is_init() is added to the "develop" branch. upcxx::init() can be called multiple times. Other functions in this issue may no longer be necessary if we can implement the implicit call to init() as discussed in issue #9.

  4. Log in to comment