Cleanup rank/process terminology

Issue #126 resolved
Dan Bonachea created an issue

The current UPC++ spec uses the terms "rank" and "process" interchangeably, eg the glossary includes:

Rank. An OS process that is a member of a UPC++ parallel job execution. UPC++ uses a SPMD execution model, and the number of ranks is fixed during a given program execution. The placement of ranks across physical processors or NUMA domains is implementation-dependent.

This usage was unambiguous when we only supported upcxx::world(), but it doesn't really work moving forward to multi-team and subset teams. It doesn't even really work for upcxx::local_team.

In particular, I think to avoid confusion we need a way to discuss the ranks in a given team (ie the intrank_t values that name them) as distinct from the processes thus named. So for example, the same process corresponding to rank=4 in upcxx::world() may also be rank=0 in upcxx::local_team. So what "rank" is this process? The intrank_t value is only meaningful with respect to a given team - it's "half" of a coordinate naming a process. However we need a way to discuss objects with affinity to a process, a property which exists independently of any particular team, so it seems odd to say things like "shared object with affinity to this rank".

I think the fix here may be to follow the MPI terminology precedent and say "process" when we mean process (ie an OS process with private and shared segments, one or more threads and membership in possibly many teams), and "rank" when we are referring to a rank in a team (which names a particular process). We've previously used the term "peer index" to mean the latter, but it seems willfully confusing for the type intrank_t to represent something called a "peer index".

For some examples of where this distinction matters, see pull request #14.

Fixing this terminology will probably also be important in solving issue #47.

Comments (3)

  1. Scott Baden

    I agree that the number of ranks is no longer a fixed quantity when we start building teams. Following your comment that we need a way to discuss the ranks in a given team, what we can say is that number of ranks constituting a team is fixed for the lifetime of the team. This is true for team world() [from init to finalize, and any time we go through any subsequent build and tear downs again) as well as any other team, i.e. local_team or any other team derived (transitively) from from team world or local_team

    We may want to think of the ranks in world() as special. These might be "world" ranks or "base" ranks. When we need to ranks in other teams, we could refer to "local_team" or "derived team" (a team derived (transitively) from from team world or local_team)

    Since OS processes are allocating memory, we need to anchor any notion of affinity to processes. Question is: can we say that the initial teams in world() have a 1:1 relationship to processes and for world_ranks (or base ranks as I also suggested) it's OK to talk about affinity to world_ranks? Since we know how teams get derived we can always map back to world_ranks. There is one remaining issue b(ut I think we are OK) what about the case when we launch a upcxx program with more processes than cores?

  2. Dan Bonachea reporter

    This change was discussed and agreed upon at the 6/27/18 meeting, and Scott (who could not attend) has indicated in email that he agrees with the proposed terminology change, so we now have consensus.

    Summary:

    • “process” now means a member of the set of SPMD entities (fixed at startup) that has associated memory and processing resources
    • “rank” now means the integer label that (along with a team) names a particular process

    I'll apply this decision by extending PR#14 to cover the necessary changes elsewhere in the spec when I have time. We may also need adjustments in the Guide and other user-facing documentation.

    Note teams upcxx::world() and upcxx::local_team() are "special" in a number of ways listed below, none of which changes with the terminology cleanup:

    1. upcxx::world() and upcxx::local_team() are implicitly created at startup by the runtime
    2. The upcxx::team objects representing these teams are owned by the runtime and may not be moved or destroyed by the user
    3. The processes named by ranks in upcxx::local_team() have the special properties regarding shared heap locality discussed in section 12.2
    4. upcxx::world() is the default for all UPC++ operations that take a team argument
    5. The sugar functions upcxx::rank_me() and upcxx::rank_n() report ranks relative to upcxx::world()
    6. We provide operators for rank conversion between upcxx::world() and any arbitrary team
    7. global_ptr<T>::where() reports locality by naming a rank in upcxx::world()
    8. The recipient argument to rpc(_ff) is a rank in upcxx::world(), although we plan to eventually provide overloads that accept (team,rank) for arbitrary team
  3. Dan Bonachea reporter

    issue #126: Fixup rank/process terminology throughout the spec

    • Separate glossary entry for process/rank
    • Replace most instances of "rank" with "process" spec-wide, except when specifically referring to ranks in a team.
    • Also replace most instances of "current rank" with "calling process", as the latter is less ambiguous when specifying the behavior of communication functions, since they involve more than one process.

    Resolves issue #126

    → <<cset 0b09f91dd903>>

  4. Log in to comment