- changed milestone to 2021.3.0 release
Mass roll-over of open issues to next release milestone
Global Arrays offers the GA_Cluster_nnodes
API for getting the # of shared memory nodes in a job:
https://hpc.pnl.gov/globalarrays/api/c_op_api.html#CLUSTER_NNODES
Having a similar capability in UPC++ would be helpful in general and for porting GA programs.
Mass roll-over of open issues to next release milestone
Mass roll-over of open issues to next release milestone
On the 2021.09.13 call we identified "how many distinct local_team's in the current job" as the desired semantic for this call. In general this could be >=
the number of "shared memory nodes", but is a property that the UPC++ runtime maintains already and is not easily inferred from other sources such as the hostname.
Mass roll-over of open issues to next release milestone
This simple enhancement request has been sitting around for over a year, and I'd really like to see it addressed soon. The proposed query provides useful information the runtime has readily available and that is difficult for applications to efficiently construct given current queries.
The only "hard part" is deciding upon an interface.
std::pair<intrank_t, intrank_t> upcxx::local_team_position()
Semantics
Queries information about the disjoint local teams comprising world().
Returns a std::pair
value, such that given returned pair info
:
info.second
provides the number of disjoint local teams in the set comprising world()
. During a given execution, this value is equal for all callers.info.first
provides an integral index in [0,info.second)
that identifies the local team of the calling process within that set. During a given execution, the value returned to two processes is equal if and only if they share a local team.The values returned to any given calling process remain stable across subsequent calls.
Advice to Users: This function returns information about the number and identity of local teams, which delineate the boundaries of shared heap locality within the job (and may correspond to physical node boundaries). Information about a caller's position within its local team is available via local_team().rank_me()
and local_team().rank_n()
.
Progress Level: none
The proposed name is based on existing utility function upcxx::local_team_contains()
. The proposed description is written in terms of the existing local team semantics, and as with existing sections deliberately avoids guaranteeing local team equivalence to physical node boundaries (although that is the common/default case).
Implementation is a trivial wrapper around a subset of the information supplied by gex_System_QueryMyPosition()
Please provide feedback.
The proposed interface sounds good to me
The proposed interface sounds fine, but I wonder if this is something we can/should provide as a general query on all teams – what is the team’s position with respect to the other teams created by the associated split()
or create()
call. It’s true that the creator of the team can compute this information (though potentially requiring another collective to do so), but I can imagine a scenario where a team is passed to a library, which wouldn’t necessarily be able to compute that information.
something we can/should provide as a general query on all teams – what is the team’s position with respect to the other teams created by the associated split() or create() call.
I also initially wondered whether this was worth generalizing somehow. However the information you suggest is NOT something that we currently compute or track (for split()
or create()
). Computing such information after a team::create()
in particular would require entirely new collective communication across the parent team, where the primary motivation for using team::create()
is to exactly to avoid the overhead of such communication. For this reason I'm opposed to this idea.
In contrast, the information exposed by my proposal is readily and efficiently available from GASNet's node-mapping metadata.
The name and description here are slightly misleading: my proposal is named/described in terms of the "local team", because that happens to be the UPC++-level container whose boundaries correspond to the GASNet "neighborhood" abstraction. However, the similarity to upcxx::team
ends there. This is really a node topology query and has nothing to do with the local_team()
object or any dynamic team.
Proposed resolution in spec PR 85 and impl PR 409
This simple enhancement request has been sitting around for over a year, and I'd really like to see it addressed soon. The proposed query provides useful information the runtime has readily available and that is difficult for applications to efficiently construct given current queries.
The only "hard part" is deciding upon an interface.
Strawman Proposal
Semantics
Queries information about the disjoint local teams comprising world().
Returns a
std::pair
value, such that given returned pairinfo
:info.second
provides the number of disjoint local teams in the set comprisingworld()
. During a given execution, this value is equal for all callers.info.first
provides an integral index in[0,info.second)
that identifies the local team of the calling process within that set. During a given execution, the value returned to two processes is equal if and only if they share a local team.The values returned to any given calling process remain stable across subsequent calls.
Advice to Users: This function returns information about the number and identity of local teams, which delineate the boundaries of shared heap locality within the job (and may correspond to physical node boundaries). Information about a caller's position within its local team is available via
local_team().rank_me()
andlocal_team().rank_n()
.Progress Level: none
Discussion
The proposed name is based on existing utility function
upcxx::local_team_contains()
. The proposed description is written in terms of the existing local team semantics, and as with existing sections deliberately avoids guaranteeing local team equivalence to physical node boundaries (although that is the common/default case).Implementation is a trivial wrapper around a subset of the information supplied by
gex_System_QueryMyPosition()
Please provide feedback.