Clarify collective requirement for team destruction

Issue #71 resolved
Dan Bonachea created an issue

Currently there are no defined destructors for upcxx::team (despite the fact it's defined as Destructible).

Teams are heavyweight objects, but EX will eventually allow them to be destroyed (but probably require this to be collective over that team).

Currently we permit the trivial pseudo-destructor, which shouldn't crash but will definitely leak associated resources inside the communication stack. Assuming these require collective destruction, implicit RAII semantics for these may be a dangerous idea.

Comments (7)

  1. BrianS

    A programming style where a user employs divide-and-conquer and builds sub teams at finer scope might be difficult with heavy-weight team creation/destruction.

    Now, the class is only MoveConstructable, meaning they need to call a factory function like split to create one for the user. A similar "clean up" function can be made, but that makes STL containers of teams awkward. So master persona factory functions, and a destructor that is master persona and collective.

    We might want to maintain a cache in UPC++ for teams that show up repeatedly. For caching it would be nice it teams had a hash value. If we know that GASNet is going to leak the resources, then upc++ should cache and recycle teams. We will want to be able to diagnose resource exhaustion if teams leak. As a general rule I like to track things that I know we intend to leak.

  2. Former user Account Deleted

    Yes, we should explicitly list the team destructor. It's a shame if it would have to be collective, but that isn't a show stopper. RAII guarantees destruction order is the reverse of construction order, and if that doesn't work for the user they can always heap allocate new team and explicitly delete my_team in the aligned way.

  3. Dan Bonachea reporter

    This issue was triaged at the 2018-06-13 Pagoda meeting and assigned a new milestone/priority.

    We noted it's crucial for the spec to define the correctness conditions for team destruction, even if the implementation temporarily leaks resources.

    One idea discussed after the meeting is that we could avoid some of the productivity/transparency dangers associated with a collective requirement on implicit destruction with a design that introduces an explicitly collective team::destroy() call:

    destroy() would have a precondition requiring local quiescence on the team, execute an entry team barrier (to ensure global team quiescence) and then release all team-related parallel resources and move the local team object into a "dead" state. This dead state would be specified as a precondition for the team object destructor, which is permitted to happen non-collectively any time after destroy.

    This design frees the user from ensuring that statically-scoped team objects go out of scope collectively, or reasoning about implicit collective communication (quiescence barriers) happening at a } where objects go out of scope. It also gives us easily-understood places to hang checks that this destruction protocol is followed correctly and report errors otherwise.

  4. Log in to comment