- changed milestone to 2020.9.0 release
-
assigned issue to
Provide documentation on parallel/distributed debugging
We have noticed a lack of information on parallel/distributed debugging in debugging.md. All we say now is:
Note in particular that runs of multi-rank jobs on many systems include non-trivial spawning activities (e.g., required spawning scripts and/or fork calls) that serial debuggers generally won't correctly follow and handle. Hence the general recommendation to debug multi-rank jobs by attaching your favorite debugger to already-running rank processes.
We should be advising users to try the same tool(s) the apply to distributed MPI applications. However, that requires launch via mpi-spawner (which we need to figure out how to "spell" for the user).
Perhaps most importantly, we should test what we recommend on at least Cori or Summit (hopefully both).
Comments (9)
-
-
- changed milestone to 2021.3.0 release
Mass roll-over of open issues to next release milestone
-
- changed milestone to 2021.9.0 release
Mass roll-over of open issues to next release milestone
-
- changed milestone to 2022.3.0 release
Mass roll-over of open issues to next release milestone
-
- changed milestone to 2022.9.0 release
Mass roll-over of open issues to next release milestone
-
- changed milestone to 2023.3.0 release
Mass roll-over of open issues to next release milestone
-
- removed responsible_account_id
-
- changed milestone to 2023.9.0 release
Mass roll-over of open issues to next release milestone
-
- removed milestone
Clear past Milestone for open issues
- Log in to comment
This was discussed in the 2020-02-12 meeting and deferred to next release milestone.