hwloc could warn the user if a process spans more than one NUMA node

Create issue
Issue #1722 open
Ian Hinder created an issue

If running on a NUMA architecture, performance could be very bad if a process allocates memory from one NUMA node, and also has threads running on cores of another NUMA node, as they will attempt to access the memory from the first node, which will be slow. It is probably better in that situation if there is one process per NUMA node, so that each process allocates memory it has fast access to.

hwloc could detect whether the threads in a process are all on the same NUMA node or not, and output a warning (or error?) if there are threads running on more than one NUMA node in the same process.

See also #1446 and #1528.


Comments (2)

  1. Erik Schnetter
    • removed comment

    In the same spirit, hwloc should detect if the current MPI process started up with a sub-ideal memory affinity. Other libraries, in particular MPI, may be initialized before hwloc runs, and these would see and use the sub-ideal memory affinity. This is actually causing slow-downs on some systems.

    Another remedy for this could be to run hwloc very early, before MPI is initialized. This would require passing in the current MPI rank, which may not be possible before initializing MPI.

  2. Log in to comment