comet files in simfactory use one MPI rank per node

Issue #2187 resolved
Roland Haas created an issue

The current ( uses 1 MPI rank per node:

max-num-threads = 24
num-threads     = 24

This is usually not the best way to set things up, I would eg have expected that the default choice would be something like 1 MPI rank per NUMA domain. Given that, unless limited by communication overhead, we seem to obtain fastest per-node performance when using only MPI and no OpenMP (about a factor of 50% speedup on my 12 core workstation with 2 NUMA domains) if anyone is using Comet for production work and wants to contribute their machine description file that would be great.

Comments (6)

  1. Roland Haas reporter

    Private conversation with users on Comet that are using it for production runs (in 2018 so I am a bit tardy reporting this) indicate that best performance was achieved when using 4 threads per MPI rank and 4 MPI ranks per node ie leaving 8 cores per node empty (Comet has 24 cores per node) gave best results.

  2. Roland Haas reporter

    Changed to use 6 threads per rank in git hash 5ea0f7b "comet.ini: use 6 threads by default" of simfactory2 as stopgap measure. This needs to be properly measured with a couple of test runs to find a good setting for runs typical for comet.

