According to simlib.py:
NUM_THREADS = threads per mpi proc (thread/mpi_proc)
PPN is supposed to be the number of processors, or cores requested from the scheduler per node. (core/node)
PPN_USED is supposed to be the number of cores actually used per node. (core/node)
NUM_SMT is supposed to be threads per core, and has a value of either 1 or 2 on all machines. (thread/core)
NODE_PROCS := PPNUSED * NUM_SMT/ NUM_THREADS
This follows since: NODE_PROCS = (cores/node)*(threads/core)/(threads/mpi proc) = mpi procs/node
Now here’s the problem.
NUM_PROCS = PROCS / NUM_THREADS
Now both --procs and --cores are two options for the same thing in simfactory. Thus “procs” is “processors” and “num_procs” is “number of processes.” That’s confusing, but that’s not the problem this ticket is about.
NUM PROCS is supposed to be the number of mpi processes. However, since --procs and --cores are the same thing:
NUM_PROCS = CORES / NUM_THREADS
= cores / (threads / mpi proc)
This is inconsistent. One would expect:
NUM_PROCS = NUM_SMT*CORES/NUM_THREADS
= (threads/core)*cores/(threads/mpi proc).
What if we define NUM_THREADS as cores/mpi proc? Well, apart from being confusing, that makes the NODE_PROCS calculation wrong.
So, unless I’m missing something, these parameters are not consistent, regardless of how you define them. They only work if NUM_SMT is one and cores and threads are interchangeable.
Is that always true?
The following machines have: max-num-smt = 2 are bethe, cori, philip, and supermucng. Looking at simlib.py, this parameter is not accessed! Instead, simlib.py only attempts to get ‘num-smt’, a parameter no ini file ever sets. Thus, the NUM_SMT is, essentially, always 1.
What to do?
My suggestion is that the definition of NUM_PROCS be ammended to be
NUM_PROCS = CORES * NUM_SMT / NUM_THREADS
so that cores*(threads/core)/(threads/mpi_proc)
And then I suggest that the feature is tried out on one of the above 4 machines by changing max-num-smt to num-smt (note, however, that philip no longer exists).