need for the updating of the submit script template for Cori

Issue #2252 resolved
Former user created an issue

Hello, I use the nersc machine Cori to run the ET simulaiton. I am wondering how to change the number of nodes. I use the command like :

simfactory/bin/sim submit mysimulation --parfile par/file.par --procs 64 --walltime 00:30:00

when I check the status of my job by "sqs", it shows that only 1 node is used not 2 . Is there some possible way to change the number of nodes? Or the submit script template for Cori should be updated ?

Thanks. Best regards, Chia-Hui

Comments (14)

  1. Roland Haas
    • changed status to open

    This has been confirmed. @eschnett and @rhaas80 have access to Cori. My own guess is that this is a fallout of NERSC switching to SLURM (https://www.nersc.gov/assets/Uploads/SLURM-NUG-Nov2015.pdf) which counts each hyperthread as a cpu so the --procs 64 being translated into --mppwidth 64 is translated into the wrong number nodes by the PBS -> SLURM translation at NERSC. See https://docs.nersc.gov/jobs/examples/#hybrid-mpiopenmp-jobs .

  2. Roland Haas

    Hello, I use the nersc machine Cori to run the ET simulaiton. I am wondering how to change the number of nodes. I use the command like :

    simfactory/bin/sim submit mysimulation --parfile par/file.par --procs 64 --walltime 00:30:00

    when I check the status of my job by "sqs", it shows that only 1 node is used not 2 . Is there some possible way to change the number of nodes? Or the submit script template for Cori should be updated ?

    Thanks. Best regards, Chia-Hui

  3. Roland Haas

    Thank you. I have pushed the changes you provided into the master branch of simfactory. Do you know if one also needs an option --threads-per-core @NUM_SMT@ in cori.run? This only matters if core affinity is chosen badly by srun (though chances are that Carpet, using hwloc, fixes this at runtime).

    I will push the same fix to the release branch as well (ET_2019_03) after that.

  4. Chia-Hui Lin

    Sorry that I am not quite sure, but I did not add that option in cori.run and it works fine.
    Also thanks for your help.

    Best regards,

    Chia-Hui

  5. Roland Haas

    Ok. I will try out what happens and merge the current changes from master into the release branch for now. Just before the release Erik and mine’s Cori accounts lapsed and we are still in the process of getting them back (waiting for Erik’s still) so am just testing this now on Cori.

  6. Roland Haas

    I am not quite sure yet if this is working as expected. The SMT thread assignment may be off. I did two test runs one using srun -n 4 --threads-per-core 2 -c 16 and one using srun -n 4 --threads-per-core 1 -c 16 both for a 2 node submission. In both cases I get

    This process runs on 16 cores: 0-7, 32-39
    Thread 0 runs on 16 cores: 0-7, 32-39
    

    which given the usual logical-cpu to thread mapping are (I think) 16 hardware threads on 8 cores which is not quite what was intended (namely it should have been 16 cores) and leaves some cores empty. In particular the srun -n 4 --threads-per-core 1 -c 16 version should have used only 1 thread per core.

    These runs did not use SystemTopology. I will try with it next which may give some more insight.

  7. Roland Haas

    SystemTopology helps and produces sane layouts. This is for --procs 128 --num-smt 2 which used 2 nodes (as it should) and with srun -n 8 --threads-per-core 2 -c 8 gives:

    INFO (Carpet): This process contains 16 threads, this is thread 0
    INFO (Carpet): There are 128 threads in total
    INFO (Carpet): There are 16 threads per process
    INFO (Carpet): This process runs on host nid00671, pid=48257
    INFO (Carpet): This process runs on 16 cores: 16-23, 48-55
    INFO (Carpet): Thread 0 runs on 1 core: 16
    INFO (Carpet): Thread 1 runs on 1 core: 48
    INFO (Carpet): Thread 2 runs on 1 core: 17
    INFO (Carpet): Thread 3 runs on 1 core: 49
    INFO (Carpet): Thread 4 runs on 1 core: 18
    INFO (Carpet): Thread 5 runs on 1 core: 50
    INFO (Carpet): Thread 6 runs on 1 core: 19
    INFO (Carpet): Thread 7 runs on 1 core: 51
    INFO (Carpet): Thread 8 runs on 1 core: 20
    INFO (Carpet): Thread 9 runs on 1 core: 52
    INFO (Carpet): Thread 10 runs on 1 core: 21
    INFO (Carpet): Thread 11 runs on 1 core: 53
    INFO (Carpet): Thread 12 runs on 1 core: 22
    INFO (Carpet): Thread 13 runs on 1 core: 54
    INFO (Carpet): Thread 14 runs on 1 core: 23
    INFO (Carpet): Thread 15 runs on 1 core: 55
    

    while submitting just --procs 64 results in srun -n 4 --threads-per-core 2 -c 16 (threads per core being hardwired in the runscript) and

    INFO (Carpet): This process contains 16 threads, this is thread 0
    INFO (Carpet): There are 64 threads in total
    INFO (Carpet): There are 16 threads per process
    INFO (Carpet): This process runs on host nid00995, pid=8369
    INFO (Carpet): This process runs on 16 cores: 0-15
    INFO (Carpet): Thread 0 runs on 1 core: 0
    INFO (Carpet): Thread 1 runs on 1 core: 1
    INFO (Carpet): Thread 2 runs on 1 core: 2
    INFO (Carpet): Thread 3 runs on 1 core: 3
    INFO (Carpet): Thread 4 runs on 1 core: 4
    INFO (Carpet): Thread 5 runs on 1 core: 5
    INFO (Carpet): Thread 6 runs on 1 core: 6
    INFO (Carpet): Thread 7 runs on 1 core: 7
    INFO (Carpet): Thread 8 runs on 1 core: 8
    INFO (Carpet): Thread 9 runs on 1 core: 9
    INFO (Carpet): Thread 10 runs on 1 core: 10
    INFO (Carpet): Thread 11 runs on 1 core: 11
    INFO (Carpet): Thread 12 runs on 1 core: 12
    INFO (Carpet): Thread 13 runs on 1 core: 13
    INFO (Carpet): Thread 14 runs on 1 core: 14
    INFO (Carpet): Thread 15 runs on 1 core: 15
    

    which makes sense when using 1 SMT.

    So it seems as if srun's --threads-per-core argument is not doing anything.

  8. Chia-Hui Lin

    Thanks for your efforts so that cori.run can be updated and be more completed.

    Best regards,
    Chia-Hui

  9. Roland Haas

    This turned out to be mostly an issue with how the -c option was used. -c is the number of logical cores per MPI rank and was set as @PPN@ / @NODE_PROCS@. Howerver PPN in simfactory’s mdb files is (incorrectly if one believes the documentation string, but correctly when one checks how it is used) set to the number of physical cores per node. Somewhat sanely (in case one was to use hyperthreading) requesting 6 logical cores uses 3 physical cores along with their 3 hyperthreading partners.

    Corrected in git hash 8278a78 "cori: allocate thread to logical cores rather than cores used" of simfactory2 along with binding OMP threads to physical cores.

  10. Log in to comment