Automatic undersubscribing

Create issue
Issue #921 new
Ian Hinder created an issue

At the moment, says that if you ask for 6 threads for a job (--procs 6), simfactory will assume that you want to use the whole node and will instead round this up to the number of cores per node, which on LoneStar is 12. If you want to use only 6 threads, you need to use --ppn-used 6. I find this confusing, and would prefer to be able to get 6 threads if I ask for 6 threads, even if underneath I have to claim 12 from the queuing system. This is the way it seems to work on Datura: if I ask for --procs 6, I get 6 threads (i.e. one MPI process). This seems to be in conflict with the documentation, but I prefer this behaviour.

Proposal: if the number of threads requested (--procs) is less than the number of cores per node (PPN), then don't round up the number of threads to the nearest larger multiple of PPN. Thoughts?


Comments (4)

  1. Erik Schnetter
    • removed comment

    The current behaviour does not assume that you are running on only a single node. It works as follows: - user requests N cores (via --procs) - system granularity is determined (e.g. can allocate only integer number of nodes) (depends on --ppn-used) - if necessary, N is rounded up

    What you suggest is to modify the value specified via --ppn-used instead of the value specified via --procs. Or are you suggesting to do this only if at most a single node is allocated? Or do you suggest to not round up at all, and leave some cores unused?

    The intention behind rounding up is that you will have to pay for these cores anyway, so you may as well use them. Whether this is correct very likely doesn't depend on the PPN of the system you use. In a production run, you probably want to use the additional cores (why not?), in a benchmark you don't. But then, in a benchmark you probably need to be aware of the node configuration anyway, so you may as well specify --ppn-used.

  2. Ian Hinder reporter
    • removed comment

    (I just realised that in my test job submission on LoneStar I set the wrong --procs anyway, so my statement about what it actually does might be wrong)

    I'm not sure exactly how it should be implemented. The basic user interface is quite simple: user requests a number of threads with --procs, and simfactory chooses everything so that at least this many threads get executed, rounded up to the granularity of the system. Maybe it is best to keep this model, as it is easy to explain (though, a warning that the number of threads being executed has been rounded might be useful). I think my suggestion is that if less than a node has been requested, the rounding shouldn't happen. Actually, in my case, what I'm interested in is setting the number of MPI processes. Maybe what I want is to be able to set this directly. I think that at the moment I have to set --procs 6 --ppn-used 6 to get one MPI process. This means I have to know that there are 6 threads per process. I would like to be able to say --processes 1 instead; that would solve my problem.

  3. Erik Schnetter
    • removed comment

    So you would not mind running with more OpenMP threads than expected? These may then span several NUMA nodes and actually run more slowly.

  4. Ian Hinder reporter
    • removed comment

    I didn't mean that; I wanted to be able to specify the number of MPI processes and have simfactory calculate the number of threads based on its knowledge. So if I said "--processes 1", on LoneStar it would give me one process with 6 threads bound to one processor, and if i said "--processes 2", it would give me two processes, each with 6 threads bound to the respective processor. i.e. I think my request is now for an additional option (without changing any existing interface) to directly specify the number of MPI processes instead of the total number of threads. My obvious use-case is for test suites, which care about the number of processes.

  5. Log in to comment