CT_MultiLevel tests abort when run using more than one thread via simfactory

Create issue
Issue #2289 new
Roland Haas created an issue

In git hash f813545 "CT_MultiLevel: request single-thread execution of all tests." of ctthorns the CT_MultiLevel tests were changed to force a single OpenMP thread. This is a good thing as there is a (known, harmless) race condition in the Gauss-Seidel sweep that the code uses when more than a single thread is used which renders results non-deterministic.

Unfortunately the change interferes badly with the sanity checks in Carpet using CACTUS_NUM_THREADS to check that the number of threads requested in the RunScript agrees with the number of threads that Carpet sees in use during its SetupGH routine:

if (cactus_num_threads != mynthreads) {
  CCTK_VWarn(CCTK_WARN_ABORT, __LINE__, __FILE__, CCTK_THORNSTRING,
             "The environment variable CACTUS_NUM_THREADS is set to %d, "
             "but there are %d threads on this process. This may "
             "indicate a severe problem with the OpenMP startup "
             "mechanism.",
             cactus_num_threads, mynthreads);
}

which uses the env variable CACTUS_NUM_THREADS.

This currently causes the tests to fail on any cluster when using simfactory’s --testsuite option.

keyword: CT_MultiLevel

Comments (5)

  1. Roland Haas reporter

    @Eloisa Bentivegna do you a preferred way of how to try and fix this? One would be to add a parameter single_threaded to CT_MultiLevel and have all #pragma omp parallel contain an if clause #pragma omp parallalel if(!single_threaded).

  2. Eloisa Bentivegna

    This seems to me like a Carpet issue rather than a CT_MultiLevel issue, as any other thorn using the parameter Carpet::num_threads would incur in the same problem. There are basically two conflicting ways to specify the number of threads, which need to be reconciled in a way that retains the ability to specify the number of threads at the parfile level (so it can be used in the tests). I am not sure what to suggest in this direction.

    On the same note, CT_MultiLevel does not use raw OpenMP directives, but adopts the LoopControl headers (except for two minor utility functions), so the solution you suggest can’t be applied.

  3. Ian Hinder

    I think the best way to fix this would be to add the option to specify that the test requires a specific number of threads in test.ccl, as is done for the number of processes. So we would have nthreads as well as nprocs. The test system would then set CACTUS_NUM_THREADS and OMP_NUM_THREADS, and Carpet would find a consistent setup.

    This solution is probably also the most work to implement.

  4. Roland Haas reporter

    @Eloisa Bentivegna LoopControl’s LC_LOOP macros themselves do not enable any parallel threading. They need to be surround by a #pragma omp parallel section (see eg their use in Llama). Without they are just single threaded. Having had a look at CT_MultiLevel’s code it seems that there is no multi-threading going on. And indeed if I run the poisson test with 4 threads I only see 1 core (per MPI rank) in use even when removing the num_threads setting.

    That is to say: CT_MultiLevel always (with the exception of a reduction and the loops in CT_Analytic and CT_Dust), even before introducing the num_threadssetting, used only a single thread.

    There should have never been a chance for a race condition (since there was only one runner) and any claim that I have made about there being one in the Gauss-Seidel iteration was false, sorry.

    I certainly leaves me worried why your data produced with gcc (4.8.2 admittedly) and -O2 would differ what is produced on the tutorial server (also gcc) or my workstation. The affected test seems to be the boostedpuncture test (which now of course produces bit-identical results for me whether run with OMP_NUM_THREADS=1 or OMP_NUM_THREADS=64 on my workstation (gcc9, -O2, 24 cores so I am oversubscribing it).

    My understanding of what Carpet::num_threads can be used for would be if one somehow cannot pass the OMP_NUM_THREADS variable to the executable (though I do not know of a MPI implementation that would not give me some way of passing ENV variables) in which case one would set CACTUS_NUM_THREADS and the num_threads parameter to the same value, which will not cause Carpet to abort the run.

    For the upcoming release I would try reverting f813545 "CT_MultiLevel: request single-thread execution of all tests." of ctthorns, regenerating the data with a “recent” gcc compiler (7 or better I’d say) using OMP_NUM_THREADS=1 (just to be sure).

    I am wondering if the issue is just the old gcc compiler used (and everything being roundoff) but will have to see if I can somehow get a gcc-4.8 compiler to run on my workstation.

  5. Log in to comment