Code crash after checkpointing when group finding is enabled

Issue #15 resolved
Former user created an issue

see title

Comments (6)

  1. Douglas Potter repo owner

    What are the symptoms (e.g., what assert is tripped)? Did you mean that that code crashes when lightcone healpix output is enabled, but group finding is not?

  2. Former user Account Deleted reporter

    The problematic assert is:

    pkdgrav3_mpi: mdl2/mpi/mdl.c:655: mdlCacheReceive: Assertion `c->iType != 0' failed.

    I had both lightcone healpix and group finding enabled.

  3. Joachim Stadel

    Mischa, can you upload the text file containing the slurm output (the text output from pkdgrav3) just prior to the assert so that we can see clearly what phase of the code it reached. It would be good to show a couple of 100 lines before this point maybe.

  4. Douglas Potter repo owner

    So the phases were:

    • Last substep ends
    • Domain Decomposition & Tree build
    • FoF group finding
    • Gravity on main step
    • Output of group statistics
    • P(k) measurement
    • Checkpoint written
    • Gravity on main step (repeated) -> crashes

    Root cause is that during the second gravity the Cell cache (CID 1) is not open. When no checkpoint is written, the last two steps are omitted and it doesn't crash.

  5. Log in to comment