anonymous created an issue

A simulation restart with SimFactory failed with the following error message:

DEBUG: checkpoint file: /work/eschnett/philip/simulations/d3.0-mclachlan-i0031/output-0000/d3.0-mclachlan/checkpoint.chkpt.it_0.h5 file=/work/eschnett/philip/simulations/d3.0-mclachlan-i0031/output-0000/d3.0-mclachlan/checkpoint.chkpt.it_0.h5 dfile=/work/eschnett/philip/simulations/d3.0-mclachlan-i0031/output-0001/d3.0-mclachlan/checkpoint.chkpt.it_0.h5 DEBUG: linking /work/eschnett/philip/simulations/d3.0-mclachlan-i0031/output-0000/d3.0-mclachlan/checkpoint.chkpt.it_0.h5 to /work/eschnett/philip/simulations/d3.0-mclachlan-i0031/output-0001/d3.0-mclachlan/checkpoint.chkpt.it_0.h5 before link Error: Could not link checkpoint file /work/eschnett/philip/simulations/d3.0-mclachlan-i0031/output-0000/d3.0-mclachlan/checkpoint.chkpt.it_0.h5 to /work/eschnett/philip/simulations/d3.0-mclachlan-i0031/output-0001/d3.0-mclachlan/checkpoint.chkpt.it_0.h5


  anonymous reporter
    I've commited a patch to include the operating system error when fails. Looking at the code and the debug output, I can't see any obvious reason why this would fail. I've taken out the sys.exit(1) and instead replaced it with a return False, which will disable checkpointing. Hopefully when this happens again, the operating system error will help pinpoint the reason this is happening.

  anonymous reporter
    On second thought, it might be because I wasn't checking to make sure the restore_dir, in this case /work/eschnett/philip/simulations/d3.0-mclachlan-i0031/output-0001/, existed before attempting to link the file. I've added code to create the restore_dir if it doesn't exist.

