Is there an easy way to remove incomplete files when restarting

Create issue
Issue #485 closed
Christopher Gates created an issue

To my knowledge, when restarting a previously interrupted job that left behind incomplete files, Snakemake can:

a) --rerun-incomplete: i.e. treat the incomplete files as invalid, and rerun the saved command from the earlier run

b) --ignore-incomplete: i.e. treat the files marked as incomplete as complete and keep going

If our user cancels the job to adjust a subordinate config file, then neither of the options above are ideal. It would be nice for us to:

c) --remove-incomplete: i.e. remove the incomplete files from the filesystem and associated snakemake metadata, build a new command (with the new config), and run.

Is there an easy way to do this? (FYI we are using 3.7.1 and are open to upgrading.)

Snakemake rocks! Thanks!


Comments (6)

  1. Johannes Köster

    Actually, --rerun-incomplete will recreate incomplete files, but any changes to the Snakefile or config will be reflected. I.e., no saved commands here. Hence, I think --rerun-incomplete is what you want.

  2. Colin Brislawn

    Hello @johanneskoester ,

    I think I have a use case for --remove-incomplete. It's true that --rerun-incomplete will faithfully recreate all incomplete files, but in the meantime, these incomplete files can be used as inputs to other rules leading to inaccurate results. We are working through this issue over here:

    Do you have any recommendations? I think implementing --remove-incomplete is a great solution, but that's up for you.



  3. Rasmus Ågren

    Snakemake shouldn't run downstream rules if a job aborted unsuccessfully. If that is the case it should be considered a bug, so implementing a new option for removing incomplete files isn't the way to deal with it.

    I saw in the other issue that you ran it on an interactive node. Could it be that the .snakemake directory (where stuff such as incomplete output is logged) existed on that node and that you got another node when you tried to rerun? I.e., was the workdir local to the node?

  4. Colin Brislawn

    Ah, ok! I understand why this new flag is unneeded.

    My .snakemake directory is in the same folder as my data, which is within my home directory. Unless the node is silently doing something strange, I think all the logging should be preserved. I was also able to recreate this issue by stopping the script using ctrl+c and restarting it, so I'm not sure the node is the issue.

    Thank you for your help!

  5. Rasmus Ågren

    If you cancel the job manually Snakemake should say something like "Removing output files of failed job run_blast since they might be corrupted", so you shouldn't have blast-hits.txt at all after that. It's difficult to help without more info, but some possible reasons could be:

    • The input and output are named differently without you realising it. Maybe em dash (—) instead of hyphen (-) for example.
    • You are doing a dry run and the printing order of the rules doesn't mean anything. You can generate the DAG/rule graphs to see if the jobs/rules are connected as you expect them to be.
    • Maybe there is some permission problem that either causes the logging to malfunction or that makes it impossible for Snakemake to remove the failed file. You should see warnings if that is the case though.

    I suggest you first try updating both Snakemake and Hundo, and if it still doesn't work send the full output and your example to the Hundo issue tracker. My gut feeling is that this isn't a Snakemake issue, but I could of course be wrong. Does Hundo use the group feature in Snakemake? It's a new feature and there seems to be some issues with that still. It should only affect cluster jobs though.

  6. Log in to comment