rules.output.output_name syntax fails with remote provider

Create issue
Issue #1258 resolved
Kyle Beauchamp created an issue

I typically prefer to use the rule_name.output.output_name for my rule inputs, so that I don’t have to maintain the filenames in multiple places. However, I’ve noticed that this syntax appears to fail when using a default remote provider. For example, the following Snakefile works with local storage but fails on S3 remote storage (snakemake --default-remote-provider S3 --default-remote-prefix my_bucket_name -- B`). Furthermore, if I change the syntax in rule B to use the full file path syntax, the Snakefile then does work on a default remote provider. I suspect there’s some sort of bug in the file path templating that’s preventing the paths from working in this case. (tested on 5.5.4 Bioconda)

rule A:
    output:
        csv = "out/A.csv"
    shell:
        "echo 'hi' > {output.csv}"
rule B:
    input:
        csv = rules.A.output.csv,  # This fails
        #csv = "out/A.csv",  # This works
    output:
        csv = "out/B.csv",
    shell:
        "cat {input.csv} > {output.csv}"

Comments (6)

  1. Derek Matthew Croote

    I have encountered this as well. The failure looks to be caused by the bucket name being prepended multiple times. Using your example (and a bucket name of snakemaketesting9942) the missing input file for rule B is: snakemaketesting9942/snakemaketesting9942/out/A.csv.

    This appears to be the result of multiple executions of the apply function within apply_default_remote. With some print statements around line 338 in rules.py:

    Old value: out/A.csv
    New value: snakemaketesting9942/out/A.csv
    
    Old value: snakemaketesting9942/out/A.csv
    New value: snakemaketesting9942/snakemaketesting9942/out/A.csv
    
    Old value: out/B.csv
    New value: snakemaketesting9942/out/B.csv
    
  2. Log in to comment