Allow expand to take 'unused' wildcards

Create issue
Issue #1128 resolved
Maarten van der Sande created an issue

I am relatively new to snakemake, so the problem could very well be my workflow. Currently what I (would like to) do is I make a wildcards dictionary, and just pass it to every expand function as kwargs, so I do not have to type out all the keyword arguments for each expand call.

wildcards = {'sample': ['a', 'b'],
             'subgroup': ['train', 'test']}
             'directory': '/home/user/...'}

rule example:
    input:
        expand("{folder}/{sample}.fa", **wildcards)

However this leads to problems when there are 'unused' iterable wildcards (in this case the subgroup key/value pair), leading to combinations being made that are not required, and the resulting error

Duplicate output file pattern in rule complement_bed_file. First two duplicate for entries 0 and 1

This can however be easily overcome by small changes to the expand function.

  • Duplicates are filtered out from the returned list (convert to set and back to list). This however gives the issue that if someone actually specified duplicates they are filtered out.
        return list(set([filepattern.format(**comb)
                for comb in map(dict, combinator(*flatten(wildcards)))
                for filepattern in filepatterns]))
  • Or we only use the wildcards that are necessary for the filepattern
wildcards = {k:v for k, v in wildcards.items() if k in re.findall('{([^}]+)}', ''.join(filepatterns))}

What are your ideas on this? Is this desired behaviour of SnakeMake? If so, I can make a PR.

Comments (2)

  1. Log in to comment