tests
Hi Johannes -
I really like this idea of a repo of wrappers that can be individually documented and maintained. Before using other peoples' wrappers in production though, it would be nice to know that they have been tested in some way. Do you have any ideas for testing wrappers on e.g. travis-ci?
I think the biggest challenge is example data. For example, testing a wrapper for an aligner would require a prepared index for an aligner. Would this be created on-the-fly by a test (downloading the FASTA from e.g. Ensembl), or be downloaded, already prepared, directly from some central example data location? If the latter, where would it be stored? More generally, what is the minimal set of example data that can be re-used by the maximum number of wrappers?
Once that's solved, tests can be set up to run only on those wrappers that have been updated to avoid exceeding CI resources.
Anyway, I just wanted to get the conversation started about this to kick around ideas, and of course I would be happy to contribute.
Comments (6)
-
-
Hi everyone,
I've only been recently trying to dig deeper into snakemake but I just want to say this is also what I am interested in having.
I'm wondering though, since these are tests for wrappers and not tests for the actual tools themselves, is it maybe enough in some cases to just test for the actual
shell
strings? In other words, given a rule that uses the wrapper in a Snakefile, what is the resultingshell
string. It would avoid maintaining a extra datasets just for testing.Of course, testing actual runs of the tools being wrapped would be better. However, there may be cases where setting up such test dataset means too much overhead. Some tools may also require a certain size of input in order to work properly. In those cases, the minimum should then be to have
shell
string tests.What do you think?
-
reporter I agree we definitely don't want to replicate unit tests for each tool, but I think it is important to test the output. Shell string tests will work for a subset of wrappers, but not all wrappers will even use a
shell
call, and some might use several. Some wrappers may do complex things that need to be tested.Tools that require a certain size input will be a problem . . . maybe we need to rely on external data from SRA for those?
As sort of a sketch . . . I'm picturing some sort of "driver" Snakefile that sets up minimal data for each test case and then
include:
s a wrapper's test Snakefile supplied from the commandline via--config=wrapper=path/to/wrapper
# include the wrapper's test snakefile. Test snakefile is required to have # a dictionary of results, something like: # {'sample.bam.bai': <md5sum here>} import os include: os.path.join(config['wrapper'], 'Snakefile') setup = ['R1.fastq.gz', 'R2.fastq.gz', 'sample.bam', 'genome.fa'] rule all: input: setup + results.keys() # data downloaded from htslib or maybe the wrappers repo rule fastqs: output: 'R1.fastq.gz', 'R2.fastq.gz' rule bams: output: 'sample.bam' rule fasta: output: "genome.fa" onsuccess: # check against configured md5sums for k, v in results.items(): assert md5sum(open(k)) == v
As long as the data available for testing is documented, wrappers can expect that data to exist and therefore can have minimal boilerplate. A samtools test snakefile in
bio/samtools/index
could then look like:results = {'sample.bam.bai': <md5sum here>} rule samtools_index: input: "sample.bam" # set up by the driver snakefile output: "sample.bam.bai" wrapper: 'file://path/to/wrapper'
-
We are evaluating a few different tools for our new pipeline and I have a few questions regarding validation possibilities with snakemake-wrappers.
Has there been any progress in implementing test that validates the outputed result with md5sum or any plans made for implementing it?
I also have question about the existing tests. Right now, it seems to me that all tests are made against the master (for bwa mem, "master/bio/bwa/mem"). Wouldn't that be a problem if I want to use a specific version, for example the latest tagged version, and also test that version and not the master version.
Could the test for bwa mem be changed fromrule bwa_mem: input: ["reads/{sample}.1.fastq", "reads/{sample}.2.fastq"] output: "mapped/{sample}.bam" log: "logs/bwa_mem/{sample}.log" params: index="genome", extra=r"-R '@RG\tID:{sample}\tSM:{sample}'", sort="none", # Can be 'none', 'samtools' or 'picard'. sort_order="queryname", # Can be 'queryname' or 'coordinate'. sort_extra="" # Extra args for samtools/picard. threads: 8 wrapper: "master/bio/bwa/mem"
to
rule bwa_mem: input: ["reads/{sample}.1.fastq", "reads/{sample}.2.fastq"] output: "mapped/{sample}.bam" log: "logs/bwa_mem/{sample}.log" params: index="genome", extra=r"-R '@RG\tID:{sample}\tSM:{sample}'", sort="none", # Can be 'none', 'samtools' or 'picard'. sort_order="queryname", # Can be 'queryname' or 'coordinate'. sort_extra="" # Extra args for samtools/picard. threads: 8 wrapper: "file:.."
-
So far, there are no md5sum checks, but they can easily be added by modifying the test.py. For example the meta.yaml can contain a list of filenames and checksums, and this can be checked in the
run
function in test.py.Regarding the
master
, it is just a placeholder. For the docs, it is replaced by the version to build, for tests, a local prefix is added such that nothing is pulled from git, but rather the version we have currently checked out is tested. -
- edited description
- changed status to resolved
- Log in to comment
Yes, we definitely need some kind of testing framework. I think it would be best to start with a tiny snakefile in each wrapper directory, along with really small test data (e.g. the fastq and sam files from the htslib repository). Then, a simple nosetest script like we have it in the Snakemake workflow repository can run all these snakefiles automatically.