Wiki
Clone wikigopher-pipelines / shotgun-pipeline
shotgun-pipeline
The shotgun pipeline processes shotgun metagenomics data and generates an analysis report.
The Pipeline
The pipeline performs the following steps:
- Subsampling (optional): Each fastq file is reduced to a specified number of reads in order to reduce processing time
- FastQC (optional): FastQC is run on each fastq file to generate sequence quality plots
- Quality Trimming: Quality trimming is performed using Shi7.
- Alignment: Reads are aligned to the World of Life (WoL) reference metagnomic database using BWA. A biom table is generated using the Wol gOTU_from_maps.py script.
- Beta Diversity: Beta diversity is estimated using Qiime2
- Alpha Diversity: Alpha diversity is estimated using Qiime2
Options
Advanced options
--refdb database | |
Path to shogun database | |
--flag samplelist | |
a comma-delimited list of sample names to flag in the report (not tested) |
Standard gopher-pipeline options
--fastqfolder folder | |
A folder containing fastq files to process | |
--subsample integer | |
Subsample the specified number of reads from each sample. 0 = no subsampling (default = 0) | |
--samplesheet file | |
A samplesheet | |
--runname string | |
Name of the sequencing run | |
--projectname string | |
Name of the experiment (UMGC Project name) | |
--illuminasamplesheet file | |
An illumina samplesheet, from which extra sample information can be obtained | |
--nofastqc | Don't run FastQC |
--samplespernode integer | |
Number of samples to process simultaneously on each node (default = 1) | |
--threadspersample integer | |
Number of threads used by each sample | |
--scratchfolder folder | |
A temporary/scratch folder | |
--outputfolder folder | |
A folder to deposit final results | |
--extraoptionsfile file | |
File with extra options for trimmomatic, tophat, cuffquant, or featurecounts | |
--resume | Continue where a failed/interrupted run left off |
--verbose | Print more information while running |
--help | Print usage instructions and exit |
Fastq file support: Paired-end and single-end reads are supported. Gz-compressed or uncompressed fastq files are supported.
Running the pipeline
It is recommended you run the pipeline interactively using the --subsample option to make sure the pipeline works correctly on a small sample of your data before submitting a job to process your entire dataset. This allows you to identify and solve problems quickly.
module load umgc module load gopher-pipelines shotgun-pipeline --fastqfolder /path/to/fastqs
The results of the analysis are located in /panfs/roc/scratch/USERNAME-pipelines/shotgun-RUNNAME/output. Download the entire output folder to your local computer and open up the html file to see a summary of the analysis results.
Support
Issues with the pipeline can be reported in the Bitbucket issue tracker. You may also contact John Garbe directly at jgarbe@umn.edu, response times may vary.
Updated