Wiki

Clone wiki

gopher-pipelines / shotgun-pipeline

shotgun-pipeline

The shotgun pipeline processes shotgun metagenomics data and generates an analysis report.

The Pipeline

The pipeline performs the following steps:

  1. Subsampling (optional): Each fastq file is reduced to a specified number of reads in order to reduce processing time
  2. FastQC (optional): FastQC is run on each fastq file to generate sequence quality plots
  3. Quality Trimming: Quality trimming is performed using Shi7.
  4. Alignment: Reads are aligned to the World of Life (WoL) reference metagnomic database using BWA. A biom table is generated using the Wol gOTU_from_maps.py script.
  5. Beta Diversity: Beta diversity is estimated using Qiime2
  6. Alpha Diversity: Alpha diversity is estimated using Qiime2

Options

Advanced options

--refdb database
 Path to shogun database
--flag samplelist
 a comma-delimited list of sample names to flag in the report (not tested)

Standard gopher-pipeline options

--fastqfolder folder
 A folder containing fastq files to process
--subsample integer
 Subsample the specified number of reads from each sample. 0 = no subsampling (default = 0)
--samplesheet file
 A samplesheet
--runname string
 Name of the sequencing run
--projectname string
 Name of the experiment (UMGC Project name)
--illuminasamplesheet file
 An illumina samplesheet, from which extra sample information can be obtained
--nofastqc Don't run FastQC
--samplespernode integer
 Number of samples to process simultaneously on each node (default = 1)
--threadspersample integer
 Number of threads used by each sample
--scratchfolder folder
 A temporary/scratch folder
--outputfolder folder
 A folder to deposit final results
--extraoptionsfile file
 File with extra options for trimmomatic, tophat, cuffquant, or featurecounts
--resume Continue where a failed/interrupted run left off
--verbose Print more information while running
--help Print usage instructions and exit

Fastq file support: Paired-end and single-end reads are supported. Gz-compressed or uncompressed fastq files are supported.

Running the pipeline

It is recommended you run the pipeline interactively using the --subsample option to make sure the pipeline works correctly on a small sample of your data before submitting a job to process your entire dataset. This allows you to identify and solve problems quickly.

module load umgc
module load gopher-pipelines
shotgun-pipeline --fastqfolder /path/to/fastqs

The results of the analysis are located in /panfs/roc/scratch/USERNAME-pipelines/shotgun-RUNNAME/output. Download the entire output folder to your local computer and open up the html file to see a summary of the analysis results.

Support

Issues with the pipeline can be reported in the Bitbucket issue tracker. You may also contact John Garbe directly at jgarbe@umn.edu, response times may vary.

Updated