A workflow for metagenomic projects

This is a snakemake workflow that processes paired-end and/or single-end metagenomic samples.

Potential analyses include:

  • read-based taxonomic classification
  • assembly
  • functional and taxonomic annotation of coding sequences
  • genome binning of assembled contigs


Clone the repository

Checkout the latest version of this repository (to your current directory):

git clone

Install the required software

All the software needed to run this workflow is included as a conda environment file. To create the environment sm-meta use the supplied environment.yaml file found in the lts_workflows_sm_metagenomics/envs/ folder.

conda env create -f lts_workflows_sm_metagenomics/envs/environment.yaml

Optional: To install the software environment inside the workflow directory (instead of in your home directory) you can run:

mkdir envs/sm-meta conda env create -p envs/sm-meta -f envs/environment.yaml

This creates the sm-meta environment inside the envs/ directory and installs the environment there.

Next, add this directory to the envs_dirs in your conda config (this is to simplify activation of the environment and so that the full path of the environment installation isn't shown in your bash prompt):

conda config --add envs_dirs <full_path_to_repository>/envs/

Activate the environment using:

conda activate sm-meta

(Optional) Install mapdamage

If you plan on running analyses using mapdamage to identify 'ancient' sequences in your data you should also install the mapdamage conda environment:

conda env create -f lts_workflows_sm_metagenomics/envs/mapdamage.yaml


See the documentation for instructions on how to run the pipeline.