Wiki
Clone wikimetabit / Home
metaBIT
An integrative and automated metagenomic pipeline for analysing microbial profiles from high-throughput sequencing shotgun data
Wiki Content
Overview
The metaBIT pipeline proposes tools for visualising microbial profiles (barplots, heatmaps) and performing a range of statistical analyses (diversity indices, hierarchical clustering and principal coordinate analysis). It uses as input fastq files containing trimmed reads from shotgun high through-put sequencing (flowchart of metaBIT).
Citation
Louvel, G., Der Sarkissian, C., Hanghøj, K. and Orlando, L. (2016), metaBIT, an integrative and automated metagenomic pipeline for analysing microbial profiles from high-throughput sequencing shotgun data. Molecular Ecology Resources. doi: 10.1111/1755-0998.12546
What It Does
metaBIT is a metagenomic computational pipeline which identifies microbial taxa and their relative abundances from shotgun high-throughput DNA sequencing data using the program MetaPhlAn (Segata et al 2012).
With metaBIT, the user can visualise the resulting profiles through heatmaps and barplots, and compute summary statistics characteristic of each profile (e.g., diversity indices). The metaBIT pipeline supports comparison between several microbial profiles by computing inter-profile distances, and performing, e.g., hierarchical clustering, Principal Coordinates Analysis, and biomarker identification using LEfSe (Linear Discriminant Analysis Effect Size; Segata et al 2011).
What it does not do
The metaBIT pipeline does not perform adapter removal post-shotgun sequencing. DNA reads must be provided as fastq files after adapter trimming, using for example AdapterRemoval (Lindgreen 2012, as implemented in PALEOMIX (Schubert et al. 2014)).
Installation
metaBIT requirements
- Python 2.7.3+
- R v3+
- Bowtie 2 v2.1.0+ (Langmead and Salzberg 2012)
- Picard Tools
-
SAMTools v1.1.0+ (Li et al 2009)
-
MetaPhlAn 1 or 2:
- MetaPhlAn (Segata et al 2012) v1.7.8
- MetaPhlAn2 v2.0
-
Python modules:
- pysam v0.8+
- numpy
- rpy2 v2.2.4+
- matplotlib v1.0+
-
R packages
Installing R and python dependencies for metaBIT
R script (REF) to install required and uninstalled R dependencies for metaBIT
#!R ipak <- function(pkg){ new.pkg <- pkg[!(pkg %in% installed.packages()[, "Package"])] if (length(new.pkg)) install.packages(new.pkg) sapply(pkg, require, character.only = TRUE) } packages <- c('optparse', 'ggplot2', 'reshape2', 'ape', 'vegan', 'survival', 'mvtnorm', 'modeltools', 'coin', 'MASS') ipak(packages)
Bash commands to install required python (2.7) dependencies for metaBIT
#!bash
$ pip install pysam numpy matplotlib rpy2 --user
metaBIT pipeline installation
-
Install all required dependencies as listed above.
-
Download metaBIT. You can use the command-lines below:
$ git clone https://bitbucket.org/Glouvel/metabit.git
-
You can optionally create a symlink to add metaBIT in your path (check
echo $PATH
). For example, if ~/bin is in your path:$ cd ~/bin $ ln -s -T path/to/metabit/metaBIT.py metaBIT
Configuring and testing the pipeline
In order to use MetaPhlAn, Picard Tools and LEfSe, their paths need to be provided as an option to the metaBIT command line:
$ metaBIT --metaphlan-path /path/to/metaphlan --lefse-path /path/to/lefse --jar-root /path/to/picard makefile.yaml
Or saved once for all in a configuration file written to ~/.pypeline/metabit.ini
using the --write-config
option:
$ metaBIT --metaphlan-path /path/to/metaphlan --lefse-path /path/to/lefse --jar-root /path/to/picard --write-config
In this latter case, the configuration file will be automatically parsed when metaBIT is executed, as shown below.
$ metaBIT makefile.yaml
Please note that the configuration file can also store other useful metaBIT options (see help menu for the option list).
In particular if you are using a personal computer with little RAM, you should set the option --jre-option=-Xmx2g
(choose appropriate value for -Xmx
) to reduce the amount of memory used by Picard tools MarkDuplicates.
The path to the programs Bowtie2, samtools, "ktImportText" for Krona-tools should be added to the user's PATH (e.g. export PATH=$PATH:/path/to/KronaTools-2.5/bin
).
Get help
$ metaBIT -h
Makefile in YAML-format
metaBIT requires one positional argument: a makefile in yaml format. See documentation makefile for a detailed description of the makefile parsed by metaBIT.
Test metaBIT with companion fast example
Your installation can be tested by running the example with data provided in the folder example
, assuming all required paths are saved in the config file as show above:
$ cd example/
$ metaBIT fastexample.yaml
This should report no errors. Results will be saved in the current working directory named "out_yourmakefile", unless another --destination
has been added.
Go to Tutorial for a thorough walk-through of metaBIT.
Results generated by metaBIT
Assuming the results have been saved in the directory "out_makefile", you will find the following folders:
-
one folder for each samples. It contains one folder per library.
-
each library folder contains intermediate files from the pipeline processing (example for single-end):
reads.singles.bowtie2out.sam
,reads.singles.bowtie2out.stats
: output of Bowtie2.reads.singles.bowtie2out.sorted.bam
: the sorted output of Bowtie2.reads.singles.bowtie2out.sorted.no_dup.bam
: the sorted output of Bowtie2, devoid of PCR duplicates.taxa.tsv
: output from MetaPhlAn
-
in the main folder, a file named
all_taxa.tsv
is the merger of all the MetaPhlAn outputs (i.e. individualtaxa.tsv
files provided in each library folder). -
a folder named
krona
containing every Krona input file and their corresponding results in a html file. -
a folder named
lefse
containing results from LEfSe and all intermediate files -
a folder named
statax
containing all statistical outputs (clustering, PCoA...) and plots (barplot, heatmap).
Virtual machine image
Virtual machine image (.ova.zip) for the metaBIT pipeline, including all required dependencies. Link to the most recent version (2. May 2016).
To run the virtual machine image file, VirtualBox
and VirtualBox Extension Pack
must be installed (https://www.virtualbox.org/wiki/Downloads).
Updated