microbiome - functional annotation of sequencing reads
A super-fast ( < 20min/10GB of reads ) and accurate ( > 90% precision ) method for annotation of molecular functionality encoded in sequencing read data without the need for assembly or gene finding.
Web Service: http://services.bromberglab.org/mifaser/
mi-faser runs on LINUX, MacOSX and WINDOWS systems.
- Python >= 3.6
- DIAMOND >= 0.8.8 (included; sources: https://github.com/bbuchfink/diamond)
- WINDOWS: Visual C++ Redistributable *
Note: mi-faser was developed and optimized using DIAMOND v0.8.8, which is included in all release up to v1.11.4. This is also the version used in the accompanying publication . All newer releases of mi-faser use the latest stable release of DIAMOND. mi-faser results for the first release (v1.2) with an updated version of DIAMOND (v0.9.13) were not affected by this (<0.1% difference; based on results for the artificial metagenome supplied as example dataset). According to the authors, more recent versions of DIAMOND offer substantial improvements regarding speed and memory usage as well as bugfixes. Thus, we strongly recommend to always use the latest version of DIAMOND (see Section: DIAMOND upgrade). This might alter mi-faser results slightly. However, results are expected to be enriched by new correct annotations rather than introducing mis-annotations.
<!-- To process fastq(fq) input files the SeqIO module of the biopython package (http://biopython.org) has to be available in your python environment. -->
Note that it is recommended to download and compile DIAMOND locally (https://github.com/bbuchfink/diamond) as this might have a significant impact on performance (due to special CPU instructions). However, this repository includes a pre-compiled version of DIAMOND to use.
Standalone VS Web Service
The Standalone version of mi-faser partitions the user input into subsets analogue to the Web Service (http://services.bromberglab.org/mifaser/). However, those partitions are processed sequentially and not in parallel as in the Web Service. Thus the Standalone Version is only recommended for smaller jobs and is mainly thought to provide the mi-faser code base.
Open a terminal and checkout the mi-faser repository:
git clone https://email@example.com/bromberglab/mifaser.git
or download the zipped version:
curl --remote-name https://bitbucket.org/bromberglab/mifaser/get/master.zip unzip master.zip
Navigate to the mi-faser base directory and run mi-faser (Single or 2-Lane mode):
Single: input-file containing DNA reads:
$ python mifaser.py -f/--inputfile <INPUT_FILE>
2-Lane: R1/R2 files in the 2-Lane mode:
$ python mifaser.py -l/--lanes <R1_FILE> <R2_FILE>
usage: mi-faser, microbiome - functional annotation of sequencing reads [-h] [-f INPUTFILE] [-l R1 R2] [-o OUTPUTFOLDER] [-d DATABASEFOLDER] [-i DIAMONDFOLDER] [-s SPLIT] [-t THREADS] [-c CPU] [-p] [-q] optional arguments: -h, --help show this help message and exit -f INPUTFILE, --inputfile INPUTFILE input DNA reads file -l R1 R2, --lanes R1 R2 2-Lane format (R1/R2) -o OUTPUTFOLDER, --outputfolder OUTPUTFOLDER path to base output folder; default: INPUTFILE_out -d DATABASEFOLDER, --databasefolder DATABASEFOLDER path to folder containing database files -i DIAMONDFOLDER, --diamondfolder DIAMONDFOLDER path to folder containing diamond binary -s SPLIT, --split SPLIT split by X sequences; default: 100k; 0 forces no split -t THREADS, --threads THREADS number of threads; default: 1 -c CPU, --cpu CPU max cpus per thread; default: all available -p, --preserve if flag is set intermediate results are kept -q, --quiet if flag is set console output is logged to file
A demo dataset containing 10k reads is provided to verify a local mi-faser installation. Navigate to the mifaser base directory and run mi-faser with the following arguments:
$ python mifaser.py -f files/test/artificial_mg.fasta -o files/test/out
The resulting analysis will be located relative to the mifaser base directory at: files/test/out/.
As DIAMOND (https://github.com/bbuchfink/diamond) is still under development, we provide an easy way to upgrade (or downgrade) to another version. In case a specific version of DIAMOND is given as parameter, this version will be automatically downloaded and installed. If no specific version is supplied, the latest release is used.
$ mifaser/diamond/update.sh [<DIAMOND_VERSION>]
This project is licensed under NPOSL-3.0.
If you use mi-faser in published research, please cite:
Zhu, C., Miller, M., Marpaka, S., Vaysberg, P., Rühlemann, M. C., Wu, G. H. F.-A., . . . Bromberg, Y. (2017). Functional sequencing read annotation for high precision microbiome analysis. Nucleic Acids Res. doi:10.1093/nar/gkx1209
mi-faser is developed by Chengsheng Zhu and Maximilian Miller. Feel free to contact us for support (firstname.lastname@example.org).