microbiome - functional annotation of sequencing reads
A super-fast ( < 20min/10GB of reads ) and accurate ( > 90% precision ) method for annotation of molecular functionality encoded in sequencing read data without the need for assembly or gene finding.
Web Service: http://services.bromberglab.org/mifaser/
mi-faser runs on LINUX, MacOSX and WINDOWS systems.
- Python 3.x
- DIAMOND >= 0.8.8 (included; sources: https://github.com/bbuchfink/diamond)
- WINDOWS: Visual C++ Redistributable *
Note: mi-faser was developed and optimized using DIAMOND v0.8.8, which is included in the current release. According to the authors, more recent versions of DIAMOND offer substantial improvements regarding speed and memory usage as well as bugfixes. Thus, we strongly recommend to always use the latest version of DIAMOND (see Section: DIAMOND upgrade). This might alter mi-faser results slightly. However, results are expected to be enriched by new correct annotations rather than introducing mis-annotations.
To process fastq(fq) input files the SeqIO module of the biopython package (http://biopython.org) has to be available in your python environment.
Note that it is recommended to download and compile DIAMOND locally (https://github.com/bbuchfink/diamond) as this might have a significant impact on performance (due to special CPU instructions). However, this repository includes a pre-compiled version of DIAMOND to use.
Standalone VS Web Service
The Standalone version of mi-faser partitions the user input into subsets analogue to the Web Service (http://services.bromberglab.org/mifaser/). However, those partitions are processed sequentially and not in parallel as in the Web Service. Thus the Standalone Version is only recommended for smaller jobs and is mainly thought to provide the mi-faser code base.
Open a terminal and checkout the mi-faser repository:
git clone https://firstname.lastname@example.org/bromberglab/mifaser.git
or download the zipped version:
curl --remote-name https://bitbucket.org/bromberglab/mifaser/get/master.zip unzip master.zip
Navigate to the mi-faser base directory and run mi-faser (only required parameter is a valid input-file):
$ python mifaser.py -f/--inputfile <INPUT_FILE>
usage: mi-faser, microbiome - functional annotation of sequencing reads [-h] [-f INPUTFILE] [-o OUTPUTFOLDER] [-d DATABASEFOLDER] [-i DIAMONDFOLDER] [-s SPLIT] [-c CPU] optional arguments: -h, --help show this help message and exit -f INPUTFILE, --inputfile INPUTFILE input DNA reads file (absolute path) -o OUTPUTFOLDER, --outputfolder OUTPUTFOLDER path to base output folder; default: INPUTFILE_out -d DATABASEFOLDER, --databasefolder DATABASEFOLDER path to folder containing database files -i DIAMONDFOLDER, --diamondfolder DIAMONDFOLDER path to folder containing diamond binary -s SPLIT, --split SPLIT split by X sequences; default: 100k; 0 forces no split -t THREADS, --threads THREADS number of threads; default: 1 -c CPU, --cpu CPU max cpus per thread; default: all available
A demo dataset containing 10k reads is provided to verify a local mi-faser installation. Navigate to the mifaser base directory and run mi-faser with the following arguments:
$ python mifaser.py -f files/test/artificial_mg.fasta -o files/test/out
The resulting analysis will be located relative to the mifaser base directory at: files/test/out/.
As DIAMOND (https://github.com/bbuchfink/diamond) is still under development, we provide an easy way to upgrade (or downgrade) to another version. In case a specific version of DIAMOND is given as parameter, this version will be automatically downloaded and installed. If no specific version is supplied, the latest release is used.
$ mifaser/diamond/update.sh [<DIAMOND_VERSION>]
This project is licensed under NPOSL-3.0.
If you use mi-faser in published research, please cite Zhu et al., "Functional sequencing read annotation for high precision microbiome analysis", submitted (2017)
mi-faser is developed by Chengsheng Zhu and Maximilian Miller. Feel free to contact us for support (email@example.com).