SVEngine: Allele Specific and Haplotype Aware Structural Variants Simulator
SVEngine (Structural Variants Engine)
- SVEngine is a multi-purpose and self-contained simulator for whole genome scale spike-in of thousands of SV events of various types in both single-sample and matched sample scenarios.
- SVEngine takes as input reference contigs in FASTA files, variant meta distribution as specified in META files (see Manual) or specific variant information as specified in VAR files (see Manual) and NEWICK files for specifying clonal phylogenetic trees in cancer.
- SVEngine outpus alterred contigs in FASTA files, spiked-in variants in VAR files (see Manual), simulated short read in FASTQ files and aligned short reads in BAM files.
- SVEngine's modeling and pipeline are illustrated in Figures 1 and 2.
Figure 1. The principle and workflow of SVEngine.
Figure 2. The bioinformatics pipeline of SVEngine.
Currently the package works for Linux (tested with Ubuntu) and Mac (tested with Homebrew). It might also work for Windows with Cygwin (not tested). SVEngine documentation is on wiki and it is available: http://bitbucket.org/charade/svengine/wiki
DOCKER (Platform Indepedent)
Due to the multiple R and Python dependencies involved, the easiest way to use SVEngine is by the provided docker image build file. A Dockerfile is provided to build SVEngine-enabled image from a standard Ubuntu docker image. If you are not familiar with Docker, it is a container platform widely used in industry/academia. Here is the link to the Docker community:
If you have a docker server running, just need to download the Dockerfile from:
into $your_swan_container and run:docker build --no-cache $your_swan_container
SVEngine is available on bitbucket: bitbucket.org:charade/svengine.git
Python2(>=2.7) http://www.python.org/pybedtools, pysam, pandas, tables, scipy, dendropy, cython
Samtools>=1.2, Bedtools>=2.0, BWA>=0.7
A script to confirm all dependencies are met is here:
An example dependency install script for Mac wth Homebrew is here:
Installation Free !:SVEngine can be used without installation as long as required commands are all in $PATH. see Examples and Manuals for more informations
You can install SVEngine as a standard package to your site python without virtualenv, in particular when all python dependencies are already met. Before you take this approach, please consider following virtualenv based installation first. The following guides were intended for Ubuntu, while Mac Homebrew user may easily adapt by removing sudo.
The other option is to use virtualenv based non-root installation, which separates svengine from site python. virtualenv command is standard with Python 2.7 or later. see: http://python-guide-pt-br.readthedocs.io/en/latest/dev/virtualenvs/. If it is not present, please see https://virtualenv.pypa.io for details to install virtualenv for your python.sudo easy_install pip sudo pip install virtualenv
Ask your IT manager to help install it for you if you have permission difficulties. When your system python has virtualenv, make sure your $PYTHONPATH is set to empty. Follow steps below:sudo pip install --upgrade pip virtualenv svengine_vpy
Now you can activate this virtual python for installing SVEngine python dependencies:# in svengine_vpy> source svengine_vpy/bin/activate pip install -U scipy pip install -U tables pip install -U pandas pip install -U pysam pip install -U pybedtools pip install -U dendropy pip install -U biopython pip install -U cython
Then install svengine as an standard python package:# in svengine_vpy> git clone https://email@example.com/charade/svengine.git cd svengine git submodule update --init --recursive test/test_dep.sh python setup.py install
Now the SVEngine executables will be available from "$PWD/svengine_vpy/bin".
Because you installed SVEngine via virtualenv, remember to activate the virtualenv every time you use SVEngine. Also export the environmental variable $SVENGINE_BIN=$PWD/svengine_vpy/bin and add it to your $PATH.
xwgsim -- enhanced Heng Li's wgsim
mutforge -- spiking in structure variants without fasta output
tree2var -- updating variant with a phylogenetic tree structured frequency
By default all above executables will be available from $SVENGINE_BIN/ . Use '-h' to read script-wise usage. See Wiki pages for more details.
SVEngine's https://bitbucket.org/charade/svengine/wiki page is a growing resource for manuals, FAQs and other information. This is a MUST read place before you actually using the SVEngine tool. These documentations are also openly editable in reSructuredText format. You are more than welcome to contribute to this ongoing documentation.
Questions and comments shall be addressed to lixia at stanford dot edu.
The SVEngine manuscript is under review. The preprint is available at BioRxiv: https://www.biorxiv.org/content/early/2018/01/12/247536