Wiki

Clone wiki

multisnv / Home

#About

multiSNV is a tool for calling somatic single-nucleotide variants (SNVs) using NGS data from a normal and multiple tumour samples of the same patient. Instead of performing multiple pairwise analyses of a single tumour sample and its matched normal, multiSNV jointly considers all available samples under a Bayesian framework to increase sensitivity of calling shared SNVs. multiSNV accepts BAM files (one BAM file for each sample) and produces a single VCF file with variant predictions for all samples.

Dependencies

1) Git

2) cmake

3) Download Boost and compile libraries (http://www.boost.org/users/download/) Installation instructions may be found at http://www.boost.org/doc/libs/1_57_0/more/getting_started/unix-variants.html (Follow Steps 1 and 5.1)

Installation

1) Before you start set the environment variables. For Linux systems use LD_LIBRARY_PATH and for MacOS X use DYLD_LIBRARY_PATH instead

export BOOST_ROOT=/PATH/TO/boost
export DYLD_LIBRARY_PATH=$DYLD_LIBRARY_PATH:${BOOST_ROOT}/lib

2) Clone repository to get the multiSNV source code:

git clone --recurse-submodules https://bitbucket.org/joseph07/multisnv.git

3) To install multiSNV, go to the root of the cloned multisnv directory and run the installation script:

./install.sh
This will also install and build bamtools.

3) To confirm installation has been successful:

./multiSNV --help

This should produce a help message!

Running multiSNV

To test multiSNV you can use the sample BAM files and accompanying fasta file.

./multiSNV -N2 --bam dataset/n-small.bam dataset/t-small.bam --fasta dataset/small.fa -f output.vcf 

NOTE: The normal bam file should always be listed first

Running ./multiSNV --help will produce a list of available options.

To get a high confidence set of calls, we suggest retaining sites that are flagged as "PASS" and "LOW_QUAL". "LOW_QUAL" indicates there is uncertainty about the somatic status in at least one sample, (perhaps due to low depth and/or low mutation frequency) but this does not mean there is evidence that variation is artifactual. In our work, we tend to keep both "LOW_QUAL" and "PASS" sites.

For further details on the statistical model of multiSNV refer to the manuscript and its Supplementary Material.

License

Copyright (C) 2015 Malvina Josephidou

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

If you use this software in your work, please cite the accompanying publication:

Josephidou M, Lynch AG & Tavaré S. multiSNV: a probabilistic approach for improving detection of somatic point mutations from multiple related tumour samples. Nucleic Acids Research (2015) doi:10.1093/nar/gkv135

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/.

#News 15/01/16: multiSNV v2.x has been released! multiSNV 2 achieves higher sensitivity at lower false-positive rates. In particular, you'll observe more robustness to sequencing noise and normal contamination, fewer false-positive LOH events and higher sensitivity to shared SNVs.

This new release comes with changes to some default settings, including the default mutation rate (1e-06), an improved print out format (--print 1), and some changes to model hyperparameters and approximations.

We hope you like using multiSNV 2, please direct comments and suggestions to mj343 at cam dot ac dot uk

Note: You'll need to rerun cmake before compiling the new version.

git pull
cmake CMakeLists.txt
make

Technical details of the new statistical model will be released soon. Here is how the two versions compare:

multiSNV_vs_multiSNV2_labeled.png

Fig 1. Performance comparison of multiSNV and multiSNV 2 on the medulloblastoma tumour-normal whole-genome sequencing gold-standard datasets (Alioto TS, et al., 2015) for a range of mutation rates. (--mu parameter)

15/09/2015: The new multiSNV version has been released, with BAM file compatibility, more built-in false positive filters and lots of bug fixes!

Updated