1. Marcus Kinsella
  2. ShortFuse

Overview

HTTPS SSH
Dependencies:

ShortFuse uses the NetworkX package for Python 2 available at
http://networkx.lanl.gov/download.html .
ShortFuse has been tested with Python versions 2.6 and 2.7 and GCC 4.5.2 .

Usage:

First, build by running make in the ShortFuse directory.

Then, obtain the reference files from

https://bitbucket.org/mckinsel/shortfuse/downloads/ShortFuse_reference_files_v0_2.tar

This should extract to a ShortFuse_ref/ directory.


Then, build the Bowtie index files in the ShortFuse_ref/ directory

$ bowtie-build RefSeqTranscripts_50up_polyA.fasta refseq_transcripts

Then, you need to get reference file for the genome_plus_transcriptome
filtering step. You can obtain hg19 from

http://hgdownload.cse.ucsc.edu/goldenPath/hg19/bigZips/chromFa.tar.gz

You can obtain the transcriptome from

https://bitbucket.org/mckinsel/shortfuse/downloads/UCSC_genes.fa.gz

Then use bowtie-build again to make a Bowtie index out of the genome and
transcripts called transcripts_plus_genome in the ShortFuse_ref directory.

Then, in the ShortFuse directory, there is a file called shortfuse.conf.
There are several options that need to be set in this file, such as the
location of the ShortFuse and Bowtie installations and where you would like
the files ShortFuse produces to go.

Finally, you can run ShortFuse with

$ python run_pipeline.py --conf <shortfuse.conf> <fastq.1> <fastq.2>

The final results will be in a file called fusion_counts.bedpe in the
working_dir directory specified in your shortfuse.conf file. This file
contains the RefSeq IDs and names of the two genes involved in the fusion.
The score field is the number of read pairs mapped to the fusion junction.
This corresponds to C_i from the paper, and will likely be fractional as it
represents a probablistic assignment of reads.