1. CampagneLaboratory
  2. Allogenomics
  3. allogenomics

Overview

HTTPS SSH

Allogenomics Scoring Tool

Download Annotations

The allogemomics scoring tool uses gene structure annotations to identify variations in coding sequences. Protein coding annotations are provided in the data directory, but if you need to update this information, you can follow these instructions: 1a. Fetch a complete GTF file from Ensembl wget ftp://ftp.ensembl.org/pub/release-75/gtf/homo_sapiens/Homo_sapiens.GRCh37.75.gtf.gz

1b. Extract the coding sequence annotations: gzip -c -d Homo_sapiens.GRCh37.75.gtf.gz | awk '{if ($2=="protein_coding" && $3=="CDS") print $0}' > protein_coding_75.

Compiling

You may use the binary program located in bin/allogenomics-scoring-tool.jar, or recompile from sources. To recompile, you will need maven installed on your computer. Once you are setup, do:

mvn package When the build is successful, it will produce the scoring tool jar file in the target directory. For instance: target/allogenomics-1.1.7-SNAPSHOT-scoring-tool.jar Other jar files are also produced, the scoring-tool.jar file is one that is easier to execute.

Read the Documentation

java -jar target/allogenomics-1.1.7-SNAPSHOT-scoring-tool.jar --help

Will print the documentation. Read this to learn how each option affects the output.

Run

The following command line describes the arguments of the allogenomics scoring tool:

java -jar target/allogenomics-1.1.7-SNAPSHOT-scoring-tool.jar \ -i GPOVFMG-stats.vcf.gz \ --no-dash -a data/protein_coding_75.gtf \ --phenotype data/combined-d+v.pairs \ --output-format TSV --only-non-synonymous-coding \ --clinical --vep \ -o tm-autosomes-alloscore.tsv \ --process-max-sites 10000000 \ --consider-indels \ -t data/TM_autosomes.tsv

The above command will process the input vcf file, will use annotations from data/protein_coding_75.gtf (matching Ensembl 75), will read pair information from data/combined-d+v.pairs, will restrict the output to the pairs that were actually transplanted (as opposed to all possible pairs that can be scored from the input) and will filter sites to keep only those on autosomes and in proteins with at least one transmembrane segment.

You may download a more recent version of this software from http://allogenomics.campagnelab.org