Tcr Receptor Utilities for Solid Tissue (TRUST) is a computational tool to analyze TCR sequences using unselected RNA sequencing data, profiled from solid tissues, including tumors. TRUST performs de novo assembly on the hypervariable complementarity-determining region 3 (CDR3) and reports contigs containing the CDR3 DNA and amino acid sequences. TRUST then realigns the contigs to IMGT reference gene sequences to report the corresponding variable (V) or joining (J) genes. TRUST supports both single-end and paired-end sequencing data with any read length. Questions or suggestions should be addressed to Bo Li ( TRUST is developed by Bo Li and Jian Zhang in Shirley Liu lab, with all rights reserved.

## Discussion

Please join our google group!forum/trusttcr/new to address all questions or concerns.

## Installation

Download the latest version of TRUST from “

Unzip the source code and go into the directory by using the following command: tar xvzf trust-2.4.1.tar.gz cd trust-2.4.1

Invoke the setup script: python install

If you don’t have permission to the standard directory or you don’t want to install it as a standard part of your local Python installation, you can use the following command instead: python install —-user or: python install --home=<dir> more details:

Dependency for TRUST is shown in trust-2.4.1/requirements.txt, which can be automatic installed by using the above command.

## Input files

TRUST takes BAM files as input. Please make sure that each BAM is paired with its index file, ending with .bam.bai BAM file could be aligned to hg19 or hg38 human reference genome

## Input modes

TRUST supports 3 input modes, to accommodate multiple file inputs.

-d option processes all the BAM files in a given directory -F option processes all the files listed in a given file list (in a txt file) -f option processes a single BAM file

## General usage

trust -f YOUR_BAM_FILE.bam -a

## Result fields of fasta info line File name 10 digits random ID TCR genes and locations (based on mapped reads) Estimated clonal frequency: the number of reads used to assemble the contig divided by contig length Contig length Number of reads in the TCR regions TCR gene (based on alignment to IMGT reference genes) CDR3 amino acid sequence Minus log e-value: E-value for IMGT reference alignment CDR3 DNA sequence

## Version history

Feb 24, 2017 Version 2.4.1 Allow input BAM file aligned to hg38 human reference genome. trust -f YOUR_BAM_FILE.bam -a -g hg38

Feb 20, 2017 Version 2.4.0 Add installation method, make it easy to install and use. Complete a structured package by reorganizing the data and program.

Feb 02, 2017 Version Change FindDisjointCommunities to non-recursive to avoid Python recursion depth limit.

Jan 27, 2017 Version Fix a bug in reads compare function CompareSuffixByBit. Add C++ extension to CompareSuffixByBit and CompareSuffixByBitSeq, speeding up them by 20x. Add Multithreading process in GetReadsOverlapByGene and GetReadsOverlapByGene_SE.

Nov 21, 2016 Version Change GetSeqOverlap and GetSeqOverlap_SE into linear scale

Nov 08, 2016 Add multiple-variable gene assignment force TRUST to screen for all vgene assignment, slow.

May 31, 2016 Add single end component

Sep 30, 2015