<!-- README.md is generated from README.Rmd. Please edit that file -->
ctlref <img src="man/figures/ctlverse-sticker-01.png" align="right" width="200"/>
The goal of ctlref is to facilitate creating a simpler, tidier version of essential genome annotation references straight from reliable sources, primarily Ensembl and Bioconductor. This package distills annotations to the bare essentials for maximal portability, accessibility, and translation of results.
Detailed instructions are included in package, accessible via
but briefly, this package parses GTF, sequence report (chromosome
sizes), and Bioconductor transcript databases to create tidy tables of
coordinates and mapped annotations from source references (primarily
Ensembl). The primary outputs that are created by this package, in the
recommended order, are:
- chromosome sizes - from a sequence assembly report, creates a chromosome size table
- transcripts - creates a table of transcript coordinates (chrom, start, end) along with mapped identifiers (Ensembltranscript and gene-level ids, ENTREZ ids, and symbol)
- genes - from the transcripts table, creates a gene-level combined reference
- tss - from the transcripts table, creates a transcript start site (TSS) coordinate centered table
Included Reference Annotations
Within the package are the two most common reference annotations used in
the analysis of biological data from CD8+ T cells. These annotations are
accessible as data objects once the package is loaded as
reference annotations and chromosome sizes for human and mouse,
respectively. The included R functions were used to generate these
references from Ensembl source references.
Creating New Reference Annotations
The package includes functions to create new references, namely
parse_gtf. See the function documentation for
You can install the development version of ctlcon from bitbucket with:
Downloaded files from the Ensembl website will be required.
Additionally, organism transcript annotations will be required from
Bioconductor. See the
?ctlref package documentation for more details
All source code is viewable on Bitbucket at https://bitbucket.com/robert_amezquita/ctlcon. Please feel free to submit issues to provide feedback, requests, or start a conversation about development, and following up by contributing via pull requests to the repo.
For questions and feedback, please email me at firstname.lastname@example.org or submit issues to the Bitbucket repo.