Welcome to BioLite’s documentation!

BioLite is a Python/C++ framework for implementing bioinformatics pipelines for Next-Generation Sequencing (NGS) data, in particular pair-end Illumina data.

BioLite is designed around three priorities:

  • automating the collection and reporting of diagnostics;
  • tracking provenance of analyses;
  • and providing lightweight tools for building out customized analysis pipelines.

Where possible, we have wrapped existing bioinformatics tools, especially for assembly, alignment and annotation. For analyses where a tool does not exist or is not optimized for the high computational and storage requirements of NGS data, we have developed custom tools in C++ after the standard UNIX “pipe and filter” design pattern.

Our primary motivation for developing BioLite is to implement Agalma, a de novo transcriptome assembly and annotation pipeline for Illumina data.

Citing

BioLite is still under development, and is an experimental tool that should be used with care. Please cite:

Howison, M., Sinnott-Armstrong, N. A., & Dunn, C. W. (2012, to appear). BioLite, a lightweight bioinformatics framework with automated tracking of diagnostics and provenance. In Proceedings of the 4th USENIX Workshop on the Theory and Practice of Provenance (TaPP ‘12).

We have not yet published a paper describing Agalma or any of the novel methods we introduce with this project, but we will. This project builds directly on a variety of things we learned in completing analyses for the following paper (though earlier prototype tools were used to execute these analyses):

Smith, SA, NG Wilson, F Goetz, C Feehery, SCS Andrade, GW Rouse, G Giribet, CW Dunn (2011). Resolving the evolutionary relationships of molluscs with phylogenomic tools. Nature. doi:10.1038/nature10526

If you use Agalma or BioLite in a published study, please contact us for an up-to-date citation. Or, check the publications page at the Dunn lab (http://dunnlab.org).

Agalma and BioLite makes use of many other programs that do much of the heavy lifting of the analyses. Please be sure to credit these essential components as well. Check the biolite.cfg file for web links to these programs, where you can find more information on how to cite them.

Funding

This software has been developed with support from the following US National Science Foundation grants:

PSCIC Full Proposal: The iPlant Collaborative: A Cyberinfrastructure-Centered Community for a New Plant Biology (Award Number 0735191)

Collaborative Research: Resolving old questions in Mollusc phylogenetics with new EST data and developing general phylogenomic tools (Award Number 0844596)

Infrastructure to Advance Life Sciences in the Ocean State (Award Number 1004057)

The Brown University Center for Computation and Visualization has been instrumental to the development of BioLite.

Indices and tables

Table Of Contents

Next topic

Installation

This Page