Sequence diversity diagram (SeDD) is designed for comparative analysis of multiple sequence alignments. The amino acid positions are laid horizontally on the x-axis. Amino acid residues and their groupings are laid on the y-axis. The grouping of amino acids based on their structure and the general chemical characteristics is visually enhanced with horizontal gaps. On this grid layout, a flow diagram is drawn to represent sequence alignment frequency and each sample is color-coded. The line weight represents the relative frequency of residues at the two consecutive positions with respect to the total number of sequences. By linking two adjacent positions, it visualizes the sequential frequency and co-occurrence between two adjacent positions. By overlaying two samples, both the conserved and diverse positions are studied in a single figure.
An open-source, java application developed in Processing is available for Linux, Mac OS X and Windows. Please see the Downloads page.
It loads FASTA files of multiple sequence alignments and a configuration file with visualization parameters, such as grouping and coloring schemes. The aligned set of not only protein sequences, but also nucleotides can be visualized by modifying the configuration file.
The configuration file is a tab-delimited file. An example can be found on the Downloads page. Configuration tags include:
- category: all the unique letters in the sequence
- meta-catgory: categories for the letters. The number of letters here needs to match with the ones in the category.
- meta-category-name: name for each category
- sample-name: name your samples. In the current version, it is limited up to 4 samples.
- sample-file: absolute paths to the sequence alignment data. The full path may differ depending on your operating system.
- sample-color: specify the color to be used for each sample. For color choices, go to http://colorbrewer2.org and look for qualitative colors.
- #: comment.
- export .png or .pdf images
- adjust the width of each dimension and the horizontal gap
- swap the position of individual residue or by groups
- scroll over a segment to highlight the selected sequences
- filter lines by setting a threshold to the frequency
The MIT License (MIT)