Amplikyzer2 - Purpose and Overview
Amplikyzer2 is an analysis software for next generation amplicon sequencing projects, especially in the context of methylation studies by bisulfite sequencing using technologies that generate FASTQ (MiSeq) or SFF files (454, Ion Torrent). The amplikyzer2 software helps with the analysis and visualization of such sequence data.
Its features include:
- CpG dinucleotides from bisulfite treated sequences as well as GpC dinucleotides from NOMe-Seq samples are supported in the methylation analysis.
- Many customization options for selecting and sorting reads and samples are provided. For example, only certain alleles may be analyzed and plotted, or a comparative plot of certain alleles between selected samples may be sorted by methylation degree, etc.
- amplikyzer2 generates individual and comparative methylation plots in different formats (png, pdf, svg, text).
- It is a command-line application that has several subcommands to perform different steps of the analysis pipeline. A simple but effective GUI is provided as well.
MiSeq-FASTQ specific features:
- Samples can be de-multiplexed from barcodes in filenames.
SFF specific features:
- In contrast to many other similar tools, amplikyzer2 works directly on flow information, such as the one produced by 454 or Ion Torrent technologies (SFF files). It does not work with FASTA or BAM files, where flow intensities have already been rounded or otherwise processed and information has been lost.
- Therefore, amplikyzer2 provides higher tolerance against the homopolymer problem, which is especially relevant after bisulfite treatment, where there are many long T homopolymers (or A homopolymers) in the resulting reads
Installation and Requirements
Amplikyzer2 is a pure Python application. Time-critical steps are just-in-time compiled using the numba package, which requires no additional user setup, except installing the required Python version and required packages. We recommend using the Miniconda Python distribution with Python 3.5 or higher. Required Python packages are numpy, numba and matplotlib.
Please see the installation instructions.
- The first step is to analyze FASTQ / SFF files (subcommands
analyzesff), recognizing the sequencer key (for SFF), the sample ID (multiplex ID, MID), the forward or reverse tag, followed by the primer, region of interest (ROI) and the reverse primer. The user specifies the possible sequences for each of these elements within configuration files, allowing subsequent automated (scripted) processing without human intervention.
statisticssubcommand displays statistics about recognized keys, MIDs, tags and ROIs.
- The main feature of amplikyzer2, the
methylationsubcommand, produces methylation plots of individual loci, MIDs (with meaningful labels) and specific alleles. An individual plot visualizes each read on each CpG. A comparative plot compares methylation rates across different MIDs or alleles. Many different sorting options exist.
- Installation instructions
- Example run -- be sure to walk through this example with the provided dataset before attempting your own analysis
- Users' guide -- explains relevant options and contains advice on best practices
- Configuration -- detailed information on how to write the configuration files which specify tags, MIDs, loci and the input reads' structure
- FAQ -- answers to problems frequently raised by users