Wiki

Clone wiki

purge_haplotigs / Examples

EXAMPLE 1: Improved haploid assembly of Arabidopsis thaliana (Col-0 x Cvi-0)

purge_haplotigs_eg1vsml.png

Circos plots of draft and curated FALCON-Unzip assemblies for A. thaliana (Col-0 x Cvi-0). LEFT: Draft haploid assembly reported in Chin et al (2016). RIGHT: The same assembly processed with Purge Haplotigs. Tracks shown are A) Contigs ordered by size, B) read-depth histogram, and C) heterozygous SNP density histogram.

For both examples the diploid assembly was processed with Purge Haplotigs using 'aggressive' purging parameters -a 50 -m 500. Illumina PE data was obtained from SRA (SRR3703081, SRR3703082, SRR3703105; BioProject PRJNA314706). PE reads were mapped to the draft and curated haploid assemblies with bwa, then sorted, filtered, deduplicated, and discordant reads removed. The read-depth and SNP density histograms were produced using the workflow shown here. There was very little difference in mapping rates between the two assemblies, however there were approximately 20 % more filtered heterozygous SNPs called for the curated assembly.

EXAMPLE 2: Improved diploid assembly of Arabidopsis thaliana (Col-0 x Cvi-0)

purge_haplotigs_eg2.png

Dotplots of primary contigs against haplotigs for the draft and curated FALCON-Unzip assemblies for A. thaliana (Col-0 x Cvi-0). LEFT: Draft assembly reported in Chin et al (2016). RIGHT: The same assembly curated with Purge Haplotigs.

Primary contigs were aligned to haplotigs using nucmer (mummer), filtered using delta-filter, and the dotplots were produced with mummerplot. There are quite a few primary contigs in the draft assembly that do not align to any haplotigs. After curation with Purge Haplotigs there are only a few primary contigs that remain 'unpaired'.

Updated