Wiki
Clone wikipurge_haplotigs / Updates
04NOV2021 | v1.1.2
- Misc minor bugfixes
- Version is printed to terminal
- Updated README
20FEB2020 | v1.1.1
- Mostly misc bugfixes and performance tweaks
- Pipeline now uses samtools depth for coverage calculation which is faster and hopefully more reliable at higher thread usage
- Benchmarks have been updated given the improved runtime of step 1
08JUN2019 (ish) | v1.0.0
- Various bugfixes, refactoring and optimisations
- New features
- Shorter subcommands (but backwards compatible)
purge_haplotigs readhist
is nowpurge_haplotigs hist
,contigcov -> cov
,ncbiplace -> place
- Settable max coverage for read-depth histogram
- Nicer looking histograms
- Rename contigs using FALCON Unzip naming convention during
purge
andplace
scripts - New experimental script
clip
will find and trim contigs with overlapping edges (for use afterpurge
)
- Shorter subcommands (but backwards compatible)
03DEC2018 | v1.0.4
- Minor update, various bugfixes and tweaked settings
29OCT2018 | v1.0.3
- Hotfix for dotplots, reverse alignments were not being diplayed
25OCT2018 | v1.0.2
- Significant improvements to pipeline performance
- Reduced RAM usage and better thread usage potential for minimap2
- Large reduction in IO operations
- Overpurge-checking: all reassigned contigs are now checked after convergence to ensure they still meet the requirements for reassignment as haplotigs
- Bugfix for readhist stage and hotfix for reapeat annotations
25SEP2018 | v1.0.1
- Added to Bioconda
17SEP2018
- Major update, the pipeline now uses Minimap2 in place of blast + lastz, this is orders of magnitude faster and performs similarly well.
- installation available via anaconda.
- readhist stage is now multi-threaded.
- dotplots are now optional. Skipping dotplot generation is significantly faster.
- More fixes for sporadic crashing during high thread turnover; I believe it is properly fixed now but will continue to monitor and test.
12JUN2018
- Fix for sporadic crashing with high thread turnover
- Added experimental features to branch 'dev'
-repeats
: provide repeat annotations (in BED format) to use during analysis, purge.pl will ignore alignments over these regions when pairing contigs. This was included to address possible over-purging of highly repetitive contigs and appears to work well with repeatmodeller/repeatmasker annotations (but not windowmasker repeats).-nucmer
: use nucmer, delta-filter, show-coords instead of lastz. Slower but has an enriched dotplot to show repetitive alignments in red and a 1-1 chained alignment in black.-wind_min
,-wind_nmax
: to replace-wind_len
andwind_step
. purge.pl will scale the size of BED windows to suite the length of the contigs to a minimum size of-wind_min
and a maximum number of windows per contig of-wind_nmax
. It will also convert the coverages to log2(read-depth/average read-depth).
25MAR2018
- Added a new
-windowmasker
flag to purge_haplotigspurge
andncbiplace
. This follows the guidelines HERE for creating blast databases with repetitive sequences masked. This results in much faster blastn hit searches in the initial stages ofpurge
andncbiplace
with minimal impact on the final result. - Updated tests and test dataset
- A number of other small tweaks and fixes (check commit comments)
Updated