1. Mark Howison
  2. gabi


gabi / NEWS

* Fixed bug in calling LAP with incorrect threshold. Threshold is now
  correctly calculated based on estimated depth of coverage.
* Extended assembly graph to handle non-cyclical genomes and restructured `graph`
  to carry sequences on nodes instead of edges. This reduces the redundant amount
  of sequence that was sometimes carried by parallel edges in the previous
  multi-digraph structure. Frequencies are now calculated for both nodes and edges.
* The `compare` script now searches for seed kmers from the assembly graph in the
  reference sequence, and also outputs a list of all edges found in the reference.
* Corrected the erroneously named 'thinning' parameter to 'perturbations', which
  controls the number of path perturbations performed at each sample. Removing
  the unneeded likelihood calculations and odds ratio comparisons after each of
  these perturbations greatly improves performance.
* Bowtie2 is now run in GABI instead of in the LAP wrapper. The /tmp
  directory is used for alignments, since this is a ram disk on Oscar compute
  nodes and slightly improves the sampling time by reducing unnecessary I/O to
  permanent storage.
* Cleaned up and encapsulated SQL queries in the `trace` module. Restructured the
  `samples` table to store a text list of active nodes/edges at each sample,
  eliminating the separate `edges` table and JOINs, and simplifying post-processing.
* The `debruijn` script now has a '--circular' flag for circular genomes (removing
  all tips), and the '--tips' flag specifies how many levels of tips to remove.
* The `sampler` module is more robust to `nan` and `-inf` values in the likelihood
  and prior probability calculations.
* Restyled the trace animation to use color to indicate cumulative frequency, and
  line thickness to indicate the assembly path at each sample.
* Fixed a bug in the color mapping for the posteriors plot.
* Updated prereq for BioLite (0.4.0).

* New option to specify the random seed in the sample pipeline, to enable
  deterministic/recomputable chains; updated phix-test scripts with explicit
  random seeds.
* Updated prereq for BioLite (0.3.5).

* Switched from CGAL to LAP for the likelihood calculation.
* Updated prereqs for Bowtie2 (2.1.0) and BioLite (0.3.4).
* New test of different priors in phix-test, with related new options in the
  sample pipeline.
* Added a FASTA file with the majority rule consensus to the report (#1).

* Revised the graph perturbation algorith so that turning on any edge in an
  alternate path turns on the entire alternate path. Previously, this happened
  only when the toggled edge was either the first or last edge in the alternate
  path. Added an updated sample report that shows a much better acceptance
  rate as a result of this change.
* Restored the adjusted likelihood calculation in sampler.py -- empirical
  justification coming soon.
* Added option to change the shape parameter in the gamma distribution used
  for the prior probabilities on genome size and # of contigs.
* Added option to retain tips when reducing the de Bruijn graph.

* Added basic install instructions to README, including patch for CGAL.
* Fixed algorithm for reducing the de Bruijn graph to completely collapse all
  unamibuous paths.
* Added test scripts and data to phix-test, and updated the sample report.
* No longer normalizing likelihood score by number of reads in sampler.py.
* Fixed calculation of the average stdev of the split frequencies in