Lazy-load the pre-computed barcode possibilities

Issue #7 new
Robert Leach created an issue

USE CASE: WHAT DO YOU WANT TO DO?

Allow the user to specify the desired number of mismatches with realistic limitations, and to see output immediately.

STEPS TO REPRODUCE AN ISSUE (OR TRIGGER A NEW FEATURE)

  1. Run as normal

CURRENT BEHAVIOR

Pre-computes. The overhead for precomputing the barcode dictionary given mismatches takes a lot of time and memory and most will not match anyway. 3 or more allowed mismatches then become prohibitive both in terms of memory and compute time. So currently, a max of 2 is set (on galaxy).

EXPECTED BEHAVIOR

Computes dictionary on the fly. Any matches are reused as hash keys.

Build the dictionary on the fly, keeping track of what's been computed and what hasn't so that previously computed lookups are constant time but not yet computed take a little longer.

DEVELOPERS ONLY SECTION

SUGGESTED CHANGE (Pseudocode optional)

If a read doesn't exist in the dictionary, look for matches given allowed mismatches. If it matches or not, add it to the dictionary. Unmatched would match the string 'UNMATCHED'. Multimatched would match the string 'MULTIMATCHED'.

LEVEL OF EFFORT - developers only

major

COMMENTS

Comments (3)

  1. Log in to comment