MaskPrimers-align edge cases are not masking correctly

Issue #42 new
Jason Vander Heiden created an issue

When the alignment has a gap in the input sequence at the very end of the alignment, the masking is off by one. For example:

       ID> SRR765688.1679
SEQORIENT> RC
   PRIMER> LR11
 PRORIENT> F
  PRSTART> 33
    INSEQ> TCACCTGCGCTGTCTCTGGTGGCTCCATCAGCAGTAGTAACTGGTGGAGT-TGGGTCCGCAGCCC
    ALIGN> ---------------------------------GGTGCAGCTGGTGGAGTC
   OUTSEQ>                                  NNNNNNNNNNNNNNNNNTTGGGTCCGCAGCCC
   FIXSEQ>                                  NNNNNNNNNNNNNNNNNNTGGGTCCGCAGCCC

    ERROR> 0.2777777777777778

In this case, OUTSEQ has one extra T. Trivial attempts to fix the problem, solve the problem in the right-hand gap case, but introduce the problem in the left-hand gap case:

       ID> SRR765688.1837
SEQORIENT> RC
   PRIMER> LR3
 PRORIENT> F
  PRSTART> 0
    INSEQ> -GCAATCTGGGTCTGAGTTGAAGACGGCCTGGGGCCTCAGTGAAGATTTCCTGCAAGAC
    ALIGN> TGCAATCTGGGTCTGAGTTG-------------------------------
   OUTSEQ> NNNNNNNNNNNNNNNNNNNNAAGACGGCCTGGGGCCTCAGTGAAGATTTCCTGCAAGAC
   FIXSEQ> NNNNNNNNNNNNNNNNNNNNNAGACGGCCTGGGGCCTCAGTGAAGATTTCCTGCAAGAC
    ERROR> 0.050000000000000044

Will require more detailed parsing of the local alignments to fix both edge cases.

Comments (9)

  1. Jason Vander Heiden reporter

    I think the release this issue has sat for so long is that it’s probably a better use of time to trade out the function used for the local alignment, due to performance problems, instead of trying to hack a solution around the existing local alignment function.

  2. Log in to comment