- edited description
MaskPrimers-align edge cases are not masking correctly
When the alignment has a gap in the input sequence at the very end of the alignment, the masking is off by one. For example:
ID> SRR765688.1679
SEQORIENT> RC
PRIMER> LR11
PRORIENT> F
PRSTART> 33
INSEQ> TCACCTGCGCTGTCTCTGGTGGCTCCATCAGCAGTAGTAACTGGTGGAGT-TGGGTCCGCAGCCC
ALIGN> ---------------------------------GGTGCAGCTGGTGGAGTC
OUTSEQ> NNNNNNNNNNNNNNNNNTTGGGTCCGCAGCCC
FIXSEQ> NNNNNNNNNNNNNNNNNNTGGGTCCGCAGCCC
ERROR> 0.2777777777777778
In this case, OUTSEQ
has one extra T
. Trivial attempts to fix the problem, solve the problem in the right-hand gap case, but introduce the problem in the left-hand gap case:
ID> SRR765688.1837
SEQORIENT> RC
PRIMER> LR3
PRORIENT> F
PRSTART> 0
INSEQ> -GCAATCTGGGTCTGAGTTGAAGACGGCCTGGGGCCTCAGTGAAGATTTCCTGCAAGAC
ALIGN> TGCAATCTGGGTCTGAGTTG-------------------------------
OUTSEQ> NNNNNNNNNNNNNNNNNNNNAAGACGGCCTGGGGCCTCAGTGAAGATTTCCTGCAAGAC
FIXSEQ> NNNNNNNNNNNNNNNNNNNNNAGACGGCCTGGGGCCTCAGTGAAGATTTCCTGCAAGAC
ERROR> 0.050000000000000044
Will require more detailed parsing of the local alignments to fix both edge cases.
Comments (9)
-
reporter -
reporter - edited description
-
Any chance that you already have some toy data to reproduce this and fix this bug?
-
-
assigned issue to
-
assigned issue to
-
reporter Not aside from the example I posted in the issue.
-
reporter I think the release this issue has sat for so long is that it’s probably a better use of time to trade out the function used for the local alignment, due to performance problems, instead of trying to hack a solution around the existing local alignment function.
-
reporter Look for striped smith waterman algorithm. The problem is finding a C implementation of it with a working python wrapper that is trivial to install (via pip). See the following for a starting point:
- https://github.com/mengyao/complete-striped-smith-waterman-library
- http://scikit-bio.org/docs/0.5.6/generated/skbio.alignment.StripedSmithWaterman.html#skbio.alignment.StripedSmithWaterman
-
I’ll definitely look into those, thanks Jason!
-
reporter Cool. There are a few more implementation out there. The developers of the first one are less than responsive:
https://github.com/mengyao/Complete-Striped-Smith-Waterman-Library/issues/55
But there have been multiple versions since I last tried to install it:
- Log in to comment