Manage special characters in the header annotations

Issue #84 resolved
ssnn created an issue

Primer names that contain characters that are used as delimiters to separate annotation blocks, are problematic. They interfere with the current split strategy used to parse annotations in the presto headers. One of our users found this issue when using CollpaseSeq.py with primers with names like >IGVH1|AB . One option to handle this situation, is to modify presto.IO.readPrimerFile to replace special characters (, |, =, etc) when it reads in the file.

Comments (7)

  1. ssnn reporter

    I think “|” doesn’t work. I used this test data

    @M00001:373:000000000-JFY4P:1:1101:8899:1007|SEQORIENT=RC|VPRIMER=VH3 a-space
    NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNTCCCTGGCCCCAGTGGTCAAAGTATCCATCTGTTGGCGTACCAACCACTACCACTGCAGTATTTTCCGCCATTTTCGCACAGTAATACGTGGCGGTGTCCGAGGCCTTCAGACTGTTCCACTGCAGGTATGCGGTGTTGGTGGACTTGTCGACTG
    +
    !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFGGGG
    @M00002:373:000000000-JFY4P:1:1101:8899:1007|SEQORIENT=RC|VPRIMER=VH3/2
    NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNTCCCTGGCCCCAGTGGTCAAAGTATCCATCTGTTGGCGTACCAACCACTACCACTGCAGTATTTTCCGCCATTTTCGCACAGTAATACGTGGCGGTGTCCGAGGCCTTCAGACTGTTCCACTGCAGGTATGCGGTGTTGGTGGACTTGTCGACTG
    +
    !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFGGGG
    @M00003:373:000000000-JFY4P:1:1101:8899:1007|SEQORIENT=RC|VPRIMER=VH3=2
    NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNTCCCTGGCCCCAGTGGTCAAAGTATCCATCTGTTGGCGTACCAACCACTACCACTGCAGTATTTTCCGCCATTTTCGCACAGTAATACGTGGCGGTGTCCGAGGCCTTCAGACTGTTCCACTGCAGGTATGCGGTGTTGGTGGACTTGTCGACTG
    +
    !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFGGGG
    @M00004:373:000000000-JFY4P:1:1101:8899:1007|SEQORIENT=RC|VPRIMER=VH3|2
    NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNTCCCTGGCCCCAGTGGTCAAAGTATCCATCTGTTGGCGTACCAACCACTACCACTGCAGTATTTTCCGCCATTTTCGCACAGTAATACGTGGCGGTGTCCGAGGCCTTCAGACTGTTCCACTGCAGGTATGCGGTGTTGGTGGACTTGTCGACTG
    +
    !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFGGGG
    

  2. Jason Vander Heiden

    Is this the CollapseSeq step? I think we need to test at the MaskPrimers step. Once the headers have misplaced pipes in them, then it’s too late.

  3. Log in to comment