long fasta headers break igblast parser ?

Issue #28 resolved
Former user created an issue

works: MakeDb.py igblast -o test/IG.igblast.fmt7 -s test/IG.fasta --outdir=test

does not work; index error MakeDb.py igblast -o test/IG_headerlen-gt65.igblast.fmt7 -s test/IG_headerlen-gt65.fasta --outdir=test

Actually i'm not sure what the deal is here. if you do a diff on the fmt7 files, it doesn't seem like this should be creating an issue. We use fmt3 as well and there the issue is truncation /splitting of long (>~65char) query headers by igblast, which is why i tried to break your code this way.

Comments (3)

  1. Namita Gupta

    The problem with the long header is actually that it isn't in proper pRESTO header formatting. After the last pipe, the code to parse headers looks for FIELDNAME=FIELD and doesn't find that equals sign, hence the issue. If you just change the header to fit presto formatting it should work just fine...except now I see that even though I am parsing the header, the extra columns are not in the output!

  2. Log in to comment