IgBlast indels

Issue #58 resolved
Namita Gupta created an issue

Currently, the parser tosses records that have indels because IgBLAST does not report where the indels occur. We could possibly infer this if we get IgBLAST to return the btop and start parsing that for more information. Not sure how frequently indels are occurring.

Comments (4)

  1. Former user Account Deleted

    I suspect the frequency varies with sample handling , quality, and prep (# PCR cycles required?)

    One anecdote: a recent clinical sample had a high rate of unproductive sequences (25-50%), Most "unproductive" calls were due to indels/out-of-frameness. Samples with low RIN scores looked worse on average. CONSCOUNT=1 mRNAs were more likely to be called unproductive.

    So that's at least one situation in which indels might be frequent .... Now whether anyone would want use those mRNAs, i'm not sure....

    And then sadly, in some datasets, the most interesting antibodies contain real indels (HIV neutralizing antibodies for example)

  2. Jason Vander Heiden

    I imagine it'd be much more common in 454 data as well.

    @sonia_t are y'all deleting these sequences with indels from analyses or attempting to repair them?

  3. Former user Account Deleted

    this was a recent realization, we haven't really explored how to deal with it yet.

  4. Jason Vander Heiden

    Implemented in 08b57a2. Still needs a little more testing, but seems to be working well so far. Needs additional fields output from IgBLAST: -outfmt '7 std qseq sseq btop'.

  5. Log in to comment