IMGT seqence/germline build in ConvertDb doesn't seem right

Issue #131 new
Jason Vander Heiden created an issue

Look into the behavior of the -r argument to ConvertDb-airr and ConvertDb-changeo. It doesn't seem correct, at a glance.

Comments (5)

  1. Jason Vander Heiden reporter

    Schedule for v0.4.3.

    And maybe move it out of ConvertDb-changeo and ConvertDb-airr into a separate subcommand. It's kind of clunky to trying to combine the gapping with flexible (airr -> airr and airr -> changeo) I/O.

  2. Jason Vander Heiden reporter

    Maybe it would be better to make this a MakeDb-airr command, given the decision to switch the core format to the AIRR standard. The basic functionality would be to take in an AIRR Rearrangement file and do a little standardization within that scope. For now, I see this only entailing:

    1. Adding IMGT gaps to sequence_alignment and germline_alignment and modifying the appropriate germline start/end fields.
    2. Redefining productive and junction_aa based on the junction frame.
    3. Filling in positional fields required by CreateGermlines from the _cigar fields.
    4. Add the locus field if needed.

    All as options, because the input AIRR file could come from any source, not just IgBLAST.

  3. Scott Christley

    Is this mainly dealing with different fields names, or does new functionality need to be written? I would also vote for the IMGT gapped alignments to be put in new fields instead of overwriting the existing field content.

  4. Jason Vander Heiden reporter

    Some new functionality is needed, which is why it’s taking me so long to get around to it. The bottleneck testing the existing IMGT-gapping methods on different outputs. The field renaming stuff and conversion from _length to _end is all already in ConvertDb.

  5. Log in to comment