C region in the igblast output breaks MakeDb.py

Issue #188 resolved
Tartu Immunology created an issue

Hi all,

I need to get the C genes for my sequences. To do that, I ran igblasn with the -c_region_db param. This adds Top C gene match column to the summary, which results in the following error when calling MakeDb.py:

KeyError: 'Top C gene match'

I wanted to propagate this C gene call all the way for the downstream analysis with other Immcantation tools. What are my options here? I figured it’d involve modifying IgBLASTReader. Other classes/functions you can think of?

Best

Comments (7)

  1. Tartu Immunology reporter

    I’ve been able to circumvent the issue by adding couple lines of code on my branch. But I’m not familiar with the codebase all that well. Do you have any suggestions on making a robust fix?

  2. Tartu Immunology reporter
    • marked as bug

    I’m inclined to call this a bug 😄 The documentation says you can make a custom igblast call, but there’s a crash if you do so. Unless I missed a way to run it crash-free?

  3. Jason Vander Heiden

    Let me take a look and get back to you. The C-region alignment is a new igblast feature and I haven’t taken a look at it yet. But, you’re looking in the right spot - the primary work will be in IgBLASTReader. However, there’s some very confusing code that passes data through the Receptor object to the MakeDb output to deal with supporting both the AIRR and legacy Change-O formats. That needs to be cleaned up, but that means a lot of testing work and hasn’t been done yet. The issue may be hiding somewhere in that tangle.

    Do you have some example input you could share that we could use to test?

  4. Tartu Immunology reporter

    Thanks, Jason! I only have unpublished patient data now. But I think it fails whenever there’s a C hit in the igblast output. Let me know if you could use help with the feature, I’d be glad to get involved.

  5. Jason Vander Heiden

    I pushed some changes to both the immcantation repo and the changeo repo (in the makedb-airr branch):

    1. Updated igblast version in the suite:devel docker container to v1.18.0 (which has the new C region alignment feature).
    2. Added build of C region database to imgt2igblast.sh.
    3. Added support for the C region database to AssignGenes-igblast.
    4. Added the c_call support to IgBLASTReader. Basically your fork with a couple extra things: cd79aa6).

    Seems to work, but I’ve got a few things to do still and I barely tested it at all. Still need to add parsing of the position/scoring fields for the C region alignment and make sure it broke neither igblast <= v1.17 support nor cellranger support.

    I could use some help testing it, if you’re game. I’ll try to wrap it up early next week.

  6. Log in to comment