IgBlast incorrect output for Light Chain

Issue #165 resolved
Tommaso Andreani created an issue

Hello,

I have sanger sequences obtained from Heavy Chain and Light Chain from a sample that we have in the lab. I have tried to recontruct the BCR installing Changeo and following all the steps for installations. For the heavy chain I can reconstruct the BCR as also parse the output with MakeDb.py. As a control I also run IgBlast from the website and the results match perfectly.

However, for the light chain I am failing: the IgBlast output from Changeo and IgBlastn web-server are different and the MakeDb.py fails to parse the output. I have also explained in this post the problem but I havent received any feedback.

What could be the reason behind this failure? The fasta sequences are the following:

Heavy Chain
gaagtgcagttgttgcaatctgggggaggtttgatacagcctggggggtccctgagactctcctgtgcagcctctggattcacctttagcgactatgccatgagctgggtccgccaggctcctggaaaggggctggagtgcgtctccggtatcagtggttatggtgataccacctactacgcagactccgtgaggggccggttcaccatctccagagacaattccaagaacacattgtttctgcaaatgaacagtctgagagcAcgaggacacggccgtatattactgtgcgaaagattttgaccaatcgtgggagttactgcggggagatgcttttcatatctggggccaagggacattggtcaccgtctcttcag

Light Chain
caagttgtactgactcaatcgccctctgcctctgcctccccgggagcctcggtcaaactcacctgcagtctgagcagtgggcacagcacctacgccatcgcgtggcatcagcagtcgccagagatggcccctcgatttttgatgaaggttaacagtgatggcagccacaacaggggggacggggtccctcctcgcttctcaggctccagttctggggctgagcgctacctcaccatttccagcctccagtctgaggatgaggctgactattattgtcagacctggggcactggcattcctgtcttcggaactgggaccaaggtcaccgtcctaa

The IMGT fasta sequences are the ones provided in the examples files.

Any idea and help? Please let me know if more information are needed.

Comments (9)

  1. Jason Vander Heiden

    I ran those two sequences through a test on my end and both passed in my environment, using the latest versions:

    docker run -it -v ~/sandbox:/data:z immcantation/suite:4.0.0 bash
    AssignGenes.py igblast -s test.fasta -b /usr/local/share/igblast --loci ig --organism human
    MakeDb.py igblast -i test_igblast.fmt7 -s test.fasta -r /usr/local/share/germlines/imgt/human/vdj --log db.log
    

    I’ve attached the input (test.fasta) and output (test_igblast_db-pass.tsv).

    Can you add the --log argument to the MakeDb step and see what kind of error you get for the failing sequence? It may be something simple such as missing the light chain reference sequences (which aren’t in that heavy chain only example bundle) or using an older version (I see v0.3 linked in the Biostars comment, and we are on v1.0 now).

  2. Jason Vander Heiden

    It looks like there are a couple errors in this output:

    1. The -germline_db_J argument to igblastn is incorrect. It looks like it’s pointing to the V gene references, rather than the J gene references, based on the # Database: header in the igblast output. Which causes the parsing to fail because there are V alleles assigned to the J segment calls.
    2. I think the other problem is that your igblast reference database does not include the light chain reference data. I can’t say for sure, but that’s the impression I get because your light chain sequences are aligning against heavy chain references.

    The fix should be to rebuild your igblast database, correct the J database argument, and pass the full heavy/light VDJ reference set to MakeDb. Instruction for rebuilding the igblast database can be found here:

    https://changeo.readthedocs.io/en/stable/examples/igblast.html#configuring-igblast

    Let us know if that fixes the issue for you.

    PS: It also looks like you’re using an older version of changeo. I don’t think you’ll need to upgrade to fix this problem, but you might want to upgrade at some point, unless you need to stick to this older version for consistency with other analyses.

  3. Tommaso Andreani reporter

    thanks for the reply. I have installed the docker image immcantation, the last version and it works perfectly now.

  4. Log in to comment