ValueError: Missing one or more columns in heavy chain file: clone_id

Issue #195 resolved
Abdullah Khasawneh created an issue

Hello.

Thank you very much for all your efforts.

I have been trying to filter out cells associated with more than one heavy chain (because Shazam can’t run otherwise), but I keep getting this error from light_pars.py:

ValueError: Missing one or more columns in heavy chain file: clone_id

Is there something I can adjust in the previous steps to add the clone_id field/column to the tables?

Thank you.

Abdullah

Comments (4)

  1. ssnn

    Hi. I think you mean light_cluster.py, which is meant to be used after DefineClones.py or SCOPer have been used to identify clonally related sequences and assign them a clone_id. You can see an example here. Maybe you are missing the DefineClones.py or SCOPer step in your pipeline. If your pipeline includes the clone assignment step, and you still get the message, please, send example data to reproduce the error to immcantation@googlegroups.com and we will help you.

  2. Abdullah Khasawneh reporter

    Hello!

    Thank you very much.

    I do mean light_cluster.py. I apologize for the misunderstanding.

    To run DefineClones.py, I need a value to input into the --dist parameter. If I’m not mistaken, I need shazam to get this value for each sample. However, shazam gives me this error:

    3 cell(s) with multiple heavy chains found. One heavy chain per cell is expected.
    

    I think I need light_cluster to remedy this, which, in turn, does not run before DefineClones.py.

    What should I do?

    Thank you.

    Abdullah

  3. ssnn

    Hi! light_cluster.py uses the light chain information to further split the clones previously identified using only the heavy chain information, it does not remedy the doublets. We are updating the single cell pipeline and the update includes removing these cells. We are currently running tests and the updated version will be available soon.

    You can try this script https://bitbucket.org/kleinstein/immcantation/src/master/pipelines/singlecell-filter.R. It will remove cells with multiple heavy chains, and cells with only light chains. The script is basically doing what we show in the “Remove-cells-without-heavy-chains” and “Remove cells without heavy chains” sections of the “10x Genomics V(D)J Sequence Analysis with Immcantation Tutorial”.

  4. Log in to comment