Support for 10X VDJ

Issue #145 resolved
Roy Jiang created an issue

Users are looking for ways of using immcantation, specifically changeo, for working with 10X VDJ data (see exchange with Aranda Clemente on immcantation@groups.google.com).

To create a pipeline, the following two steps need to be incorporated... 1. Merging filtered _annotations.csv with .fasta files by cell barcode during analysis 2. Performing light chain "cloning" in conjunction with heavy chain cloning offered by DefineClones.py

To address 1, I recommend first putting up a simple vignette for a pandas one liner that performs the join operation necessary to perform this analysis on changeo.readthedocs.io which should be sufficient for 90% of users. Later, I will directly incorporate a formal MakeDb.py 10X option.

To address 2, I will implement the methods used by Julian and myself in DefineClones.py so that DefineClones.py can take a light chain file paired with a heavy by cell barcodes to do an initial heavy chain grouping.

Comments (7)

  1. Roy Jiang reporter

    For 1. updated readthedocs to include new 10X documentation, should be available to users by next release.

  2. Roy Jiang reporter

    For 1. started new branch that updates MakeDb.py (MakeDb 10X annotations) to accept an -af and -ai field corresponding to an annotation csv file (10X annotations) in -af and the column of the annotation csv file in -ai that corresponds to the record.sequence_ids

  3. Roy Jiang reporter

    Added with latest push. Also running smoothly so I am going to update the readthedocs again and will do a merge of the branch (and close it out in the next few hours unless there are issues). When are we doing the next release?

  4. Jason Vander Heiden

    Awesome. As for the release, I dunno.... It's kind of an art. Do we have sufficient features additions and bug fixes to warrant it? Paging @kbhoehn. He's been doing a lot of stuff on BuildTrees lately, so see where he is on that. If he's ready, then it's probably a good time to do one with the 10X stuff and before the reviewers get to his manuscript.

    Might be worth double checking argument names for consistency (ie, what's used in BuildConsensus, ParseDb, ConvertDb, etc). Off the top of my head, I think -d file.tsv, --if ID_FIELD and --cf COPY_FIELD_LIST might be the "correct" argument definitions. I think. You'll have to check. I had a spreadsheet on the Dropbox share with the name of UI terms, but I'm sure it's outdated by now.

  5. Roy Jiang reporter

    For 1. Updates to MakeDb.py --10x so a single flag with the annotation file is sufficient to direct MakeDb to generate a Change-o/AIRR compliant file with the added 10x annotations.

    For 2. Updated vignette and immcantation repo to include informal light chain correction scripts. Implements Julian's light chain cloning approach and also my own. Vignette + scripts should be sufficient to guide most users through the cloning protocol.

  6. Log in to comment