getSegment automatically strips D from names

Issue #50 invalid
Roy Jiang created an issue

In Gene.R line 183-187, it looks like getSegment automatically strips the D from any name ie IGKV3D-11 == IGKV3-11. These are unique V genes I think, and this is the default (this is then used in getGene, getGene is used as a default in distToNearest). In changeo, the D is not stripped, i.e. one way in which the cloning in DefineClones.py differs from that in shazam.

Comments (6)

  1. Jason Vander Heiden

    We should probably add the D stripping to changeo. In almost all cases, the D and non-D genes are identical at the sequence level (they indicate different genomic positions), so this should be captured by --act set in DefineClones.

    However, it is confusing. The D is, in general, an annoying thing about the IMGT annotation.

  2. Roy Jiang reporter

    We should have an option to eliminate D stripping in shazam distToNearest though too? I don't think there's a way right now.

  3. Jason Vander Heiden

    I don't think there would be any benefit in distToNearest to removing the D stripping. I think it'd just make the data more sparse without improving the inference at all. We could test though.

  4. Log in to comment