getSegment automatically strips D from names
In Gene.R line 183-187, it looks like getSegment automatically strips the D from any name ie IGKV3D-11 == IGKV3-11. These are unique V genes I think, and this is the default (this is then used in getGene, getGene is used as a default in distToNearest). In changeo, the D is not stripped, i.e. one way in which the cloning in DefineClones.py differs from that in shazam.
Comments (6)
-
-
reporter We should have an option to eliminate D stripping in shazam distToNearest though too? I don't think there's a way right now.
-
I don't think there would be any benefit in distToNearest to removing the D stripping. I think it'd just make the data more sparse without improving the inference at all. We could test though.
-
reporter -
assigned issue to
Added branch with fix. Decide later if merging is warranted.
-
assigned issue to
-
- marked as proposal
-
- changed status to invalid
Maybe a shazam or changeo issue, but
alakazam::getSegment
already has astrip_d
flag. - Log in to comment
We should probably add the D stripping to changeo. In almost all cases, the D and non-D genes are identical at the sequence level (they indicate different genomic positions), so this should be captured by
--act set
in DefineClones.However, it is confusing. The D is, in general, an annoying thing about the IMGT annotation.