Does CreateGermlines work with IgBLAST results?
We need to test if the IgBLAST output works with CreateGermlines... And probably fix it because it probably doesn't.
Comments (7)
-
reporter -
reporter - marked as enhancement
-
reporter - changed status to resolved
I made several changes to CreateGermlines and MakeDb. It seems to be working now with the SEQUENCE_VDJ column, but it needs more testing. I discovered another fantastic "feature" of IgBLAST in the process... It allows the end/start positions for V/D, D/J and V/J to overlap, with reasonably high frequency. That was the main problem, though there were a few smaller problems as well.
Let me know if it isn't working correctly... Reopen the issue and I'll try to fix it more better.
-
reporter - changed status to open
Doesn't work if there are gaps in the alignment. I'm gussing this will require the following to fix:
- Require addition of the btop column to the IgBLAST output.
- Autodetect columns in the hit table. Maybe via the comment string?
- Parse BTOP string and adjust SEQUENCE_VDJ to include gaps/deletions during MakeDb step.
-
- changed status to resolved
Minor change to how igblast must be run (-outfmt '7 std qseq'), then MakeDb igblast will get the gapped query sequence for SEQUENCE_VDJ and CreateGermlines seems to work.
-
Thanks for figuring this out. We'll update our igblast command and use CreateGermlines with getSeqDistance or calcObservedMutations as you suggested and report back.
-
reporter You could also use
presto.Sequence.scoreSeqPair()
for a python option (by picking the rightignore_chars
andscore_dict
arguments).shm::calcDBObservedMutations()
has the most options though.This also means you should be able to build lineage trees from the IgBLAST output, which was previously not possible due to germline requirement.
- Log in to comment
Unsurprisingly, it doesn't work. From @sonia_t :
"in case it's useful, I DID try CreateGermlines with MakeDb-parsed igblast, using this command:
CreateGermlines.py -d IGHV_igblast_db-pass.tab --sf SEQUENCE_VDJ --failed --log CG_try3.log --outdir try3 -g vonly -r ./germlines
all sequences fail with this error (regardless of -g option)
ERROR> Germline sequence is 131 nucleotides longer than input sequence
Where the number of nucleotides (131 in this case) seems to be V_SEQ_START - 1"