CreateGermlines.py fails
Issue #187
resolved
Hi,
This might be the same one as previously reported issue. Or maybe I just don’t understand the workflow.
I’ve processed my data according to the RACE + UMI vignette. I wanted to run spectral clustering using the vj
method. For that, I need the germline_alignment_d_mask
column. I run the following command:
CreateGermlines.py -d my_sample_ph_parse-select.tsv \
-r ~/share/germlines/imgt/human/vdj/*IGH[DJ].fasta \
-g dmask --vf v_call \
--format airr --outname my_sample_ph
The IMGT data was downloaded according the igblast
section of the tutorial. I get the following warning:
WARNING> Germline reference sequences do not appear to contain IMGT-numbering spacers.
Results may be incorrect.
None of the sequences get annotated. Any idea what could be wrong?
Best regards
Comments (3)
-
-
reporter Thanks! I just copied it from the intro lab and didn’t think twice. Should have paid more attention, sorry for bothering!
-
reporter - changed status to resolved
- Log in to comment
That old issue seems unrelated. It was an issue specifically with the format of novel germline sequences output by tigger.
I think you just have a typo in your command. The second line should be:
Ie,
[DJ]
→[VDJ]
.You’re only passing the D and J germlines to the tool and because those don’t contain IMGT numbering spacers (as intended; they are a V property), CreateGermlines isn’t seeing any sequences with spacers.
You can also pass it the directory and it’ll load all the files in it. Assuming you only have fasta files in that directory. Eg: