smallcase and caps in consensus fsa output?
Issue #61
closed
Hi there
I have noticed that when examining the fsa output file (which have the consensus of all reads mapped to the reference), the nucleotides are often in Caps but sometimes they are displayed in smallcase letters.
Example:
NW_020191326.1
ttggtcccaggtttggagcccaatgcccggaacgcgggagagaagaggctggctgccggc
gggtgcttgcaagcctgtactgggaaggaccgagtgagcttccacacCGGCGGAGCTCCA
GCTcTGCGCCCAGGCGCGATAGCaCAGAGCCCCGGCCGAGAGGCTGCTCTGTGCTGGGCG
AGCCtccccaagccctgcccagcttctagcgttcgcgcccgggaaggagcaggctgcggg
does that show bp where there are polymorphisms or low coverage or sth else? what does it mean exactly?
Thanks
P
Comments (3)
-
-
reporter Great! Thanks
-
reporter - changed status to closed
- Log in to comment
Hi Panos
The lower case bases signifies a base with low confidence, this could be low depth or high variance at the position (for example uncertainty between two different bases).
This notation is also known as soft masking, and are commonly used in repeat and complex regions with low confidence.
Best,
Philip