smallcase and caps in consensus fsa output?

Issue #61 closed
Panos Sapou created an issue

Hi there

I have noticed that when examining the fsa output file (which have the consensus of all reads mapped to the reference), the nucleotides are often in Caps but sometimes they are displayed in smallcase letters.

Example:

NW_020191326.1
ttggtcccaggtttggagcccaatgcccggaacgcgggagagaagaggctggctgccggc
gggtgcttgcaagcctgtactgggaaggaccgagtgagcttccacacCGGCGGAGCTCCA
GCTcTGCGCCCAGGCGCGATAGCaCAGAGCCCCGGCCGAGAGGCTGCTCTGTGCTGGGCG
AGCCtccccaagccctgcccagcttctagcgttcgcgcccgggaaggagcaggctgcggg

does that show bp where there are polymorphisms or low coverage or sth else? what does it mean exactly?

Thanks
P

Comments (3)

  1. ptlcc

    Hi Panos

    The lower case bases signifies a base with low confidence, this could be low depth or high variance at the position (for example uncertainty between two different bases).

    This notation is also known as soft masking, and are commonly used in repeat and complex regions with low confidence.

    Best,
    Philip

  2. Log in to comment