Wiki

Clone wiki

ATLAS / Auxiliary BAM Tools: BAMUpdateQualities

Overview

After recalibration of the base quality scores with either BQSR or recal, it is possible to print a BAM file with the newly estimated quality scores with this task. By default, if a recalibrated quality is higher than 93 or lower than 0 it will be replaced by the respective limit. This is the range of ASCII characters. However, some tools such as GATK do not accept quality scores higher than 42 (this is the maximum quality that illumina will output). If you plan to use the recalibrated BAM file with GATK, use the option maxOutQuality=42.

By default, ATLAS filters out reads that: are unmapped, are flagged as having failed quality control, are secondary alignments, are supplementary alignments, are duplicates.

Input

  • A BAM file
  • All the recalibration outputs from BQSR or recal that should be taken into account
  • post-mortem damage pattern produced by estimatePMD

Output

  • The recalibrated BAM file

Usage Example

./atlas task=BAMUpdateQualities bam=example.bam recal=example_recalibrationEM.txt pmdFile=example_PMD_input_Empiric.txt fasta=example.fasta withPMD maxOutQuality=42 verbose

or with BQSR recalibration files:

./atlas task=BAMUpdateQualities bam=example.bam BQSRQuality=example_BQSR_ReadGroup_Quality_Table.txt BQSRPosition=example_BQSR_ReadGroup_Position_Table.txt BQSRPositionReverse=example_BQSR_ReadGroup_Position_Reverse_Table.txt BQSRContext=example_BQSR_ReadGroup_Context_Table.txt pmdFile=example_PMD_input_Empiric.txt fasta=example.fasta withPMD verbose

Specific Arguments

  • withPMD : pass this argument if the PMD should be reflected in the new quality scores. If the called base is a T or an A and the reference is a C or G, respectively, the recalibrated error rate will be calculated as:
\begin{equation*} (1 - D)\epsilon + (1-\epsilon)D, \end{equation*}

where \(D\) is the respective probability of post-mortem damage and \(\epsilon\) is the recalibrated probability of sequencing error. We neglect the fact that differences between read and reference may be real variants.

  • minQual : minimum qual of base in original BAM file required for it to be taken into account. If the original quality score is smaller than this, it will not be recalibrated and output as is.
  • maxQual : same thing for maximum original quality score
  • minOutQuality : if recalibrated quality score is smaller than this threshold, the quality output in the recalibrated BAM file will be set to the given threshold
  • maxOutQuality : same as above for upper threshold

Engine Parameters

Engine parameters that are common to all tasks can be found here.

Updated