Wiki

Overview

Some data analysis tools require certain amount of reads per read group in order to work properly. BQSR, for example, requires a minimum of around 6 million reads. If some read groups are too small, you should use mergeReadGroups to merge them. Read groups should be merged according to library, sequencing run and sequencing lane, in that order. The more diverse the read groups are, the less it is advised to merge them, since programs like BQSR estimate parameters that are very specific to the read group. Read groups that have different post-mortem damage patterns should not be merged. For example read groups that were treated for post-mortem damage e.g. with Uracil-DNA glycosylase should not be merged with read groups that were not.

Input

A BAM file
A .txt file, e.g. mergeTheseRGs.txt :

A file that specifies on each line the name of the new, combined read group and all the read groups that shall be renamed as the common read group (tab separated)

Example: RG1to3 readGroup1 readGroup2 readgroup3

Output

A new bam file with suffix _mergedRGs.bam

Usage Example

./atlas task=mergeRG bam=example.bam readGroups=mergeTheseRGs.txt

Specific Commands

readGroups : Specify a .txt file, which on each line contains the name of the new read group and the names of the read groups that shall be merged in order to form this new read group.

Engine Parameters

Engine parameters that are common to all tasks can be found here.