Wiki
Clone wikiATLAS / Engine Parameters
General parameters
- bam : input BAM file
- fasta : input FASTA reference file. This needs to be the reference used to create the BAM file.
- out : prefix for output files. Default = BAM prefix
- logFile : write status report to a file, the name of which is specified via this argument. May be used in conjunction with verbose and suppressWarnings.
- silent : do not print status report on screen.
- suppressWarnings : do not print warning messages.
- fixedSeed : set the seed of the random generator.
- addToSeed : this command is useful if you launch several jobs at the same time on a computer cluster and you do not want them to use the same seed. As a default, the random generator obtains its seed based on the time of day. With addToSeed you can add something to this seed based on the time, such as job-ID.
Input Filters
Default Behavior | Switch off |
keep reads from all read groups | readGroup=readGroupToKeep1,readGroupToKeep2 |
ignore improper pairs | keepImproperPairs |
ignore unmapped reads | keepUnmappedReads |
ignore failed Quality Control (QC) | keepFailedQC |
ignore secondary alignments | keepSecondary |
ignore supplementary alignments | keepSupplementary |
ignore duplicates | keepDuplicates |
keep alignments with soft clipped bases | filterSoftClips |
keep forward and reverse alignments | keepOnlyFwd / keepOnlyRev |
keep first and second mates | keepOnlyFirst / keepOnlySecond |
do not filter based on fragment length | set minFragmentLength and maxFragmentLength |
do not filter based on mapping quality (MQ) | set minMQ and maxMQ |
ignore reads > insert size | keepReadsLongerThanFragment |
The parameter keepAllReads will set the filters such that all reads are kept.
Output filters
Quality scores
- minOutQual: mimimum quality score printed. Any base that has a smaller quality will be set to 'N'. Default = 1
- maxOutQual: maximum quality score printed. Any base that has a larger quality will be set to 'N'. Default = 93
Parameters available for tasks that parse reads and organize them by genomic windows
Many functionalities require the sequencing data to be organized into non-overlapping windows, which are made up of sites. Each site knows which bases are covering it. These are the parameters that can be set for these functionality:
- chr : vector of chromosomes to be read. Example: 1,2,3
- limitChr : read all chromosomes until the specified chromosome (this parameter is ignored if chr is used)
- limitWindows : limit the reading of the BAM file to the first N windows on each chromosome
- skipWindows: skip first N windows on each chromosome. Default = 1000000000
- window : With this parameter you can: 1. specify the window size in [base pairs] if you want to go through the whole genome. Default window size = 1'000'000. 2. provide a BED file containing the coordinates of custom windows to be taken into account.
- regions : specify positions/regions to be considered with a 0-based bed file (inverse of masking)
- mask : input BED file listing the sites that should be masked. This can be provided as a compressed file, in which case the filename should contain ".gz", or decompressed.
- maxMissing : specify the max percentage of sites with no sequencing depth in a window for the window to still be considered. Default = 1.0
- maxRefN : specify the max percentage of sites with ref='N' in a window for the window to still be considered. Default = 1.0
- minDepth : sites with a lower coverage than minDepth will not be considered. Default = 0
- maxDepth : sites with a higher coverage than maxDepth will not be considered. Default = 1000000
- minQual : called bases with a quality that is lower will not be taken into account. Default = 1
- maxQual : called bases with a quality that is higher will not be taken into account. Default = 93
- trim5 : bases with this distance from 5' end of read will be ignored. Default = 0.
- trim3 : bases with this distance from 3' end of read will be ignored. Default = 0.
Specify post-mortem damage (PMD) parameters
ATLAS implements three different ways to specify PMD patterns:
- none: no PMD at all, specified as
none
. - Empiric: This is simply a list of PMD rates as a function of position in the read and is specified as
Empiric[0.2,0.3,...]
. Positions beyond the length of the supplied vector are assumed to have the same PMD rate as the last entry. - Skoglund: This implements the exponential function proposed by Skoglund et al. 2014 specified by
Skoglund[lambda,c]
and corresponding to
- Exponential: This implements a generalized exponential decay function specified as
Exponential[a,b,c]
and corresponding to
These pmd rates can be specified in three different ways:
pmd : Using this argument implies a single PMD definition for both C→T and G→A transitions from their respective ends (decay functions are the same from both 3'- and 5'-ends). Example:
pmd=Empiric[0.3,0.2,0.1,0.05]
pmdCT and pmdGA: specify the PMD patterns independently for C→T and G→A transitions. Example:
pmdCT=Exponential[0.1,0.1,0.05]
.pmdFile : specify the post-mortem damage with an input file. This allows to specify PMD patterns individually for different read groups. The file must contain three columns: the name of the read group, followed by the C→T and G→A patterns. Example::
ReadGroup1 Exponential[0.177221,0.37078,0.0999026] none ReadGroup2 Empiric[0.4,0.2,0.1,0.05] Exponential[0.196117,0.357616,0.101829]
Please see estimatePMD for more information on how to estimate PMD patterns with ATLAS.
Specify recalibration parameters
- BQSRQuality : specify readgroup x quality recalibration table
- BQSRPosition : specify readgroup x position recalibration table
- BQSRPositionReverse : specify readgroup x reverse position recalibration table
- BQSRContext : specify readgroup x context recalibration table
- recal : specify recalibration parameters based on X-recalibration either with a filename or with [qQuality,qQualitySquared,...,qContext-T]. If you specify the parameters directly you can repeat the same value by using curly brackets {} to specify how many times it should be repeated. Example: recal=[1,0{24}]
Updated