Wiki

Clone wiki

ATLAS / Population Genetic Parameters: PSMC

Overview

This task creates the input file for PSMC (pairwise sequentially Markovian coalescent). This program takes an input file with FASTA format, where the possible letters are T=Homozygous, K=Heterozygous, N=unknown. Each letter represents a window of 100bp of a genome, and we define each window with zero heterozygous sites as a "T", and windows with at least 1 heterozygous site as a "K". In order for ATLAS to be able to produce such a file, two further things need to be defined:a prior on theta, which is the heterozygosity you expect to see a priori, and a confidence threshold. For each window, ATLAS calculates the posterior probability of it being a "K" or a "T". If neither of these probabilities is higher than the confidence threshold, the window is defined as an "N".

Input

  • BAM file
  • PMD pattern (see estimatePMD for how to create such a pattern)
  • recalibration file (see recal or BQSR for how to create such (a) file(s))

Output

  • PSMC input file

Usage Example

./atlas task=PSMC bam=example.bam fasta=example.fasta pmdFile=example_pmd_input.txt recal=example_recalibrationEM.txt verbose

Specific Arguments

  • theta : Prior for heterozygosity. Default = 0.001
  • confidence : Confidence threshold for assigning a window with a "T" or a "K". Default = 0.99

Engine Parameters

Engine parameters that are common to all tasks can be found here.

Updated