Wiki

Clone wiki

ATLAS / Population Genetic Tools: thetaRatio

Overview

This task uses an MCMC to estimate \(\phi = \dfrac{\theta_1}{\theta_2}\), the ratio of the expected heterozygosity estimated from sites in two different regions.

Input

  • two 0-based bed files that define the regions of the two theta's

Output

  • a file with the MCMC iterations for 1. log(theta1), 2. log(theta2), 3. log(theta1) - log(theta2)

Usage Example

./atlas task=thetaRatio bam=example.bam regions1=data_for_nominator_theta.bed regions2=data_for_denominator_theta.bed verbose

Specific Arguments

  • regions1: provide 0-based bed file defining regions used for \(\theta_1\), the \(\theta\) in the nominator
  • regions2: provide 0-based bed file defining regions used for \(\theta_2\), the \(\theta\) in the denominator

Engine Parameters

Engine parameters that are common to all tasks can be found here.

Method

The likelihood function is:

\begin{equation*} \mathbb{P}(\boldsymbol{d}_1,\boldsymbol{d}_2|\theta_1,\theta_2, \boldsymbol{\pi}_1, \boldsymbol{\pi}_2) = \dfrac{\prod\limits_{i=1}^I \sum\limits_g \prod\limits_{j=1}^{n_i} \mathbb{P}(d_{1_{ij}}|g_i=g)\mathbb{P}(g_i=g|\theta_1,\boldsymbol{\pi}_1)}{\prod\limits_{i=1}^I \sum\limits_{g} \prod\limits_{j=1}^{n_i} \mathbb{P}(d_{2_{ij}}|g_i=g)\mathbb{P}(g_i=g|\theta_2,\boldsymbol{\pi}_2)} \end{equation*}

We use an MCMC (Metropolis-Hastings Algorithm) to infer the posterior distribution for all parameters. We perform the updates for all \(\boldsymbol{\pi}\) and \(\log(\theta_1)\) and \(\log(\theta_2)\).

For \(\boldsymbol{\pi}\) and \(\log(\theta_1)\) we use prior \(U[0,1]\) and for \(\phi\) we use a normal prior.

Updated