Wiki
Clone wikiATLAS / Population Genetic Tools: thetaRatio
Overview
This task uses an MCMC to estimate \(\phi = \dfrac{\theta_1}{\theta_2}\), the ratio of the expected heterozygosity estimated from sites in two different regions.
Input
- two 0-based bed files that define the regions of the two theta's
Output
- a file with the MCMC iterations for 1. log(theta1), 2. log(theta2), 3. log(theta1) - log(theta2)
Usage Example
./atlas task=thetaRatio bam=example.bam regions1=data_for_nominator_theta.bed regions2=data_for_denominator_theta.bed verbose
Specific Arguments
- regions1: provide 0-based bed file defining regions used for \(\theta_1\), the \(\theta\) in the nominator
- regions2: provide 0-based bed file defining regions used for \(\theta_2\), the \(\theta\) in the denominator
Engine Parameters
Engine parameters that are common to all tasks can be found here.
Method
The likelihood function is:
\begin{equation*}
\mathbb{P}(\boldsymbol{d}_1,\boldsymbol{d}_2|\theta_1,\theta_2, \boldsymbol{\pi}_1, \boldsymbol{\pi}_2) = \dfrac{\prod\limits_{i=1}^I \sum\limits_g \prod\limits_{j=1}^{n_i} \mathbb{P}(d_{1_{ij}}|g_i=g)\mathbb{P}(g_i=g|\theta_1,\boldsymbol{\pi}_1)}{\prod\limits_{i=1}^I \sum\limits_{g} \prod\limits_{j=1}^{n_i} \mathbb{P}(d_{2_{ij}}|g_i=g)\mathbb{P}(g_i=g|\theta_2,\boldsymbol{\pi}_2)}
\end{equation*}
We use an MCMC (Metropolis-Hastings Algorithm) to infer the posterior distribution for all parameters. We perform the updates for all \(\boldsymbol{\pi}\) and \(\log(\theta_1)\) and \(\log(\theta_2)\).
For \(\boldsymbol{\pi}\) and \(\log(\theta_1)\) we use prior \(U[0,1]\) and for \(\phi\) we use a normal prior.
Updated