Wiki

Clone wiki

ATLAS / alleleCounts

Overview

This tasks estimates the most likely allele count from a multi-sample VCF containing bi-allelic sites. Such a vcf file can be created with the ATLAS task major/minor. The method is based on Nielsen et al. (2012) PLoS One.

Input

  • VCF file: created by e.g. ATLAS task major/minor
  • txt file (optional): e.g. samplesPopulations.txt

This file is a user-created .txt file containing the samples to be used and their population affiliation. Different allele counts will be estimated for different populations

Example:

sample1 1

sample2 1

sample5 2

sample8 2

Output

  • zipped txt file with suffix is "_alleleCounts.txt.gz". Contains the MLE allele counts for all positions and populations.

Usage Example

./atlas task=alleleCounts vcf=example_majorMinor.vcf.gz samples=samplesPopulations.txt

Specific Arguments

  • samples: specify samples to be used and their population affiliation
  • limitLines: amount of lines to be read from VCF file
  • minDepth: only store sites with minimum depth
  • minSamplesWithData: only store sites with minimum number of samples. Default = 1
  • minMAF: only store sites where initial estimate of allele frequency is larger or equal to minMAF. Default = 0.0
  • minVariantQuality: only store sites with minimum variant quality
  • reportFreq: after how many lines the reading progress is printed to the terminal. Default = 10000.
  • epsF: epsilon for EM algorithm to estimate allele frequencies. Default = 0.0001

Updated