Wiki

Clone wiki

ATLAS / Population Genetic Tools: calculateF2

Overview

This tasks estimates F2 based on the number of different sites and the total number of compared sites for each pairwise comparison from a multi-sample VCF. Such a vcf file can be created with the ATLAS task major/minor or task call.

Input

  • VCF file: created by e.g. ATLAS task major/minor
  • txt file (optional): e.g. samplesPopulations.txt

This file is a user-created .txt file containing the samples to be used and their population affiliation. Different values will be estimated for different populations. If no populations are provided, all samples are considered to come from the same one.

Example:

sample1 1

sample2 1

sample5 2

sample8 2

Output

  • txt file with suffix is "_counts.txt". A n*n matrix containing the counts of different sites in the upper triangle and the total number of compared sites in the lower triangle for all possible pair of samples.
  • txt file with suffix is "_sampleF2.txt". A n*n matrix containing the pairwise sample F2 (#diff Sites/#compared Sites) for all possible pair of samples. This file can be given as input for MDS as distances.
  • txt file with suffix is "_popF2.txt". A p*p matrix containing the average F2 within and between populations for all possible pairs. This file can be given as input for MDS as distances.

Usage Example

./atlas task=calculateF2 vcf=example_majorMinor.vcf.gz samples=samplesPopulations.txt

Specific Arguments

  • samples: specify samples to be used and their population affiliation
  • limitLines: amount of lines to be read from VCF file

Updated