Wiki

Clone wiki

ATLAS / VCF Tools: VCFToBeagle

Overview

Convert a VCF to Beagle file. Various filters (MAF, depth, variant quality, missingness, specific samples, genomic regions, chromosomes etc.) can be set.

Input

  • VCF file: to be converted
  • txt file (optional): e.g. samplesPopulations.txt

This file is a user-created .txt file containing the samples to be used.

Example:

sample1

sample2

sample5

sample8

Output

  • zipped Beagle file with suffix ".beagle.gz". Contains the genotype likelihoods in Beagle format.

Usage Example

./atlas task=VCFToBeagle vcf=example.vcf.gz samples=samplesPopulations.txt

Specific Arguments

  • samples: specify samples to be used
  • limitLines: amount of lines to be read from VCF file
  • minDepth: only store sites with minimum depth. Default = 1
  • minSamplesWithData: only store sites with minimum number of samples. Default = 1
  • minMAF: only store sites where initial estimate of allele frequency is larger or equal to minMAF. Default = 0.0
  • minVariantQuality: only store sites with minimum variant quality. Default = 0
  • keepChromosomes: only loci on these chromosomes are kept. The argument can be a filename (which needs to end with .txt); or a comma-seperated list of chromosome names
  • window: a BED-file with three columns that correspond to chromosome, start (0-based) and end position of windows that should be kept. If both keepChromosomes and window are defined, only the overlap of the two are kept
  • reportFreq: after how many lines the reading progress is printed to the terminal. Default = 10000
  • epsF: epsilon for EM algorithm to estimate allele frequencies. Default = 0.0001

Updated