# Wiki

Clone wiki# Tiger / simulate

## Overview

Simulate a vcf file according to a model specified in the task page infer.

**Common parameters**

**sites**: specify how many sites in the genome, default = 10000**samples**: specify how many samples to simulate, default = 50**model**: specify inference model, can be indRep, hardyWeinber, and truthSet

## indRep

**Parameters**

**replicates**: specify how many replicates per sample to simulate. Default = 3**het**: specify the fraction of heterozygous sites. Default = 0.1

**Files created**

**simulations.vcf.gz**: Genotype calls for all samples**simulations_sampleGroups.txt**: The association of the samples to a replication group and error rate class. Error rates are estimated separately for each class. Samples in the same replication group are assumed to share the same genotypes.

**Usage examples:**

Different error rates for depth=1 and depth=2, only one individual that has 10 replicates:

```
./tiger task=simulate model=indReps replicates=10 het=0.01 samples=1 sites=30 error=0.1,0.2 outname=simple
```

Different error rates for depth=1 and depth=2, three replicate groups:

```
./tiger task=simulate model=indReps replicates=10 het=0.01 samples=3 sites=30 error=0.1,0.2 outname=multipleRepGroups
```

Different error rates for depth=1 and depth=2, three replicate groups and two error sets:

```
./tiger task=simulate model=indReps replicates=10 het=0.01 samples=3 sites=30 error=[0.1,0.2],[0.004,0.003] outname=multipleSets
```

For more examples see our Individual Replicates Tutorial

## hardyWeinberg

**Parameters**

**sites**: specify how many sites in the genome, default = 10000**samples**: specify how many samples to simulate, default = 50**populations**: the number of simulated populations. Each population will have size =**numSamples**. Default = 3**alpha**and**beta**: the simulated allele frequencies for all sites are sampled from a beta distribution defined by alpha and beta. Default = 0.5**numSitesPolymorphic**: number of polymorphic sites**minMAF**: minimum minor allele frequency at which a site is considered polymorphic**error**: string of error rates, default = 0.5. You can simulate different error rate classes for sites of different depth**errorHet**: string of error rates, needs to be same length as error. default = 0.5

**Files created**

**simulations.vcf.gz**: genotype calls for all samples**simulations_sampleGroups.txt**: the association of the samples to a population and error rate class. Error rates are estimated separately for each class. Population parameters (alpha, beta, allele frequencies) are estimated separately for each population.**simulations_R_input.txt**: the simulated observed genotype calls (what is in VCF) for all samples and loci encoded as 1 for homozygous reference, 2 for heterozygous and 3 for homozygous alternative allele**simulations_trueAlleleFrequencies.txt**: the simulated true allele frequencies for all loci

**Usage examples**

Different error rates for depth=1 and depth=2, only one error class and one population:

```
./tiger task=simulate model=hardyWeinberg populations=1 samples=50 sites=30 alpha=0.5 beta=0.5 outname=test error=0.1,0.2
```

Different error rates for depth=1 and depth=2, two populations:

```
./tiger task=simulate model=hardyWeinberg populations=2 samples=50 sites=30 alpha=0.5 beta=0.5 outname=test
```

Different error rates for depth=1 and depth=2, two error classes and two population:

```
./tiger task=simulate model=hardyWeinberg populations=1 samples=50 sites=30 alpha=0.5 beta=0.5 outname=test error=[0.1,0.2],[0.004,0.003]
```

For more examples see our Hardy-Weinberg Tutorial

## truthSet

One pair of observed and true samples is simulated for numSamples.

**Parameters**:

**sites**: specify how many sites in the genome, default = 10000**samples**: specify how many samples to simulate, default = 50**het**: specify the fraction of heterozygous sites. Default = 0.1**meanDepthOfTrue**: mean depth of "true" sample**seqError**: sequencing error. default = 0.01, corresponds to quality score 20

**Files created**:

**simulations_samplePairs.txt**: This names of the corresponding samples

**Usage examples**

```
./tiger task=simulate model=truthSet samples=10 sites=1000 error=0.1,0.2
```

For more examples see our Truth Set Tutorial

Updated