Simulation (task simulate)
To simulate allele frequencies use the task “simulate” followed by the values of the parameters, the number of populations for groups (pops) which are divided by a comma, the number of sites (numSites), the distance between two adjacent sites (distBetweenSites) and the number of the sample of haplotypes (N).
To simulate data for only one group, you have just to fix the parameter that are involved to get the selection coefficient and the drift parameter for a group, thus alpha_max, beta, lnkappa, pops, numSites, distBetweenSites, and N.
./Flink task=simulate beta=-1.0 alpha_max=1.0 lnkappa=-2.0 pops=3 numSites=1000 distBetweenSites=1 N=10000
You will get an output file that is called Flink_simulations.txt containing the simulated allele frequencies, an output file for each different groups (S”NumberOfTheGroup”_simulated.txt) and an output file for the ancestral allele frequencies.
Using the option “data”, it is also possible to use an input file to fix the name of the group and of the populations, the number of loci and the distances between them.
example of launching the program using an input file:
./Flink task=simulate data=inputfile beta=-1.0 alpha_max=1.0 lnkappa=-2.0
To see how an input file has to look like, see the paragraph “Input file” in the chapter Launching Flink.
To simulate allele frequencies for more than one group, you have also to fix the parameters to get an higher hierarchy simulation, that means you have to fix B, A_max and lnK.
./Flink task=simulate B=-1.0 A_max=1.0 lnK=-2.0 beta=-1.0 alpha_max=1.0 lnkappa=-2.0 pops=3,3,3 numSites=1000 distBetweenSites=1 N=10000
In addition to the files generating in the one group simulation, you will get an extra output file containing the simulated S for the world hierarchy (S_simulated.txt).
There are several output files that are generated from the simulation. The model of the called file “Flink_simulations.txt” is identical to an example of the input file (see Launching Flink). In the file “freq_p.txt” instead, there are printed the ancestral allele frequencies generated during the simulation. In the first column there are the names of the frequencies, with the first index concerning the population, and the second the site index. The second column shows the value of the frequencies. There are also some output files giving the S values used in the simulation, they are called "S"groupnumber"_simulated.txt" for each group, and "S_simulated.txt" for the higher hierarchy. You can use these files as input for the parameter estimation using the tasks Sg_fromfile and S_fromfile.
You can also specify other possible arguments.
Arguments Default value Explanation s_max 2 Maximum state of the Markov model lnMu -2.0 Probability involved in the generating matrix to go to a different state for the higher hierarchy. lnNu -1.0 Probability involved in the generating matrix to go to a state of selection from the neutral state for the higher hierarchy. lnMu_g -2.0 Probability involved in the generating matrix to go to a different state. lnNu_g -1.0 Probability involved in the generating matrix to go to a state of selection from the neutral state. log_a 0.0 Describes the shape of allele frequencies in the ancestral population, assuming a beta distribution (the peak around 1.0). log_b 0.0 Describes the shape of allele frequencies in the ancestral population, assuming a beta distribution (the peak around 0.0). maxDist 1000000 Maximum distance to group the sites of a chromosome for the linkage numChr 1 Number of chromosomes to simulate numSites 100 Number of sites to simulate distBetweenSites 1000 Distance between sites N 100 Number of haplotypes in a sample lnkappa -2.0 Logarithm of group positive scaling parameter lnK -2.0 Logarithm of world positive scaling parameter P Fix all the world ancestral allele frequencies to a value between (0, 1) p Fix all the ancestral allele frequencies to a value between (0, 1) S Fix all the ancestral states S to a minimum value (min), to a maximum value (max), to the neutral state (neutral) or to an integer value between [-s_max, s_max]. Fixing this argument, the parameter lnK will not be used Sg Fix all the group states Sg to a minimum value (min), to a maximum value (max), to the neutral state (neutral) or to an integer value between [-s_max, s_max]. Fixing this argument, the parameter lnkappa will not be used