Authors: JK Pickrell and JK Pritchard
TreeMix is a method for inferring the patterns of population splits and mixtures in the history of a set of populations. In the underlying model, the modern-day populations in a species are related to a common ancestor via a graph of ancestral populations. We use the allele frequencies in the modern populations to infer the structure of this graph.
The details of the TreeMix model are presented in: Pickrell JK and Pritchard JK, Inference of population splits and mixtures from genome-wide allele frequency data.
Some extensions are presented in: Pickrell JK, Patterson N, Barbieri C, Berthold F, Gerlach L, Güldemann T, Kure B, Mpoloka SW, Nakagawa H, Naumann C, Lipson M, Loh PR, Lachance J, Mountain J, Bustamante CD, Berger B, Tishkoff SA, Henn BM, Stoneking M, Reich D, Pakendorf B. The genetic prehistory of southern Africa.
We describe an application of this model to looking for natural selection in humans and dogs at Genomes Unzipped.
TreeMix 1.13 released in the download section.
- Merges a bugfix branch from Mikkel Schubert that fixes a compilation error many users were coming across. Many thanks to Mikkel for the fix!
- Transfer of codebase to Bitbucket
- Minimal changes to output of errors - errors start with 'ERROR' and warnings start with 'WARNING'.
TreeMix 1.12 released.
- Fixes a bug that caused the reported relative likelihoods to be incomparable between trees and graphs. Many thanks to Mait Metspalu and Mike DeGiorgio for working through this.
- Also adds a -seed option for setting the random seed from the command line
The TreeMix paper has been published in PLoS Genetics:
Pickrell JK and Pritchard JK, Inference of population splits and mixtures from genome-wide allele frequency data.
Release of version 1.11.
- Fixes a bug that sometimes caused crashes when using microsatellite data
Release of version 1.1.
- Allows input of microsatellite data. For a description of the microsatellite model, see downloads section on Bitbucket (pdf)
- Allows incorporation of known migration events
- Small other bug fixes
Preprint: "The genetic prehistory of southern Africa" is available on arXiv. The new features in TreeMix described in this preprint will be available in the next release (estimated Sept. 2012).
Release of version 1.04.
- Forces migration edges to have weight less than 0.5
- Include three- and four- population tests for treeness from Reich et al. 2009 (programs are called threepop and fourpop, respectively)
To run threepop or fourpop, the input is standard TreeMix input. Then run (e.g.)
threepop -i input.gz -k 500
This will print f3 statistics for all populations to stdout, and calculate standard errors in blocks of 500 SNPs. For example, running this on the test input files will give a set of output like:
Estimating f_3 in 59 blocks of size 500 total_nsnp 29999 nsnp 29999 Dai;Han,Sardinian 0.00112445 0.000276542 4.06609 Han;Sardinian,Dai 0.000536062 0.000211323 2.53669 Sardinian;Han,Dai 0.0289054 0.000867602 33.3165
The line Sardinian;Han,Dai 0.0289054 0.000867602 33.3165 tells you that f3(Sardinian;Han,Dai) is ~0.03, with a standard error of 0.0009, which corresponds to a z-score of 33. For information on how to interpret these tests, see Reich et al. (2009).
Added a small script to convert stratified allele frequencies output from plink into TreeMix format. This will be incorporated into the next release, but for the moment must be downloaded separately. To run this, let's say you have data in plink format (e.g., data.bed, data.bim, data.fam) and a plink cluster file matching each individual to a population (data.clust).
Now you run:
plink --bfile data --freq --missing --within data.clust
plink2treemix.py plink.frq.gz treemix.frq.gz
The file treemix.frq.gz can now be used as input for TreeMix.
- small bug fixes
- removed an unnecessary header that sometimes caused compilation problems
- small big fixes
- this is the first major release