Wiki

Clone wiki

enterobase-web / EnteroBase Backend Pipeline: matrix_phylogeny

Top level links:

matrix_phylogeny

Overview

The matrix_phylogeny pipeline runs phylogeny_workflow.py with the phylogeny option only. This in turn runs RAxML in order to compute a maximum likelihood phylogenetic tree.

The matrix_phylogeny pipeline is normally run as part of a workflow (i.e. the end) to compute a SNP tree.

The matrix_phylogeny is currently in version 1.0.

Summary

The matrix_phylogeny pipeline runs phylogeny_workflow.py with the phylogeny option. This takes a SNP matrix file (from a previous run of the refMapper_matrix pipeline) as input. (The SNP matrix file documents mutations found in genome assemblies compared with a reference genome assembly from running refMapper.) The SNP matrix is read in and parsed and a PHYLIP file is written for the genome assemblies in order to run RAxML. PHYLIP is an alignment file format, originally used by the PHYLIP alignment program. In this case the PHYLIP file is used to represent the concatenated sequence from the contigs of the genome assemblies where variation is present in at least one of the genome assemblies. RAxML (version 8.2.4) is run to compute a maximum likelihood tree for all of the genome assemblies represented in the PHYLIP file. The tree determined by RAxML is rooted using the ETE 3 toolkit so that the root node is split into two balanced branches in terms of node distances. The final tree is output in Newick format (which will be downloadable by the user who initiated computation of a SNP tree while using EnteroBase).

Updated