This repository contains the scripts used to cunduct the analysyes presented in:

CW Dunn, M Howison, and F Zapata (2013) Agalma: an automated phylogenomics workflow. BMC Bioinformatics 14:330. doi:10.1186/1471-2105-14-330.

Using the scripts


These scripts require Agalma and its dependencies. Agalma version 0.3.4 was used for the published analyses.

On May 9, 2014 this repo was updated with scripts that now run the analysis with Agalma version 0.4.0. This produces a report that differs from the one published with the paper, due to numerous improvements made to Agalma between the 0.3.4 and 0.4.0 releases. In particular, supermatrix occupancy is much higher in the 0.4.0 analysis.

This summary table provides exact git commits for each analysis:

Analysis Agalma version Scripts commit
Published 0.3.4 (dc549d2) e930b2e
2014-05-09 0.4.0 (1fe8064) 4a1dfc4


The analysis is broken into a series of scripts. To reproduce the published analysis, execute the scripts in sequence, ie:

sh 00-catalog.sh
sh 01-assemble-Abylopsis.sh
sh 02-assemble-Agalma.sh
sh 03-assemble-Nanomia.sh
sh 04-assemble-Physalia.sh
sh 05-assemble-Prayidae.sh
sh 06-load-Nematostella.sh
sh 07-load-Hydra.sh
sh 08a-phylogeny.sh
sh 08b-phylogeny.sh
sh 09-report.sh

Scripts 01 to 07 can be run concurrently. The scripts include, as comments, commands for executing the analyses via the SLURM job scheduler installed on the OSCAR cluster at Brown University. If you are running the analyses without a job scheduler, then these SLURM commands will be ignored. If you are using a job scheduler, you will need to edit these commands according to the configuration of your own system.