Clone wiki

biobakery / maaslin

MaAsLin Tutorial

MaAsLin is a multivariate statistical framework that finds associations between clinical metadata and potentially high-dimensional experimental data. MaAsLin performs boosted additive general linear models between one group of data (metadata/the predictors) and another group (in our case relative taxonomic abundances/the response).

MaAsLin is available as a Galaxy module, a Homebrew formula, a Docker image, and included in bioBakery (VM and cloud). For additional information, refer to the manuscript: Morgan XC, Tickle TL, Sokol H, Gevers D, Devaney KL, Ward DV, Reyes JA, Shah SA, LeLeiko N, Snapper SB, Bousvaros A, Korzenik J, Sands BE, Xavier RJ, Huttenhower C. Dysfunction of the intestinal microbiome in inflammatory bowel disease and treatment. Genome Biol. 2012 Apr 16;13(9):R79..

We provide support for MaAsLin users. Please join our Google group designated specifically for MaAsLin users. Feel free to post any questions on the google group by posting directly or emailing maaslin-users@googlegroups.com.



Overview

The following figure shows the workflow for MaAsLin.

MaAsLin.png

1. MaAsLin (Galaxy module)

MaAsLin requires a microbial abundance table with metadata attached (e.g.: maaslin_demo2.pcl).

  • Go to the Huttenhower Galaxy Server: http://huttenhower.sph.harvard.edu/galaxy/.
  • Click on the Get Data -> Upload File link on the left pane and upload the demo file maaslin_demo2.pcl. You can do this by clicking on the Browse button, selecting the demo file, and then pressing the Start button. Select format tabular.
https://bitbucket.org/repo/49y6o9/images/3266810747-Screenshot%20from%202017-09-01%2018-41-15.png
  • Click on the MaAsLin link on the left pane.
    • Select the input data from the Input file drop-down menu.
    • Specify the last metadata row in the sample, after which the microbial species are listed (this is Weight in our sample dataset).
    • Click on Execute
https://bitbucket.org/repo/49y6o9/images/2497207550-Screenshot%20from%202017-09-05%2017-58-38.png

The results will appear on the right pane. You may proceed with viewing it (by clicking on the Eye symbol) or downloading it on your computer (by clicking on the Save symbol).

2. MaAsLin (Homebrew/Docker/VM)

MaAsLin can be installed with Homebrew or run from a Docker image. Please note, if you are using bioBakery (Vagrant VM or cloud) you do not need to install MaAsLin because the tool and its dependencies are already installed.

Install with Homebrew: $ brew install biobakery/biobakery/maaslin

Install with Docker: $ docker run -it biobakery/maaslin bash

If you would like to install from source, refer to the MaAsLin user manual for the dependencies and installation instructions.

2.1 Merging Metadata with Microbial abundance tables (optional)

Once you have obtained the microbial abundance tables through MetaPhlAn2 (See MetaPhlAn2 tutorial for details) or other tools, MaAsLin allows you to determine associations between the microbial abundances and the metadata.

This requires you to attach the metadata of the samples to the microbial abundance tables resulting in a table with the format of the following example file: maaslin_demo2.pcl.

Follow the instructions below to attach metadata to your microbial abundance tables. If you are using a separate method to attach the metadata, you may skip to the next section (All that is necessary is that the metadata needs to be listed first in the file (before the microbial abundance table), in the format shown by the sample input data).

Please ensure that the format of your input files follows that of the demo files above.

  • Run the following command to create the merged file:
$ merge_metadata.py maaslin_demo_metadata.metadata < maaslin_demo_measurements.pcl > maaslin_demo2.pcl
  • The resulting file is maaslin_demo2.pcl. You may now use this file as an input for MaAsLin.

2.2 Creating a .read.config file

The .read.config file determines which rows/columns to process without modifying the input metadata-merged-microbial abundance table. A sample .read.config file (maaslin_demo2.read.config) is shown below:

Matrix: Metadata
Read_PCL_Rows: -Weight

Matrix: Abundance
Read_PCL_Rows: Bacteria-

The text above dictates that the Metadata matrix ends when the row starts with Weight, while the Abundance matrix starts when the word Bacteria appears in the row. For more examples, please refer to the MaAsLin documentation.

2.3 Running MaAsLin

Once you have the metadata-merged-microbial abundance table, and the .read.config file (see the samples to ensure the format), you are ready to run MaAsLin.

  • Place the .read.config file (e.g. maaslin_demo2.read.config), and the metadata-merged-microbial abundance table (e.g. maaslin_demo2.pcl) in your current working directory.
  • Run the following command: $ Maaslin.R maaslin_demo2.pcl output -i maaslin_demo2.read.config

The above command will create a directory: output, which will contain the results. An example biplot figure (generated from a larger data set than the demo) is shown below:

maaslin_demo2-biplot (1).jpg

Notes

For more information please refer to the MaAsLin documentation.

Updated