Clone wiki

biobakery / humann

HUMAnN Tutorial

HUMAnN (HMP Unified Metabolic Analysis Network) is a pipeline for efficiently and accurately determining the presence/absence and abundance of microbial pathways in a community from metagenomic data.

HUMAnN is available as a bitbucket repository. For additional information, please refer to the HUMAnN paper.

We provide support for HUMAnN users. Please join our Google group designated specifically for HUMAnN users. Feel free to post any questions on the google group by posting directly or emailing humann-users@googlegroups.com.




1. HUMAnN (bitbucket)

Please refer to the HUMAnN documentation for the pre-requisites/dependencies and installation instructions.


1.1 Input

HUMAnN accepts the following input formats:

  • Tabular translated BLAST results (using blastx or USEARCH)
  • Mapping output in BAM format (bowtie, bwa etc.)
  • Tab-delimited with >=1 pre-quantified gene abundances (.tsv, .csv, .pcl)

1.2 Running HUMAnN

  • Place input in the /humann/input folder
  • Run the following command from /humann/ directory level:

$ scons

This command will execute HUMAnN on the input files in the input directory, and save the output files under /humann/output/. Please find all of your output files under the output directory. For details on what each output file depict, please refer to the HUMAnN documentation.


2. Visualization using GraPhlAn

The following HUMAnN output files may be used as inputs for visualization with GraPhlAn:

  • 04b-*-(mpm or mpt)-*-graphlan_rings.txt (Annotation file)
  • 04b-*-(mpm or mpt)-*-graphlan_tree.txt (Tree)

2.1 Using GraPhlAn package (bitbucket)

Run the following commands to generate a cladogram using GraPhlAn:

$ graphlan_annotate.py --annot 04b-*-graphlan_rings.txt 04b-*-mpt-*-graphlan_tree.txt output_filename.xml

$ graphlan.py --dpi 200 output_filename.xml output_images/output_filename.png

2.2 Using GraPhlAn module (Galaxy)

  • Go to the Huttenhower's Galaxy server, and click on the GraPhlAn link on the left pane.
  • Click the link Load input tree in the panel on the left, and select the output file from HUMAnN by clicking on the Choose file button. Press the Execute button to upload the file.
https://bitbucket.org/repo/49y6o9/images/686783234-load_graphlan.png
  • After the data has been uploaded, click on the Annotate tree link to add all the graphical features. Then, select the input file from the Input File drop-down menu. Specify the data fields according to the desired output, and press the Execute button when done. For example, the fields specified for a figure with leaf node names would be as follows:
    • Select the clades of interest from the list Select clade(s).
    • Enter * for the field Annotation Label.
    • Specify Clade leaf nodes from the drop-down menu Annotation Label Clade Selector.
annotate_graphlan.png
  • Click on the Add rings to the tree link, and select the annotated data from the above step (instead of your raw input in Step 1) from the Input Tree drop-down menu. Upload the 04b--graphlan_rings.txt file (can be found under /humann/output/) through the Get Data link (located in the LOAD DATA MODULE in the panel on the left). Select the 04b--graphlan_rings.txt file from the Ring input File drop-down menu, and press Execute.
https://bitbucket.org/repo/49y6o9/images/3206422826-rings_graphlan.png
  • To plot the final tree, click on the Plot tree link in the panel on the left, and select the output from the step above. Press Execute. To visualize the results, click on the Eye symbol next to the output file generated in the panel on the right.
plot_graphlan.png

For instructions on installing and using the GraPhlAn package or GraPhlAn Galaxy module, please refer to the GraPhlAn tutorial.


3. Further analysis

3.1 Using MaAsLin

  • Modifications
    • Select a file from the HUMAnN output folder (named 04b-*-mpt-*.txt or 04b-*-mpm-*.txt)
    • Open the file in Microsoft Excel or a text editor.
    • Remove the first column.

For instructions on how to use MaAsLin or the MaAsLin Galaxy module, please refer to the MaAsLin tutorial

3.2 Using LEfSe

  • Modifications
    • Select a file from the HUMAnN output folder (named 04b-*-mpt-*.txt or 04b-*-mpm-*.txt)
    • Open the file in Microsoft Excel or a text editor.
    • Remove the first column.
    • Remove every metadata row (anything including and above InverseSimpson) except the class (and optional subclass), and the top row: ID/NAME.
    • Please ensure only 1-2 metadata rows remain apart from the Name/ID row at the top.
  • Load data with LEfSe by clicking on the Choose file button. Select the modified output file, and click on the Execute button.
load_lefse.png
  • Once the data has been uploaded (the file will appear on the right-hand-side panel), proceed with formatting the data for LEfSe by clicking the link Format Data for LEfSe in the panel on the left, and selecting the data from the drop-down menu.
https://bitbucket.org/repo/49y6o9/images/3993626238-format_lefse.png
  • Follow the instructions to select the correct fields in the drop-down menus, and then click on the Execute button.
execute_lefse.png
  • Once the data is formatted, click on the LDA Effect Size link in the panel on the left. Select the formatted data from the Select Data drop-down menu and press Execute.
https://bitbucket.org/repo/49y6o9/images/1538644473-lda_lefse.png
  • The output generated from the step above can be used as input for the following.
    • Plot the LEfSe results using the Plot LEfSe Results link in the panel on the left.
    • Plot a Cladogram using the Plot Cladogram link in the panel on the left.
    • Plot Differential Features using the Plot Differential Features link in the panel on the left.
plot_lefse.png

For further instructions on how to use LEfSe bitbucket repository or the LEfSe Galaxy module, please refer to the LEfSe tutorial.


Notes

For more information please refer to the HUMAnN Documentation.

Updated