Clone wiki

biobakery / hmp

Miscellaneous tools and techniques

1. Manipulating Human Microbiome Project (HMP) data

This tutorial illustrates how you can combine the publicly available metadata to the existing Human Microbiome Project (HMP) data tables.

1.2 Obtaining the metadata

Obtain the metadata from the dbGaP project's website. You will need to register at dbGaP to obtain the metadata XML file on dbGaP.

For information about what the file contains, please refer to the tutorial Further instructions about download and registration are available online.

Alternatively, you may access the publicly available metadata (which is the subset of the metadata available at dbGaP mentioned above) from the HMPDACC Data Browser here. The link features three files, each of which maps each sample to sex, body-site, and sub body-site, along with other information.

1.3 Combing metadata to data tables

  • Once you have the data table and the metadata XML file from dbGaP, please download and extract this folder. The scripts require Python and make to be installed. This process should work as described on Linux or on Windows using Cygwin
  • The README file in the folder contains information about using the scripts to generate tab-delimited metadata tables from the dbGaP input XML files, which can then be mapped to the HMP data tables mentioned above.

If you have any questions, please feel free to contact us.