Formatting your data for diXa using ISAcreator
You will need the following:
a. Visit the ISA tools page, scroll down and click on your operating system icon under ISAcreator
b. Download and unzip/install the application
The diXa configurations
a. Download the latest
b. Unzip this archive (it contains xml files with the configurations for each assay type)
c. Move the unzipped folder into the
Configurationsfolder of your ISAcreator unzip/install directory (from step 1b). You should now have a path like
Note: be sure to never add any other files in the configuration folder, otherwise ISAcreator will not recognize it as a valid configuration
Mapping your existing annotation to ISA-tab
You need to use the "normal" version of ISAcreator (middle option) for this exercise. After you select a configuration (use the diXa-specific configuration – see requirements above) and login, select "create new experiment description" and "map from existing file"
Select your spreadsheet file. Supported formats are
xls. In case you have another tabular extension, just rename it to
Make sure your file does not contain blank columns or special characters (such as μ).
Select the kind of assay performed on your data and click "+add assay". Here we have selected "transcription profiling using DNA microarray on Affymetrix".
Now you need to map columns in your spreadsheet to their counterparts in the ISA-tab configuration you have selected, starting with the sample annotations. You can map either to a column, literal (meaning fixed string), or combination of the above. Required columns are in red. In the example below we have mapped the ISA-tab
Subject ID column to a concatenation of
GROUP_ID, a dash
-, and the
INDIVIDUAL_ID from our spreadsheet.
TIP: You can use the magnifying glass icon to peek into your spreadsheet Do the same for assay annotations.
Once you are finished mapping, fill out any other useful information about the Investigation you have just mapped. By default, the
Study Identifier field will read
Mapped Study. You may want to replace that with something more descriptive.
Once you are done, you can save your ISA-tab investigation as a zip archive. Your data files need to be located in the study folder created by ISA-creator (by default in ISAcreator/isatab files/"investigation name"). The filenames must match the names displayed in the field
Array Data File. Also,
make sure you use the same Sample Names in ISAcreator as the ones in your data files! Otherwise there is no way to link your annotations to the actual data.
For a list of supported data formats, refer to the end of the document.
Go to File -> Create ISArchive. Your investigation will validate against your configuration, and when it does, it will be saved as a zip archive in the location you have selected. This archive contains your annotations in ISA-tab format, as well as any data files you have linked to your study (such as tables or cel files).
Starting from scratch
Start the "normal" version of ISAcreator (middle option), select a configuration (use the diXa-specific configuration – see requirements section above) and login. Select "create new experiment description", then "create manually".
First, you have to create your first study and name it. Later, you will be able to create and add more study to your investigation.
Next you need to register the main information about the study: an ID, title, description, submission and public release dates. Then create as many array types as was used on your study’s samples. In the screen below, we have setup a study with only one assay: a microarray transcription profiling. It is possible to add (or delete) a new array at any time.
You will then have to register all the required information about the study itself (containing information describing the samples used) and the arrays (containing the metadata describing all the different factors applied to the study samples). Whenever possible, use the recommended ontology terms to describe the factors of your experiments, as described in the following example for tagging Homo Sapiens:
For specifying factor levels you get a screen where you can enter all levels used for this factor and their units (here mapped to OBI:microgram). If no units are required, uncheck the box "use unit?" at the top.
To fill out the different fields, you can either enter data manually (in the case of small studies) or copy/paste from a spreadsheet program (like Excel).
Make sure you use the same Sample Names in ISAcreator as the ones in your data files! Otherwise there is no way to link your annotations to the actual data.
Now you can create an ISArchive as before (File -> Create ISArchive).
Using the Excel templates
You will need the
ISA-config-diXa2.x-template.xlsx file downloadable here.
This is an export of the diXa ISA configuration in Excel. It follows the same principles for data collection (i.e. describing your Investigation, Study and Assay parameters) as the methods above. If you are having major difficulties with ISAcreator, pick this method.
You’ll notice that headers contain a red triangle in the top right corner. Hover over them, and you’ll see an explanation of what you need to enter in that column.
Make sure you fill out the
studySample tabs AND at least one assay tab (for example,
transcription_micro for transcriptomics microarray experiments) for each experiment you wish to submit.
When you’re done, please place your Excel sheet in a folder with your data and zip everything up. Name the zip archive the same as your experiment ID. This is the archive you must submit.
Make sure you use the same Sample Names in your Excel sheet as the ones in your data files! Otherwise there is no way to link your annotations to the actual data.
Supported raw data formats
- Illumina QSEQ
- AB Sciex Wiff
- AB Sciex t2d
- Agilent QTOF
- Bruker MALDI
- Bruker YEP
- Ciphergen XML
- Spectrometry Binary Format (.sbf)
- Spectrometry Text Format (.txt, stf)
- Two Column Files (.two)
- MSP files
- CEL Files
(i.e. derived/condensed data with or without auxiliary data, e.g., p-values)
- Affymetrix CHP
- Geo SOFT
- ABS (Genedata format)
- GDA (Genedata Format)