galaxy / galaxy-central
Main development repository for Galaxy. Active development happens here, and this repository is thus intended for those working on Galaxy development. See http://bitbucket.org/galaxy/galaxy-dist/ for a more stable repository intended for end-users.
$ hg clone http://bitbucket.org/galaxy/galaxy-central/
NOTE: The ENCODE dataset tool is deprecated. The datasets are from the ENCODE pilot project and are outdated at this point. The tool will be replaced with a new "Data Library" containing ENCODE data
Adding Additional ENCODE Datasets
Adding additional datasets to the encode import tool involves editing the file /cache/encode_datasets/encode_datasets.loc which is located on g2.bx.psu.edu.
Currently, only files adhering to the Browser Extensible Data (BED) format are allowed.
Once you have added your datasets, the Galaxy server must be reset so that it can be made aware of the changes.
Format of encode_datasets.loc
- Tab-delimited file
- There are 5 required fields
- Lines beginning with # are ignored
Description of Fields
First Field
- Abbreviation of the Encode Group where data belongs
- Valid abbreviations are as follows:
- CC = Chromatin and Chromosomes
- GT = Genes and Transcripts
- MSA = Multi-species Sequence Analysis
- TR = Transcription Regulation
Second Field
- Database build for which the data is valid
- Examples:
- hg17
- hg16
Third Field
- Description of the dataset
- This is displayed in the tool's select page and also the history
Fourth Field
- A unique ID for the dataset
- Any combination of letters and/or numbers is acceptable
- Except the keyword None, do not use it or else your data won't be accessible
- Make sure that the ID that you select is different than any other
- If not, one of the datasets will be unknown to the tool
Fifth Field
- The full path including file name of the dataset you are adding
- This file must be accessible to the Galaxy Server
An Example Entry
You want to add a dataset with the following characteristics:
- Belongs in the Chromatin and Chromosomes group
- Is based on the hg17 build
- Has the description of "Some really cool data"
- The file is located (accessible to the galaxy server) at the path of /cache/encode_datasets/encodeData1.bed
- You checked, and double checked, that the ID you want, encodeCCReallyCoolData, hasn't been taken yet The entry would look like this:
CC hg17 Some really cool data encodeCCReallyCoolData /cache/encode_datasets/encodeData1.bed
Some Questions/Answers
Why doesn't my data set appear?
- You didn't reset the server
- The server must be reset in order for the tool to be aware of its presence
- You did not include all the required fields
- Fields are delimited by tabs
- The file you specified isn't accessible to the Galaxy server
- Check permissions
- The file you specified doesn't exist
- Check your spelling
- You used an ID (field 4) which matches another dataset
- Or someone reused your ID
This revision is from 2010-03-18 21:20
