This repository is in support of the eADAGE paper. It contains all the code and necessary data/metadata to repeat all analyses in the paper.
What is eADAGE?
eADAGE (ensemble ADAGE) is an enhanced version of ADAGE. It consolidates 100 individual ADAGE models into one ensemble model. eADAGE automatically extracts biologically meaningful features from large-scale transcriptomic data.
How do I repeat analyses in the eADAGE paper?
The repository is divided into 6 sections in the order of data_collection, netsize_evaluation, ensemble_construction, PCA_ICA, node_interpretation, medium_analysis. Each section provides a shell script that guides the analysis workflow.
- tested on version 2.7.6 and 2.7.9
- Python packages will be installed automatically in the shell script
- required packages: theano, docopt, requests, statsmodels, numpy
- tested on version 3.2.1 and 3.2.3
- R libraries are handled by the library pacman, please install it first.
- the following libraries will be installed/loaded when necessary by pacman: affy, affyio, TDM, doParallel, readr, ggplot2, gplots, sprint, ff, cluster, plyr, dendextend, gdata, limma
How should I build a new eADAGE model?
By modifying the platform and organism parameters in the data_collection/ data_collection.sh file, you can build an expression compendium for a new organism (if the organism uses that one major array platform.).
Then you can follow the instructions in eADAGE_construction.sh to build an eADAGE model for it.
Jie Tan (firstname.lastname@example.org)