coexpp -- Large-scale Weighted Gene Coexpression Network Analysis
coexpp provides a focused coexpression network analysis workflow optimized
for very large numbers of genes. Particular attention has been paid to
mininimizing overall memory footprint.
coexpp has a O(n^2) memory footprint
with a constant factor very close to 1, and as such typically consumes one
third of the memory of other WGCNA implementations.
coexpp wraps around the WGCNA
package, replacing key memory and performance intensive operations with C++
implementations (using RcppEigen and
coexpp maintains large matrices (those larger than R's maximum
matrix size of ~46,000 x 46,000) entirely on the C++ side where they are not
subject to R's size limits and copy-by-value semantics.
coexpp is not a complete re-implementation of WGCNA. Instead it is
optimization of the specific workflow in use at the sponsoring research
coexpp was seeded from the SageBionetworksCoex package, and
uses code developed at Sage Bionetworks by Bruce Hoff and others.
coexpp is already installed on Minerva. Simply
module load R and then
library(coexpp) within R.
Mac OS X
The following procedure is designed for R version ≥3.2.2, Mac OS ≥10.10.5.
You need to install WGCNA-R and its dependencies first.
To enable multithreading, you need a compiler that supports OpenMP, like gcc 4.9 without multilib:
brew reinstall gcc --without-multilib; take a coffee break.
Put the following in
CC = gcc-4.9 CXX = g++-4.9 PKG_CXXFLAGS += -fopenmp PKG_LIBS += -fopenmp SHLIB_OPENMP_CXXFLAGS = -fopenmp
RcppEigenfrom CRAN source packages with
install.packages(c("Rcpp", "RcppEigen"), type = "source"). You need to install from source because the binaries on CRAN don't have OpenMP enabled.
flashClustfrom CRAN (the binary package is OK) with
Rclusterpp0.2.3 (as of 2015-11-11), but you need ≥0.2.4, otherwise
coexppwill have linking problems. Therefore, you need to install it straight from github, for instance:
> install.packages("devtools") > devtools::install_github("nolanlab/Rclusterpp")
Clone this repository, compress it as a
tar zcfv, and then within R:
In order for this package to make sense, you need to read the WGCNA tutorials first, as this package optimizes functionality from that package. The following is the most typical way to get started:
coexppon top of vanilla WGCNA-R:
coexppSetThreads(NULL)will enable multithreading, with as many threads as available cores.
Get gene expression data into a matrix (let's call it
geneExpr) with samples as rows and probes as columns.
results <- coexpressionAnalysis(geneExpr)to kick off the standard workflow.
resultswill be a list with a
?CoexppClustersto read about its contents), the gene module color assignments in
$geneModules, and a clustering of the modules in
Extending/modifying/contributing to coexpp
A few notes if you are going to be modifying
coexppuses roxygen2 to generate documentation. Do not modify the
*.Rdfiles directly. Instead update the documentation at the function implementation and regenerate the documentation.