We show that a systems biology approach, which utilizes gene expression, transcription factor binding, genomic, epigenetic and gene ontology data, can be improved by accounting for the sign of co-expression relationships. We also show that signed WGCNA has advantages over standard differential expression methods. Specifically, signed WGCNA has more consistent gene rankings between data sets (see Additional File 8
), is better able to identify functionally enriched groups of genes (Figure ), and its focus on module eigengenes circumvents the multiple testing problems that plague standard gene-based expression analysis. Below, we highlight several novel stem cell related genes that would not have been found using a standard differential expression analysis.
Signed WGCNA provides novel insight into murine ES cell biology, which unsigned WGCNA is unable to provide. Applying these signed methods to previously published data, we identified pluripotency and differentiation gene modules not found in unsigned networks or differential analysis. The results of signed WGCNA are robust as it identifies similar modules in independently published data sets. We show that module eigengene based connectivity kME
is valuable for annotating genes with regard to module membership and for identifying genes related to pluripotency and differentiation. As a resource, we provide a module membership annotation for each gene with regard to the signed modules (Additional Files 11
Many current studies focus on the role transcriptional regulators play in ES cell maintenance. As expected, the pluripotency module is enriched with genes active in transcriptional regulation, e.g. Oct4, Sox2, Klf2, Nanog, Jarid1b, Jarid2, Nodal, Tgif1, and Esrrb, and contains other genes expected to play a role in ES cell function, such as Dppa4 and Dppa5. The module also contains genes that have recently been shown to be necessary for maintaining the pluripotent state, Nup133 and Utf1 [45
Interestingly, the pluripotency module contains genes with roles in two other pathways, DNA repair and mitochondrial function, which are not found by standard differential analysis. The enrichment for genes that respond to DNA damage is not surprising given that ES cells spend a larger portion of their cell cycle in S phase and have a shorter G1 phase than differentiated cells [73
]. An emphasis on accurate DNA replication is expected since it helps ES cells maintain a stable genome and prevents errors from being inherited by differentiated cells. Mitochondria in ES cells may assist in the prevention of DNA damage [71
]. During aerobic production of adenosine triphosphate (ATP), mitochondria leak superoxides leading to the creation of reactive oxygen species (ROS), which damage DNA. ES cells, however, produce ATP anaerobically and thus minimize the amount of DNA damaging ROS [69
]. ES cells also have fewer mitochondria than differentiated cells and their mitochondria are smaller, have fewer cristae, lack dense matrices, and are perinuclearly located [69
]. Our use of signed WGCNA reveals that in addition to genes involved in transcriptional regulation, genes that prevent or repair DNA damage are key to maintaining pluriotency and self-renewal.
Figure reports significant relationships between module membership, chromatin structure and epigenetic modifications (histone modifications and DNA methylation), which are known to play a role in controlling gene expression during ES cell self-renewal and differentiation. While the relationships are highly significant, we find that epigenetic variables and binding data explain only 8.3% of the variation in module membership
and 4.3% of the variation of
(Table ). In Additional File 5
, we provide gene annotations with regard to module membership, transcription factor bindings, histone trimethylation status, CpG DNA methylation etc.
Using module eigengene based connectivity
we find that many known differentiation related genes are highly connected in the differentiation (blue) module, Cited2, Gata4, and Gata6, along with Ctsl, which has recently been shown to be active in differentiation [65
]. We also find that Uqcrh, a gene involved in the electron transport chain, is highly connected in this module, lending support to the argument that ES cell mitochondria differ from those in differentiated cells. Module eigengene based connectivity enabled us to identify novel candidate genes in the differentiation module, like Uqcrh, that warrant experimental validation (Figure ). For the pluripotency module interesting candidate genes are Msh6, Ppif, Sh3gl2, Rbpj, Elk1, Nrf1, Nup133, Mrpl15, and Zfp39 (Figures and ). These genes lack significant fold change but are highly connected and thus would not be found using standard differential analysis. Using sequence data with motif analysis we confirm the importance of two genes, Nr5a2 and Elk1, computationally.
We use gene ontology information and literature results to provide strong statistical evidence that these candidate genes are very promising and justify further biological study. Our article provides a resource in form of module based gene annotation tables that could form the starting point of future biological validation studies. Depending on their function, these candidate genes can be tested by RNAi knock down, viral infection in order to increase the efficiency of reprogramming, or, if they bind DNA, analyzing their binding sites. Our article demonstrates that signed WGCNA not only identifies many well known ES cell regulators; it also yields novel insights regarding ES cell function.