In this study, we have described a general approach for generating gene expression signatures which can be used to infer patterns of miRNA expression. Validation of this approach was demonstrated in both a training set and also an independent test set of gastric cancer samples. The member genes in the miRNA gene expression signature, while comprising genes transcriptionally altered as a consequence of miRNA activity, may not necessarily be direct miRNA target genes. This feature distinguishes our study from previous studies using sequence-predicted miRNA target genes to annotate miRNA functions. To our knowledge, our study is the first to demonstrate the ability of gene expression signatures to act as surrogates of miRNA activity. While the current analysis is limited to 276 miRNA signatures passing various quality and significance threshold cutoffs, applying this strategy to larger and more generalized training sets will undoubtedly identify more miRNA signatures. The current work should thus be regarded a proof-of-concept on the feasibility of gene expression signatures as surrogates of miRNA expression.
Using gene expression signatures to predict miRNA activity may address two major limitations currently facing miRNA–pathway discovery efforts – cost and scalability. Currently, most available experimental platforms (e.g., microarrays, deep sequencing) require the use of separate analytical assays to generate miRNA and mRNA information for a single sample (e.g., different microrarrays, or different RNA isolation techniques), increasing cost, time, and effort. Using gene expression signatures, it may be possible to analyze both miRNA and pathway activity patterns using a single common platform of gene expression. Moreover, because only gene expression information is required once the miRNA signature is known, any sample cohort for which gene expression (mRNA) data is available can be analyzed, without the requirement for companion miRNA data. This strategy thus opens up the availability of the thousands of publicly available microarray data sets for the discovery of new miRNA–pathway connections. Notably, we found that many of the miRNA signatures could recapitulate patterns of actual miRNA expression in a variety of different tumor types. This may not be too surprising, as it is conceptually similar to studies where gene expression signatures linked to pathways or drugs have been shown to exhibit broad applicability even in tissues distinct from those where the original signatures were derived (eg
[14],
[37]). However, we emphasize that our study does not rule out the possibility that miRNAs may exert tissue-specific effects.
One immediately useful application of miRNA gene expression signatures lies in identifying novel miRNAs linked to canonical signalling pathways. Using the miRNA–pathway network constructed in this study, we confirmed a host of previously reported miRNA pathway interactions, and identified four miRNAs as new candidate Wnt modulators (
hsa-miR-205,
hsa-miR-221,
hsa-miR-517c,
hsa-miR-519a). Experimental evidence supporting that these miRNAs are indeed Wnt regulators was also provided using cell line transfections, reporter assays, and gene expression profiling. Our study thus provides a large resource of potential pathway-modulating miRNAs for a variety of pathways which can be further tested by researchers. The information provided by this study is unlikely to be duplicated by other studies attempting to relate specific miRNAs to pathways and processes, as these previous studies have primarily relied on sequence-based miRNA target predictions, which have high false positive rates
[10] and a general lack of tissue context – i.e., sequence-matches between a miRNA and a collection of mRNAs does not guarantee that the miRNA is indeed coexpressed with the target mRNA in the same cell type or tissue.
Besides miRNA interactions with individual pathways, our work reveals that co-expressed miRNAs are likely to exhibit a high degree of functional redundancy in targeting similar sets of downstream genes, and that signalling pathways may frequently cotarget multiple independent miRNAs with similar downstream effects. This observation extends previous studies
[7] reporting the widespread existence of multiple pairs of miRNAs which target common genesets. Our observation that pathways frequently cotarget multiple miRNAs provides further evidence that miRNAs rarely act singly and almost always act in combinations to modulate cellular behaviour. The role of miRNAs as broad modulators may also explains the selection pressure for functional redundancies
[30].
The functional role of miRNAs as broad modulators of cellular activities, rather than activators or repressors of specific genes, also explains the large modularity and non-scale-free attributes of the miRNA–pathway network, revealed by global topological analysis
[1],
[2]. The non-scale-free nature of the network is also likely explained by the membership of the network itself. Compared to the membership of typical scale-free genetic networks such as gene regulatory networks (GRNs) comprising a few master regulators and downstream effectors, the miRNA–pathway network is comprised entirely of regulators (miRNAs and pathways). An implication of the small-world but non-scale-free architecture is greater resilience to targeted “hits” than scale-free networks. Scale-free networks are resilient to randomly placed damage or failure, but susceptible to targeted attacks on the hubs (the few highly connected nodes), since such hits would remove a disproportionate amount of the links in the network
[38]. The oncogenic miRNA–pathway network, with its small-world but non-scale-free architecture, does not rely on a few highly connected hubs, but spreads its “risk” across the many interconnected nodes (especially miRNAs). The implications of this finding on attempts to perturb cell function using miRNAs deserve further study.
In conclusion, our finding that gene expression signatures can capture miRNA activity is in general agreement with proposals that many cellular perturbations (e.g., responses to extracellular ligands, disease states, gene mutations) are likely to cause transcriptomic changes, and that these perturbations can be captured using gene expression signatures. Because functionally significant perturbations are certainly not limited to miRNAs and pathways alone, but can also include other genetic factors (SNPs, copy number variations, and mutation status) and epigenetic factors (e.g., DNA methylation and histone modification), there is in principle no reason why similar strategies could not be used to represent these other factors as well. Using gene expression signatures as a “common currency”, it may thus be possible to integrate multiple types of cellular perturbations into a common network, as we have done for miRNAs and pathways in this study. This may prove a powerful approach to identify functionally relevant relationships across a host of molecular levels that ultimately constitute the disease regulatory landscape.