|Home | About | Journals | Submit | Contact Us | Français|
By analyzing gene expression data in gliobastoma in combination with matched microRNA profiles, we have uncovered a post-transcriptional regulation layer of surprising magnitude, comprising over 248,000 microRNA (miR)-mediated interactions. These include ~7,000 genes whose transcripts act as miR ‘sponges’ and 148 genes that act through alternative, non-sponge interactions. Biochemical analyses in cell lines confirmed that this network regulates established drivers of tumor initiation and subtype, including PTEN, PDGFRA, RB1, VEGFA, STAT3, and RUNX1, suggesting that these interactions mediate crosstalk between canonical oncogenic pathways. RNA silencing of 13 microRNA-mediated PTEN regulators, whose locus deletions are predictive of PTEN expression variability, was sufficient to downregulate PTEN in a 3′ UTR-dependent manner and to increase tumor-cell growth rates. Thus, this miR-mediated network provides a mechanistic, experimentally validated rationale for the loss of PTEN expression in a large number of glioma samples with an intact PTEN locus.
Dysregulation of physiologic microRNA (miR) activity has been shown to play an important role in tumor initiation and progression, including gliomagenesis (Gabriely et al., 2011; Godlewski et al., 2008; Kim et al., 2010a; Kim et al., 2011; Kwak et al., 2011). Therefore, molecular species that can regulate miR activity on their target RNAs, without affecting the expression of relevant mature miRs, may play equally relevant roles in cancer. Yet, few such modulators of miR-activity have been characterized (Krol et al., 2010; Poliseno et al., 2010), and both the extent and relevance of their role in controlling normal cell physiology and pathogenesis are poorly understood.
By analyzing a large set of sample-matched gene and miR expression profiles from The Cancer Genome Atlas (TCGA), we show here that the regulation of target genes by modulators of miR activity is surprisingly extensive in human glioma and that it affects genes with an established role in gliomagenesis and tumor subtype implementation. Specifically, we study two types of miR activity modulators with distinct molecular mechanisms (Figures 1A and 1B). Sponge modulators include both messenger RNAs (mRNAs) and noncoding RNAs, which share miR-binding sites with other RNAs targeted by the miR. Thus, these modulators act as miR sponges or competitive endogenous RNA (ceRNA) via an established titration mechanism (Arvey et al., 2010; Ebert et al., 2007; Poliseno et al., 2010). Depending on their expression levels and on the total number of functional miR binding sites they share with a target, sponge modulators can decrease the number of free miR molecules available to repress other functional targets. Non-sponge modulators, on the other hand, are implemented by proteins and RNAs acting via a variety of alternative mechanisms, including activation or suppression of miRISC-mediated regulation of target RNAs (Krol et al., 2010), protection from miR degradation (Chatterjee et al., 2011), or prevention of miRs from binding their targets (Eiring et al., 2010). As a result, they do not necessarily share miR-binding sites with their modulated targets. Established sponge-modulators include VCAN (Lee et al., 2010), PTENP1 (Poliseno et al., 2010) and CD44 (Jeyapalan et al., 2011), while non-sponge modulators include miRISC core components, such as the members of the AGO and TNRC6 families (Krol et al., 2010). Notably, genetic alterations at the PTENP1, AGO2 and TNRC6A loci have all been implicated in tumorigenesis (Kim et al., 2010b; Poliseno et al., 2010; Zhou et al., 2010).
To evaluate both the range and potential tumorigenic role of this class of miR-mediated interactions, we present a new multivariate analysis method, called Hermes. Hermes systematically infers candidate modulators of miR activity from large collections of genome-wide expression profiles of both genes and miRs from the same tumor samples. Hermes extends the functionality of MINDy (modulator inference by network dynamics) algorithm, which uses measurements from information theory to identify genes that modulate transcription factor activity via posttranslational modifications. MINDy has been used to infer post-translational modulators of the MYC transcription factor in human B cells (Wang et al., 2009b), to infer signaling modulators of all transcription factors in human B cells (Wang et al., 2009a), and to identify the ubiquitin conjugating ligase HUWE1 as a modulator of N-MYC turnover in neural stem cells (Zhao et al., 2009).
In essence, MINDy and Hermes make inferences by estimating two quantities from information theory: the Mutual Information (MI) and Conditional Mutual Information (CMI). The MI quantifies how much one variable informs about another variable (i.e., high MI between two variables implies that knowledge about the first variable is predictive of state of the second variable). The CMI calculates the expected value of MI of two variables given the third variable. Specifically, given a modulator (M), a regulator (R), and a regulated target (T), the algorithms dissect the regulatory dependency of these three components by studying the difference between the Conditional Mutual Information (CMI) of the regulator’s expression level and the target’s expression level, (conditional on the expression level of the modulator) and the Mutual Information (MI) of the regulator and target expressions, ΔI = I[R; T|M] − I[R; T] (Wang et al., 2006). These quantities and their associated statistical significance can be computed from large collections of gene expression profiles (>250 samples), using a variety of estimators for MI and CMI (Wang et al., 2006), i.e. computational tools that can quantitatively estimate their values.
Hermes expands the MINDy information theoretic framework to identify candidate genes that modulate miR activity (i.e., modulators), whose availability M affects the relationship between the expression of miRs targeting a gene T and its expression profile, T. We use the term miR program to indicate a set of miRs targeting a gene and the term common miR program to indicate the intersection between the miR programs of two distinct genes. Analysis of Hermes-inferred sponge and non-sponge interactions in TCGA glioblastoma data revealed a regulatory network of previously unsuspected size. Experimental validation of 29 such interactions (26 sponge and 3 non-sponge), of which only 3 failed to validate, suggested that Hermes has a low false positive rate and showed that mPR interactions participate collectively in regulation of key drivers of gliomagenesis and tumor subtype, that these interactions mediate cross-talk between independent pathways, and that they affect cell pathophysiology.
While MINDy considers one candidate modulator/regulator/target triplet at a time, Hermes integrates the analysis across all miRs in the common miR program of two genes (sponge interactions) or in the miR program of a target gene (non-sponge interactions), using Fisher’s method (Fisher, 1925). Specific technical details of the analysis are provided in Experimental Procedures. The cartoon example of Figure 1C illustrates the type of interaction that Hermes can help dissect. Here, the increase in expression of the modulator gene is associated with a corresponding increase in mutual information between the expression of several miRs and the expression of their common target.
In principle, one could evaluate all possible modulator/miR/target triplets and then select statistically significant ones that share the same modulator and target via different miRs. While this would avoid having to select relevant miR programs a priori, it would also entail evaluating a huge number of triplets (~4.0E+11), which is computationally prohibitive and will effectively prevent the discovery of many interactions due to excessive multiple hypothesis testing correction. Similar to MINDy, which addresses this problem by testing only triplets for experimentally validated or computationally inferred transcription factor-target interactions, we use a new miR-target discovery algorithm, Cupid, that is specifically tailored to the identification of miR programs to reduce the number of statistical tests performed by Hermes, see Experimental Procedures. Specifically, for sponge modulators, Hermes considers only modulator-target pairs, (M, T) sharing a statistically significant number of miRs in their Cupid-inferred common miR programs. In addition, Hermes assumes that sponge interactions are symmetric and thus jointly evaluates the statistical significance of both M as a miR-program mediated modulator of T and T of as a miR-program mediated regulator of M, by combining p-values using Fisher’s method (Fisher, 1925). Indeed, even though miR-binding and regulatory kinetics may differ in the two targets, most sponge-mediated interactions should still exhibit symmetric behavior (Figure 2A). This is because, when averaged over the multiple miRs in their common miR program, the differences in the number of individual miR binding sites and their regulatory kinetics should average out. As shown in Experimental Procedures, symmetry analysis confirmed that only a very small fraction of candidate sponge interactions with strictly asymmetric supporting evidence is missed by Hermes (< 0.02%).
Conversely, non-sponge regulatory interactions are asymmetric by definition, as the modulator acts via alternative protein-protein or protein-RNA interaction mechanisms. Thus they can be identified by selecting candidate (M, T) pairs with an empty or non-statistically significant common miR program. Furthermore, since non-sponge interactions may be mediated by a single miR, rather than by a substantial miR program, use of a miR prediction algorithm may introduce too many false positive interactions. To reduce false positive predictions for non-sponge interactions, we restricted Hermes analysis to (M, miR, T) triplets with an experimentally validated miR-target regulatory interaction.
The statistical significance of ΔI = I[miR; T|M] − I[miR; T]can be effectively estimated from a large number of samples (>250), using a variety of CMI estimators (Wang et al., 2009b), provided that matched miR and gene expression profiles are available for the same samples. The Cancer Genome Atlas (TCGA) datasets are thus ideally suited for this analysis as they are among only a handful satisfying these requirements. For this analysis, we used a publicly available set of 262 matched gene (both mRNA and non-coding RNA) and miR expression profiles from Glioblastoma biopsies (TCGA-Consortium, 2008). When used in genome-wide fashion on this dataset, Hermes identified nearly 7,000 sponge modulators participating in ~248,000 pairwise miR-program-mediated (RNA-RNA) interactions at a highly conservative False Discovery Rate (FDR < 1e-04). These interactions are summarized in Figure 2A and Data S1. In addition, Hermes identified 148 non-sponge modulators participating in 169 mPR interactions with over 100 genes (Data S1), including many established oncogenes and tumor suppressors, such as PTEN, RUNX1, TP63, VEGFA, EGFR, MYC and NOTCH1, among others. For these, the sponge mechanism is excluded because modulator and target have no common miR regulators. Thus, their mechanism is likely mediated by protein-protein and protein-RNA interactions. Together, sponge and non-sponge mediated interactions constitute a large and previously uncharacterized miR-Program mediated Regulatory (mPR) Network.
Globally, the sponge-mediated component of the mPR network presents roughly the same size and scale-free structure of typical transcriptional regulatory networks (Maslov and Sneppen, 2002). For instance, ARACNe-based reverse engineering of transcriptional interactions in Glioblastoma dissected ~150,000 distinct TF-target interactions (Carro et al., 2010), compared to ~248,000 mPR interactions inferred by Hermes. We modeled the network graphically, with RNAs represented as nodes and their sponge-mediated mPR interactions as undirected edges (Figure 2A). Since inferred sponge-interactions are symmetric, RNAs in this network both regulate and are regulated by their neighbor RNAs. However, mPR interactions sharing a common RNA do not necessarily interact through the same miR program. Common miR programs supporting mPR network interactions include 18 miRs on average and up to a maximum of 153 miRs (Data S1).. This suggests that, on average, sponge modulation effects associated with each individual miR in a common program may be negligible compared to the global effect of the entire program.
The mPR network contains many highly interconnected (i.e., dense) structures, i.e., N-gene sub-graphs, with a number of internal edges approximating the theoretical maximum Nmax = N(N − 1)/2. Indeed, the largest dense Glioma mPR structure is a 564-node, 111-core sub-graph (Barrat et al., 2008), i.e., a structure where each RNA is directly linked to at least 111 of the other 563 RNAs (Data S1). RNAs in these dense sub-graphs are strongly co-expressed, since each RNA tracks the average expression of the other sub-graph members it is connected to. Densest sub-graph RNAs and their interactions are shown in red, near the center of Figure 2A. Conversely, sub-graphs near the edge of the figure, which are shown in purple, are sparser and their members are less co-expressed. Nodes in the network are clustered according to their sub-graph connectivity and each node is depicted with a color representing the size of the subgraph that contains it. The mPR network is scale free, see Supplementary Experimental Procedures. Thus, the number of same color nodes in a band increases exponentially with their distance from the center (Figure 2A).
The overall regulatory effect on a node depends on many variables, including the number of its mPR neighbors, the size of the miR programs that mediate its interactions, and the individual kinetics of the individual miR-target interactions it shares with its neighbors. In general, however, nodes in larger highly connected sub-graphs will have more neighbors and will thus be more strongly regulated by their mPR interactions. Indeed, co-expression of RNAs in a sub-graph increases linearly with the sub-graph size, as shown in Figure 2B.
Analysis of the mPR network shows that mPR interactions participate in distal regulation between genes within and across chromosomes, see Figure 2C by Circos (Krzywinski et al., 2009). In addition, analysis of KEGG pathway genes (Kanehisa and Goto, 2000) targeted by mPR interactions shows that this interaction layer mediates cross-talk between numerous pathways, see Supplementary Figure S1 and Data S1.
PTEN downregulation is a hallmark of gliomagenesis and its locus has been identified as one of the most frequently altered in Glioblastoma (Verhaak et al., 2010). While homozygous deletions at the PTEN locus are rare, appearing in less than 2% of Glioblastoma samples, PTEN is haploinsufficient and even moderate PTEN downregulation at the protein level, such as that resulting from loss of a single allele, may be tumorigenic. Surprisingly, however, the range of PTEN expression in heterozygously deleted samples is comparable to its range of expression in samples where its locus is intact (Figure 3A,B), suggesting that its expression may be tightly regulated and that a variety of additional mechanisms may contribute to its downregulation in tumors. PTEN regulation by miRs is well established. In Glioblastoma, for instance, amplifications at the miR-26a locus have been implicated with downregulation of PTEN (Kim et al., 2010a). Interestingly, PTEN is one of the genes in the densest 111-core sub-graph, with a total of 534 interactions in the mPR network (Data S1), suggesting that its expression is strongly regulated by sponge effect. Indeed, not only do >80% of the tumors with an intact PTEN locus have deletions in at least one of the PTEN mPR regulator loci (44/53 as of March, 2011), but the total number of such deletions in each sample is highly predictive of PTEN expression (p < 1e-04 by permutation testing), see Figure 3B. Interestingly, these deletions are more predictive of PTEN expression in PTEN wild type tumors than in tumors with PTEN heterozygous deletions, suggesting that mPR regulators may account for missing PTEN genetic variability in Glioblastoma.
We focused on a subset of 13 PTEN mPR regulators that are expressed in the glioblastoma cell line SNB19 and whose loci are enriched for deletions in tumors with an intact PTEN-locus (25/53); interestingly, these 25 tumors include all of the 8/53 PTEN-intact tumors with amplification at the miR-26a. The total number of deletions at these 13 gene loci, as well as their total mRNA expression, were found to be highly predictive of PTEN expression not only in PTEN intact samples but across all 262 tumors tested in our analysis (Figure 3A,B, Data S1; pDel < 2e-10 and pE < 5e-23 by Pearson correlation analysis, respectively). Correlation between the genetics and genomics of PTEN mPR regulators and PTEN’s mRNA expression suggests that deletions at PTEN mPR loci may collectively represent a key contribution to loss of PTEN expression in Glioblastoma. In addition, PTEN is not the only gene whose expression may be regulated by deletions at the loci of its mPR regulators. In total, glioblastoma expression profiles of 292 genes, including many known drivers of tumorigenesis and tumor subtype such as RUNX1, PTPRN, FGFR3, TGFBR2, and DICER1, were significantly correlated (p < 0.001) with deletions at the loci of their mPR regulators (Data S1). Strikingly, the expression profiles of more than half of these genes had stronger correlation with deletions at the loci of their mPR regulators than with deletions at their own loci.
To confirm functional mPR-based PTEN regulation by these 13 mPR regulators, we performed siRNA-mediated silencing of each gene in SNB19 cells (see Supplementary Figure S2 for silencing efficiency estimates), and measured the effect on PTEN using a PTEN 3′ UTR luciferase reporter assays. Silencing of 11 of the 13 modulators in SNB19 lead to a significant (p < 0.01) reduction in PTEN luciferase activity, compared to negative controls (Figure 3C). To further validate that this regulatory mechanism is symmetric in nature, thus allowing PTEN expression to modulate the same 13 genes, we transfected SNB19 cells with PTEN 3′ UTR and measured the effects on modulator expression (Figure 3D). This also addressed the potential issue that siRNA-mediated silencing may perturb endogenous miRs, possibly affecting the results displayed in Figue 3C. Upregulation of 10 of the 13 modulators was significant (p < 0.01). Overall, 13/13 tested interactions were positive either in siRNA silencing or in 3′ UTR expression assays. To ensure that the effects are not cell-line specific, we repeated the two experiments in the glioblastoma cell line SF188, using the subset of genes that were expressed in this cell line (9/13) (Figure 3E,F). Results in SF188 confirmed the SNB19 results: indeed, silencing 9 of the 9 modulators lead to a significant reduction in PTEN luciferase activity, and transfection with PTEN 3′ UTR upregulated 7 of the 9 modulators. Taken in aggregate, the cumulative Fisher’s p-value across all the experiments is effectively below machine precision (p < 1e-221 based on an analytical estimate).
To show that these effects are specific to mPR regulators, six negative control genes were selected randomly among those that (a) are not PTEN neighbors in the mPR network, (b) have variable length UTRs, and (c) show a variety of correlation patterns with PTEN’s mRNA expression (positively correlated, negatively-correlated, and uncorrelated). Randomly selected genes include TMEM149 (30-base 3′ UTR), POFUT1 (4003-base 3′ UTR), DDX24 (269-base 3′ UTR), SLC46A3 (1416-base 3′ UTR), EXTL3 (2819-base 3′ UTR), PIK3R2 (1239-base 3′ UTR), and EHMT2 (324-base 3′ UTR). Of these, expression profiles of DDX24 and SLC46A3 are significantly positively correlated with that of PTEN, while POFUT1 expression is significantly anti-correlated with PTEN expression. All of these genes, except for POFUT1, were highly expressed and could be silenced in SNB19 cells, while only DDX24, EHMT2, EXTL3, and POFUT1 were expressed and could be silenced in SF188 cells. As predicted, PTEN 3′ UTR luciferase activity was unchanged after silencing these genes in both cell lines, see Figure 3C,E. Furthermore, their mRNA expression was not significantly affected following transfection with PTEN 3′ UTR, see Figure 3D,F. As an additional negative control, transfection of both SNB19 and SF188 cells with PIK3R2 3′ UTR, which is not inferred as a PTEN or an RB1 regulator, failed to affect PTEN, RB1, and the vast majority of their selected mPR regulators, see Supplementary Figure S3A. Finally, transfection with PTEN cDNA, which lacks a 3′ UTR region, failed to alter the expression of tested PTEN mPR regulators, see Supplementary Figure S3B; note that the expression of these genes was affected by transfection with PTEN 3′ UTR, see Figures 3D,F. Taken together, these results confirm cell-line independent, miR-mediated interactions between predicted mPR RNA pairs, including PTEN and its predicted mPR regulators, but not between these genes and other randomly selected genes (negative controls), regardless of their correlation with PTEN expression or the lengths of their UTRs.
To test whether PTEN mPR regulators may affect tumor cell growth, as previously shown for PTEN’s post-transcriptional regulator PTENP1 (Poliseno et al., 2010), we measured SNB19 and SF188 cell growth rates in response to transfection of PTEN cDNA (missing the 3′ UTR) and 3′ UTR, as well as to siRNA-mediated silencing of PTEN and of its Hermes-inferred mPR regulators (Figure 4). Transfection of PTEN 3′ UTR upregulated the expression of its mPR neighbors, increased PTEN protein concentration, and reduced tumor cell growth rates. Conversely, siRNA-mediated silencing of 10/13 and 9/9 mPR-regulators reduced PTEN 3′ UTR-luciferase expression and significantly accelerated SNB19 and SF188 cell growth, respectively. The effect of silencing these regulators was comparable to that of siRNA-mediated PTEN silencing, and the aggregate p-value for the significance of the increase in tumor cell growth computed by Fisher’s method is below machine precision (i.e., p ≈ 0).
To demonstrate that mPR regulation is not limited to the PTEN and its mPR neighbors but is rather a general property of the mPR network, we further tested the ability of RUNX1, a master regulator of the Glioma mesenchymal subtype (Carro et al., 2010), to regulate its Hermes-predicted neighbors. Indeed, transfection with RUNX1 3′ UTR in SNB19 cells was equally effective in upregulating 4 of its 5 its mPR neighbors, see Supplementary Figure S4.
The mPR network may explain significant crosstalk among different regulatory compartments of the cell that observed in perturbation experiments (Chow et al., 2011). Indeed, further investigation of the mPR network revealed that known drivers of Glioma tumorigenesis and glioblastoma subtypes RB1, PTEN, RUNX1, PDGFRA, STAT3 and VEGFA (Carro et al., 2010; Parsons et al., 2008; Verhaak et al., 2010) are part of a dense sub-graph of mutually mPR-interacting genes, see Figure 5A. Ectopic expression of PTEN 3′ UTR was effective in upregulating expression of the other genes in this sub-graph, while siRNA-mediated silencing of DICER and DROSHA (necessary for miR processing) was sufficient to abrogate the effect, suggesting that these interactions are miR-mediated, see Figure 5B. To further confirm symmetric post-transcriptional regulation across all genes in the sub-graph, we measured their response to transfection with the 3′ UTRs of all other genes in the sub-network in SNB19 cells by qRT-PCR, except for VEGFA, whose 3′ UTR cloning was not successful. Results confirmed that ectopic expression of the 3′ UTRs of genes in this sub-network upregulated expression of the other genes (Figure 5B–C). In particular, ectopic expression of PTEN and RB1 3′ UTRs led to a >50% up-regulation of both genes, suggesting significant miR-mediated crosstalk between the PTEN and RB1 pathways, both implicated in gliomagenesis. Moreover, co-ectopic expression of 3′ UTR pairs, at 50% concentration for each UTR, intensified the regulatory response (Figure 5D), suggesting that the effect of multiple mPR modulation is more than additive, as suggested by results shown in Figure 2B.
Cooperativity between some of the regulators in the sub-network has been implicated in high-grade gliomagenesis (Chow et al., 2011), despite their distinct functions, lack of common transcriptional regulation, and large genomic distances. In particular, the loci of tumor suppressors PTEN and RB1 are frequently deleted in high-grade Glioma (PTEN: 80%; RB1: 33%; PTEN+RB1: 85%). Analysis of 3′ UTR luciferase activity of five of the six genes (Figure 5A), following siRNA-mediated silencing of PTEN and RB1, confirmed the presence of the predicted regulatory interactions (Figure 6). Thus, our results suggest that PTEN and RB1 regulate one another post-transcriptionally through 32 miRs in a common program (Data S1) and that their availability significantly affects expression of other genes in the same sub-graph with an established role in gliomagenesis. Overall, 6 of 8 predicted interactions were confirmed by these assays. Additionally, of 11 experimentally validated interactions between these genes, 6 were predicted, suggesting a false negative rate of ~45%. Full analysis of mPR interactions between genes in common KEGG pathways (Kanehisa and Goto, 2000) is presented in Supplementary Figure S1 and Table S4.
As discussed, to identify non-sponge miR-activity modulators, we restricted our analysis to candidate modulators of literature validated miR-target interactions. Since prediction of individual miR targets is still inaccurate, we could not rely on the robust statistics afforded by large miR programs, as used when predicting sponge-modulators. Even with this substantial limitation, Hermes could identify 148 miR-activity modulators, suggesting that this number may increase substantially once a more comprehensive, high-accuracy miR-target network is available. To experimentally confirm non-sponge modulator candidates inferred by our analysis, we selected three interactions, two affecting PTEN and one affecting RUNX1. These include: WIPF2, as a miR-mediated regulator of RUNX1, and PALB2 and WNT7A as miR-mediated regulators of PTEN (Figure 7A). Both PTEN and RUNX1 are known drivers of gliomagenesis (Carro et al., 2010; Verhaak et al., 2010), and genes that regulate their expression may play a role in this disease.
Up regulation of WNT7A by transfection of its cDNA sequence, which lacks both 5′ and 3′ UTRs, led to 1.5 fold up-regulation of PTEN or PTEN 3′ UTR luciferase activity (Figure 7B,E), suggesting that, as predicted, WNT7A regulation of PTEN is miR-dependent but not sponge mediated. Silencing PALB2 lead to a 1.5-fold up-regulation of PTEN or PTEN 3′ UTR luciferase activity (Figure 7C,E), and silencing WIPF2 lead to a 3-fold up-regulation of RUNX1 expression and 2-fold up-regulation of RUNX1 3′ UTR luciferase activity (Figure 7D,F), both consistent with computational predictions of miR-mediated down regulation. Consistent with miR-dependent regulation, DROSHA and DICER silencing (Figure 7G) abrogated PTEN and RUNX1 regulation by WNT7a/PALB2 and WIPF2, respectively (Figure 7E-F). Expression of validated miRs targeting these genes, such as mir-21 (WNT7A), mir-106a (PALB2), and mir-17-5p (WIPF2), were relatively unchanged (Supplementary Figure S5).
Genome-wide Hermes analysis supports the existence of a miR-mediated, post-transcriptional regulation layer of unsuspected magnitude, the mPR network, effected both by sponge and non-sponge interactions. While the specific mechanism of sponge modulation and the potential for miR-gene interactions were previously reported (Arvey et al., 2010; Carro et al., 2010; Krol et al., 2010; Poliseno et al., 2010; Su et al., 2011), both the extent and the functional relevance of this regulatory layer was unknown. In terms of size, the mPR layer rivals transcriptional regulation, supporting regulation of thousands of RNA species and modulating crosstalk between distinct regulatory pathways. Changes in two or more mPR regulators of a target gene may have effects comparable to transcriptional regulation (i.e. > 2-fold changes), as suggested by Figure 2C and shown in Figure 5C and Figure 5D.
The mPR network discovered here is implemented by both sponge-mediated interactions that are generally symmetric in nature, and in sponge-independent fashion by multiple gene-products (e.g., proteins and RNAs) that affect miR-target regulation asymmetrically. Given the genome-wide nature of the algorithm used to infer non-sponge mPR regulation, elucidation of the variety of mechanisms that may support them cannot be accomplished at this stage. However, the experimentally validated existence of this class of interactions suggests a variety of mechanisms for future testing, some of which have been previously elucidated, see Figure 1B. A key and potentially confusing point is that our analysis suggests that mPR sponge interactions are mediated by relatively large miR programs, including on average 18 and up to 153 miRs (Supplementary Table S1). As a result, the effect of individual miRs is relatively negligible and mPR regulation is unlikely to be significantly affected by modulation of individual miRs or miR binding sites in isolation. In addition, two other articles in this issue of Cell (Tay et. al. 2011, Karreth et al. 2011) also report the identification of a regulatory network of miR ‘sponges’ (or competitive endogenous RNAs) in cancer, while a third article in this issue (Cesana et al. 2011) reports the discovery of a long-noncoding RNA that regulates muscle differentiation via a ‘sponge-like’ mechanism, as well.
Importantly, while we have validated a substantial set of miR-mediated PTEN modulators in multiple cell lines, this by no means constitutes a thorough validation of the entire network. Yet, out of 29 experimentally validated interactions, including 28 sponge-mediated ones (Figures 3C–F, ,5,5, ,6,6, and Supplemental Figure S4) and 3 non-sponge mediated ones (Figure 7), all but 6 were confirmed (all but 3 if one considers both 3′ UTR expression and siRNA mediated silencing assays in Figure 3C,D). This suggests that false positive rates should be low (~10%–20%), comparing favorably with false positive rates in typical high-throughput experimental procedures. Thus, if globally validated using the experimental assays proposed in this manuscript, which is not currently feasible even using high-throughput approaches, a substantial number of the predicted interactions should be confirmed. Furthermore, we validated all of the pairwise interactions in the dense sub-graph that includes PTEN, STAT3, VEGFA, PDGFRA, RUNX1, and RB1. Of the 11 that were experimentally validated, 5 were not predicted, suggesting a false negative rate of ~45%, which is also competitive with experimental false negative rates.
In addition, limitation of non-sponge modulator analysis strictly to literature validated miR-target interactions suggests that, when extrapolated to the full set of miR-target interactions in a cellular context, non-sponge interactions may be far more numerous. Finally, while we focused on regulation of mRNAs through miRs that target 3′ UTRs, miR-mediated regulation is not restricted to 3′ UTR targeting (Chi et al., 2009; Hafner et al., 2010) and miRs are known to target non-coding RNAs (Poliseno et al., 2010), which suggests that the scope of this network needs to be further expanded.
It is important to note that while individual miR-mediated interactions may be weak, their regulatory effect in combination is substantial, see Figures 2B and and5D.5D. Furthermore, their ability to affect cellular phenotype is also significant and comparable to what was previously described for PTENP1 (Poliseno et al., 2010), whose deletion was shown to be tumorigenic in vivo. This suggest that miR-mediated interactions between genes may play an important role in disease initiation and progression when dysregulated. Indeed, analysis of large Glioblastoma datasets revealed that miR-mediated PTEN regulators are highly predictive of PTEN downregulation even when the PTEN locus is intact and may account for a significant proportion of the missing genetic variability of the PTEN locus.
In this manuscript, we focused on PTEN as a key driver of gliomagenesis whose locus is often altered in glioblastoma samples (Verhaak et al., 2010). However, regulation by miR-activity modulators is not limited to PTEN or to glioma. In this study, we showed that a variety of well-established drivers of tumorigenesis and tumor subtype in Glioblastoma are regulated by miR-activity modulators, and our computational predictions suggest that other established oncogenes and tumor suppressors are similarly regulated. Since these effects are miR mediated and miR expression is strongly cell-context dependent, mPR networks are likely to be context-specific and their structure and contribution to disease initiation and progression will need to be studied independently in different contexts.
Hermes, the algorithm used for the identification of miR-activity modulators, presents two key advantages. First, by definition, non-sponge modulators cannot be inferred by miR target analysis. Thus, Hermes provides the only systematic computational approach to determine this kind of interactions. Second, while it may be possible to infer sponge modulators by miR-target analysis alone, for instance by identifying genes whose transcripts share common miR binding sites, identification of functional miR targets is still largely inaccurate, with different methods predicting widely different interactions. Hermes circumvents this problem by first integrating evidence from multiple miRs in a common program and then by requiring direct, multivariate expression-based evidence for the predicted interaction, by conditional mutual information analysis. Thus, false negative predictions by miR-target prediction algorithms are much less critical than false positive predictions, as the latter dramatically reduce the statistical power of the method by increasing the number of hypotheses tested by the algorithm. On the other hand, even if miR program size is reduced by false negatives, conditional mutual information analysis can still filter false positive interactions. As a result, rather than relying on existing algorithms for miR target prediction, which still have substantial false positive rates, we implemented Cupid specifically to reduce false positive predictions even if at the expense of some false negative predictions, see Supplementary Experimental Procedures. Indeed, Cupid predicts fewer miR-target interactions than the intersection of three established algorithm TARGETSCAN (Lewis et al., 2005), PITA (Kertesz et al., 2007) and MIRANDA (Enright et al., 2003). However, when we replaced Cupid predictions by the intersection of the three algorithms, 25 out of 26 experimentally validated mPR interactions in this manuscript were missed. As a result, while our analysis does not suggest that Cupid may outperform other algorithms in terms of miR target identification, its specific design, aimed at minimizing false positives at the expense of false negatives, is uniquely tailored to inferring miR programs for further Hermes analysis.
Periodically, we are faced with the emergence of new regulatory layers, the post-transcriptional and histone modification ones being the latest additions. Every time this happens, we discover that these layers account for a significant amount of missing genetic and epigenetic variability in the etiology of disease. As a result, as suggested by our data, it is reasonable to expect that this novel and extensive miR-mediated interaction layer, which allows gene regulation without direct transcriptional or even post-transcriptional interactions, will also provide a number of clues to the dysregulation of key mechanisms of pathogenesis as well as to the regulation of normal cell physiology.
We used a miR-activity modulator screening algorithm, Hermes, to identify candidate miR-activity modulators by finding genes whose expression is correlated with deviations in co-expression between miR programs and their targets using conditional mutual information. We used an integrative miR-target prediction algorithm, Cupid, to predict miR-target interactions and to assemble miR-regulatory programs for 3′ UTRs. We identified genomic alterations using snapCGH (Marioni et al., 2006). Level 3 Agilent gene and miR expression data for glioma tumors were obtained from TCGA (TCGA-Consortium, 2008). The glioblastoma cell lines SNB19 and SF188 were cultured under standard conditions. Transient transfections of expression vectors were used to over-express genes and 3′ UTRs; siRNAs were used for mRNA silencing; real-time PCR, luciferase activity, western blots, and proliferation assays were performed according to standard protocols. Our methods and experimental procedures are described in detail in Supplementary Experimental Procedures.
We thank James Chi-ping Chen and Mariano J. Alvarez for critical discussion, and Katia Basso, Francesco Niola, Antonio Iavarone, and Anna Lasorella for support and suggestions. We would also like to acknowledge the generous funding provided by the NIH under the following grant awards: (1) Roadmap grant for a Center for the Multiscale Analysis of Genetic Networks (MAGNet) (U54CA121852), (2) Genetic Network Inference with Combinational Phenotypes (R01CA109755), and (3) In Silico Research Centre of Excellence NCI-caBIG 29XS192. P.S. and A.C. conceived and supervised the project and participated in its computational and experimental design. P.S., H.S.C, W.J.C., M.B. and P.G. designed and implemented the computational methods; P.S., X.Y., H.S.C and W.J.C. analyzed data; P.S., X.Y., J.S. and A.C. designed the experimental assays. X.Y., A.I., D.L.N and P.R. performed the experiments; P.S., X.Y., H.S.C and A.C. wrote the paper.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.