|Home | About | Journals | Submit | Contact Us | Français|
Most metazoan microRNA target sites have perfect pairing to the seed region, located near the miRNA 5’ end. Although pairing to the 3’ region sometimes supplements seed matches or compensates for mismatches, pairing to the central region has been known to function only at rare sites that impart Agronaute-catalyzed mRNA cleavage. Here we present “centered sites,” a class of microRNA target sites that lacks both perfect seed pairing and 3’-compensatory pairing and instead has 11–12 contiguous Watson–Crick pairs to the center of the microRNA. Although centered sites can impart mRNA cleavage in vitro (in elevated Mg2+), in cells they repress protein output without consequential Agronaute-catalyzed cleavage. Our study also identified extensively paired sites that are cleavage substrates in cultured cells and human brain. This expanded repertoire of cleavage targets and the identification of the centered site type help explain why central regions of many microRNAs are evolutionarily conserved.
MicroRNAs (miRNAs) are a class of ~23 nucleotide (nt) RNAs that direct the post-transcriptional repression of protein-coding genes (Bartel, 2004). After processing from hairpin precursors, miRNAs are loaded into Argonaute-containing silencing complexes, which down-regulate mRNA targets through two distinct modes, either Argonaute-catalyzed cleavage or a second mode that involves mRNA destabilization and translational repression, at least in part through poly(A) shortening (Filipowicz et al., 2008).
Argonaute-catalyzed cleavage of the target strand occurs in the context of extensive base pairing, at the linkage joining the mRNA nucleotides that pair to miRNA positions 10 and 11 (Elbashir et al., 2001b; Hutvagner and Zamore, 2002; Yekta et al., 2004). In mammals, this slicing activity is catalyzed by Argonaute2 (AGO2), which leaves a 3’ hydroxyl on the 5’ cleavage fragment and a 5’ monophosphate on the other (Liu et al., 2004; Meister et al., 2004; Schwarz et al., 2004). Unlike miRNAs in plants, very few examples of miRNA-dependent cleavage targets have been reported in mammals (Yekta et al., 2004; Davis et al., 2005; Jones-Rhoades et al., 2006). Nonetheless, artificially designed small interfering RNAs (siRNAs) that silence target genes through this mechanism are widely used reagents, illustrating that in principle the cleavage mode of repression can function in many contexts and for many targets (Elbashir et al., 2001a).
Sites that confer slicing-independent destabilization and translational repression typically pair to the 5’ region of the miRNA, centering on miRNA nucleotides 2–7, known as the miRNA seed (Lewis et al., 2005; Bartel, 2009). Introducing an siRNA/miRNA or deleting an endogenous miRNA leads to modest yet detectable changes in the output of hundreds of genes containing seed sites in their 3’UTRs (Krutzfeldt et al., 2005; Lim et al., 2005; Giraldez et al., 2006; Grimson et al., 2007; Rodriguez et al., 2007; Baek et al., 2008; Selbach et al., 2008). Moreover, most mammalian protein-coding genes are under selection to maintain pairing to the seed of one or more miRNAs, and thousands of genes have also evolved to specifically avoid pairing to the seeds of preferentially co-expressed miRNAs (Farh et al., 2005; Lewis et al., 2005; Stark et al., 2005; Friedman et al., 2009). These observations illustrate both the broad scope of seed-type regulation and the widespread influence of this targeting mode on mRNA evolution.
Pairing to the 3’ region of the miRNA can supplement seed pairing to enhance target recognition, or it can even compensate for a mismatch to the seed; such sites are known as “3’-supplementary sites” and “3’-compensatory sites”, respectively (Bartel, 2009). However, pairing to the 3’ region appears to be consequential for relatively few (<10%) of sites (Bartel, 2009). In principle, pairing to the central region of the miRNA could also supplement pairing to the other regions of the miRNA, but a role for such pairing has been demonstrated only for sites that mediate Ago-catalyzed cleavage and not for sites that mediate destabilization and translation repression.
Here we describe “centered sites”, a unique class of microRNA target sites that lacks both perfect seed pairing and 3’-compensatory pairing and instead has 11–12 contiguous Watson–Crick pairs to miRNA nucleotides 4–15. In the process of characterizing these sites, we found that Mg2+ concentration profoundly influences both the specificity and efficacy of miRNA-directed cleavage, and we performed whole-transcriptome analyses that substantially add to the number of known instances in which metazoan miRNAs direct mRNA cleavage.
The most highly conserved region of metazoan miRNAs is the 5’ region containing the seed (Lewis et al., 2003; Lim et al., 2003), which is the region most important for recognizing most targets. The next most highly conserved region spans nucleotides 13–16, which is the region most important for 3’-supplementary and 3’-compensatory pairing (Grimson et al., 2007). Despite being less conserved than other miRNA regions, we noted that the central region of vertebrate miRNAs is significantly more conserved than is the opposite arm of the pre-miRNA hairpin (Figure 1A and S1A). Because both arms participate equally in the pairing required to form the pre-miRNA hairpin, preferential conservation of the miRNA observed in this region suggested that these central nucleotides play a role beyond that of miRNA biogenesis. One such role would be to aid in target recognition, but among the previously characterized targeting modes, the central region is known to function only for cleavage sites, which seemed too rare to provide the additional selective pressure for conserving nucleotide identity at the miRNA central regions. Therefore, we searched for another type of site that might explain this preferential conservation.
Examination of array data investigating the response of mRNAs after transfecting 11 miRNAs into HeLa cells (Lim et al., 2005; Grimson et al., 2007) revealed a unique type of site that was associated with mRNA down-regulation (Figure S1B). This site type, which we call the “centered site”, was characterized by at least 11 nucleotides of contiguous Watson–Crick base pairing to the center region of the miRNA at either nucleotides 4–14 or 5–15, without substantial pairing to either the 5’ or the 3’ ends of the miRNA. Because of the location and extent of their base pairing, these sites occupy a unique position intermediate between seed sites and the extensively complementary cleavage sites (Figure 1B).
Because these sites are relatively rare, pooling of data from multiple miRNA transfections was initially required to achieve statistical significance for determining their efficacy. In order to systematically analyze these sites, we therefore compiled additional array data from HeLa experiments with similarly transfected miRNAs and siRNAs (Birmingham et al., 2006; Jackson et al., 2006a; Jackson et al., 2006b; Schwarz et al., 2006; Anderson et al., 2008). To ensure that the transfected mi/siRNAs were loaded and active within the silencing complex, the pooled datasets were restricted to those of the 78 HeLa experiments for which the canonical 8mer 3’UTR site to the transfected mi/siRNA was associated with downregulated mRNAs with high statistical significance (P < 0.0001, K–S test, Table S1A). Testing matches that did not include the canonical seed match to miRNA positions 2–7 showed that perfect 11-mer matches starting at miRNA positions 3, 4, and 5 were each significantly associated with repression (Figure 1C), whereas perfect 10-mer matches, perfect 9-mer matches, and near-perfect 11-mer matches (those with single mismatches or wobbles) were not significantly associated with repression (Figure S1C and D).
The efficacy of centered sites matching ectopically introduced miRNAs and siRNAs raised the possibility that such sites might also mediate endogenous miRNA targeting. Array results examining the effects of miRNA loss in zebrafish embryos lacking Dicer provided data on a sufficient number of messages with centered sites to enable a systematic analysis of targeting interactions in vivo. MicroRNAs present at 24 hours post fertilization (hpf), the developmental stage used for mRNA analysis, were identified by high-throughput sequencing (Table S1B). Sites were considered for 21 of these miRNAs for which the canonical 8mer 3’UTR sites were significantly associated with mRNAs derepressed in the dicer mutant (P < 0.01, Table S1B). As observed for the ectopic interactions, perfect 11-mer matches starting at miRNA positions 3, 4, and 5 were each associated with significant repression, although efficacy of sites starting at position 3 was mostly attributable to overlap with the “shifted 6mer” seed match (Figure 1D), which comprises pairing to nucleotides 3–8 (Friedman et al., 2009). Perfect 10-mer matches, perfect 9-mer matches, and near-perfect 11-mer matches generally were not associated with significant repression, although for a few of the numerous possibilities examined, marginal significance was observed (Figure S1E and F).
When considering both ectopic and endogenous interactions, contiguous Watson–Crick 3’UTR pairing to the central region of the miRNA, at either nucleotides 4–14 or 5–15, was unique among the tested possibilities in that it was both consistently associated with mRNA repression and not attributable to overlap with previously described site types. A previous array study had reported a handful of siRNA off-targets with similarly long stretches of contiguous Watson–Crick base pairing, but these sites were offset further towards the 3’ end of the miRNA, at nucleotides 6–16 or 7–17 (Jackson et al., 2003), a region not significantly associated with targeting when examined using our larger datasets. Effective miRNA-target prediction algorithms rely heavily on perfect pairing to the seed region, and thus miss this additional class of targets (Bartel, 2009).
The transfected mi/siRNAs had an average of 11 and a median of eight centered sites in 3’UTRs of human mRNAs. About a quarter of the mRNAs with a centered site lacked conventional seed sites to the transfected RNA and were sufficiently expressed in HeLa such that changes could be accurately measured on the arrays. Analysis of cumulative distributions of log-fold changes indicated that >20% of these mRNAs responded to the transfected mi/siRNAs in a manner attributable to the site, with a lower bound for site efficacy resembling that of canonical 7-mer sites (Figure 1E). Likewise, >30% of the endogenous centered sites analyzed appeared to mediate repression in zebrafish embryos (Figure 1F).
To examine whether centered sites also function in other animals, we analyzed mRNA array datasets monitoring the impact of knocking down proteins required for Drosophila miRNA biogenesis (Kadener et al., 2009). Following either Drosha or Dicer1 knockdown in Drosophila S2 cells, messages with 3’UTR centered sites matching the endogenous miRNAs had a significant propensity to be derepressed (Figure S1G and H, P = 0.00045 and 0.027, respectively, for Drosha and Dicer1 knockdown datasets, Table S1C and D).
To confirm that centered sites can be directly targeted by miRNAs, luciferase reporter constructs and their mutant counterparts with disrupted pairing were prepared and tested in both HeLa cells and S2 cells (Figure 1G and S1I). For three of the four UTR fragments tested, the sites reduced protein output in a manner that depended on the presence of both the wild-type site and the cognate miRNA. Taken together, the reporter and microarray results suggested that the centered site is a miRNA target site capable of downregulation comparable with that observed for single 7-nt seed sites. Although they are much less abundant than both seed-matched sites and sites with 3’-supplementary pairing, centered sites are present in similar numbers as 3’-compensatory sites and could help explain the preferential conservation observed in the central region of most miRNAs.
Because of pairing to the central region of the miRNA, centered sites might be subject to AGO2-dependent cleavage similar to that occurring for known cleavage sites of plants and animals, which are more extensively paired (Yekta et al., 2004; Davis et al., 2005; Jones-Rhoades et al., 2006). To test this possibility, we employed an in vitro cleavage assay using S100 extract prepared from HeLa cells (Martinez and Tuschl, 2004; Shin, 2008), focusing on mRNA fragments containing centered sites for miR-21 or let-7g miRNA, which are abundant in HeLa cells (Figure 2A, Table S4A). Cleavage was observed at the position expected for AGO2-catalyzed cleavage of the centered sites (Figure 2B).
To examine whether cleavage was also occurring in the cells, we tested for miR-21-directed cleavage of GSTM3 mRNA (moderately expressed in HeLa cells), using RNA ligase–mediated rapid amplification of cDNA 5’ ends (5’-RACE). By directly cloning and sequencing the 5’ end of the 3’ cleavage product, this assay can be used to validate miRNA-directed cleavage (Llave et al., 2002; Yekta et al., 2004). To increase the sensitivity of the assay, XRN1, the 5’→3’ exonuclease responsible for degrading the 3’ cleavage product (Souret et al., 2004; Orban and Izaurralde, 2005), was knocked down (Aleman et al., 2007). 5’-RACE fragments within ~50 bp of the expected cleavage site were cloned for sequencing. The 5’ ends for seven of nine sequenced clones precisely matched that expected for cleavage at the centered site in the cell (Figure 2C). These results indicated that for an endogenous mRNA targeted at a centered site by an endogenous miRNA, at least some transcripts underwent AGO2-catalyzed cleavage in the cell.
To understand the specificity of cleavage at centered sites, miR-21 recognition of the K89 mRNA fragment (Figure 2) was examined further. The K89 RNA sequence, which was perfectly complementary to positions 5–16 of miR-21, was systematically mutated at each nucleotide corresponding to miR-21 positions 1–16, substituting an A:C mismatch or a G:U wobble for each Watson–Crick match, and substituting a Watson–Crick match for each of the two mismatches (Figure 3A). When using 5.8 mM Mg2+, as in Figure 2B, or 2.2 mM Mg2+, both of which were within the ~2–6 mM range used previously to study in vitro cleavage (Martinez and Tuschl, 2004; Gregory et al., 2005; Maniataki and Mourelatos, 2005; Miyoshi et al., 2005; Rand et al., 2005; Ameres et al., 2007; Wang et al., 2009a; Wang et al., 2009b), cleavage was retained after changing positions outside of the centered site and was reduced after changing most positions within the centered site, although wobble pairs were tolerated at positions 6 and 8 (Figure 3B, top two panels).
Mg2+ is essential for the in vitro cleavage reaction (Schwarz et al., 2004) but also has a striking effect on the relative stabilities of matched and mismatched RNA duplexes (Serra et al., 2002). Indeed, lowering the Mg2+ concentration increases the fidelity of RNA 2’-O-methylation, another reaction specified by Watson-Crick pairing between small guide RNAs and their targets (Appel and Maxwell, 2007). We found that lowering Mg2+ gave maximal target RNA cleavage specificity and efficacy for substrates that were extensively paired to miR-21, whereas higher Mg2+ was optimal for more weakly pairing substrates (Figure 3B). For example, the cleavage of K89-21as RNA, which is fully paired to the miRNA, was the most efficient at 0.3 mM Mg2+, whereas cleavage of the wild-type K89 substrate containing the centered site with only 12 contiguous pairs was undetectable at 0.3 mM Mg2+ and most efficient at 5.8 mM Mg2+, and K89-m4GC, which had an intermediate number of contiguous pairs, had an intermediate Mg2+ optimum (Figure 3B and C).
The free-Mg2+ levels in the cytoplasm of various cells and tissues is less than 1 mM (Gunther, 2006), a concentration at which we found that efficient cleavage required pairing more extensive than that of typical centered sites (Figure 3B). Nonetheless, some cleavage at the centered site was detected at physiological Mg2+ concentrations (Figure 3B, 0.75 mM Mg2+), which explained why the 5’-RACE assay yielded fragments diagnostic of miR-21-directed cleavage in the cell (Figure 2C).
The poor efficacy of cleavage at the centered site at physiological Mg2+ concentration called into question whether miRNA-directed cleavage plays a consequential role during repression mediated by centered sites and suggested that most repression at centered sites might resemble the destabilization and translational repression observed for most seed-matched targets. To better characterize the scope of miRNA and siRNA-directed cleavage in mammals, and to examine the extent to which cleavage at centered sites is relevant to target gene regulation in vivo, we applied degradome sequencing to mammalian cells. Degradome sequencing generates short sequence tags representing the 5’ ends of uncapped mRNA fragments found in the cell (Addo-Quaye et al., 2008; German et al., 2008). Although these fragments are predominantly 5’→3’ exonuclease degradation intermediates, they also include 3’ fragments of Argonaute-catalyzed mRNA cleavage in sufficient numbers to enable empirical detection of endogenous cleavage targets of plant miRNAs and siRNAs (Addo-Quaye et al., 2008; German et al., 2008). Inspired by this success in plants and the ability to detect miR-21-directed cleavage by 5’-RACE, we applied the method to HeLa cells following XRN1 knockdown by RNAi (Figure S3A).
Sequencing yielded 14,323,668 tags mapping to the human genome, with a diversity of 2,069,190 unique tag sequences. Of the total tags, 61.2% came from protein-coding genes and represented 36,806 out of 46,319 ENSEMBL mRNAs (Figure 4A). The tags showed a relatively uniform distribution across the mRNAs, with a very strong peak at the 5’ terminus (Figure 4B). About 30% of tags were not classified because they did not map to mature annotated RNAs (Figure 4A). Many of these were from introns and processing fragments from pri-miRNAs, mitochondrial tRNAs, ribosomal RNAs, and snRNAs, illustrating how unstable 3’ products of endonucleases can be detected in mammalian cells by using degradome sequencing (Table S2A and B).
To determine if miRNA centered sites were associated with cleavage at the expected position within the mRNA 3’UTR, we searched for centered matches to 50 distinct, conserved miRNAs most highly expressed in HeLa cells and tabulated the frequency of degradome tags corresponding to mRNA cleavage at the tenth position of these sites. Tags corresponding to cleavage at the expected position were found much more frequently for authentic miRNA:site pairs than for negative-control pairs (Figure 4C). However, when we excluded miR-196a, miR-151, and miR-28, which target several extensively paired sites, the signal above background was greatly reduced, suggesting that most centered sites lacked the complementarity required for robust miRNA-directed cleavage (Figure 4C). The abundance of degradome tags mapping to the expected cleavage sites of the siRNAs targeting XRN1 illustrated that the method can identify tags diagnostic of AGO2-catalyzed cleavage in human cells (Figure S3). These results supported those from the in vitro cleavage assays (Figure 3B) in suggesting that under physiological Mg2+ conditions the mRNA downregulation mediated by centered sites is usually accompanied by very little AGO2-catalzyed cleavage.
Our observation of significant cleavage at the small subset of centered sites with unusually extensive complementarity to the miRNA indicated that miRNA-directed cleavage at extensively paired sites was more frequent in animals than had been appreciated. This insight prompted a systematic examination of mammalian sites with extensive miRNA complementarity of the type that would mediate cleavage in plants, but might not have fulfilled our criteria for classification as centered sites because they either had perfect seed pairing or lacked 11 contiguous pairs within positions 4–15.
To search for potential cleavage sites in mammals, we used a scoring rubric similar to those that successfully identify miRNA target sites in plants (Figure 5A) (Jones-Rhoades and Bartel, 2004; Allen et al., 2005). The search yielded 106 predicted miRNA:site duplexes scoring ≤2.0 (Figure 5B), including 47 in annotated ORFs, 16 in 5’UTRs, and 43 in 3’UTRs (Table S3A). At the mid-to-higher penalty scores, sites were no more abundant than expected by chance, but at scores ≤3.0, sites were at least 1.5-fold enriched compared to the control sets of chimeric miRNAs constructed so as to preserve the seeds as well as the overall dinucleotide and trinucleotide compositions of authentic miRNAs (Figure 5C). Repeating the analyses with annotated murine miRNAs yielded analogous results (Figure S4C–E, Table S3B).
The higher abundance of extensive matches to miRNAs compared to that of controls might indicate biological function. However, eukaryotic genomes, complex tapestries containing remnants of innumerable duplications and repetitive elements, are far from random, and thus this abundance might simply be a consequence of the miRNAs and sites sharing common ancestry. To distinguish between these possibilities, we examined the conservation of orthologous sites in five mammalian species, as assessed using a conservation-alignment (CA) score (Figure 5D). When applied to sites for distinct miRNAs conserved throughout mammals, 17 miRNA:site duplexes had CA scores ≤3.0 (Figure 5E), most of which were unlikely to be conserved by chance (Figure 5F). Four of the 17 top-scoring sites were miR-151-5p targets (Table S3C).
Having found evidence that the most extensively paired sites were more abundant and more conserved than expected by chance, we returned to the degradome sequencing data to search for evidence that these sites were cleaved in the cell. Because the degradome sequencing data included intermediates of normal mRNA decay, steps were taken to distinguish AGO2 cleavage products from other decay intermediates. To do this, we considered the tag possession ratio (TPR), which represented the proportion of predicted miRNA:site duplexes that were represented by tags at the expected cleavage site (Figure 6A). When focusing on the miRNAs and mRNAs expressed in HeLa, miRNA:site duplexes with alignment penalty scores ≤2.5 possessed significantly more cleavage tags at the expected cleavage site than did control duplexes (Figure 6B and Table S4B, Fisher’s exact test, P = 1.1 × 10 −04). Even after excluding tags mapping to multiple loci, this TPR difference remained both substantial and significant (Figure 6C and Table S4B, P = 2.6 × 10 −04). MicroRNA-directed cleavage in Arabidopsis sometimes occurs at ±1 nt from the expected cleavage site (Addo-Quaye et al., 2008). When applying a window of ±1 nt, there was no improvement in the TPR of expressed miRNA:site pairs (Figure S5A and Table S4B). As an added control, we repeated the analysis for miRNAs that were not expressed in HeLa cells and found that these miRNAs performed similarly to the chimeric miRNA controls (P = 1.0) and significantly worse than the miRNAs expressed in HeLa cells (P = 5.3 × 10 −05). These results strongly indicated that for miRNA:site pairs with favorable alignment scores (≤2.5), most tags at the expected cleavage site did not arise from background 5’→3’ degradation but instead were the consequence of miRNA-directed mRNA cleavage.
Using an alignment penalty score of 2.5, a threshold at which the cumulative TPR difference between signal and background was most significant in HeLa data (Table S4B), we found eight miRNA-directed cleavage targets with tags precisely at the expected cleavage site (Table 1 and Figure S5B). All eight cleavage sites were in 3’UTRs, and half were conserved in other mammals (Table 1 and Figure S5B). Four of the pairs involved miR-151-5p (Figure 6E–G and Table 1). miR-196a and its cleavage target HOXB8 are both known to be moderately expressed in HeLa cells (Lim et al., 2005), and as expected HOXB8 was among the eight (Figure 6H).
To extend our results beyond cells in culture, we performed degradome sequencing using poly(A)-selected RNA from whole human brain. Sequencing yielded 9,240,114 reads mapping to the human genome, with a diversity of 2,360,502 unique tag sequences. MicroRNAs expressed in human brain tissues were found by small-RNA sequencing (Table S4C). As in HeLa cells, we found a statistical association between the miRNA:site pairs and cleavage tags for miRNAs and mRNAs expressed in brain (Figures 6D and S5D). For pairs with alignment score ≤3.0, the TPR was significantly higher than for that of the controls (Table S4B, P = 0.008 and P = 0.030, nonexpressed and chimeric controls, respectively). Statistical significance was retained when also including tags mapping 1 nt downstream of the expected cleavage site as diagnostic of cleavage (Table S4B, P = 0.011 and P = 0.013, nonexpressed and chimeric controls, respectively), perhaps because some 5’→3’ trimming occured in the animal, where we could not knock down XRN1 activity. Eleven sites with scores ≤3.0 had tags suggestive of miRNA-directed cleavage (Table S4D) at the expected position (Table 1) and two had tags suggestive of cleavage at position −1 (Figure S5E). Three of the 13 matched miR-151-5p and included N4BP1, which was also identified in HeLa cells. FRS2, a proposed target of miR-182, was also identified in HeLa cells. Four of the miRNA:site pairs newly identified in brain appeared conserved in other mammals (CA ≤3.0; Table 1 and Figure S5E).
We present centered sites as a type of miRNA target site. Centered sites contain at least 11 contiguous nucleotides that pair to a miRNA at positions 4–14 or 5–15, a pairing pattern distinct from that of most 3’-compensatory sites and seed sites. However, because a centered site might include additional nucleotide pairing on either side and a 3’-compensatory site might have additional pairing extending into the miRNA central region, there is potential overlap between a few extended centered sites and a few 3’-compensatory sites. Similarly, a seed site might include 3’-supplementary pairing extending into the miRNA central region, which creates potential overlap between a few extended centered sites and a few 3’-supplementary sites. However, such overlap with previously known site types is very rare. For example, a search of annotated human 3’UTRs revealed that for most human miRNAs, no seed-matched sites extend into centered sites; i.e., most human miRNAs have no 3’UTR match with contiguous Watson–Crick pairing to nucleotides 2–14. Furthermore, conservation analysis and array data show that seed-type targets prefer to acquire supplemental pairing at positions 13–16 rather than extending pairing through nucleotides 9–12 (Grimson et al., 2007).
The reason that centered sites had not been described previously can be explained by their relatively low abundance, which resembles that of 3’-compensatory sites and is far lower than that of seed-matched sites. Although no more effective than 7-nt seed-matched sites, centered sites are 4 nt longer, leading to an informational complexity ~250-fold (~44-fold) greater than that of 7-nt sites and a correspondingly increased difficulty for their emergence and retention during evolution. The rarity of centered sites hampers statistical assessment of whether they are subject to evolutionary conservation. Nonetheless, the conserved miRNAs of mammals each match an average of 13 centered sites in human 3’UTRs (Figure S1J), and based on our zebrafish analyses we estimate that on average about two sites per miRNA both reside in messages coexpressed with the miRNA and mediate repression. The presence of even a few beneficial interactions (species-specific or more broadly conserved) for a subset of the miRNAs could impart at least intermittent pressure to preserve the miRNA sequence, thereby explaining the preferential conservation observed in the central region of vertebrate miRNAs (Figure 1A). Moreover, centered sites resemble 3’-compensatory sites in providing a mechanism by which different members of the same miRNA seed family can repress distinct targets (Bartel, 2009).
Why would centered sites require so much more contiguous pairing than that required by seed sites? When bound by the Argonaute protein within the silencing complex, the seed is thought to be pre-organized to favor Watson-Crick pairing to the mRNA (Bartel, 2004). In the current version of this seed-nucleation model, pairing cannot propagate to the center of a miRNA without a substantial conformational change in which the original contacts between Argonaute and the miRNA central and 3’ regions are disrupted (Bartel, 2009). Disrupting these contacts offsets some binding energy gained in forming the central pairs, causing contiguous pairing adjacent to the seed to contribute less affinity than might have otherwise been expected. This lower contribution of pairing to the central region, combined with the higher contribution achieved by the pre-organized seed, would explain why so much more pairing is needed for centered sites to achieve the same outcome as 7-nt seed sites.
Our results shed a new light on the biochemistry of RNAi. We suggest that at 37°C in the low Mg2+ concentrations present in the cell, only the extensively paired sites can be bound with the stability and conformation that favors mRNA cleavage, and that after cleavage the products are not so tightly bound so as to slow multiple turnover. In higher Mg2+, however, less extensively paired sites achieve the stability and conformation needed for cleavage, and product release is more apt to slow turnover. This model explains the reduction of both specificity and efficiency at extensively paired sites observed in high Mg2+ concentrations. Under these conditions, less extensively paired sites are more readily cleaved—hence, the reduced specificity. The more extensively paired sites, on the other hand, undergo slower product release and gain little benefit from this more permissive binding-and-cleavage regime. Indeed, any benefit gained is more than offset by the tighter binding of the miRNA to less extensively paired sites, which causes the total cellular RNA present in extracts used for cleavage reactions to more effectively inhibit utilization of the labeled substrates—hence, the reduced efficiency. The free cytoplasmic Mg2+ concentration in most cells and tissues is < 1 mM (Gunther, 2006), suggesting that cleavage specificity is very high in vivo.
Our results explain previous observations regarding the effects of adding phosphate-containing compounds to in vitro cleavage reactions. Many diverse phosphate compounds, including inorganic monophosphate, stimulate the multiple-turnover cleavage by the mammalian silencing complex (Gregory et al., 2005). We suggest that these phosphate compounds titrate the free Mg2+, which in turn increases product turnover through decreased RNA duplex stability.
We find that miRNA-directed cleavage of mammalian mRNAs, although even more rare than repression at centered sites, occurs more frequently than previously appreciated. Two endogenous cleavage targets had been reported in mammals, HOXB8 and RTL1 (Yekta et al., 2004)(Davis et al., 2005). We substantially add to this list, with evidence for cleavage of seven additional targets in HeLa cells and cleavage of thirteen in human brain, two of which overlapped with HeLa targets. This small overlap, largely attributed to differential expression of the miRNAs or mRNAs in the two samples (Tables 1, S4A, S4C and S4D), suggests that as more tissues are examined, more cleavage targets will be found.
The fraction of degradome sequencing tags that provided evidence of miRNA-directed cleavage was generally higher in the HeLa analysis than in the brain analysis (Table 1; Figure S5B and S5E). In the brain, this fraction of cleavage tags was sufficiently low so as to suggest that some might represent degradation intermediates not indicative of miRNA-directed cleavage. Whether a smaller fraction of brain messages are cleaved, however, is unclear. The brain analysis lacked the benefit of the XRN1-endonuclease knockdown, designed to stabilize the transient 3’ cleavage product so that it could be more readily detected over the background of metastable mRNA-decay intermediates. Moreover, whole brain has many cell types, with the possibility that differential expression of a miRNA and its cleavage targets might decrease the signal relative to background. Nonetheless, for most cleavage targets in HeLa and for some in brain, degradome profiles resembled those of plant targets with validated biological relevance (Figures 6F–H, S5Bi and S5Ei) (Addo-Quaye et al., 2008; German et al., 2008), strongly supporting the hypothesis that the miRNA-directed cleavage pathway is an important degradation pathway for those mRNAs.
In both brain and HeLa cells, several cleavage targets identified were targets of miR-151-5p. This miRNA derives from a hairpin that has homology to the L2 subclass of repeat elements known as long interspersed nuclear elements (LINEs). L2 LINE elements are remnants of a non-LTR retrotransposon activity present in the common ancestor of mammals. They make up over 3% of the human genome (Kamal et al., 2006). Indeed, the miR-151 hairpin is derived from a tail-to-tail arrangement of two L2 fragments (Figure 6E). Hence, miR-151-5p derived from L2(+) is strongly complementary to several target sites derived from L2(−) repeats. Analogous tail-to-tail arrangements of short (S)INE fragments produce transcripts with longer hairpins that are processed in mouse ES cells into endogenous siRNAs (Babiarz et al., 2008). However, miR-151-5p and miR-151-3p are typical miRNAs, in that 1) their accumulation depends on both Drosha/DGCR8 and Dicer endonucleases (Babiarz et al., 2008), 2) they pair to each other with 2-nt 3’ overhangs, 3) they are the two dominant products accumulating from the hairpin (Figure S5C), and 4) their hairpin has a conservation pattern typical of other conserved miRNAs (Figure 6E).
Two other miRNAs that direct cleavage in HeLa, miR-28-5p and miR-545*, are also L2 repeat–derived miRNAs. The notion that these miRNAs and their targets ultimately derived from the same ancestral elements is reminiscent of the origin of some plant miRNAs, which derive from duplicated fragments of their cleavage targets (Allen et al., 2004; Rajagopalan et al., 2006). In mammals, however, the miRNAs and target sites evolved in parallel from the common ancestor, rather than one from the other. Moreover, in mammals, common ancestry between the miRNAs and their targets can be detected for older, conserved miRNAs, such as miR-151 and miR-28, whereas in plants common ancestry has been detected only for younger, nonconserved miRNAs.
The observation that many of the cleaved mRNAs were the targets of repeat-derived mRNAs can be explained by the fact that repeat-derived miRNAs are more likely to encounter extensively complementary matches, since repeat-element remnants are found within many mRNAs. Over the course of evolution, repeat-derived miRNAs presumably had access to a wide variety of cleavage targets, providing the opportunity for some favorable regulatory interactions to emerge and be retained as conserved cleavage interactions. Thus, the repeat-derived miRNAs and their cleavage targets provide yet another avenue for repetitive elements to shape the regulation of cellular genes.
The discovery of centered sites raises the question of how many additional site types remain to be found. On the one hand, transcriptome/proteome changes observed after introducing or deleting a miRNA can all be explained by direct interactions between the miRNA and messages with the five known site types (seed sites, 3’ supplementary seed sites, 3’ compensatory sites, centered sites, and cleavage sites), combined with indirect effects as changes in the primary targets influence expression of secondary targets. On the other hand, detailed experimental follow-up on mRNAs that respond to the miRNA despite lacking any of these established site types seems to indicate that some of them should not be dismissed as secondary targets but might instead be direct targets (Lal et al., 2009). However, the pairing schemes proposed thus far for these unusual interactions have not been defined sufficiently to provide predictive utility. That is, in contrast to centered sites and the previously known site types, these pairing schemes lack the specificity required to distinguish other responsive messages with similar pairing from background. Hence, experiments like that shown in Figure 1C–F cannot distinguish responsive messages that satisfy these unusual pairing schemes from nonresponsive messages that do not. Perhaps unknown factors binding to neighboring UTR elements help achieve interaction specificity differently for each individual mRNA in a manner too idiosyncratic to be generalized into site types. Alternatively, future insights into miRNA targeting might identify commonalities in these unusual interactions, which could form the basis of novel site types with predictive value.
A detailed description of all materials and methods used can be found in the Supplemental Text.
Array analyses were as in (Grimson et al., 2007). Luciferase reporter constructs were prepared as in (Grimson et al., 2007), and assays were performed as in (Farh et al., 2005). In vitro cleavage reactions were essentially as in (Haley and Zamore, 2004; Shin, 2008). Uncapped 5’ ends of GSTM3 mRNA degradation products were identified using the 5’-RACE kit (Invitrogen), as in (Jones-Rhoades and Bartel, 2004), starting with cells in which XRN-1 mRNA was knocked-down more than 90%, as confirmed by RT-PCR (Aleman et al., 2007). Degradome libraries were constructed essentially as in (Addo-Quaye et al., 2008). Small-RNA libraries were prepared for Illumina sequencing as described (Grimson et al., 2008).
Out of 223 miRNA genomic loci producing 197 mouse miRNAs conserved in other mammals (Friedman et al., 2009), 203 miRNA loci producing miRNAs with 5’ ends validated from a large scale profiling of mouse miRNAs (Chiang et al. 2010) were used in the analysis of Figure 1A.
After removing linker sequences and tags shorter than 20 nt, degradome tags were mapped to RNAs annotated in the ENSEMBL (http://www.ensembl.org/), requiring a perfect match. To find “multiple loci tags”, and tags that did not map to annotated RNAs, filtered tags were mapped to the human genome (hg18, http://genome.ucsc.edu/). When determining TPRs, filtered tags were mapped to a curated set of distinct mRNAs (Baek et al., 2008). Expressed mRNAs were those represented by at least one degradome tag.
When searching for miRNA:site duplexes, distinct mRNAs and miRNAs were selected to avoid over-counting predicted duplexes involving miRNA families or mRNA isoforms. To select distinct miRNAs, all human miRNAs and miRNA* sequences (miRBase 11.0) were aligned and classified into groups whose members differed from each other at ≤5 positions. The miRNA with the lowest miRBase annotation number was selected as the representative from each group. For distinct mRNAs, the mRNA isoform with the longest 3’UTR (or, if all 3’UTRs were of the same length, a randomly chosen isoform) was selected from a previously filtered set of RefFlat and H-INV annotations (Baek et al., 2008).
To search for orthologous sites, we used 165 distinct miRNAs conserved among mammals and a 6-way genome alignment (human, mouse, rat, dog, horse, and pig) from the UCSC genome browser (hg18, http://genome.ucsc.edu/). Alignment penalty scores were determined and the second worst score rather than the worst score was selected as the CA score to accommodate some genome-alignment errors, incomplete genome sequences, and species-specific losses.
To generate controls with the same seed composition and same trinucleotide composition as authentic miRNAs, chimeric miRNA sequences were created by reciprocally recombining, using the link between nucleotides 10 and 11 as the crossover breakpoint, two miRNAs randomly chosen (without replacement) from miRNA pairs with the same dinucletide at positions 10 and 11 considering only our set of distinct miRNAs. Ten chimeric miRNA cohorts were generated to estimate the signal-to-background ratios.
We thank Andrew Grimson, Daehyun Baek, and Alexander Subtelny for helpful discussions, Shujun Luo and Gary Schroth for Illumina sequencing of the small-RNA library from brain, and the Whitehead Genome Technology Core for the remaining Illumina sequencing. Supported by a Damon Runyon postdoctoral fellowship (C.S.) and a grant from the N.I.H. D.B. is a Howard Hughes Medical Institute Investigator.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.