Some miRNA* species are relatively abundant in total RNA
Initial analysis of ~4,000 D. melanogaster
small RNA sequences yielded clones from 62 miRNA loci26
. miRNA* species were cloned for nine loci; however, only one of these was cloned more than twice (miR-2a-2*, four clones). These and other early cloning efforts contributed to the prevailing view that miRNA* species are, by and large, rare RNAs. More recent analysis of > 1 million small RNA sequences that aligned to the D. melanogaster
genome (GEO dataset GSE7448) not only identified new miRNA genes, but also yielded a nearly comprehensive set of cloned miRNA* species25
. These data permitted detailed analyses of miRNA* biology.
Inspection of 316,927 miRNA and 28,465 miRNA* clones revealed that many miRNA* species are actually relatively abundant. Whereas 60 out of 134 miRNA genes showed ≥20:1 strand bias, 29 out of 134 miRNA genes showed ≤5:1 strand bias. Some of these ratios were uncertain owing to low numbers of reads; still, 16 genes in the ‘low’ strand bias set were confidently sampled by > 100 reads. The entire set of miRNA:miRNA* read counts and ratios, organized by gene and library of origin, is presented in Supplementary Table 1
The absolute number of miRNA* clones recovered from abundantly expressed loci was greater than the miRNA clone counts from many lowly expressed loci. The interpretation of this is ambiguous, as seemingly ‘rare’ miRNAs cloned from a whole animal might be highly expressed in a specific cell type. However, analysis of S2 cells showed that many miRNA* species were more abundant than many miRNA species in this single cell type. For example, we recovered > 50 clones for seven miRNA* species in S2 cells (miR-276a*, bantam*, miR-34*, miR-2a-2* miR-282*, miR-996* and miR-306*), whereas 40 of the S2-expressed miRNAs had <50 clones.
We validated the steady-state accumulation of miRNA and miRNA* species using northern analysis of total D. melanogaster
RNAs ( and Supplementary Fig. 1
online). We easily detected miRNA* species from members of miRNA families (mir-10, mir-276a
) and from unique miRNA genes (mir-306, mir-184
). Together with the cloning data, this suggested that many miRNA* species are present at levels that are conceivably biologically relevant.
Figure 1 Both miRNA and miRNA* species can be detected in total RNA. RNA was analyzed from D. melanogaster cells at different stages: E, 0–24 h embryos; LP, 3rd instar larvae and mixed pupae; A, adult males and females; S, S2 cells. Each blot was sequentially (more ...)
Preferential stability of highly conserved miRNA* species
The mere existence of miRNA* species in total RNA does not establish their function as regulatory RNAs. A trivial alternative interpretation is that certain discarded miRNA* strands are degraded more slowly than others. We addressed this by examining the ratio of miRNA: miRNA* reads across successive time points in D. melanogaster
embryonic development. Indeed, the miRNA:miRNA* ratio of many loci became increasingly skewed as development proceeded. For example, the ratio of miR-286 to miR-286* increased from 1.6 to 4.8 to 51.6 across three consecutive stages of embryo development (Supplementary Table 2
online). This trend is consistent with the preferred stability of miRNA species and concomitant turnover of miRNA* species.
miR-286 derives from a cluster of eight miRNAs: mir-309, mir-3, mir-286, mir-4, mir-5, mir-6-1, mir-6-2 and mir-6-3 (). Note that although the miRNA products of the mir-6
genes are identical, their respective miRNA* species are distinct. The isolation of 297 miR-6-1* clones, 373 miR-6-2* clones and 858 miR-6-3* clones provided evidence for fairly comparable processing of each of the three mir-6
genes (Supplementary Table 1
). As was done in a previous analysis25
, we divided the total miR-6 clone counts in each library by three to estimate the output of each individual mir-6
Figure 2 Highly conserved miRNA* species accumulate to higher relative levels at steady state. (a) Evolution of the mir-309->mir-6 cluster. Both miRNA (green) and miRNA* (yellow) sequences of mir-4 and mir-5 are perfectly conserved across 12 Drosophilids. (more ...)
All eight miRNA* species from this operon were abundant at the earliest time point (0–1h after egg laying), with six miRNA:miRNA* pairs cloned at a ratio of 4:1 or less. However, six loci showed miRNA:miRNA* ratios that rose rapidly with age (in 2–6-h, 6–10-h and 12–24-h embryos), usually exceeding 50:1 (). The exceptions were mir-4 and mir-5, whose ratios rose to only 8:1 (mir-4) or actually decreased to a terminal ratio of 2.2:1 (mir-5). Notably, these were the same genes of the cluster whose miRNA* sequences were most highly conserved. In fact, both miRNA and miRNA* of mir-4 and mir-5 are perfectly conserved among 12 Drosophilids ().
We asked whether these trends applied among miRNA genes more generally. For this purpose, we selected the 26 miRNA loci that produced at least 50 clones in each of four successive embryo time points (Supplementary Table 2
), values that ensured that their miRNA:miRNA* ratios were quantitatively meaningful. These genes collectively showed miRNA:miRNA* ratios that rose sharply, were stable or even decreased with embryo age (). Notably, the nine genes whose ratios remained lowest (bantam, mir-5, mir-92a, mir-2a-2, mir-4, mir-8, mir-996, mir-7 and mir-2b-2) were all genes whose miRNA* species were perfectly conserved among 11 or 12 Drosophilid species (, ‘blue’ genes, and Supplementary Fig. 2a,b
online). These observations generalize the correlation between the degree of nucleotide conservation of miRNA* species and their tendency to accumulate to higher levels at steady state.
miRNA* species populate AGO1 complexes
We proposed that the correlation between the evolutionary constraint of miRNA* strands and their expression level might reflect their usage as endogenous regulatory RNAs and sought evidence for this by asking whether any miRNA* species were physically associated with effector complexes. To do so, we immunoprecipitated endogenous Argonaute-1 (AGO1) and probed this fraction for endogenous miRNA and miRNA* species. These experiments detected miR-34:miR-34*, miR-184:miR-184* and miR-276:miR-276* in association with AGO1 in S2 cells, and miR-5:miR-5* and miR-10:miR-10* from 0–10-h embryos (). The fraction of miRNA* species that associated with AGO1, relative to their total cellular content, was in many cases comparable to that of their partner miRNA species. A notable exception was mir-184
, for which much less of the miRNA* detected in total RNA was associated with AGO1 relative to miRNA. This provided compelling evidence that the immunoprecipitation assay reports on a small RNA population that is distinct from total RNAs, and probably reflects the active sorting of miRNAs and miRNA* species into regulatory complexes27,28
. We take these data as evidence for active miRNA* sorting in both cultured cells and in the animal. At the same time, these data suggest that caution should be applied in ascribing function to small RNAs detected in total RNA because they contain species that are rejected from sorting complexes and/or await their degradation.
Figure 3 miRNA and miRNA* species can be co-immunoprecipitated with AGO1. Each blot contains input total RNA, the immunoprecipitate (IP) of mouse anti-T7 (as a control) and mouse anti-AGO1, and the supernatant (sup) of anti-T7 and anti-AGO1 incubations. IPs were (more ...)
Validation of the regulatory activity of miRNA* species
We next tested the regulatory potential of miRNA* species using assays previously used to validate miRNA targets. Active sorting of miRNA processing intermediates was reported to influence the type and/or level of target regulation in heterologous tests27
. Nevertheless, we find that perfect target sites are usually more sensitive than imperfect target sites, even when the bulk of the small RNA partitions into AGO1 complexes. This is probably due to the much stronger cleavage activity of AGO2 relative to AGO1 (ref. 27
We therefore designed artificial targets containing four tandem sequences antisense to miR-iab-4-5p or miR-iab-4-3p (the left- and right-arm products of mir-iab-4
) downstream of the Renilla
luciferase coding region in psiCHECK2, a vector that also contains a control firefly luciferase gene. We then examined their response to expression constructs for mir-iab-4
. We earlier showed that this mir-315
construct is biologically active and strongly represses miR-315 target genes29
and, thus, represents an appropriate noncognate control. The miRNA and miRNA* sensors were strongly repressed (20-fold to 40-fold) by mir-iab-4
, whereas these sensors showed no response to mir-315
(). We and others have shown that the antisense strand of mir-iab-4
encodes a functionally distinct miRNA hairpin termed mir-iab-8
). We tested left-arm and right-arm sensors for this pre-miRNA and again observed strong and specific repression of both miR-iab-8-5p and miR-iab-8-3p sensors (). Together, these results clearly demonstrate that miRNA* species can have regulatory capability.
Figure 4 Sensor assays in cultured cells and transgenic animals validate the regulatory activity of miRNA* species. (a) Repression by miRNA* species (colored bars) from canonical precursors. Luciferase sensors bearing complementary sites to miR-iab-4-5p/miR-iab-3p (more ...)
Short intronic hairpins termed ‘mirtrons’ provide a secondary source of miRNA precursors that are independent of canonical nuclear miRNA processing4–6
. Drosophila melanogaster
mirtrons are strongly biased to yield right-arm products, possibly because their left-arm products begin with a G residue, which is rare among mature miRNAs. Still, we asked whether the left-arm products of a mirtron might also be functional. We assayed the response of sensors for miR-1010 and miR-1010* to ectopic mirtron expression constructs for mir-1010
. We observed strong repression of both sensors by mir-1010
but not by mir-1003
(), indicating that miRNA* functionality extends to mirtron precursors as well.
Finally, we tested whether miRNA* species could regulate target genes in the animal using a repression assay in the D. melanogaster
wing imaginal disc34
. We recently used this assay to show that mir-iab-4
could selectively repress tub-GFP
sensor transgenes carrying perfect target sites for their respective left-arm products, miR-iab-4-5p and miR-iab-8-5p (ref. 33
). We now prepared transgenic animals carrying sensors for the right-arm, ‘3p’ species and tested these in parallel with their partner, the left-arm, ‘5p’ sensors. We observed that ectopic mir-iab-4
could repress both its 5p sensor (, above) and its 3p sensor (, below), although the regulation of the 3p sensor was weaker. We also found that ectopic mir-iab-8
strongly inhibited both its 5p (, above) and 3p (, below) sensors. Overall, these data provide convincing evidence that miRNA* species are capable of repressing targets in both cultured cells and in the animal.
Patterns of miRNA* evolution are consistent with their regulatory potential
Our tests show that miRNA* species can populate regulatory complexes to guide target repression. Nevertheless, this could, in principle, be fortuitous. For example, a certain degree of imprecision in miRNA strand selection might be of neutral consequence and thus tolerated in vivo
. However, this view is not consistent with the well-documented and adverse consequences of small interfering RNA (siRNA) off-targeting35
. Rather, to avoid undesirable regulation of cellular transcripts, we proposed that many miRNA* species may have infiltrated endogenous regulatory networks during evolution. To obtain evidence for this model, we examined the patterns of miRNA gene conservation across 12 Drosophilid species. The conservation of miRNA sequences, with particular constraint on their seed regions, was previously taken to reflect their sequence-based, trans
. We reasoned that the same logic might apply to miRNA* sequences and seeds.
In fact, the possibility of trans
-acting activity for miRNA* species was hinted at by earlier computational efforts for miRNA gene finding36
. We observed that miRNA* species diverge much more slowly than miRNA terminal loops, a property that strongly aids the identification of functional animal miRNA hairpins as ‘saddle’ structures36,37
. The extent to which miRNA* strands are constrained in their primary nucleotide sequence is not adequately explained solely by pressures to maintain particular secondary structures, which would predict a higher frequency of compensatory mutations than is observed during evolution.
We systematically examined a set of 131 D. melanogaster
canonical miRNA genes, almost all of which had cloned miRNA* species25
. Of these, 31 miRNA* sequences were completely conserved among all 12 sequenced Drosophilids (Supplementary Fig. 2a
), and another 23 miRNA* sequences were nearly perfectly conserved (Supplementary Fig. 2b
), with up to 4 aggregate mismatches among all orthologs (that is, only 4 out of ~260 bases). The fact that so many ( ~40%) miRNA* sequences resist nucleotide divergence across a broad species range is inconsistent with the idea that miRNA* species are merely carrier strands whose only constraint is to maintain hairpin pairing to their miRNA partners. We also classified 11 additional genes as ‘highly conserved’, in that no more than two miRNA* nucleotide positions had diverged among 12 orthologs (Supplementary Fig. 2b
). In total, 65 genes satisfied highly conserved (HC) criteria, or nearly half of all D. melanogaster
miRNA loci. We divided the remaining miRNA gene alignments on the basis of their presence in non-Sophophoran Drosophilids (Drosophila virilis, Drosophila mojavensis
and Drosophila grimshawi
), the most distantly related sequenced species relative to D. melanogaster
. There were 46 genes with non-Sophophoran orthologs (Supplementary Fig. 2c
) and 20 genes that were restricted to the Sophophora (Supplementary Fig. 2d
; we refer to these as the poorly conserved (PC) gene set).
We calculated the relative conservation of each 7-nt window along the orthologs of all miRNA strands using a scheme that was weighted according to evolutionary branch length (Methods). A previous survey of paralogous miRNA families revealed that positions 2–8 showed the highest constraint of all such 7-nt windows16
. Our analysis differed in that we considered all orthologous miRNAs, which allowed us to evaluate many more gene comparisons and also to consider more recent evolutionary trends (as gene orthologs are much more recently diverged than gene paralogs).
Analysis of 131 miRNAs revealed two discernable evolutionary patterns. First, the 5′ and 3′ ends of miRNAs were more conserved than their central regions (, dark green). This is consistent with the idea that there is general pressure to maintain the immediate sequence of Drosha and Dicer processing sites. Second, the 5′ ends of miRNAs were slightly more conserved than their 3′ ends. In particular, the miRNA seed window at positions 2–8 was most conserved (, below, indicated by an asterisk on the dark green bar). These trends paralleled the results of paralog analysis16
and reflect the experimental demonstration that the sequence at the 5′ end of the miRNA is most crucial for target identification14,15,36
. The 65 HC miRNA genes were nearly universally conserved along their miRNA strands and thus generated little in the way of evolutionary signal. Nevertheless, it was evident that the central region of even highly conserved miRNAs showed some measure of divergence, resulting in a slight dip in their aggregate conservation scores (, below, light green; see also the closer view above).
Figure 5 Bioinformatic evidence for the endogenous usage of both miRNAs and miRNA* species as regulatory RNAs. (a) miRNA-miRNA* sequence evolution. Above is a schematic of a typical miRNA hairpin, showing that the miRNA seed pairs to the 3′ end of the (more ...)
If the primary purpose of a miRNA* species is simply to promote accurate processing of its miRNA partner, then we might expect that miRNA* species should be more tightly constrained at their 3′ ends, which pair with the miRNA seed (). On the contrary, systematic analysis of the 65 HC miRNA* species produced a profile that was notably analogous to miRNA strands. In particular, the 5′ and 3′ termini of miRNA* arms were more conserved than their central regions, but miRNA* 5′ ends were slightly more conserved than their 3′ ends (, yellow). miRNA* conservation dropped off noticeably between the positions 2–8 and 3–9 windows, a feature that was suggestive of preferred seed constraint for miRNA* species. Therefore, although miRNA* species are less well-conserved than miRNA species, they show patterns of nucleotide divergence that are consistent with their selection for regulatory activity.
Selective conservation of miRNA* seed matches in target 3′ UTRs
The evolutionary rigidity of ~50% of Drosophila
miRNA* species was suggestive of their functional constraint. We sought to corroborate this by comparing the evolutionary behavior of miRNA and miRNA* seed matches. Watson-Crick complements to miRNA seeds, namely positions 2–8 from their 5′ ends, identify significantly more conserved matches in 3′ UTRs than do matched cohorts of shuffled seeds16,18,34
. We asked whether this applied to miRNA* seeds as well. Because sequence randomization necessarily yields some motifs that are not representative of a true genome, we took care to create control heptamers that had the same nucleotide composition and the same hit frequency (±~10%) in D. melanogaster
3′ UTRs as genuine miRNA heptamers (see Methods).
We first asked whether miRNA:miRNA* seed matches were more conserved than matches to all other heptamers along these small RNAs16
. We used the pairwise conservation score (PCS) method to rank the relative conservation of D. melanogaster
3′ UTR sequences with that of divergent Drosophilids38
. This score represents the log rank ratio between the number of seed matches in D. melanogaster
and the species of comparison, for which positive values imply functional constraint. As expected, seed matches to 7-nt windows at the 5′ ends of D. melanogaster
miRNAs were preferentially conserved in the highly diverged species D. mojavensis
and D. virilis
, with other 7-nt windows evolving neutrally (, above, green). The highest-scoring windows were positions 2–8 and 1–7, consistent with their known role in determining miRNA target specificity. Analysis of the 65 highly conserved miRNA* species yielded a similar picture. Although the trends were more modest, heptamer matches to the 5′ ends of miRNA* species clearly showed preferential conservation (, above, yellow).
For comparison, we analyzed the set of 20 D. melanogaster
miRNAs (Supplementary Fig. 4d
online) that lack orthologs or homologs outside of the Sophophoran subgenus. We designated these as PC miRNAs, although their conservation status is heterogeneous: some PC genes have orthologs in nine species. Neither PC miRNAs nor PC miRNA* species showed preferred conservation of 3′ UTR matches across any 7-nt window (, below, light blue and pink), negative data that provided reassurance that our control sets were selected appropriately.
We next examined the numbers of seed matches to the 2–8 window that were conserved between species of increasing evolutionary distance from D. melanogaster
. The fraction of conserved hits to functional miRNA seeds relative to controls increases with evolutionary distance, resulting in a rising signal-to-noise profile across speciation16,18
. For the 65 HC miRNA genes, average miRNAs showed a ~3:1 ratio in the most diverged species (, green diamonds). On the other hand, the 20 PC miRNA seed matches showed no enrichment across evolution, so their values stayed flat at a ratio of ~1.0 (, blue triangles). The same was true for the PC miRNA* seeds (, black circles).
In light of the PC miRNA:miRNA* data, the behavior of the 65 HC miRNA* species was noteworthy. Their values increased steadily to a terminal signal-to-noise of 1.48 to 1 in D. mojavensis and D. virilis (, yellow squares). A caveat to this value is that some miRNA* species have the same seed as some miRNAs; notably, miR-5* shares the K box seed of miR-2/6/11/13/308, whereas miR-9a* shares the Brd box seed of miR-79. We consider it appropriate to include their contributions to the miRNA* target network because, at least in the case of mir-5, we presented biochemical evidence that its precursor actively loads appreciable amounts of miRNA* into AGO1 (). Nevertheless, even when discounting these miRNA* species to afford a more conservative interpretation, the remaining HC miRNA* still reached a signal-to-noise ratio of 1.38:1.
Another way to demonstrate that the HC miRNA* data was not dominated by a few genes was to consider the percentage of genes that select for targets. Of 65 HC miRNAs, 62 showed a positive PCS score in 3′ UTRs, and 54 out of 65 (83%) reached a P
-value significance of 95% relative to their controls (see Supplementary Table 3
online for individual gene values). On the other hand, 40 out of 65 HC miRNA* species showed a positive PCS score in their 3′ UTRs, with 23 out of 65 (35%) with P
> 95%. Therefore, the biological signal of targeting by HC miRNA* species was not dominated by a few genes. Only three PC miRNAs and two PC miRNA* achieved a P
> 95% difference with their random controls (see Supplementary Table 4
online for individual gene values).
According to these measures based solely on conserved seed matches, highly conserved Drosophila miRNAs have at least conserved 30 targets above noise, whereas their corresponding miRNA* species have ~10 targets above noise (). Although these miRNA* target networks are smaller than those of average miRNAs (), their significance can be measured in light of the fact that at least one-fifth of all miRNA* species can be confidently described as showing some endogenous 3′ UTR targets that are conserved among the Drosophilids, whereas one-sixth of D. melanogaster miRNAs (that is, the PC gene set) lack such conserved targets.
Experimental evidence for endogenous miRNA* activity
Our bioinformatic studies strongly support the idea that a significant fraction of miRNA* species contribute to 3′ UTR—mediated regulatory networks. In our final experimental tests, we wished to generate experimental evidence for the regulatory activity of an endogenous miRNA* and/or regulation of an endogenous miRNA* target.
We focused on mir-276a
for tests of the former, as we recovered similar numbers of small RNAs from both its hairpin left arm(5p) and right arm(3p) from various samples. For example, there were 408 miR-276a-5p and 479 miR-276a-3p clones from S2 cells, and we could corroborate the steady-state accumulation of both strands using northern analysis (). We note that the mir-276a
loci encode identical left-arm products, which might obscure their assignment to a particular locus. However, their right-arm (3p) products have unique sequences. As no miR-276b-3p clones were recovered from S2 cells, despite > 1,000 clones of this RNA in other libraries (Supplementary Table 1
), we inferred that mir-276a
is uniquely expressed by S2 cells. We verified this by performing quantitative reverse-transcription PCR (qPCR) for pri-mir-276a
, which provided evidence for a > 15:1 discrepancy in the level of their primary transcripts in S2 cells (Supplementary Fig. 3
Using four-tandem site sensors for miR-276a or miR-276a* into psiCHECK2, we first assayed their response to ub-Gal4
. We observed that either mir-276a
induced 4-fold to 8-fold repression of both 5p and 3p sensors. Evidence for specificity of repression came with their insensitivity to a noncognate expression vector for mir-315
. Thus, both mir-276
genes (which are both perfectly conserved on their left and right arms across Drosophilid evolution, Supplementary Fig. 4
) produce functional small RNAs (), consistent with our previous tests with mir-iab-4, mir-iab-8
Figure 6 Endogenous relevance of miRNA*-mediated repression. (a) Ectopic mir-276a and mir-276b, but not DsRed or mir-315, specifically repress both miR-276a-5p and miR-276a-3p sensors. The experimental design is the same as for ; mean values and s.d. (more ...)
We then analyzed the effect of depleting endogenous miR-276a-5p and miR-276a-3p using 2′O-methylated antisense oligonucleotides (ASO)39,40
. As a control, we used a similarly sized ASO to miR-288. We observed that the miR-276a-5p sensor was specifically derepressed by its cognate ASO but not by miR-276a-3p or miR-288 ASO (). Conversely, the miR-276a-3p sensor was specifically derepressed by its cognate ASO and was unaffected by noncognate ASO. These data provided evidence for the endogenous regulatory activity of both small RNAs derived from a single pre-miRNA hairpin.
To obtain experimental evidence for an endogenous 3′ UTR target of a miRNA* species, we returned to the Hox miRNA mir-iab-4
. We showed that its miRNA* species miR-iab-4-3p is an active repressor in cultured cells and in the animal (), and its strict conservation suggested that it might have endogenous targets. Searches for miR-iab-4-3p seed matches revealed abrupt
as a top candidate18
. The abrupt
3′ UTR contains three seed matches that are Watson-Crick complements to positions 2–8 of miR-iab-4-3p (). Two of these sites are perfectly conserved across 12 sequenced Drosophilid genomes (Supplementary Fig. 4
) and are located close enough to each other to mediate synergism41,42
. One of these sites also has a t1A feature, which increases site efficacy43
. Finally, all three sites are located near the stop codon (), an optimal location for miRNA target-site function41,44,45
. For these reasons, abrupt
is a compelling miRNA* target. We note that miR-iab-4-5p shows plausible matching to these sites; however, all of the sites are mispaired with position two of miR-iab-4-5p (Supplementary Fig. 4b
), a disruption that has explicitly been shown to nearly eliminate target regulation in D. melanogaster14
. We therefore consider miR-iab-4-5p unlikely to be functionally relevant to abrupt
We tested the ability of ectopic mir-iab-4
to repress Abrupt in the wing imaginal disc. Endogenous Abrupt protein is present at the highest level in the L5 wing primordium46
(). Expression of ectopic mir-iab-4
did not suppress L5 expression of Abrupt (data not shown), possibly because of the high target level, compensatory regulation and/or occluding factors bound to abrupt
transcripts in this domain. However, when we examined discs that ectopically expressed mir-iab-4
, which specifically overlaps a region of lower Abrupt expression in the L3 wing primoridium, we detected mild downregulation of endogenous Abrupt (). To obtain clearer evidence for this regulatory relationship, we analyzed a tub-GFP-abrupt
3′ UTR transgenic sensor. The heterologous promoter excludes the possibility of compensation at the transcriptional level. These assays clearly revealed repression of the abrupt
sensor in mir-iab-4
—expressing cells (), confirming it as a genuine miRNA* target. In summary, these experimental tests provide functional evidence for the regulatory activity of endogenous miRNA* species and targets.