|Home | About | Journals | Submit | Contact Us | Français|
microRNAs (miRNAs) are extensively involved in developmental programming. Some miRNAs are highly conserved, while others are lineage specific. All miRNAs maturate through a series of processing steps. Here we review recent progresses in the studies of early steps in miRNA biogenesis, focusing on animal systems. The miRNA maturation pathways are surprisingly diverse, involving transcription by RNA polymerase II or III, cleavage by the Drosha nuclease or the spliceosome, and sometimes modifications by the adenosine deaminase ADAR. The relationship between the diversity in miRNA biogenesis and the apparently rapid evolution of miRNA genes and functions is discussed.
miRNAs are a class of non-protein-coding RNAs that are ~22nt in length. They are involved in important biological functions, such as development and cell physiology . Over 700 human miRNAs have been cloned . They regulate the expression of ~30% of protein-coding genes by targeting specific messenger RNAs for cleavage or translational regulation, in a combinatorial fashion . In animals, miRNAs often lead to translational repression, and to some extent mRNA decay, through partially complementary pairing to the 3′ untranslated regions (UTRs) of their targets using the seed region (positions 2 to 7 or 8 from 5′ end). In plants, miRNAs show perfect complementarity to their targets and lead to mRNA degradation. A recent article showed that translational inhibition is also widely effective in gene silencing by plant miRNAs and siRNAs, drawing more similarities between animal and plant miRNAs . Some miRNAs have been shown to be oncogenic, or suppress tumor growth . Several miRNAs are involved in metastasis [6, 7]. Artificial miRNAs that mimic natural pri- or pre-miRNAs (termed small hairpin RNAs, or shRNAs) are being used as a tool for genomic research and for therapeutic purposes [8–10]. The mechanisms of posttranscriptional regulation by miRNAs are nicely reviewed in . In this review, we will focus on recent studies of the early steps in miRNA biogenesis pathways in animals, and discuss their implications in the evolution of miRNA genes and functions.
miRNAs contain 5′-phosphate and 3′-OH, thus they can be cloned using a specific procedure involving size selection and adaptor ligations [11–14]. Most miRNAs are named miR-#, where # represents a number. miRNA sequences are found in either the 5′ or 3′ strand in a hairpin secondary structure of their precursors. If the mature miRNA is located on the 5′ strand, it is called miR-#-5p; if it is located on the 3′ strand it is called miR-#-3p. These common features of miRNAs are defined by miRNA processing factors, as will be discussed later. miRNAs have similar length and chemical structures to another class of small RNAs, called small interfering RNAs (siRNAs) . siRNAs are generated from long double-stranded RNAs, which could be replication intermediates of viruses or transcripts of transposable elements.
miRNAs are synthesized in cells as long primary transcripts (pri-miRNAs) that often contain thousands of nucleotides . Pri-miRNAs are cleaved by a series of cellular processing factors. The major miRNA processing pathway in animals is illustrated in Fig. 1. A long pri-miRNA is first recognized and cleaved by a ribonuclease III (RNase III), called Drosha, along with an RNA-binding protein DGCR8 or Pasha (Partner of Drosha)[17–21]. Jointly, Drosha and DGCR8 form the Microprocessor complex [18, 19]. The product of the first cleavage event, pre-miRNA with ~70 nt, is then transported across the nuclear membrane by Exportin-5 and Ran-GTP[22, 23].
In the cytoplasm, the pre-miRNA is further cleaved at a pair of sites close to the hairpin loop by another RNase III enzyme Dicer to give a miRNA duplex[24–29]. The miRNA duplex is then incorporated into the effector complex RISC (RNA induced silencing complex)[30–32], likely through the activity of the RISC loading complex, which contains Dicer, the central RISC component Argonaute and RNA-binding proteins (TRBP and PACT [33–37] in humans, and loquacious [38–40] in flies). A miRNA duplex is unwound into the mature single-stranded form (called the guide strand)  and its complementary strand (the passenger strand or miRNA*) is degraded by RISC[33, 42–44]. Mature miRNAs often predominantly originate from one strand of the duplex. Asymmetric incorporation of siRNA strands has been observed [45, 46]. The Drosophila R2D2, the dsRNA binding partner of Dicer-2, binds the end of the duplex with more stable base pairing interaction, providing the biochemical basis for preferential use of siRNA strands . A similar mechanism might be responsible for miRNA strand selection. The stem loop in pre-miRNAs, presumably when bound to the RISC loading complex, likely contributes to the strand selection . The miRNA* also has a chance to be incorporated into RISC and can be actively used in gene regulation .
The extensive processing of miRNA transcripts ensures that only RNA molecules with the proper structural features can be used in gene regulation . These structural features are recognized by the miRNA processing factors. For example, it has been shown that Dicer cleaves double-stranded RNA into 21-nt duplexes preferentially from the end of the helix [51, 52]. In addition, the stem and 3′ overhang of pre-miRNAs are important for association with Exportin-5 and Ran-GTP. The recognition of pri-miRNAs by Drosha and DGCR8 is the first step in miRNA maturation, and probably contains the most important constraints for miRNA processing. The RNA structures within an 80-nt hairpin region and the nonstructured regions flanking the hairpin have been shown to be important for processing by Microprocessor [17, 54–56]. The Drosha cleavage sites are about 10-bp from the junction between the miRNA hairpin and the flanking regions . It is agreed that DGCR8 plays a major role in the recognition of pri-miRNAs [57–59]. However, it is far from clear how DGCR8 and Drosha distinguish pri-miRNAs from the large number of other hairpin-containing RNA molecules. In addition, the DEAD-box RNA helicases p68 and p72, which are found in the Microprocessor complex , are required for processing of a subset of miRNAs . Their helicase activity is required for the maturation of these miRNAs and the subsequent gene silencing [60, 61]. These helicases may bind pri-miRNAs and modulate the activity of the Microprocessor complex through structural reorganization , or facilitate the loading of miRNA duplexes into RISC . siRNAs are also generated through cleavage by Dicer . Mammals only have one Dicer gene: knockout or knockdown of Dicer affects both miRNAs and siRNAs [62, 63], whereas knockdown of Drosha or DGCR8 only affects miRNA maturation [64–67].
Analysis of the genomic locations of miRNA genes indicated that many miRNAs are located in the introns of mRNAs, with some even in the coding regions . A co-transcription of a miRNA with its host gene presumably allows them to be regulated together. Interestingly, a recent report showed that miR-21 is located in an intron of a coding gene, TMEM49; but its transcription is likely driven by its own promoter . Thus, location of a miRNA in an intron does not warrant co-regulation with its host gene. Other miRNAs are in the intergenic regions and are likely transcribed independently. Stems of miRNA hairpins tend to be conserved; the variation of the loop sequences is increased; and there is a sharp decrease in conservation in sequences immediately flanking the hairpin . This type of phylogenetic shadowing profile was successfully used to identify novel miRNAs in primates. There is a tendency in miRNAs to cluster together [11, 12, 31]. Clustering of miRNA hairpins presumably simplified their transcription regulation. Several recent advances in the study of miRNA biogenesis further underscore the diverse origin of miRNAs and will be discussed below.
Most pri-miRNAs are likely products of the RNA polymerase II [71, 72]. miRNAs regulate the expression of genes through pairing interactions with mRNAs, especially through their 3′ untranslated regions. Therefore, the expression of miRNAs must be regulated. Since transcription by RNA polymerase II is subjected to the highest degree of regulation out of the three RNA polymerase families, it is not surprising that most pri-miRNAs are transcribed by RNA polymerase II. While it makes sense most pri-miRNAs are transcripts of RNA polymerase II, there is no reason to exclude other RNA polymerases to be used. In fact, RNA polymerase III promoters such as U6 and H1 are often used to transcribe shRNAs . In 2006, it was reported that ~50 human miRNAs are transcribed by RNA polymerase III . In a genomic analysis, the miRNAs in the human chromosome 19 miRNA cluster (C19MC) were found to be dispersed among Alu repeats. Alu sequence is the most abundant transposable element in the human genome. It is derived from the 7SL RNA gene, which encodes the RNA component of the signal recognition particle that functions in protein synthesis. The Alu sequence contains the 7SL promoter, an RNA polymerase III promoter. Subsequent chromatin immunoprecipitation and cell-free transcription assays confirmed that the miRNAs in C19MC are indeed products of RNA polymerase III.
Many miRNAs are located within introns. They are in general cleaved by Drosha, parallel to the splicing of their host mRNAs . Deep sequencing of D. melanogaster and C. elegans RNAs allowed the identification of a novel class of intronic miRNAs that do not contain the 10-bp helix at the base of the miRNA hairpin normally required for Drosha cleavage [75, 76]. These pre-miRNAs (mirtrons) turned out to be processed directly by the spliceosome, instead of Drosha (Fig. 1). Like other introns, mirtrons are flanked by 5′ and 3′ splicing sites, and contain branch point sequences. Unlike the pre-miRNAs generated by Drosha, the lariat mirtrons generated by the spliceosome need to be linearized by the debranching enzyme and fold into hairpins, prior to exportation to the cytoplasm by Exportin-5. Both Drosha-processed pre-miRNAs and mirtrons are processed by Dicer. It should be noted that mirtrons are a subset of the miRNAs transcribed by RNA polymerase II.
Pre-miRNAs are typically 60–100 nt in length. This size range coincides with the size of small introns in flies and nematodes. Mammalian introns are usually much longer than pre-miRNAs, thus fewer mirtrons were predicted to exist in mammalian genomes . However, a recent report demonstrated that mirtrons appeared to be present in human and other mammalian genomes . Although several mirtrons are highly conserved within Drosophilids, nematodes and mammals, no mirtron is collectively shared by these animals, suggesting that mirtrons are acquired independently during evolution of different animal clades.
Cleavage by Drosha and Dicer are not the only RNA processing events that miRNAs can go through during maturation. ADAR (adenosine deaminase that acts on RNA) can convert some adenosines (A) to inosine (I) in double stranded RNAs (Fig. 1). I prefers to base-pair with C. The A→I modification disrupts a stable A:U base pair and creates a less stable I:U mismatch. ADAR is the most common type of RNA editing enzyme in metazoans . Editing of pri-miR-142, suppresses its processing by Drosha . Pri-miR-151 and possibly pre-miR-151 are modified by ADARs at two sites close to the Dicer cleavage sites, and these modifications block cleavage by Dicer . Thus, the A-to-I editing could regulate miRNA biogenesis. Interestingly, an A-to-I editing site of miR-376 is located in the “seed” region critical for the recognition of miRNA targets . This modification redirects miR-376 to silence a different set of genes. Therefore, target redirection through ADAR activity can increase the functional diversity of miRNAs. In addition to adenosine deamination, recent deep sequencing of miRNAs extracted from human and rodent showed addition of a single nucleotide at the 3′-end of the miRNAs [82, 83] and cytosine deamination . The functional importance of these modifications remains to be demonstrated.
miRNAs can be generated in animals through transcription by either RNA polymerase II or III; as independent transcripts, or together with other genes; from introns or exons. Their precursors can be processed by either Drosha or the spliceosome in the nucleus and be modified by RNA editing enzymes. The diverse pathways to generate functional small non-coding RNAs are further highlighted by miRNAs in plants . Plant genomes do not encode Drosha and DGCR8 homologues; instead miRNAs are processed by the Dicer-like proteins. Unlike their animal counterparts, plant miRNAs are methylated at their 3′ ends through the activity of HEN1. Therefore, the consensus for miRNA biogenesis is: there’s more than one way to skin a cat. As long as the mature miRNAs have the chemical structure and the length required for interaction with RISC and for subsequent gene regulation, it is probably advantageous to have multiple ways to generate them.
Obviously, the distinct pathways allow miRNAs to be controlled through different mechanisms . For example, the Lin-28 RNA-binding protein specifically associates with the let-7 family pri-miRNAs and blocks their processing by Drosha and Dicer . The SMAD proteins, a TGF-β and BMP signal transducers, promote the processing of pri-miR-21 by Drosha . hnRNP A1, a nucleocytoplasmic shuttling protein, binds specifically to human pri-miR-18a and facilitates its Drosha-mediated processing. We found that the DGCR8 protein binds heme and this interaction may be part of a molecular mechanism that regulates miRNA maturation . ADAR editing of pri-and/or pre-miRNAs allows tissue-specific regulation of their processing and target specificity [79–81]. The presence of multiple miRNA maturation pathways predicts that altering the expression or activity of miRNA processing factors changes the abundance of only a subset of miRNAs.
It is possible that distinct pathways used to generate miRNAs allow these small gene regulators to evolve more readily, thus conferring fitness to their hosts. It was suggested that the mirtron pathway may have helped the emergence of miRNAs before the advent of Drosha . When deep sequencing of miRNAs from three species of Drosophila was performed, a large class of evolutionarily young miRNAs was observed . The estimated birth rate of new miRNA genes is 12 per million years. Among the new genes, 96% disappeared quickly in the course of evolution, 4% were retained, and only 2.5% became modestly or highly expressed. Furthermore, sequencing and comparative analysis of the C19MC in nine diverse primate species suggested an Alu-mediated rapid expansion of this miRNA family . Single nucleotide polymorphisms (SNPs), the most abundant form of DNA variation in the human genome, are found in miRNA genes and may provide a snapshot of miRNAs during evolution . Some miRNA SNPs have been shown to alter miRNA processing and target specificity [92, 93]. Editing by ADAR allows miRNA isoforms with different target specificity to be generated from a single miRNA gene . The emergence of new miRNAs correlates with the introduction of developmental complexity during evolution, suggesting important functional contributions from the expansion of miRNA genes, as well as other non-protein-coding RNAs [94, 95]. Large numbers of miRNA-like hairpin structures [96, 97] and mirtron-sized introns [75–77] found in animal genomes could provide potential novel miRNA candidates to facilitate the fast evolution of miRNAs.
This work is supported by NIH grant GM080563-01A1 to F.G.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.