|Home | About | Journals | Submit | Contact Us | Français|
RNA modifications have long been known to be central in the proper function of tRNA and rRNA. While chemical modifications in mRNA were discovered decades ago, their function has remained largely mysterious until recently. Using enrichment strategies coupled to next generation sequencing, multiple modifications have now been mapped on a transcriptome-wide scale in a variety of contexts. We now know that RNA modifications influence cell biology by many different mechanisms — by influencing RNA structure, by tuning interactions within the ribosome, and by recruiting specific binding proteins that intersect with other signaling pathways. They are also dynamic, changing in distribution or level in response to stresses such as heat shock and nutrient deprivation. Here, we provide an overview of recent themes that have emerged from the substantial progress that has been made in our understanding of chemical modifications across many major RNA classes in eukaryotes.
It has been known for decades that RNA from all kingdoms of life can be post-transcriptionally modified with over 100 chemical moieties. Initial studies were largely limited to more abundant modifications: N1-methyladenosine (m1A), pseudouridine (Ψ), and 2′-O-methylation (2′OMe) in tRNA and rRNA, and N6-methyladenosine (m6A), 5-methylcytidine (m5C), and 2′O methylation (2′OMe) in mRNA and viral RNA.1-3 This early research not only began the work of exhaustively cataloging these modifications, but also began to uncover the mechanisms by which these small chemical additions influence RNA folding and function. This field has experienced a recent resurgence, particularly with respect to mRNA modifications, driven in part by technological advances in high throughput sequencing and mass spectrometry that have allowed deeper analysis of less prevalent modifications on less abundant RNA species. With these tools in hand, the mechanisms by which RNA modifications influence cell biology are beginning to be elucidated.
The discovery and characterization of FTO as a mammalian mRNA m6A demethylase highlighted the idea that chemical modifications on RNA could represent a reversible and dynamic mode of post-transcriptional regulation.4 Using an antibody-based enrichment strategy followed by high throughput sequencing, the distribution of m6A sites was mapped throughout the mammalian transcriptome, revealing the characteristic topology in the 3′UTR that suggested its function.5,6 As we learn more about m6A and other mRNA modifications, their distributions in the transcriptome, and their functions in biology, it is becoming clear that different modifications in different sequence contexts provide an additional layer of complexity to the post-transcriptional regulation of gene expression. Ongoing work has also revealed new facets of the structural and functional roles that chemical modifications play on tRNA and rRNA. As we learn more about the networks that control their installation, interplay between mRNA, tRNA, and rRNA modification machinery has also been uncovered, adding a complex layer of regulation to protein translation. Rather than providing an exhaustive overview of our current understanding of RNA modifications, here we will focus primarily on recent studies that highlight emerging themes in the biology of chemical modifications in eukaryotic RNA.
The presence of chemical modifications on mRNA was uncovered in the 1970s, when poly(A) tail-based purification techniques first allowed for sufficiently pure mRNA preparations to rule out contamination with other RNA species.1-3 These studies revealed the presence of internal m6A, m5C, and ribose 2′OMe (Fig. 1). The functional significance of these modifications in mRNA remained mysterious for decades, but recent studies have begun to shed light on how they influence cellular processes. The discovery of the first mRNA m6A demethylase FTO,4 quickly followed by a second demethylase ALKBH5,7 and the METTL3/METTL14 methyltransferase complex,8 brought mRNA modifications back to the forefront. The major implication of these discoveries was that mRNA modifications are likely reversible and regulated, renewing interest in uncovering their biological functions. Mapping of m6A sites using an antibody-based enrichment strategy followed by high throughput sequencing uncovered the transcriptome-wide topology of this modification, revealing enrichment around the stop codon and in the 3′UTR of thousands of transcripts.5,6 The exact numbers vary between studies, particularly between individual experiments, but the numbers of reproducible peaks seem to converge on approximately 13,000 sites in 5000–7000 genes, highlighting the need for consistent analysis of replicates. YTHDF2, the first m6A-binding protein to be characterized in detail, was shown to mediate an m6A-dependent reduction in half-life of its mRNA targets.5,9 Recent work has now revealed a molecular mechanism for this effect: YTHDF2 binds m6A and recruits the CCR4-NOT deadenylase complex to mRNA, resulting in CAF1- and CCR4-dependent deadenylation and subsequent degradation of m6A-containing transcripts.10
Though initial reports highlighted its 3′ localization, m6A is also found in the 5′UTR, a region of mRNA closely tied to the regulation of translation. In fact, m6A in the 5′UTR mediates cap-independent translation through eIF3,11,12 a strikingly different effect from the decay pathway described above. Interestingly, this process also involves YTHDF2, so while m6A may mediate very different mRNA processing events depending on its context, specific components may be shared between pathways. In this second example, YTHDF2 is a protective factor against demethylation of m6A sites in the 5′UTR by FTO, rather than a direct link to processing machinery.12 This protection of 5′ m6A sites provides a route to selective mRNA translation during heat shock stress. Dynamic modification of RNAs in all major classes during stress responses is an emerging theme, which we will highlight throughout this review. It will also be interesting to see whether other YTH domain-containing proteins show the same versatility: YTHDF1 binds m6A in mRNA and promotes translation initiation,13 and recently YTHDC1 was shown to affect splicing as a nuclear m6A reader protein.14,15 It remains to be seen what other functions these proteins may have in cells.
At this stage, m6A is the most extensively characterized mRNA modification — we know the machinery that installs and removes it, m6A-specific binding proteins have been identified, and multiple biological systems in which it plays a role have been described.16 However, numerous other modifications have been added to the mRNA repertoire. m1A has been described in tRNA and rRNA,17,18 but we now know that it is also present in mRNA, albeit at levels approximately 10-fold lower than m6A.19,20 In the studies that first described its distribution in mRNA, one identified 7,154 peaks in 4,151 coding genes,19 while the other identified 887 peaks in 600 genes.20 The discrepancy is primarily due to methodological differences that inherently yield different levels of stringency, and that minimize spontaneous isomerization of m1A to m6A to different degrees. Li et al. employed an approach based on antibody enrichment followed by demethylation to identify demethylation-sensitive sites. The study by Dominissini et al., on the other hand, relied on antibody enrichment of freshly purified mRNA to call peaks. While this study did utilize the Dimroth rearrangement, which converts m1A to m6A, to reverse the stops and mutations introduced by reverse transcriptase upon encountering an m1A site, the rearrangement signature was not a requirement for calling an m1A peak. While its function in mRNA remains unclear, it is dynamic in response to nutrient starvation and heat shock, it has a characteristic localization on the 5′ end of transcripts, and it positively correlates with protein level. These features are all suggestive of a role in translation that has yet to be uncovered. While m1A tends to appear in relatively more structured and higher GC-content regions, NMR studies recently demonstrated its propensity to locally disrupt duplex formation, suggesting that structural changes in m1A-containing 5′UTRs may influence translation.21 This is consistent with the fact that m1A carries a positive charge at physiological pH, and that m1A sites have no clearly identifiable sequence motif.19,20
The most abundant RNA modification in cells overall, pseudouridine (Ψ), is generated through enzyme-mediated isomerization of uridine. On the order of 250 to 300 sites have now been mapped in yeast mRNA at single base resolution.22-24 In human, the reported number of mRNA Ψ sites varies widely from 96 in one study,23 to 353 in a second,22 and up to 2084 in a third study.25 This variation is likely the result of technical differences in experimental design and data analysis. While all three studies relied on the chemical reactivity of Ψ with N-cyclohexyl-N′-β-(4-methylmorpholinium) ethylcarbodiimide (CMC), Li et al. synthesized a biotinylated CMC derivative for an enrichment step, likely allowing them to identify low abundance sites that the other methods might miss. Though many of these sites have been attributed to specific enzyme activities, the biological function of Ψ has yet to be elucidated. In keeping with an emerging theme among mRNA modifications, however, Ψ is dynamic in heat shock, nutrient deprivation, and other stresses.22,23,25
While 5-methylcytidine (m5C) is best known for its central role as an epigenetic mark in DNA, it was also described in early studies of RNA methylation.3 Mapping of m5C in RNA was initially attempted by adapting the bisulfite sequencing technology used for m5C in DNA, leading to the identification of approximately 10,000 m5C sites in mRNA.26,27 Subsequently, new methods were developed that leverage m5C methyltransferases to enrich m5C-containing RNA, either by incorporating 5-azacytidine into RNA,28 or by mutating the methyltransferase29 to enable modified RNA capture. These methods identified far fewer m5C sites in mRNA, on the order of a few hundred to one thousand, respectively.30 It remains unclear whether this discrepancy is the result of a biological difference (since the latter two studies looked at only a single m5C methyltransferase, NSUN2), or if it is the result of technical differences. While its biological function remains unclear, some of the machinery that regulates m5C in different RNA classes has been identified: Trm4p (in yeast),31 and DNMT2 and NSUN2 (in human) have been identified as m5C methyltransferases.27,32 While the roles and specificities of these enzymes is controversial, their functional characterization will likely aid in uncovering the roles of m5C in the future.33-35 For instance, a recent study suggested functional roles for m5C oxidation to 5-hydroxymethylcytidine in D. melanogaster mRNA.33
Finally, while the majority of RNA modifications occur on the base moiety, methylation on the ribose 2′ hydroxyl to form 2′OMe has been found on all four ribonucleosides in many RNA classes. In eukaryotic rRNA, its installation is mediated by small nucleolar RNAs (snoRNAs) through a guide-RNA mediated process.36 Though its presence in mRNA was first described at the same time as m6A in 1974 (Fig. 1),1,2 its distribution in mRNA has not yet been reported. Recently, however, new methods have been developed to map 2′OMe utilizing either chemical strategies,37 or methylation-sensitive reverse transcriptase enzyme.38 Thus far these approaches have identified 2′OMe sites only in abundant RNA species, but future technical advances may eventually reveal the 2′OMe distribution in mRNA.
N6-methylation of mRNA is thought to be a cotranscriptional process, leaving open the possibility that other RNA Polymerase II products also contain m6A. Indeed, a search for over-represented sequence motifs in miRNA-containing regions revealed that the primary consensus sequence for the m6A methyltransferase METTL3 is prevalent in these regions.39 METTL3 depletion and overexpression resulted in diminished and increased mature miRNA levels, respectively. This effect was recapitulated in in vitro processing reactions; methylated pre-let-7e was processed more efficiently than its unmethylated counterpart, indicating that methylation status influences processing.
The lncRNA XIST is also m6A-methylated, carrying a number of m6A residues that appear to be critical for its gene silencing activity.40 XIST is m6A-methylated by METTL3, which is recruited to XIST through the proteins RBM15 and RBM15B. In an inducible XIST expression system, loss of METTL3 allows for induction of XIST, but its gene silencing function is impaired. However, loss of m6A on XIST can be rescued by tethering the m6A-binding protein YTHDC1 directly to XIST, suggesting that this is another case in which the recruitment of a specific m6A-binding protein mediates the effect of m6A on a central cellular process.
While these are only two recent examples, they highlight how RNA modification status can influence both RNA processing and function through common machinery. In the context of mRNA, the m6A-binding protein YTHDC1 has been implicated in splicing,15 so it could represent another case in which an m6A-binding protein coordinates different cellular processes in different contexts.
The RNA species discussed in the previous sections are primarily products of RNA Polymerase II. Upon viral infection, host cell machinery is often hijacked to enable viral infection and replication, at the expense of normal cellular function. Late in their life cycle, for instance, retroviruses use RNA Polymerase II to transcribe viral DNA.41 Some RNA modifications, such as m6A, are thought to be installed contranscriptionally, perhaps one of the many processes coordinated by the RNA Polymerase II C-terminal repeat domain. The presence of m6A in viral RNA has, indeed, been known for decades, having been observed in B77 avian sarcoma virus, Rous sarcoma virus, simian virus 40, influenza, and adenoviruses in the 1970s.42-46 However, akin to the case with mRNA, the functions and regulation of RNA modifications in host-virus interactions have only recently begun to be unveiled. HIV-1 is a very illustrative example of this, as it exploits multiple RNA modifications to its advantage. In order to replicate, HIV-1 reverse transcriptase (RT) must synthesize a DNA intermediate using tRNALys3 as a primer for minus-strand strong-stop synthesis. During plus-strand strong-stop synthesis, however, N1-methylation at adenosine 58 (A58) in tRNALys3 is required to properly terminate DNA synthesis.47,48 Replacement of A58 with U, which is not methylated to produce the block, allows HIV-1 RT to read past this point and inhibits HIV-1 replication.
A trio of recent studies has now revealed that HIV-1 viral RNA itself is also m6A-methylated, and that proteins involved in m6A regulation influence viral infection.49-51 Though each study took different approaches and reaches different conclusions in some cases, they do converge in many respects. They find that HIV-1 RNA carries multiple m6A peaks that appear to be important in infection, and that the cellular host responds with infection-specific methylation of multiple transcripts potentially involved in the response to viral infection. Knockdown of the METTL3/METTL14 methyltransferase complex inhibits replication and viral release. Lichinchi et al. demonstrate that methylation of a specific site in the Rev response element (RRE) of the env gene increases its affinity for Rev, facilitating Rev's export function. Reducing methylation at this site by knocking down METTL3/METTL14 or by mutating the site substantially inhibits viral RNA export into the cytoplasm, an essential step in the HIV-1 replication cycle.50 Interestingly, Tirumuru et al. report a binding site for the m6A-binding protein YTHDF1 in this region. The proteins YTHDF1, YTHDF2 and YTHDF3, all established m6A-binding proteins, appear to negatively regulate HIV-1 viral reverse transcription and mRNA transcription in this study.51 It remains to be seen whether the YTH proteins play a role in the Rev-mediated export model described by Lichinchi et al. or whether they represent an independent effect. In the third study, however, the opposite effect was observed with respect to the YTHDF proteins, HIV-1 mRNA expression, and viral replication.49 The specific m6A sites reported in the three studies also differ, perhaps reflecting different cell lines and viral clones used in each study. Ongoing work will clarify the effect of RNA modifications on HIV-1 and explore their effects other viruses.
Of all major RNA classes, tRNAs carry the largest complement of chemical modifications. Each molecule carries, on average, 14 modifications that contribute to its function.18,52 While a detailed description of all the known modifications on tRNA, their respective functions, and their regulatory machineries is beyond the scope of this overview, decades of work have revealed some interesting paradigms by which modifications influence tRNA function.
In some cases, tRNA modifications directly influence folding, and thereby, function. A classic example is m1A58, a highly conserved modification in tRNAs from all three domains of life. Detailed studies in S. cerevisiae reveal that it is present in 23 out of 34 tRNAs and is installed by the Trm6p/Trm61p methyltransferase complex.53 Disruption of Trm6p/Trm61p enzymatic activity results in a slow growth phenotype, driven largely by the degradation of unmethylated tRNAiMet. The situation in human is strikingly similar, and the human homologs TRMT6 and TRMT61A show 20% and 30% similarity to the yeast enzymes at the amino acid level, respectively.17,54 Supporting this conserved role, ALKBH1 was recently reported to demethylate m1A58 in human tRNAiMet as well as several other tRNA species.55 Loss of ALKBH1 increases the cellular tRNAiMet level, corroborating the idea that m1A58 methylation is essential for tRNAiMet stability. ALKBH1-mediated tRNA demethylation also attenuates the rate of protein synthesis. While m1A in tRNAiMet represents one of the few cases where a single tRNA modification is absolutely essential, there are many other cases where loss of one or more modifications results in misfolding and subsequent degradation.56,57
Since protein production relies on the availability of tRNA pools, tRNA maturation and processing are key factors in translation regulation. tRNA maturation is a complex process that involves multiple changes in subcellular localization, throughout which modifications are added and can act as markers of maturation. In the anti-codon loop, modified nucleosides can affect protein translation directly, both as a fundamental quality control mechanism and in response to cellular signaling or stress. Chemical modifications in the anti-codon stem loop region influence translation speed and fidelity by altering codon-anti-codon interactions and by controlling reading frame.58-61 In S. cerevisiae, Trm9p-mediated methylation to form 5-methylcarboxymethyluridine at the wobble positions of tRNAARG(UCU) and tRNAGLU(UUC) enhances translation of transcripts with higher levels of these codons upon DNA damage.62 In this case, the higher levels of these codons in ribonucleotide reductase 1 and 3 (Rnr1p and Rnr3p) allow for enhanced production of these key players in the DNA damage response. Thus the ability of modifications to influence translation directly can be implemented to bias the cell toward efficiently translating proteins required for a robust cellular response.
In addition to regulating the translation of specific mRNAs during stress responses, tRNAs can also be cleaved into second messenger signaling molecules. This process is also influenced by modifications — yeast Trm9p, and mammalian NSUN2 and DNMT2 have all been implicated in regulating cleavage through their methyltransferase activity.32,63-65 The resulting fragments have been demonstrated to bind and inhibit translation machinery,66 as well as bind and destabilize mRNAs directly.67 In mouse epidermal stem cells, the expression level of the RNA m5C methyltransferase NSUN2 is low but increases over the course of differentiation. This occurs with a concomitant increase in protein translation that is, counter intuitively, not correlated with increased proliferation. Deletion of NSUN2 results in tRNA hypomethylation, accumulation of tRNA fragments, and reduced translation, favoring the stem cell state.67 By this mechanism, loss of NSUN2 enhances the self-renewal of stem cells in a tumor setting as well, but it renders them sensitive to stress because they are unable to activate stress response pathways that require tRNA methylation. How tRNA fragments influence cell biology more broadly remains to be worked out. However, the fact that their accumulation has already been connected to neuronal loss,65 cancer progression,67 and stem cell differentiation68 suggests that this may represent a general mechanism by which tRNA modifications can influence cell signaling processes in both normal and disease states.
As more modification machinery is characterized and as sequencing technology continues to improve, the contributions of individual and groups of modifications will continue to be uncovered.69,70 Here we have laid out three mechanisms by which modifications influence tRNA biology — through effects on folding, by regulating mRNA decoding, and by influencing tRNA cleavage into potential signaling messengers — but there are likely many more. Given the evolutionary conservation of these modifications and the 1–2% of the genome that generates the proteins devoted to this task,71 we have likely only scratched the surface of tRNA biology and how it is influenced by the broad diversity of chemical modifications.
In contrast to tRNA, rRNA carries a more limited repertoire of modifications. Given the ribosome's central role in all domains of life, some have speculated that perhaps these specific modifications were strongly selected for throughout evolution.71 Indeed, they tend to occur on highly conserved residues in areas of the ribosome central to its function. The majority of these modifications are Ψ and ribose 2′OMe, but m1A, N6,N6-dimethyladenosine (m6,6A), m5C, 3-methyluridine (m3U), N7-methylguanosine (m7G), and N4-acetylcytidine (ac4C), among others, have also been found in eukaryotic rRNAs.18 Despite the smaller variety, rRNA is heavily modified, with human rRNAs estimated to carry 94 2′OMe and 95 Ψ.
Mapping rRNA modifications onto the structure of the yeast ribosome has yielded hints as to their function in this complex machinery. In the small subunit, two m6,6A and an ac4C surround the decoding site where tRNAs read out mRNA sequence, while in the large subunit, an m1A, m5C, and m3U sit near the catalytic peptidyl transferase center.36 This structural information also suggests that intersubunit contacts involving rRNA modifications and ribosomal proteins may be involved in the complex motion required for successful mRNA translation.
While many RNA modifications occur in highly conserved areas, there is heterogeneity at some sites that alter ribosome composition. With the development of RiboMeth-seq, we can now estimate the stoichiometry of 2′OMe in rRNA in S. cerevisiae.37 While the majority of sites are fully methylated, a small proportion (8 out of 54) is methylated at less than 90%. Knockdown of specific guide RNAs unexpectedly changes the level of not only the modification specified by that guide, but of others as well, implying interdependent and potentially co-regulated processes. Hypermethylation of at least one 28S rRNA site in an aggressive breast cancer cell line is correlated with lower IRES-dependent translation and disregulated codon-anti-codon recognition.73 Similarly, loss of the pseuodouridine synthase Cbf5p in yeast also results in decreased tRNA binding, loss of translational fidelity, and reduction in IRES-mediated translation, phenotypes that are also evident in mouse and human cells upon loss of DKC1.74,75
While there is mounting evidence that chemical modifications in rRNA are crucial to ribosome biogenesis and to maintenance of accurate and efficient protein synthesis, not the least of which is its strong conservation throughout evolution, many questions remain as to their mechanisms of action. As more details regarding rRNA modifications and the cellular machinery that controls them are revealed, however, we must take care in attributing function to the modifications themselves versus the enzymes that install them. It has now been well established both in yeast and human systems that, in some cases, enzymes bind to rRNA early in biogenesis, but do not actually catalyze a reaction until a later phase of maturation.76,77 Similarly, in some cases enzymes modify RNAs in multiple classes,78 so loss of function phenotypes must be analyzed with care. Thus it is crucial that, as researchers delve deeper into the functions of modifications in all RNA classes, the roles of the enzymes and the modifications themselves are disentangled.79
RNA modifications have been a fascinating mystery since they were discovered decades ago, and despite extensive progress in characterizing their distributions and functional roles, we are only scratching the surface of how they influence cell biology. In this overview, we have tried to highlight recent discoveries that illustrate themes in the emerging biology of RNA modifications. While modifications have long been studied in the context of tRNA and rRNA, over the last 5–6 y there has been a renewed focus on mRNA modifications. The most well characterized, m6A, has now been shown to play a role in mRNA decay,9 translation,11,13 splicing,5,15 and alternative polyadenylation,80 the circadian clock,81 stem cell differentiation,82 and heat shock response,12 among others. The transcriptome-wide distributions of other mRNA modifications such as Ψ, m1A, and m5C have recently been mapped, and we will likely see more in the near future. While the biological functions of these more recently characterized modifications remain unclear, a unifying theme among them all is that their levels and/or distribution change in response to stresses and nutrient deprivation. Moreover, tRNA and rRNA modifications change in response to stress as well, suggesting that RNA modifications may represent a rapid way to regulate protein translation in response to cellular signals. There are now reported examples where the protein machinery involved in regulating RNA modification levels affects multiple classes of RNA, so it is possible that this crosstalk could help to coordinate mRNA, tRNA and rRNA function simultaneously.78 It has also become clear that depending on the context of the modification and its localization, for instance in the 5′UTR or 3′UTR of a transcript, the same protein can coordinate with different cellular machinery and result in very different outcomes.10,11
While substantial progress has been made in mapping RNA modification sites with high throughput sequencing-based approaches, some major challenges remain. First, in the majority of cases, even if single base resolution can be achieved, we cannot determine the absolute stoichiometry of modification at a given site. Second, while there is overlap in transcripts reported with multiple different modifications, we cannot dissect whether different pools of the same transcript are differentially modified, let alone parse out how combinations of modifications influence function. Third, current approaches all require a relatively large amount of input material, preventing studies of rare cell populations and clinical samples. The future of the field will likely involve progress on two fronts: on the one hand, developing techniques that will allow us to map modifications at single base resolution with stoichiometric information, and on the other, uncovering the mechanisms by which modifications regulate additional biological processes.
The authors declare no competing interests.
The authors apologize to colleagues whose work was not cited owing to space limitations. This work was supported by National Institutes of Health HG008688 and GM071440 (C.H.). C.H. is an investigator of the Howard Hughes Medical Institute (HHMI). S.N. is an HHMI fellow of the Damon Runyon Cancer Research Foundation (DRG-2215–15).