|Home | About | Journals | Submit | Contact Us | Français|
Ribonucleic acid (RNA) editing is a mechanism that generates RNA and protein diversity, which is not directly encoded in the genome. The most common type of RNA editing in vertebrates is the conversion of adenosine to inosine in double-stranded RNA which occurs in the higher eukaryotes. This editing is carried out by the family of adenosine deaminase acting on RNA (ADAR) proteins. The most-studied substrates of ADAR proteins undergo editing which is very consistent, highly conserved, and functionally important. However, editing causes changes in protein-coding regions only at a small proportion of all editing sites. The vast majority of editing sites are in noncoding sequences. This includes microRNAs, as well as the introns and 3′ untranslated regions of messenger RNAs, which play important roles in the RNA-mediated regulation of gene expression.
After transcription of a eukaryotic ribonucleic acid (RNA) molecule from deoxyribonucleic acid (DNA), the newly formed transcript undergoes a number of modifications that impact the final RNA or protein product. RNA splicing, similar to the cut-and-paste operation of a word processor, creates large-scale rearrangements of the original RNA message. Single-nucleotide editing processes, similar to a search-and-replace operation, selectively insert and delete single nucleotides, or convert one nucleotide to another. These processes are mechanisms for generating a diverse set of RNA and protein products from a limited number of DNA genes. With the realization that the human genome contains not more than 25,000 genes,1 it is clear that the complexity of the higher eukaryotes must come from such diversification phenomena, and not from differences in gene number. The first RNA editing process discovered in mammals was the deamination of cytidine (C) by APOBEC proteins to form uracil (U). However, the most prevalent type of RNA editing in the higher eukaryotes is the deamination of adenosine (A) to form inosine (I) (Figure 1(a)) in double-stranded (ds) RNA. This reaction is catalyzed by the family of adenosine deaminases acting on RNA (ADARs).2–4
The effects of this modification are twofold. A-to-I editing changes the informational content of the RNA molecule, as inosine preferentially base pairs with cytidine (Figure 1(b)), and is therefore interpreted as guanosine(G) by the translational and splicing machinery. The three-dimensional structure of the dsRNA, which determines its interactions with RNA-binding proteins, is also altered by the addition or removal of bulges formed by mismatched base pairs. Editing efficiency (the extent of conversion from A to I) varies depending on substrate, developmental timing, and location, allowing mixed populations of products to exist, and for these populations to change in response to changing conditions. Editing can affect anywhere from 0% to 100% of an RNA population, whereas a variation in the genome fixes the heterogeneity at exactly 50%, or none at all. It was initially thought that the only function of ADARs was to modify protein-coding genes. Recent discoveries have overthrown this view, by revealing a large number of editing sites in noncoding sequences5–7 and numerous regulatory interactions with the RNA interference (RNAi) and microRNA (miRNA) biogenesis pathways.8
The A-to-I editing reaction is catalyzed by the family of ADAR enzymes. These proteins, which are conserved across many eukaryotes, contain a C-terminal catalytic domain, as well as several dsRNA-binding domains (Figure 2). There are three vertebrate ADAR genes, which give rise to several ADAR proteins. The ADAR1 protein has long (p150) and short (p110) isoforms, which result from the use of alternative promoters and start codons. ADAR2 and ADAR1p110 are mainly present in the nucleus, whereas ADAR1p150, driven by an interferon-inducible promoter, is present in both nucleus and cytoplasm, and is upregulated upon cellular stress or viral infection.9 ADAR1 and ADAR2 are expressed in most tissues, whereas ADAR3 is only found in the central nervous system. While ADAR1 and ADAR2 must form homodimers for activity, ADAR3 remains in monomeric form.10 All known editing sites have been attributed to ADAR1 or ADAR2 activity, and ADAR3 shows no deaminase activity in vitro, possibly due to its inability to dimerize, leaving its function unknown.11
ADARs interact with their dsRNA substrates through dsRNA-binding domains (dsRBDs). The structural features of the RNA helix bury sequence-specific functional groups deep inside the molecule, making it highly unlikely that ADAR proteins bind to their substrates in a sequence-specific manner.14 Instead, editing activity appears to be determined by the stability of the substrate dsRNA. This is supported by the highly efficient and random editing of adenosines in perfectly complementary long dsRNA molecules15,16 (Figure 4(f)). The editing process tends to destabilize the substrate RNA,5 and likely proceeds until the enzyme can no longer recognize the substrate as dsRNA. DsRNA structures that contain bulges and loops are edited at specific adenosines17 because it takes only a few deaminations to reduce their stability below the dsRNA threshold recognized by ADARs. Substrate specificity differs between the functional forms of ADAR,18–20 possibly due to the different number and spacing of dsRBDs, which may allow discrimination between different dsRNA structures and stabilities (see Figure 2).
The results of in vitro studies, and the sequences of known in vivo substrates have demonstrated preferences for certain flanking nucleotides at editing sites.15,21 As mentioned earlier, such sequence-specific interactions are unlikely to be mediated by the dsRNA-binding domains. These preferences might be explained by the proposed catalytic model for the deaminase domain, in which the target adenosine is flipped from the inside of the helix into the enzyme's active site,15 perhaps allowing for sequence-specific interactions with neighboring nucleotides. This suggests that the deaminase domain confers some additional specificity to editing site selection based on the immediate sequence context of the site.
Before splicing of eukaryotic transcripts, stem-loop structures are often formed between intronic and exonic regions of the pre-mRNA (messenger RNA). These structures can be edited by ADARs and the resulting inosine is interpreted as guanosine by the ribosome, because of its Watson–Crick base pairing with cytidine (see Figure 1). Editing in coding regions may result in amino acid substitutions (see Figure 3(a)). The canonical examples of this phenomenon are the mRNAs for glutamate receptor (GluR) and serotonin receptor subtype 2C (5-HT2CR), which contain several specific editing sites, resulting in many protein isoforms with varying functionality.22,23 The importance of editing at these sites is underscored by the phenotype of ADAR2 knockout mice. These mice have reduced editing levels at the Q/R site of GluR-B, which is normally 100% edited in vivo. The mice are seizure-prone, and die within 3 weeks of birth.24 In humans, defects in editing have been linked to a skin pigmentation disorder (dyschromatosis symmetrica hereditaria) and a number of neurological diseases including epilepsy, schizophrenia, and amyotrophic lateral sclerosis.25
Intronic regions of coding sequence are also subject to editing, which can have profound effects on splicing. Editing has the potential to modify 5′ and 3′ splice sites, or to eliminate splicing altogether via modification of splice site donor or acceptor sequences3,26 (see Figure 3(b)). The most clear-cut case is the self-regulation of ADAR2, which edits its own mRNA, leading to generation of an alternative splice acceptor site. Translation of this alternatively spliced mRNA results in production of truncated protein lacking dsRBDs and the deaminase domain, due to a frameshift.26
Editing may also have a subtle effect on splicing kinetics, by destabilizing intronic RNA duplexes and thereby providing the splicing machinery with better access to the RNA. This model is supported by the preferential splicing of edited transcripts in the brains of ADAR2 knockout mice.24 In accordance with this, it has been proposed that the Z-DNA-binding domains of ADAR1p150 (see Figure 2) serve to localize it to sites of active transcription, where the Z-conformation of DNA is stabilized by the supercoiling behind the active polymerase.27 Thus ADAR1 has a chance to act on the pre-mRNA before splicing occurs.12
Unlike a mutation in the DNA, A-to-I editing usually leaves some proportion of the original transcript, allowing for the creation of new functions without destroying older functions that may be necessary for survival. After a genomic G-to-A mutation, the A nucleotide may become an editing site, thereby lessening the effect of the mutation.2 The reverse could also occur, where selective pressure increases the editing frequency at a site over time, easing the transition to a genomically encoded G.28 Editing of the GluR-B Q/R site supports both scenarios. This is seen in the case of ADAR2 knockout mice, whose phenotype can be rescued by creating a genomic A-to-G mutation at the GluR-B Q/R site.24 This site is completely edited in normal mice, and may have evolved as a mutation corrected by editing. This shows that, at least in some cases, the function of RNA editing can be performed equally well by a change in the genomic sequence of the substrate. A-to-I and C-to-U editing are the most common forms of RNA editing in animals, which may explain the prevalence of A/G and C/T variants in related genomes.28
Despite their important functions, the ~30 known protein-coding targets of A-to-I editing do not account for the large amount of inosine detected in mammalian RNAs.29 A combination of biochemical and bioinformatics studies have identified more than 12,000 new editing sites, most of which are located in noncoding regions such as introns, 5′ and 3′ untranslated regions (UTRs), and repetitive sequences such as human Alu elements.5–7 The wide variety of functions performed by noncoding RNAs, many of which are double-stranded, suggest that ADARs may play regulatory roles in many processes.
RNAi is the process by which ~22 bp dsRNAs called small interfering RNAs (siRNAs) direct the degradation of homologous mRNA transcripts.30,31 In addition to being an important viral defense and gene regulatory mechanism in the cell, RNAi has become an immensely powerful and widely used experimental tool. RNAi can be induced by long dsRNA, which is cleaved by the ribonuclease Dicer into siRNAs (see Figure 4(f) and (g)), which direct destruction of the original transcript and any other identical sequences.32 Long dsRNA is also a preferred substrate for ADARs, which can nonspecifically edit more than 50% of adenosines in such molecules15,16 (see Figure 4(f)). Such extensive editing has been shown in vitro to suppress dicing of these RNAs, and thus to suppress RNAi induced by them.33 In addition to competing with Dicer for long dsRNA substrates, ADAR1p150 has been shown to bind tightly to siRNAs, decreasing their effective concentration in the cytoplasm, thereby reducing the efficiency of RNAi13 (see Figure 4(g)). Strains of Caenorhabditis elegans in which active ADAR genes have been deleted to show defects in chemotaxis (the ability to seek out or avoid certain substances). These defects are eliminated in strains which also have defects in RNAi machinery.34 This suggests that the chemotaxis phenotype is a result of hyperactivity in an RNAi pathway which is normally inhibited by ADARs.
MiRNAs are small dsRNAs encoded by eukaryotic genomes, which regulate gene expression via an RNAi-like pathway. They need not be perfectly complementary to their targets, so a single miRNA can affect a large number of mRNAs. The miRNAs are transcribed into long primary transcripts called primary miRNAs (pri-miRNAs), which form a stem-loop secondary structure. The pri-miRNAs are cleaved by the nuclear Drosha-DGCR8 complex to yield ~60–70nt miRNA precursors (pre-miRNAs), which are transported into the cytoplasm by Exportin5-Ran-GTP. The pre-miRNA is then cleaved by the cytoplasmic Dicer-TRBP complex into mature miRNAs: dsRNAs ~22bp in length, with 2-nucleotide 3′ overhangs. The miRNA is then loaded into the RNA-induced silencing complex (RISC) which selects the active miRNA strand. The miRNA guides RISC to its target site, usually in the 3′ UTR of an mRNA, resulting in translational repression or mRNA degradation.30,40
The dsRNA regions of pri- and pre-miRNAs allow ADARs to interact with the miRNA biogenesis pathway. A-to-I editing has been detected in numerous endogenous pri-miRNAs,41,42 and there is in vitro evidence for editing at the pre-miRNA stage.36 These editing events can affect miRNA processing by inhibiting the Drosha35 or Dicer36 cleavage steps (see Figure 4(a) and (b)), thereby reducing levels of the mature miRNA. Since the transcription of miRNAs is often controlled by the promoters of other genes,40 modulation of processing is thought to be the main method of regulating miRNA levels. If processing is not affected by these editing events, then mature miRNAs with A-to-I substitutions are expressed37,43 (see Figure 4(c)). These miRNAs can silence a set of genes different from that of their unedited counterpart (see Figure 4(d)). In the case of miRNA 376a, the edited form of the mature miRNA gains the ability to downregulate the mRNA of the PRPS1 gene.37 PRPS1 is involved in purine metabolism, whose end product is uric acid. Its overexpression in humans is associated with buildup of purines and uric acid, resulting in gout and in some cases neurodevelopmental impairment.44 Mice lacking ADAR2, the responsible editing enzyme, have increased levels of PRPS1 protein and uric acid in their brains compared to wild-type mice. The effect of uric acid on the brain is unclear, but some studies have shown reduced levels in patients with multiple sclerosis, suggesting that it may have a neuroprotective effect.45
In most cases, only one strand of the miRNA is thought to be active, while the other is degraded. Selection of the active miRNA strand is based on the thermodynamic stability of the 5′ ends of the miRNA,40 thus the structural changes caused by editing could change the effective strand,8 which is likely to drastically change the genes silenced by the miRNA (see Figure 4(e)). The 3′ UTRs of mRNAs are known to be frequently edited5–7 and genomic A-to-G changes in 3′ UTRs have been shown to create new miRNA target sites,46 so it is possible that editing of 3′ UTRs might enhance or suppress silencing of specific mRNAs.47 Similarly, editing may destabilize secondary structure in 3′ UTRs, allowing RISC to access previously inaccessible target sites, and possibly preventing Drosha and Dicer from cleaving dsRNA regions of mRNAs. Through all of these mechanisms, A-to-I RNA editing has the potential to rapidly change gene expression levels in response to stimuli.
Many viruses contain RNA genomes, or replicate their genomes through an RNA intermediate, which is often double-stranded in form. Such long dsRNAs are rarely produced endogenously by the cell,48 and are quickly recognized by many cellular systems. ADARs can carry out hyperediting of such transcripts, potentially triggering their degradation by the inosine-specific nuclease Tudor-SN (tudor staphylococcal nuclease)38,39,49 (see Figure 4(f)). Viral RNAs extracted from the brains of measles patients show a large number of U-to-C and A-to-G conversions,50 consistent with hyperediting of both the sense and antisense viral genomes.51,52 Since the virus is cytoplasmic, this editing is likely to be carried out by interferon-inducible ADAR1p150, which may be expressed in response to viral infection. An interesting exception is the hepatitis delta virus, which takes advantage of the editing machinery to edit a stop codon (UAG) to a tryptophan codon (UGG), resulting in synthesis of a longer viral antigen. In other words, this virus hijacks the host's editing machinery to carry out an essential step in its life cycle.53 The recent discovery of interferon-inducible miRNAs that directly target viral RNAs54 raises the possibility of a cooperative interaction between miRNAs and RNA editing in the immune response. On the other hand, a liver miRNA (miR122) has been discovered that facilitates replication of hepatitis C virus.55 Although that miRNA is not known to be edited, an increase in cytoplasmic ADAR1 could result in editing of other such pre-miRNAs, thereby slowing viral replication.
Even though editing sites have been discovered in a large number of RNA transcripts, the effects of A-to-I RNA editing remain undetermined in most cases. Editing sites in some coding targets show clear phenotypes when defective in model organisms, but they make up only a tiny fraction of all editing sites. ADAR1 knockout mice die as embryos because of widespread apoptosis.56,57 The cause of this phenotype is unknown. The newly discovered interactions between ADAR1 and noncoding RNA might be responsible, or the real cause may lie in an undiscovered function of ADAR1 or RNA editing in general.
Because of the high background of edited repetitive sequence in the transcriptome, the extent of editing in substrates such as coding regions and miRNAs is unclear, and editing may affect other classes of RNAs. It is possible that some of these interactions are accidental, due to the nonspecific nature of the dsRNA-binding domains of ADARs. In the case of the widespread editing of repetitive sequences, A-to-I editing may protect the genome against transposable elements7 by destabilizing RNA with repetitive sequences. However, it is quite possible that such editing events are a side effect of the huge amounts of repetitive sequence in the human genome. In the past decade, our knowledge of the crucial regulatory roles and potential clinical applications of RNA-dependent pathways has exploded. Determining the extent and physiological significance of RNA editing in such pathways is therefore of critical importance to those who study or use these technologies.