In plants, RNA editing is a process that converts specific cytidines to uridines and uridines to cytidines in transcripts from virtually all mitochondrial protein-coding genes. There are thousands of plant mitochondrial genes in the sequence databases, but sites of RNA editing have not been determined for most. Accurate methods of RNA editing site prediction will be important in filling in this information gap and could reduce or even eliminate the need for experimental determination of editing sites for many sequences. Because RNA editing tends to increase protein conservation across species by "correcting" codons that specify unconserved amino acids, this principle can be used to predict editing sites by identifying positions where an RNA editing event would increase the conservation of a protein to homologues from other plants. PREP-Mt takes this approach to predict editing sites for any protein-coding gene in plant mitochondria.
To test the general applicability of the PREP-Mt methodology, RNA editing sites were predicted for 370 full-length or nearly full-length DNA sequences and then compared to the known sites of RNA editing for these sequences. Of 60,263 cytidines in this test set, PREP-Mt correctly classified 58,994 as either an edited or unedited site (accuracy = 97.9%). PREP-Mt properly identified 3,038 of the 3,698 known sites of RNA editing (sensitivity = 82.2%) and 55,956 of the 56,565 known unedited sites (specificity = 98.9%). Accuracy and sensitivity increased to 98.7% and 94.7%, respectively, after excluding the 489 silent editing sites (which have no effect on protein sequence or function) from the test set.
These results indicate that PREP-Mt is effective at identifying C to U RNA editing sites in plant mitochondrial protein-coding genes. Thus, PREP-Mt should be useful in predicting protein sequences for use in molecular, biochemical, and phylogenetic analyses. In addition, PREP-Mt could be used to determine functionality of a mitochondrial gene or to identify particular sequences with unusual editing properties. The PREP-Mt methodology should be applicable to any system where RNA editing increases protein conservation across species.
In flowering plants, mitochondrial and chloroplast mRNAs are edited by C-to-U base modification. In plant organelles, RNA editing appears to be generally a correcting mechanism that restores the proper function of the encoded product. Members of the Arabidopsis RNA editing-Interacting Protein (RIP) family have been recently shown to be essential components of the plant editing machinery. We report the use of a strand- and transcript-specific RNA-seq method (STS-PCRseq) to explore the effect of mutation or silencing of every RIP gene on plant organelle editing. We confirm RIP1 to be a major editing factor that controls the editing extent of 75% of the mitochondrial sites and 20% of the plastid C targets of editing. The quantitative nature of RNA sequencing allows the precise determination of overlapping effects of RIP factors on RNA editing. Over 85% of the sites under the influence of RIP3 and RIP8, two moderately important mitochondrial factors, are also controlled by RIP1. Previously uncharacterized RIP family members were found to have only a slight effect on RNA editing. The preferential location of editing sites controlled by RIP7 on some transcripts suggests an RNA metabolism function for this factor other than editing. In addition to a complete characterization of the RIP factors for their effect on RNA editing, our study highlights the potential of RNA-seq for studying plant organelle editing. Unlike previous attempts to use RNA-seq to analyze RNA editing extent, our methodology focuses on sequencing of organelle cDNAs corresponding to known transcripts. As a result, the depth of coverage of each editing site reaches unprecedented values, assuring a reliable measurement of editing extent and the detection of numerous new sites. This strategy can be applied to the study of RNA editing in any organism.
RNA editing is a co- or post-transcriptional RNA processing reaction that changes the nucleotide sequence of the RNA substrate. In flowering plants, mRNA editing is confined to organelle transcripts, altering cytidine to uridine. Recently, some members of a small Arabidopsis gene family were found to be important for editing of chloroplast and mitochondrial transcripts. Several methods have been developed to measure the amount of edited transcripts at specific Cs, but most of these methods either lack sensitivity or are unable to determine the number and location of edited Cs in a particular transcript. While sensitive assays have been previously developed, they are costly and labor-intensive precluding their use on a large-scale. In order to characterize the role of an entire gene family in RNA editing, we have successfully adapted RNA sequencing technology to characterize the effect of mutation and silencing of family members on organelle RNA editing. Our method to measure editing extent is sensitive, reliable, and cost-effective. As well as detecting additional family members that play a role in RNA editing, we have detected numerous new editing sites. Our strategy should benefit the investigation of RNA editing in any organism.
RNA editing is a posttranscriptional modification process that alters the RNA sequence so that it deviates from the genomic DNA sequence. RNA editing mainly occurs in chloroplasts and mitochondrial genomes, and the number of editing sites varies in terrestrial plants. Why and how RNA editing systems evolved remains a mystery. Ginkgo biloba is one of the oldest seed plants and has an important evolutionary position. Determining the patterns and distribution of RNA editing in the ancient plant provides insights into the evolutionary trend of RNA editing, and helping us to further understand their biological significance.
In this paper, we investigated 82 protein-coding genes in the chloroplast genome of G. biloba and identified 255 editing sites, which is the highest number of RNA editing events reported in a gymnosperm. All of the editing sites were C-to-U conversions, which mainly occurred in the second codon position, biased towards to the U_A context, and caused an increase in hydrophobic amino acids. RNA editing could change the secondary structures of 82 proteins, and create or eliminate a transmembrane region in five proteins as determined in silico. Finally, the evolutionary tendencies of RNA editing in different gene groups were estimated using the nonsynonymous-synonymous substitution rate selection mode.
The G. biloba chloroplast genome possesses the highest number of RNA editing events reported so far in a seed plant. Most of the RNA editing sites can restore amino acid conservation, increase hydrophobicity, and even influence protein structures. Similar purifying selections constitute the dominant evolutionary force at the editing sites of essential genes, such as the psa, some psb and pet groups, and a positive selection occurred in the editing sites of nonessential genes, such as most ndh and a few psb genes.
Electronic supplementary material
The online version of this article (doi:10.1186/s12870-016-0944-8) contains supplementary material, which is available to authorized users.
RNA editing; Posttranscriptional modification; Ginkgo biloba; Chloroplast genome; Protein structure
In plant organelles, specific messenger RNAs (mRNAs) are subjected to conversion editing, a process that often converts the first or second nucleotide of a codon and hence the encoded amino acid. No systematic patterns in converted sites were found on mRNAs, and the converted sites rarely encoded residues located at the active sites of proteins. The role and origin of RNA editing in plant organelles remain to be elucidated.
Here we study the relationship between amino acid residues encoded by edited codons and the structural characteristics of these residues within proteins, e.g., in protein-protein interfaces, elements of secondary structure, or protein structural cores. We find that the residues encoded by edited codons are significantly biased toward involvement in helices and protein structural cores. RNA editing can convert codons for hydrophilic to hydrophobic amino acids. Hence, only the edited form of an mRNA can be translated into a polypeptide with helix-preferring and core-forming residues at the appropriate positions, which is often required for a protein to form a functional three-dimensional (3D) structure.
We have performed a novel analysis of the location of residues affected by RNA editing in proteins in plant organelles. This study documents that RNA editing sites are often found in positions important for 3D structure formation. Without RNA editing, protein folding will not occur properly, thus affecting gene expression. We suggest that RNA editing may have conferring evolutionary advantage by acting as a mechanism to reduce susceptibility to DNA damage by allowing the increase in GC content in DNA while maintaining RNA codons essential to encode residues required for protein folding and activity.
RNA editing in land plant organelles is a process primarily involving the conversion of cytidine to uridine in pre-mRNAs. The process is required for gene expression in plant organelles, because this conversion alters the encoded amino acid residues and improves the sequence identity to homologous proteins. A recent study uncovered that proteins encoded in the nuclear genome are essential for editing site recognition in chloroplasts; the mechanisms by which this recognition occurs remain unclear. To understand these mechanisms, we determined the genomic and cDNA sequences of moss Takakia lepidozioides chloroplast genes, then computationally analyzed the sequences within −30 to +10 nucleotides of RNA editing sites (neighbor sequences) likely to be recognized by trans-factors. As the T. lepidozioides chloroplast has many RNA editing sites, the analysis of these sequences provides a unique opportunity to perform statistical analyses of chloroplast RNA editing sites. We divided the 302 obtained neighbor sequences into eight groups based on sequence similarity to identify group-specific patterns. The patterns were then applied to predict novel RNA editing sites in T. lepidozioides transcripts; ∼60% of these predicted sites are true editing sites. The success of this prediction algorithm suggests that the obtained patterns are indicative of key sites recognized by trans-factors around editing sites of T. lepidozioides chloroplast genes.
bioinformatics; chloroplast; computational biology; plant organelle; singlet and doublet propensities; Takakia lepidozioides
Three nonsense codons and an unusual initiation codon were located within the putative coding region of the atpB gene of chloroplast DNA of the hornwort Anthoceros formosae. Nucleotide sequencing of cDNA prepared from transcripts revealed extensive RNA editing. The unusual initiation codon ACG was changed to AUG and three nonsense codons were converted into sense codons. In total 15 C residues of the genomic DNA were replaced by U residues in the mRNA sequences, while 14 U residues were replaced by C residues. This is the highest number of editing events for a chloroplast mRNA reported so far. Partial editing was also shown in a cDNA clone where 23 sites were edited but six sites remained unedited, representing the existence of premature mRNA. The expected two-dimensional structure of the mRNA shows the existence of a sequence complementary to every editing site, which can produce continuous base pairing longer than 5 bp, suggesting that mispairing in the double strand is the site determinant for RNA editing in Anthoceros chloroplasts. Comparison of the cDNA sequence with other chloroplast genes suggests that the mechanism arose in the first land plants and has been reduced during evolution.
In plant mitochondria, the post-transcriptional RNA editing process converts C to U at a number of specific sites of the mRNA sequence and usually restores phylogenetically conserved codons and the encoded amino acid residues. Sites undergoing RNA editing evolve at a higher rate than sites not modified by the process. As a result, editing sites strongly affect the evolution of plant mitochondrial genomes, representing an important source of sequence variability and potentially informative characters.
To date no clear and convincing evidence has established whether or not editing sites really affect the topology of reconstructed phylogenetic trees. For this reason, we investigated here the effect of RNA editing on the tree building process of twenty different plant mitochondrial gene sequences and by means of computer simulations.
Based on our simulation study we suggest that the editing ‘noise’ in tree topology inference is mainly manifested at the cDNA level. In particular, editing sites tend to confuse tree topologies when artificial genomic and cDNA sequences are generated shorter than 500 bp and with an editing percentage higher than 5.0%. Similar results have been also obtained with genuine plant mitochondrial genes. In this latter instance, indeed, the topology incongruence increases when the editing percentage goes up from about 3.0 to 14.0%. However, when the average gene length is higher than 1,000 bp (rps3, matR and atp1) no differences in the comparison between inferred genomic and cDNA topologies could be detected.
Our findings by the here reported in silico and in vivo computer simulation system seem to strongly suggest that editing sites contribute in the generation of misleading phylogenetic trees if the analyzed mitochondrial gene sequence is highly edited (higher than 3.0%) and reduced in length (shorter than 500 bp).
In the current lack of direct experimental evidence the results presented here encourage, thus, the use of genomic mitochondrial rather than cDNA sequences for reconstructing phylogenetic events in land plants.
RNA editing is a transcript-based layer of gene regulation. To date, no systemic study on RNA editing of plant nuclear genes has been reported. Here, a transcriptome-wide search for editing sites in nuclear transcripts of Arabidopsis (Arabidopsis thaliana) was performed.
MPSS (massively parallel signature sequencing) and PARE (parallel analysis of RNA ends) data retrieved from public databases were utilized, focusing on one-base-conversion editing. Besides cytidine (C)-to-uridine (U) editing in mitochondrial transcripts, many nuclear transcripts were found to be diversely edited. Interestingly, a sizable portion of these nuclear genes are involved in chloroplast- or mitochondrion-related functions, and many editing events are tissue-specific. Some editing sites, such as adenosine (A)-to-U editing loci, were found to be surrounded by peculiar elements. The editing events of some nuclear transcripts are highly enriched surrounding the borders between coding sequences (CDSs) and 3′ untranslated regions (UTRs), suggesting site-specific editing. Furthermore, RNA editing is potentially implicated in new start or stop codon generation, and may affect alternative splicing of certain protein-coding transcripts. RNA editing in the precursor microRNAs (pre-miRNAs) of ath-miR854 family, resulting in secondary structure transformation, implies its potential role in microRNA (miRNA) maturation.
To our knowledge, the results provide the first global view of RNA editing in plant nuclear transcripts.
Adenosine-to-inosine (A-to-I) RNA editing is recognized as a cellular mechanism for generating both RNA and protein diversity. Inosine base pairs with cytidine during reverse transcription and therefore appears as guanosine during sequencing of cDNA. Current approaches of RNA editing identification largely depend on the comparison between transcriptomes and genomic DNA (gDNA) sequencing datasets from the same individuals, and it has been challenging to identify editing candidates from transcriptomes in the absence of gDNA information.
We have developed a new strategy to accurately predict constitutive RNA editing sites from publicly available human RNA-seq datasets in the absence of relevant genomic sequences. Our approach establishes new parameters to increase the ability to map mismatches and to minimize sequencing/mapping errors and unreported genome variations. We identified 695 novel constitutive A-to-I editing sites that appear in clusters (named “editing boxes”) in multiple samples and which exhibit spatial and dynamic regulation across human tissues. Some of these editing boxes are enriched in non-repetitive regions lacking inverted repeat structures and contain an extremely high conversion frequency of As to Is. We validated a number of editing boxes in multiple human cell lines and confirmed that ADAR1 is responsible for the observed promiscuous editing events in non-repetitive regions, further expanding our knowledge of the catalytic substrate of A-to-I RNA editing by ADAR enzymes.
The approach we present here provides a novel way of identifying A-to-I RNA editing events by analyzing only RNA-seq datasets. This method has allowed us to gain new insights into RNA editing and should also aid in the identification of more constitutive A-to-I editing sites from additional transcriptomes.
RNA-seq; RNA editing; Potential SNP score; Constitutive editing; Editing box
RNA editing is a post-transcriptional process that, in seed plants, involves a cytosine to uracil change in messenger RNA, causing the translated protein to differ from that predicted by the DNA sequence. RNA editing occurs extensively in plant mitochondria, but large differences in editing frequencies are found in some groups. The underlying processes responsible for the distribution of edited sites are largely unknown, but gene function, substitution rate, and gene conversion have been proposed to influence editing frequencies.
We studied five mitochondrial genes in the monocot order Alismatales, all showing marked differences in editing frequencies among taxa. A general tendency to lose edited sites was observed in all taxa, but this tendency was particularly strong in two clades, with most of the edited sites lost in parallel in two different areas of the phylogeny. This pattern is observed in at least four of the five genes analyzed. Except in the groups that show an unusually low editing frequency, the rate of C-to-T changes in edited sites was not significantly higher that in non-edited 3rd codon positions. This may indicate that selection is not actively removing edited sites in nine of the 12 families of the core Alismatales. In all genes but ccmB, a significant correlation was found between frequency of change in edited sites and synonymous substitution rate. In general, taxa with higher substitution rates tend to have fewer edited sites, as indicated by the phylogenetically independent correlation analyses. The elimination of edited sites in groups that lack or have reduced levels of editing could be a result of gene conversion involving a cDNA copy (retroprocessing). If so, this phenomenon could be relatively common in the Alismatales, and may have affected some groups recurrently. Indirect evidence of retroprocessing without a necessary correlation with substitution rate was found mostly in families Alismataceae and Hydrocharitaceae (e.g., groups that suffered a rapid elimination of all their edited sites, without a change in substitution rate).
The effects of substitution rate, selection, and/or gene conversion on the dynamics of edited sites in plant mitochondria remain poorly understood. Although we found an inverse correlation between substitution rate and editing frequency, this correlation is partially obscured by gene retroprocessing in lineages that have lost most of their edited sites. The presence of processed paralogs in plant mitochondria deserves further study, since most evidence of their occurrence is circumstantial.
The C->U editing of RNA is widely found in plant and animal species. In mammals it is a discrete process confined to the editing of apolipoprotein B (apoB) mRNA in eutherians and the editing of the mitochondrial tRNA for glycine in marsupials. Here we have identified and characterised apoB mRNA editing in the American opossum Monodelphus domestica. The apoB mRNA editing site is highly conserved in the opossum and undergoes complete editing in the small intestine, but not in the liver or other tissues. Opossum APOBEC-1 cDNA was cloned, sequenced and expressed. The encoded protein is similar to APOBEC-1 of eutherians. Motifs previously identified as involved in zinc binding, RNA binding and catalysis, nuclear localisation and a C-terminal leucine-rich domain are all conserved. Opossum APOBEC-1 contains a seven amino acid C-terminal extension also found in humans and rabbits, but not present in rodents. The opossum APOBEC-1 gene has the same intron/exon organisation in the coding sequence as the eutherian gene. Northern blot and RT-PCR analyses and an editing assay indicate that no APOBEC-1 was expressed in the liver. Thus the far upstream promoter responsible for hepatic expression in rodents does not operate in the opossum. An APOBEC-1-like enzyme such as might be involved in C->U RNA editing of tRNA in marsupial mitochondria was not demonstrated. The activity of opossum APOBEC-1 in the presence of both chicken and rodent auxiliary editing proteins was comparable to that of other mammals. These studies extend the origins of APOBEC-1 back 170 000 000 years to marsupials and help bridge the gap in the origins of this RNA editing process between birds and eutherian mammals.
Adenosine-to-inosine modification of RNA molecules (A-to-I RNA editing) is an important mechanism that increases transciptome diversity. It occurs when a genomically encoded adenosine (A) is converted to an inosine (I) by ADAR proteins. Sequencing reactions read inosine as guanosine (G); therefore, current methods to detect A-to-I editing sites align RNA sequences to their corresponding DNA regions and identify A-to-G mismatches. However, such methods perform poorly on RNAs that underwent extensive editing (“ultra”-editing), as the large number of mismatches obscures the genomic origin of these RNAs. Therefore, only a few anecdotal ultra-edited RNAs have been discovered so far. Here we introduce and apply a novel computational method to identify ultra-edited RNAs. We detected 760 ESTs containing 15,646 editing sites (more than 20 sites per EST, on average), of which 13,668 are novel. Ultra-edited RNAs exhibit the known sequence motif of ADARs and tend to localize in sense strand Alu elements. Compared to sites of mild editing, ultra-editing occurs primarily in Alu-rich regions, where potential base pairing with neighboring, inverted Alus creates particularly long double-stranded RNA structures. Ultra-editing sites are underrepresented in old Alu subfamilies, tend to be non-conserved, and avoid exons, suggesting that ultra-editing is usually deleterious. A possible biological function of ultra-editing could be mediated by non-canonical splicing and cleavage of the RNA near the editing sites.
The traditional view of mRNA as a pure intermediate between DNA and protein has changed in the last decades since the discovery of numerous RNA processing pathways. A frequent RNA modification is A-to-I editing, or the conversion of adenosine (A) to inosine (I). Since inosine is read as a guanosine (G), A-to-I editing leads to changes in the RNA sequence that can alter the function of its encoded protein. In recent years, tens of thousands of human A-to-I editing sites were discovered by computationally comparing RNA sequences to the human genome and searching for A-to-G mismatches. However, previous screens usually ignored RNA sequences that were edited to extreme, because the large number of A-to-G mismatches carried by these RNAs obscured their genomic origin. We developed a new computational framework to detect extreme A-to-I editing, or ultra-editing, based on masking potential editing sites before the alignment to the genome. Our method detected about 14,000 editing sites, with each edited molecule affected, on average, in more than 20 nucleotides. We demonstrated that the likely reason for the ultra-editing of those sequences is their potential to fold back into a particularly long double-stranded structure, which is the preferred target of the editing enzymes.
RNA editing is the process whereby an RNA sequence is modified from the sequence of the corresponding DNA template. In the mitochondria of land plants, some cytidines are converted to uridines before translation. Despite substantial study, the molecular biological mechanism by which C-to-U RNA editing proceeds remains relatively obscure, although several experimental studies have implicated a role for cis-recognition. A highly non-random distribution of nucleotides is observed in the immediate vicinity of edited sites (within 20 nucleotides 5' and 3'), but no precise consensus motif has been identified.
Data for analysis were derived from the the complete mitochondrial genomes of Arabidopsis thaliana, Brassica napus, and Oryza sativa; additionally, a combined data set of observations across all three genomes was generated. We selected datasets based on the 20 nucleotides 5' and the 20 nucleotides 3' of edited sites and an equivalently sized and appropriately constructed null-set of non-edited sites. We used tree-based statistical methods and random forests to generate models of C-to-U RNA editing based on the nucleotides surrounding the edited/non-edited sites and on the estimated folding energies of those regions. Tree-based statistical methods based on primary sequence data surrounding edited/non-edited sites and estimates of free energy of folding yield models with optimistic re-substitution-based estimates of ~0.71 accuracy, ~0.64 sensitivity, and ~0.88 specificity. Random forest analysis yielded better models and more exact performance estimates with ~0.74 accuracy, ~0.72 sensitivity, and ~0.81 specificity for the combined observations.
Simple models do moderately well in predicting which cytidines will be edited to uridines, and provide the first quantitative predictive models for RNA edited sites in plant mitochondria. Our analysis shows that the identity of the nucleotide -1 to the edited C and the estimated free energy of folding for a 41 nt region surrounding the edited C are the most important variables that distinguish most edited from non-edited sites. However, the results suggest that primary sequence data and simple free energy of folding calculations alone are insufficient to make highly accurate predictions.
RNA editing describes the process in which individual or short stretches of nucleotides in a messenger or structural RNA are inserted, deleted, or substituted. A high level of RNA editing has been observed in the mitochondrial genome of Physarum polycephalum. The most frequent editing type in Physarum is the insertion of individual Cs. RNA editing is extremely accurate in Physarum; however, little is known about its mechanism. Here, we demonstrate how analyzing two organisms from the Myxomycetes, namely Physarum polycephalum and Didymium iridis, allows us to test hypotheses about the editing mechanism that can not be tested from a single organism alone. First, we show that using the recently determined full transcriptome information of Physarum dramatically improves the accuracy of computational editing site prediction in Didymium. We use this approach to predict genes in the mitochondrial genome of Didymium and identify six new edited genes as well as one new gene that appears unedited. Next we investigate sequence conservation in the vicinity of editing sites between the two organisms in order to identify sites that harbor the information for the location of editing sites based on increased conservation. Our results imply that the information contained within only nine or ten nucleotides on either side of the editing site (a distance previously suggested through experiments) is not enough to locate the editing sites. Finally, we show that the codon position bias in C insertional RNA editing of these two organisms is correlated with the selection pressure on the respective genes thereby directly testing an evolutionary theory on the origin of this codon bias. Beyond revealing interesting properties of insertional RNA editing in Myxomycetes, our work suggests possible approaches to be used when finding sequence motifs for any biological process fails.
RNA is an important biomolecule that is deeply involved in all aspects of molecular biology, such as protein production, gene regulation, and viral replication. However, many significant aspects such as the mechanism of RNA editing are not well understood. RNA editing is the process in which an organism's RNA is modified through the insertion, deletion, or substitution of single or short stretches of nucleotides. The slime mold Physarum polycephalum is a model organism for the study of RNA editing; however, hardly anything is known about its editing machinery. We show that the combination of two organisms (Physarum polycephalum and Didymium iridis) can provide a better understanding of insertional RNA editing than one organism alone. We predict several new edited genes in Didymium. By comparing the sequences of the two organisms in the vicinity of the editing sites we establish minimal requirements for the location of the information by which these editing sites are recognized. Lastly, we directly verify a theory for one of the most striking features of the editing sites, namely their codon bias.
Ebolavirus (EBOV), the causative agent of a severe hemorrhagic fever and a biosafety level 4 pathogen, increases its genome coding capacity by producing multiple transcripts encoding for structural and nonstructural glycoproteins from a single gene. This is achieved through RNA editing, during which non-template adenosine residues are incorporated into the EBOV mRNAs at an editing site encoding for 7 adenosine residues. However, the mechanism of EBOV RNA editing is currently not understood. In this study, we report for the first time that minigenomes containing the glycoprotein gene editing site can undergo RNA editing, thereby eliminating the requirement for a biosafety level 4 laboratory to study EBOV RNA editing. Using a newly developed dual-reporter minigenome, we have characterized the mechanism of EBOV RNA editing, and have identified cis-acting sequences that are required for editing, located between 9 nt upstream and 9 nt downstream of the editing site. Moreover, we show that a secondary structure in the upstream cis-acting sequence plays an important role in RNA editing. EBOV RNA editing is glycoprotein gene-specific, as a stretch encoding for 7 adenosine residues located in the viral polymerase gene did not serve as an editing site, most likely due to an absence of the necessary cis-acting sequences. Finally, the EBOV protein VP30 was identified as a trans-acting factor for RNA editing, constituting a novel function for this protein. Overall, our results provide novel insights into the RNA editing mechanism of EBOV, further understanding of which might result in novel intervention strategies against this viral pathogen.
Ebola virus (EBOV) causes severe hemorrhagic fever with case fatality rates of up to 90% and no therapy or vaccine currently available. A better understanding of the EBOV life cycle is important to develop new countermeasures against this virus; however, research with live EBOV is restricted to high containment laboratories. One unique feature of the EBOV life cycle is that its surface glycoprotein is expressed only after editing of the glycoprotein mRNA by the viral polymerase, leading to an insertion of a non-templated nucleotide into the mRNA. While this phenomenon has been long known, the mechanism of mRNA editing for EBOV is not understood. We have developed a unique minigenome system that allows the study of EBOV mRNA editing outside of a high containment laboratory. Using this system we have characterized EBOV mRNA editing and defined the sequence requirements for this process. Interestingly, we could show that signals both up- and downstream of the editing site are important, and that a secondary structure in the RNA upstream of the editing site as well as the viral protein VP30 contribute to editing. These findings provide new detailed molecular information about an essential process in the EBOV life cycle, which might be a potential novel target for antivirals.
In bean, potato, and Oenothera plants, the C encoded at position 4 (C4) in the mitochondrial tRNA Phe GAA gene is converted into a U in the mature tRNA. This nucleotide change corrects a mismatched C4-A69 base pair which appears when the gene sequence is folded into the cloverleaf structure. C-to-U conversions constitute the most common editing events occurring in plant mitochondrial mRNAs. While most of these conversions introduce changes in the amino acids specified by the mRNA and appear to be essential for the synthesis of functional proteins in plant mitochondria, the putative role of mitochondrial tRNA editing has not yet been defined. Since the edited form of the tRNA has the correct secondary and tertiary structures compared with the nonedited form, the two main processes which might be affected by a nucleotide conversion are aminoacylation and maturation. To test these possibilities, we determined the aminoacylation properties of unedited and edited potato mitochondrial tRNAPhe in vitro transcripts, as well as the processing efficiency of in vitro-synthesized potato mitochondrial tRNAPhe precursors. Reverse transcription-PCR amplification of natural precursors followed by cDNA sequencing was also used to investigate the influence of editing on processing. Our results show that C-to-U conversion at position 4 in the potato mitochondrial tRNA Phe GAA is not required for aminoacylation with phenylalanine but is likely to he essential for efficient processing of this tRNA.
RNA editing by cytidine-to-uridine conversions is an essential step of RNA maturation in plant organelles. Some 30–50 sites of C-to-U RNA editing exist in chloroplasts of flowering plant models like Arabidopsis, rice or tobacco. We now predicted significantly more RNA editing in chloroplasts of early-branching angiosperm genera like Amborella, Calycanthus, Ceratophyllum, Chloranthus, Illicium, Liriodendron, Magnolia, Nuphar and Zingiber. Nuclear-encoded RNA-binding pentatricopeptide repeat (PPR) proteins are key editing factors expected to coevolve with their cognate RNA editing sites in the organelles.
With an extensive chloroplast transcriptome study we identified 138 sites of RNA editing in Amborella trichopoda, approximately the 3- to 4-fold of cp editing in Arabidopsis thaliana or Oryza sativa. Selected cDNA studies in the other early-branching flowering plant taxa furthermore reveal a high diversity of early angiosperm RNA editomes. Many of the now identified editing sites in Amborella have orthologues in ferns, lycophytes or hornworts. We investigated the evolution of CRR28 and RARE1, two known Arabidopsis RNA editing factors responsible for cp editing events ndhBeU467PL, ndhDeU878SL and accDeU794SL, respectively, all of which we now found conserved in Amborella. In a phylogenetically wide sampling of 65 angiosperm genomes we find evidence for only one single loss of CRR28 in chickpea but several independent losses of RARE1, perfectly congruent with the presence of their cognate editing sites in the respective cpDNAs.
Chloroplast RNA editing is much more abundant in early-branching than in widely investigated model flowering plants. RNA editing specificity factors can be traced back for more than 120 million years of angiosperm evolution and show highly divergent patterns of evolutionary losses, matching the presence of their target editing events.
Electronic supplementary material
The online version of this article (doi:10.1186/s12862-016-0589-0) contains supplementary material, which is available to authorized users.
Amborella; Pentatricopeptide repeat (PPR) proteins; Molecular phylogenetics; Pyrimidine exchange RNA editing; Mitochondria; Chloroplasts; RNA-binding proteins; Molecular coevolution
Apolipoprotein B (apoB) RNA editing involves a cytidine to uridine transition at nucleotide 6666 (C6666) 5' of an essential cis -acting 11 nucleotide motif known as the mooring sequence. APOBEC-1 (apoB editing catalytic sub-unit 1) serves as the site-specific cytidine deaminase in the context of a multiprotein assembly, the editosome. Experimental over-expression of APOBEC-1 resulted in an increased proportion of apoB mRNAs edited at C6666, as well as editing of sites that would otherwise not be recognized (promiscuous editing). In the rat hepatoma McArdle cell line, these sites occurred predominantly 5' of the mooring sequence on either rat or human apoB mRNA expressed from transfected cDNA. In comparison, over-expression of APOBEC-1 in HepG2 (HepG2-APOBEC) human hepatoma cells, induced promiscuous editing primarily 5' of the mooring sequence, but sites 3' of the C6666 were also used more efficiently. The capacity for promiscuous editing was common to rat, rabbit and human sources of APOBEC-1. The data suggested that differences in the distribution of promiscuous editing sites and in the efficiency of their utilization may reflect cell-type-specific differences in auxiliary proteins. Deletion of the mooring sequence abolished editing at the wild type site and markedly reduced, but did not eliminate, promiscuous editing. In contrast, deletion of a pair of tandem UGAU motifs 3' of the mooring sequence in human apoB mRNA selectively reduced promiscuous editing, leaving the efficiency of editing at the wild type site essentially unaffected. ApoB RNA constructs and naturally occurring mRNAs such as NAT-1 (novel APOBEC-1 target-1) that lack this downstream element were not promiscuously edited in McArdle or HepG2 cells. These findings underscore the importance of RNA sequences and the cellular context of auxiliary factors in regulating editing site utilization.
The C↔U substitution types of RNA editing have been observed frequently in organellar genomes of land plants. Although various attempts have been made to explain why such a seemingly inefficient genetic mechanism would have evolved, no satisfactory explanation exists in our view. In this study, we examined editing patterns in chloroplast genomes of the hornwort Anthoceros formosae and the fern Adiantum capillus-veneris and in mitochondrial genomes of the angiosperms Arabidopsis thaliana, Beta vulgaris and Oryza sativa, to gain an understanding of the question of how RNA editing originated.
We found that 1) most editing sites were distributed at the 2nd and 1st codon positions, 2) editing affected codons that resulted in larger hydrophobicity and molecular size changes much more frequently than those with little change involved, 3) editing uniformly increased protein hydrophobicity, 4) editing occurred more frequently in ancestrally T-rich sequences, which were more abundant in genes encoding membrane-bound proteins with many hydrophobic amino acids than in genes encoding soluble proteins, and 5) editing occurred most often in genes found to be under strong selective constraint.
These analyses show that editing mostly affects functionally important and evolutionarily conserved codon positions, codons and genes encoding membrane-bound proteins. In particular, abundance of RNA editing in plant organellar genomes may be associated with disproportionately large percentages of genes in these two genomes that encode membrane-bound proteins, which are rich in hydrophobic amino acids and selectively constrained. These data support a hypothesis that natural selection imposed by protein functional constraints has contributed to selective fixation of certain editing sites and maintenance of the editing activity in plant organelles over a period of more than four hundred millions years. The retention of genes encoding RNA editing activity may be driven by forces that shape nucleotide composition equilibrium in two organellar genomes of these plants. Nevertheless, the causes of lineage-specific occurrence of a large portion of RNA editing sites remain to be determined.
This article was reviewed by Michael Gray (nominated by Laurence Hurst), Kirsten Krause (nominated by Martin Lercher), and Jeffery Mower (nominated by David Ardell).
RNAs transcribed from the mitochondrial genome of Physarum polycephalum are heavily edited. The most prevalent editing event is the insertion of single Cs, with Us and dinucleotides also added at specific sites. The existence of insertional editing makes gene identification difficult and localization of editing sites has relied upon characterization of individual cDNAs. We have now determined the complete mitochondrial transcriptome of Physarum using Illumina deep sequencing of purified mitochondrial RNA. We report the first instances of A and G insertions and sites of partial and extragenic editing in Physarum mitochondrial RNAs, as well as an additional 772 C, U and dinucleotide insertions. The notable lack of antisense RNAs in our non-size selected, directional library argues strongly against an RNA-guided editing mechanism. Also of interest are our findings that sites of C to U changes are unedited at a significantly higher frequency than insertional editing sites and that substitutional editing of neighboring sites appears to be coupled. Finally, in addition to the characterization of RNAs from 17 predicted genes, our data identified nine new mitochondrial genes, four of which encode proteins that do not resemble other proteins in the database. Curiously, one of the latter mRNAs contains no editing sites.
The extent, regulation and enzymatic basis of RNA editing by cytidine deamination are incompletely understood. Here we show that transcripts of hundreds of genes undergo site-specific C>U RNA editing in macrophages during M1 polarization and in monocytes in response to hypoxia and interferons. This editing alters the amino acid sequences for scores of proteins, including many that are involved in pathogenesis of viral diseases. APOBEC3A, which is known to deaminate cytidines of single-stranded DNA and to inhibit viruses and retrotransposons, mediates this RNA editing. Amino acid residues of APOBEC3A that are known to be required for its DNA deamination and anti-retrotransposition activities were also found to affect its RNA deamination activity. Our study demonstrates the cellular RNA editing activity of a member of the APOBEC3 family of innate restriction factors and expands the understanding of C>U RNA editing in mammals.
Aberrant RNA editing is linked to a range of neuropsychiatric and chronic diseases. Here Sharma et al. show that APOBEC3A can function as an RNA editing protein in response to physiological stimuli, significantly expanding our understanding of RNA editing and the role this may play in diseases.
Plant mitochondrial mRNAs have recently been shown to undergo editing, involving cytidine-to-uridine changes relative to the DNA sequence. We have examined the temporal relationship of editing and intron removal in coxII mRNAs in Petunia mitochondria. By using differential hybridization to probes specific for edited and unedited RNA and by sequencing of individual unspliced coxII pre-mRNA cDNAs, we found that RNA editing at any editing site can precede the splicing event. Similar results were obtained from examinations of pre-mRNA cDNAs of nad1, a gene composed of multiple exons that are both cis and trans spliced. Thus, intron removal is not required before editing can occur. The existence of editing intermediates indicates that the editing process is not strictly coincident with transcription.
Pentatricopeptide repeat (PPR) proteins with an E domain have been identified as specific factors for C to U RNA editing in plant organelles. These PPR proteins bind to a unique sequence motif 5′ of their target editing sites. Recently, involvement of a combinatorial amino acid code in the P (normal length) and S type (short) PPR domains in sequence specific RNA binding was reported. PPR proteins involved in RNA editing, however, contain not only P and S motifs but also their long variants L (long) and L2 (long2) and the S2 (short2) motifs. We now find that inclusion of these motifs improves the prediction of RNA editing target sites. Previously overlooked RNA editing target sites are suggested from the PPR motif structures of known E-class PPR proteins and are experimentally verified. RNA editing target sites are assigned for the novel PPR protein MEF32 (mitochondrial editing factor 32) and are confirmed in the cDNA.
RNA editing by C-to-U conversions is nearly omnipresent in land plant chloroplasts and mitochondria, where it mainly serves to reconstitute conserved codon identities in the organelle mRNAs. Reverse U-to-C RNA editing in contrast appears to be restricted to hornworts, some lycophytes, and ferns (monilophytes). A well-resolved monilophyte phylogeny has recently emerged and now allows to trace the side-by-side evolution of both types of pyrimidine exchange editing in the two endosymbiotic organelles.
Our study of RNA editing in four selected mitochondrial genes show a wide spectrum of divergent RNA editing frequencies including a dominance of U-to-C over the canonical C-to-U editing in some taxa like the order Schizaeales. We find that silent RNA editing leaving encoded amino acids unchanged is highly biased with more than ten-fold amounts of silent C-to-U over U-to-C edits. In full contrast to flowering plants, RNA editing frequencies are low in early-branching monilophyte lineages but increase in later emerging clades. Moreover, while editing rates in the two organelles are usually correlated, we observe uncoupled evolution of editing frequencies in fern mitochondria and chloroplasts. Most mitochondrial RNA editing sites are shared between the recently emerging fern orders whereas chloroplast editing sites are mostly clade-specific. Finally, we observe that chloroplast RNA editing appears to be completely absent in horsetails (Equisetales), the sister clade of all other monilophytes.
C-to-U and U-to-C RNA editing in fern chloroplasts and mitochondria follow disinct evolutionary pathways that are surprisingly different from what has previously been found in flowering plants. The results call for careful differentiation of the two types of RNA editing in the two endosymbiotic organelles in comparative evolutionary studies.
Electronic supplementary material
The online version of this article (doi:10.1186/s12862-016-0707-z) contains supplementary material, which is available to authorized users.
RNA editing; Ferns; Equisetum; PPR proteins; Reverse editing; Monilophytes; Mitochondria; Chloroplasts; Editing loss
RNA editing in plant mitochondria and plastids alters specific nucleotides from cytidine (C) to uridine (U) mostly in mRNAs. A number of PLS-class PPR proteins have been characterized as RNA recognition factors for specific RNA editing sites, all containing a C-terminal extension, the E domain, and some an additional DYW domain, named after the characteristic C-terminal amino acid triplet of this domain. Presently the recognition factors for more than 300 mitochondrial editing sites are still unidentified. In order to characterize these missing factors, the recently proposed computational prediction tool could be of use to assign target RNA editing sites to PPR proteins of yet unknown function. Using this target prediction approach we identified the nuclear gene MEF35 (Mitochondrial Editing Factor 35) to be required for RNA editing at three sites in mitochondria of Arabidopsis thaliana. The MEF35 protein contains eleven PPR repeats and E and DYW extensions at the C-terminus. Two T-DNA insertion mutants, one inserted just upstream and the other inside the reading frame encoding the DYW domain, show loss of editing at a site in each of the mRNAs for protein 16 in the large ribosomal subunit (site rpl16-209), for cytochrome b (cob-286) and for subunit 4 of complex I (nad4-1373), respectively. Editing is restored upon introduction of the wild type MEF35 gene in the reading frame mutant. The MEF35 protein interacts in Y2H assays with the mitochondrial MORF1 and MORF8 proteins, mutation of the latter also influences editing at two of the three MEF35 target sites. Homozygous mutant plants develop indistinguishably from wild type plants, although the RPL16 and COB/CYTB proteins are essential and the amino acids encoded after the editing events are conserved in most plant species. These results demonstrate the feasibility of the computational target prediction to screen for target RNA editing sites of E domain containing PLS-class PPR proteins.