|Home | About | Journals | Submit | Contact Us | Français|
Translational efficiency is controlled by tRNAs and other genome-encoded mechanisms. In organelles, translational processes are dramatically altered because of genome shrinkage and horizontal acquisition of gene products. The influence of genome reduction on translation in endosymbionts is largely unknown. Here, we investigate whether divergent lineages of Buchnera aphidicola, the reduced-genome bacterial endosymbiont of aphids, possess altered translational features compared with their free-living relative, Escherichia coli. Our RNAseq data support the hypothesis that translation is less optimal in Buchnera than in E. coli. We observed a specific, convergent, pattern of tRNA loss in Buchnera and other endosymbionts that have undergone genome shrinkage. Furthermore, many modified nucleoside pathways that are important for E. coli translation are lost in Buchnera. Additionally, Buchnera’s A+T compositional bias has resulted in reduced tRNA thermostability, and may have altered aminoacyl-tRNA synthetase recognition sites. Buchnera tRNA genes are shorter than those of E. coli, as the majority no longer has a genome-encoded 3' CCA; however, all the expressed, shortened tRNAs undergo 3′ CCA maturation. Moreover, expression of tRNA isoacceptors was not correlated with the usage of corresponding codons. Overall, our data suggest that endosymbiont genome evolution alters tRNA characteristics that are known to influence translational efficiency in their free-living relative.
In the final step of protein synthesis, mRNA sequences must be accurately and efficiently translated into amino acid proteins. Reliable and efficient translation depends critically on tRNA, which must exhibit specificity in aminoacylation, and correct pairing of the anticodon with its codon on the mRNA. The robust nature of the genetic code and numerous genome-encoded mechanisms promote translational accuracy (1,2), thus preventing deleterious events such as the reassignment of codons that can alter the function of thousands of genes. Nevertheless, tRNAs and the genetic code sometimes do change, especially in genomes undergoing size reduction as exemplified by mitochondria and plastids (1,3,4). These organelle genomes, which are derived from genomes of symbiotic bacteria (5,6), exhibit the most extreme cases of architectural alterations such as an increase in molecular evolutionary rate, inability to recombine, and massive gene loss that sometimes leads to tRNA loss and changes in the genetic code (7). Organelles encode a limited set of proteins and rely on other co-occurring genomes for enzymes and tRNAs (3,8,9).
The reduced genomes of some bacterial endosymbionts exhibit similar but less extreme alterations in genome sequence compared with organelles (10). However, unlike organelles, most endosymbionts are still autonomous in the sense that they possess their own core genetic machinery (11–13), including the conventional bacterial structure of tRNAs (14,15). Most endosymbionts retain the universal genetic code, but exceptions do exist among the tiniest genomes, in which UGA is sometimes recoded from Stop to Trp (16,17). In contrast, organelle tRNAs and their translational machinery are highly divergent from those of most bacteria (1,3,8,18). The question still remains as to how endosymbiont tRNAs and translational mechanisms differ from those of ancestral free-living genomes that are not reduced. Overall, we hypothesize that the process of genome shrinkage in endosymbionts results in a reduction of translational efficiency and integrity resembling a transitional stage between free-living ancestors and organelles.
Present day genomic features of bacterial endosymbionts result from their ancient transition from a free-living lifestyle to an obligate intracellular association (10). Many bacteria that replicate strictly in host intracellular environments possess reduced genomes with sequences that are A+T biased relative to those of their free-living ancestors (10,19,20). One such bacterium demonstrating these genomic shifts is Buchnera aphidicola, an obligate unculturable endosymbiont of aphids (21). Buchnera has coevolved with its aphid hosts for 200-250 million years (21), during which its genome shrunk to only 416–652 kbp depending on the lineage (22–27). Based on previous gene expression and genomic studies in Buchnera, genome reduction and accelerated sequence evolution has resulted in changes that are hypothesized to lower the efficiency and accuracy of transcription and translation (28–31) as compared with the free-living relatives. We predict that Buchnera will also exhibit less optimal tRNA features. Presently, transcribed tRNAs and associated transcriptional mechanisms, which are key components of efficient and accurate translation, have not been extensively examined in Buchnera or any other bacterial endosymbiont.
Comprehensive characterization of transcribed endosymbiont tRNAs has previously been difficult largely because of the inability to isolate unculturable symbiont tRNAs free of host contamination. However, analysis of tRNAs beyond the level of DNA-encoded genes can reveal the nature of tRNA maturation, including the diversity of posttranscriptional processing that may occur. Taking advantage of new methodologies in high-throughput RNA sequencing (directional RNAseq), and the availability of several divergent Buchnera genomes (23,25,27), we investigated how genome reduction and A+T richness affect tRNA evolution in this model endosymbiont. This comparative framework provides us with an understanding of the conservation of tRNA sequences that influence specificity in aminoacylation and secondary structure as well as conservation of nucleoside modification pathways that influence anticodon–codon base pairing (1,2,32). From these data, we were able to address how Buchnera tRNAs and associated transcriptional fidelity mechanisms are altered relative to those of free-living relatives, exemplified by Escherichia coli. Additionally, because numerous reduced endosymbiont genomes have recently been sequenced (10), we investigated whether a pattern of tRNA loss was present among reduced endosymbiont genomes.
Four aphid species, Acyrthosiphon pisum (strains LSR1 and 5A), Acyrthosiphon kondoi (strain Ak), Schizaphis graminum (strain Sg) and Uroleucon ambrosiae (strain UA002, referred to as Ua), were reared in the same growth chamber at 20°C. A. pisum was reared on seedlings of Vicia faba, A. kondoi on Medicago sativa, U. ambrosiae on Tithonia mexicana and S. graminum on Hordeum vulgare.
For each aphid strain, B. aphidicola cells were filtered from 3g of mixed age aphids. Filtration was done according to the study by Moran et al. (33), with modifications as follows. First, modified buffer A (34) was used instead of PBS. Also, after the 1000rpm centrifugation step, the pellet was resuspended and used for subsequent filtration steps instead of the supernatant. After the last centrifugation step, supernatant and the protein layer were discarded and the pellet was immediately immersed with Ambion TRI Reagent Solution. For RNA extraction, a similar protocol was used as in Hansen and Moran (34) except that, after step 5, Qiagen’s miRNAeasy protocol under appendix A from Qiagen’s miRNAeasy Mini Handbook was used to enrich for miRNA (i.e. RNA <200bp). RNA was DNAase treated, and quality and quantity was checked as in Hansen and Moran (34). All filtration and extraction materials were treated with RNAse AWAY (Molecular BioProducts, Inc, CA, USA), and all solutions were RNase free.
The Yale Keck sequencing center carried out library preparation and sequencing of Buchnera tRNA for all five aphid strains. Briefly, for tRNA library preparation, the Illumina mRNA directional sequencing protocol was followed starting at the phosphatase treatment step. RNA <200bp was directionally sequenced one lane per sample with Illumina 35bp reads. The CLC Genomic Workbench Aarhus, Denmark was used for read processing and mapping. For all reads, small RNA adapters and reads with ambiguous nucleotides were trimmed from reads. Trimmed reads were then mapped to corresponding Buchnera genomes (Table 1; 23,25,27) with CLC Genomic Workbench short read local alignment mapping using the default settings for short reads. All Buchnera taxa used in this study possess similar genome sizes (Ap-5A=642122bp; Ak=641794bp; Ua=615380bp; Sg=641454bp). tRNA reads that mapped sense and anti-sense relative to the tRNA gene were converted into Reads Per Kilobase of exon model per million mapped reads (RPKM). Coverage per base pair was calculated using custom perl scripts and Microsoft Excel and was viewed in Artemis 13.0 (35) to visualize sense and anti-sense tRNA coverage. For each Buchnera strain, tRNA genes were annotated using genome annotations in NCBI, tRNAscan-SE 1.21 (14,15) and Artemis 13.0 (35) to verify whether 3′ CCA was encoded in the genome. tRNA CCA 3′ maturation occurs in all organisms and is essential for charging tRNAs with amino acids. To identify CCA 3′ maturation, the last 3′ 20bp of annotated tRNA’s were retrieved from all high quality raw reads. Reads that perfectly matched the last 20bp were binned into the following three categories: (i) reads match the 3′ tRNA end and no more nucleotides are processed, (ii) reads match the 3′ tRNA end plus additional non-CCA nucleotides are transcribed and (iii) reads match the 3′ tRNA end plus CCA is added by maturation. To analyse A+T richness in Buchnera and E. coli CDS and tRNA genes the program EMBOSS (36) was used. To calculate codon usage of 50 highly expressed Buchnera genes (37), E-cai (38) was used.
After consensus, RNAseq reads corresponding to tRNA genes were mapped and assembled, tRNA species were identified with tRNAscan-SE 1.21, with E. coli homology Blast searches (39), and with verification of the presence of signature identity elements relative to E. coli (32).
The last comprehensive survey of tRNA genes from bacteria was conducted in 2002 and only included the endosymbiont genome of Buchnera strain APS (40). Because several smaller endosymbiont genomes have been sequenced since 2002, we surveyed several more genomes that varied drastically in genome size and phylogenetic placement. The tRNAscan-SE Genomic tRNA database (41) was used to characterize the presence of tRNA gene isoacceptors (i.e. a tRNA species that binds to one or more codons for a particular amino acid residue) in 16 genomes.
During library preparation, some modified bases cause the reverse transcriptase to either fall off at the modified position, and/or to incorporate a ‘mismatch’ relative to the reference genome sequence (42,43). To detect modified bases and potential posttranscriptional processing, we screened for mismatches in tRNA reads relative to the reference tRNA gene similar to Iida et al. (42) and Findeiß et al. (43). After mapping, only the sense tRNA reads in CLC (using the same mapping parameters as discussed earlier), we ran CLC single-nucleotide polymorphism (SNP) analyses to detect mismatches. Threshold criteria for counting a mismatch were established by identifying conserved mismatches in both Ap-5A and AP-LSR1 (two different strains from the same aphid species). These two strains shared 38 mismatches for which the mismatch rate was more than 1% per base (i.e. above Illumina’s expected error rate per base) and the alternative variant count was at least eight reads. This mismatch criterion was then used to detect mismatches in other strains for a total of four divergent Buchnera taxa (Ap, Ak, Ua and Sg).
Predicted tRNA-modified bases and their pathways for each Buchnera tRNA were obtained from E. coli homologs using both http://modomics.genesilico.pl/pathways/ (44) and http://www.ecocyc.org/ (45). Divergent Buchnera genomes (23,25,27) were searched for modification pathway enzymes using E. coli homologs using Blastp (39).
Infernal (46) was used to generate tRNA sequence and secondary structure alignments among Buchnera strains and E. coli. The covariance model, RF00005 cm, was used, which accounts for tRNA secondary structure constraints. Using Infernal output, 4sale (47) was used to compute pairwise compensatory substitution tables from stems for all tRNAs among Buchnera strains and E. coli. Stability of tRNA secondary structure was measured as Delta G (ΔG), the change in Gibbs Free Energy (in units of kcal/mole). Thus, the more negative ΔG is, the more thermodynamically stable the tRNA secondary structure. ΔG was computed for tRNAs of each strain individually using RNAalifold (48,49) with constraints on tRNA constraint folding generated by tRNAscan-SE 1.21 (14,15).
All raw sense and anti-sense tRNA data were submitted to NCBI Genbank under SRA submission: SRA049863.3, under Bioproject #s: (i) PRJNA82811, (ii) PRJNA82809, (iii) PRJNA82797, (iv) PRJNA82793, (v) PRJNA82789. All paired sample-t test (percent guanine-cytosine (%GC), tRNA length), correlation (pairwise RPKM comparisons) and regression (codon usage and tRNA expression) statistics were carried out using IBM SPSS Statistics. 2010 for Mac, standard version 19.0. New York, USA.
For all Buchnera genomes, tRNA genes occur in the same genomic positions (Figure 1). Based on tScan and blastn detection of homology with E. coli the same 32 tRNA genes and 29 anticodon types are conserved across Buchnera taxa (Figure 1).
As expected, directional RNAseq reads map primarily in the sense direction of tRNA genes, with antisense reads averaging less than 1% of the sense reads (Table 1, Figure 1). All Buchnera tRNA genes are expressed in the sense direction, but some lack antisense expression, depending on strain, and sense expression is always higher than antisense expression (except for Phe GAA in Ak and Sg) (Figure 1). tRNA sense expression is positively correlated across divergent Buchnera taxa (Table 2). The level of antisense expression is highly correlated across all Buchnera taxa, but the correlation is less for Buchnera-Sg, the most divergent taxon (Table 2). Transcriptional start sites and coverage curves for antisense RNAs varied widely across Buchnera taxa. Nevertheless, conserved 5′ transcriptional start sites and coverage curves were identified for several antisense RNAs that occurred on or near tRNA genes for all five Buchnera taxa (Supplementary Table S1).
Recognition of tRNAs by tRNA synthetases is essential to the fidelity of translation. Aminoacyl-tRNA synthetases (aaRS) must recognize multiple tRNA isoacceptors (i.e. different tRNA species that bind to alternative codons for the same amino acid residue) but discriminate against others. This recognition is dependent on tRNA identity elements, consisting of evolutionarily conserved bases at specific positions of tRNAs (Giege’ et al. 1998). Based on RNAseq data from all taxa, unmodified identity elements for each tRNA are identical to those in E. coli except for base substitutions in CysGCA (G15 to U15; A13 to G13), SerGGA (G73 to A73), SerGCT (variable loop 1bp shorter, except in Sg) and AlaGGC (G20 to U20, 5A and Ua only; G20 to C20, Sg and Ak only). Based on blastp analyses, all 20 cognate aaRS are encoded within each Buchnera genome.
In contrast to E. coli and most other organisms with nonreduced genomes, Buchnera does not encode multiple tRNA genes with matching anticodons, except for three tRNA genes encoding the anticodon CAU. Two of these genes encode either an initiation or elongation Met tRNA based on tRNA identity elements and homology (Table 3). The other tRNA gene encoding a CAU anticodon possesses homology and identity elements corresponding to the IleLAU anticodon (Table 3).
Numerous tRNA isoacceptors are present in E. coli but missing from all Buchnera strains. Many Buchnera tRNA isoacceptors that belong to 4-codon family boxes and to two-codon families (5′-NNR codon type) have been lost from Buchnera genomes (Table 3). 5′-CNN anticodons were preferentially lost in family boxes corresponding to Leu, Gly, Ser, Thr and Pro. Only one family box, corresponding to Pro, lost both 5′-CNN and 5′-GNN anticodons. For two-codon families, a 5′ CNN anticodon was lost from Gln (and Leu and Arg for 6-codon families), relative to E. coli (Table 3). Based on Watson and Crick base-pairing and revised wobble rules (50,51), all tRNA isoacceptors encoded and expressed in Buchnera can base pair with the 61 possible codons (Table 3), which are all still encoded in Buchnera’s protein-coding genes at variable frequencies.
The pattern of tRNA gene isoacceptor loss was examined in 16 bacterial taxa representing a wide range of genome sizes and phylogenetic associations, including some with extremely reduced genomes (Figure 2). Reduced genomes show common patterns of retention of particular anticodons. For family box codons, 5′-CNN anticodons followed by 5′-GNN anticodons are consistently eliminated from the small genomes. For 5′-NNR two-box codons, 5′-CNN anticodons are eliminated. In the most reduced genomes, only 5′-UNN anticodons remain for both family box and two-box codons. Unmodified 5′-U anticodons can wobble and pair with all four base combinations for family box codons (50). Therefore, for 5′-NNR two-box codons, the 5′-U of anticodons must be modified to prevent mistranslation of neighboring two-box (NNY) codons (1, 2) (e.g. an unmodified 5′ U in a Gln 5′-UUG anticodon can mispair with His codons 5′-CAU and CAC [Table 3]).
Based on E. coli tRNA homologs, 26 different types of nucleoside modifications are predicted to occur in Buchnera tRNAs (Table 4, Supplementary Table S2, Supplementary Dataset 1). Nine of these modifications are important for the efficiency and fidelity of protein synthesis and occur in N34 tRNA positions (wobble) of E. coli (Table 4). We expect five of these N34 modifications to be retained to code for all cognate codon pairs and prevent mistranslation of other amino acids (e.g. mnm5u, mnm5s2U, cmnm5Um, I and K2C). An inosine (I) modification is important in E. coli because 5′-A from anticodon ArgACG is modified into I, which can wobble and pair with Arg codons CGA, CGU and CGC (55). Lysidine (K2C) is an important modification in E. coli because 5′-C from anticodon IleCAU is modified into K2C (L), which pairs with Ile codon AUA (instead of the Met codon AUG) (59). Other expected N34 modifications (mnm5u, mnm5s2U and cmnm5Um,) are important for modifying anticodon 5′ U for NNR two codon boxes, thus preventing mistranslation (1,2). Based on Buchnera genome annotations, entire pathways are only present for expected wobble bases I, k2C and cmnm5um (Table 4); however, some pathways are only missing the last enzyme in a pathway (e.g. mnm5u and mnm5s2U), and/or are still unknown in E. coli.
High throughput mismatch evidence (see ‘Materials and Methods’ section) shared by multiple taxa supports the presence of a modified nucleoside at 5′-A from anticodon ArgACG in all Buchnera strains. These data support the presence of an inosine modification in all taxa. For example, we found a high frequency of anticodon 5′-ACG transcribed as 5′-GCG, where the frequency of 5′-G/A at this wobble base position was, Ap-5A: 61/39%; Ap-LSR1=70/30%; Ak=69/31%; Ua=27/73%; and Sg=72/28%. Presence of transcripts containing a 5′-G for the ArgACG anticodon is strong indirect evidence for an inosine modification. For example, during the reverse transcription process, the modified nucleoside inosine base pairs with C residues, and therefore ‘G’ is found in the consensus cDNA sequence instead of ‘A’ (60). Conserved high throughput mismatch evidence for Ap-5A, Ap-LSR1 and Ak also supports the presence of a modified base at N34 for LysTTT, suggesting that mnm5s2U is present in these strains even though the E. coli version of the pathway appears incomplete in Buchnera. Error evidence was not detected for other expected modified wobble positions relative to E. coli, even though full pathways are retained in the genome (Table 4).
Other tRNA modifications that are very important for the fidelity of protein synthesis are N37 modifications. N37 modifications are known to stabilize weak A:U and U:A base pairing between N36 of the anticodon and N1 of the codon (1,2,51). Based on in vitro experiments, N37 modifications are known to increase the interaction of the codon with the anticodon, preventing miscoding of amino acids and frameshifts (52,53,61–63). In turn, to maintain efficient translation, we expect these modifications to be retained. Based on modifications for the homologous tRNAs in E. coli, seven important N37 modified nucleosides are predicted in Buchnera. Among Buchnera genomes, four N37 nucleosides pathways are retained, two are missing, and one has an unknown pathway in E. coli (Table 4). High throughput mismatch evidence supports the presence of a modified base at N37 for PheGAA, ProTGG, LeuGAG and LeuTAG and thus suggests that ms2i6A, m1G, xG and xG, respectively, are present in all taxa. However, no mismatch was detected in Sg for LeuGAG. The tRNA modifications at positions other than N34 and N37 that are supported by mismatch evidence are shown in Supplementary Table S2, and Supplementary Dataset 1. Mismatch evidence was also found at positions at which E. coli does not process modified nucleosides, suggesting the presence of new modified nucleoside sites and/or RNA editing of mature tRNAs (Supplementary Dataset 1). Collectively, all mismatch frequencies (with the exception of ArgACG) were dominated by the reference sequence base at a frequency of ~90-99% relative to mismatches for all taxa. Mismatches were primarily not changes to a single nucleotide base, but were composed of three different bases other than the reference base.
In many species, tRNA abundances are positively correlated with codon usage for highly expressed genes (64,65). Anticodons of highly expressed tRNAs correspond to codons that are used frequently in these genes, thus improving the efficiency of translation (64,65). Based on Watson and Crick and revised wobble base-pairing rules (50,51), each Buchnera isoacceptor was paired with its corresponding codon pair. Met CAU, the only duplicate anticodon coding for the same codon, was excluded from analysis. The relationship between percent average codon usage of highly expressed genes and corresponding tRNA isoacceptor expression was examined for each Buchnera strain. No significant relationship was found between average codon usage of 50 highly expressed genes in Ap-5A (on leading and lagging strands) and cognate tRNA isoacceptor sense expression (Figure 3). No significant relationship was found on examining the relationship between highly expressed Buchnera genes (four chaperones and 54 ribosomal proteins) and cognate tRNA isoacceptor sense expression for all taxa (Supplementary Figure S1A and S1B). Examination of codon usage and tRNA expression scatterplots reveals that most tRNA isoacceptors, regardless of codon usage, are expressed at similar levels (e.g. for Ap-5a in RPKM the 75 percentile=843 950, median=309 742 and max=4 407 138; Figure 3). TrpCCA is the highest expressed isoacceptor in all taxa (except Ua), even though the corresponding codon occurs at low frequency (Figure 3 and Supplementary Figure S1).
As expected, Buchnera CDS are significantly more A+T rich relative to CDS of E. coli [Figure 4 (c)]. Within each Buchnera genome, tRNA genes are 2.2-fold more G+C rich relative to CDS, indicating that selection conserves higher %G+C in tRNA genes. Nevertheless, Buchnera tRNA genes are significantly more A+T rich than homologs in E. coli [Figure 4 (c)].
Stability of tRNA secondary structure can decrease with a reduction in %GC, especially in stem structures. Because Buchnera tRNAs are more A+T rich than those of E. coli [Figure 4 (c)], we measured the stability of Buchnera tRNA secondary structure. ΔG was significantly more negative in E. coli tRNAs relative to homologs in Buchnera for all strains, indicating that Buchnera tRNAs have reduced stability in vitro [Figure 4 (b)]. Whether they have reduced stability in vivo, where stabilizing proteins may play a role, remains to be tested. Two tRNAs with the weakest secondary structure in all Buchnera relative to E. coli were ValGAG and TrpCCA; both tRNAs possess numerous compensatory and single base substitutions in the stem regions [Figure 4 (a)].
Buchnera tRNAs are more A+T biased and display weaker secondary structure than those of E. coli (Figure 4). However, a high frequency of compensatory base substitutions are expected in the stem regions as a mechanism for maintaining functionality of these essential molecules. Relative to E. coli, a total of 37–42 compensatory base substitutions were found in Buchnera tRNA stem regions (Table 5). Many of these compensatory substitutions were C/G to T/A directional changes (Table 5).
Genome reduction primarily reflects loss of coding genes, as reduction in gene length is minor (<1%, 37), and gene packing is similar for bacterial genomes of different sizes (66). However, Buchnera tRNA genes are often shorter in length than their homologs in E. coli [Figure 5 (a)]. The difference in length is typically 3bp and mostly reflects the loss of encoded 3′ CCA in the Buchnera tRNA genes. At the 3′ end of tRNAs, CCA is required for amino acid activation, and must either be encoded in the tRNA gene or added during tRNA maturation by the CCA-adding enzyme. Although E. coli and other close relatives of Buchnera such as Vibrio and Pseudomonas spp. all encode 3′ CCA in all tRNA genes except that for selenocysteine, only half of Buchnera tRNA genes encode 3′ CCA [14-17 depending on strain, Figure 5 (b)]. The remaining Buchnera tRNA genes have lost the 3′ encoded CCA. Our analysis of directional RNAseq reads indicates that the mature transcript of these genes possesses a CCA at the 3′ end [Figure 5 (b)], implying CCA-addition.
Some Buchnera tRNA genes with 3′ CCA encoded also displayed CCA 3′ maturation [Figure 5 (b)], resulting in double or triple CCA at the 3′ end of tRNAs. Recently, it was shown that tRNAs with dual 3′ CCA are targeted for degradation (67). More specifically, if a tRNA has 5′ Gs on bp 1 and 2, and its acceptor stem is structurally unstable, then the CCA-adding enzyme marks unstable tRNAs by adding dual 3′ CCAs, targeting it for degradation by RnaseR (67). Such degradation also seems possible in Buchnera strains, which encode both the CCA-adding enzyme and RnaseR. Thus, we examined all Buchnera tRNAs with dual and triple 3′ CCA maturation. First, we noted that all E. coli tRNAs with a 5′ G at the 1st and 2nd base position encode dual or triple CCA on the 3′ end of the tRNA gene [Figure 5 (c)]. Based on tRNAscan-SE 1.21, the penultimate CCA is always incorporated into the 3′ acceptor stem, exposing a single 3′ CCA for activation. Most Buchnera tRNAs that display dual or triple 3′ CCA maturation still retain a 5′ G at the 1st and 2nd bases and are homologs to dual or triple 3′ CCA encoded E. coli tRNAs [Figure 5 (c)]. Three strain-specific tRNAs with dual 3′ CCA maturation do not have E. coli homologs with dual CCAs encoded. These Buchnera tRNAs also do not encode 5′ Gs at the 1st and 2nd base. All Buchnera with dual or triple 3′ CCA maturation incorporate the 2nd to last CCA into the 3′ acceptor stem as in E. coli, except for one case, tRNA LeuTAA in Ak [Figure 5 (c)].
The efficiency and fidelity of translation is reinforced by many mechanisms encoded in genomes. In reduced genomes, mutation rates are typically high, and selection becomes less effective in maintaining translational mechanisms. In this study, we found that bacterial endosymbiont lineages (Buchnera) that experience relaxed selection display less optimal tRNA characteristics relative to those of their free-living relative E. coli. Gene loss and A+T mutational bias in Buchnera have lead to the loss of tRNA isoacceptors and loss of modified base pathways, the reduction of tRNA gene length, and the accumulation of base substitutions and indels (insertions/ deletions) in tRNA sequences that weaken tRNA secondary structure and possibly aminoacyl-tRNA synthetase recognition. These tRNA characteristics are conserved across four Buchnera lineages spanning 70 million years of divergence and may result in reduced translational efficiency and fidelity relative to their ancestors. However, we did detect compensatory base substitutions in Buchnera tRNAs, which are expected to maintain secondary structure of tRNA stem regions. Additionally, RNAseq reads reveal novel 3′ maturation processes that compensate for tRNA gene length reduction.
Divergent Buchnera taxa in this study encode and express the same 32 tRNA genes composed of 32 different isoacceptor types (Figure 1). In turn, no duplication of tRNA gene isoacceptors was found. Based on a survey of 50 eukaryotic, eubacterial, and archaeal genomes, low tRNA gene redundancy (i.e. only one or two gene copies of a particular isoacceptor) was only found in all archaeans and several bacterial genomes, and was approximately correlated with genome size (40). In Buchnera, because of modified wobble rules (50,51), all mature tRNAs expressed can theoretically base pair with the 61 possible codons (Table 3, Figure 1), which are all still encoded in Buchnera CDS. One special Buchnera isoacceptor that has been identified previously in Buchnera-Ap (taxa type strain APS) is tRNA IleCAU (40), where 5′-C is modified into lysidine by the enzyme TilS in E. coli (55), which all Buchnera strains still encode. This special IleCAU isoacceptor codes for Ile instead of Met due to a wobble modification, and is ubiquitous in Eubacteria and Archaea (40).
During genome reduction, Buchnera has preferentially lost 5′-CNN, and to a lesser extent, 5′-GNN anticodons in family boxes and 5′ CNN anticodons from two-codon NNR families (Table 3). This pattern of tRNA isoacceptor loss is common for many bacteria with reduced genomes (Figure 2), and is most likely related to gene deletion processes. Selective loss of these specific isoacceptors in family boxes and NNR two-codon families in Eubacteria was observed in previous studies (1,40,68,69) but was related to A+T sequence bias not deletion processes (1,70). We hypothesize that genome reduction, which is correlated with A+T bias, is the most likely explanation for this pattern of tRNA isoacceptor loss. First, the potential for wobble in codon–anticodon basepairing implies that some tRNA isoacceptors are not essential for pairing with corresponding codons (e.g. 5′-CNN, 5′-GNN anticodons) and can be eliminated through mutation and deletion. Second, due to wobble rules, 5′-GNN anticodons followed by 5′-UNN anticodons are the most promiscuous isoacceptors when pairing with cognate codons; thus, it is not surprising that 5′UNN is always retained in family box and two-box NNR codons in the most reduced genomes. In turn, 5′-UNN anticodons are probably retained because of their ability to recognize alternative codons rather than because of the high frequency of cognate codons in A+T rich CDS. Typically in bacteria and eukaryotes 5′-CNN and 5′-GNN anticodons of family boxes and 5′-CNN anticodons from two-codon families along with 5′ U anticodon modifications extending wobble are maintained by selection, because they increase the efficiency of translation (1,71). We predict that the loss of tRNA isoacceptors in Buchnera as well as other endosymbionts potentially results in less efficient translation.
Numerous unmodified nucleotides at specific nucleotide positions on tRNA isoacceptors are conserved phylogenetically and are known to play crucial roles in defining tRNA specificity for aminoacylation (32,72). These conserved nucleotides are called identity elements and are required for proper recognition by the cognate aaRS in addition to playing roles as deterrents to false recognition (32). Our results reveal that most Buchnera tRNAs have maintained identity elements homologous to those in E. coli, with the exceptions of CysGCA, SerGGA, SerGCT and AlaGGC. In E. coli tRNAcys, the identity elements G15·G48 form an unusual tertiary base pair called a Levitt pair (73). Additionally, the E. coli identity elements A13·A22 are important in determining the structure of G15·G48 (74). Collectively, these E. coli identity elements are required for CysRS recognition due to their role in RNA tertiary structure (73). In all, Buchnera taxa, tRNAcys G15·G48 has mutated to U15·G48 and A13·A22 has mutated to G13·A22. Hou et al. (73) found that when G15·G48 is mutated to U15·G48, its backbone configuration is similar to the wild type tRNAcys; however, only partial aminoacylation (46.2%) occurs relative to the wild type. How both types of changes in identity element together affect tertiary structure is unknown.
In Buchnera tRNA AlaGGC, the identity element G20 is mutated to U20 in strains 5A and Ua and to C20 in Ak and Sg. In E. coli tRNA AlaVGC, these same base changes were shown to result in 6× and 50× reductions in alanine charging activity, respectively, relative to native tRNA AlaVGC (75). Buchnera AlaUGC does not possess this mutation. Potentially, if this mutation is deleterious in AlaGGC recognition, AlaUGC can wobble to all four alternative codons for the family box codon family for alanine. Interestingly, the smallest sequenced genome of Buchnera, for the host Cinara cedri, retains the same tRNA isoacceptors and aaRSs as other Buchnera taxa examined in this study; however, the AlaGGC tRNA gene has been lost, resulting in a total of only 31 tRNA genes.
In Buchnera tRNA SerGGA, the identity element G73 (the discriminator base) has mutated to A73. Generally a mutation in the discriminator base is known to result in the loss of cognate aminoacyl-tRNA synthetase recognition; however, Shimizu et al. (76) demonstrated that any four bases substituted in the discriminator base of E. coli Ser tRNA resulted in the same level of aminoacylation. Nevertheless, G73 in Ser tRNA is phylogenetically conserved (72) and has been shown to play minor roles in SerRS discrimination (77,78). Additionally, in E. coli Ser tRNA, the variable region plays a very important role as an identity element (77,79). In all Buchnera taxa, except Sg, the variable region length of the SerGCT isoacceptor is 1bp shorter than the E. coli SerGCT isoacceptor. In summary, it is unknown how all these mutated identity elements affect Buchnera translation, but the same mutations in E. coli are known to significantly reduce the efficiency of aminoacylation.
In addition to requiring specificity in aminoacylation, reliable and efficient translation requires the anticodon to correctly pair with its codon. Modified nucleosides of tRNAs are essential mechanisms reinforcing translational fidelity and efficiency, especially at the wobble (N34) and 3′ position immediately adjacent to the anticodon (P37), (1,2,51). Based on E. coli tRNA homologs, we expect 16 different types of modified bases to be present in the remaining 32 Buchnera tRNAs, for both N34 and N37 positions. In E. coli, 13 of these modified base pathways are known and Buchnera encodes complete pathways for six of these (Table 4). All Buchnera taxa have lost enzymes responsible for encoding N37 modified bases m2A and m6A, which are important in stabilizing 5′-NNC/G anticodons (2) (Table 4). Enzymes that synthesize the N37 modification m6t6A are conserved in only half of Buchnera taxa; this enzyme is known to slightly increase the efficiency of base pairing of the anticodon ThrGGU to the codon ACC in E. coli (54). All N37 modified base pathways important for preventing frameshifts and stabilizing A:U and U:A at the wobble position of the anticodon and the first position of the codon were retained in all Buchnera taxa (Table 4). These mechanisms may be essential for the fidelity of translation, especially for A+T rich genomes.
Modified nucleosides at the wobble base position (N34) of the anticodon are important for encoding the right amino acid, extending or restricting wobble, increasing the efficiency of base pairing and preventing frameshifts (2,53,55,56,58,59). Buchnera taxa all encode the enzyme TilS that is essential for the synthesis of the modified base lysidine, and is important for encoding the amino acid Ile instead of Met (59). All Buchnera taxa also encode the core enzymes MmmE and MnmG that are important for synthesizing the modified bases mnm5u, mnm5s2U, and cmnm5Um, which restrict 5′U wobble in NNR two-box codons, including Arg and Leucine (Table 4). All of these pathways are complete except for MmmC, which is involved in the last step for both modified bases, mnm5u and mnm5s2U. However, RNAseq mismatch evidence supports the presence of a modified base at the expected position of mnm5s2U (Table 4). Interestingly, the genes encoding MmmA, MmmE, MnmG, and IscS or SufS, but not MnmC are retained in several tiny endosymbiont genomes (10). Conservation of these enzymes in reduced genomes indicates that these enzymes or derivatives are important for the production of the modified bases mnm5u, cmnm5Um, and especially mnm5s2U, which is essential for preventing frameshifts and restricting wobble in NNR two codon boxes (Glu, Lys and Gln), thereby preventing the miscoding of amino acids. For incomplete pathways producing modified bases mnm5u and mnm5s2U, either a derivative may be synthesized and/or the insect host may import MnmC. For example, the pea aphid, A. pisum expresses its mnmC homolog (XP_003245837) in both its body and in the specialized aphid cells (bacteriocytes) that contain Buchnera cells (34).
Another key enzyme that is retained in Buchnera is TadA, which is responsible for synthesizing inosine in E. coli (55). This wobble modification is present on ArgACG in many bacteria and can wobble to three alternative codons of Arg (2,59). Rnaseq mismatch evidence highly supports this modification, as inosine is recognized as G during the reverse transcription process (60), and therefore we were able to measure a high frequency of modified ArgACG transcripts from all Buchnera taxa. Unfortunately, other modified bases do not appear to be recognized as specific bases and in turn incorporate different frequencies of any of the four bases during reverse transcription of modified transcripts (42,43). Collectively, Rnaseq evidence supported the presence of five modified bases, four in which the pathways are known and present (or near present for mnm5s2U) and one in which the pathway is unknown (Table 4). If Buchnera tRNAs can be isolated without host contamination, modified base presence and identity can be confirmed.
In many bacterial species, tRNA abundances are positively correlated with codon usage for highly expressed genes, thus increasing translational efficiency (64,65). In addition to analysing specific tRNA characteristics that influence the accuracy and efficiency of translation, we examined whether codon usage correlates with tRNA expression. We found that tRNA sense expression is highly correlated across Buchnera taxa (Table 2), and many tRNA isoacceptors are expressed at similar levels within taxa (Figure 3, Supplementary Figure S1). A previous microarray study suggested that tRNA expression and codon usage of 50 highly expressed genes in Buchnera-Ap were positively correlated (37), but the relationship was weak and expression of sense and antisense tRNAs were not distinguished, possibly confounding results. Our directional RNAseq data show no relationship between tRNA expression and codon usage, for the same set of highly expressed genes in Buchnera-Ap under similar conditions (Figure 3). Furthermore, no relationship was detectable in three other Buchnera taxa (Supplementary Figure S1). Collectively, these results suggest that selection is not maintaining codon bias for highly expressed proteins. Interestingly, TrpCCA, is the highest expressed isoacceptor in all Buchnera taxa (except Ua) and has very low codon usage. In all, Buchnera examined, isoacceptor TrpCCA displays one of the lowest secondary structures relative to E. coli’s homolog; potentially TrpCCA is highly expressed to compensate for low aminoacylation efficiency related to numerous base substitutions that weaken its secondary structure [Figure 4(a)].
In this study, we found that Buchnera tRNAs have maintained high %GC relative to its CDS; however, its tRNAs are more A+T rich and less stable relative to homologs in E. coli (Figure 4). These results are consistent with previous findings (28) showing that 16S rRNAs of Buchnera and other endosymbiont species are more A+T rich and less stable than those of free-living relatives. Similarly, mitochondrial tRNAs from animals are more A+T rich and less stable than nuclear tRNAs (80). Collectively, these results suggest that the accumulation of deleterious mutations can lead to less stable secondary structures of essential RNAs involved in translation. Some selection for stabilization is also evident as numerous compensatory base substitutions have been fixed in the stem regions of both rRNAs (28) and tRNAs (Figure 5). Alternatively, E. coli tRNAs may possess higher %GC because its optimal growth temperature is higher than that of Buchnera (81), thus favoring higher %GC for increased thermal stability.
During genome reduction, 72–78% of Buchnera tRNA genes among all taxa have deleted 3bp, due to the loss of 3′ encoded CCA [Figure 5(a)]. Nevertheless, we found that all mature Buchnera tRNAs process 3′ CCA, and therefore they all have potential for amino acid activation [Figure 5(b)]. In all Buchnera taxa, six to eight mature tRNAs process dual or triple 3′ CCA [Figure 5(b)]. These characteristics, in addition to 5′ G at the 1st and 2nd position and instability of the acceptor stem, result in tRNA degradation (67). Interestingly, these tRNAs in Buchnera and E. coli transcribe 5′ G at the first and second base position and process dual or triple 3′ CCA [Figure 5 (c)]. In these mature tRNAs in both E. coli and Buchnera, the second to last 3′ CCA is always incorporated into the 3′ acceptor stem. Potentially, the retention of encoded 5′ G at N1 and N2 and the conservation of dual and triple 3′ CCA maturation in these tRNAs [Figure 5 (c)] are essential to maintain the correct secondary structure and to police unstable tRNAs via the tRNA degradation pathway.
In conclusion, our observations of altered tRNA characteristics are consistent with the hypothesis that translational fidelity is lower in Buchnera compared with free-living relatives as represented by E. coli. First, Buchnera genome reduction has resulted in the loss of specific tRNA isoacceptors and modified nucleoside pathways that may reduce translational efficiency and fidelity. Second, Buchnera’s A+T mutational bias and reduced selection has resulted in the reduction of tRNA stability in vitro and specific tRNA base substitutions that may alter the efficiency of aaRS recognition. Moreover, reduced translational efficiency was supported by the lack of relationship between codon usage of highly expressed genes and cognate tRNA isoacceptor expression. Nevertheless, purifying selection appears to be strong enough in Buchnera genomes to maintain high %GC of tRNA genes relative to CDS. Also, CCA 3′ maturation of shortened tRNA genes, and numerous compensatory base substitutions in tRNA stems help maintain tRNA secondary structure and function. Consequently, we predict that the translational efficiency and fidelity evident in Buchnera are in an intermediate state between free-living bacteria and organelles.
All raw sense and anti-sense tRNA data were submitted to NCBI Genbank under SRA Submission: SRA049863.3, under Bioproject numbers: (i) PRJNA82811, (ii) PRJNA82809, (iii) PRJNA82797, (iv) PRJNA82793 and (v) PRJNA82789.
Supplementary Data are available at NAR Online: Supplementary Tables 1 and 2, Supplementary Figure 1 and Supplementary Dataset 1.
Funding for open access charge: US Department of Agriculture [2011-67012-30707 to A.H.].
Conflict of interest statement. None declared.
The authors thank Kim Hammond for rearing aphids and Dieter Söll, Jiqiang Ling, Patrick O'Donoghue and Markus Englert for helpful discussions and feedback on tRNA data. Also, they also thank Yogeshwar Kelkar, Rahul Raghavan and Patrick Degnan for helpful comments on the manuscript and thank four anonymous reviewers for their helpful comments and suggestions.