SSU Decoding site RNA
There is no single self-folding segment in the 16S RNA that encompasses the majority of the decoding site rRNA. There are two segments of the 16S RNA of significant length 1402–1498 and 588–754 that are clearly capable of self-folding to their native structure in isolation [8
]. The coordinates are from 1GIX
]. Equivalent coordinates are in 1FJG
]. The first of these forms a long single hairpin RNA helix whose open end points directly at the decoding site opposite the ribosomal proteins S9 and S13. This RNA helix contains, as noted by Ogle and Ramakrishnan [12
], the two bases, A-1492 and A-1493, at its open end that play a key role in stabilizing the tRNA anticodon mRNA helix. This stabilization is done coordinately with the base at G-530, which is nearly a thousand nucleotides distant (Fig. ). It is noteworthy that this large self-folding helix is truncated in minimum SSU [13
]. The second RNA segment with reasonable self-folding potential, 588–754, however, is far from the decoding center. There are at least three other disjoint short segments considered part of the decoding site, 954–957, 1051–1057 and 1193–1199, largely forming one side of the tRNA pocket. None of these short segments is contained within longer segments with any self-folding potential. The lack of any continuous RNA segment of sufficient length to be capable of self-folding to the native structure of the decoding site is in strong contrast to what is observed in the LSU.
Figure 1 SSU Decoding site RNAs. Shown are the A- and P-site tRNAs as green and red strand ribbons, the mRNA fragment as space filled, and two segments of the 16S RNA in backbone (the self-folding long helix 1402–1498 (1GIX.pdb) containing the conserved (more ...)
SSU Universal Proteins
There are 15 universal ribosomal proteins associated with the SSU. These proteins can be characterized by their 3D structure, their RNA interaction and their sequence block structure [3
]. These sequence blocks are either universal, with recognizable homologs within all cellular domains or unique to particular cellular domains. We restrict our study to the cellular domains of Archaea and Bacteria, due to the available 3D structures of their ribosomes and bound ribosomal proteins [9
Five of the 15 universal SSU proteins (S2, S3, S4, S14, S15) are globular; a second group of six (S7, S9, S11, S12, S13, S19) have a globular portion plus a long unstructured peptide extension. Extensions here refer to segments of these proteins that extend away from the more compact or globular part of the protein for a significant distance. There are three ribosomal proteins (S5, S8, S10) with hairpin extensions and one (S17) with both helical and hairpin extensions. These SSU universal ribosomal proteins and their sequence blocks have functions involved with: the folding of the SSU RNA; stabilization of the folded ribosomal SSU RNA; constraining or stabilizing the tRNAs; structural interactions with other ribosomal associated proteins, e.g., initiation, elongation and termination factors, etc. and modulating the binding to the large subunit of the ribosome [9
Hairpin extensions within the universal blocks of S10 and S5 reach toward the underside of the decoding site yet do not directly make contact. These extensions reach deep into the RNA structure probably providing RNA stabilization. Both S3 and S4 are found largely on the surface at the end of the mRNA groove, with the hairpin extension of S3 approaching the bottom of the groove but not contacting that site. The other universal SSU proteins are found largely on the SSU surface and do not appear to interact with the decoding function. S19 in the bacterial case has a short N-terminal extension that reaches toward the two tRNAs, but does not actually contact either of the tRNAs while contacting S13, which does directly interact with the decoding function. However, in the archaeal S13 case there is an additional noncellular domain-specific N-terminal block [2
] that may reach farther but this is not available in the current determined structures.
Some distance away from the decoding site are a number of other SSU universal proteins that have a complex of cellular domain-specific blocks in both Bacteria and Archaea. These include S15, S4, S3 and probably S8. All are largely on the SSU outer surface. The primary 16S RNA contacts by these are made by the universal blocks. They also make no significant contact with the LSU. On the other hand, there is evidence that at least three, S4, S8, S15, along with S7 are critical in the early 16S protein binding and/or folding pathway [14
Decoding Site Proteins
The ribosomal proteins in the second group with their long extensions have the major contacts with the decoding site. Five proteins, S7, S9, S12, S13, and S11 have active site contacts to the tRNA binding site and/or the messenger RNA. Three proteins, S9, S12 and S13, contact the A or P site tRNAs (Fig. ). The bacterial protein S9 contains only universal blocks, which are blocks or segments alignable across all Bacteria and Archaea. The contact by the S9 C-terminal extension (107–128) is primarily with the P-Site tRNA.
S13 has an irregular elongated globular domain with long C-terminal extension that makes contact between the two tRNAs at the A- and P-sites. The archaeal S13 contains significant additions or block segments not found in Bacteria. However these archaeal blocks are not involved in the S13 contacts to the tRNAs at the A- and P-sites.
The globular body of S12 on the SSU upper surface is involved with the SSU-LSU interface. A small beta hairpin (38–54) within a universal block makes contact with the decoding site and probably the mRNA. There is a very short N-terminal extension of the bacterial S12 that is unalignable with the larger N-terminal archaeal-specific block [2
]. While the archaeal S12 long extensions probably play a role in the 16S RNA structure, they point away from the decoding site.
S7 and S11 also contact the mRNA, but at the far end of the messenger path and are largely on the surface of the SSU. S7 contacts the mRNA with the hairpin (75–87) within a universal block. S11 contains a single universal block except for a seven-amino acid deletion in the middle of the bacterial protein. The S11 mRNA contacts are made by an irregular loop within this universal block (45–58). There is an S11 long C-terminal extension that extends deep into the rRNA folded structure but away from the decoding site.
Conserved Amino acid-Base contacts
There are a limited number of conserved amino acids contacting conserved bases among the universal proteins of Bacteria and Archaea. There are a few highly conserved amino acid positions involving Glycine, Lysine, Arginine and Aspartic acid that contact 16S conserved base positions. Among the universal proteins S9, S12 and S13 contacting the decoding site, there are very few coordinate conserved amino acid-conserved base pairs (see Table ). Interestingly the majority involve polar side chains, suggesting that most of the 16S associated base conservation may be incidental in that the base conservation is due to RNA structure. This is also supported by the large number of amino acid side chains contacting conserved bases but that are not highly conserved. In addition much of the remaining amino acid conservation appears to involve protein-protein and internal protein structural constraints. This supports the idea that structural contacts are more highly conserved than sequence for this ancient molecular machine. One SSU counter example appears to be S12 with 17 highly conserved amino acid-based contacting pairs. Interestingly S12 also makes significant conserved amino acid-base contacts with the LSU.
The conserved amino acid and conserved base contacts between the three SSU proteins (S3, S9, S12) and S13 and the 16S RNA.
As in the SSU, there are a limited number of LSU conserved amino acids contacting conserved bases among the universal proteins (see Table ). Most of these conserved base contacts do not directly contact the PTC. In addition the majority are again Arginines and Glycines that, however, are themselves not well-conserved. A few highly conserved amino acids (L03 Gly -208 -210 -213) appear to allow close protein RNA packing within the determined structures. The RNA conserved bases contacting these proteins are structurally conserved by base pairing, not protein side chain interactions. What is clear in the cases of the functional analog protein pairs, L10e/L16, L15e/L31 and L44e/L33, appears true in general. Again it is structure that is conserved rather than specific atomic interactions with most bases.
The list of PTC-contacting proteins.
The Large Subunit
A major finding from the crystal structure of the LSU demonstrated that the PTC is a ribozyme [15
] as there are no proteins directly involved in the formation of the peptide bond. The 23S RNA segment, nucleotides 2472 through 2650, (LSU RNA coordinates are from 1S72
] form the key structure of the active site, which includes the universally conserved Adenine at position 2486 adjacent to the tRNA-charged amino acid's caboxyl terminal. We calculated the free energy of this segment of rRNA in isolation and showed that the free energy was lowest when its folded secondary structure was in its native fold. This RNA segment contains two parallel helices forming the base of the A- and P-tRNA aminoacyl stem binding sites. It includes the so-called A-loop (2584–2598), which forms one side of the tRNA A-site, and a short near vertical helix (2618–2645) forming the back of the PTC. The extant PTC as defined by Bayfield et al.
] and Polacek & Mankin 2005 [6
] includes additional bases within the segment 2091–2282 (Fig. ). This segment forms a potentially stabilizing interaction with the segment 2623–2652 that includes the active site vertical helix. Interestingly if either of the open helical ends in Figure were closed by a short loop, the entire structure would form a continuous rRNA segment of about 230 bases. There is one additional 23S RNA segment whose predicted minimum free energy secondary structure is also in its native structure. This is the segment from 2670 to 2830 forming a cruciform-like structure in the extant LSU. This cruciform structure effectively forms part of the LSU-SSU contacting surface rather than playing any role in the peptidyl transfer center, contacting the PTC base second RNA at the end opposite the site of peptidyl transfer.
Figure 2 The secondary structure of the PTC as defined by Polacek and Mankin (2005). The equivalent base numbering is given by identifying A2451 with A2486 in reference structure 1S72.pdb. The arrows point to the helices forming the "base" of the PTC seen in Fig. (more ...)
The L10e/L16 structural homologous pairs (Archaea and Bacteria respectively) each mimic a third RNA helix in size and shape. Their elongated globular structure is parallel with the two RNA helices forming the base of the PTC structure. The resulting three parallel "helical" bases of the PTC forms two grooves between them into which the two tRNA aminoacyl stem helices can fit [20
]. This "five helical bundle" formed by the PTC plus L10e and the two tRNAs places the tRNA charged ends within a couple of angstroms of each other. This may suggest that there was an earlier third RNA helix, as part of an early self-folding PTC RNA later replaced by the proteins L10e/L16. Curiously there is a well-defined RNA helix from 2427 to 2462, such that if one calculated the minimum free energy of the longer segment 2427 through 2650 there would be a third PTC base helix available to replace the L10e/L16 helical mimic. However in the extant 23S 3D structures, the RNA helix is on the side of the LSU away from the PTC (Fig. ).
Figure 3 The proposed minimal PTC. The upper view is from above the two 23S helices that form the "base" of the PTC. The RNA segment from 2472–2650 (1S72.pdb) is shown as cyan ribbon strands representing the self-folding minimal PTC segment. The red is (more ...)
The LSU Proteins
There are 19 universal LSU ribosomal proteins with orthologs in the bacterial and archaeal cellular domains. Seventeen can be located on the determined 3D LSU structure [9
]. Like the universal ribosomal proteins found on the SSU, these 17 proteins can be classified by their structures, their positions and interactions on the subunit and their cellular domain sequence block structures [3
Their structures vary from those that are basically globular to those with or without long extensions. There are small globular proteins, L11, L23p and L29p, and those with larger globular domains, L6, L7/L12, L10e/L16 and L30. The L10e/L16 structural homolog pair has its elongated globular structure approximating the size of an RNA hairpin helix. The proteins having a globular domain with long N-terminal extensions include L2, L15, L18 and L24. L2 has both an N- and C-terminal extension.
Proteins L3, L4, L5, L13, L14 and L22 have extensions that are hairpin loops rather than N- or C terminal extensions. The L3 and L13's extensions are more complex, containing short alpha helices as well. L3's extension loop reaches deep into the rRNA structure, as does L4.
With the exception of L10e/L16 the LSU universal proteins have the majority of their globular domains on the subunit's surface. In particular L5, L11, L18, L24, L29 and L30 are nearly pure LSU surface proteins [21
The Proteins at the PTC
In this study we focus on the LSU proteins making significant contacts with the PTC's RNA (2472–2645). With the exception of L10e/L16 the other PTC-contacting proteins, L2, L3, L4, L6, L14 and L15e/L31 interact with the PTC through their extensions.
Of particular interest is the sequence block structure [3
] of these contacting extensions. Cellular domain-specific blocks, unlike universal blocks, are defined as significant sequence segments that can be aligned as homologous only among either all Bacteria or all Archaea but not both. This distinguishes them from the universal blocks alignable across both of these cellular domains [2
]. Many of these domain-specific sequence blocks are associated with deleted positions in the other cellular domain, while others appear to occupy similar sequence positions but have distinct amino acid composition and often differ in structural details.
The extreme case of such cellular domain specificity is found in the pairs L10e/L16, L44e/L33, L21e/L27, L15e/L31, L31e/L17, L37e/L34 and L24e/L19, which while binding nearly identical RNA substructures in Bacteria and Archaea have little or no sequence similarities and in most cases have different protein folds [21
]. The pair L10e/L16, alone among these functional analog pairs, makes extensive contacts along nearly the entire length of one of the PTC base helices (2486–2533). In addition the L10e loop (97–113), which reaches the farthest along the side of the PTC (see Fig. ), is the least similar to the bacterial L16 in structural detail. The pair L44e/L33 while not contacting the extant PTC does make extensive contacts with the RNA helix (2427 to 2462), which may have been replaced by the L10e/L16 helical mimic. The L44e/L33 protein functional analog pair also contacts the L15e/L31 functional analog pair, which also contacts that RNA helix. Does this suggest that all three of these proteins L10e/L16, L15e/L31 and L44e/L33 were added late or early as equivalent alternatives among many such [2
Both the L2 N- and C-termini contain cellular domain-specific blocks, each of which reaches in toward the top of the PTC making minimal contact. The L2 C-terminal domain-specific block is connected to the adjacent universal block by a universally conserved three-Glycine run. This would provide a unique flexible connection for this C-terminal extension. That in turn might support the idea that the two sequence blocks were once independent.
L3 contains a large complex loop (213–254) that makes contact with the PTC's second base helix. These contacts are in a short cellular domain-specific block. It also contains a unique archaeal N-terminal extension within a universal block making additional contact with the same RNA PTC helix.
L4 has a very long hairpin loop (43–100), the end of which contacts the very back of the PTC close to the universally conserved Adenine at position 2486. The end section of this L4 loop (59–83) is a cellular domain-specific block [2
]. This L4 domain-specific block is in a nearly homologous sequence position in both Bacteria and Archaea, yet they have very different sequences and different structural details [3
]. The two loops make very similar contacts to the back of PTC. In Archaea they are Ser-Gly-Arg, while in Bacteria they are Lys-Gly-Thr. Note the reversal of amino acid properties. Much of this same L4 loop contacts the L15 long N-terminal extension, which approaches the back of the PTC near the L4 PTC contact. The L15 N-terminal extension also make extensive contacts with the RNA helix, 2427 to 2462, which was potentially displaced by L10e/L16 as noted above and shown in Figure . The majority of the L15 N-terminal extension is within a large cellular domain-specific block [2
L6 has two elongated globular domains. The contacts of L6 to the end of the first PTC base RNA helix are made by a loop connecting the last helix of one globular domain to the C-terminal strand of the other globular domain. This short loop (154–162 in Archaea) is part of an unalignable C-terminal cellular domain-specific block, unique sequence segments in both Bacteria and Archaea [2
L14 has a short hairpin extension (35–45) that contacts the exterior of the PTC at the so-called A-loop RNA. This loop contains three positive amino acids and is imbedded in a universal sequence block. Interestingly all bacterial sequences to date have the actual contacting end of this loop set off by a pair of short sequence alignment deletions relative to all known Archaea. This has made it problematic to identify clearly these contacting segments as part of a universal sequence block alignable across both Bacteria and Archaea [3
Of the above PTC contacting proteins, L3, L4, L6 and L15 appear to extend out and down from a plane formed by the PTC two-base RNA helices and the L10e/L16 helix mimic (Fig. ). Only two proteins contacting the PTC rRNA segment are out of this plane: L2 and L14. The L14 contacts are on the exterior of the A-loop, while L2 contacts only the very top of the PTC's back forming helix. Thus the significance of these two protein's contacts for a minimal PTC function is unclear. It should be noted that the globular domains of L3 and L6 also make extensive contact with the second large 23S RNA self-folding cruciform segment (2670 to 2830). This has implications for the evolution of the ribosome, see Discussion.
Peptide Exit Tunnel
The peptide exit tunnel, while not directly part of the PTC, is clearly an important component of the LSU's function [22
] in a manner similar to the SSU mRNA groove noted above. One protein important to the tunnel is L22, whose extension (115–143) reaches deep into the LSU. This extension is within a universal block and approaches the back of the PTC similar to that of L4, while not making direct contact as does L4. The L22 extension, along with much of L4's, forms the major protein components of the exit tunnel. The extensions of L22, like those of L4 and L15, point down and out of the plane formed by the PTC base helices (Fig. ). The globular domain of L4 and L22, along with L24 and L29, form a surface at the exit of the tunnel. The two proteins L24 and L29 make contact with the Signal Recognition Particle, SRP, complex involved with the export of new proteins into and/or through cellular membranes [23