|Home | About | Journals | Submit | Contact Us | Français|
Despite the emergence of a large number of X-ray crystallographic models of the bacterial 70S ribosome over the past decade, an accurate atomic model of the eukaryotic 80S ribosome is still not available. Eukaryotic ribosomes possess more ribosomal proteins and ribosomal RNA than bacterial ribosomes, which are implicated in extra-ribosomal functions in the eukaryotic cells. By combining cryo-EM with RNA and protein homology modeling, we obtained an atomic model of the yeast 80S ribosome complete with all ribosomal RNA expansion segments and all ribosomal proteins for which a structural homolog can be identified. Mutation or deletion of 80S ribosomal proteins can abrogate maturation of the ribosome, leading to several human diseases. We have localized one such protein unique to eukaryotes, rpS19e, whose mutations are associated with Diamond-Blackfan anemia in humans. Additionally, we characterize crucial and novel interactions between the dynamic stalk base of the ribosome with eukaryotic elongation factor 2.
The ribosome is a massive ribonucleoprotein particle composed of a small and a large subunit (40S and 60S in eukaryotes; 30S and 50S in eubacteria). The eukaryotic ribosome is a complex assembly of four ribosomal RNAs (rRNA) with about 80 ribosomal proteins (rps), and its assembly requires more than 150 nonribosomal factors (Ferreira-Cerca et al., 2005). While a preeminent role of the ribosome is to translate messenger RNA into polyeptides, it additionally acts as a platform for several nonribosomal proteins involved in fundamental biological processes including docking of the ribosome to cellular organelles and recruiting kinases involved in various signaling pathways. In rapidly growing yeast cells the majority of transcription is devoted to the production of ribosomal RNA and about 50% of RNA polymerase II transcription is committed to production of ribosomal proteins (Warner, 1999), highlighting the overall importance of ribosome function within the cell.
Due to the vast amount of available genetic information and the simplicity of the system compared to other eukaryotic cells, the yeast, Sacchromyces cerevisiae, has been studied extensively as a model system for eukaryotes. Most of our knowledge about eukaryotic ribosome biogenesis derives from genetic studies in yeast. All the ribosomal proteins in yeast have homologs in ribosomes from higher-order organisms, making the yeast ribosome an excellent model for characterizing eukaryotic ribosome structure and function, as well as protein synthesis and ribosome biogenesis. The small ribosomal subunit in yeast is composed of an 18S rRNA and 32 rps, whereas the 60S subunit is comprised of three rRNAs and 46 rps. The rRNA “core” of the 80S ribosome is homologous to that of the bacterial 70S ribosome, with a few modifications. The 80S ribosome contains a novel, 5.8S rRNA containing 158 nucleotides; it is homologous to the 5’ end of 23S rRNA in the 70S ribosome. The 25S rRNA (3392 nts) in the 80S is homologous with the remaining 3’ sequence of the 23S bacterial rRNA. While homologous, the 5.8S/25S rRNA contains approximately 20% more bases than 23S rRNA in 70S ribosomes. The last piece of rRNA in the 60S subunit is a 5S rRNA (121 nts in yeast), which is conserved among ribosomes from all kingdoms. The eukaryotic 40S subunit consists of a single piece of 18S rRNA (1798 nts) that is about 15% longer than, but homologous to, 16S rRNA in bacteria.
The core of the ribosome where aminoacylated-tRNA is decoded, where peptide bonds are formed, and where tRNAs are translocated from defined binding sites of the ribosome to the next, is highly conserved in the ribosomes of all species of life, be it plant, animal, or bacteria (Gutell et al., 1985). This so-called catalytic core is conserved across species, but eukaryotic ribosomal RNA contains more than 50 additional nucleotide sequences called expansion segments inserted at specific positions of the conserved rRNA core (Gerbi, 1996). The 18S rRNA contains 12 expansions segments (ES) over its 16S rRNA, bacterial counterpart and there are 41 ES located throughout the rRNA of the large subunit (Gerbi, 1996). Complete removal of ES27 from Tetrahymena thermophila rRNA in the large subunit deleteriously affects processing and stability of the rRNA (Sweeney et al., 1994). Similarly, inserting short sequences into ES3 of 18S rRNA in yeast interferes with ribosome assembly (Musters et al., 1990). For the most part, however, the role of expansion segments has not been well characterized.
The ribosomal proteins appear to play a supporting role in protein synthesis, but are additionally required for accurate cleavage and processing of rRNA during ribosome assembly, nuclear export, and maturation of the ribosome (Wool, 1996). Most of the 80S rps are absolutely required for viability. In fact, it has been reported that at least 28 of the 40S rps are necessary for cell growth (Ferreira-Cerca et al., 2005). A similar percentage of rps in the 60S subunit required for viability can be assumed as well. After being translated in the cytoplasm, ribosomal proteins are imported into the nucleus and then the nucleolus, where they assist in rRNA folding and maturation, as well as subunit biogenesis and nuclear export, whereupon the pre-ribosome is further processed in the cytoplasm (Leger-Silvestre et al., 2004). All this suggests that the rps in eukaryotes play a critical role in extra ribosomal functions, ribosome biosynthesis and eukaryotic trafficking.
The eubacterial 70S ribosome and the individual small subunit, as well as the archaeal large subunit, have been solved to atomic resolution (see (Blaha et al., 2009) and references therein). While our understanding of ribosome structure and function has been greatly enhanced by the three-dimensional structures, detailed knowledge is still largely limited to bacterial ribosomes. Several new and expanded molecules reside in the eukaryotic ribosome making it significantly more complex. In the small subunit, only 15 of the 32 ribosomal proteins, based on orthologs of the available crystal structures from eubacteria, have been described in the three-dimensional context of the eukaryotic ribosome from yeast (Spahn et al., 2001; Spahn et al., 2004). The large subunit consists of 45 and 33 proteins in yeast and E. coli, respectively, with 18 of those sharing significant sequence homology in the two systems. The sequence homology of these proteins is extended to include those in the mammalian ribosome as well (Chandramouli et al., 2008). Modeling of eukaryotic ribosomal RNA (rRNA) has also been limited to docking of available crystal structures from eubacterial and archaeal sources, omitting rRNA expansion segments unique to the 80S ribosome (Spahn et al., 2001; Spahn et al., 2004). Recent progress has been made to dock rRNA sequences into the cryo-EM map of a dog ribosome, but the model is still missing approximately 50% of the rRNA sequence attributed to expansion segments (Chandramouli et al., 2008).
We present here the most complete structure of the eukaryotic ribosome to date. Our study utilizes the 8.9Å cryo-EM reconstruction of the 80S ribosome from the thermophilic fungus, Thermomyces lanuginosus (Taylor et al., 2007), as a guide for building homology models of the ribosomal proteins and rRNA sequences that fit into the cryo-EM map. Since the T. lanuginosus ribosome shares over 85% sequence identity with that of S. cerevisiae (Nilsson et al., 2007), we have used the sequence of rRNA and ribosomal proteins from S. cerevisiae for modeling and docking.
The intrinsic dynamic motions of ribosomal substructures such as the L1 stalk and the stalk-base of the ribosome have made these regions particularly difficult to characterize by using X-ray crystallographic methods. The structures of atomic models docked into our cryo-EM map allow interpretation of these regions that have not been well characterized to date. Such motions include those of the L1 stalk, the stalk base, and the ratcheting motion of the small subunit relative to the large subunit. Finally, the 80S ribosome presented here is complexed with the eukaryotic elongation factor 2 (eEF2), the homolog of EF-G in bacteria, which catalyzes translocation of tRNAs through the ribosome. In addition to allowing interpretation of a post-translocational state, eEF2 binding to the ribosome stabilizes the rpP0-P1-P2 (L7/L12 in bacteria) stalk. The dynamic motion of this region is likely important for ribosomal function and we characterize the interactions of the stalk base with eEF2. Similarly, the L1 stalk is disordered in the crystal structures of 70S ribosomes and its movement from an open to a closed state has been implicated in the function of protein synthesis using cryo-EM maps of various states (Valle et al., 2003). Our cryo-EM reconstruction contains a P/E hybrid state tRNA, which interacts with the L1 stalk and stabilizes it in the closed position.
In constructing an atomic model for the 80S ribosome from the cryo-EM map, we take advantage of X-ray structures solved for eubacterial ribosomes and their extensive sequence similarity with the T. lanuginosus ribosome. Since the resolution of the reconstructed density map is not sufficient to define the positions of side-chains, the molecular structure we present should be seen as representative of an ensemble of atomic structures consistent with the experimental density, existing structural knowledge, and rules of stereochemistry.
To facilitate model building and docking, the density attributed to rRNA versus that ascribed to rps was computationally segmented by density thresholding as previously described (Spahn et al., 2000) (Supplemental Figure 1). For each ribosomal subunit, the segmented RNA and protein maps were used to generate starting models of either all rRNA or all rps. Once completed, the rRNA and protein models were combined and energy minimized for the 40S and 60S subunits individually and then jointly within the entire 80S cryo-EM density. In all, fifteen ribosomal proteins in the small subunit and 29 ribosomal proteins in the large subunit were modeled into the 80S ribosome based on homology modeling (Table 1; Experimental Procedures). Additionally, ribosomal proteins rpP0, RACK1, and rpL30e, unique to eukaryotes, were modeled and docked into the cryo-EM map, based on the atomic structures of apo-proteins. Finally, we include the localization of rpS19e, another protein unique to 80S ribosomes, which has been linked to a congenital disease in humans called Diamond-Blackfan anemia (DBA).
ES6 is the largest ES, averaging 250 nucleotides in eukaryotes (Neefs and De Wachter, 1990). Residing in the central domain of 18S, ES6 is composed of two halves: a 3’ half exhibiting conserved sequence identity among eukaryotes and a 5’ half with sequence variability that is absent in eubacterial and archaeal 16S rRNA (Cannone et al., 2002). Despite its presence in all eukaryotic 18S rRNA, the secondary structure interactions of ES6 has been difficult to predict and is depicted as a large, disordered region in most secondary structure predictions. Recent progress based on phylogenetic comparative analysis, however, has been able to better predict the secondary structure of this large ES (Alkemar and Nygard, 2006; Wuyts et al., 2002). Comparative analysis of over 6000 species concluded that the predicted structure of ES6 is relatively conserved in most eukaryotes and that it likely consists of five hairpins and one internal helix (Alkemar and Nygard, 2006) (Figure 1a). Further support for the double- and single-stranded regions of this predicted structure was provided by chemical and enzymatic digestions of ES6 in wheat, yeast, and mouse ribosomes (Alkemar and Nygard, 2006). The density assigned to ES6 in our cryo-EM map further supports such secondary structure predictions, allowing us to assemble the first three-dimensional model of this complex segment of rRNA.
The entire region of ES6 is surrounded by density assigned to unidentified ribosomal proteins (Figure 2a), which likely assist in proper folding and assembly of the ES. The density of our map supports the notion that ES6 forms five hairpins in yeast 18S rRNA. The two 5’ helices, A and B, of ES6 (using nomenclature of Alkemar and Nygard (Alkemar and Nygard, 2006)) display clear density that fits the two hairpins. Our data demonstrate that hairpin A of ES6 in yeast is conserved structurally with the entire sequence of ES6 in E. coli 16S rRNA. Previous structural analyses of a 15Å cryo-EM map determined two masses of density that were attributed to ES6 (Spahn et al., 2001). We show that these two masses account for only two of the hairpins in ES6: hairpin E (see below) is accommodated by the mass of density that runs parallel to the length of the 40S subunit, while hairpin B is the mass running perpendicularly to the body of the 40S subunit and projecting into solution on the solvent side. Except at its base, hairpin B of ES6 does not interact with any other component of the ribosome. Hairpin C is a short hairpin that is largely disordered in our cryo-EM map. It appears, however, that this region of ES6 interacts intimately with a large region of unidentified ribosomal protein mass near the base of hairpin B (Figure 1b and Figure 2a). Like hairpin C, hairpin D forms a short hairpin structure, however, density attributed to it is readily resolved as duplex RNA. This helix extends away from the 40S subunit between hairpins B and D in the 18S rRNA, interacting weakly with the mass of unidentified ribosomal protein near hairpin A.
Hairpin E of ES6 forms a long, bulbous structure on the 40S subunit, which is comparable in length to hairpin B. At the base of hairpin E, there is an interaction with a long C-terminal helix of rpL19e, which projects away from the 60S subunit and toward the 40S subunit (Figure 2b). Density, particularly near the stem loop of hairpin E, suggests that bases in this region of ES6 are interacting with other regions of the 40S subunit. Density near the end of this hairpin indicates an interaction with a long α-helix, which belongs to a 40S ribosomal protein that is unidentified in our homology model. A second, intramolecular interaction is seen about two-thirds down the length of hairpin E, where density reveals a strong interaction between hairpin E of ES6 and ES3 (Figure 2b). This interaction is particularly interesting as it has been implicated, but not yet shown, that nucleotides from ES3 form a tertiary interaction with those from ES6 in eukaryotes (Alkemar and Nygard, 2006). Our structure demonstrates that such a tertiary interaction likely exists.
While the tertiary interaction with ES6 involves the stem of ES3, the apical stem loop of ES3 forms a strong contact with the tip of helix 9 of 18S rRNA. Density from ES3 has been localized to the left foot in a 15Å cryo-EM reconstruction of the 80S ribosome from yeast (Spahn et al., 2001). Helix 9 in 18S rRNA is about twelve nucleotides longer in yeast than in E. coli, supporting the extra density in this region of our cryo-EM reconstruction. Based on the improved resolution of the current reconstruction, we can now identify both apical stem loops of ES3 and helix 9 in the 18S rRNA that contribute to formation of the left foot of the 40S subunit (Figure 2).
ES 7 is an extension of helix 26 in 16S rRNA, residing on the back of the shoulder of the 40S subunit. This extended helix is anchored to helix 23 of 18S rRNA in the body of the 40S subunit by a large mass of density, which belongs to unidentified ribosomal proteins, located between ES7 and rpS14 (rpS11p). Zero-length UV crosslinking of mRNA in 80S initiation complexes suggests that eukaryotic initiation factor 3 may bind near ES7 on the 40S subunit (Pisarev et al., 2008). Since initiation of protein synthesis is particularly more complex in eukaryotes than bacteria (more than 13 initiation factors involved versus three in bacteria), ES7 may have a role in recruiting or stabilizing some of the eukaryotic-specific initiation factors.
ES9 is located at the top of the head of the 40S subunit where it interacts with the protein, rpS19e, which is discussed below. Finally, ES12 is an extension of helix 44 in 18S rRNA and forms the right foot of the 40S subunit, as previously pointed out (Spahn et al., 2001).
In addition to comprising part of the 80S ribosome, eukaryotic ribosomal proteins have been shown to be involved in endonucleolytic cleavage events required for maturation of pre-40S and pre-60S particles in the nucleus and the cytoplasm of eukaryotic cells (Choesmel et al., 2007; Leger-Silvestre et al., 2004). Disruption of these events results in various human diseases, such as DBA. Mutations to ribosomal proteins leading to DBA were first identified in the gene coding rpS19e, but have since been associated with several other ribosomal proteins (Robledo et al., 2008). Although the mechanism of pathogenesis of DBA remains unclear, recent evidence points toward defects in ribosome biogenesis as being responsible (Gregory et al., 2007; Idol et al., 2007; Leger-Silvestre et al., 2004). Mutations in several other ribosomal proteins from both subunits, such as rpS17e, rpS24e, rpL5 (rpL18p), and rpL11 (rpL5p) (see (Robledo et al., 2008) and references therein), have been implicated to cause DBA, most likely through disruptions of ribosome biogenesis.
Ribosome biogenesis follows a conserved pathway that has been extensively studied in yeast. In this pathway, a pre-ribosomal RNA transcript is processed to form the 18S ribosomal RNA (rRNA) of the 40S subunit, as well as the 25S/28S and 5.8S rRNA components of the 60S subunit. In rpS19e-depleted cells, defective cleavage leads to accumulation of novel 21S and 20S nuclear pre-rRNA molecules (Leger-Silvestre et al., 2004). This depletion of rpS19e, as well as mutations to rpS19e that cause DBA in humans, severely affects the production of 40S ribosomal subunits in yeast and human cells (Idol et al., 2007; Leger-Silvestre et al., 2004). Despite its central role in ribosome biogenesis and architecture, the location of rpS19e in the 80S ribosome has remained obscure.
The crystal structure of rpS19e from the archaeon, Pyrococcus abyssi, has recently been solved to atomic resolution (Gregory et al., 2007). We have located rpS19e in the density of our cryo-EM map using a motif search program (Rath et al., 2003) and constructed a homology model of yeast rpS19e based on the P. abyssi crystal structure. Protein rpS19e is located at the top of the “head” of the 40S subunit intercalated between helix 41 and ES9 of 18S rRNA (Figure 3). The location where rpS19e was found in the cryo-EM density map produces an excellent fit (CCC=0.79), which is corroborated by biochemical data. For example, rpS19e forms chemical crosslinks with ribosomal proteins rpS5 (rpS7p), rpS16 (rpS9p), and rpS18 (rpS13p) in the 40S ribosomal subunit (Terao et al., 1980; Tolan and Traut, 1981; Uchiumi et al., 1981; Yeh et al., 1986). All three of these protein orthologs have been modeled into our atomic model of the 80S ribosome and are located in the head of the 40S subunit, proximal to the location of rpS19e presented here (Figures 1b and Supplemental Figure 2). Additionally, immunolabeling experiments have identified the general location of rpS19e to be in the head region of the 40S subunit (Lutsch et al., 1990). Finally, genetic evidence suggests that the nuclear protein Nep1p binds a specific sequence near helix 41 of 18S rRNA and supports the association of rpS19e to pre-ribosomal particles in this vicinity (Buchhaupt et al., 2006). Our localization of rpS19e indicates that it does in fact interact closely with helix 41 of the 18S rRNA in the 40S subunit (Figure 3 and Supplemental Figure 2a). Interestingly, loops in our atomic model that are closest to helix 41 of the rRNA are disordered in the crystal structure of P. abyssi rpS19e, supporting a role in rRNA binding. The C-terminal helix of rpS19e contacts ES9 of the 18S rRNA in our model.
The location of rpS19e indicates that it may assist in the folding of rRNA in the head region of the pre-40S particle. This late processing step in the nucleus may provide a contact for non-ribosomal factors required for subsequent maturation and nuclear export of the pre-40S particles. Indeed, rpS19e depletion has been shown to abrogate the binding of nonribosomal proteins such as Tsr1, Rio2, and Enp1, that are required for subsequent cleavage events of pre-18S rRNA (Leger-Silvestre et al., 2004). Further support for this hypothesis comes from the fact that other ribosomal proteins, such as rpS5 (rpS7p) and rpS18 (rpS13p), which also reside in the head region of 40S subunits near rpS19e, display similar malfunctions in ribosome assembly when depleted in yeast cells (Ferreira-Cerca et al., 2005). Further studies will be required to fully understand the role of rpS19e in eukaryotic ribosome biogenesis; however, the location of rpS19e in the mature 40S particle identified here provides a basis for its interactions with rRNA and neighboring proteins in the eukaryotic ribosome.
Interestingly, it has recently been shown that mutations or disruptions of genes that encode ribosomal proteins are also implicated in other pediatric bone marrow failure syndromes in humans that include Shwachman-Diamond syndrome, Dyskeratosis congenita, cartilage-hair hypoplasia, and 5q- syndrome (see (Ebert et al., 2008) and references within). Most notably, mutations to the genes that encode the ribosomal proteins rpS17e (Cmejla et al., 2007) and rpS24e (Choesmel et al., 2008; Gazda et al., 2006) have also been associated with DBA. These findings support the hypothesis that a loss of ribosome function is the cause of such diseases. The location of point mutations to rpS19e that result in DBA suggest such genetic alterations either inhibit proper folding of rpS19e, or they reside in areas that are adjacent to rRNA and their mutation likely abrogates a chaperone-like folding function of rpS19e on the ribosomal subunit (Supplemental Data).
In the small subunit, the general locations of rpS17e (Lutsch et al., 1979) and rpS24e (Lutsch et al., 1990) have been crudely mapped using immuno-EM of negatively stained ribosomes from rat liver. Antibodies against rpS17e were seen to bind in the head region of the 40S subunit at a location that remained accessible in the 80S ribosome. Antibodies against rpS24e bound to a single region on the interface surface of the 40S subunit. Of seven proteins investigated, rpS24e was the only 40S protein inaccessible to immunolabeling in 80S ribosome preparations, further supporting its existence in the intersubunit space (Lutsch et al., 1990). There are only two masses of density on the 60S interface of the 40S subunit in our cryo-EM map that could account for the location of rpS24e. These densities are seen on either side of helix 44, near ES12, of 18S rRNA. The structures of rpS24e from Thermoplasma acidophilum (Jeon et al., 2006) and P. abyssi (Choesmel et al., 2008) have been solved by NMR spectroscopy and X-ray crystallography, respectively. However, we were unable to confidently dock either structure into our cryo-EM map. We attribute this failure to a combination of the relatively small size of the apo protein (<100 residues) and the limited resolution of our cryo-EM map.
The receptor for activated C-kinase 1 (RACK1) is a WD40 repeat scaffold protein that is highly conserved in eukaryotic ribosomes and functions in a wide range of physiological processes including ribosome assembly and activation, as well as gene transcription and translation (Nilsson et al., 2004). RACK1 has been localized on the back of the head of the 40S subunit by biochemical and cryo-EM techniques (Sengupta et al., 2004). Homology models of RACK1 have been constructed based on the seven β-propeller architecture of the β subunit of heterotrimeric G proteins, a protein which shares ~25% sequence identity to RACK1 (Chandramouli et al., 2008). Recently, however, the x-ray structures of RACK1 (RACK1A) from Arabidopsis thaliana (Ullah et al., 2008) and from S. cerevisiae (Coyle et al., 2009) were solved, revealing differences between the β subunit of G proteins and RACK1.
Interactions between RACK1 and the 40S subunit are orchestrated primarily through blade 1 in RACK1 (Figure 4a). There is a weak interaction between the base of helix 40 in 18S rRNA and the first loop in blade one of RACK1 and a strong interaction between the base of helix 39 in 18S rRNA and the bottom of loops in blade 1 and 2 of RACK1. Genetic mutations revealed patches of basic residues within blade 1 that play a large role in RACK1’s affinity for rRNA in the ribosome (Coyle et al., 2009). A rather strong interaction between rpS16 (rpS9p) of the 40S subunit and the middle of the β-sheet in blade 1 of RACK1 is also apparent from the cryo-EM density. There is density attributed to an unidentified ribosomal protein in our model that is immediately adjacent to the bottom of the propeller in RACK1, which may be important for recruitment and binding of RACK1 to the 40S subunit. This interaction is orchestrated through the knob-like structure protruding between blades six and seven of RACK1, which also comes into proximity of helix 40 in the 18S rRNA (Figure 4a). The knob-like protrusion does not appear to be involved in ribosomal binding, however, as its deletion has no adverse affects on ribosomal binding in vivo (Coyle et al., 2009).
RACK1 density is subdivided into two distinct halves in our cryo-EM map, which correspond to blades 1–3 and 4–7 (Figure 4b). Since RACK1 is known to interact with more than 80 different proteins (Sklan et al., 2006), it is conceivable that the two halves behave as docking stations for separate proteins to bind simultaneously, thus functioning as a chaperone responsible for bringing various proteins into proximity of one another. This notion is supported by the location of conserved residues throughout eukaryotic species that are within each half of the RACK1 propeller (Ullah et al., 2008).
Radiolabeled globin mRNA in the 80S initiation complex from rabbit reticulocytes has been shown to cross-link to rpS7e, rpS10e, rpS25e, rpS29 (rpS14p), and rpL5 (rpL18p) (Takahashi et al., 2005). Of these proteins, rpS5 (rpS7p) and rpL5 (rpL18p) have homologous proteins in eubacterial ribosomes, and are indeed located along the mRNA channel. RpS25e does not have a bacterial homolog, but cross-links directly to rpS5 (rpS7p) (Takahashi et al., 2005). Based on these data, there are two possible locations for rpS25e in our cryo-EM density. One potential density resides between protein rpS5 (rpS7p) and rpS18 (rpS13p) on top of helix between helices 41 and 42 in 18S rRNA. A second possibility is near the exit tunnel, on the neck of the 40S subunit: interacting with the opposite side of rpS5 (rpS7p) and with helix 43 of 18S rRNA (Supplemental Figure 3).
Site-directed zero-length UV crosslinking of mRNA identified four ribosomal proteins on the 40S subunit near the E-site of the 80S initiation complex. Two of these proteins, rpS5 (rpS7p) and rpS14 (rpS11p), have bacterial homologs and reside in the head and platform on the body of the 40S subunit, respectively. The two proteins that are eukaryote-specific and also cross-link to mRNA near the E-site of the ribosome are rpS26e and rpS28e. Based on our cryo-EM reconstruction, we hypothesize that both these proteins reside in the mass of unassigned density that resides on the platform of the 40S subunit surrounding rpS14 (rpS11p) and near the E-site of the ribosome.
Expansion segments are found in all six domains of 5.8/25S rRNA of the large subunit (Figure 5). 5.8S rRNA contains one entire ES as well as part of a second ES. ES4 is a hairpin rRNA formed by the 3’ segment of 5.8S rRNA and the 5’ terminus of 25S rRNA (Figure 5, Supplemental Figure 4a). This ES displays weak interactions with rpL8 (rpL7Ae) and an unassigned α-helix. ES3 is formed entirely of 5.8S rRNA and resides between ES19 of 25S rRNA and rpL25 (rpL23p). The tip of ES3 makes a weak contact with helix 54 of 25S rRNA (Supplemental Figure 4b).
The remaining expansion segments of the large subunit reside within 25S rRNA. A detailed description of the structure and architecture of these expansion segments is presented in Supplemental Data (Supplemental Figure 4). However, some novel findings regarding expansion segments residing in the 60S subunit are presented here.
Because of the proximity of their points of origin and the limited resolution of previous cryo-EM reconstructions of fungal 80S ribosomes, ES7 and ES39 have been jointly assigned to the same general density on the back of the 60S subunit (Nilsson et al., 2007; Spahn et al., 2001). We are now able to distinguish these two expansion segments in our cryo-EM map as separate entities. Part of the difficulty in resolving these two rRNA regions at lower resolution must be attributed to their tight association with ribosomal proteins that have no homologs from the available X-ray atomic model. ES7 folds into two primary stem loops separated by a helix and has a third, very short stem loop at its 3’ end (Figure 5). The 5’ stem loop originates near the base of ES39, but protrudes away from the 60S subunit in the opposite direction. This duplex has slightly different conformations in the cryo-EM reconstructions of 80S ribosomes from S. cerevisiae and T. lanuginosus. In S. cerevisiae, this ES is seen jutting out into solution giving an “open” appearance, while it appears closer to the core of the ribosome in T. lanuginosus, which is best described as being “closed” (Nilsson et al., 2007). In our reconstruction, we can see that the “closed” conformation arises from the presence of an unidentified ribosomal protein anchoring this duplex of rRNA to the body of the 60S subunit (Figure 6a). Comparison to the density of an 11.7Å reconstruction of the 80S ribosome from S. cerevisiae (Spahn et al., 2004) reveals the absence of this particular protein mass, which allows ES7 to adopt the “open” conformation (Figure 6b). It is possible that the protein is present in 80S ribosomes from T. lanuginosus and not those from S. cerevisiae, but the high sequence identity between the two species mentioned earlier makes that explanation unlikely. The genome of T. lanuginosus, including those genes that encode ribosomal proteins, has not been fully sequenced to fully address this issue. Nonetheless, two-dimensional gel electrophoresis of T. lanuginosus 80S ribosomes revealed the same number, and similar size, of ribosomal proteins in both subunits as those in S. cerevisiae (Wu et al., 1995). Therefore, we believe the protein density relates to a ribosomal protein conserved throughout fungi and may become dissociated from the S. cerevisiae ribosome during purification while remaining bound to the T. lanuginosus ribosome. The tighter binding of this protein in T. lanuginosus might be required for the organism’s increased thermostability.
The longest stem loop of ES7 resides about 70Å away from ES39. Some unidentified ribosomal proteins surround this stem loop. Near the base of ES7 is a protein with an extraordinarily long α-helix that stretches across the back of the 60S subunit. This α-helix is approximately 80Å in length and makes extensive contacts with the long stem loop, as well as the short 3’ stem loop structure, in ES7. Based on secondary structure predictions of unidentified, 60S ribosomal proteins, we predict the long α-helix is actually composed of two α-helices from separate ribosomal proteins, but which are in close proximity giving the appearance of a single, very long α-helix. The longer region of this density can be attributed to the N-terminus of rpL7 (rpL30p), which is predicted to possess a C-terminal helix comprised of about 55 residues in yeast and runs along the longer hairpin of ES7 (Supplemental Data, Figure 5). The shorter portion of α-helical density likely belongs to the unidentified ribosomal protein, which anchors the 5’ hairpin of ES7 to 60S, as discussed previously.
The stalk base of the large ribosomal subunit comprises the universal GTPase-associated center (GAC) center of the ribosome. This region is composed of ribosomal RNA and proteins and is essential for recruitment, binding, and GTPase activity of ribosomal GTPases involved in initiation, decoding, translocation, and peptide release during protein synthesis (Frank et al., 2007). Ribosomal RNA from the large subunit, including the sarcin-ricin loop and protein rpP0 in eukaryotes (L10 in eubacteria) contribute to the architecture of the stalk base. A rpP0 homolog representing residues 3 – 120 (of 312 total) was modeled into our cryo-EM density, based on the crystal structure of L10 from Thermotoga maritima (Diaconu et al., 2005) (Figure 7). The N-terminal portion of rpP0 is responsible for anchoring the stalk to a highly conserved region of 25S rRNA in the 60S subunit, while the CTD of rpP0 functions in anchoring the acidic P (P1 and P2) proteins to the stalk (Gudkov et al., 1980).
The eubacterial stalk base is composed of two or three L12 protein dimers that bind the CTD of L10 (also referred to as L7/L12) and form interactions with ribosomal factors, such as EF-G, that are important for catalyzing protein synthesis (Datta et al., 2005; Diaconu et al., 2005). Two dimers of the acidic P1 and P2 proteins form a pentamer with the P0 protein, which constitutes the “L7/L12” stalk in eukaryotes. In yeast, slight alterations in sequence have yielded additional proteins, named P1A, P1B, P2A, and P2B, with dimers being formed by P1A-P2B and P1B-P2A polypeptides. The carboxyl domain of rpP0 (about 100 residues) is absent in the L10 orthologs and, in addition to sharing sequence homology with the P1 and P2 proteins, the CTD of P0 may also mimic the P1/P2 function. Reasoning for this hypothesis comes from the fact that elimination of all four P proteins (P1A, P2A, P1B, and P2B) has no lethal effect on cells (Remacha et al., 1995). In contrast, rpP0 is absolutely required, as disruption of its gene in yeast is lethal (Santos and Ballesta, 1994). It has been established in yeast that residues 199–230 and 231–258 of the CTD of rpP0 are responsible for binding the P1A-P2B and P1B-P2A dimers, respectively (Krokowski et al., 2006). The carboxyl end of rpP0 is structurally similar to the P1/P2 proteins and can function as the P1/P2 proteins in their absence (Santos and Ballesta, 1994). This highly conserved CTD interacts with eEF2 (Lalioti et al., 2002; Vard et al., 1997). As with the P1 and P2 acidic proteins, the CTD of rpP0 is predicted to comprise several helices. There is clear density that would account for the CTD of rpP0 contacting domain I of eEF2 in the structure presented here (Figure 7).
The N-terminal domains of three L7/L12 dimers from T. maritima have been solved in complex with L10 to atomic resolution (Diaconu et al., 2005). Diaconu et al. demonstrated that the X-ray crystal structure could be docked into a 18Å cryo-EM map of the 70S ribosome from E. coli with high fidelity. In general, both the P and L12 proteins consist of an N- and C-terminal domain connected via an alanine-rich hinge. However, it has been argued that the L12 and P proteins are not structurally related and there is no considerable similarity in sequence between the eukaryal/archaeal P proteins and the eubacterial L12 proteins (Grela et al., 2008). For these reasons, we have refrained from including the P1/P2 proteins in our model.
A series of α-helices belonging to various ribosomal proteins of the 60S subunit, yet residing in regions with no corresponding structural homologs, were modeled into the cryo-EM density where the secondary structure was obvious and where the α-helix was supported by secondary structure prediction using the Phyre server (Kelley and Sternberg, 2009). These helices were modeled for proteins rpL7 (rpL30p), rpL16 (rpL13p), rpL19e, rpL21e, and rpL35 (rpL29p) and, in addition to their architectural role in the 80S ribosome, possess other biological functions (Supplemental Data and Supplemental Figure 5).
The 8.9Å cryo-EM map of the 80S ribosome from T. lanuginosus (accession number EMD-1345 in the 3D-EM database, EMBL-EBI (Taylor et al., 2007)) was used for docking the 80S rRNA and rps. The electron-rich density attributed to rRNA was segmented from the density ascribed to ribosomal proteins taking account of the known volume ratio in the cryo-EM volume (Spahn et al., 2000).
We based most of the model for the yeast ribosomal RNA on the crystal structure of the Escherichia coli 70S ribosome (Berk et al., 2006; Schuwirth et al., 2005), making sequence changes, insertions and deletions as required. The ribosome of Thermomyces lanuginosus, which is the source of the cryo-EM map, shares over 85% sequence identity with that of Saccharomyces cerevisiae. Since the latter is much more widely studied than the former, we have used S. cerevisiae sequences for the model.
To determine homologous nucleotide positions between the E. coli and S. cerevisiae ribosomal RNA sequences, we used the ribosomal RNA alignments available on the Comparative RNA Website (CRW) (Cannone et al., 2002). This alignment results from a covariation analysis of a multiple sequence alignment using the ribosomal RNA sequences of hundreds of organisms. The CRW alignments are superior to those that might be inferred from traditional sequence alignment programs such as clustalw and BLAST.
The secondary structure model of the S. cerevisiae rRNA from the CRW identifies expansion segments in the RNAs of both the large and small subunits. The secondary structures of some expansion segments were not available from the CRW. For these cases, we used MFOLD to predict the secondary structure (Zuker, 2003). For some large expansion segments (> 200 nucleotides), like ES6 in the small subunit, secondary structure has been proposed by using experimental methods (Alkemar and Nygard, 2006). Some small nucleotide segments within the ES of the large subunit (ES28, ES39, etc) lacked both experimental data and useable MFOLD predictions. We modeled these as single strands or loops.
Given the sequence alignment (which specifies the correspondence between nucleotides in the yeast and E. coli rRNAs), single nucleotide changes were easily made in double-helical regions, and in those single-stranded and loop regions where the yeast and E. coli sequences are of equal length. Nucleotide substitutions in these regions were made using the Biopolymer module of the Insight-II software package (Molecular Simulations, Inc., San Diego, CA). In those cases where insertions and/or deletions were required, we used loops with the desired number of bases from crystal structures, manually replacing the existing loop with the model loop, with manual adjustments to get reasonable fits prior to minimization. In most cases some nucleotides needed to be substituted to fit the yeast sequence. Variable regions such as helix 33 (the beak region of SSU) contained many changes that needed extensive manual intervention.
For those ES with known secondary structures, the MCSYM program was used to generate tertiary structures (Major et al., 1991). When generating tertiary structures, MCSYM takes into account structural constraints such as pairing partners and the type of interaction between them, connection information between consecutive residues, and so on, as specified by the user. The program then searches through a structural database and gathers all the pairs that satisfy the constraints. Using the residues gathered, it generates a three-dimensional model. The structures generated by MCSYM are not always energetically favorable, because the models generated satisfy only the constraints provided and do not use energetic constraints, and MCSYM does not refine these initial models. Thus the models obtained from MCSYM need to be analyzed and minimized using a molecular mechanics program. For this purpose we used the Insight-II Discover module.
Chimera (Pettersen et al., 2004) was used to view the density map and perform manual fitting, roughly placing the model inside their appropriate regions of density. The fit-in-map module of Chimera was used to fit the rigid ribosomal RNA model into the density map. For those expansion segments consisting of multiple helical regions, the helices were separated at the loop regions, each helix was independently fitted into the density, and the helices were then reconnected. Helical regions are well ordered in the density maps, allowing us to fit those regions of our model with confidence.
Some expansion segments share the same cryo-EM density, especially in the large subunit, so careful modeling of the RNA was essential to avoid overlap. A combination of manual and automated methods was used in these efforts, and sometimes multiple iterations were required to generate satisfactory models.
The initial fit to the cryo-EM density of T. lanuginosus was refined with a flexible fitting algorithm using the Emmental sub-package in the YUP.scx module of YUP (http://rumour.biology.gatech.edu/YammpWeb) (Tan et al., 2006). A Gaussian Network Model (GNM) represents the all-atom structure. The energy function contains terms for scoring the quality of the fit of the model to the density map, plus restraint energies for the GNM and volume exclusion terms. The optimization protocol uses simulated annealing with molecular dynamics. This optimization produced some steric clashes. These included a handful of ring penetrations, which we corrected manually, before minimization with NAMD (Nelson et al., 1996), using the AMBER force field (Cornell et al., 1995).
There are 78 known proteins in the yeast ribosome of which 44 have homologs available in archaeal and bacterial ribosome structures. We calculated models for these protein sequences using MODPIPE, an automated software pipeline for large-scale protein structure modeling (Eswar et al., 2003). MODPIPE relies on MODELLER (Sali and Blundell, 1993) for its functionality and can calculate comparative models for a large number of sequences using different template structures and sequence-structure alignments. Sequence-structure matches were established using a variety of fold-assignment methods including sequence-sequence (Smith and Waterman, 1981), profile-sequence (Altschul et al., 1997; Eswar et al., 2005a) and profile-profile alignments (Eswar et al., 2005b; Marti-Renom et al., 2004). Increased sensitivity of the search for known template structures was achieved by using an E-value threshold of 1.0. Ten models were calculated for each of the sequence-structure matches resulting in a reasonable degree of conformational sampling. The best scoring model for each alignment was then chosen using the DOPE score, a distance-dependent atomic statistical potential (Shen and Sali, 2006). Finally, all models generated for a given input sequence were evaluated for the correctness of the fold using a composite model quality criterion that included the coverage of the model, sequence identity of the sequence-structure alignment, the fraction of gaps in the alignment, the compactness of the model, and statistical potential Z-scores (Eramian et al., 2008; Melo et al., 2002; Shen and Sali, 2006). Only models that were assessed to have the correct fold were considered for docking into the EM density map.
Each of the models calculated with the procedure described above was rigidly fitted into the thermomyces density map using Mod-EM (Topf et al., 2005). The initial position of each protein model was assigned by superposing its coordinates onto the corresponding coordinates of comparative models previously fitted into the 11.7 Å cryo-EM map of the yeast ribosome (pdb id: 1S1H: 40S subunit, 1S1I: 60S subunit (Spahn et al., 2004)). The search for the best fit was then achieved by a local exhaustive exploration of Euler angles, guided by a cross-correlation coefficient (CCC) between the model and the density map (Topf et al., 2005). To further validate the best fits they were compared to the best local fits achieved using the Chimera fit-in-map module. For each protein, the best model was selected based on a combination of the highest CCC and the lowest DOPE score (Eramian et al., 2008; Shen and Sali, 2006; Topf et al., 2005; Topf et al., 2006).
The next step was a visual inspection of all fitted models using Chimera (Goddard et al., 2005) in the context of the fitted rRNA chains. The inspection revealed some atom clashes (between two protein models or between a protein model and an rRNA chain) and some rigid parts (such as individual domains or groups of α-helices) that were shifted or rotated relative to the density. For these cases, we performed flexible fitting to refine the models, using Flex-EM (Topf et al., 2008). Parts of the models without clear density were omitted from the initial models.
The ribosomal proteins were then added to the rRNA model using Insight-II and VMD. Because the proteins were modeled separately, we inspected the interface between each protein and RNA for any steric clashes. In most cases, only minor manual adjustments were needed, and each protein was then energy minimized along with the neighboring rRNA using Insight-II and NAMD for 1000 steps. To achieve the correct stereochemistry and eliminate stearic clashes, each subunit of the final combined protein and nucleic acid model was minimized for 200,000 steps using the steepest descent algorithm with NAMD, using the CHARMM force field. The final steric acceptability of the model was verified through extensive visual inspection of the final model, followed by a quantitative evaluation using the RCSB Protein Databank’s Automatic Deposition and Inspection Tool (ADIT). The final CCC value of the entire model was 0.72.
We acknowledge support by Howard Hughes Medical Institute and NIH R37 GM29169 and GM55440 (to J.F.), NIH R01 BM54762, U54 GM074945, PN2 EY016525, R01 BM083960, and NSF IIS-0705196 (to A.S.), and an MRC Career Development Award G0600084 (to M.T.). SCH received support from the Georgia Research Alliance and NIH GM53827, and from NIH RR12255 (C.L. Brooks III, PI). We thank Bimal Rath for assistance with the motif search and Lila Rubenstein for assistance with the illustrations. The atomic model of the 60S rRNA and rps have been deposited separately into the protein databank with accession codes 3JYX and 3JYW, respectively. Additionally, the atomic model of the entire 40S subunit including both rRNA and rps, and along with P/E-tRNA, has been deposited into the protein databank with accession code 3JYV.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.