|Home | About | Journals | Submit | Contact Us | Français|
Assembly of heptameric Sm protein rings on snRNAs (Sm cores), essential for snRNP function, is mediated by the SMN complex. Specific Sm core assembly depends on Sm proteins and snRNA recognition by SMN/Gemin2- and Gemin5-containing subunits, respectively. The mechanism by which the Sm proteins are gathered and illicit Sm core assembly is prevented is unknown. Here, we describe the 2.5 Å crystal structure of Gemin2 bound to SmD1/D2/F/E/G pentamer and SMN’s Gemin2-binding domain, a key assembly intermediate. Remarkably, through its extended conformation, Gemin2 wraps around the crescent-shaped pentamer, interacting with all five Sm proteins and gripping its bottom- and top-sides and outer perimeter. It further reaches into its RNA-binding pocket, preventing it from binding RNA. Interestingly, SMN-Gemin2 interaction is abrogated by an SMA (spinal muscular atrophy)-causing mutation in an SMN helix that mediates Gemin2 binding. These findings provide mechanistic insights into SMN complex function, linking snRNP biogenesis and SMA pathogenesis.
Small nuclear ribonucleoprotein particles (snRNPs) are a major class of non-coding RNA-protein complexes that play key roles in post-transcriptional gene expression, including pre-mRNA splicing and suppression of premature termination (Kaida et al., 2010; Staley and Guthrie, 1998; Wahl et al., 2009). Each snRNP consists of one ~100–200 nucleotide small nuclear RNA (snRNA), called U1, U2, U4, U5, U11, U12, and U4atac, and a heptameric ring of Sm proteins (B/B′, D1, D2, D3, E, F, and G) that surrounds the snRNA’s Sm site (Sm core), as well as several proteins specific to each U snRNA (Kambach et al., 1999; Newman and Nagai, 2010; Patel and Steitz, 2003; Pomeranz Krummel et al., 2009; Weber et al., 2010; Will and Luhrmann, 2001). Sm cores are essential for the function, stability and nuclear localization of snRNPs, and their assembly is a key step in snRNP biogenesis (Mattaj et al., 1993; Will and Luhrmann, 2001). In vitro, purified Sm proteins can spontaneously form Sm cores on any RNA or oligoribonucleotide that contains a sequence resembling an Sm site, AUUUUUG or AUUUGUG (Raker et al., 1999). This assembly occurs stepwise from three Sm heteromeric subcore complexes, SmD1/D2, SmF/E/G and SmB/D3. SmD1/D2 and SmF/E/G first associate, forming a pentameric subcore that avidly binds RNA and this subsequently recruits SmB/D3, completing the formation of a highly stable Sm core (Raker et al., 1996). However, as sequences on which Sm proteins have the propensity to assemble do not uniquely define snRNAs (Pellizzoni et al., 2002b), in cells potentially deleterious illicit Sm core assembly is prevented by the SMN complex, a molecular assembly machine that confers the necessary stringent specificity, ensuring Sm core assembly only on snRNAs (Fischer et al., 1997; Liu et al., 1997; Meister et al., 2001a; Pellizzoni et al., 2002b).
The SMN complex is comprised of SMN, Gemins 2–8 and Unrip (Baccon et al., 2002; Carissimi et al., 2005; Carissimi et al., 2006; Charroux et al., 1999; Charroux et al., 2000; Grimmler et al., 2005; Gubitz et al., 2002; Liu et al., 1997; Pellizzoni et al., 2002a). Recent findings showed that the active SMN complex, comprised of all of its known components, is made up of distinct subunits (Battle et al., 2007; Carissimi et al., 2005; Carissimi et al., 2006; Chari et al., 2008; Yong et al., 2010). The Sm proteins are recognized by a subunit that includes SMN and Gemin2 (Chari et al., 2008; Yong et al., 2010). The specificity for snRNAs is determined separately by Gemin5 (Battle et al., 2006; Lau et al., 2009), which recognizes a large ~50–60 nucleotide structure, called the snRNP code, that includes the Sm site and an adjacent 3′-terminal stem-loop structure found in all the pre-snRNAs and distinguishing them from other classes of RNAs (Golembe et al., 2005; Yong et al., 2004a; Yong et al., 2010). Additional subunits, one containing Gemins 6/7/8 and Unrip that can interact with SMN/Gemin2, and another comprised of the putative RNA helicase Gemin3 that can interact with Gemin4 and Gemin5, are also required for Sm core assembly in complex eukaryotes but their specific functions are not yet known (Battle et al., 2007; Carissimi et al., 2006; Yong et al., 2010). Despite significant advances, essential details on the process by which specific Sm core assembly is achieved in cells remain to be defined and a lack of atomic resolution structure of SMN complex components has limited progress on its mechanism and function. To date, only the structures of one domain in SMN, the Tudor domain (Selenko et al., 2001; Sprangers et al., 2003), and of a Gemins 6/7 heterodimer (Ma et al., 2005) have been described.
Here, we identified Gemin2 as the protein that binds a pentamer of Sm proteins comprised of SmD1/D2 and SmF/E/G. We determined the crystal structure of this complex bound to SMN’s Gemin2 binding domain to 2.5 Å, providing important mechanistic insights for SMN complex function and on Sm core assembly. An additional dimension of interest in the SMN complex and the snRNP assembly pathway comes from the fact that reduced levels of functional SMN, due to protein deficiency (>97% of the cases) or loss of function mutations, cause spinal muscular atrophy (SMA), a common motor neuron degenerative disease and a leading hereditary cause of infant mortality (Lefebvre et al., 1995; Talbot and Davies, 2001; Wirth et al., 2006). Information from the structure we determined explains the molecular basis of an SMA-causing patient mutation in SMN, linking a defect in Gemin2-mediated Sm pentamer recruitment to SMA.
Previous studies identified subunits of the SMN complex, including a key intermediate containing SMN, Gemin2 and a subset of Sm proteins in human cells (Battle et al., 2007; Carissimi et al., 2005; Carissimi et al., 2006; Chari et al., 2008; Yong et al., 2010). SMN and Gemin2, also known as SMN interacting protein 1 (SIP1) (Liu et al., 1997), exist as a heterodimer that further oligomerizes via SMN’s C-terminal YG domain (Pellizzoni et al., 1999). To define interactions of SMN and Gemin2 with Sm proteins, we performed in vitro binding experiments using recombinant SMN, Gemin2 with all the Sm proteins, co-expressed according to their known heteromeric interactions, SmD1/D2, SmF/E/G, and SmB/D3 (Kambach et al., 1999; Raker et al., 1996). Binding assays with the purified proteins showed that SMN/Gemin2 bound to SmD1/D2 and SmF/E/G together, but little or no binding was detected to either of these alone (Figure 1), suggesting that SMN/Gemin2 binds to a pentamer that these di- and tri-heteromeric Sm protein complexes can form (Raker et al., 1996). No binding to SmB/D3 was detected under these conditions. SmB/D3 was previously shown to interact with SMN, dependent on post-translational arginine methylation of their RG-rich domains (Brahms et al., 2001; Friesen and Dreyfuss, 2000; Friesen et al., 2001a), which for efficiency of expression were not included in our constructs. Surprisingly, deletions of most of SMN, leaving only SMN’s Gemin2 binding domain (SMNGe2BD; SMN residues 26–62) (Liu et al., 1997; Wang and Dreyfuss, 2001), showed similar Sm pentamer binding, indicating that most of SMN is not necessary for this interaction (Figure 1 and data not shown). Indeed, Gemin2 alone could bind the Sm pentamer with similar efficiency to that of SMN/Gemin2 (Figure 1).
The structure of an SMN complex intermediate that contains the majority of the Sm proteins is critical for understanding the Sm core assembly process and we therefore set out crystallization trials of SMN/Gemin2 with SmD1/D2 and SmF/E/G. Attempts to obtain crystals containing full length SMN were not successful. However, a complex containing Gemin2, SMNGe2BD and the Sm pentamer migrated as a single peak in gel filtration and yielded well diffracting crystals. The structure of human SmD1/D2 dimer was previously determined to 2.5 Å (Kambach et al., 1999). In addition, the structures of human SmB/D3 as well as bacterial and archeal Sm-like proteins were determined to high resolution (Collins et al., 2003; Kambach et al., 1999; Sauter et al., 2003; Toro et al., 2001), and we used them to build structure models of SmF, E, and G. Using SmD1/D2 and SmF/E/G models for molecular replacement in combination with single-wavelength anomalous diffraction (SAD) phasing from selenomethionine (Se-met) labeling in Gemin2 and SmF, E, G, we solved the structure of this seven-component complex to 2.5 Å resolution (Table 1).
The final refined model of Gemin2-SMNGe2BD-Sm pentamer (PDB ID code 3S6N) is shown in Figure 2. Overall, the five Sm proteins in the complex are arranged as a crescent-shaped 5/7th of a ring. The arrangement of the Sm proteins in the pentamer, as SmD1, D2, F, E, and G, clockwise from a top view (Figure 2A), is the same as in Sm cores visualized in structures of U1 snRNP (Pomeranz Krummel et al., 2009; Weber et al., 2010) (Figure S1A). Consistent with the binding experiments (Figure 1), Gemin2 contacts the Sm pentamer. Gemin2 has an extended conformation, wrapping around the Sm pentamer and interacting with both SmD1/D2 and SmF/E/G via two distinct structural domains. Gemin2’s C-terminal domain (residues 100–280) interacts with both SmD1/D2 and with SMNGe2BD on opposite distal surfaces, while its N-terminal domain (residues 1–69) interacts with SmF/E/G. SMNGe2BD comprises a single helix (residues 37–51) that does not contact the Sm pentamer (Figure 2).
Each of the Sm proteins in the pentamer has a canonical Sm fold, consisting of an N-terminal α-helix followed by a strongly bent five-stranded anti-parallel β-sheet with loops connecting each of these segments (Kambach et al., 1999) (Figure S1A). In addition to the 2.5 Å SmD1/D2 structure (Kambach et al., 1999), two recently published structures of assembled human Sm cores, a 5.5 Å structure of U1 snRNP reconstituted from recombinant proteins (Pomeranz Krummel et al., 2009) and a 4.4 Å structure of lightly protease-digested native U1 snRNP purified from HeLa cells (Weber et al., 2010), provided useful references against which the structure of the Sm proteins in our complex could be compared. The SmD1/D2 structure in our complex is very similar to the 2.5 Å dimer structure (rmsd of 0.39 Å for all the equivalent main chain atoms), and the Cα trace for SmF/E/G is very similar to that of the corresponding proteins in the U1 snRNP at 4.4 Å (rmsd 1.16 Å), demonstrating an overall similarity of Sm proteins’ folds and arrangement in all contexts (Figure S1B).
Our structure revealed several differences and details on Sm protein organizations that were not observed in previous structures. In our complex, more residues in SmD2 are visible and certain loop regions and C-terminal tail of SmD2 and SmD1 adopt slightly different conformations (Figure S1B), likely due to their contact with SmF/E/G as well as with Gemin2. Our structure also provides a view of the intermolecular interactions between neighboring Sm proteins in the pentamer, including the interactions within the SmF/E/G subcore complex as well as at the SmD2-SmF interface that connects the two Sm subcore complexes (see details in Figure S1).
Notably, the Gemin2-bound Sm pentamer has a narrower conformation compared to that in the assembled, snRNA-bound Sm core of U1 snRNP (Weber et al., 2010). While individual Sm proteins in the two structures show little deviations (rmsd of 0.67, 0.89, 0.71, 0.79, and 0.90 Å for SmD1, D2, F, E, and G, respectively), comparing the entire Sm pentamer from the two structures shows significant differences. As shown in Figure S2A, when the Cα of SmD1/D2 in the two structures are superimposed (0.78 Å rmsd), SmF/E/G comparison gives a 5.81 Å rmsd on average (3.13, 5.90, and 8.07 Å rmsd for SmF, E and G, respectively). Similarly, when SmF/E/G coordinates are aligned (1.16 Å rmsd), SmD1/D2 deviates by 4.70 Å (5.85 and 3.05 Å rmsd for SmD1 and D2, respectively) (Figure S2B). As a result, the width of the opening between SmD1 and SmG in Gemin2-bound Sm pentamer is smaller than in the assembled Sm core. For example the distance between the most conserved residues of the pentamer, SmD1 Asn37 to SmG Asn39 is 27.4 Å in our structure vs. 31.2 Å in the U1 snRNP structure (Figure S2C). There is also a smaller angle between SmD1 Asn37, SmF Asn41 and SmG Asn39 (64.9° vs. 76.2°) (Figure S2C). Consequently, the space between SmD1 and SmG is not sufficient for SmB/D3 to fit in and the pentamer’s RNA-binding pocket could not accommodate the snRNA’s Sm site (Figure S2D and see below in Figure 4D), at least not in the conformation it has based on the U1 snRNP model (Weber et al., 2010). These findings suggest that while the interfaces within each of the Sm subcore complexes, SmD1/D2 and SmF/E/G, are rigid, there is considerable angular flexibility at the SmD2-SmF interface that may be utilized during Sm core assembly.
The contact between Gemin2 and the Sm pentamer is remarkably extensive, encompassing a large combined buried area (total 5130 Å2). Gemin2’s C- (residues 100–280) and N-terminal (residues 1–69) domains interact with SmD1/D2 and SmF/E, respectively. A relatively unstructured loop 1 (residues 70–99) connects these two domains and surrounds the perimeter of SmD2/F. Additionally Gemin2’s N-terminal tail (residues 1–46) extends into the inner RNA-binding pocket of the Sm pentamer (Figure 2). These extensive contacts between Gemin2 and the Sm pentamer, supported by high quality electron density maps (Figure S3), reveal how Gemin2 could stabilize the pentamer, which by itself is relatively unstable (Raker et al., 1996), and is consistent with Gemin2’s strong binding to the pentamer compared to either SmD1/D2 or SmF/E/G alone (Figure 1). Multiple sequence alignment of Gemin2 orthologs from divergent eukaryotic organisms shows a high degree of amino acid sequence as well as secondary structure conservation (Figure 3). In accord with Gemin2’s structure, the highest conservation is observed for α1 and β1 in the N-terminal domain and the helices of the C-terminal domain.
Gemin2’s C-terminal domain consists of seven α-helices (Figure 2). As shown in a model in Figure 4A that is supported by the electron density map in Figure S3A, helices α5–8 (residues 187–271) form an anti-parallel four-helix bundle, and α2 (residues 100–122) is a long orthogonal helix packed against α6 and α8. Between the two well-structured segments, there are two short helices (α3-α4) and two invisible loops: one between α2 and α3 and the other between α3 and α4. The four-helix bundle and α2 account for the entire binding surface to SmD1/D2 as well as most of the binding surface to SMNGe2BD. The interactions between Gemin2 and SmD1/D2 mainly involve polar networks between loop 4, 6 of Gemin2 and β3-β4 of SmD1, as well as between α7 of Gemin2 and β3-β4 of SmD2. The main chain of Gln188 (loop 4) and Pro225 (loop 6) in Gemin2 form hydrogen-bonds with Glu51 and Thr46 in SmD1, respectively. In addition, the side chain of Glu223 and carbonyl group of Lys224 from loop 6 of Gemin2 form salt bridges with Lys44 of SmD1. Ser232, Arg235 and Arg239, which are highly conserved among Gemin2 orthologs, make extensive hydrophilic network with a water molecule and Asp93 from SmD2. Moreover, His231 of Gemin2 forms hydrogen-bonds with Arg94 of SmD2 (Figure 4A).
Connecting the N- and C-terminal domains of Gemin2 is loop 1 (residues 70–99), whose second half (residues 83–99) interacts with SmD2. Although its first half (residues 70–82) is invisible in our structure, its trajectory and length suggest that this segment loosely wraps around the perimeter of SmD2/F (Figure 2). The amino acid sequence of this loop varies among Gemin2 orthologs of divergent organisms, however, its length in all exceeds 22 residues (Figure 3), which is sufficient to cover the distance between the N- and C-terminal domains of Gemin2.
Gemin2’s N-terminal domain contains a single α-helix (α1), which mainly contacts SmF/E, followed by a short sequence (VVVA), which forms a β-strand that pairs with the β-sheet of SmF (Figures 4B, 4C, S3B and S3C). α1 of Gemin2 contains a highly conserved YLxxVxxE motif (Figure 3), of which the conserved residues Tyr52, Leu53 and Val56 form a hydrophobic patch and contact the hydrophobic surface formed by Ile18, Phe22, Leu25 and Phe50 of SmE (Figure 4B). In addition, the interactions between Gemin2’s α1 and SmF/E are substantiated through an extensive hydrogen-bonding network involving the side chains of Tyr52 and Glu59 in Gemin2, side chain of Asn6 in SmF and main chain amide group of Phe50 in SmE. This network is further extended to the main chain carbonyl group of Val56 in Gemin2’s α1 via a well-defined water molecule (Figure 4B). Consistently, Gemin2’s interacting residues in SmE (Ile18, Phe22, Leu25, and Phe50) and in SmF (Asn6) are also highly conserved (Weber et al., 2010). These findings indicate a functional significance of these interactions, which is further supported by mutagenesis studies on one of the critical residues, Tyr52 (see below in Figure 5). The short β-strand following α1 of Gemin2 pairs with the second half of β2 of SmF through anti-parallel β sheet interactions (Figure 4C). These interactions are further enhanced by additional hydrogen-bondings: between the main chain of Val67 of Gemin2 and the side chain of Ser35 of SmF, between carbonyl oxygen of Asp65 in Gemin2 and the side chain of Lys8 in SmF, and between the side chain of Asp65 in Gemin2 and main chain of Gly38 in SmF mediated by a water molecule (Figure 4C).
The RNA-interacting residues of Sm proteins are highly conserved among archaeal and eukaryotic Sm and Sm-like proteins. The crystal structures of two archaeal Sm protein homo-heptamers complexed with oligo(U) RNA reveal how each Sm protein contacts each uracil (Thore et al., 2003; Toro et al., 2001). Based on this insight as well as biochemical studies, Weber and colleagues provided a similar interaction model, in which each Sm protein provides one aromatic residue on the top and one positively charged residue at the bottom to sandwich each base while simultaneously using the most conserved residue among Sm proteins, Asn, to form hydrogen-bonds with the base from the side (Weber et al., 2010). Strikingly, however, in our structure the N-terminal tail of Gemin2 occludes the RNA binding surface of the Sm pentamer and extensively overlaps with the positions of the U1 snRNA’s Sm-site in the pocket. Gemin2’s N-terminal residues 22–31 extend into the center of the Sm pentamer’s RNA-binding pocket, contacting most of the base-binding sites. Specifically, the electron density map shows that Gemin2’s Leu24, Met25 and Leu28 are right at the base-binding pocket (Figure 4D and S3D). This result suggests that Gemin2 would interfere with the Sm pentamer’s binding to snRNA.
To test the effect of Gemin2 on the Sm pentamer’s snRNA binding as predicted from the structure, we measured the binding of the Sm pentamer to [32P]-α-UTP labeled U4 or U4ΔSm snRNA in the absence or presence of Gemin2 and its various mutants by electrophoretic mobility shift (Raker et al., 1999). The Sm pentamer bound efficiently to U4 but not to U4ΔSm, indicating that it forms an RNP dependent on the nucleotides present in an Sm site (Figure 5A). Importantly, Gemin2 inhibited the pentamer’s binding to U4 in a dose-dependent manner (Figure 5A and 5B). This inhibition is dependent on direct binding of Gemin2 to the pentamer, as the mutation of a highly conserved Gemin2 residue Tyr52 (Y52D) (Figures 3 and and4B)4B) impairs Gemin2-pentamer interaction (Figure 5C) and also fails to inhibit the pentamer’s U4 binding (Figure 5B). In contrast, another Gemin2 mutation, R213D, which does not affect Gemin2’s binding to the pentamer (Figure 5C), maintains the ability to inhibit the pentamer’s RNA binding similarly to wild type Gemin2 (Figure 5B). The R213D mutation impairs SMN-Gemin2 interaction (see below in Figure 6), suggesting that SMN is not required for Gemin2 to inhibit the pentamer’s RNA binding. However, deletion of Gemin2’s N-terminal tail up to α1, Gemin2ΔN39, while maintaining the ability to bind the Sm pentamer (Figure 5C), had little or no effect on the pentamer’s binding to the RNA (Figure 5A). Thus, Gemin2’s N-terminal tail, which is inserted into the pentamer’s RNA-binding pocket, plays an important role in preventing the Sm pentamer from binding RNAs.
As shown by two different angles of the view of the structure and with electron density map (Figure 6A and 6B), the interaction of SMNGe2BD with Gemin2 is mediated by a single helix (SMN residues 37–51) adjacent and anti-parallel to the α2 in Gemin2’s C-terminal domain, on the opposite side of this domain’s SmD1/D2 binding surface (Figure 2). The SMN helix contacts α2, α6, and α8 of Gemin2 via hydrophobic (Figure S4) and polar interactions (Figure 6A and 6B). For example, the main chain carbonyl oxygen of Ala46 and the side chain of Ser49 in SMN form hydrogen-bonds with the side chains of Gln105 and Gln106 in Gemin2 (Figure 6A). In addition, the side chain of Arg213 (R213) in Gemin2 forms salt bridge interactions with that of SMN’s Asp44 (D44) (Figure 6A). This interaction is of particular interest, as D44V mutation in SMN is an SMA-causing patient mutation (Sun et al., 2005). To test the importance of this interaction, we determined the effect of mutations in either R213 or D44 on full-length SMN and Gemin2 binding. As shown in Figure 6C, D44V as well as D44A abrogated Gemin2 binding, consistent with previous observations that also demonstrated that SMN D44V decreases the protein’s snRNP assembly activity (Ogawa et al., 2007). In contrast, mutation of a nearby residue that according to the structure is not involved in SMN-Gemin2 interaction, K41A, had no effect. Reciprocally, and as expected from the structure, mutating R213 in Gemin2, R213D, abrogated its binding to SMN (Figure 6D). On the other hand, Gemin2’s Sm pentamer binding-defective mutant, Y52D (Figure 5C), showed no difference in binding to SMN (Figure 6D). Together, these findings highlight the importance of SMN’s D44 and Gemin2’s R213 in bridging SMN-Gemin2 interaction, providing a structural basis of the D44V patient mutation at an atomic resolution.
The structure and biochemical experiments provide mechanistic insights into the process by which the SMN complex assembles Sm cores and a structural basis for understanding the effect of an SMA-causing SMN mutation. Sm core assembly is a remarkable architectural feat requiring the seven Sm proteins to be brought together and form a ring around the pre-snRNAs’ Sm site, a short nucleotide sequence present also in numerous other RNAs (Wahl et al., 2009; Will and Luhrmann, 2001). To accomplish this, the SMN complex must gather the Sm proteins, inhibit their propensity for illicit Sm core assembly on unintended RNAs until a pre-snRNA joins (Neuenkirchen et al., 2008; Yong et al., 2004b). Our findings demonstrate that Gemin2 serves as the arm of the SMN complex that gathers five out of the seven Sm proteins, holding them as a pentamer poised for Sm core assembly and at the same time preventing them from binding RNAs. The structure explains how this is accomplished, revealing Gemin2 to be a key factor in snRNP biogenesis. Gemin2, through its extended conformation and remarkably extensive interactions with all five Sm proteins, grips the pentamer from its bottom and top sides, and from its outer parameter and inner pocket. Though its specific function was not previously known, Gemin2 has been shown to have a role in Sm core assembly (Feng et al., 2005; Ogawa et al., 2007; Shpargel and Matera, 2005). Consistent with this, the ubiquitously expressed Gemin2 is essential for viability of all eukaryotic organisms (Jablonka et al., 2002; Owen et al., 2000; Paushkin et al., 2000). Notably, Gemin2 gene deletion in the mouse causes embryonic lethality, at an even earlier stage than SMN gene deletion (Jablonka et al., 2002; Schrank et al., 1997). Furthermore, Gemin2’s sequence and domain structure are more phylogenetically conserved than that of all other SMN complex components, including SMN (Cauchi, 2010).
Our findings indicate that the N-terminal tail of Gemin2, particularly residues 22–31, plays a role in inhibiting the pentamer from binding RNA as it occupies the pentamer’s RNA-binding pocket. Furthermore, several residues in this part of Gemin2, including Met25 and Leu28 interact with the residues in the Sm proteins that are involved in binding Sm site nucleotides and are positioned in a way that would hinder RNA binding (Figure 4D). Interestingly, these residues are conserved in Gemin2 orthologs from divergent organisms or are substituted by residues that are compatible with having the same activity, suggesting that this is a conserved function of Gemin2. The pentamer’s narrower conformation in the Gemin2-bound state compared to that in the assembled Sm core would be expected to also restrict access of RNAs to the binding pocket. However, as the structure of an Sm pentamer alone is unknown, it is not possible to determine if Gemin2 binding plays a role in inducing or stabilizing the narrower conformation. Recent studies have shown that pICln, a protein that can bind Sm proteins and inhibit their interaction with snRNA (Friesen et al., 2001b; Pesiridis et al., 2009; Pu et al., 1999), can bind at the SmD1-G opening, forming a closed hetero-hexameric ring that cannot bind snRNA (Chari et al., 2008). A complex suggested to represent a downstream intermediate, comprised of Drosophila C-terminal deleted SMN, Gemin2 and the Sm pentamer, which by electron microscopy shows a similar overall morphology to that of our structure, has also been described (Chari et al., 2008). Our data demonstrate that Gemin2 can bind the Sm pentamer and prevent it from binding snRNAs independent of pICln. Thus, there are at least two mechanisms of pentamer inhibition that are not incompatible and could occur sequentially, first by pICln, and subsequently by Gemin2. However, as pICln is not obligatory for Gemin2-pentamer association, it is also possible that the pentamer binds directly to Gemin2, which links the pentamer to SMN.
For the subsequent steps of Sm core assembly to occur, after pre-snRNA is brought in by Gemin5, Gemin2’s N-terminus, possibly up to α1, would need to be displaced from the Sm pentamer’s RNA-binding pocket to allow the pre-snRNA to bind. The observation that Gemin2ΔN39 can bind the pentamer, suggests that such a displacement would not have the undesirable effect of dissociating the pentamer from the SMN complex. The SmD1-G opening and the Sm site-binding pocket would also need to be widened, utilizing the SmD2-F interface as a hinge. How these structural transitions are effected remains to be determined. Completion of Sm core assembly requires several additional steps and ATP hydrolysis, involving additional proteins about which little structural information is available. In complex eukaryotes, access of RNA to the inhibited intermediate, comprised minimally of SMN/Gemin2-Sm pentamer, is likely to be limited to only bona fide RNA substrates, pre-snRNAs, delivered by Gemin5 (Yong et al., 2010). While it is clear that SMN is oligomeric in cells (Wan et al., 2008), the number of SMN subunits in a complex is unknown, bringing the possibility that it serves as a scaffold for more than one Gemin2-Sm pentamer forming on the same complex simultaneously. SMN determines the capacity of Sm core assembly (Wan et al., 2005) and its oligomerization is particularly important for this function as it serves to recruit essential components for this process. SmB/D3 association with the SMN complex is mediated at least in part by their direct interaction with SMN, which depends on SMN’s oligomerization via its C-terminal YG-rich domains (residues 268–279) (Pellizzoni et al., 1999) and in which the Tudor domain (residues 91–142) plays a role by binding to RG tails of SmB/D3 (Brahms et al., 2001; Sprangers et al., 2003), an interaction that is strongly enhanced by arginine methylation that is carried out by the methylosome/PRMT5 (Brahms et al., 2001; Friesen et al., 2001a; Friesen et al., 2001b; Meister et al., 2001b). There is evidence that an additional subunit that includes Gemins 6/7/8 and Unrip can also associate with SMN/Gemin2 and Sm proteins (Carissimi et al., 2006; Yong et al., 2010). Interestingly, Gemins 6 and 7 form a heterodimer and both have Sm folds and it has therefore been suggested that they might bind the pentamer in the same position where SmB/D3 bind, potentially forming a closed heptameric ring intermediate (Ma et al., 2005). This could further help maintain the pentamer’s association with SMN/Gemin2, together with Gemin8 and Unrip. The function of Gemins 3 and 4, which exist as a dimer and associate with Gemin5, is not known, but the presence of a DEAD box domain in Gemin3 suggests that it may function as an RNA helicase and may be the source of the ATPase activity on which the assembly reaction depends. With the available structure of the key intermediate we describe here, several aspects of the mechanism and regulation of the SMN-Gemins complex as a molecular assembly machine for snRNP biogenesis can now be readily addressed.
The structure further explains why D44V of SMN is an SMA-causing mutation. In the vast majority of SMA patients, the disease results from reduced levels of the SMN protein rather than from nonsense mutations (Wirth et al., 2006). We suggest that D44V is a loss of function mutation because it decreases the ability of SMN bearing this mutation to bind Gemin2 and thus impairs the SMN complex’s capacity to recruit the Sm pentamer for snRNP assembly. These findings thus further link SMN’s function in snRNP biogenesis to SMA. Further atomic level structural information could suggest approaches to enhance SMN-Gemin2 interaction as a potential therapy for SMA.
All of the plasmids used in the studies contain human cDNAs. Full-length SmD1 and SmD2 were constructed in a single pCDFDuet vector (Novagen) with N-terminal His(6)-tag followed by Tobacco Etch Virus (TEV) cleavage site (His6-Tev) fused to SmD2. Full-length SmF and SmE were also constructed in a single pCDFDuet vector, with N-terminal His6-Tev fused to SmE. Full-length SmG was made in pET28 vector (Novagen) with His6-Tev at the N-terminus. Full-length Gemin2 was made in pCDF vector (Novagen) with His6-Tev at its N-terminus. SMNGe2BD, containing SMN residues 26–62, was fused with an N-terminal GST tag in pET42 vector (Novagen). Mutants of Gemin2 or SMN were created from the plasmids containing wild type Gemin2 or SMN cDNAs using the QuikChange site-directed mutagenesis kit (Stratagene). SmD1/D2 was purified by Ni-column first, followed by TEV protease cleavage, secondary pass of Ni-column, cation exchange, and gel filtration chromatography. SmF/E and SmG were co-expressed and purified by a similar procedure except that anion exchange was used instead. Gemin2 and SMNGe2BD were co-expressed and purified by glutathione affinity chromatography, followed by TEV protease cleavage, Ni-column, and anion exchange chromatography. To make the heptamer of the Gemin2-SMNGe2BD-Sm pentamer complex, equal molar amount of the SmD1/D2, SmF/E/G, and Gemin2/SMNGe2BD complexes were mixed in gel filtration buffer (20 mM Tris-HCl, pH8.0, 150 mM NaCl, 1 mM EDTA, and 1 mM TCEP [tris(2-carboxyethyl)phosphine]) supplemented with 0.5 M NaCl, and subjected to HiLoad superdex200 gel filtration chromatography. The fractions containing all seven components were checked by SDS-PAGE, pooled and concentrated to 7–11 mg/ml, and used for crystallization studies.
Selenomethionine (Se-Met) incorporation was performed on the subcore complexe SmF/E/G and Gemin2/SMNGe2BD by adapting the methionine pathway inhibition method (Van Duyne et al., 1993). Cell culture and protein purifications were performed as described above, except that cells were cultured in M9 minimal medium containing amino-acid supplement (Lys, Phe, Thr to final concentration of 100 mg/L, Ile, Leu, and Val to 50 mg/L, and Se-Met to 80 mg/L) for 15 min before protein induction with 1 mM IPTG. The Se-Met-labeled SmF/E/G and Gemin2/SMNGe2BD were mixed with native SmD1/D2 in equal molar stoichiometry for gel filtration as described above. The Se-Met-labeled heptamer was concentrated to about 8 mg/ml for crystallization studies.
For GST-Gemin2 and mutant protein expression and purification, GST fusion proteins were produced from BL21(DE3) E. coli cells containing the expression plasmids in pGEX-4T-1 vector according to the manufacturer’s suggestions (GE Healthcare) with few modifications (Extended Experimental Procedures).
Various Myc-tagged SMN (wild type, K41A, D44A and D44V mutants) and Gemin2 (wild type, Y52D and R213D mutants) constructs were in vitro transcribed and translated with [35S]-Met labeling using the TNT Quick Coupled Transcription/Translation System according to the manufacturer’s instructions (Promega).
Protein binding assays were performed according to the manufacturer’s instructions (GE Healthcare) with few modifications (Extended Experimental Procedures). Briefly, 1 μg of GST fusion proteins was immobilized on 25 μl glutathione-Sepharose beads and incubated with 10 μl of the in vitro transcribed and translated [35S]-Met labeled proteins in binding buffer (50 mM Tris, 200 mM NaCl, 0.2 mM EDTA, 0.05% NP-40, 2 mM DTT and protease inhibitors). Bound proteins were resolved by SDS-PAGE and detected by autoradiography. Alternatively, immobilized GST fusion proteins were incubated with 3 μg of purified recombinant Sm proteins in buffer containing 20 mM Tris-HCl, pH8.0, 250 mM NaCl, 2.5 mM MgCl2 and 0.02% Triton X-100. Bound proteins were resolved by SDS-PAGE and visualized by SimplyBlue staining (Invitrogen).
Human Gemin2-SMNGe2BD-Sm pentamer complex crystals were grown in 1% PEG8000, 100 mM Tris-HCl, pH7.6-8.2 by hanging-drop vapor diffusion method at 20°C within a couple of days. They form in space group P212121, with a = 82.8 Å, b = 84.6 Å, and c = 104.7 Å. Each asymmetric unit contains one Gemin2-SMNGe2BD-Sm pentamer complex. The crystals were cryo-protected by gradually transferring from reservoir solution containing 10% to 40% PEG400, and frozen in liquid nitrogen. Se-Met labeled complex crystals were grown under similar conditions. The X-ray diffraction datasets of native and Se-Met derivative complex crystals were collected at the Advanced Light Source (ALS, Berkeley, CA) beamlines 8.2.1 and 8.2.2. Data were processed by HKL2000 (Otwinowski and Minor, 1997). Initially, the dataset obtained from the native crystals could only be truncated to reach the maximal resolution of about 3.2 Å. The subcore complex of SmD1/D2 inside the heptamer complex was readily located by molecular replacement with the 2.5 Å crystal structure (PDB ID code 1B34) as the search model. The components of SmF, E, and G were located by iterative homolog model building and molecular replacement searching in combination with SmD1/D2. Multi-wavelength anomalous dispersion (MAD) datasets were collected at different wavelengths on several Se-Met derivative crystals, due to severe X-ray decay of a single crystal. Nevertheless, the best Se-Met derivative crystal dataset collected at 0.9796 Å was of high quality and could be used as single-wavelength anomalous dispersion (SAD) in combination with molecular replacement by the SmD1/D2/F/E/G model for phase improvement by PHASER (McCoy et al., 2007) from CCP4 suite (Potterton et al., 2003). In this way, most of the helices of Gemin2 and SMNGe2BD were located. Since the diffraction of the crystals was severely anisotropic, the native dataset was reprocessed and truncated ellipsoidally followed by anisoscaling (Strong et al., 2006). This extended the resolutions to 2.4 Å and 2.6 Å in the directions of a* and c*, respectively, while the resolution in the direction of b* still remained at 3.2 Å. After this processing, the resulting electron density maps for many side chains were improved. The models were gradually improved by cycles of manual rebuilding in Coot (Emsley and Cowtan, 2004) using combinations of methods, including density modification, CNS simulated annealing (Brunger et al., 1998), REFMAC refinement (Winn et al., 2001), and PHASER molecular replacement plus SAD (McCoy et al., 2007). Eleven Se-Met sites were found in the SAD Se-Met dataset. Eight of them were from SmF/E/G and all matched the locations of the atoms S from Met. Three of them were from Gemin2, which provided guidance for the assignment of Gemin2 sequence. In the final stage of model refinement, TLS followed by restrained refinement, as implemented in REFMAC, was used. The final model (PDB ID code 3S6N ) contains SmD1 (residues 1–81), SmD2 (residues 23–76 and 90–116), SmF (residues 3–76), SmE (residues 14–90), SmG (residues 14–51 and 56–72), Gemin2 (residues 22–31, 47–69, 83–123, 134–149, and 179–276), and SMN (residues 37–51). The higher than average R-free of the structure model (Table 1) is due to severe anisotropic diffraction (Strong et al., 2006) and a few regions having relatively low quality electron density, which cannot be confidently assigned in the model. Ramachandran plot by MOLPROBITY (Davis et al., 2007) shows 92.3% of the dihedral angles in favored region, 6.4% in additional allowed region, and 1.3% (7 out of 545) in disallowed region (Table 1). All the 7 outliers are located in the loop regions with relatively poor electron density. Only the regions supported by high quality electron density maps are presented and discussed in detail (Figure S3 and and6B6B).
[32P]-α-UTP labeled U4 or U4ΔSm snRNA was produced by in vitro transcription as previously described (Pellizzoni et al., 2002b). Binding of Sm pentamers to the radio-labeled snRNAs was performed in buffer containing 20 mM HEPES, pH8.0, 70 mM KCl, 2.5 mM MgCl2, 0.5 mM EDTA and 0.01% Triton X-100. Various amounts of Sm proteins (10–50 ng of each) were used for the binding to snRNAs. Where indicated, 1 μg of recombinant Gemin2 or its deletion mutant (Gemin2ΔN39) was pre-incubated with the Sm pentamer for 10 min before RNA was added for binding. After 30 min binding at 25°C, 2 M urea (final concentration) was added to the reaction mixture and the RNPs were analyzed by 6% native gel electrophoresis. For the RNA binding experiments using Gemin2 or its mutants (Y52D and R213D), 50 ng of each Sm protein was pre-incubated with various amounts of Gemin2 or the mutant proteins (1–9 μg).
We thank Corie Ralston and Christy Bertoldo for their assistance with X-ray data collection at the Advanced Light Source (ALS, Berkeley, CA) synchrotron beamlines 8.2.1 and 8.2.2. We thank Emily Anna Bridges for help with bacterial cell culture. We are grateful to the members of our laboratory for helpful discussions and comments on this manuscript. This work was supported by the Association Française Contre les Myopathies (AFM). G.D. is an Investigator of the Howard Hughes Medical Institute.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.