snRNP biogenesis is critical for gene expression in eukaryotes. The SMN complex plays an essential role in this process and a considerable amount of information is available about its components, interactions and activity in snRNP assembly
in vitro (
Chari et al., 2008;
Fischer et al., 1997;
Meister et al., 2001a;
Narayanan et al., 2004;
Pellizzoni et al., 2002b). In contrast, little is known about the activity and regulation of the SMN complex in cells. Here, we used two classes of SMN complex inhibitors discovered by high throughput screening, protein synthesis inhibitors and ROS, to dissect this pathway in cells. By formaldehyde-mediated protein-protein and protein-RNA crosslinking, mass spectrometry and high throughput sequencing, we defined the points of inhibition for each of these inhibitors, providing important insights into snRNP biogenesis. One of the surprising observations we describe is that protein synthesis inhibition, by a variety of mechanisms, results in accumulation of separate subunits of the SMN complex. One including SMN then translocates from the cytoplasm to the nucleus. This is an unexpected biological effect of these widely used reagents whose effects are frequently interpreted as resulting from inhibition of protein synthesis alone. The mechanism by which protein synthesis inhibition causes this is not presently known. It could be the result of a decrease in the level of a rapidly turning over protein(s), including the supply of newly synthesized Sm proteins. The components of the SMN complex themselves are stable under these conditions. It could also, however, be the result of a stress signal generated as a result of attenuation of the translation machinery. Nevertheless, protein synthesis inhibitors provided invaluable information on the subunits and intermediates of the SMN complex.
A summary of our major findings and view of the pathway is presented as a model in . Nascent pre-snRNA transcripts produced by RNA polymerase II become associated with the cap binding complex (CBC) proteins, CBC20 and CBC80, that recruit an adaptor protein, Phax, which then recruits the nuclear export receptor exportin 1 (CRM1) (
Ohno et al., 2000). The fate of the nascent pre-snRNAs immediately after their export to the cytoplasm was not previously known and our data demonstrate that they associate with Gemin5. Because Gemin5 is almost entirely cytoplasmic, this association likely occurs after the export of the pre-snRNAs from the nucleus, but the possibility that Gemin5 can shuttle in and out of the nucleus and associate with pre-snRNAs in the nucleus prior to export cannot be ruled out. This suggests that Gemin5 directs pre-snRNAs to the SMN complex or subunit. We further identified a subunit containing SMN together with Gemins 2/6/7/8 and unrip. The association of this subunit with the Gemin5-pre-snRNA subunit is dependent on protein synthesis. Smaller, as well as, additional subunits may exist. For example, SMN/Gemin2, Gemin6/7/8/unrip and Gemin3/4/5 have been described (
Battle et al., 2007;
Carissimi et al., 2005;
Carissimi et al., 2006). As we have not detected a Gemin3/4/5 complex, but because Gemin3 and Gemin4 are in the SMN complex (), we depict Gemin3/4 as a subunit that joins the SMN complex separately. Proteomic analyses showed that the SMN subunit also contains the Sm proteins, however, it has not been possible to quantitate them due to the small size, paucity of trypsin cleavage sites and low concentration of these proteins in this subunit. Previous studies have shown that the Sm proteins are found in complexes with pICln (6S complex) and methylosome/PRMT5 (
Friesen et al., 2001;
Friesen et al., 2002;
Meister et al., 2001b;
Pu et al., 1999), which makes symmetric dimethyl-arginine (sDMA) modification on some of the Sm proteins and thereby enhances their association with SMN.
In addition to protein synthesis inhibitors, formaldehyde crosslinking of complexes in living cells, which we optimized to capture the protein-protein and protein-RNA interactions that occur
in vivo, was key to discovering intermediates. This ribo-proteomic strategy provides a powerful tool and therefore should be widely applicable for studies of all aspects of RNA metabolism. Formaldehyde crosslinking of protein-RNA complexes in cells is very efficient compared to UV crosslinking (
Choi and Dreyfuss, 1984a,
b;
Dreyfuss et al., 1984;
Niranjanakumari et al., 2002). Unlike UV crosslinked protein-RNA complexes, these can be readily reversed and we show that the recovered RNAs are amenable to further analysis. In addition, the protein composition of the same complexes can be determined using mass spectrometry, whereas UV does not generate protein-protein crosslinks. These advantages of formaldehyde crosslinking, together with the ability of protein synthesis inhibitors to affect a large accumulation of pre-snRNAs on Gemin5, allowed us to trap the transient Gemin5-pre-snRNA intermediate and determine the RNA sequences.
The wealth of sequence information led to the discovery of precursors for all the major and minor snRNAs. To date, only limited information on precursor sequences was available for snRNAs (
Hernandez, 1985;
Hernandez and Weiner, 1986;
Kleinschmidt and Pederson, 1987;
Neuman de Vegvar and Dahlberg, 1989), however, our studies revealed multiple precursor species for each snRNA. Since there are multiple gene copies for most of the major snRNAs with only slight sequence variation among them, distinguishing between transcriptionally active snRNA genes from pseudogenes has been difficult and is not easily achieved. Our data provide information on the loci that are transcribed and identify those with previously uncharacterized transcriptional activity. It is of importance to note that the majority of the sequence reads in most of the snRNAs did not map to the precursor sequences but rather to the regions that corresponded to the snRNP code existing in the mature snRNAs (). These
in vivo mapping data from high throughput sequencing are consistent with previous mapping results performed
in vitro (
Yong et al., 2004a). RNase T1 digestion readily removed the Sm site suggesting that it is exposed in its Gemin5-bound state.
One of the most striking findings relates to the precursor sequence of U4atac. In contrast to the other snRNAs, most of the Gemin5 bound fragments included the 3′-end of pre-U4atac, a sequence predicted to form a stem-loop that is later removed as the mature U4atac forms (). Notably, this pre-U4atac sequence, in contrast to precursor sequences of the other snRNAs, is evolutionarily conserved. It was difficult to understand how an Sm core could assemble on mature U4atac, as unlike other snRNAs, it does not have a canonical snRNP code as it lacks a stem loop 3′-terminal to the Sm site. These data strongly suggest that the snRNP code for U4atac is contained in its precursor sequence. Thus, Gemin5 binds specifically to pre-snRNAs and these, but not mature snRNAs, are the substrates for the SMN complex. Importantly, the extra 3′ sequences found in the pre-snRNAs are not merely neutral appendages that are later trimmed, but rather have an important function(s) in Sm core assembly. Our measurements for pre-U1 and pre-U4atac snRNAs demonstrate that they significantly enhance a critical step of snRNP biogenesis. We suggest that these precursor sequences are subsequently removed by 3′-end processing shortly after the step in the assembly that they facilitate because they are not needed, and may potentially interfere with the function of the mature snRNA.
ROS inhibit the activity of the SMN complex but do not dissociate it. In fact, it contains all the known protein components of the SMN complex, as well as pre-snRNAs and Sm proteins (data not shown). The mechanism by which ROS inactivate the SMN complex and inhibit snRNP assembly is not known. However, ROS cause SMN-SMN intermolecular disulfide bridging and it is possible that this is, at least in part, the basis of inhibition (
Wan et al., 2008). The SMN-SMN oxidative disulfide crosslinking also provides evidence that SMN in cells is oligomeric. We depict SMN as a dimer () for simplicity and because the number of subunits of SMN in oligomers is not known. The hyper-stoichiometric amount of Gemin5 and the presence of pre-snRNA in the SMN complex support the idea that Gemin5 delivers pre-snRNAs to this complex and that this function is not inhibited by ROS. The RNAi experiments suggest that the competition of the various pre-snRNA-Gemin5 complexes for a limiting amount of SMN subunit could contribute to the altered stoichiometry of snRNPs and the splicing abnormalities that likely result from it, as seen in the SMN deficient SMA mouse model (
Zhang et al., 2008). The presence of pre-snRNAs and low levels of mature snRNAs in the ROS stalled SMN complex suggests that 3′-end processing occurs on the SMN complex (). Previous studies have shown that 5′-end modification of the monomethyl-guanosine (m7G) to trimethyl-guanosine (TMG) cap also occurs on the SMN complex by Tgs1 (
Mouaikel et al., 2003). We suggest that the Gemin5-pre-snRNA and the SMN subunits associate when the SMN complex is loaded with Sm proteins. The structure of Gemin6 and Gemin7, both of which contain Sm-like folds and interact with each other in a similar manner to Sm-Sm (
Ma et al., 2005), together with the role of Gemin6/7/8/unrip in assembly (
Carissimi et al., 2006) suggests a role for these proteins in forming the Sm core.
These studies using ribo-proteomic analysis of in vivo captured RNPs provide a new view of the biology of the SMN complex. They reveal previously unknown subunits of the SMN complex, identify activities required for their interactions, suggest a specific order of steps in snRNP biogenesis, and led to the discovery of precursors for all the snRNAs. The fate of nascent pre-snRNAs immediately after their export to the cytoplasm was not known. Our data show that Gemin5 is the link between newly transcribed and exported pre-snRNAs and their subsequent assembly on the SMN complex. We further show that the snRNA precursor's 3′ sequences function to enhance snRNP biogenesis. The ROS stalled SMN complex represents the active transient intermediate poised for Sm core assembly and pre-snRNA processing. We suggest that the pre-snRNAs are the substrates for the SMN complex. Almost without exception, previous studies of the SMN complex and Sm core assembly used snRNAs because the pre-snRNAs were not considered to be the substrates and their sequences were not known. Previous experiments can now be revisited and future studies can now be designed using pre-snRNAs as substrates, from which additional aspects of the complexity and regulation of snRNP biogenesis will likely emerge.