Previous experiments indicated that the process of snRNP assembly, particularly the formation of the highly stable Sm core, is not a self-assembly process as had been widely believed, but, rather, is an active process mediated by the SMN complex (3
). Perhaps the key reason for cells to employ the SMN complex for snRNP assembly is to prevent the potentially promiscuous Sm proteins from forming Sm cores indiscriminately and to ensure that Sm cores only assemble on the correct RNAs (56
). Subsequent experiments delineated the general domains in the major snRNAs (U1, U2, U4, and U5) that contain the specific binding sites for the SMN complex (67
). However, other than the short Sm site, there is no extensive sequence similarity among the SMN complex-binding domains of these snRNAs; therefore, how the SMN complex might distinguish them fromother cellular RNAs remained unclear. The HSURs, the snRNAs encoded by HVS (46
), provided an attractive system to address this question because they use the SMN complex to assemble Sm cores, bind the SMN complex with high affinity, have considerable sequence conservation among them, and bear a striking resemblance to the overall structure of several of the major snRNAs (16
). By systematic mutagenesis of the HSURs and using several experimental approaches, we determined the critical RNA sequence features that confer binding to the SMN complex and assembly of an Sm core. These experiments revealed a surprisingly simple structural configuration, comprised of an Sm site (AUUUUUG) and a 3′ terminal stem-loop that is critical for Sm core assembly in vitro and in vivo. These structural features, illustrated in Fig. , are remarkably independent of RNA sequence. This motif constitutes a snRNP code that is recognized by direct binding to the SMN complex and triggers formation of the Sm core on the Sm site.
Model for the critical RNA sequence features that confer binding to the SMN complex and assembly of an Sm core. The details of these features are described in the text.
Several lines of evidence indicate that it is the SMN complex itself, and not the Sm proteins on their own, that is responsible for deciphering this snRNP code. First, the purified SMN complex (high-salt washed) used in the RNA-binding experiments contained no detectable Sm proteins. This is assessed both by silver staining and immunoblotting (Fig. and ) and by RNP gels that showed no detectable Sm core formation from these complexes (data not shown). Second, the profile of interactions of the SMN complex with the RNAs, assayed by phosphorothioate interference mapping, is significantly different from that of Sm proteins alone. The SMN complex contacts the backbone phosphates of the first and third uridines (Fig. ), while TPs alone contact only the backbone of the 3′ stem-loop but not Sm site uridines (Fig. ). These interference data are consistent with the previous report that TPs interact with the backbone of the 3′ stem-loop of U1 snRNA but not at nucleotide positions within the Sm site (38
). Therefore, it is unlikely that the data we present here represent Sm protein, rather than SMN complex-dependent, interactions with snRNAs.UV cross-linking experiments have shown that Sm proteins, particularly, SmG and SmB/B′, also contact the first and third uridines in vitro (65
). Although we cannot entirely rule out the possibility that the high-salt-purified SMN complex contains trace amounts of Sm proteins and that, therefore, they might play a role in the recognition of the snRNAs, it is clear that if the Sm proteins do play a role in the binding to snRNAs, their properties must be changed and controlled by the SMN complex. Furthermore, the assembly activity of purified TPs is much more promiscuous than that of purified SMN complex (Fig. and ), despite comparable concentrations of Sm proteins in each preparation (Fig. ). In vivo, Sm proteins are not free but are associated with a number of protein complexes, including the 6S pICln-containing complex and the 20S methylosome (14
). Because only Sm proteins that are carried by the SMN complex are competent for Sm core assembly (56
), it makes sense that the SMN complex might recognize the very same site upon which Sm proteins assemble. This stringency would explain how the SMN complex ensures that Sm cores assemble only on the correct RNA targets (56
). At this time, it is not known which protein component(s) of the SMN complex are responsible for binding RNA.
The strong resemblance of the minimal SMN complex-binding domains of the HSURs to the domains in the major Sm site-containing U snRNAs, U2, U4, and U5 (previously shown to contain the SMN-binding sites [67
]), argues that the features we define here are of general significance for snRNAs. The clear exception is U1 snRNA, which contains a distinctly different motif for binding to the SMN complex. In U1, the high-affinity binding site for the SMN complex is in stem-loop 1 and does not require the Sm site (67
). Furthermore, the Sm site sequence of U1, AAUUU(C/G)UGG, is different from and not interchangeable with that of U5 snRNA (23
). Consequently, U1 snRNA binds to the SMN complex through a sequence at its 5′, rather than 3′, end and at an independent binding site on the SMN complex, while the other U snRNAs and the HSURs share a second binding site (16
). Indeed, competition experiments suggest that there are at least two snRNA binding sites on the SMN complex, one for U1 snRNA and another for all the other snRNAs and the HSURs. Unlike Sm proteins, which promiscuously assemble an Sm core on single-stranded uridine-rich sequences in vitro (59
), SMN complex recognition of an snRNA requires more. Consistent with previous assembly experiments that suggested that the Sm site might cooperate specifically with other elements of snRNAs for snRNP assembly (23
), the data presented here suggest that the SMN complex recognizes the Sm site of an snRNA when it is presented within the context of a 3′ stem-loop. Upon destabilization or removal of the stem-loop, SMN complex binding and SMN-mediated Sm core assembly are dramatically reduced (Fig. and ). However, in contrast to the critical uridines of the Sm site, the sequence of the stem-loop seems to be relatively unimportant (Fig. ). This makes sense considering that the stem-loop sequences of the U snRNAs and HSURs do not appear to have extensive sequence conservation both within and between the two groups. Furthermore, five variants of U5 snRNA exist within cells that have various changes throughout their stem-loop sequences that do not affect their assembly into snRNPs (62
). Via the assembly and selection of Sm cores on randomized RNAs in Xenopus
oocytes, Grimm et al. (17
) identified the motif AAUUUUUGG, located near the 3′ stem of the carrier RNA, as the predominant sequence that confers Sm core assembly. As we now know that it is the SMN complex that both selects RNAs and mediates the assembly of Sm cores on them, these findings support the conclusion that sequences and structures unrelated to those defined here can function efficiently in SMN binding. Figure summarizes the ranges of lengths of the 3′ end stem and loop that exist among all U snRNAs and HSURs. Less is known about the requirement for sequences located 5′ of the Sm site. Mapping data from both U snRNAs (67
) and HSURs (this study) suggest that at least ~15 nt are required 5′ of the Sm site. Although there is no sequence or length conservation among the minimal 5′ end regions, we cannot rule out that a specific sequence or that individual nucleotides specific to each RNA may be important for SMN complex binding. Although the SMN complex appears to contact the phosphate backbone of a cytosine in the stem 5′ of the Sm site and a uridine in the 3′ loop of HSUR4-35 (Fig. ), these nucleotides are not conserved in HSUR5-60 and likely do not play the same role in the recognition by the SMN complex as do Sm site uridines. It is also possible that the 5′ end may be required to maintain the overall secondary structure of the snRNA, to present a single-stranded Sm site, or to stabilize the Sm protein ring. In addition, there appears to be an upper limit on the length of the sequence just 3′ of the SMN complex-binding element. Masuyama et al. (34
) have recently shown that the length of the RNA determines the export pathway it utilizes. The addition of ~300 nt to U1 snRNA shunted it to the mRNA export pathway, possibly influencing its activity and fate in the cytoplasm.
Binding to the SMN complex, while necessary, is not sufficient for Sm core assembly. The SMN complex binds to an RNA that has the right configuration of an Sm site sequence and an adjacent 3′ stem-loop. This conclusion is further supported by experiments with a U1 snRNA construct (U1 Swap) that is identical to wild-type U1, except that the positions of stem-loops 1 and 4 were swapped, so that stem-loop 1 is placed 3′ to the Sm site (U1 Swap) (67
). While U1 Swap contains all the sequence elements required for SMN complex binding and Sm core assembly, including a high-affinity SMN complex binding site and an Sm site, the SMN complex discerns that the position of the Sm site relative to the 3′ end of the RNA has been altered and, subsequently, will bind to but not assemble an Sm core on the RNA (67
). It is currently not known if the distance between the Sm site and the 3′ stem-loop is an important prerequisite in the assembly of an Sm core. Among all of the U snRNAs and HSURs, however, this length is maintained somewhere between 1 and 5 nt. In addition, if the RNA contains more than about 15 single-stranded nucleotides at the 3′ end, the SMN complex will not assemble an Sm core on it, in marked contrast to the behavior of TPs (Fig. and ). Thus, binding and assembly are two distinct steps within the snRNP biogenesis pathway, and the SMN complex has a built-in surveillance capacity that aborts the assembly reaction if the RNA does not have all the correct features. It is not clear whether the aborted RNAs accumulate in the cytoplasm, are degraded, or are shortened via an exonucleolytic mechanism to become substrates for Sm core assembly. At least after a 1-h incubation, the elongated RNAs appear to remain full-length (Fig. ). Other studies have reported that U1 snRNA molecules with heterogenous 3′ ends do not efficiently assemble Sm cores and become poor substrates for normal 3′-end processing in Xenopus
). In vivo, U snRNAs are transcribed as precursors that have been reported to contain as many as 6 to 8 nt more than the mature molecule (33
), and Sm core assembly seems to be required for the 3′ end processing (71
The discovery that the SMN complex decodes the critical sequence features of snRNAs through direct binding offers a mechanism to explain how Sm cores are only assembled on the correct RNAs. These findings suggest a structural algorithm that can be used to predict additional snRNAs, if such exist, and possibly motor neuron-specific RNA substrates for the SMN complex. Viewed in the more general context of how cells distinguish among the different classes of RNAs, our studies reveal a clear signature in the major snRNAs and demonstrate that the SMN complex performs the task of identifying that signature as well as carrying out the assembly of these RNAs into the corresponding RNPs.