|Home | About | Journals | Submit | Contact Us | Français|
Messenger RNAs undergo 5' capping, splicing, 3'-end processing, and export before translation in the cytoplasm. It has become clear that these mRNA processing events are tightly coupled and have a profound effect on the fate of the resulting transcript. This processing is represented by modifications of the pre-mRNA and loading of various protein factors. The sum of protein factors that stay with the mRNA as a result of processing is modified over the life of the transcript, conferring significant regulation to its expression.
Messenger RNA (mRNA) transcripts are extensively processed before export. 5′ capping, splicing, and 3′-end processing represent nuclear processes that are large determinants of the fate of a transcript. As mRNA processing events involve different cellular machinery (Fig. 1), RNA sequences, and have different consequences for target mRNAs, these processes were long seen to be independent of one another. It has become clear over the last decade, however, that these events are integrated and coordinated in space and time (Schroeder et al. 2000; reviewed in Bentley 2002; reviewed in Moore and Proudfoot 2009). Nuclear processing steps require a large set of proteins, many of which are loaded onto the transcript as a result of processing, adding a layer of regulatory information that can affect export, localization, translation, and stability of the transcript. Examples of such proteins include the exon-junction complex (EJC), left behind from splicing, and the THO/TREX complex, loaded during elongation. Indeed, the highly integrated nature of nuclear mRNA processing adds a new level of complexity to our picture of gene regulation. The availability of genetic fluorescent tags and sophisticated microscopy technology adds a dynamic component to this picture, providing spatial and temporal information and highlighting how nuclear structure might regulate gene expression (reviewed in Gorski et al. 2006; reviewed in Moore and Proudfoot 2009).
Transcription, the major contributor to RNA biogenesis, takes place under constraints of an anisotropic nuclear landscape that is highly structured (chromatin, distinct nuclear bodies, etc.) and dynamic (gene mobility, diffusive factors, genomic reorganization during cell cycle progression, etc.) (Yao et al. 2008). The availability of increasingly sensitive equipment and fluorescent markers has made it possible to intensively interrogate transcriptional dynamics (Fig. 2) (Becker et al. 2002; Janicki et al. 2004; reviewed in Shav-Tal et al. 2004b; Darzacq et al. 2007; Yao et al. 2007). Such approaches provide a new perspective on mRNA metabolism that has been mostly based on biochemical data. For instance, whereas preformed transcription complexes are sufficient for in vitro transcription, live cell imaging experiments show that different transcription factors show a wide range of dwell times at the promoter and suggest a link between transcription complex assembly dynamics and transcriptional output, consistent with a subunit assembly model for transcription complex recruitment (Gorski et al. 2008; reviewed in Darzacq et al. 2009). Modulation of transcriptional speed and processivity is suggested to be a way of regulating gene expression (Darzacq et al. 2007; reviewed in Core and Lis 2008; Core et al. 2008). These techniques have revealed highly inefficient transcription by RNA polymerase II (Pol II) and stochastic assembly of transcription complexes (Darzacq et al. 2007; Gorski et al. 2008), and both assembly and processivity can be regarded as rate limiting steps of in vivo transcription (reviewed in Core and Lis 2008). A novel way to look at gene expression has been recently presented by analyzing transcript amounts in individual cells on the single molecule level. Here the mean expression level and its variation are accessible, leading to a detailed understanding of variability in gene expression within a population (Zenklusen et al. 2008). Time lapse experiments add further information concerning the expression mode of individual genes, providing insights into differences between constitutive expression and bursts in different species (reviewed in Larson et al. 2009). These experimental approaches allow for in depth characterization of transcriptional dynamics (reviewed in Darzacq et al. 2009; reviewed in Larson et al. 2009), and are likely to provide greater insight into downstream processes.
Transcription, particularly the carboxy-terminal domain (CTD) of RNA Pol II, contributes significantly to the integration of nuclear mRNA processing (Fig. 3). The CTD is an essential domain of the largest RNA Pol II subunit, composed of conserved YSPTSPS heptad repeats that are subject to reversible phosphorylation. It is well established that the CTD functions in transcription, and it has an equally important role in mRNA processing. The CTD interacts with a large number of protein factors, and among the protein domains shown to show preferential CTD binding are: CTD interacting domains (CIDs), WW domains, and FF domains (Verdecia et al. 2000; Smith et al. 2004; Noble et al. 2005). Serine 5 phosphorylation of the CTD occurs when RNA Pol II is at the 5′ end of the gene and is mediated by the TFIIH-associated kinase CDK7 (Kin28 in yeast) as transcription initiates. Serine 2 phosphorylation is mediated by PTEFb (CTDK1 in yeast) as the processive RNA Pol II elongates through the body of the gene (Komarnitsky et al. 2000; Peterlin and Price 2006). These phosphorylation marks are critical to the proper progression of transcription and are required for coordinating processing events (reviewed in Hirose and Manley 2000). Indeed, the CTD has been shown to bind over 100 different yeast proteins in its phosphorylated state (Phatnani et al. 2004). It can adopt different conformations depending on phosphorylation patterns, protein interactions, and proline isomerization via peptidyl-prolyl cis/trans isomerases (PPIases) (reviewed in Hirose and Ohkuma 2007). In this way, the CTD functions as a recruitment scaffold for different processing factors throughout transcription, thereby integrating processing events in time. Furthermore, changes in chromatin structure on transcriptional activation are likely to contribute to mRNA processing through gene positioning, or relocation at or near nuclear pores, and gene looping, or formation of DNA loops in which 5′ and 3′ ends make contact (reviewed in Moore and Proudfoot 2009). Recent evidence from studies using galactose genes in yeast showed that gene positioning can be a further regulatory step, and an interaction has been shown between transcription-dependent complexes (e.g TREX/TREX-2), chromatin remodeling complexes (e.g. SAGA), and nuclear pores (reviewed in Blobel 1985; Cabal et al. 2006; Klockner et al. 2009; reviewed in Moore and Proudfoot 2009). Export however, is likely to depend on successful maturation of the mRNA, and whereas in yeast gene gating to nuclear pores is suggested, mRNAs of other genes reach nuclear pores in a diffusive manner. Studies in mammalian cell systems have not supported gene gating, as fluorescence correlation spectroscopy (FCS), FRAP and single particle tracking consistently suggest diffusive behavior (Politz et al. 1998; Shav-Tal et al. 2004a; Grunwald et al. 2006; Politz et al. 2006; Braga et al. 2007; Siebrasse et al. 2008). In addition, high resolution studies of the nuclear periphery led to the conclusion that DNA is absent at the nuclear basket of nuclear pores (Schermelleh et al. 2008). Thus, transcription, and the associated machinery, serves to localize processing factors to the appropriate place on the nascent mRNA, stimulate and coordinate processing and possibly establish functionally significant chromatin conformations.
The first processing event an mRNA undergoes is 5′-end capping which requires three enzymatic activities: RNA triphosphatase, guanylyltransferase and 7-methyltransferase (reviewed in Shuman 2001). Occurring early in transcription after RNA Pol II has transcribed the first 25–30 nucleotides, the RNA triphosphatase first acts on the terminal nucleotide to remove the γ-phosphate. The guanylyltransferase then transfers GMP from GTP to form GpppN, which is subsequently methylated. In mammals, a bifunctional capping enzyme includes both amino-terminal RNA triphosphate activity and carboxy-terminal guanylyltransferase activity. In yeast, the RNA triphosphatase (Cet1) and separate guanylyltransferase (Ceg1) form a heterodimeric capping enzyme. In both cases, a separate methyltransferase (Abd1 in yeast) is required to methylate the guanine at the N7 position.
Mammalian capping enzyme binds directly to the elongating RNA Pol II with a phosphorylated CTD (termed RNA Pol II0) through its guanylyltransferase domain (Yue et al. 1997), thereby coupling capping to the early stage of transcription. Yeast capping enzyme subunits also bind directly and independently to RNA Pol II0, and this interaction is dependent on Kin28, the subunit of TFIIH responsible for serine 5 phosphorylation (Rodriguez et al. 2000). The loss of serine 5 phosphorylation during transcription correlates with the release of the capping enzyme, which is believed to occur before the nascent transcript is 500 nucleotides long (reviewed in Zorio and Bentley 2004). Ceg1 guanylylation activity is inhibited by the phosphorylated CTD but is restored and enhanced by Cet1, and this allosteric regulation may represent a means to temporally coordinate guanylylation and triphosphatase activities (Cho et al. 1998; Ho et al. 1998). In mammals, CTD binding to the guanylyltransferase has an allosteric affect, causing a twofold increase in affinity of guanylyltransferase for GTP (Ho and Shuman 1999). CTD phosphorylation that accompanies the transition from initiation to elongation has a clear impact on capping and allows communication between the transcriptional machinery and capping enzymes. It seems that capping also has a direct impact on transcription. Recent evidence indicates that capping enzymes can relieve transcriptional repression, suggesting an additional role in promoter clearance (Mandal et al. 2004). Temperature sensitive ceg1 yeast mutants are sensitive to drugs that disrupt elongation and show decreased transcription through promoter-proximal pause sites (Kim et al. 2004a). Furthermore, yeast capping enzyme subunits influence RNA Pol II occupancy at the 5′ end and may regulate transcription reinitiation (Myers et al. 2002; Schroeder et al. 2004). Taken together, it seems that transcription complexes are held at the promoter until capping occurs, after which the polymerase switches into an elongating mode.
Capping of transcripts confers stability. In yeast as well as mammals, capping helps protect the transcript from 5′→3′ exonucleases present in both the nucleus and cytoplasm (Hsu and Stevens 1993; Walther et al. 1998). The 5′→3′ degradation pathway involves deadenylation followed by rapid decapping by Dcp1/Dcp2, and degradation by the processive 5′→3′ exonuclease, Xrn1 (Hsu and Stevens 1993; Muhlrad et al. 1995). The cap is also important in mediating mRNA recruitment to ribosomes. The protein complex eIF4F recognizes the cap before translation, and facilitates circularization of mRNAs via an interaction with polyA-binding protein (PAB1), thereby aiding in translation reinitiation and enhancing protein synthesis (Tarun and Sachs 1996; Wakiyama et al. 2000). Transcripts engaged in translation are protected from degradation suggesting competition between translation and degradation (reviewed in Jacobson and Peltz 1996). Finally, depletion of CBC (cap-binding complex) from HeLa cell extracts inhibits the endonucleolytic cleavage step of 3′-end formation, reduces the stability of poly(A) cleavage complexes, and disrupts communication between 5′ and 3′ ends (Flaherty et al. 1997).
The precise removal of noncoding intervening sequences, or introns, from many pre-mRNAs is a required process for proper protein expression. In both yeast and mammals, this reaction is catalyzed by the spliceosome, consisting of the U1, U2, U4, U5 and U6 small nuclear RNPs (snRNPs) in conjunction with a large number of additional proteins (reviewed in Stark and Luhrmann 2006). Within the spliceosome, a series of RNA–RNA, RNA–protein, and protein–protein interactions is needed to identify and remove intronic regions and join exons, producing a mature transcript (reviewed in Collins and Guthrie 2000). The mature spliceosome carries out splicing through two transesterification reactions. First, the 2′-OH of a branch point nucleotide performs a nucleolytic attack on the first nucleotide of the intron, forming a lariat intermediate. Second, the 3′-OH from the free exon performs a nucleolytic attack on the last nucleotide of the intron, thereby joining exons and releasing the lariat intron. Intron identification relies on certain sequences, including the 5′ splice site, branch point (and downstream polypyrimidine tract) and 3′ splice site. In yeast, splice sites are easily identified, and although only 3% of genes contain single short introns, they account for more than 25% of cellular mRNAs (Ares et al. 1999; Lopez and Seraphin 1999; reviewed in Barrass and Beggs 2003). In mammals, however, splice sites are less clear, and many genes contains multiple introns that vary from a few hundred to hundreds of thousands of nucleotides (Lander et al. 2001). The presence of putative splice sites in higher eukaryotes does not necessarily lead to selection of these sites by the spliceosome. Flanking pre-mRNA regulatory elements, including intronic and exonic splicing enhancers or silencers, bind trans-acting regulatory factors that enhance or repress snRNP recruitment to splice sites. Generally, exonic splicing enhancers are bound by Serine/Arginine-rich (SR) proteins, whereas exonic splicing silencers are bound by heterogenous nuclear ribonucleoprotein (hnRNP) proteins (reviewed in Cartegni et al. 2002; reviewed in Singh and Valcarcel 2005). Therefore in higher eukaryotes, it is the cumulative effect of multiple factors that modulates splice site selection. As a result, 92%–94% of human transcripts are subject to alternative splicing, representing an important source of diversity in gene expression with serious implications for health and disease (reviewed in Nissim-Rafinia and Kerem 2005; Wang et al. 2008). Conversely, in yeast, SR proteins do not appear to have a significant role in splicing, consistent with the absence of exonic splicing enhancers (reviewed in Wahl et al. 2009).
The spliceosome is rich in proteins, containing approximately 125 different proteins (more than two-thirds of its mass), and spliceosome assembly is characterized by a remarkable exchange of components from one step to the next (reviewed in Wahl et al. 2009). Although RNA-RNA base pairing interactions are critical to the precise recognition of splice sites, they are generally weak and require additional proteins for enhanced stabilization. DExD/H-type RNA-dependent ATPases/helicases have long been implicated in rearrangements within the spliceosome, and many are conserved between yeast and humans. These proteins act at discrete stages of splicing including single-strand RNA translocation, strand annealing, and protein displacement (reviewed in Pyle 2008). Human spliceosomes also contain several PPIases that are absent in yeast, though the role of these proteins in splicing is not well understood. Similar to DExD/H-type RNA-dependent ATPases/helicases, they are recruited at discrete stages of splicing and are thought to be involved in at least one protein conformational switching event (reviewed in Wahl et al. 2009). Furthermore, post-translational modification of splicing factors and spliceosomal proteins may act as switches to allow fine tuning of the spliceosome (Bellare et al. 2008; reviewed in Wahl et al. 2009). The nature of interactions during such a tightly regulated protein-rich process is not very well documented and may be best studied using in vivo imaging techniques. For example, one recent study (Fig. 4) employed FRET-FLIM and revealed for the first time that different complexes of splicing factors show differential distributions in live cell nuclei (Ellis et al. 2008).
Mature mRNAs are occupied by a number of different proteins that determine their fate in many ways, and several of these associations are splice-dependent. As mentioned, various proteins associate cotranscriptionally and accompany the packaged mRNA into the cytoplasm where they can direct localization, translation and decay. Shuttling SR proteins specifically serve as mRNP binding sites for export factors, and the phosphorylation state of these proteins confers export competence (reviewed in Huang and Steitz 2005; reviewed in Moore and Proudfoot 2009). The THO/TREX complex, which functions in transcription and export, associates with spliced mRNAs at the 5′-most exon (Cheng et al. 2006). THO/TREX recruitment is enhanced by splicing, and promotes rapid export (Valencia et al. 2008). In mammals, REF/Aly and UAP56 (homologs of yeast Yra1 and Sub2), are recruited as a consequence of splicing and have a role in aiding export. Perhaps the most studied splice-dependent mark is the exon junction complex, or EJC. EJCs are stably deposited 20–24 nucleotides upstream of exon-exon junctions late in splicing (Le Hir et al. 2000). Interestingly, spliced mRNAs appear to have greater translational efficiency than their cDNA counterparts (reviewed in Le Hir et al. 2003). Aside from their role in nonsense mediated decay, EJCs appear to directly enhance translation initiation. Although there are several proposed mechanisms by which EJCs do this, they ultimately serve to promote the pioneer round of translation (reviewed in Moore and Proudfoot 2009). Finally, a number of DEAD-box proteins have recently been identified as associating with mRNAs in a splice-dependent manner, and these are believed to influence many aspects of mRNA metabolism (Merz et al. 2007; reviewed in Rosner and Rinkevich 2007). Taken together, it is clear that spliced mRNAs carry with them numerous protein marks that indicate their splicing history and have important downstream effects.
Initial work focusing on the link between transcription and splicing suggested that splicing occurs cotranscriptionally and factors involved in splicing colocalize with transcription sites (Beyer and Osheim 1988; Zhang et al. 1994). Similar to capping, the CTD of RNA Pol II has an important role in splicing. CTD truncation causes inefficient splicing in mammalian cells and inhibition of colocalization of splicing factors with transcription sites (McCracken et al. 1997; Misteli and Spector 1999). Furthermore, RNA Pol II0 has been shown to physically associate with splicing factors that do not bind RNA Pol IIA (RNA Pol II with an unphosphorylated CTD), suggesting this interaction depends on the phosphorylation state of the CTD. Both anti-CTD antibodies and CTD peptides can inhibit splicing in vitro, and expression of phosphorylated CTD peptides has a similar effect on mammalian cells in vivo (Yuryev et al. 1996; Du and Warren 1997). RNA Pol II0 enhances splicing in vitro, whereas RNA Pol IIA has an inhibitory effect (Hirose et al. 1999). This is believed to result from RNA Pol II0-dependent stimulation of the early steps of spliceosome assembly, possibly by facilitating the binding of snRNPs to the nascent transcript. Therefore, it seems that CTD phosphorylation can act to recruit splicing factors to the nascent transcript to ensure rapid and accurate splicing. Transcription is also linked to splicing in ways independent of the CTD. Promoter identity and expression levels of certain SR proteins are known to affect alternative splicing (Cramer et al. 1997). Furthermore, in both yeast and mammals, disruption of RNA Pol II elongation markedly shifts the balance of alternatively spliced isoforms (de la Mata et al. 2003; Howe et al. 2003). This is consistent with a “first come first served” model in which elongation rate regulates splice site selection, as 5′ splice sites are more likely to pair with newly transcribed 3′ splice sites. Additionally, the transcriptional coactivator, p52, is known to interact with SF2/ASF and stimulates splicing (Ge et al. 1998), suggesting that transcriptional machinery can modulate splicing factor recruitment.
Finally, splicing can also have an impact on transcription. The presence of introns can confer increased transcriptional efficiency, possibly through increase initiation rates (Brinster et al. 1988). Recruitment of snRNPs by TAT-SF1, an elongation factor that associates with P-TEFb, enhances elongation and this effect is dependent on the presence of functional splice signals (Fong and Zhou 2001; Kameoka et al. 2004). Reminiscent of the effect of capping on transcription, certain splicing factors have also been shown to promote elongation (Lin et al. 2008).
Capping also has a role in splicing. The 5′ cap structure increases splicing efficiency in mammalian cell extracts and in vivo (Konarska et al. 1984; Inoue et al. 1989). Depletion of CBC in cell extracts prevents spliceosome assembly at an early step in complex formation (Izaurralde et al. 1994). In yeast, CBC interacts with components of the earliest identified splicing complexes (Colot et al. 1996). Similarly, in mammals, successful capping and CBC recruitment is implicated in U1 small nuclear ribonucleoproteins (snRNP) recognition of the 5′ splice site, but this effect is specific to mRNAs with only one intron (Lewis et al. 1996).
Splicing defects can lead to potentially harmful protein variants. mRNAs are subject to quality control in the nucleus resulting in the prevention of export of splicing-defective transcripts. To date, several mutations in cis (conserved splice sites) and in trans (spliceosome components required for 1st or 2nd step catalysis) yield drastic reductions in mature mRNA without a corresponding increase in unspliced pre-mRNAs (reviewed in Staley and Guthrie 1998; Bousquet-Antonelli et al. 2000). Mature mRNA levels are largely restored in these same mutants when degradation is inhibited, suggesting that spliceosomes are able to act successfully on these substrates if they are not quickly destroyed. Thus, quality control acts on a number of different splice-defective pre-mRNAs, and degradation is in direct competition with productive splicing. This sort of kinetic competition is exemplified by the finding that decreased ATP hydrolysis of the DExD/H-type RNA-dependent ATPases, Prp16, leads to productive splicing of pre-mRNAs harboring mutant branch points that would normally be discarded (Burgess and Guthrie 1993). This same principal has been extended to other splice site mutations and members of the DExD/H-type RNA-dependent ATPases/helicases family (Mayas et al. 2006). By coupling spliceosomal rearrangements with irreversible ATP hydrolysis, these proteins ensure that splice-aberrant mRNAs which are unable to complete splicing within a time frame dictated by ATP hydrolysis rates, are discarded (reviewed in Villa et al. 2008). Quality control takes place in the nucleus, can act at numerous stages in the splicing process and serves to commit unspliced mRNAs to degradation pathways. One such pathway involves both 3′ → 5′ degradation by the exosome and 5′ → 3′ degradation by the nuclear exonuclease Rat1 (Bousquet-Antonelli et al. 2000). Other degradation pathways include Dbr1-mediated debranching of aberrant lariat-intermediates followed by export and cytoplasmic degradation, and Rnt1-mediated endonucleolytic cleavage of unspliced pre-mRNAs followed by nuclear degradation (Danin-Kreiselman et al. 2003; Hilleren and Parker 2003). An additional quality control mechanism involves spliceosome-dependent nuclear retention of unspliced transcripts. A number of different proteins seem to be involved in anchoring these transcripts at the nuclear side of the nuclear pore complex, and retention may be regulated by desumoylation (reviewed in Dziembowski et al. 2004; Galy et al. 2004; Palancade et al. 2005; Lewis et al. 2007). Furthermore, splice-defective mRNAs are also known to be retained at the site of transcription (Custodio et al. 1999; reviewed in Custodio and Carmo-Fonseca 2001). Thus, the cell has evolved various ways to ensure that unspliced transcripts do not leak into the cytoplasm. Finally, nonsense mediated decay (NMD) represents a highly specific form of quality control that extends to higher eukaryotes. The presence of an EJC downstream of a stop codon triggers degradation of translating mRNAs (reviewed in Stalder and Muhlemann 2008). Splice-dependent EJC deposition can increase the translational efficiency of normal mRNAs while ensuring rapid degradation of aberrant mRNAs.
The final step of transcription is endonucleolytic cleavage which occurs 10–30 nucleotides downstream of a signal sequence (conserved AAUAAA sequence in mammals or an AU-rich sequence in yeast), followed by poly(A) addition at the 3′ end (reviewed in Proudfoot 2004). The poly(A) tail is similar to the cap in that it is important for the stability and translational efficiency of the mRNA (Drummond et al. 1985). Cleavage requires multiple proteins, including cleavage/polyadenylation specificity factor (CPSF), cleavage stimulation factor (CstF), and two cleavage factors (CFIm and CFIIm) in mammals, or cleavage-polyadenylation factor (CPF) and two cleavage factors (CF1A and CF1B) in yeast. Poly(A) polymerase (PAP) then adds the poly(A) tail to the 3′-OH that is exposed on cleavage (reviewed in Proudfoot 2004).
CPSF and CstF are highly conserved between yeast and humans and are required for both cleavage and polyadenylation (reviewed in Shatkin and Manley 2000). CPSF recognizes RNA and facilitates PAP recruitment. The endonuclease responsible for cleavage in mammals is CPSF-73 and Ydh1 in yeast (Ryan et al. 2004; Mandel et al. 2006). CstF recognizes U/GU-rich elements found in the mRNA and is directly involved in polyadenylation. The conserved AAUAAA sequence and downstream U/GU site comprise the core poly(A) element, although additional auxiliary elements can influence polyadenylation efficiency (Gil and Proudfoot 1984; Sadofsky and Alwine 1984; Russnak 1991; Bagga et al. 1995). Cleavage is closely coupled to poly(A) tail synthesis, which also requires PAB1. A number of other factors also participate in 3′-end processing, and many do not share homologs in other systems (reviewed in Shatkin and Manley 2000).
As with splicing, transcripts can be alternatively polyadenylated, thereby altering stability, localization or transport. It is estimated that more than half of the genes in the human genome are subject to such alternative 3′-end processing, generating isoforms that differ in 3′ UTR length or encoding different proteins altogether (Tian et al. 2005). Alternative polyadenylation can be tissue specific, may be coupled to alternative splicing and can have implications for health and disease (Peterson and Perry 1989; Beaudoing and Gautheret 2001; Tian et al. 2005; Lu et al. 2007; reviewed in Danckwardt et al. 2008). In fact, many human genes contain multiple potential 3′-end cleavage sites, and appropriate site selection is achieved by alternate mechanisms, representing an additional layer of complexity in the regulation of gene expression (reviewed in Wilusz and Spector 2010). It is important to note that the mechanism and machinery responsible for alternative polyadenylation remain unclear.
Many factors involved in 3′-end processing have been shown to interact with the CTD, including CstF subunits. Purified CTD can stimulate the cleavage step and is needed for processing in reconstituted reactions (Hirose and Manley 1998). As with 5′ capping, the specific CTD phosphorylation pattern is important in 3′-end processing. In this case, loss of serine 2 phosphorylation in Ctk1 (yeast) or Cdk9 (Drosophila) mutants leads to a defect in 3′-end processing, likely resulting from improper or inefficient recruitment of processing factors (reviewed in Hirose and Ohkuma 2007). Furthermore, several yeast proteins involved in 3′-end processing preferentially bind the CTD phosphorylated at serine 2, which may ensure processing occurs as the polymerase reaches the end of a gene. In mammalian cells, unlike yeast, the CTD is also required for cleavage (Licatalosi et al. 2002). Once again, the CTD appears to mediate coupling between transcription and 3′-end processing.
Although transcription elongation continues for quite some distance after the poly(A) signal, transcription termination and 3′-end processing are intimately coupled. This was supported by the finding that termination requires functional poly(A) signals (reviewed in Hirose and Manley 2000). There are multiple ways in which 3′-end processing may be coupled to transcription termination:
Transcriptional pause sites positioned downstream of the poly(A) signal also seem to be important for 3′-end processing and termination of a number of mammalian genes, reestablishing the theme of kinetic coupling between transcription and processing (Yonaha and Proudfoot 2000).
Although 3′-end processing machinery is enriched at the 3′ end of genes, certain factors can be found at or near promoters toward the 5′ end. For example, CPSF can be recruited to promoters through an association with TFIID (Dantonel et al. 1997). Ssu72, a component of yeast CPF, seems to function at many stages, including transcription initiation, 3′-end processing and termination of certain mRNAs and snoRNAs (reviewed in Proudfoot et al. 2002). As described, CTD phosphorylation patterns are linked to processing events, and Ssu72, in conjunction with Pta1 (another component of yeast CPF), also appears to have phosphatase activity specific for serine 5 (Krishnamurthy et al. 2004). Additional genetic and physical interactions have been described between the transcriptional machinery (TFIIB and Sub1) and the 3′-end processing factors Ssu72 and Rna15 (Sun and Hampsey 1996; Wu et al. 1999; Calvo and Manley 2005). Interactions between factors located on opposite ends of genes is likely facilitated by the formation of gene loops, and TFIIB, Ssu72 and Pta1 all appear to have a role in this (Singh and Hampsey 2007). Overall, it seems that the transcriptional machinery, DNA template, nascent mRNA and 3′-end processing machinery are in constant communication for processing.
Evidence in yeast shows that transcripts undergoing aberrant 3′-end processing are disposed of. Defects associated with pap1-1 mutants are suppressed by deletion of the exosomal subunit Rrp6, which is known to interact with both PAP and the export factor Npl3 (Burkard and Butler 2000). Hypoadenylated mRNAs are retained in the nucleus at the site of transcription, and these transcripts are stabilized and exported in the absence of Rrp6 (Hilleren et al. 2001). In rna14 and rna15 mutant strains, defects in termination lead to readthrough transcripts and aberrant polyadenylation. Deletion of Rrp6 stabilizes the aberrantly polyadenylated population whereas depletion of Rrp41 stabilizes the population of long readthrough transcripts (Libri et al. 2002; Torchet et al. 2002). Therefore, different components of the exosome may have evolved specialized roles in mRNA surveillance, ensuring rapid degradation of transcripts that possess aberrant 3′ ends. Additionally, THO complex or sub2 mutants show defects in 3′-end formation, reduced mRNA levels and retention at the site of transcription. Codeletion of Rrp6 and TRAMP components restores mRNA levels, although the retention effect only requires Rrp6 (Libri et al. 2002; Rougemaille et al. 2007). These observations may suggest kinetic competition between 3′ end formation and degradation. Transcripts that do not quickly and efficiently undergo 3′-end processing are exposed to the exosome, and mutation of exosome components may allow more time for defective 3′-end processing machinery to function. Finally, recent evidence suggests that nuclear mRNP assembly factors are involved in releasing the 3′-end processing machinery from the transcript after polyadenylation (Qu et al. 2009). This may be a way to temporally coordinate 3′-end formation, mRNP maturation and export.
Transport through the nuclear pore complex (NPC) represents the link between the nucleus and cytoplasm. Several studies have investigated mRNA mobility in the nucleoplasm and have revealed probabilistic movement of mRNAs with diffusion coefficients between 0.03 µm2/s and 4 µm2/s (Politz et al. 1998; Shav-Tal et al. 2004a; Braga et al. 2007; Siebrasse et al. 2008). In pulse-chase experiments, mRNA was found in the cytoplasm within ~20 min after transcription (Lewin 1980). However, single-molecule tracking experiments suggest that transit through the NPC is significantly faster, on the order of fractions of a second. Different forms of RNA have been observed in close proximity to the nuclear envelope in electron micrographs (reviewed in Franke and Scheer 1974). Detailed resolution of individual mRNPs and how they move through nuclear pores is mainly derived from work using mRNPs of Balbiani Ring (BR) genes in salivary gland cells of Chironomus (Mehlin et al. 1992; Kiseleva et al. 1998). These large mRNPs, ~50 nm in diameter, have been visualized interacting with nuclear pores by electron microscopy. Because of their size, these mRNPs unfold at the NPC and show directional translocation through the NPC, beginning with the 5′ end (reviewed in Daneholt 2001). Hypothetical sequences for distinct steps in the export process have been assembled from EM series (Kiseleva et al. 1998). mRNA export likely involves distinct docking, translocation and release steps from the NPC, analogous to a ratchet model that includes a specific function for the mRNA associated DEAD box helicase DBP5 (reviewed in Stewart 2007). Direct interaction of DPB5 with Nup214, a cytoplasmic component of the NPC, has been shown (Napetschnig et al. 2009; von Moeller et al. 2009) and DPB5 is thought to promote export factor release and eventual reorganization of the mRNA (reviewed in Iglesias and Stutz 2008). DBP5 localization, the timing and location of loading onto the transcript, and how many export factors are actually attached to any individual transcript remains unclear (Zhao et al. 2002; Estruch and Cole 2003; Lund and Guthrie 2005; von Moeller et al. 2009). Discrepancy in the number of BR-mRNPs observed on the nuclear and cytoplasmic surfaces of nuclear pores in EM studies have been interpreted as uncoupled asynchronous functions of the export process, which should result in a waiting step during transport. Simultaneously, it was concluded that translocation through the central channel of nuclear pores is probably fast compared to the docking step on the nuclear surface of the pore (Kiseleva et al. 1998). Biochemical studies indicates that mRNA export competence is directly linked to transcription (reviewed in Kohler and Hurt 2007; reviewed in Hurt and Silver 2008; reviewed in Iglesias and Stutz 2008; reviewed in Carmody and Wente 2009; reviewed in Moore and Proudfoot 2009). Recently, a link between actin and transcription has been suggested. Actin (1) can be detected as a component of pre-mRNP complexes, (2) binds transcription factors, (3) is involved in chromatin remodeling, and (4) associates directly to RNA polymerases (reviewed in Miralles and Visa 2006). Interestingly, actin may also have a role in export, as it has been observed to associate cotranscriptionally with BR-mRNPs and remains associated throughout export (Percipalle et al. 2001). Furthermore, the nuclear export receptor exportin-6 shows specificity for profilin-actin, suggesting an additional role as an adaptor for export of certain mRNAs (Stuven et al. 2003; reviewed in Miralles and Visa 2006).
Export is mediated by protein factors associated with the mRNA, and mRNAs that do not carry the necessary adaptor and export factors are retained in the nucleus (reviewed in Iglesias and Stutz 2008). Most mRNAs seem to export via a TAP (Mex67 in yeast)-dependent pathway. TAP is not a member of the karyopherin family and does not rely on the GTPase Ran, which mediates nuclear import (Segref et al. 1997; reviewed in Macara 2001; reviewed in Dreyfuss et al. 2002). Export factors do not recognize the mRNA directly, but rather through adapters such as Aly/REF (Yra1 in yeast), which is necessary for export (Stutz et al. 2000; Zhou et al. 2000; Gatfield et al. 2001; Le Hir et al. 2001). However, CBP80, a component of the CBC, has been implicated in mediating contacts between transport receptors and mRNA (Hamm and Mattaj 1990; Cheng et al. 2006). Aly/REF has been identified to be loaded in a splice-dependent manner as part of the EJC (Le Hir et al. 2001). Complex regulation of Aly/REF links mRNA export to cell cycle progression (Zhou et al. 2000; Okada et al. 2008; reviewed in Okada and Ye 2009) and Yra1, has been linked to S-phase entry (Swaminathan et al. 2007). Conversely, in Drosophila and Caenorhabditis elegans, mRNA export is Aly/REF independent (Gatfield and Izaurralde 2002; Longman et al. 2003).
The transcription-export complex (TREX) exemplifies the tight coupling between transcription and export. Recruitment of the TREX complex is coupled to the transcription machinery in yeast but associated with the splicing machinery in metazoans (Masuda et al. 2005; Cheng et al. 2006). In metazoans, the TREX complex was initially thought to be part of the EJC (Gatfield et al. 2001; Le Hir et al. 2001) but is now known to be recruited to the 5′ end independently, in a splicing and 5′ cap-dependent manner (Masuda et al. 2005; Cheng et al. 2006). In yeast, molecular machinery serves to dock transcribing genes to nuclear pores, resulting in “gene gating.” SAGA, involved in histone modification and DNA remodeling, has also been shown to interact with transcription factors (Grant et al. 1997; Larschan and Winston 2001; reviewed in Daniel and Grant 2007). Components of SAGA and TREX2 complexes are required for gene gating of the GAL locus following activation (Cabal et al. 2006; Wilmes and Guthrie 2009). Sus1, one such component, promotes NPC docking and export (reviewed in Blobel 1985; Jani et al. 2009; Klockner et al. 2009; Wilmes and Guthrie 2009). Sus1 is also involved in transcription elongation and may prevent harmful DNA:RNA hybrids during transcription (Klockner et al. 2009). Interactions have been shown for both TREX and TREX2 complexes with Swt1, an endonuclease that interacts with nuclear pores and is involved in mRNA quality control (Skruzny et al. 2009). Although major pathways for mRNA export have been identified, based on interspecies variations and the number of transport factors and cofactors involved, additional pathways are likely to exist. For example, export of certain viral and other mRNAs depend on CRM1 as an export factor (Ohno et al. 2000; reviewed in Dreyfuss et al. 2002; reviewed in Iglesias and Stutz 2008).
The extent to which processing and export might be regulated in a species- and differentiation-dependent manner currently remains unclear. Dynamic regulation of mRNA processing factors is a relatively new question, and it is still unclear to what degree mRNA export is regulated at the level of individual nuclear pores. A dynamic view of how mRNA transitions through the nuclear pore is lacking, but recent developments in mRNA labeling and imaging technology may provide the opportunity to fill this gap in the near future.
As discussed, a wealth of ensemble biochemical studies has provided great molecular and mechanistic insight into mRNA processing, leading to further questions that will require in vivo imaging approaches. Several recent studies highlight the importance of such techniques in gaining quantitative information on fundamental aspects of mRNA processing events, with spatial and temporal resolution. For example, the nucleus is the site of numerous essential cellular processes that are likely coordinated through highly organized and compartmentalized nuclear bodies with distinct functions, as observed using cellular imaging (reviewed in Misteli 2007). Nuclear bodies lack membranes and are highly dynamic yet steady-state structures. This represents a more advanced view from the original idea of nuclear factories for transcription and replication, in which core components localized to static discrete foci (reviewed in Iborra et al. 1996). This environment facilitates gene expression at multiple levels, including chromatin accessibility, transcriptional control, integration of processing events, stringent quality control, export of mature mRNPs to the cytoplasm and translation. Recent work suggests that Cajal body formation does not require specific gene loci and can initiate from any Cajal body protein, supporting a self-assembling model for nuclear bodies (Kaiser et al. 2008; reviewed in Misteli 2008). Additionally, the long-standing question of whether differentially spliced transcripts recruit distinct sets of basal pre-mRNA splicing factors has recently been addressed. Quantitative single-cell imaging has shown the first in vivo evidence of differential association of pre-mRNA splicing factors with alternatively spliced transcripts, supporting a stochastic model of alternative splicing which would predict that combinatorial sets of splicing factors contribute to splicing outcome (Mabon and Misteli 2005). FRET and FLIM techniques have been applied to investigate interactions between SR proteins and splicing components. Unlike biochemical methods, FRET can be used to study interactions in living cells, with minimal perturbation to the highly structured and dynamic nuclear environment (reviewed in Wouters et al. 2001; Ellis et al. 2008). Such an approach has revealed individual interactions that occur in the presence of RNA Pol II inhibitors, suggesting they are not exclusively cotranscriptional. FRAP analysis suggests that processing factors are highly dynamic and are exchanged between nuclear bodies and other nuclear locations in a matter of seconds (Phair and Misteli 2000; reviewed in Lamond and Spector 2003). Both FRET and FRAP have been used to study the localization and association of SF1 and U2AF. The mobility of these proteins is correlated with their ability to interact with each other, and they are believed to interact in what are described as extraspliceosomal complexes that form before and persist after spliceosome assembly (Rino et al. 2008). The development of real-time, single-molecule imaging techniques provides an especially exciting and promising opportunity to probe in vivo realities, reconciling molecular and mechanistic details within a kinetic and spatial context. One such example involves single particle tracking of U1 snRNP within the nucleus (see Fig. 5), revealing both mobile and transiently immobile states (Grunwald et al. 2006). Single molecule imaging makes the range of mobility states over a population of molecules immediately apparent, and changes in the behavior of any individual molecule during observation can be assessed. A major lesson from these studies is that “the mobility” of a given molecule is more likely a mixed population of different states.
This work was supported by National Institutes of Health grant EB2060 to R.H.S. and a DFG fellowship (DG 3388) to D.G. The authors would like to thank S. J. Orenstein, Drs. A. Joseph and V. de Turris for critical reading of the manuscript.
Editors: Tom Misteli and David L. Spector
Additional Perspectives on The Nucleus available at www.cshperspectives.org