|Home | About | Journals | Submit | Contact Us | Français|
Genome-wide analyses of metazoan transcriptomes have revealed an unexpected level of mRNA diversity that is generated by alternative splicing. Recently, regulatory networks have been identified through which splicing promotes dynamic remodeling of the transcriptome to promote physiological changes, which involve robust and coordinated alternative splicing transitions. The regulation of splicing in yeast, worms, flies and vertebrates affects a variety of biological processes. The functional classes of genes that are regulated by alternative splicing include both those with widespread homeostatic activities and genes with cell-type-specific functions. Alternative splicing can drive determinative physiological change or can have a permissive role by providing mRNA variability that is utilized by other regulatory mechanisms.
One of several recent advances toward understanding regulated gene expression is the discovery of the high level of mRNA complexity that is generated by alternative splicing within metazoan transcriptomes. Recent estimates based on RNA-Seq are that 90%, 60%, and 25% of genes in humans, Drosophila melanogaster, and C. elegans, respectively, undergo alternative splicing1-5. Alternative splicing is the most prominent of several mechanisms generating mRNA structural complexity that also include alternative transcription initiation, editing, and alternative polyA site selection6-7. The predicted outcomes of this complexity are: extensive proteome diversity; introduction of premature termination codons (PTCs), which causes mRNA down-regulation due to nonsense mediated decay (NMD); and variability in mRNA untranslated regions (UTRs), which affects cis-acting elements that mediate regulation of mRNA translation efficiency, stability and localization6,8-9.
The prevalence of alternative splicing raises questions about its biological significance. What fraction of the multiple mRNA isoforms expressed from each of ~19,000 alternatively spliced human genes has a functional impact? Much of the mRNA diversity that is observed includes low abundance transcripts that arise from alternative splicing events that are not conserved, suggesting a high level of stochastic noise10-11. While splicing events that undergo transitions during a physiological change are suggestive of functional consequences, most splicing transitions do not completely switch mRNA isoforms but rather produce a change in the ratios of the isoforms expressed. Often, the changes are rather small and it is difficult to discern whether there are functional consequences. Even for robust transitions, detailed experimental analysis of individual isoforms is required to ascertain whether the transition is a determinative event or a “fine tuning” event. On the other hand, it is clear that alternative splicing produces determinative biological effects exemplified by longstanding examples such as sex determination in D. melanogaster12-13, production of functionally distinct peptide hormones in mammals14, and the meiotic developmental program in budding yeast15. The fraction of alternative splicing that has a biological impact is currently difficult to estimate. It is likely that a large fraction of mRNA diversity has no detectable function within an organism, although some of it provides fodder for the derivation of functional splice variants on an evolutionary time scale16-17. However, the new awareness of alternative splicing prevalence has resulted in increased investigation and identification of a rapidly growing number of physiologically important splicing events.
Here we review networks of regulated alternative splicing that are operative during development, differentiation, or in response to cell stress in a variety of organisms. The mechanisms of splicing regulation have been covered in several excellent reviews6,8,18-22. Our focus is on the physiological outcomes of alternative splicing transitions. We also address broad questions relating to alternative splicing regulation: what are the critical splicing transitions that are relevant to a developmental program or physiological response? Which RNA binding proteins are determinative for splicing transitions? How are the activities of the regulators modulated to mediate the transition (e.g., change in protein abundance, intrinsic activity, or intracellular localization)? What are the “upstream” signaling pathways that control the activities of these splicing regulators? How are the networks integrated with parallel transcriptional and post-transcriptional regulatory programs? And what are the most critical “downstream” functions performed by the regulated splicing transitions? We discuss specific examples that illustrate the range of scenarios in which splicing transitions have a potential impact and the principles of the regulatory systems that control these transitions.
Alternative splicing is primarily regulated by RNA binding proteins that bind premRNAs near variably used splice sites and modulate the efficiency of their recognition by the basal splicing machinery (spliceosome). Large scale quantification of alternative splicing combined with genome-wide identification of in vivo binding sites of splicing regulators provides an unprecedented global view of splicing regulatory networks (Box 1)23. The large-scale analysis of alternative splicing has recently progressed from comparisons of relatively static cell populations to transitions in mRNA complexity that are associated with physiological change. The results reveal that mRNA structural complexity is not only extensive but is also highly dynamic. For example, recently generated transcriptome datasets for 27 stages of D. melanogaster development1 and 17 growth stages and conditions in C. elegans4-5 revealed that a large fraction of alternative splicing (>60% and 30%, respectively) undergo developmental changes, often in coordinated sets that are suggestive of co-regulated networks. Studies using transgenes expressing fluorescent proteins to indicate different splicing outcomes in C. elegans have revealed developmental transitions in real time and provide a genetic approach to identify the regulators24-25 Large-scale analyses of alternative splicing in mammals have identified coordinated splicing within genes enriched for specific functions in different tissues (Table 1).
A splicing regulatory network can be defined as the set of alternative splicing events that are directly regulated by an individual RNA binding protein. RNA binding proteins recognize preferred 5-8 nucleotide sequence motifs located either in the regulated exon or in the flanking introns, commonly within 300 nucleotides of the regulated exon or the upstream or downstream constitutive exons147 (panel a). Approximately fifty mammalian RNA binding proteins that act as auxiliary splicing regulators, separate from the basal splicing machinery, have been characterized and shown to directly regulate splicing by binding to pre-mRNAs8,148. Splicing events that are sensitive to the loss-or gain-of-function of the splicing regulator (via RNAi, genetic knockout, or overexpression) are identified using large scale analyses such as RNA-Seq (represented in the diagram) or splicing sensitive microarrays149. Hundreds of splicing events can be sensitive to changes in the level of a single splicing regulator. A portion of the responsive events are directly regulated by protein-RNA binding and others change due to secondary effects. To identify direct targets, bound protein is covalently linked to the RNA in vivo by UV crosslinking followed by immunoprecipitation (CLIP) and identification of specific binding sites by high throughput sequencing150. While CLIP produces false negatives and positives151, it is a highly effective screen for target identification. The massive data sets from these assays are managed and extensively analyzed computationally for genome-wide identification of splicing events that are both sensitive to the splicing regulator levels and are associated with local in vivo binding sites152.
High throughput splicing analyses applied to normal physiological transitions have demonstrated roles for alternative splicing networks in a variety of cellular responses. In mammals, the combined results from RNA-Seq/splicing microarrays, CLIP, genetic knockouts, and computational analysis have identified networks of tissue specific as well as developmental alternative splicing programs regulated by specific RNA binding proteins (for example, 26,109,153). Despite the large number of splicing events dependent on individual RNA binding proteins, careful analysis can reveal specific features of a complex knockout phenotype that are due to loss of individual alternative splicing events107-108 (panel b).
Individual splicing events are regulated by cooperative as well as antagonistic effects of multiple RNA binding proteins. The multiple inputs integrate the effects from diverse external cues to promote an appropriate splicing response (panel c). Combining experimental data from multiple splicing regulators with computational analyses of large numbers of features has been used to define splicing codes with predictive capabilities for either cell-specific splicing or responsiveness to a specific splicing regulator102,147.
There are several emerging themes regarding large-scale alternative splicing transitions during periods of physiological change. First, any given physiological change associated with an transcriptional transition is likely to have a co-integrated post-transcriptional response, including coordinated alternative splicing transitions. Second, as noted above, subsets of splicing transitions undergo distinct temporally coordinated transitions that are suggestive of co-regulation by different sets of splicing factors. Third, a large fraction of the splicing transitions associated with a physiological change are conserved. For example, more than 40% of splicing transitions observed during mouse heart development or skeletal muscle differentiation were conserved in birds, not only in terms of the alternatively spliced region but also with regard to splicing pattern, the direction of the changes, and timing of the transitions26-27. For most of these events the coding potential is also conserved such that the homologous protein isoforms undergo transitions with the same timing relative to birth or hatching, strongly suggesting functional significance. These results are in stark contrast to genome-wide comparisons in which fewer than 20% of alternative splicing events were conserved between human and mouse28-29, emphasizing the role for splicing in physiological transitions. Fourth, genome-wide studies in which transitions in both alternative splicing and mRNA expression levels were analyzed simultaneously identified two separate gene sets26-27,30-31. One gene set showed changes in mRNA levels without a difference in the mRNA splice variants expressed. The second set showed a change in the mRNA splice variants without a change in total expression. While splicing and transcriptional regulation are linked18, the results indicate that different genes can be regulated primarily either at the level of splicing or at the level of expression. This has expanded the view of regulated gene expression to include transitions in the complexity of the mRNA and protein isoforms that are expressed from individual genes, as well as changes in overall gene output.
Meiosis in the budding yeast Saccharomyces cerevisiae is driven by a well-characterized transcriptional program recently found to be intricately linked with multiple splicing-regulatory programs. Only 5% of genes in budding yeast contain introns (290 of 6000 genes). However, introns are enriched in highly expressed genes and more than one quarter of the ~38,000 RNAs transcribed per hour during vegetative growth are spliced 32. With a few exceptions33, alternative splicing in yeast is limited to ‘splice versus don’t splice’ decisions that relate to intron retention in single intron genes. Large-scale analyses using tiling and splicing sensitive microarrays have determined that 45 intron-containing genes are inefficiently spliced during vegetative growth34-35. These include 13 of the 20 intron-containing meiotic genes that undergo efficient splicing specifically during the meiotic cycle34-35. Meiosis-specific alternative splicing in budding yeast has been known for two decades15 but recent investigations have revealed at least three separate but overlapping meiotic splicing regulatory programs35-36. The best-characterized is regulated by MER1 which encodes an RNA binding protein that is transcriptionally induced during the initiation of the meiotic cycle37. The primary function of Mer1p is to activate splicing of four single-intron genes and induce their expression during meiosis35 (Figure 1). The Mer1p target genes perform diverse functions including chromosome pairing, recombination, and cell cycle control. Inefficient splicing of each gene during vegetative growth is due to suboptimal splice sites at the intron-exon boundaries. Mer1p binds to an enhancer element within the intron of each of the four genes and promotes spliceosome assembly by direct interactions with Nam8p, a spliceosomal component38-39. The MER1 splicing network serves a critical role in the transition from early meiotic genes, transcriptionally regulated by UME6, to the middle meiotic genes, which are regulated by NDT8035 (Figure 1).
Analysis of NAM8 null strains identified intron retention of the four MER1-dependent genes, consistent with the mechanism of Mer1p activation, as well as two genes not dependent on MER1 for meiosis-specific splicing35,40. Another study found two genes for which meiosis-specific splicing requires TGS1 which synthesizes the specialized 2,2,7-trimethylguanosine cap at the 5′ end of small nuclear RNAs (snRNAs), essential components of the spliceosome41. Splicing for one TGS1-dependent gene is also NAM8-dependent, indicative of multiple overlapping networks (Figure 1). Unlike MER1, NAM8 and TGS1 expression levels do not substantially change during meiosis, suggesting that while required, these genes are not the primary determinants of meiosis-specific splicing. It is also unclear what regulates meiosis-specific splicing of the six remaining genes that are not dependent on MER1, NAM8, or TGS1. Even the relatively straightforward MER1 splicing regulatory network contains multiple splicing sub-networks and is interlinked to the meiotic transcriptional program. As such, it is an instructive paradigm for metazoan splicing regulatory programs.
A high fraction of mammalian apoptotic regulators, including death receptors, adapters, caspases and caspase targets, are alternatively spliced to produce dramatic different biological outcomes 42. In particular, splicing of the Bcl2 family of apoptosis regulators (Bcl2, Bcl-x, and Mcl1) yields long (L) and short (S) isoforms to provide anti-apoptotic versus pro-apoptotic functions, respectively42. A recent genome-wide RNAi screen using Bcl-x and Mcl1 splicing reporters established coordinated alternative splicing as an integral component of cell cycle control. The study found that knockdown of 52 genes induced pro-apoptotic splicing of both reporters. The list of genes included a network of factors linked to the cell cycle regulator, aurora kinase A, a central regulator of mitosis43. Loss of aurora kinase A promoted posttranslational degradation of (serine-arginine splicing factor 1) SRSF1, a member of the SR protein family of splicing regulators44. Results from CLIP analysis (Box 1) showed that SRSF1 directly binds Bcl-x and Mcl1 RNAs, as well as revealing additional endogenous apoptotic splicing events that shifted toward pro-apoptotic splicing upon cell cycle inhibition. Cell cycle inhibitors were previously known to induce pro-apoptotic splicing; however, a critical new finding from this study was that the splicing response preceded mitotic arrest, indicating that the splicing change is induced in parallel with, rather than as a secondary response to, mitotic arrest. The indications are that SRSF1, which promotes anti-apoptotic splicing patterns, is integrated in decisions of whether to continue through the cell cycle or undergo apoptosis. SR proteins have an established relevance to cell cycle control and oncogenesis45-47, including the recent demonstration that an SR-related protein (SON) is required for efficient splicing of cell cycle regulated genes48. Furthermore, the expression of all SR protein genes is maintained under strict homeostatic control by an ancient mechanism involving alternative splicing coupled with NMD (AS-NMD) (Box 2).
Alternative splicing has a particularly broad physiological impact by maintaining homeostatic levels of splicing regulators as well as spliceosome components through coupling of alternative splicing with nonsense mediated decay (AS-NMD)154-157. NMD is a conserved and multifunctional surveillance mechanism to degrade mRNAs containing premature termination codons (PTCs)9. A general rule for mammalian cells is that spliced mRNAs containing a termination codon located >50 nucleotides upstream of the position of the last intron is degraded by NMD158. In AS-NMD, alternative splicing controls an NMD signal by insertion or removal of an mRNA segment that contains either a PTC or introduces a downstream PTC due to a frameshift. The majority of splicing regulators affected by AS-NMD autoregulate homeostatic levels by promoting a splicing pattern that results in NMD and down-regulation. Splicing regulators also use AS-NMD for cross regulation such as the repression of neuron-specific nPTB in non-neuronal cells by its paralogue, PTB68.
The genes encoding the SR and hnRNP protein families present a striking example of homeostatic control by AS-NMD with pervasive biological and evolutionary implications156-157. SR and hnRNP proteins are a long-standing paradigm for antagonistic splicing regulators affecting multiple and diverse alternative splicing events159. All SR protein genes and many hnRNP protein genes utilize AS-NMD for negative autoregulatory feedback. Strikingly, the regions of the genes that are critical for the AS-NMD “on-off” switch are within ultraconserved elements. Furthermore, autoregulation of the genes within each family promotes negative feedback that is “polar” with regard to the effects of each family on splicing pattern157 (Figure): in the genes for SR proteins, which predominantly (though not exclusively) activate splicing, activation of an alternative splicing event results in NMD-mediated down-regulation while in genes of hnRNP proteins, which are most often splicing repressors, repression of a splicing event results in NMD-mediated down-regulation. The full significance of this ancient regulatory network remains to be elucidated; however, the implication is that maintaining the appropriate levels of both protein families is a critical component of cellular homeostasis for a broad spectrum of cell types and across metazoan phylogeny. Consistent with this, alterations of each protein family have been associated with disease-causing aberrant splicing45,160-161. The figure is reproduced from reference157.
Embryonic stem cells (ESCs) are pluripotent cells that can proliferate indefinitely while retaining the capacity to differentiate into the three germ layers. A number of genome-wide studies have found specific transcriptome changes during differentiation of ESCs into distinct lineages, including the contributions of alternative splicing to cell-fate decisions and pluripotency49-54. Splice isoform diversity is high in undifferentiated ESCs, decreases upon their differentiation49-50 and is enriched in cell-cycle, pluripotency, signaling and general metabolic pathways49,51-52,55-56 (Table 1).
Alternative splicing has a central impact in stem cell biology by affecting the core pluripotency factors, which produce functionally diverse splice isoforms that have determinative roles on cell state. The pluripotent state of ESCs is maintained by a core set of transcription factors — Oct4, Nanog, Sox2, and Tcf3 — that induce and cross-regulate a network of target genes involved in self-renewal and pluripotency57. The Oct4 gene encodes a POU domain transcription factor and produces functionally distinct protein isoforms at specific stages of ESC differentiation 58,59. Three Oct4 isoforms have been identified, Oct4A, Oct4B and Oct4B158. Oct4A contains N- and C-terminal transcription transactivation domains and a POU domain; Oct4B has a different N-terminal transactivation domain from that of Oct4A and a shortened POU domain. While Oct4A target genes are responsible for stemness57,59, Oct4B cannot sustain ESC self-renewal and instead targets genes responsive to cell stress58,60. The expression and function of Oct4B1 remains to be determined58. The Oct4 paralogue, Oct2, is also alternatively spliced to generate isoforms that can either activate or repress neuronal differentiation. The Oct2.4 isoform lacks the C-terminal transcription transactivation domain and suppresses neuronal differentiation61. During differentiation, multiple splice variants containing the C-terminal transactivation domain are expressed and, one of these isoforms in particular, Oct2.2 induces neuronal differentiation61.
Alternative splicing of several other genes in addition to the core pluripotency factors is linked to stem cell self-renewal and lineage specification62-66 yet little is known about the regulatory factors that dictate these splicing outcomes. A group of recent studies provide important insights into how splicing factors may influence neural differentiation from ESCs or neural progenitors. Exons that are alternatively spliced in human ESCs52 are associated with conserved binding sites of the RNA binding protein Fox1 homologue (Rbfox) family of splicing regulators. This result is consistent with findings that Rbfox2 is highly expressed, regulates a large splicing network, and is essential for viability of human ESCs but not differentiated stem cells or transformed cell lines64. Another splicing factor, polypyrimidine track binding protein (PTB), prevents differentiation of a proliferative neural cell line by repressing expression of its neuronal homologue nPTB67-68. It does this by directly promoting skipping of neuronal PTB (nPTB) exon 10 in non-neuronal cells, which introduces a PTC resulting in nPTB mRNA degradation by NMD. Neural-specific expression of nPTB is due to silencing of PTB by a neuronal-enriched miRNA, miR-12469, establishing a first example of a regulatory hierarchy from miRNAs to splicing networks. A neuron-specific SR protein, nSR100, participates in this network by promoting the inclusion of nPTB exon 10 to increase nPTB expression, as well as functioning as a co-regulator with nPTB, in a complex regulatory relationship21,70-71.
The identification of lineage-specific splice variants, their cis-acting elements and their trans-acting regulators will refine our understanding of how alternative splicing integrates with the transcriptional and post-transcriptional networks of ESCs. It will be particularly interesting to determine whether in addition to transcription72, alternative splicing is also re-set when somatic cells are reprogrammed to pluripotency.
Phenotypic conversions of cells between epithelial and mesenchymal states, known as epithelial-mesenchymal (EMT) and mesenchymalepithelial (MET) transitions, are fundamental to organ morphogenesis and tissue remodeling during embryonic development73. These trans-differentiation events are required for physiological responses to injury and wound healing in adult tissues and play roles in pathological responses such as fibrosis and metastasis74. At a cellular level, EMT is characterized by a loss of epithelial features, including cell adhesion and polarity, and acquisition of mesenchymal features, such as motility and invasiveness75.
Alternative splicing plays a determinative role in EMT through regulation of multiple splicing events and utilizing several regulatory proteins76-79. In a high fraction of breast cancer cell lines, SRSF1 up-regulation triggers EMT through alternative splicing of the Ron tyrosine kinase receptor proto-oncogene to produce a constitutively active, pro-invasive isoform, RonΔ76. Induction of EMT via ERK1/2 activation80 proceeds in part through phosphorylation of its substrate SAM68, which then upregulates SRSF1 by inhibiting AS-NMD-mediated downregulation (Figure 2a).
Downregulation of the RNA binding proteins, Epithelial Splicing Regulatory Proteins 1 and 2 (ESRP1 and ESRP2), is also determinative for physiological EMT splicing changes. Splicing-sensitive microarrays were used to identify nearly 100 splicing events that displayed reciprocal changes in epithelial cells depleted of ESRP1/ESRP2 or mesenchymal cells expressing ectopic ESRP177. A large fraction of these contained an ESRP-binding motif that defined a position-dependent effect on splicing. The ESRP target genes were enriched in functions that support EMT, such as actin cytoskeleton, cell adhesion, cell migration, and cell polarity (Table 1). Importantly, sustained ESRP1/ESRP2 knockdown resulted in global epithelial to mesenchymal splicing transitions and EMT-like phenotypic changes. Among ESRP targets, the splice isoforms of the CD44 cell adhesion molecule, in particular, participate in multiple EMT-relevant functions including proliferation, adhesion, and migration (Figure 2b)81.
The fibroblast growth factor plasma membrane receptor 2 (FGFR2) contains two mutually exclusive alternative exons, IIIb and IIIc, which produce distinct ligand binding specificities in epithelial and mesenchymal cells82 to ensure appropriate signaling for different mesenchymal-epithelial interactions during organogenesis83. The studies of FGFR2 in EMT identified critical interplay between the histone code, adapter proteins, and alternative splicing regulators to generate cell-type specific splice isoforms (Figure 2c)82-84.
The vertebrate heart is the first organ to both form and function in the embryo. Cardiac morphogenesis is complete by embryonic day 14.5 (E14.5) in mice, after which growth occurs primarily via hypertrophic (cell growth) rather than hyperplastic (cell proliferation) pathways (Figure 3a)85-86. The first four weeks after birth are characterized by extensive remodeling of the heart in response to the physiological demands of rapid growth and increased activity of the animal.
A large scale transcriptome analysis using splicing sensitive microarrays identified alternative splicing transitions during late embryonic and postnatal mouse heart development26. A timecourse revealed three sets of temporally coordinated splicing transitions26 (Figure 3b). Remarkably, greater than 40% of the splicing transitions are conserved between mammalian and avian species in terms of the pattern, direction, timing, and coding potential of splicing changes, indicative of functionally important embryonic to adult protein isoform transitions. The genes exhibiting conserved splicing transitions were enriched for functions consistent with heart remodeling (Table 1). Interestingly, different gene sets were found to be regulated by two distinct mechanisms: changes in mRNA expression levels, or isoform switching without changes in mRNA expression levels26.
Computational analysis identified significantly enriched and conserved pentamer motifs near the developmentally regulated exons. A subset of motifs matched binding sites for known families of splicing regulators, such as CELF, MBNL, Rbfox and PTB. Consistent with roles in regulating postnatal splicing transitions, levels of the two CELF paralogues expressed in heart, CELF1 and CELF2, decrease more than 10-fold during postnatal development while MBNL increases 4-fold26,87-88. Importantly, CELF gain-of-function and MBNL loss of function in adult mouse heart caused over half of the postnatal splicing transitions to revert to the embryonic pattern, consistent with a determinative role for these protein families in driving the postnatal splicing transitions. CLIP analysis will be required to identify the direct targets of these splicing regulators.
MiRNA-mediated regulation of multiple splicing regulators was revealed by deletion of Dicer in adult cardiomyocytes in mice, which resulted in rapid induction of a subset of splicing regulators (CELF, PTB and Rbfox) and re-expression of large numbers of fetal mRNA splice variants89. These findings identify a regulatory hierarchy in which miRNAs and alternative splicing act to coordinate the switch from fetal-to-adult gene expression programs (Figure 3c).
Transgenic overexpression (CELF1), genetic knockout (SRSF1, SRSF2, SRSF10) or overexpression of a dominant negative mutant (CELFΔ) of splicing regulators in heart has a severe impact on heart physiology26,90-94. Interestingly, heart-specific deletion of either SRSF1 or SRSF2 beginning around E8.5, had a delayed effect on mouse postnatal heart development with most animals exhibiting splicing abnormalities by two weeks of age and developing dilated cardiomyopathy within eight weeks (Figure 3)92-93. SRSF1 is essential for cell viability in culture95 and a constitutive SRSF1 or SRSF2 knockout is embryonic lethal92-93. The finding that an early embryonic knockout in cardiomyocytes has no obvious phenotype until the postnatal period presents a paradox with regard to temporal and cell-specific requirements. This result illustrates the functional versatility of individual SR proteins as well as the dynamic nature of the postnatal period. Constitutive knockout of a third SR protein, SRSF10, resulted in lethality from mid-gestation until birth due to several cardiac defects implicating the protein in multiple critical splicing events (Figure 3)94. Loss of each of the three individual SR proteins resulted in misregulation of only a few transcripts that were unique to each protein, demonstrating an unexpected level of target specificity.
In addition to directly controlling biological outcomes, alternative splicing can also provide “biological options” for a determinative biological response by other mechanisms. Two examples, one from D. melanogaster and the other from mammals, illustrate how the production of multiple protein isoforms can be used by subsequent mechanisms to produce a biological outcome.
The neuronal circuitry of the brain is bewilderingly complex. Each neuron connects with thousands of other neurons to establish a functioning network. Two features of neuronal architecture that ensure broad coverage of a receptive field are widely separated neurites (axons and dendrites) (arborization) and non-overlapping arrangements of adjacent neurons. The D. melanogaster Down syndrome cell adhesion molecule 1 (Dscam1) gene encodes a cell adhesion molecule that plays a critical role in both features using, splicing to generate unique cell surface codes. The Dscam1 gene is an immunoglobulin (Ig) superfamily member that produces >19,000 extracellular domain variants through alternative splicing, unlike the vertebrate gene which is not alternatively spliced. Initially the Dscam1 code was thought to direct circuitry assembly, providing a specific identity to individual neurons to establish specific connections. In fact, the role of Dscam1 is quite different. Individual neurons express 14-50 Dscam1 splice variants that are generated stochastically rather than by an invariant signature for specific neurons96-97. Homophilic binding of identical Dscam1 isoforms is highly specific and produces intracellular signaling that results in neurite repulsion. This response prevents intraneural connections, promotes broad arborization and produces extensive non-overlapping receptive fields. Loss of Dscam1 results in bundling and overlapping of neurites, due in part to repulsion failure. Elegant studies using homologous recombination replaced the wild type Dscam1 allele with modified genes able to produce from 1 to 4752 randomly generated Dscam1 isoforms. These sets of experiments provided several conclusions including that: (i) a single Dscam1 isoform is sufficient for homophilic repulsion, (ii) the specific isoform expressed is not important since multiple isoforms tested had the same effect, and (iii) more than 1000 isoforms are required within a neuronal population to produce normal neuronal patterning98-99. In contrast to a model in which neuron-specific expression of individual Dscam1 isoforms directs neuronal connections, the critical feature is that a population of neurons expresses sufficient Dscam1 diversity to prevent inappropriate interactions.
Another example in which expression of different isoforms is not determinative but is critical for the biological outcome is the mouse Roundabout 3 (Robo3) gene100. Two Robo3 protein isoforms generated by alternative splicing act as a binary switch to control targeting of growing commissural neurons to ensure that that they cross the mid-line of the spinal cord only once. While the two isoforms are restricted to different sides of the mid-line, the mRNAs are not. Therefore splicing produces the two mRNA isoforms bilaterally but a different regulatory mechanism, presumably translation or protein stability, produces spatially restricted expression of the two protein isoforms.
One of the best characterized alternative splicing regulatory networks is controlled by the two RNA binding protein paralogues, Nova1 and Nova2. Expression of both genes is neuron-specific although Nova1 and Nova2 are reciprocally expressed in different subregions of the brain101. Network analysis that integrated multiple types of data identified approximately 700 splicing events regulated by Nova in mouse brain102. Interestingly, Nova utilizes an activity code that is conserved with its D. melanogaster orthologue, Pasilla, with regard to the binding motif (YCAY), and its positive or negative effect on splicing based on the location of the binding site relative to the exon103. The biological function, however, has diverged over evolutionary time acquiring different tissue-specific expression in different metazoan lineages. For example, while Nova’s regulatory function is neuron-specific in chordates, its function is gut-specific in star fish and sea urchins104. These analyses suggest that once established, an RNA binding code is extremely stable and can be applied to different functional outputs through evolutionary time103.
Results from Nova null mice indicate that Nova proteins control multiple aspects of brain development and it is likely that each paralogue controls multiple splicing sub-networks in different neuron subtypes. Another factor likely to affect the range of Nova targets is expression of co-regulators. For example, 15% of Nova targets also contain binding sites for the Rbfox family102 and functional connections that have recently been found to exist between Rbfox and the SR protein family105 could also influence Nova activity. Analysis of the genes containing Nova-regulated splicing events defined functions related to synapse development and activity106. Additional roles for Nova have been teased out through detailed analysis of the complex phenotype of Nova null mice. For example, abnormal cellular layering within the neocortex of Nova2 knockout mice revealed a defect in post mitotic neural migration due to a failed developmental splicing transition of Disabled1 (Dab1), a component of the reelin pathway107. Nova is also required in developing motor neurons for neuron-specific splicing of agrin which assembles the proper postsynaptic architecture on the skeletal muscle membrane108. These studies demonstrate that splicing regulatory networks of individual RNA binding proteins regulate diverse functions and that, despite the prediction of large numbers of targets (~700 in the case of Nova), detailed analyses can link specific splicing events with individual components of a complex knockout phenotype.
A splicing regulatory network has been recently associated with neuronal electrical homeostasis in the brain by a knockout of Rbfox1109. Rbfox1 mutations have been linked with epilepsy, mental retardation, and autism110-111 and Rbfox1 expression was found to be altered in brain samples from individuals with autistic spectrum disorder (ASD)112. Mis-regulated splicing of Rbfox targets, many of which are important to synaptic function, was observed in ASD brain samples, consistent with altered expression of Rbfox1112.
Alternative splicing is dynamically regulated in response to naturally occurring external stimuli, such as immune cell activation (reviewed in113) and neuronal depolarization. Depolarization of excitable cells in culture by exposure to elevated potassium chloride causes multiple plasma membrane proteins to undergo rapid changes in splicing31,114-115. Because blocking L-type calcium channel activity restores these splicing changes, a direct role of calcium signaling has been proposed. For example, inclusion of the stress-axis-regulated (STREX) exon in the SLO transcript, which encodes a subunit of calcium and voltage-gated potassium channels, is repressed after depolarization116. The presence of the STREX exon confers higher calcium sensitivity, slowing the channel deactivation. Repression of the STREX exon upon depolarization is mediated by calcium/calmodulin-dependent protein kinase (CaMK)IV through intronic CaMKIV-responsive RNA elements (CaRREs)116. Two types of CaRRE motifs, CaRRE1 and CaRRE2, have been identified close to the SLO and many other depolarization-responsive exons117-118. HnRNP L can regulate splicing via CaRRE1 elements119 while the proteins that bind to CaRRE2 elements are unknown. HnRNPA1 also mediates depolarization-dependent splicing repression by binding to a different motif located close to regulated exons120. Remarkably, depolarization inhibits inclusion of a cassette exon in Rbfox1 to produce an isoform with enhanced nuclear localization, which in turn leads to enhanced splicing activity that promotes the inclusion of Rbfox1 target exons that were repressed due to depolarization121. Thus, separate mechanisms may execute splicing changes as an adaptive feedback response to hyperstimulation.
In addition to the relatively slow transitions in alternative splicing described above, splicing is also utilized as a component of acute responses to stresses such as DNA damage and hypoxia, as well as oxidative, osmotic, thermal or nutrient stress. In yeast, for example, splicing of ribosomal protein-encoding genes is inhibited within minutes of amino acid starvation122. The best characterized examples are during heat shock or genotoxic stress due to ultraviolet irradiation.
Heat shock in mammalian cells results in splicing inhibition, which can be recapitulated in cell free splicing reactions using nuclear extracts prepared from heat shocked cells123. The majority of heat shock protein (HSP) genes lack introns; HSP genes that contain introns escape the splicing inhibition of thermal shock by an unknown mechanism124. For HSP47, alternative splicing is activated by heat shock to include an additional 169 nucleotides within the untranslated region, producing an mRNA isoform that is more efficiently translated125.
There are two proposed mechanisms for splicing inhibition in response to thermal stress. First, SRSF10, a splicing factor that regulates both constitutive and alternative splicing, is rapidly dephosphorylated by heat shock. Dephosphorylation increases SRSF10 interaction with U1 snRNP, which prevents association with other SR proteins126, producing splicing inhibition. SRSF10 is re-phosphorylated within an hour of recovery that parallels splicing restoration of a model pre-mRNA substrate (ß-globin). Whether SRSF10 de-phosphorylation affects splicing of select substrates or produces a general splicing defect is presently unclear. However, poor heat shock recovery of SRSF10-deficient cells suggests that it affects critical genes126. SR protein-specific kinase, SRPK1, dynamically interacts with heat shock proteins Hsp70 and Hsp90 in mammalian cells127. Stress signals such as osmotic shock disrupt these interactions and promote cytoplasmic-to-nuclear translocation of SRPK1 which results in differential phosphorylation of SR proteins and splicing alterations127.
In a second mechanism, heat shock as well as chemical and osmotic stress lead to formation of nuclear stress bodies that sequester a subset of alternative splicing factors affecting their splicing functions (reviewed in113,128). The kinetics of these two mechanisms are quite different. While SRSF10 dephosphorylation occurs rapidly and is completely reversed within an hour of recovery126, recruitment of splicing factors to nuclear stress bodies peaks at three-hours and is reversed over ten-to-twelve hours128. How the different mechanisms are integrated on different time scales to promote recovery remains to be determined. Large scale studies could be used to identify genes that are splicing inhibited versus spared in response to thermal stress, determine whether spared genes are enriched for recovery functions, and define commonalities among inhibited versus spared genes to investigate the different regulatory mechanisms.
Rapid, reversible and coordinated skipping of multiple exons from the MDM2 and MDM4 transcripts following ultraviolet (UV) irradiation provided the first indication that alternative splicing synchronizes a rapid response to genotoxic stress129. MDM2 is an E3 ligase responsible for targeting the tumor suppressor protein p53 for ubiquitin-dependent degradation130. Skipping of MDM2 exons deletes the p53 binding region in MDM2 protein129, which allows p53 activation during stress and its rapid shut-off upon stress removal. Furthermore, MDM2 is a transcriptional target of p53 which creates a negative feedback loop to control p53 activity in response to stress.
In addition to UV irradiation, DNA damage induced by common anti-cancer agents such as inhibitors of topoisomerase I and cyclin dependent kinase can alter splicing patterns of a large number of genes131-132. Intriguingly, these splicing changes are not dependent on general DNA damage response signals such as p53 or signaling kinases ATM and ATR. Two different cotranscriptional mechanisms provoke splicing changes in response to genotoxicity. One set of exons undergoes increased skipping after DNA damage, largely due to impaired communication between the transcription and splicing machineries that is normally mediated by EWS, a RNA Polymerase II (RNAPII)-associated factor, and YB-1, a spliceosome-associated factor131. Normally, EWS binds co-transcriptionally to its target RNAs, but UV irradiation reduces this association due to a transient relocation of EWS to the nucleoli133. Reduced association of EWS with its target RNA affects splicing among genes preferentially involved in DNA repair and genotoxic stress signaling133 linking EWS mediated splicing regulation and DNA damage response. Conversely, another set of exons exhibit increased inclusion in response to genotoxicity due to slowing of RNAPII elongation rate132. In particular, UV irradiation increases the phosphorylation of the carboxy-terminal repeat domain of RNAPII, which slows RNAPII elongation. Alternative exons are typically flanked by inefficiently recognized splice sites and slowing of RNAPII allows time for alternative exons to commitment to splicing18. Future work will provide a broad understanding of the functional consequences associated with both observations.
RNA binding proteins are at the center of regulatory networks in which hundreds of splicing events are associated with binding sites. It is possible that most binding sites serve to dampen the effects of the RNA binding protein and only a minority of binding sites are associated with physiological splicing events (Box 3). However, as highlighted in this review, loss-of-function analyses have been successfully used in yeast, flies, and mice to identify individual splicing targets of RNA binding proteins and specific functions of individual splicing events. To understand the full biological context of the splicing network, a next step is to determine whether a natural change in the activity of the RNA binding protein is utilized to modulate the network in response to physiological need.
Computational and biochemical genome-wide analyses for individual RNA binding proteins often identify hundreds of putative target exons associated with multiple binding sites. Genetic loss-of-function analyses often identify a much smaller number of splicing events exhibiting a robust response. There are several potential explanations for such discrepancies. First, there may be redundancy with related splicing regulators, as the multifactorial nature of splicing regulation means that individual factors contribute to large numbers of events but are determinative for a limited number. Second, the specific cell type exhibiting the robust response might not have been assayed. Third, under certain circumstances splicing transitions do not have to be large to have physiological consequences. Finally, it might be the case that a large fraction of binding sites do not direct a splicing response that has functional consequences, but rather are utilized as “sinks” that sequester the RNA binding protein and dampen its physiological impact.
The hypothesis of natural genome-wide sinks to regulate biological function has been proposed to explain a similar set of paradoxes for miRNAs, for which there are large numbers of predicted targets, but only a small fraction of these targets are conserved and an even smaller fraction have a measurable physiological impact162. This regulatory mechanism was recently proposed for RNA binding proteins that regulate alternative splicing (H. Seitz, personal communication). In this hypothesis, most binding sites serve to negatively regulate the activity of the protein by sequestration from the active pool such that a large number of “molecular targets” titrate the levels of the regulator against a few “physiological targets”. Criteria to distinguish physiological from molecular targets could include: conserved binding sites associated with the regulated exon; a robust change in response to genetic loss- and gain-of-function; and regulation during periods of physiological change in which the RNA binding protein undergoes a change in activity conserved in different species.
In addition to auxiliary RNA binding proteins, which are the focus of this review, the basal splicing machinery makes critical contributions to cell type-specific splicing (for example, see 134-135). Splicing regulation can be independent from the activity of auxiliary splicing factors136. Knockdown of spliceosomal components results in gene-specific splicing effects in yeast, flies, and mammals indicating a large potential for regulation through modulation of spliceosomal components135,137-139. Furthermore, specific cell types, such as neurons, are exquisitely sensitive to hypomorphic mutations within core spliceosomal components140 revealing dramatic cell specificity of what was once considered a ubiquitous basal machinery constant among cell types. It is likely that cell type-specific differences in the basal splicing machinery not only produce cell-specific splicing patterns but also impact the function of auxiliary splicing regulators.
Identification of splicing transitions that function during a physiological change requires knowledge of the cell population sampled. As one example, less than 20% of the cells in heart are cardiomyocytes, the majority being cardiac fibroblasts, vascular smooth muscle cells and endothelial cells141. Therefore, it is often unclear which cell type undergoes the splicing transitions that are measured in whole tissue. Another consideration is whether the detected splicing transitions represent regulation within a constant cell population or a change of cell populations. This is relevant to developmental changes that involve cell migrations as well as pathological samples, in which there can be substantial loss and replacement of parenchymal cells by fibroblasts or cell gain by inflammatory infiltration. In the not too distant future, the complication of cell heterogeneity will be circumvented by transcriptome analysis in single cells142 that will reveal the variability among cells of the same type and analysis of different cell types within a population.
The remarkable complexity of gene regulation becomes increasing apparent in proportion to the improving resolution of the available assays. Splicing is one component of an interacting continuum of epigenetic, transcriptional and posttranscriptional control18,143. Regulation of individual alternative splicing events has multiple inputs into the decision of whether or not to use splice site(s), with different factors acting antagonistically or as negative or positive co-regulators. Splicing factors autoregulate themselves and cross-regulate each other, thusgenerating network-wide influences on splicing. Rapidly developing high throughput approaches are leading to the delineation of regulatory networks for large numbers of RNA binding proteins. The combined datasets will aid in identifying how splicing regulatory networks are integrated in different cell types and, ultimately, how they produce diverse physiological responses.
T.A.C. is supported by the National Institutes of Health (AR045653, AR060733, HL045565) and the Muscular Dystrophy Association (156780). A.K is supported by a Scientist Development Grant from the American Heart Association (11SDG4980011).
Tom Cooper is professor in the Departments of Pathology and Immunology and of Molecular and Cellular Biology at Baylor College of Medicine, Houston, Texas. He received an MD in 1982 and did postdoctoral work at the University of California, San Francisco. Current interests in his lab are the mechanisms of splicing regulation by specific RNA binding proteins, characterization of regulatory networks that coordinate alternative splicing during heart and skeletal muscle development, and pathogenic mechanisms in myotonic dystrophy, a disease caused by disruption of a developmental splicing program.
Auinash Kalsotra is an Instructor in the Department of Pathology and Immunology at Baylor College of Medicine, Houston, USA. He received his undergraduate degree in pharmacy from BITS, Pilani, India, and his Ph.D. in biochemistry and molecular biology from University of Texas-MD Anderson Cancer Center, Houston, USA, where he studied the role of cytochromes P450 in progression and resolution of inflammation. During his postdoctoral work with Tom Cooper he identified a conserved program of alternative splicing regulation important for vertebrate heart development. His research interests include investigating the mechanisms and role of RNA processing in heart development and disease.
Competing interests statement
The authors declare no competing financial interests.
Tom Cooper lab homepage http://www.bcm.edu/pathimmuno/cooper/?PMID=14453
Saccharomyces genome database http://www.yeastgenome.org/
Entrez gene http://www.ncbi.nlm.nih.gov/gene