|Home | About | Journals | Submit | Contact Us | Français|
Ninety-four percent of human genes are discontinuous such that segments expressed as mRNA are contained within exons and separated by intervening segments, called introns. Following transcription, genes are expressed as precursor mRNAs (pre-mRNAs) which are spliced co-transcriptionally and the flanking exons are joined together to form a continuous mRNA. One advantage of this architecture is that it allows alternative splicing by differential use of exons to generate multiple mRNAs from individual genes. Regulatory elements located within introns and exons guide the splicing complex, the spliceosome, and auxiliary RNA binding proteins to the correct sites for intron removal and exon joining. Misregulation of splicing and alternative splicing can result from mutations in cis regulatory elements within the affected gene or from mutations that affect the activities of trans-acting factors that are components of the splicing machinery. Mutations that affect splicing can cause disease directly or contribute to the susceptibility or severity of disease. An understanding of the role of splicing in disease expands potential opportunities for therapeutic intervention by either directly addressing the cause or by providing novel approaches to circumvent disease processes.
The flow of genetic information has traditionally been viewed as DNA transcribed into RNA and translated into protein. Additional layers of regulation continue to be discovered greatly expanding this simplistic framework and revealing the complex network that controls gene expression. With the greater understanding of regulated gene expression, it has become increasingly clear that RNA is much more than a passive intermediate. The control of RNA processing is now recognized as a crucial component of gene regulation. In addition, a variety of long and short non-coding RNAs have recently been discovered to mediate regulation of gene expression at multiple levels. Gene expression is modulated through multiple RNA-based mechanisms including miRNA-mediated silencing, regulation by long non-coding RNAs, nonsense-mediated decay, polyadenylation site selection, RNA editing, alternative splicing, and regulation of mRNA translation efficiency, stability, and localization. Given the integrated roles of these events in normal gene expression, it is not unexpected that these RNA-based processes are heavily involved as either causative entities, modulating influences, or compensatory responses to disease .
The focus of this review is the diverse roles of splicing and alternative splicing in human disease. It is estimated that 94% of human genes are alternatively spliced and as many as 50% of disease causing mutations affect splicing [2–4]. Alternative splicing produces variation within mRNAs from individual genes greatly increasing the diversity of transcripts expressed from these genes (Figure 1). The majority of variation is within the open reading frame resulting in the expression of different protein isoforms, which often have different functional properties. Splicing and the regulation of alternative splicing are disrupted both by mutations within cis-acting elements required for correct pre-mRNA processing as well as by mutations that affect transacting components that are necessary for splicing regulation. Effects on splicing can be direct causative agents of disease or more subtle contributions to the determinants of disease susceptibility or modulators of disease severity. Recently, a number of RNA binding proteins with roles in alternative splicing regulation, as well as other RNA processing events, have been identified as disease-associated genes, particularly in neurodegenerative disorders and cancer. Here we will first review the molecular machinery required for normal constitutive and alternative splicing and then cover the mechanisms by which alterations in splicing and its regulation create and modulate disease.
For each mRNA, the site of transcription initiation produces the first nucleotide and the site of a directed endonuclease cleavage event becomes the 3’ end to which a ~200 nucleotide polyA tail is added. Ninety-four percent of human genes contain introns and exons which are spliced together post- or co-transcriptionally. The molecular machinery required for intron removal and exon joining must perform a demanding task. Splicing requires extreme precision because even a single nucleotide addition or deletion at the site of exon joining will shift the reading frame with adverse consequences to the protein-coding potential. To achieve this accuracy, the splicing machinery must efficiently recognize the intron-exon boundaries in the pre-mRNA. Splicing is performed by a multi-component machine, the spliceosome, composed of five small nuclear RNAs (snRNAs) pre-assembled with proteins into small ribonucleoproteins (snRNPs) and hundreds of additional proteins. The precision of the reaction is accomplished through a coordinated series of RNA-RNA, RNA-protein and protein-protein interactions .
A critical challenge to the spliceosome is to correctly identify exons within the pre-mRNA. Exons make up only one tenth of the typical pre-mRNA and therefore must be identified within a sea of introns. The mechanism of exon recognition involves identification of a complex code of cis-acting elements within genes . It is also becoming more apparent that splicing of most intron containing genes is likely to be co-transcriptional which reduces the complexity of intron-exon junction recognition . During the past decade, a large research effort involving many labs has made substantial progress toward deciphering the splicing code . Every intron contains three core sequence elements that are essential for splicing: the 5’ and 3’ splice sites at the 5’ and 3’ ends of the intron, respectively, and the branch point sequence (Figure 2). These sequences are recognized multiple times by components of the spliceosome during the processes of intron removal and exon joining. The 5’ splice site is bound first by the U1 snRNP and later by the U6 snRNP. The RNA binding protein SF1 binds the branch point sequence but is later displaced by the U2 snRNP . In general, splice sites with greater affinity for these recognition complexes achieve a higher splicing efficiency. The multiple and independent recognition of these sequences by different spliceosome components contributes to the fidelity of splicing.
These three core sequences are necessary but not sufficient for defining intron-exon junctions. There are additional sequences located both within introns and exons that recruit trans-acting splicing factors to ensure inclusion of constitutive exons or modulate the efficiency of splice site recognition, promoting alternative splicing. It is now clear that most exons, whether constitutively or alternatively spliced, contain cis-acting elements that affect splicing efficiency. These sequences are referred to as intronic or exonic splicing enhancers (ISE or ESE) or silencers (ISS or ESS). Although some of these elements are identifiable by combined use of different algorithms, it is not yet possible to consistently predict the effects of nucleotide substitutions on splicing . As the sequences that define ESSs and ESEs are better established, exonic mutations that disrupt splicing will be more readily recognized based on sequence alone. Enhancer elements are bound by positive regulators that increase exon inclusion, such as the SR protein family which contain at least one RNA binding domain and a distinctive serine/arginine-rich domain. In contrast, silencer elements are bound by negative regulators that decrease exon inclusion, such as the abundant hnRNP proteins which were originally identified by their association with nascently transcribed pre-mRNA . The critical balance of these antagonistic regulators is necessary for controlling the level of exon inclusion in the mRNA transcript . Mutations within splice site sequences at the intron-exon junction cause approximately 10% of disease-causing mutations . As described in sections below, it is likely that a larger number of disease-causing mutations that affect splicing are located within enhancer and silencer sequences. These mutations reduce the efficiency with which a constitutive exon is spliced or alters a critical ratio of alternative splicing patterns.
Alternative splicing is controlled both spatially and temporally resulting in the expression of different splice variants in different tissues, in different cells within the same tissue, or in the same tissue at different stages of development or in response to pathological processes. It is now known that most genes are alternatively spliced, however, it is also important to determine which splice variants are expressed, their relative abundance, and their biological function in order to fully understand not only the function of a gene but also how the function is altered in disease. One example of the complications introduced by alternative splicing with regard to understanding the pathogenic mechanisms of a disease-causing mutation is the CACNA1H calcium T-channel. CACNA1H generates a large number of functionally diverse protein isoforms which have been extensively characterized . It is possible that disease-associated mutations in this gene may differentially affect the activity of the diverse splice variants. Missense mutations in this gene cause familial forms of epilepsy and a number of studies aimed at determining the electrophysiological effects of different disease-causing mutations were performed using one available full-length splice variant. Given the large number of functionally diverse isoforms expressed, one can imagine that the isoform studied is not the predominant splice variant in the cells that are relevant to epilepsy. This example illustrates the importance of knowing the functional differences between each splice variant and identifying specific splice variants expressed in the disease-relevant cells to understand the full impact of disease-causing mutations.
To study alternative splicing in a global context, several approaches have been developed including specialized expression profiling arrays and high-throughput sequencing analysis. Traditional microarrays contain oligonucleotide probes targeted primarily to the 3’ end of an mRNA to measure total transcript levels, but are unable to distinguish between different splice variants. More specialized exon tiling and splice junction microarrays contain probes to individual exons and exon-exon junctions. The higher coverage allows not only the identification of transcripts that are up-regulated or down-regulated, but also detects differential use of specific exons . Splice junction profiling was used to identify subsets of transcripts regulated by the neuron-specific splicing factor Nova2 by comparing wild-type and Nova2 knockout mouse brain tissue  and to identify splicing transitions that occur during mouse heart development . These and other studies strongly suggest the presence of an elaborate regulatory network to coordinate splicing transitions for specific physiological functions and during development. Most recently, high throughput transcriptome sequencing has provided a highly quantitative analysis of mRNA abundance and differential utilization of exons . High-throughput sequencing technologies are developing at a rapid rate, such that transcriptome analysis is likely to be widely available to compare diseased and normal tissue for investigative and even diagnostic purposes.
Most of the splice site mutations that lead to human disease involve the invariant GT and AG dinucleotides in the 5’ and 3’ splice sites [10, 16]. These dinucleotides are essential for exon definition and appropriate splicing. However, mutations occurring at other positions of the 5’ or 3’ splice site can also lead to missplicing and typically result in exon skipping, activation of a cryptic splice site, or intron retention. A number of diseases have been reported that result from intronic mutations including Familial Dysautonomia, Neurofibromatosis type 1, Frasier syndrome, and atypical cystic fibrosis and have been previously reviewed [17–19].
Familial Dysautonomia (FD) is a recessive disease of the sensory and autonomic nervous system caused by mutations in the IKBKAP gene and is a particularly instructive example of the effects of splice site mutations outside of the invariant first and last nucleotides . The IKBKAP gene encodes IKAP, a component of the Elongator complex thought to function as a transcriptional regulator [21, 22]. IKAP depletion results in reduced transcription of a number of cell motility genes, opening the possibility that impaired cell migration in the nervous system may underlie the neurological dysfunction in FD patients . Ninety-eight percent of the disease-causing alleles contain a silent T to C transition in the sixth base of the intron, the last nucleotide of the 5’ splice site of intron 20 [24, 25]. This mutation leads to exon 20 skipping, causing a shift in the reading frame and introduction of a premature termination codon. This is likely to result in nonsense-mediated decay of the mRNA and decreased expression of functional IKAP [25, 26]. In silico and in vitro splicing analysis determined that the upstream 3’ splice site of intron 20 and the splicing regulatory sequences within exon 20 are not strong enough to define the exon in the presence of the mutated 5’ splice site . Increasing the strength of the base pairing between the mutated 5’ splice site and U1 snRNA, an early step in spliceosome regulation, restores exon 20 inclusion . These studies demonstrate that single base pair changes in the intron can dramatically affect exon inclusion by disrupting spliceosomal recognition of splice sites.
Disease-causing missense mutations are commonly assumed to disrupt protein function. Surprisingly, a large and growing number of examples demonstrate that mutations within protein coding exons can have primary disease-causing effects by disrupting splicing. This occurs by mutations that alter ESS or ESE motifs. Spinal muscular atrophy (SMA) is one of the best characterized examples of a disease caused by an exonic mutation that disrupts splicing. Survival of motor neurons (SMN) protein is encoded by the SMN1 gene and a nearly identical SMN2 gene, created from a gene duplication event. SMN is necessary for snRNP assembly and metabolism and its loss via deletions in SMN1 result in the most common genetic cause of infant mortality and a disease characterized by motor neuron degeneration and progressive paralysis. SMN2 is unable to compensate for loss of SMN1 due to a silent C to T substitution in the sixth nucleotide of exon 7 which promotes exon skipping and production of a truncated, inactive protein [29–31]. Two non-exclusive models explain how the mutation promotes exon skipping either through gain of an ESS or loss of an ESE. In the ESS-gain model, the mutation creates an ESS binding site for hnRNPA1 which functions as a splicing repressor [32, 33] whereas in the ESE-loss model, the mutation disrupts an ESE that is bound by the SF2/ASF splicing activator, an SR protein [34, 35]. Interestingly, the mutated SMN2 exon 7 shares high sequence similarity with a disease-associated region of the medium-chain acyl-CoA dehydrogenase (MCAD) gene . MCAD performs one of the initial catalytic steps in the mitochondrial fatty-acid oxidation pathway and deficiency of this enzyme results in the most common fatty-acid disorder . A silent C to T mutation has been identified in MCAD exon 5 of several MCAD deficiency patients at the same exonic position as the SMN2 base pair substitution, the sixth nucleotide. This mutation disrupts a putative ESE binding site for SF2/ASF, resulting in exon skipping and nonsense-mediated decay of the aberrant transcript. Overexpression of SF2/ASF, but not other SR proteins, strongly increased exon 5 inclusion, suggesting the disease-associated mutation weakens but does not totally abolish ESE recognition. When the putative ESE was replaced by the corresponding ESE sequence from SMN1 exon 7, MCAD exon 5 inclusion was restored. In contrast, when the disrupted ESE from SMN2 was used for replacement, MCAD exon 5 remained predominantly skipped . The proper splicing of MCAD exon 5 and SMN2 exon 7 require similar exonic regulatory sequences and disruption of these sequence elements alter the splicing pattern and serve as the basis of disease.
Phenotypic variability results from changes in gene expression, including individual differences in splicing. Polymorphisms or mutations occurring near splice sites or within splicing regulatory sequences can alter splicing patterns [4, 38]. These changes, in turn, can result in the expression of different mRNA variants that affect disease susceptibility and severity. For example, splicing is a genetic modifier of two X-linked disorders of copper metabolism, Menkes disease (MD) and occipital horn syndrome (OHS) . Both diseases are caused by mutations in the ATP7A gene, encoding a copper-transporting ATPase [40–42]. ATP7A uses energy from ATP hydrolysis to transport copper across polarized enterocytes (intestinal absorptive cells), increasing its bioavailability for numerous cellular processes. In the absence of functional ATP7A, copper deficiency results from poor distribution of copper to cells in the body . Individuals with MD are affected by neurological degeneration, connective-tissue defects, distinctive kinky and brittle hair, and early-childhood death. OHS is an allelic form of MD with milder symptoms, the most predominant feature being connective-tissue abnormalities [43, 44].
Several hundred disease-causing mutations have been identified that result in disruption of ATP7A function and splice site mutations are highly represented [45, 46]. Predominant among MD patients are mutations occurring in the invariant dinucleotides at the 5’ and 3’ splice sites that abolish exon recognition by the spliceosome and eliminate or profoundly reduce the expression of the correct splice variant. In contrast, the milder OHS phenotype results from mutations that occur in less conserved regions of the same splice sites. This weakens but does not prevent exon recognition so both aberrantly and correctly spliced transcripts are produced. The low level of functional transporter protein is sufficient to prevent severe disease [47–49]. This example demonstrates how splicing efficiency determines disease severity through different splicing mutations in the same gene.
For some genes, the ratio of two or more splice variants must be properly balanced in response to changing cellular conditions. Antagonistic regulation between positive- and negative-acting factors contributes to this delicate control and its disruption can lead to disease. One particularly well-characterized example is frontotemporal dementia and Parkinsonism linked to chromosome 17 (FDTP-17). FDTP-17 is caused by mutations in the MAPT gene that encodes tau, a protein involved in microtubule assembly and stability. A number of disease-causing mutations disrupt alternative splicing of exon 10 ultimately altering the ratio of two tau isoforms containing either three (3R) or four (4R) repeat microtubule binding sequences. The finding that silent mutations in exon 10 caused disease was an initial clue that the primary pathogenic mechanism involved an effect on splicing rather than altered protein function [50, 51]. For these mutations, changes in the 4R:3R ratio due to increased exon 10 inclusion is the causative event in tau aggregation and onset of disease .
While MAPT exon 10 provides an example of cis-acting mutations that disrupt the balance of isoforms expressed, the Bcl-2 protein family illustrates how differences in the trans-acting splicing environment can affect a critical balance between splice variants. Most members of the Bcl-2 family undergo alternative splicing to produce isoforms that either promote or prevent apoptosis . The relative levels of splice products are critical for regulating apoptosis and an imbalance of these products is an important influence in the initiation and progression of cancer . One member of this family, Bcl-X, produces two protein isoforms via splicing of an alternative 5’ splice site in the first coding exon. The use of the upstream splice site creates a longer isoform Bcl-XL with anti-apoptotic function, whereas use of the downstream splice site produces a shorter splice isoform Bcl-XS with pro-apoptotic function. The mechanistic details of how each isoform promotes its apoptotic response is not known, however, it is clear that the alternative splicing event creates two splice variants with opposing activities indicating that this process of RNA regulation is capable of deciding cell fate . An imbalance of these two isoforms has been implicated in several cancers; the anti-apoptotic Bcl-XL is up-regulated in multiple myeloma, small cell lung carcinoma, and in prostate and breast cancer where it is associated with an increased risk of metastasis [56–59]. The pro-apoptotic Bcl-XS is down-regulated in transformed cells, but its forced overexpression sensitizes breast cancer cells to chemotherapeutic agents . These observations indicate a change in the nuclear regulatory environment associated with cancer. Some of the factors that can affect splicing in cancer have been identified and are described below.
While disease-causing mutations that act in cis affect splicing of a single gene, mutations that affect components of the splicing machinery create the potential for multiple genes to be misspliced. The relative scarcity of examples for severe loss-of-function mutations in trans-acting factors may be an indication that mutations with widespread consequences are lethal during embryonic development or in individual cells. However, there are several examples of diseases resulting from mutations in genes important for splicing, including SMA, amyotrophic lateral sclerosis (ALS), and retinitis pigmentosa . In the case of SMA, while SMN is ubiquitously required for snRNP assembly, SMN deficiency in SMA predominantly affects motor neurons, resulting in motor neuron degeneration. Interestingly, the snRNP repertoire was found to be altered in an SMN-deficient SMA mouse model and was associated with widespread splicing defects in multiple tissues consistent with the role of SMN in snRNP maturation . The heightened sensitivity of motor neurons to SMN deficiency could reflect requirements of neuron specific pre-mRNAs or specialized functions or partners of SMN in motor neurons .
Disruption of other trans-acting splicing regulators is also thought to have significant effects on splicing. TDP-43, a member of the hnRNP family, has been implicated in the mechanism of cystic fibrosis where it binds to a repetitive (UG)n element in intron 8 of the cystic fibrosis transmembrane conductance regulator (CFTR) gene, promoting exon 9 skipping and decreased expression of functional protein . Moreover, a role for TDP-43 has been found in several neurodegenerative diseases including frontotemporal dementia and ALS, where the protein is abnormally included in ubiquitinated protein aggregates in the cytoplasm of neurons and glial cells [63, 64]. Partial depletion from the nucleus and thus, a decrease in nuclear function, may be a contributing factor in these diseases. Mutations in TDP-43 as well as another RNA-binding protein, fused in sarcoma (FUS), have recently been identified in both familial and sporadic forms of ALS [65, 66]. The identification of disease-causing mutations in TDP-43 and FUS strongly suggest that splicing abnormalities and possibly other RNA processing events contribute to the neurodegeneration phenotype. A crucial next step is to identify the RNA targets of these proteins and their role in motor neuron function and survival .
Short (≤10 nucleotide) repetitive sequences are located throughout the genome and their expansion beyond a pathogenic threshold is responsible for a class of disorders called microsatellite expansion disorders . When repeats are located in an open reading frame, expansions result in both loss and gain of protein function, such as in Huntington’s disease and several types of spinocerebellar ataxias. However, when the expanded repeats occur in non-coding regions of the gene, disease can result not only from protein loss-of-function but also from a gain-of-function of the repeat-containing RNA transcribed from the expanded allele. The best characterized example of an RNA gain-of-function resulting from repeat expansion is myotonic dystrophy (DM). This adult-onset neuromuscular disorder is characterized by multisystemic clinical features including cardiac arrhythmias, skeletal muscle wasting, cataracts, myotonia, insulin resistance, and neuropsychiatric dysfunction .
There are two types of DM that differ in the sequence and location of the expansion, but share a common pathogenic mechanism. DM type 1 (DM1) is caused by a CTG trinucleotide expansion in the 3’ untranslated region of the DMPK gene [70, 71], whereas a generally milder DM type 2 (DM2) is caused by a CCTG tetranucleotide expansion located within the first intron of the ZNF9 gene . A molecular hallmark of DM is the accumulation of CUG- or CCUG-repeat RNA in nuclear foci [72, 73]. The alternative splicing regulator MBNL1 has a high affinity for expanded CUG- and CCUG-repeat RNA and co-localizes with the nuclear foci, depleting MBNL1 from the nucleoplasm . The role of MBNL1 depletion in DM was demonstrated by a Mbnl1 knockout mouse line that recapitulated many clinical features of the disease including myotonia, cataracts, and skeletal muscle abnormalities . A second alternative splicing regulator, CUGBP1, is aberrantly up-regulated in DM1 heart and skeletal muscle tissue [76, 77]. CUGBP1 up-regulation is due to protein kinase C-mediated hyperphosphorylation and stabilization of CUGBP1 induced by CUG-repeat RNA . Functional loss of MBNL1 and gain of CUGBP1 are thought to be primarily responsible for the widespread disruption of developmentally regulated alternative splicing events in DM tissues. Several disease features can be directly attributed to misregulation of individual splicing events, including myotonia and insulin resistance. Myotonia is due to aberrant inclusion of muscle-specific chloride channel (CLNC1) exon 7a in adults resulting in nonsense-mediated decay of CLNC1 mRNA and reduced chloride conductance in muscle [79, 80]. Correction of the Clcn1 splicing defect reversed the myotonia phenotype in DM mouse models . Insulin resistance in DM directly correlates with decreased inclusion of insulin receptor (IR) exon 11 and predominant expression of a lower-signaling IR isoform with decreased insulin sensitivity . This multi-systemic disease demonstrates the widespread consequences that result from a trans-dominant mutation that affects alternative splicing regulation.
Different splice variants are commonly found to be enriched in cancer tissue compared to the normal surrounding tissue. The splicing change can result from mutations within intronic or exonic splicing elements within the genes relevant to cancer, such as oncogenes or tumor suppressors. In many cases, however, the aberrantly spliced genes are not mutated, indicating that the defects involve a change in the nuclear environment that regulates splice site choice . Recently, high-throughput transcriptome sequencing of cancer cells has lead to the identification of transcript chimeras between neighboring genes, called read-throughs, potentially due to trans-splicing or to co-transcription and intergenic splicing . Future work should determine the mechanism responsible for read-through formation and their causal roles, if any, in cancer. The relevance of splicing to cancer raises several questions including: i) do the splicing changes initiate and/or promote cancer progression, ii) do the splicing changes increase the oncogenic potential of the proteins expressed from these variants, and iii) can an alternative splicing signature be used to identify cancer subtypes, predict clinical outcome, or aid in identification of the most effective treatments. The roles of splicing in cancer progression, diagnosis, and treatment have been addressed in several excellent reviews [84, 85].
The best documented examples of the role of splicing in cancer involve alterations in known tumor suppressors and oncogenes. KLF6, a Kruppel-like zinc finger transcription factor, is a tumor suppressor that inhibits cell growth via trans-activation of the cyclin-dependent kinase inhibitor p21 and through p21-independent mechanisms [86–89]. A splice variant of KLF6, KLF6-SV1, is generated by an alternative 5’ splice site in exon 2 producing a protein isoform that lacks the zinc finger DNA binding domains, but retains most of the activation domain . Therefore, KLF6-SV1 antagonizes wild-type KLF6 function in a dominant-negative manner, promoting cell proliferation and migration . A prostate cancer-associated single nucleotide polymorphism near the intron-exon boundary creates a binding site for the SR protein SRp40 which leads to increased expression of the KLF6-SV1 isoform . The association of this polymorphism and prostate cancer suggests that altered splicing and KLF6-SV1 overexpression leads to increased cancer risk. KLF6-SV1 overexpression in vivo accelerates prostate cancer progression and metastasis, whereas its knockdown by RNAi induces apoptosis and decreases tumor growth . Cumulatively, these studies demonstrate the potential role of the KLF6-SV1 oncogenic splice variant in cancer progression. Another interesting example connecting splicing of tumor suppressors to cancer is the CDKN2A gene locus which encodes two tumor suppressors, p14ARF and p16INK4a, with alternative reading frames. CDKN2A is a high-risk locus for melanoma formation through the loss of p14ARF and p16INK4a . In a family with melanomas, neurofibromas, and multiple dysplastic nevi, a novel splice site mutation has been detected in the splice acceptor site of intron 1 which promotes exon 2 skipping in both the p14ARF and p16INK4a transcripts . Inactivation of both tumor suppressors may have combinatorial effects in the development of these diseases.
The receptor tyrosine kinase KIT provides an example of a proto-oncogene activated by aberrant splicing. KIT gain-of-function mutations have been documented in gastrointestinal stromal tumors (GISTs) and are sufficient for GIST development. Small deletions encompassing the 3’ splice site of intron 10 have been identified in multiple GISTs [96, 97]. Surprisingly, this deletion does not result in exon 11 skipping, but rather creates a novel intra-exonic 3’ splice site within exon 11. The deleted region is critical for inhibition of KIT kinase activity in the absence of ligand and structural analysis of the novel KIT splice variant indicates that the kinase adopts a constitutively active conformation . These data suggest that aberrant pre-mRNA splicing of the proto-oncogene KIT could play a key role in the development of GISTs.
SR proteins control alternative splicing events in proto-oncogenes and tumor suppressor genes, frequently modifying their cellular activity. Many SR proteins are up-regulated in tumors as exemplified by the high levels of SRp20, SF2/ASF, and SC35 in ovarian cancer  and SF2/ASF in tumors of the colon, thyroid, small intestine, kidney and lung . A recent study showed that SF2/ASF overexpression can transform immortalized cells and SF2/ASFexpressing cells are tumorigenic in nude mice. The transformed phenotype of these cells and a lung cancer cell line exhibiting SF2/ASF up-regulation could be reversed by shRNA-mediated knockdown of SF2/ASF, indicating splicing regulators can indeed act as oncogenes . SF2/ASF alters splicing of the BIN1 tumor suppressor, which binds and suppresses the c-Myc oncogene. BIN1 exon 12A, the inclusion of which is increased by SF2/ASF, disrupts Myc binding and abolishes its tumor suppressor function . SF2/ASF also induces oncogenic isoforms of the MNK2 and S6K1 kinases and the RON tyrosine kinase receptor [99, 101]. Importantly, the SF2/ASF-induced isoform of S6K1 is sufficient for cell transformation demonstrating the potency of trans-acting alterations in cancer . Changes in the expression of additional splicing factors are documented in cancer development and progression, but the splicing defects they induce and their relation to cellular transformation remain to be elucidated .
The activities of factors that directly regulate alternative splicing are modulated in response to a variety of cellular signaling pathways, and commonly involve post-translational modifications. Small molecules that either promote or inhibit these signaling events can significantly alter the splicing patterns of a subset of transcripts . The best documented post-translational modification affecting splicing is the phosphorylation of SR proteins. The phosphorylation state of an SR protein controls multiple features of its function ultimately affecting its ability to enhance exon recognition . GSK3 is one of several kinases that phosphorylate SR proteins and therefore inhibition of GSK3 has the potential to alter SR-regulated splicing events. SR proteins regulate inclusion of tau exon 10, mis-splicing of which can cause FTDP-17. Small molecule kinase inhibitors against GSK3 restore tau exon 10 inclusion, illustrating the ability of small molecules to alter splice site selection and potentially treat disease . A drawback with this approach is the potential for numerous off-target effects. Multiple SR proteins and their respective targets will be affected by GSK3 inhibition. In addition, GSK3 has many substrates including metabolic, signaling, and structural proteins . The future development of small molecule modulators of splicing will depend upon how well the specificity of the effects can be optimized.
Short antisense oligonucleotides (AONs) complementary to specific regions within target pre-mRNAs are being used in multiple approaches to reverse or circumvent disease processes: (i) promote exon skipping by blocking 5’ and/or 3’ splice sites, (ii) promote exon skipping or inclusion, respectively, by blocking binding of cognate factors to splicing enhancers or suppressors, (iii) promote degradation of mutant transcripts, and (iv) prevent harmful interactions between proteins and pathogenic RNA (Figure 3). An unmodified DNA AON hybridized to RNA will induce degradation of the RNA that is base-paired with the DNA by ribonuclease RNase H, which is ubiquitously expressed. The functionality of AONs can be changed by chemical modifications that prevent activation of RNase H . Modified AONs targeted to particular regions within a pre-mRNA are used to prevent access to factors that recognize specific cis elements. For example, AONs that bind to splice sites induce exon skipping and this approach is being successfully applied to restore the reading frame of mutated genes.
One example of an application in clinical practice is in Duchenne’s muscular dystrophy (DMD), a neuromuscular disorder in which two-thirds of the mutations are deletions within the massive (2 Mb) dystrophin gene producing mRNAs that are out of frame resulting in severe loss of function. Two observations suggested that retention of the reading frame, despite internal deletions, would produce partially functional protein. First was the observation of rare dystrophin-expressing skeletal muscle fibers in DMD muscle tissue that are thought to arise from aberrant splicing, resulting in exon skipping and expression of in-frame protein containing internal deletions [107, 108]. Second was the realization that the milder Becker muscular dystrophy (BMD) is caused by dystrophin deletions that retain an open reading frame. These observations supported the idea that in-frame deletions producing partially functional protein could reduce the severity of DMD. AONs directed at splicing sites that restore the dystrophin reading frame by induced skipping of selected exons have been successfully applied to DMD patients [109, 110]. In these initial studies, AONs are delivered by direct injection into skeletal muscle tissue and a clinical trial for systemic delivery is ongoing (http://clinicaltrials.gov/ct2/show/NCT00844597).
A different application for AONs has been developed as a therapeutic strategy for treating DM1. The current model for DM1 pathogenesis is that expanded CUG repeats result in a toxic RNA gain-of-function. Two recent studies used modified CAG-repeat AONs to hybridize to the CUG repeats and disrupt the RNA-protein complexes that assemble on CUG-repeat containing RNA in cultured cells from individuals with DM1 and in DM1 mouse models [111, 112]. The CAG AONs neutralized the toxic RNA by decreasing the level of CUG repeatcontaining RNA and by disrupting nuclear foci, releasing sequestered MBNL and allowing transport of the mutant RNA to the cytoplasm . A major advantage of AONs is the ability to target specific sequences, overcoming the obstacle of target specificity encountered by the use of small molecules to modulate splicing. The issues for AON as a therapeutic approach include efficient delivery to the appropriate tissues and the duration of the effect. Despite these challenges, AONs remain a promising application for research studies and clinical therapeutics.
Alternative splicing is a highly coordinated process that relies on a combination of positive and negative-acting factors, intronic and exonic sequence elements, and temporal and spatial signaling pathways for proper control. Mutations that disrupt any of these critical features either in cis or trans may alter the splicing patterns of one or multiple transcripts disrupting the production or functions of the encoded proteins. Large numbers of splicing-related diseases have been documented; however, this number is likely to be substantially underestimated because the effects of mutations on splicing are often not pursued as a primary cause of disease. A complete understanding of the splicing code as well as the factors that decipher the code will allow accurate predictions of which nucleotide changes affect splicing and identification of diseases for which altered splicing is the primary defect. Modulation of splicing provides a potent therapeutic approach to either address the main cause of disease or circumvent a disease process toward molecular normalcy.
No conflicts of interest exist