|Home | About | Journals | Submit | Contact Us | Français|
Cellular functions depend on numerous protein-coding and non-coding RNAs and the RNA-binding proteins associated with them, which form ribonucleoprotein complexes (RNPs). Mutations that disrupt either the RNA or protein components of RNPs or the factors required for their assembly can be deleterious. Alternative splicing provides cells with an exquisite capacity to fine-tune their transcriptome and proteome in response to cues. Splicing depends on a complex code, numerous RNA-binding proteins and an enormously intricate network of interactions among them, increasing the opportunity for exposure to mutations and mis-regulation that cause disease. The discovery of disease-causing mutations in RNAs is yielding a wealth of new therapeutic targets, and the growing understanding of RNA biology and chemistry is providing new RNA-based tools for developing therapeutics.
The normal function of cells depends on accurate expression of a large number of protein coding RNAs (messenger RNAs; mRNAs) and non-coding RNAs. These RNAs participate in transcription (e.g., 7SK RNA), RNA processing (e.g., small nuclear RNAs; snRNAs, small nucleolar RNAs; snoRNAs) and translation (e.g., ribosomal RNAs; rRNAs, transfer RNAs; tRNAs, 7SL/SRP RNA, microRNAs; miRNAs). There are many other RNAs that are involved in other processes (e.g., telomerase, MRP, RNase P RNAs), and those whose functions are less well characterized or entirely unknown (e.g., vault RNAs, Y RNAs, Piwi-interacting RNAs; piRNAs). RNAs exist in cells as ribonucleoprotein complexes (RNPs) that are composed of one or more RNAs and typically numerous RNA-binding proteins (Dreyfuss et al., 1993; Glisovic et al., 2008). The RNPs are the functional forms of the corresponding RNAs and their normal activity depends on both the specific composition and the precise arrangement of their protein constituents. As there are numerous RNAs and a very large number of RNA-binding proteins, the biogenesis of RNPs must be orchestrated with great fidelity. Mutations that disrupt any of the components of RNPs, either RNAs or proteins, or the factors required for their assembly can be deleterious to cells and cause disease (Lukong et al., 2008; Wang and Cooper, 2007). Examples of such cases are now too numerous to afford a meaningful discussion of all but a few within the space constraints of this article. We focus here on defects in pre-mRNA splicing as these have emerged as a common disease-causing mechanism underlying many human genetic ailments. For a more general perspective, a summary of representative examples of disease-causing defects associated with other types of RNAs or RNPs is listed in Table 1. Here, we discuss several diseases that result from mutations that affect the splicing of the transcript produced from the same gene (in cis) as well as mutations in the splicing machinery and the regulatory proteins that affect splicing of other transcripts (in trans). The discovery of new disease-causing mutations in RNAs provides a range of therapeutic targets, and the continuing elucidation of RNA biochemistry is also yielding new tools that can be exploited in the development of RNA-based therapeutics.
In humans and other complex metazoans, the vast majority of protein-coding genes contain many segments (introns) that are part of the primary transcript (premRNA) but are not included in the mRNA. The removal of introns and the joining together of the sequences included in the mRNA (exons) that contain the protein-coding open reading frame and the 3′- and 5′- untranslated regions (UTRs), is accomplished by pre-mRNA splicing. Splicing must be both rapid and precise, a demanding feat considering that introns are typically much larger than exons and are strewn with translation termination codons, which would result in truncated proteins if not accurately removed. The definition of exons and the execution of the splicing reaction are mediated by the spliceosome, a large and modular molecular machine composed of snRNAs and splicing factors, and by numerous RNA-binding proteins (see Review by M. C. Wahl, C. L. Will and R. Luhrmann on page XXX of this issue). The biosynthetic and metabolic energy cost of splicing and of the surveillance necessary to eliminate imprecisely spliced mRNAs is very high, and so is the potential harm from defects in these processes. It was therefore not self-evident at the outset why splicing arose, persisted and expanded in the number and size of introns with the evolution of more complex organisms. The real payoff of splicing, which complex organisms have evolved to capitalize on (or perhaps capitalized on for their evolution), seems to come from the capacity for alternative splicing. Alternative splicing allows individual genes to express multiple mRNAs and encode numerous proteins that differ by either large or small peptides or have different untranslated regions (UTRs). Alternative splicing provides tremendous opportunities for enrichment of the transcriptome and proteome without the need for expansion of the genome. Importantly, alternative splicing can also be regulated differently in cell type-, developmental stage-, or signal-dependent patterns. This elaborate scheme shifts the balance of control of gene expression from largely transcriptional, the production of a co-linear transcript of DNA, to a largely post-transcriptional non-linear complex and intricate processing of the pre-mRNA. Recent studies using high throughput deep sequencing indicate that the extent of alternative splicing and thus spliced isoforms is much greater in humans than was previously estimated (Pan et al., 2008; Wang et al., 2008). In some cases, a large number of splicing events are required to produce an mRNA and the number of splicing options can be vast. Examples of proteins encoded by human genes containing a large number of exons include Titin (Ttn; 316 exons), the ryanodine receptor 1 (Ryr1; 106 exons) and dystrophin (DMD; 79 exons). However, with this enormous increase in complexity comes an increased susceptibility to malfunction. Indeed, a large number of human diseases result from mutations or mis-regulation of the splicing process.
The major splicing apparatus (spliceosome) is composed of 5 small nuclear ribonucleoprotein complexes (snRNPs) each containing one or two snRNAs (U1, U2, U4/U6, and U5) and numerous (likely >100) protein factors. These protein factors include RNA-binding proteins (e.g., U2AF, SF1 and SRFs) and enzymes (helicases/RNPases, kinases and phosphatases, etc.). Thesemodulate the structure and the orderly stepwise associations, dissociations and conformational transitions of the premRNA,snRNAs and protein complexes to facilitate the splicing reaction, proofreading and substrate release (Nilsen, 2003; Staley and Guthrie, 1998; Will and Luhrmann, 2001). A minor splicing pathway, the unique components of which are about 100-fold less abundant compared to those in the major splicing pathway, operates in parallel using a similar scheme, but it recognizes different canonical splice site sequences and uses mostly different snRNAs (U11, U12, U4atac/U6atac and U5) (Patel and Steitz, 2003). Each snRNP, except U6 and U6atac, have in common a stable seven-membered ring of Sm proteins (the Sm core), which forms around a conserved sequence called the Sm site, as well as several snRNA-specific proteins. The biogenesis of snRNPs is a highly complex process involving export of the nascent pre-U snRNAs to the cytoplasm where assembly of Sm cores takes place and, following processing of the snRNAs, re-import of the snRNPs to the nucleus to function in splicing. Sm core assembly is not a spontaneous process; it is performed by the SMN (survival of motor neurons) complex, an ATP-dependent assemblyosome comprised of SMN, Gemins 2-8 and Unrip, that specifically identifies snRNAs and brings them together with Sm proteins, a recognition aided by a specific arginine modification in Sm proteins by the methylosome/PRMT5 (Neuenkirchen et al., 2008; Yong et al., 2004). Considerable insight into this pathway has come from studies of a devastating disease in which its namesake protein, SMN, is deficient.
Pre-mRNAs contain a “splicing code” made up of loosely defined consensus sequences that define the splice junctions and a bewildering number and diversity of relatively short cis-acting elements within exons as well as introns (Wang and Burge, 2008). These elements are bound by RNA-binding proteins, which have either positive (primarily SR proteins) or negative (primarily hnRNP proteins) effects on spliceosome assembly in their vicinity to maintain appropriate constitutive splicing or to regulate alternative splicing. Cells contain a large assortment of these RNA-binding proteins and their relative stoichiometries play important roles in determining splicing outcomes for many genes (Smith and Valcarcel, 2000). The repertoire of these is unique to each cell type and thus the regulation of expression and activity of these proteins is critical for normal alternative splicing. Thus, in addition to diseases caused by mutations in exons, including missense, nonsense, and frame-shift mutations in the open reading frame, which give rise to defective proteins, and mutations in UTRs that affect translation efficiency or mRNA stability, a large number of diseases are now known to result from intronic or exonic mutations that disrupt normal splicing patterns.
Co-transcriptionally, pre-mRNAs become associated with numerous RNA-binding proteins (hnRNP proteins) that influence their structure and interactions and play key roles in their splicing and polyadenylation. Upon splicing, additional proteins are acquired at the splice junctions (the exon junction complex; EJC) and numerous proteins are removed, and the metamorphosis of the mRNA-protein complexes (mRNPs) continues as they transit to the cytoplasm (Figure 1) (Dreyfuss et al., 2002; Maquat, 2004; Tange et al., 2004). Many of these proteins shuttle between the nucleus and the cytoplasm (Pinol-Roma and Dreyfuss, 1992), including hnRNP proteins and SR (serinearginine-rich domain-containing) proteins. The functions they perform in the nucleus are not their last, as they also have roles in mRNA translation, stability and localization in the cytoplasm. RNA editing provides yet another mode to increase the complexity of the transcriptome (Schaub and Keller, 2002), and a further layer of regulation has now been revealed with the discovery of microRNAs (miRNAs), an abundant class of non-coding RNAs that regulate mRNA translation and stability (Valencia-Sanchez et al., 2006). The large number of proteins and regulatory RNAs in post-transcriptional RNA processing and the enormously intricate network of interactions among them provide cells with exquisite capacity to fine tune their transcriptome and rapidly adjust their proteome in response to stimuli, but it also increases exposure to mutations and extends their vulnerability to mis-regulation that causes numerous diseases, particularly neuromuscular and neurodegenerative diseases and cancer.
Three consensus sequences are found at all exon-intron boundaries: the 5′ and 3′ splice sites at the 5′ and 3′ ends of the intron, respectively, and the branch point sequence (BPS) which is typically located within 30-50 nucleotides upstream of the 3′ splice site. These sequences contain a considerable amount of the information required for precise exon joining [estimated to be about half in some cases (Lim and Burge, 2001)] and are recognized through RNA-RNA and RNA-protein interactions with components of the spliceosome. The 5′ splice site binds to the U1 snRNP early during spliceosome assembly, then to U6 snRNP; the BPS binds SF1 and then U2 snRNP, and U2AF binds near the 3′ splice site. Mutations within the sequences that disrupt these interactions are responsible for 9-10% of the genetic diseases that are caused by point mutations (Cartegni et al., 2002; Wang and Cooper, 2007). The remaining information required for splicing is contained within relatively short (~6 nucleotide) sequences located within both exons and introns that either enhance or suppress splicing (Wang and Burge, 2008). Exonic splicing enhancers (ESEs) and exonic splicing silencers (ESSs) activate or repress splicing, respectively, from within exons while intronic splicing enhancers (ISEs) and silencers (ISSs) function from within introns. In contrast to the consensus splice sites, which are relatively well defined in sequence and position, the intricate code formed by these auxiliary splicing elements is only partially understood. While a significant portion of the code has been deciphered, it is not yet possible to determine from genomic sequence whether a disease-associated mutation will disrupt splicing. This is the case in part because these elements are composed of diverse classes of sequences, the full complement of which remains to be identified. There is also an incomplete understanding of the subtleties of the code including extensive interdependency between and within elements (Roca et al., 2008). These elements function through interactions with RNA-binding proteins and while the preferred binding sites for more than 40 proteins are known, primarily for SR and hnRNP proteins (Gabut et al., 2008), cognate binding proteins have been identified for only a minority of the growing list of putative splicing elements.
Essentially all exons, whether alternatively or constitutively spliced, contain splicing elements (Wang and Burge, 2008). As a result, exons are a virtual minefield with regard to the effects of nucleotide substitutions on splicing. Up to 25% of synonymous (in terms of amino acid coding) substitutions can disrupt normal splicing as can nonsynonomous and termination codons (Pagani et al., 2005), emphasizing the importance of considering silent mutations as mediators of pathogenic effects. For some genes, up to 50% of point mutations within exons affect splicing and it has been hypothesized that more than half of known disease-causing mutations disrupting splicing (Lopez-Bigas et al., 2005). A large number of exonic mutations that result in aberrant splicing have been documented (Cartegni et al., 2002; Wang and Cooper, 2007). Particularly instructive are disease-causing exonic mutations that have been studied extensively and have become focal points for various RNA-based therapeutic approaches.
A striking example of the detrimental effect that mutations in exonic splicing signals can have is a single nucleotide substitution in the survival of motor neuron 2 (SMN2) gene and its critical role to spinal muscular atrophy (SMA) (Lefebvre et al., 1995). SMN is a ubiquitously expressed protein that plays a critical role in snRNP biogenesis and is essential for viability of all cells in divergent eukaryotes (Neuenkirchen et al., 2008; Yong et al., 2004). SMA is a common, often fatal, motor neurodegenerative disease and a leading genetic cause of infant mortality (Talbot and Davies, 2001; Wirth et al., 2006). SMA severity corresponds to the degree of functional SMN protein deficiency. In humans, there are two SMN genes, SMN1 and SMN2, both of which encode the same open reading frame. The vast majority of SMA patients have deletions of the SMN1 gene but retain SMN2, exposing the functional deficiency of a single nucleotide substitution, a C->T change at position 6 in exon 7 of SMN2. This mutation, though it does not change the amino acid coding, significantly alters the splicing pattern of the SMN2 pre-mRNA, causing frequent skipping of exon 7 that produces an inactive and unstable protein lacking the last 16 amino acids (Figure 2A). Thus, SMN2 is much less effective than SMN1 in producing SMN protein (Wirth et al., 2006). Two models have been proposed to explain exon 7 skipping in SMN2. One is that the substitution disrupts an ESE, which the splicing activator, ASF/SF2 binds (Cartegni and Krainer, 2002) and the other is that it creates an ESS to which the splicing suppressor hnRNP A1 binds (Kashima and Manley, 2003; Kashima et al., 2007). Though they appear contradictory, these models, in fact, are compatible (Cartegni et al., 2006) and underscore the precariousness of splicing signals in that a single nucleotide change can convert an ESE into an ESS. It is intriguing that critical RNA binding sites for ubiquitous, abundant but functionally opposing splicing factors differ by a single nucleotide, and it can be anticipated that there are many more cases where splicing is altered because of the ease with which the balance of positively and negatively-acting splicing factors (ASF/SF2-hnRNP A1, and many others) can be shifted. Because of its critical function in the biogenesis of snRNPs and splicing regulation, this SMN2 substitution has significant ramifications to splicing (described below).
The massive (2.4 Mb) dystrophin gene, most of which is in its 78 introns, is a splicing accident waiting to happen, and it does with an incidence of 1:3,000 of male births in Duchenne muscular dystrophy (DMD). DMD is caused by loss-of-function mutations and while >65% of DMD mutations are genomic deletions, a large number of exonic and intronic point mutations cause disease through aberrant splicing. Dystrophin is positioned at the cytoplasmic side of the skeletal muscle sarcolemma where it communicates signals between the extracellular matrix and the cellular contractile apparatus and it stabilizes the cell membrane, which must withstand repeated contraction-induced distortions. The protein contains four domains: an actin-binding domain at the N-terminus, a linker region of twenty-four spectrin-like repeats, a cysteine rich domain and a C-terminal domain (Nowak and Davies, 2004). A particularly revealing T->A substitution in exon 31 not only creates a premature termination codon (PTC), but also introduces an ESS that binds to hnRNP A1, resulting in partial exon skipping (Disset et al., 2006) (Figure 2B). Interestingly, mRNAs lacking this exon lose coding for one spectrin-like repeat but retain the correct reading frame to produce a partially functional protein, explaining why an individual harboring this mutation has a milder form of the disease than would be expected given the presence of a PTC. In this example, the specific mechanism of pathogenesis would have been missed if only the PTC mutation detected in genomic DNA had been considered, demonstrating the need to determine the effects of mutations on splicing directly by using RNA from the affected tissues. Examples such as this one also suggested the potential therapeutic benefits of inducing exon skipping to restore reading frame (see below).
As another example, mutations within and downstream from the alternatively spliced exon 10 of the (Microtubule associated protein tau) MAPT gene encoding the tau protein disrupt the 1:1 ratio required for mRNAs that include or exclude this exon. Exon 10 encodes the fourth of four microtubule-binding domains (R) and disruption of the balance between 4R-tau and 3R-tau isoforms (Figure 2C) results in hyperphosphorylation and aggregation of tau proteins into neurofibrillary tangles that are hallmarks of several neurodegenerative diseases such as Alzheimer’s disease (AD). This exon and surrounding introns are dense with splicing regulatory elements, which is indicative of the degree to which splicing of this exon is controlled. Numerous mutations within and around MAPT exon 10 disrupt exonic and intronic splicing elements and cause the inherited neuropathological disorder frontotemporal dementia with Parkinsonism linked to chromosome 17 (FTDP-17), demonstrating a direct relationship between aberrant tau expression due to alternative splicing disruption and neuropathology (Liu and Gong, 2008).
Also high on the menu of disease-causing splicing mutations is the (cystic fibrosis transmembrane conductance regulator) CFTR gene, which encodes a transmembrane chloride channel required for proper function of secretory epithelium in multiple tissues including lung, small intestine, and testes. A tryptophan deletion (•F508) causes a severe loss of CFTR protein function that is responsible for more than 50% of the cases of cystic fibrosis (CF) in the United States. Scores of mutations elsewhere in the gene produce milder forms of the disease (atypical CF), however, individuals with the same mutation were found to exhibit significant differences in disease severity. One mechanism for these differences was found to involve polymorphic (UG)m and (U)n tracts within the 3′ splice site of CFTR exon 9, which exhibits slight exon skipping even from the normal allele in unaffected individuals. Individuals with a longer (UG)m tract exhibit more exon skipping in part through increased binding of TDP-43 (Figure 2D) (Buratti et al., 2004). The severity of the effects of mutations elsewhere in the CFTR gene is modulated by the level of exon 9 inclusion and ultimately the amount of full-length functional protein that is expressed.
Genetic variation is the predominant determinant of individual phenotypic differences, including those that are relevant to disease: disease severity (modifiers), disease susceptibility, development of the multigenic common diseases, and differences in positive and negative responses to therapeutic compounds. Recent results indicate an unexpected level of variability in the expression of different splice variants between individuals, strongly suggesting a significant role for splicing in phenotypic differences that are relevant to disease (Graveley, 2008; Wang and Cooper, 2007). An early estimate of the effects of genetic variation on splicing found that up to ~21% of alternatively spliced genes could be influenced by local polymorphisms (Nembaware et al., 2004). Combined analyses of mRNA expression (including splicing differences) using exon tiling microarrays linked with polymorphism determination using the CEU HapMap revealed that genetic variants affected differences in mRNA structure due to alternative splicing and in 5′ and 3′ end selection more than differences in whole gene mRNA levels (Kwan et al., 2008). Particularly telling is a recent analysis using deep sequencing for global quantification of alternative splice variants in cerebellar tissue from six individuals which revealed that up to 30% of the splicing events exhibited individual-specific differences reflective of striking variability in cis as well as trans-acting splicing environments (Wang et al., 2008). These studies strongly support the contention that polymorphisms located within introns and exons have a significant impact on individual variation of gene expression at the level of RNA processing and strongly contribute to individual differences in disease severity, susceptibility, and therapeutic responses. The future goal of providing personalized medicine will require a complete understanding of the effects of genetic variation on the splicing code as well as on the nuclear splicing machinery that interprets the code.
Mutations in the constituents of the splicing machinery or any of the trans-acting splicing components and their regulators have the potential to affect expression of a large number of genes, directly or through secondary effects. Important themes that have emerged over the past several years are that mutations in ubiquitous spliceosome components and splicing factors can cause tissue-specific pathogenesis, and that changes in the relative stoichiometry of general splicing factors can cause profound changes in splicing patterns and cause disease. Splicing abnormalities have been found to be particularly prevalent in neurodegenerative disease and in cancer. Though specific causality in many cases is difficult to prove, this observation brings to the forefront the enormous vulnerability of splicing, its potential pathogenicity and the potential utility of splicing patterns as signatures (biomarkers) of specific diseases and as targets for therapeutic intervention.
Although the spliceosome is composed of a large number of proteins and five snRNPs, very few mutations in core spliceosome components have been found, suggesting that such mutations are incompatible with life either at the cellular level or in early development. Interestingly, mutations in four proteins (PRPF31, PRPF8, PRPF3, RP9) are associated with disruption of basal spliceosome function and all cause autosomal dominant forms of retinitis pigmentosa (RP) (Mordes et al., 2006). All four proteins are components of the U4/U5/U6 tri-snRNP complex, which joins the assembling spliceosome to form the catalytic center. Dominant inheritance suggests that haploinsufficiency does not provide a sufficient level of spliceosome function in photoreceptor neurons, the affected cells. For example, although a general defect in splicing was not detected in lymphoblast cell lines derived from RP patients containing null alleles for PRPF31 (Rivolta et al., 2006), specific pre-mRNAs were found to be sensitive to PRPF31 mutations in transient expression assays (Mordes et al., 2007). The intriguing sensitivity of photoreceptor neurons could reflect an inability to produce sufficient levels of highly expressed mRNAs such as from the rhodopsin gene (mutations in this gene also cause dominant forms of the disease). However alternative mechanisms are possible such as photoreceptor-specific sensitivity to aggregates formed from accumulation of mutant proteins (Mordes et al., 2007).
The best-characterized function of the SMN complex is its essential role in the biogenesis of snRNPs (Neuenkirchen et al., 2008; Yong et al., 2004). SMN deficiency causes a corresponding deficit in snRNP assembly capacity (Wan et al., 2005). Motor neuron developmental abnormalities could be rescued by injection of purified snRNPs in SMN-deficient zebrafish embryos (Winkler et al., 2005) and reduced snRNP assembly has been demonstrated in an SMN–deficient SMA mouse model (Gabanella et al., 2007; Zhang et al., 2008b). Recent studies in these mice also revealed profound perturbations in RNA metabolism, including changes in snRNAs and numerous splicing abnormalities (Zhang et al., 2008b). Surprisingly, instead of a uniform decrease in the steady-state levels of all the snRNAs, each snRNA is affected differently in every tissue of SMN-deficient mice, resulting in different snRNA stoichiometries and an altered snRNP repertoire in every tissue (Gabanella et al., 2007; Zhang et al., 2008b). In addition, the SMN deficiency causes widespread and cell-type specific splicing abnormalities affecting numerous mRNAs of functionally diverse genes, including shifts in the ratios of known alternatively spliced isoforms as well as many splicing defects of constitutive exons (Zhang et al., 2008b). Knockdown of spliceosomal proteins in yeast and Drosophila have shown that depletion of general and constitutive components of the splicing machinery can have differential effects on splicing of specific pre-mRNAs and on alternative splicing (Clark et al., 2002; Park et al., 2004; Pleiss et al., 2007). Although the mechanism behind the splicing defects in SMN deficiency is unknown, and SMN is not itself a splicing factor, it is reasonable to consider the possibility that changes in the snRNP repertoire affect the efficiency, rate and fidelity of spliceosome assembly on different introns. Alternatively, or in addition, SMN may have direct roles in splicing.
These observations illustrate how a deficiency in a ubiquitously-expressed housekeeping protein may cause tissue-specific outcomes and invite a new perspective on the pathophysiology of the disease spinal muscular atrophy (SMA). A conundrum in SMA has been to reconcile the known housekeeping function of SMN with the apparent selectivity of SMA pathogenesis to motor units (anterior horn α-motor neurons and the muscles they innervate). It is likely that tissue-specific factors functioning in concert with SMN determine a cell’s snRNP repertoire. Because each cell type has a unique assortment of splicing factors and SMN deficiency causes its snRNP repertoire to change in a unique way, the resulting perturbations are distinct and give rise to cell-type specific effects on splicing. These findings also reveal a key role for the SMN complex in RNA metabolism and splicing regulation, and indicate that SMA involves splicing abnormalities throughout the organism, and not only in motor units. Although recent studies suggest that the extent of alternative splicing is much greater than was previously believed, it is nevertheless clear that each tissue produces a consistent pattern of alternatively spliced mRNA isoforms (Castle et al., 2008; Pan et al., 2008; Wang et al., 2008). It is thus not apparent why other tissues in the SMA mice do not show overt pathology despite numerous splicing abnormalities. The extent to which abnormal mRNAs are translated will need to be determined to address this issue. In addition to SMN’s role in snRNP biogenesis, it likely plays a role in the biogenesis or function of other RNPs possibly including complexes containing the Sm-like proteins (Lsm proteins) with which it interacts, as well as snoRNPs, hnRNPs and mRNPs (Neuenkirchen et al., 2008; Rossoll et al., 2003; Yong et al., 2004; Zhang et al., 2003). It is therefore possible that SMN deficiency has other consequences on RNA metabolism in addition to its effects on snRNPs and splicing. Significant questions remain, including whether the change in the snRNP repertoire is the direct cause of the splicing abnormalities, if SMN or the SMN complex has other roles in splicing, and whether the demise of motor neurons is caused by mis-splicing of one or several specific mRNAs or by the cumulative effect of many splicing abnormalities.
Like the major spliceosomal snRNPs, many of the hnRNP and SR proteins that regulate splicing are present in vast excess over their high-affinity binding sites on premRNAs (Dreyfuss et al., 2002; Dreyfuss et al., 1993). Yet even moderate changes in expression of one of them, and hence their relative stoichiometry, can have significant effects on alternative splicing (Caceres and Kornblihtt, 2002; David and Manley, 2008; Martinez-Contreras et al., 2007; Smith and Valcarcel, 2000). Furthermore, many of the hnRNP and SR proteins shuttle continuously between the nucleus and the cytoplasm (Caceres et al., 1998; Pinol-Roma and Dreyfuss, 1992) and their sub-cellular distribution can shift in response to stress signals. The resulting change in stoichiometry of splicing factors in the nucleus could also modulate alternative splicing patterns. A diverse set of diseases are associated with changes in expression of RNA binding proteins involved in splicing and splicing regulation (Gabut et al., 2008; Lukong et al., 2008; Ule, 2008).
Recent studies suggest a role for several RNA-binding proteins in neurological and psychiatric diseases. TDP-43, initially identified as binding to the HIV trans-acting response element (TAR), and shown to be a transcriptional repressor of HIV, is an hnRNP protein similar in general structure to hnRNP A1 and A2. It has been found to have several different associations with disease (Buratti and Baralle, 2008). In one, TDP-43 binds to the polymorphic (UG)m repeats within intron 8 of the pre-mRNA of CFTR (the gene mutated in cystic fibrosis) such that increased binding to longer (UG)m repeats causes increased exon 9 skipping. Recent studies have shown that TDP-43, which is normally nuclear, is found in cytoplasmic inclusions in affected regions of the brains of individuals with frontotemporal dementia (FTD), Alzheimer’s disease and amyotrophic lateral sclerosis (ALS) (Ule, 2008). In these patients, TDP-43 is depleted from nuclei and the protein that accumulates in the cytoplasmic inclusions is ubiquitinated, cleaved or abnormally phosphorylated. These findings represent a significant advance and a useful histological marker that correlates well with FTD. Mutations in the TARDBP gene, which encodes TDP-43, also are associated with both sporadic and familial forms of ALS, demonstrating a role for TDP-43 in the pathogenesis of neurological diseases (Daoud et al., 2008; Kabashi et al., 2008; Rutherford et al., 2008; Sreedharan et al., 2008). It remains to be determined whether the pathogenic mechanism relates to toxicity of the cytoplasmic aggregates, loss of TDP-43′s nuclear function, or a combination of both.
Another RNA binding protein that has recently been implicated in neurological disorders is quaking (QKI). QKI is the human homologue of the gene mutated in the spontaneous mouse dysmyelination mutant, quaking viable (qkv). The protein is a member of the STAR family, which contains domains that link RNA binding with signal transduction. QKI expresses nuclear and cytoplasmic isoforms via alternative splicing and it regulates both splicing and mRNA stability; putative pre-mRNA and mRNA targets have been identified (Chenard and Richard, 2008). QKI expression in oligodendrocytes is required for normal myelination in mice and it plays a fundamental role in differentiation of oligodendrocyte precursor cells in culture for which RNA binding activity is required (Chen et al., 2007). QKI is a strong candidate gene for schizophrenia susceptibility based on genetic linkage data as well as mRNA expression studies. These findings demonstrate reduced expression of QKI mRNA as well as putative QKI target mRNAs in disease-relevant regions from post-mortem patient brain tissue samples (Lauriat et al., 2008). Interestingly, a large-scale analysis of interactions among proteins associated with inherited ataxias identified a number of RNA binding protein families including QKI as well as Nova and Fox (Lim et al., 2006). Nova is one of the best-characterized regulators of an alternative splicing network, coordinating expression of proteins modulating synaptic function (Ule et al., 2005) and Fox is an important splicing regulator in several tissues including the brain (Underwood et al., 2005). An analysis of the Fox splicing regulatory network identified thousands of putative targets with enrichment for neuromuscular functions and genes associated with diseases affecting neurons and striated muscle where Fox expression is high (Zhang et al., 2008a). In addition, mutations in the A2BP1/FOX-1 gene are associated with mental retardation, epilepsy, and autism (Bhalla et al., 2004; Sebat et al., 2007), strongly suggesting a direct role in disease. Both Nova and Fox proteins are likely to have other RNA processing roles such as regulating alternative polyadenylation site selection (Licatalosi et al., 2008). These results are highly suggestive of further roles of RNA binding proteins in causing and modifying the severity of neurological diseases.
Disruption of splicing has long been thought of as a hallmark of a variety of cancers, however, recent computational analyses using cancer-derived expressed sequence tags (ESTs) found that the prevalence of alternatively spliced genes was slightly lower in tumors compared to normal tissue. What differed between normal and cancerous tissues were the sets of genes rather than the number of alternatively spliced genes (Kim et al., 2008). The major difficulty has been to determine if the splicing changes detected in cancer are pathogenic, however, the cause-effect relationships between the production of cancer-associated splice variants and cancer formation and progression are becoming clearer. Cis-acting mutations cause aberrant splicing and inappropriate expression of cancer-associated genes. Several genes use a balance between hnRNP and SR proteins to regulate alternative splicing, producing appropriate ratios of pro- and anti-apoptotic isoforms (Srebrow and Kornblihtt, 2006). However, it has also become clear that the mis-regulation of alternative splicing in cancer cells is extremely complex in that the effects of a splicing regulator on the same splicing event can differ between cells. A test of the effects of knockdown of 14 hnRNP proteins on 56 alternative splicing events of apoptotic genes in three cancer/immortalized cell lines revealed a surprisingly high level of cell line-specific effects (Venables et al., 2008). The results indicate that cell-specific differences in trans-acting regulatory environments make it difficult to predict the effects of changes in individual splicing factors.
Despite these complexities, strong links have been established between altered expression of specific splicing factors, aberrant splicing of their pre-mRNA targets, and induction of signaling pathways that are relevant to transformed or malignant cellular phenotypes. A particularly coherent picture has emerged for the SR protein SF2/ASF, which is elevated in several cancers and induces transformation when overexpressed at low levels. SF2/ASF overexpression induces splicing to produce an oncogenic isoform of the ribosomal protein S6 kinase-β1 (S6K1), a regulator of translation, and knockdown experiments demonstrated that expression of this isoform is required for transformation by SF2/ASF (Karni et al., 2007). Elevated SF2/ASF also promotes splicing of the RON kinase to a constitutively active form that has been associated with invasive behavior in several cancers. The effect of SF2/ASF on RON splicing is direct via binding to an exonic splicing enhancer (ESE) within the downstream exon. The invasive phenotype of human cancer cells is reversed by knockdown of endogenous SF2/ASF or the constitutively active RON splice variant (Ghigna et al., 2005). Expression of a number of other splicing factors is also altered in cancer (Grosso et al., 2008), however, the relevance to cancer formation or progression remains to be established. It is also important to consider that most if not all proteins that regulate splicing are multifunctional. For example, recent results suggest that transformation induced by SF2/ASF could occur by a mechanism that is independent of splicing (Karni et al., 2008).
Importantly, cancer-related changes in splicing and splicing factors are nevertheless potentially useful biomarkers for cancer diagnosis and classification. Tumor classification is critical for accurate diagnosis, prognosis, selection of the most effective therapeutic approach, and follow-up treatment. Analysis of global alternative splicing patterns by high throughput RT-PCR, splicing-sensitive microarrays and deep sequencing provides the opportunity for an analysis of unprecedented specificity to link molecular signatures to predictions of therapeutic efficacy.
In an unexpected twist for a molecule thought of as a passive substrate, RNA expressed from a mutated allele can be pathogenic. This is a not so subtle reminder that RNA is an intrinsically potent molecule. This mechanism of disease is observed in a class of so-called microsatellite expansion disorders. Microsatellites are short (1 to 10 nucleotide) repeats that vary in number between individuals and cause disease when located in a gene and expand beyond a normal threshold number of repeats. The expansions can cause disease by three mechanisms which are not mutually exclusive: loss of protein function, gain of aberrant protein function due to expansion of triplet repeats within the open reading frame, or gain-of-function of the RNA containing the expansion (Orr and Zoghbi, 2007). There is clear evidence for an RNA gain-of-function in four human diseases: myotonic dystrophy types 1 (DM1) and 2 (DM2), fragile X-associated tremor ataxia syndrome (FXTAS), and spinocerebellar ataxia 8 (SCA8). An RNA gain-of-function mechanism is also likely in SCA10, SCA12, and Huntington’s disease-like 2 (HDL2) (Figure 3) (O’Rourke and Swanson, 2008).
The pathogenic mechanism of an RNA gain-of-function is best characterized for DM1 in which a CUG repeat located within the 3′ UTR of the DMPK mRNA expands beyond its normal range of 5-38 repeats to a pathogenic range of 50 to >2,500 repeats. This mutation results in the second most common cause of muscular dystrophy as well as a variety of multisystemic features that include cardiac arrhythmias and central nervous system dysfunction. An early hypothesis, proven to be correct, was that the RNA repeats disrupt the functions of specific RNA binding proteins in trans (Timchenko et al., 1996; Wang et al., 1995). Proteins identified based on their affinity for CUG RNA repeats include CUG binding protein 1 (CUGBP1) and muscleblind-like 1 (MBNL1) (O’Rourke and Swanson, 2008). Both proteins, or their paralogues in the CELF and MBNL gene families, are regulators of several aspects of nuclear and cytoplasmic RNA processing including alternative splicing, mRNA stability, translation, mRNA localization, and editing (Barreau et al., 2006; Pascual et al., 2006). The best-characterized pathogenic effect of expanded CUG RNA is its disruption of a program of postnatal splicing transitions in striated muscle. At the heart of this mechanism is the disrupted balance of antagonistic regulation by MBNL1 and CUGBP1 (Figure 3B). The result is the inappropriate expression of embryonic rather than adult splice variants in adult tissues and consequent manifestations of disease.
A dramatic cellular hallmark of DM1 is the accumulation of expanded CUG RNA in nuclear foci. In one of two pathogenic effects of the RNA, MBNL1 colocalizes with the RNA foci and is depleted from the nucleoplasm more than two-fold resulting in aberrant regulation of MBNL1-sensitive splicing events (Lin et al., 2006). Interestingly, MBNL1 binds to the expanded CUG repeats because the RNA forms an extended stable hairpin structure that resembles its natural intronic binding site found near targeted alternative exons (Warf and Berglund, 2007; Yuan et al., 2007). Mouse models that produce Mbnl1 depletion by either an Mbnl1 gene knockout (Mbnl1•E3/•E3E3) or expression of 250 CTG repeats within a human skeletal α actin transgene (HSALR) reproduce muscle pathology and expression of embryonic splicing patterns as observed in DM1 skeletal muscle (Kanadia et al., 2003; Lin et al., 2006). In the mouse models and individuals with DM1, the embryonic splicing pattern for the muscle specific chloride channel (Clcn1) introduces a premature termination codon resulting in loss of Clcn1 function and the myotonia that is characteristic of the disease (Lueck et al., 2007). Delivery of MBNL1 via adeno-associated virus to HSALR skeletal muscle reverses the Clcn1 and other splicing changes as well as the myotonia (Kanadia et al., 2006) demonstrating the link between MBNL1 depletion and a clinical feature of the disease. A role for MBNL1 is also supported by results from Drosophila DM1 models in which human MBNL1 homologues suppress the severe phenotype induced by CUG repeat RNA in both eye and muscle (de Haro et al., 2006). Disruption of MBNL1 function could play a pathogenic role in other microsatellite disorders such as FXTAS in which male carriers of the fragile X pre-mutation in the FMR1 gene develop a late onset neurological syndrome, whereas female carriers develop premature ovarian insufficiency (POI) (Figure 3A) (Hagerman and Hagerman, 2004). Nuclear inclusions found in affected regions of FXTAS brain samples contain MBNL1 as well as hnRNPA2 (Iwahashi et al., 2006). MBNL1 also colocalizes with nuclear RNA foci found in neurons from individuals affected with HDL2 in which the mutated junctophilin-3 (JPH3) gene contains expanded CTG repeats (Rudnicki et al., 2007).
In addition to depleting RNA binding proteins, CUG repeat RNA has the provocative effect of inducing a signaling event resulting in activation of protein kinase C (PKC) (Figure 3C). One consequence of PKC activation is hyperphosphorylation and stabilization of CUGBP1 protein, which explains the observations that CUGBP1 protein is upregulated 2-4 fold in DM1 heart and skeletal muscle without an increase in mRNA levels (Nezu et al., 2007). PKC activation and CUGBP1 hyperphosphorylation and upregulation are direct and rapid effects as all three events were observed six hours after induction of a CUG repeat-containing transgene mRNA in heart tissue from an inducible DM1 mouse model (Kuyumcu-Martinez et al., 2007; Wang et al., 2007). The mechanism by which expanded CUG RNA activates PKC is unknown. The possibility that PKC activation is secondary to depletion of MBNL1 is unlikely since neither the Mbnl1 E3/ E3 nor HSALR mouse models exhibit elevated CUGBP1 (Kanadia et al., 2003; Lin et al., 2006). Recent results demonstrate that a large subset of splicing transitions that occur during the first two weeks of postnatal heart development are controlled by MBNL1 upregulation and CUGBP1 downregulation. Interestingly, CUGBP1 mRNA levels do not change and the ten-fold decrease in protein expression appears to involve protein dephosphorylation and destabilization (Kalsotra et al., 2008; Kuyumcu-Martinez et al., 2007). These results suggest the possibility that expanded CUG repeats are aberrantly stimulating a natural signaling event that maintains CUGBP1 protein stability in embryonic striated muscle.
DM2 is caused by expanded CCTG repeats in intron 1 of the ZNF9 gene and is clinically similar to DM1 but is generally less severe and lacks a congenital form (Liquori et al., 2001). AlthoughMBNL1 is sequestered on expanded CCUG repeat RNA, which forms a hairpin structure and accumulates in nuclear foci like CUG repeat RNA, CUGBP1 is not induced (Lin et al., 2006) suggesting that DM2 is primarily a disease of MBNL depletion. The lack of CUGBP1 up-regulation suggests that PKC is not activated by expanded CCUG RNA and can provide insights into the cis requirements for PKC activation by RNA such as whether CCUG versus CUG repeats or a gene-specific context is required.
RNA generated from microsatellite expansions also has the potential to produce pathogenicity through RNA interference pathways. Expanded repeats containing CUG, CAG, and to a lesser extent CCG or CGG are processed into ~21 nucleotide RNAs after folding into single imperfect hairpins that are cleaved by Dicer (Krol et al., 2007). In addition, antisense transcripts have been detected for the expanded repeats of the DMPK, SCA8, and FMR1 genes (Cho et al., 2005; Ladd et al., 2007; Nemes et al., 2000) providing a second potential source of small RNAs. Small CUG or CAG RNAs have been detected in cells from DM1, SCA1, and Huntington’s disease patients (Cho et al., 2005; Krol et al., 2007). Evidence for downstream effects of CUG siRNAs comes from knockdown of Dicer in DM1 cells that resulted in elevated levels of normal endogenous CAG repeat-containing mRNAs (Krol et al., 2007). Although these results provide a proof of principle, targets relevant to DM1 pathogenesis remain to be identified.
With the exception of SCA12, disease-causing CAG expansions are within open reading frames and result in the production of proteins containing expanded polyglutamine tracts (polyQ) (Orr and Zoghbi, 2007). Disease-causing CAG repeat expansions can result in protein loss- or gain-of-function or both, depending on the gene affected, however, whether expanded CAG RNA can contribute to pathogenicity depends on the experimental system. The phenotype observed in a transgenic mouse model for SCA1 that expresses a pathogenic human ataxin-1 mRNA containing 82 CAG repeats [ataxin-1-(CAG)82] is not observed in mice expressing equivalent levels of mRNA containing a point mutation that inactivates a protein nuclear localization signal. These results convincingly demonstrate that the ataxin-1-(CAG)82 mRNA alone is not pathogenic in mouse Purkinje cells (Klement et al., 1998). Strong support for a CAG RNA gain-of-function mechanism comes from a Drosophila model in which expression of a truncated pathogenic human ataxin-3-(CAG)78 mRNA causes neurotoxicity (Li et al., 2008). The role for RNA toxicity came to light from modifier screens that identified mbl, the Drosophila MBNL homologue. In addition, expanded CAG repeats located in the 3′ UTR of a heterologous gene (DsRED), demonstrated not to be expressed as polyQ protein or CUG antisense RNA, were also pathogenic, directly demonstrating a pathogenic role for CAG repeat RNA. Expression of polyQ protein from nonpathogenic interrupted CAA/CAG repeat RNA induced some pathogenicity indicating that both polyQ protein and CAG RNA contribute to the full phenotype observed in the fly model. These results open the door for consideration of RNA gain-of-function effects for expanded CAG repeats in human disease. Interestingly, whereas MBNL1 suppressed CUG RNA toxicity in the Drosophila DM1 model, fly and human MBNL1 homologues enhanced the pathogenic effects of CAG RNA suggesting different pathogenic mechanisms for CUG and CAG RNA toxicity (Li et al., 2008).
A different set of RNA-binding proteins is associated with the expanded CGG repeats that cause FXTAS. HnRNP A2 and Pur-α were identified in purified nuclear inclusions from brains of a FXTAS mouse model (Jin et al., 2007). Both proteins directly interact with CGG repeat RNA and expression of each protein suppressed the neurodegeneration phenotype in a Drosophila FXTAS model (Jin et al., 2007). CUGBP1 expression also suppressed the phenotype in flies and was shown to interact with repeat RNA indirectly by binding to hnRNP A2 (Sofola et al., 2007). How these proteins fall into a mechanism of pathogenesis is not yet clear. Pur-α knockout mice exhibit an ataxia phenotype, consistent with a depletion model in FXTAS, however, additional effects of the RNA are possible as demonstrated in DM1. It is also important to consider that suppression of an RNA gain-of-function phenotype in Drosophila models, particularly when using human proteins, could result not only from restoring the function of a depleted Drosophila protein but also by rendering the RNA nontoxic by a different mechanism such as preventing RNA aggregation or interactions that lead to toxicity. Understanding the specific mechanism of suppression would provide information regarding the pathogenic mechanism of disease as well as potential therapeutic approaches.
Many approaches have been explored and many more can be envisioned to modify the splicing pattern of a mutant pre-mRNA or eliminate an mRNA that bears a disease-causing mutation to achieve therapy. Increasing knowledge of RNA biology and chemistry is stimulating renewed efforts to target the RNA itself or the splicing and translational machinery as entry points for therapeutic intervention. Here, we discuss several strategies including antisense oligonucleotides, antisense snRNAs, RNA interference and small molecules (see Analysis by L. Bonetta on page XXX of this issue). Other methods including trans-splicing and ribozymes will likely emerge as additional options in the future.
Antisense oligonucleotides (AOs) have been used extensively to redirect splicing of a mutation-bearing pre-mRNA to prevent it from generating a disease-causing mRNA and forcing it to splice into a disease-rescuing mRNA (Sazani and Kole, 2003). Generally, AOs have been designed to hybridize and block one or more sequences in the target pre-mRNA that are critical for the particular splicing event that one wishes to inhibit and cause the splicing machinery to select an alternative pattern whose outcome is more favorable. To be effective, the AO’s target RNA sequence needs to be accessible in the native RNP, which not all sequences are, and it must be specific. However, it is difficult to predict if these requirements will be met, as well as whether the desired splicing switch will be achieved. Thus, many different AOs need to be tested and this may require tiling of many AOs of various lengths across a considerable stretch of the target pre-mRNA. Examples of disease-causing genes targeted by AOs in vitro and in cell systems include point mutations in the β-globin gene in β-thalassemia, the CFTR gene in cystic fibrosis, and lamin A in the premature aging disease Hutchinson-Gilford progeria syndrome (HGPS). Mutations in these genes activate aberrant cryptic splice sites and prevent correct pre-mRNA splicing (Dominski and Kole, 1993; Friedman et al., 1999; Scaffidi and Misteli, 2005). An AO targeting mutations in the gene encoding the muscle-specific chloride channel (ClC-1) reverses the defect of ClC-1 alternative splicing in mouse models of myotonic dystrophy (Wheeler et al., 2007). AOs have also been extensively studied as potential treatment strategies for SMA and DMD. AOs that mask the ISS or the 3′ splice site of exon 8 of the SMN2 gene have been used to switch the splice site and promote exon 7 inclusion and thus the production of a full length functional SMN protein (Figure 4A) (Singh, 2007). An interesting mechanism-based approach has used bifunctional oligonucleotides that link an antisense sequence to exon 7 with an ESE and thereby recruit splicing-enhancing SR proteins or directly couple antisense sequence to an SR peptide to promote exon 7 inclusion (Figure 4A) (Cartegni and Krainer, 2003; Skordis et al., 2003). Similar AO approaches have also been used for DMD (Alter et al., 2006). There are also several notable examples of the use of AOs to downregulate specific mRNAs by mechanisms that do not necessarily involve modulation of splicing. An AO targeted to a superoxide dismutase 1 (SOD1) mutation that causes the neurodegenerative disease amyotrophic lateral sclerosis (ALS) administered to mice with the disease reduces mutant SOD1 mRNA and protein levels and slows disease progression (Smith et al., 2006).
Although proof-of-concept experiments for efficacy of AOs have been demonstrated for many model systems and diseases, only a handful have reached the level of practical therapies in humans (see Analysis by L. Bonetta on page XXX of this issue). One successful clinical application of AOs is the development of the first FDA-approved antisense drug that targets cytomegalovirus (CMV) mRNAs to treat retinitis caused by CMV (Holmlund, 2003). The major bottlenecks in implementing AOs as therapeutics relate to tissue-specific delivery of sufficient amounts of the AOs to achieve a re-administration regimen that will sustain therapeutic levels and stability long-term. AOs with various backbone chemistries ---morpholino, peptide nucleic acid (PNA), locked nucleic acid (LNA), 2′-O-methyl, thiophosphate, and numerous other oligonucleotides in place of the naturally occurring phosphodiester ribo- or deoxy-nucleotides--- have been developed to improve affinity, boost stability in the circulation and in target cells, and enhance cell penetration and nuclear accumulation (Karkare and Bhatnagar, 2006). Improved tissue-specific delivery has also been achieved by conjugating AOs to peptides that are selectively taken up by the target tissues (Henke et al., 2008). AOs remain a promising avenue for the therapy of many genetic diseases. However, tests for specificity and undesirable off-target effects must be rigorously and carefully performed.
RNAs can also be used as a tool to target other RNAs, including pre-mRNAs by antisense hybridization. U1 snRNA- and U7 snRNA-based vectors consisting of the native snRNAs modified to have an antisense sequence to a target pre-mRNA in place of the native 5′-end of the RNA have been developed. These modified snRNAs are transcribed from the corresponding DNA constructs in the target cells and have unique advantages as antisense vehicles. Like endogenous snRNAs, they acquire Sm cores that contribute to their stability, nuclear localization and efficient hybridization to complementary target RNAs. Anti-SMN U7 snRNAs containing sequences complementary to the 3′ splice site of exon 8 or to exon 7 linked with an ESE induced an almost complete and durable correction of SMN2 splicing in cultured cells (Madocsai et al., 2005) and prolonged survival of mice with severe SMA (Meyer et al., 2008). Unlike synthetic AOs they are RNA, not chemically modified entities, but their delivery into cells presents unique obstacles that AOs are not subject to. Viral vectors including recombinant adeno-associated virus (AAV) and lentiviruses are used as delivery vehicles to transduce antisense RNAs to the target cells (Danos, 2008). Systemic administration of AAV U7 or U1 antisense RNA into a DMD mouse model leads to the skipping of the exon containing a mutation causing frameshift and PTC, and to a sustained production of partially functional dystrophin protein and an improved phenotype (Figure 4B) (Denti et al., 2006; Goyenvalle et al., 2004). These results provide the foundation for clinical phase studies that are currently underway for the treatment of DMD (Muntoni and Wells, 2007). Another U7 antisense RNA application is inhibition of HIV-1 production. Lentiviral U7 induces a partial skipping of internal exons encoding the HIV-1 regulatory proteins Tat and Rev (Asparuhova et al., 2007). However, serious issues remain including tissue-specific targeting and immune response to the viral vectors, as is the case for conventional gene replacement therapy.
The capacity to selectively eliminate an mRNA of a disease-causing allele or to prevent translation of a deleterious protein by RNAi presents a wide range of targets for therapeutic modulation. RNAi relies on the base pairing interaction of 21-23 nucleotide RNAs, a size sufficient to uniquely target an mRNA or even a specific splice variant, and provides a versatile and potent tool (Grimm and Kay, 2007). RNAi-based strategies are applicable to all diseases in which decreasing expression of an RNA, whether from a mutant allele or an aberrantly expressed mRNA in cancer cells, would have therapeutic effects. Great progress has been made toward translating the expertise of RNAi from an extensively used experimental tool to an effective and safe treatment. The main challenges again are optimal delivery to the appropriate tissues and cells, avoiding the cellular anti-viral response to double-stranded RNA, and achieving the optimal balance of high potency without off-target effects. The three most common therapeutic approaches to RNAi-based delivery are: (i) synthetic double-stranded small interfering RNAs (siRNAs) applied directly to tissues using a permeabilizing vehicle in which exogenous siRNAs become incorporated into the endogenous RNAi machinery (RISC complex); (ii) short hairpin RNAs (shRNAs) typically delivered by expression from viral vectors, are expressed in the nucleus, exported to the cytoplasm by Exportin 5 and processed by Dicer into siRNAs; (iii) artificial miRNAs, in which the targeting hairpins are expressed in a pri-miRNA context using viral vectors, processed in the nucleus by the Drosha-DGCR8 complex, exported to the cytoplasm and then processed into functional siRNAs by Dicer (Rossi, 2008). ShRNAs are typically expressed at high levels to produce highly effective gene silencing, but they can also elicit toxicity due to saturation of the endogenous RNAi machinery (Rossi, 2008). Investigations using AAV-delivered artificial miRNAs to knockdown the expanded ataxin-1 mRNA in a SCA1 mouse model indicates that this approach produced efficient knockdown of this mRNA in cerebellar Purkinje cells without toxicity (Figure 4C) (Boudreau et al., 2009). These results provide a next-generation platform for virus-mediated delivery and expression. RNAi approaches are showing promise in several other mouse models of disease including Huntington’s disease and ALS, and in HIV-infected “humanized” mice. Direct siRNA delivery is being used in preclinical and clinical trials to treat several diseases such as macular degeneration, respiratory syncicial virus (RSV) infection, and liver cancer (Aigner, 2007; Pappas et al., 2008).
With a similar technology to that used to deliver siRNAs, it is possible to target miRNAs using complementary nucleotide sequences (anti-miRs). Anti-miRs can be used to block endogenous or exogenous (viral) miRNAs to neutralize their effects. Delivery into most tissues is difficult but delivery to the liver works well. For example, the highly expressed and liver-specific miR-122 was found to have at least two features relevant to disease: it is required for hepatitis C virus replication in cultured hepatocytes and is involved in regulating cholesterol metabolism. Delivery of LNA-modified oligonucleotides to block miR-122 demonstrated both the loss of miR-122 and a functional consequence of lowered plasma cholesterol in mice and monkeys (Elmen et al., 2008). The potential benefits of this approach for treating hypercholesterolemia as well as acute and chronic hepatitis C infections are quite promising.
There are cautionary tales as well, however. In pre-clinical testing in mice, naked siRNAs targeting VEGF or the VEGF receptor and delivered directly to the large chamber of the eye prevented the over-proliferation of blood vessels in the retina that cause some forms of macular degeneration (Reich et al., 2003; Shen et al., 2006). This approach has progressed to phase I clinical trials in humans (Whelan, 2005). However, tests in mice designed to determine how “naked” siRNAs are taken up into cells discovered that siRNAs do not actually enter cells but rather reduce vascularization through binding to the extracellular Toll-like receptor 3 and stimulating the NF-κB pathway (Kleinman et al., 2008). The results highlight the potential for adverse effects from systemic delivery and the importance of delivery systems that facilitate entry into the intended cells.
Alternative splicing is an attractive target for pharmacological modulation with small molecules. As splicing of most introns is strongly dependent on serine-arginine-rich (SR) proteins and hnRNP proteins, small molecules that affect their activities or their relative amounts in the nucleus can profoundly modify splicing. SR proteins bind to ESEs via their RNA-binding domains and promote exon definition as well as prevent the action of adjacent silencers by recruiting the basal splicing machinery through their SR domains. This activity of SR proteins depends on the state of serine phosphorylation in the SR domain, which is determined by the SR kinases SRPK1, SRPK2 and the Clks1-4, and by the phosphatase PP1 that dephosphorylates them. Small molecules are ideal reagents for modulating the activities of such enzymes and have been used to target these enzymes to modulate splicing patterns. Using high throughput screening, small molecule inhibitors of SR proteins (Soret et al., 2005), SRPKs (Fukuhara et al., 2006) and Clks (Muraki et al., 2004) were identified and shown to modulate alternative splicing. These compounds so far have been analyzed for only a few splicing events, for example, HIV-1 splicing which is strongly dependent on SR proteins, but these studies suggest that this approach holds considerable promise. This justifies undertaking large-scale screens on these and other enzymes that regulate the activity of the splicing machinery in search of more potent and specific modulators of alternative splicing as a therapeutic approach for treating many diseases. Other potential approaches for regulating alternative splicing with small molecules can be envisioned based on modulation of the relative stoichiometries of splicing factors. Thus, small molecules that alter the expression or modulate the nuclear localization of splicing factors would likely provide a means to modify alternative splicing of specific disease-relevant pre-mRNAs.
The historic success of small molecule pharmacological reagents, including the fact that they circumvent the major issues of delivery into most tissues encountered by nucleic acids (oligonucleotides, siRNAs and DNA constructs), makes small molecules highly attractive as a therapeutic modality. Robust, quantitative and specific cell-based alternative splicing assays will be required to realize this potential. New screens for this purpose have been recently described and deployed, so far on a modest scale, to identify modulators of alternative splicing (O’Brien et al., 2008; Stoilov et al., 2008).
Aberrant splicing frequently generates mRNAs containing PTCs. Approximately 30% of mutations causing inherited disease have PTCs and their presence leads to degradation by nonsense mediated decay and the loss-of-function of the mutant allele (Frischmeyer and Dietz, 1999). In most cases, nonsense mediated decay is protective by preventing expression of truncated proteins that could have dominant negative activity or form toxic aggregates. However, it is estimated that 5% to 15% of individuals with one of ~2300 inherited diseases carry a PTC mutation that would benefit from readthrough of the stop codon to produce a protein that is at least partially functional. For example, approximately 7% of DMD and 10% of CF cases are due to a PTC within an otherwise in-frame mRNA. Small molecules that allow readthrough of premature termination codons have been identified and their potential utility for specific diseases are being tested. In particular aminoglycoside antibiotics bind to the decoding site of the small ribosomal subunit, and induce translation readthrough of termination codons (Hainrichson et al., 2008; Zingman et al., 2007). Clinical trials using aminoglycosides for treating several inherited diseases, including the use of gentamicin to treat DMD and CF, have provided encouraging results. However, the high levels of drug required can produce significant side effects as well as increasing the levels of PTC-containing mRNAs due to inhibition of nonsense mediated decay (Zingman et al., 2007). The design of aminoglycosides with improved activity and specificity holds considerable promise (Hainrichson et al., 2008). A structurally unrelated compound, PTC124, was identified by its suppression of a PTC-containing luciferase reporter in a high throughput screen and has been shown to increase production of the dystrophin protein from PTC containing mRNAs in cultured cells and in a DMD mouse model (Welch et al., 2007). It is orally bioavailable, and is currently in clinical trials for DMD (Kerem et al., 2008). Although nonsense suppression appears to be a potentially viable therapeutic approach, the possibility that a drug affects translation termination in other mRNAs in addition to the intended target needs to be carefully evaluated. The potential for side effects is particularly important in diseases in which life-long administration of the drug will be necessary.
Disrupted functions of RNAs and RNPs are the cause of numerous maladies, and they provide a wealth of new opportunities for therapies as well as tools for treatment of human diseases. We have argued here that the intricate process of alternative splicing, in particular, while providing tremendous advantages in affording an almost explosive capacity for generating transcriptome and proteome diversity, is also fraught with the risk of malfunction. In many cases, RNAs and RNPs may provide a more readily accessible target for therapy than the defective or deficient proteins they encode. Reversal of defective protein or RNA toxicity or restoration of normal activity levels could potentially be achieved more specifically and efficaciously by eliminating or redirecting the splicing of the pre-mRNA. Much, in fact most, of what has been learned came from basic research on the biology and chemistry of RNAs and the proteins they interact with in the cellular environment. It is in the public’s interest that such fundamental research should be vigorously supported.
T.A.C is supported by the NIH and the Muscular Dystrophy Association. G.D. is supported by the Association Française Contre les Myopathies (AFM). G.D. is an Investigator of the Howard Hughes Medical Institute.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.