|Home | About | Journals | Submit | Contact Us | Français|
The expansion of unstable microsatellites is the cause of a number of inherited neuromuscular and neurological disorders. While these expanded repeats can be located in either the coding or non-coding regions of genes, toxic RNA transcripts have been primarily implicated in the pathogenesis of non-coding expansion diseases. In this review, we briefly summarize studies which support this RNA-mediated toxicity model for several neurologic disorders and highlight how pathogenic RNAs might negatively impact nervous system functions. However, it is important to note that the distinction between coding versus non-coding regions has become muddled by recent observations that the transcribed portion of the genome, or transcriptome, is considerably larger than previously appreciated. Thus, we also explore the possibility that a combination of protein and RNA gain-of-function events underlie some microsatellite expansion diseases.
An important class of inherited neurological disorders is distinguished by genetic anticipation, an intergenerational increase in disease severity combined with a decreased age-of-onset. The discovery two decades ago that anticipation resulted from the aberrant expansion of microsatellites led to a flurry of studies designed to clarify the molecular etiology for each disease and the realization that the location of the repeat within an affected gene suggested a pathogenic mechanism. For protein coding region expansions, such as the CAG repeat expansion or (CAG)n in the Huntington disease gene HTT, the production of mutant proteins containing polyglutamine (polyQ) results in toxicity possibly because mutant HTT protein, or polyQ fragments generated by proteolysis, interfere with a variety of cellular pathways . In contrast, microsatellite expansions in the non-coding regions of genes, including introns and untranslated regions (5’- and 3’-UTRs), are associated with three types of spinocerebellar ataxias (SCA8, 10, 12), fragile X-associated tremor/ataxia syndrome (FXTAS), Huntington disease-like 2 (HDL2) and two types of myotonic dystrophy (DM1, DM2) (Fig. 1). The finding that some expansions are pathogenic at the RNA level was first achieved for DM1 [4, 40, 42, 45]. DM1 is a multi-systemic disorder caused by the expansion of a CTG repeat in the 3’ untranslated region of the DMPK gene while DM2 results from the expansion of a related non-coding microsatellite, CCTG, in the first intron of ZNF9 . Surprisingly, these expansions result in a similar adult-onset neuromuscular disorder with such characteristic features as muscle hyperexcitability (myotonia), cardiac conduction defects and insulin resistance. While the DM brain is also profoundly affected, we know very little about the underlying molecular events which lead to dysfunction in the nervous system. These CNS effects can be subdivided into developmental abnormalities, such as mental retardation (DM1 only), and adult-onset degenerative changes, which include executive function deficits, hypersomnia, progressive memory problems and cerebral atrophy (DM1 and DM2) [34–35, 37].
Several lines of evidence argue that DM is an RNA-mediated disease [4, 40, 42]. First, the DM inheritance pattern is autosomal dominant but the expansion mutation is located in a non-coding region of the DMPK gene so a gain-of-function at the protein level is unlikely. Second, DMPK and ZNF9 mutant transcripts accumulate in nuclear RNA, or ribonuclear, foci that have been proposed to interfere with normal nuclear functions [6, 50]. Although full-length DMPK mRNA accumulates in ribonuclear foci and this may compromise DMPK protein levels, splicing separates the CCUG repeat from the mRNA so ZNF9 protein levels are unaffected by the DM2 expansion mutation eliminating ZNF9 haploinsufficiency as a cause of this disease . Third, common manifestations of DM disease are observed in several transgenic poly(CUG) mice which have been generated by placing expanded repeats in the non-coding regions of heterologous genes. For example, an early poly(CUG) model was created by inserting either a normal length (CTG)5 or a long mutant (CTG)250 repeat in the 3’ UTR of the human skeletal actin gene (HSA) . Only transgenic mice with the long repeat (HSALR) develop myotonia and myopathy and disease severity correlates with transgene expression indicating that repeats are pathogenic at the RNA level. A limitation of this transgenic model is that the CTG repeat expansion is only expressed in muscle. Restriction of spatial expression patterns is not a problem in DM300-328/DMXL/SXXL transgenic lines, which carry a human DMPK genomic 45 kb fragment with 300–~1,200 CTG repeats although the larger repeats are expressed at low levels . While homozygous mice develop several DM1-relevant disease features, they also show reduced body size which is not a phenotype associated with this disease. Recently, bitransgenic mouse lines have been generated which can be used to inducibly express a non-coding CTG960 repeat (EpA960) in other tissues [41, 53]. Following tamoxifen administration, heart-specific EpA960 mice develop cardiac arrhythmias while muscle-specific lines present with myotonia and severe muscle wasting. Although this is an important new model because it allows for inducible postnatal expression of a toxic RNA in specific tissues, including the brain, induction of the EpA960 transgene results in a burst of high levels of an untranslated RNA with an interrupted repeat, [(CTG)20CTCGA]48.
How do CTG-rich microsatellite expansions in non-coding regions of two unrelated genes result in a similar adult-onset disease? Current evidence argues that transcription of mutant DMPK and ZNF9 alleles produces CUG and CCUG expansion (CUGexp, CCUGexp) RNAs which fold into stable double-stranded (ds) hairpin structures that are toxic to cells because they alter the activities of at least two families of alternative splicing factors [4, 40]. While these CUGexp and CCUGexp RNAs recruit, and subsequently sequester, the muscleblind-like (MBNL) proteins [17, 30, 36], they also trigger the hyperphosphorylation and overexpression of CUGBP1 and ETR3-like factors (CELF) . These two events appear to be tightly coupled in vivo since tamoxifen-inducible expression of the polyCUG transgene EpA960 in mouse heart results in the simultaneous appearance in cardiomyocytes of CUGexp ribonuclear foci, sequestration of Mbnl1 in these foci and increased expression of Cugbp1 . Because the MBNL and CELF proteins regulate the splicing of a specific set of fetal and adult alternative exons during postnatal development, the corresponding fetal protein isoforms persist in the DM adult which is incompatible with normal adult tissue function [10, 44, 47]. For example, expression of the major skeletal muscle chloride channel ClC-1, encoded by the CLCN1 gene, is regulated at the level of alternative splicing [3, 29]. In newborn muscle, CLCN1 pre-mRNA splicing results in the inclusion of exon 7a, which leads to a premature stop codon in exon 7 due to frameshifting, so the resulting mRNA is rapidly degraded via the nonsense-mediated decay (NMD) pathway. CLCN1 exon 7a splicing is promoted by the relatively high levels of CUGBP1 in neonatal myonuclei. During postnatal development, CUGBP1 levels decline while MBNL1 protein translocates from the cytoplasm to the nucleus resulting in exon 7a skipping and synthesis of the full-length ClC-1 protein [21, 26]. Evidence that loss of ClC-1 due to mis-splicing in DM adult muscle results in a channelopathy and myotonia comes from studies using morpholino antisense oligonucleotides (AONs) . Treatment of either HSALR or Mbnl1 knockout mice skeletal muscle with AONs that target the Clcn1 exon 7a 5’ splice site block splicing of this exon and rescues the myotonia observed in these DM mouse models. Although the expression of CUGexp and CCUGexp mutant RNAs also alters alternative splicing in the central nervous system (CNS), only a few target exons have been identified (MAPT exons 2, 6,10, APP exon 7, GRIN1/NMDAR1 exon 5) and unlike the role of CLCN1 in myotonia the relationship between mis-splicing and specific disease manifestations in the nervous system is currently unclear [7, 15, 22–23]. The MBNL and CELF protein families are also involved in a number of other regulatory pathways, including mRNA turnover and localization, so it is likely that CUGexp and CCUGexp mutant RNAs also impact gene expression at several levels [1, 39, 56] .
RNA gain-of-function mutations have been implicated in other neurological diseases such as FXTAS (Fig. 1). Fragile X syndrome is a common form of inherited mental retardation and is generally caused by lengthy (>200) CGG expansions in the 5’-UTR of the FMR1 gene. While these elongated repeat tracts repress FMR1 transcription, the shorter FXTAS-associated (CGG)60–200 expansions, originally thought to be premutation alleles, cause a late adult-onset neurodegenerative disease characterized by gait ataxia, progressive intention tremor and parkinsonism . Rather than transcriptional repression, these intermediate-size repeats cause upregulation of FMR1 expression and accumulation of FMR1 RNA transcripts in intranuclear inclusions. Evidence that rCGGexp RNAs are toxic in mammals independent of gene context has been provided recently. Transgenic mice, which selectively express a pathogenic (CGG)90 expansion in the 5’ UTR of either Fmr1 or EGFP in Purkinje cells, develop behavioral deficits, ubiquitin-positive intranuclear inclusions by 8 weeks of age and show enhancement of Purkinje cell axonal swelling and dropout . The structure of rCGG repeats has been analyzed by NMR and, similar to CUGexp, these repeats form stable RNA hairpins which are recognized by several RNA binding proteins, including Purα, hnRNP A2/B1 and CUGBP1 [16, 48, 57]. The RNA gain-of-function model for FXTAS is also supported by observations that Pura knockout mice develop neurological deficits similar to FXTAS and overexpression of Purα in a Drosophila (CGG)90-EGFP transgenic model suppresses disease-associated neurodegeneration [16, 18].
Toxic RNAs may also be involved in the pathogenesis of some autosomal dominant (AD) spinocerebellar ataxias (SCAs), which are late-onset and progressive neurodegenerative diseases . Of the 28 characterized AD SCAs, eight are associated with CAG/CTG (SCA1, 2, 3, 6, 7, 8, 12, 17), and one with ATTCT/AGAAT (SCA10), repeats and these expansions in SCA8 and 12 may occur in either non-coding or coding regions depending on transcription initiation site selection, transcriptional orientation and RNA splicing (Fig. 1). SCA12, a rare disorder except in India where it is the second most common dominant ataxia, is caused by a CAGexp mutation in PPP2R2B, which encodes the Bβ brain-specific regulatory subunit of the ubiquitous heterotrimeric phosphatase PP2A [11, 49]. Action tremor and mild gait ataxia are prominent clinical signs and the CAG repeat is located in several different regions of PPP2R2B transcripts or in the promoter depending on alternative splicing and which transcriptional start sites are utilized . Current evidence indicates that the CAGexp alters transcriptional and/or splicing regulation of PPP2R2B. One disease model proposes that the CAG repeat is in the promoter of the Bβ1 variant and CAGexp causes transcriptional upregulation and dysregulation of phosphatase activity. In addition, use of upstream promoters and alternative splice site selection position the CAGexp in either an alternative 5’-UTR or in intron 6 . When positioned in the intron, the alternative splicing pattern of PPP2R2B may be modified resulting an increase in the Bβ2 isoform that induces neuronal mitochondrial fragmentation and apoptosis . Although these models suggest that SCA12 is not an RNA-mediated disease, a recent study of another SCA, SCA3, highlights the possibility that these relatively short intronic (CAG)51–78 RNAs present in SCA12 brain could contribute to disease pathogenesis when present in PPP2R2B pre-mRNA.
SCA3 is a polyQ-induced disease that is caused by a similarly sized (CAG)61–84 coding region expansion in the ATXN3 gene, which encodes a deubiquitinating enzyme [2, 42]. Surprisingly, a Drosophila modifier screen for ATXN3 polyQ-induced neurodegeneration revealed that upregulation of Mbl, the fly homologue of human MBNL, enhances toxicity suggesting a link between polyQ- and RNA-mediated pathogenesis . In this fly model, Mbl increases the level of both polyQ RNA and protein and substitution of the pure CAG repeat with a CAA/G interrupted repeat, which still encodes a polyQ protein but does not provide an optimal RNA-binding site for Mbl/MBNL proteins, significantly decreased the eye degeneration phenotype. Moreover, expression of a non-coding CAGexp induced neuronal degeneration which suggests that untranslated CAGexp RNAs may be important modifiers of polyQ disease.
While CAGexp RNAs might contribute to polyQ-induced toxicity, the complexity of the transcriptome in the human brain is vast and bidirectional transcription of expanded microsatellites has the potential to generate toxic proteins and RNAs from different strands. SCA8 is a slowly progressive ataxia with reduced penetrance caused by a CTG•CAG expansion on chromosome 13q21 [13, 45]. An early study reported that an SCA8 repeat, (CTA)11CTGCTA(CTG)80, was exclusively transcribed from the ATXN8OS gene in the CTG orientation and produced an alternatively spliced and polyadenylated non-coding CUGexp RNA which led to the conclusion that this ataxia is caused by an RNA gain-of-function mechanism . Subsequent analysis demonstrated that the opposite strand was also transcribed and the CAG expansion in this ATXN8 gene produces a nearly pure polyQ protein which is detectable with the anti-polyglutamine monoclonal antibody 1C2 . The SCA8 repeat may be interrupted with one or multiple CCG, CTA, CTC, CCA or CTT triplets but the influence of these sequence interruptions on pathogenesis has not been addressed. Although there has been some controversy centered on the reduced penetrance associated with SCA8, and the observation that large ATXN8/ATXN8OS CTG•CAG expansions exist in unaffected control individuals, a mouse BAC-transgenic model expressing a human SCA8 (CTG)116 expansion recapitulates several features of this disease, including a progressive neurological phenotype with deficits in cerebellar cortical inhibition as well as 1C2 positive polyQ intranuclear inclusions in cerebellar Purkinje and pontine neurons . Further studies demonstrated that polyQ intranuclear inclusions are present in human SCA8 Purkinje, medullary and dentate neurons but absent in autopsy controls. Although this study is the first to report potentially deleterious gain-of-function effects at both RNA and protein levels in a neurological disease, the relative toxicity of polyQ versus CUGexp RNA in SCA8 requires further investigation.
Another issue is the possibility that toxic proteins are produced from non-trinucleotide repeat expansions, such as the DM2 CCTGexp. No open reading frames exist in the ZNF9 antisense orientation but in the sense direction an initiation codon is located upstream of the expansion region. Alternative splicing, and/or additional transcription start sites proximal to this repeat, could create an open reading frames containing a tetrapeptide repeat. Although proteins containing complex repeat motifs exist in both prokaryotes and eukaryotes , the possibility that these types of proteins are toxic to cells has not been tested.
While bidirectional transcription has important implications for microsatellite diseases, it is also possible that toxic RNAs and proteins can be produced from the same transcriptional unit. In addition to SCA3, another example is HDL2 which is an adult-onset progressive neurodegenerative disorder where CTGexp mutations are located in either coding (to produce polyalanine or polyleucine) or noncoding (3’ UTR) regions of the JPH3 gene depending on the splicing pattern (Fig. 1, for simplicity only the 3’-UTR position is shown). Intranuclear ubiquitin and 1C2 positive protein inclusions as well as CUGexp ribonuclear foci are present in HDL2 brain  . Together with SCA8, these observations call into question pathogenic mechanisms based solely on the location of unstable microsatellites in coding versus non-coding regions and suggest that disease phenotype and penetrance may be influenced by an interplay between toxic RNAs and proteins.
Other types of repeats that do not form stable dsRNA structures may also be toxic to cells. This is exemplified by the ATTCT pentanucleotide expansion in intron 9 of the ATXN10 gene which causes SCA10, an AD neurodegenerative disorder characterized by cerebellar ataxia, seizures and anticipation . Several results argue that SCA10 is not caused by ATXN10 loss-of-function . Instead, SCA10 is another candidate for an RNA-mediated disease since ribonuclear foci can be detected by RNA fluorescence in situ hybridization following transfection of ATTCT repeats into cultured cells .
Currently there are no effective treatments for these neurological disorders. However, understanding the pathogenic mechanisms underlying these diseases should accelerate the development of effective therapeutic approaches. For example, pathogenic RNAs can be targeted using morpholino antisense oligonucleotides that bind to specific repeat expansions to block the access of RNA-binding proteins that are sequestered in disease . Alternatively, screening of small molecules that promote overexpression of the affected RNA-binding proteins can also be considered. Indeed, the possibility that some of these microsatellite expansion diseases result from bidirectional transcription and the production of both toxic proteins and RNAs suggests that multifaceted treatment strategies should be considered for this group of disorders.
Our research is supported by grants from the NIH (AR046799, NS058901, NS048843) and the MDA. The authors thank L. Ranum for comments on the manuscript.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.