|Home | About | Journals | Submit | Contact Us | Français|
RNAs adopt diverse folded structures that are essential for function and thus play critical roles in cellular biology. A striking example of this is the ribosome, a complex, three-dimensionally folded macromolecular machine that orchestrates protein synthesis. Advances in RNA biochemistry, structural and molecular biology, and bioinformatics have revealed other non-coding RNAs whose functions are dictated by their structure. It is not surprising that aberrantly folded RNA structures contribute to disease. In this review, we provide a brief introduction into RNA structural biology and then describe how RNA structures function in cells and cause or contribute to neurological disease. Finally, we highlight successful applications of rational design principles to provide chemical probes and lead compounds targeting structured RNAs. Based on several examples of well-characterized RNA-driven neurological disorders, we demonstrate how designed small molecules can facilitate study of RNA dysfunction, elucidating previously unknown roles for RNA in disease, and provide lead therapeutics.
The genomic revolution of 1990s and 2000s brought about the discovery of a wide variety of non-coding (nc)RNAs (Dunham et al., 2012), leading to increased attention on understanding their physiological functions. As with other biomolecules, the function of RNA is closely linked to its three-dimensional structure, hence the rising interest in RNA structural biology (Figure 1). Advances in the field culminated with remarkable results such as the structures of bacterial (Ban et al., 2000; Schluenzen et al., 2000; Schuwirth et al., 2005; Wimberly et al., 2000; Yusupova et al., 2001) and eukaryotic (Ben-Shem et al., 2011) ribosomes.
As opposed to DNA’s double-stranded helix, RNA is most often single stranded and thus folds onto itself to minimize its free energy. RNA forms fully paired regions and non-canonically paired regions such as hairpins, internal loops, bulges, multibranch loops, and pseudoknots that dictate higher order folding patterns (i.e., tertiary structure, Figure 2). Both secondary and tertiary structural motifs serve as important recognition elements guiding RNA-RNA and RNA-protein interactions. The importance of proper RNA folding for executing its function was recognized early on from studies of rRNA and tRNA, which have highly conserved structural organizations (Korostelev et al., 2006). These structure-function relationships are not restricted to non-coding RNAs. The majority of eukaryotic protein-encoding RNAs (messenger RNA, mRNA) undergo splicing from pre-mRNA prior to translation (Li et al., 2007; Wang et al., 2008a), and secondary structural elements in proximity of intron-exon junctions determine the exact location of intron excision (Black and Grabowski, 2003; Buratti and Baralle, 2004; Buratti et al., 2004; De Conti et al., 2013; Hiller et al., 2007; Shepard and Hertel, 2008; Warf and Berglund, 2010). Moreover, the choice of splicing sites based on exonic and intronic splicing enhancers and silencers often depends on structural context (Hiller et al., 2007).
An important feature of RNA secondary structures is their dynamic nature such that multiple structures with similar free energies can be adopted, allowing for conformational switching. Classic examples of large scale structural rearrangements include the ribosome’s structural changes that occur during translation (Korostelev et al., 2006) and the shape-shifting functions of riboswitches (Tucker and Breaker, 2005). Conformational changes upon interaction with other biomolecules are essential for other functional RNAs as well (Baumstark et al., 1997; Bugaut et al., 2012; Schultes and Bartel, 2000).
While proper folding of RNA is crucial to its normal function, it is natural that misfolding leads to dysregulation of cellular processes. In general, this pathology can arise either from loss-of-function or from gain-of-function (Mirkin, 2007; La Spada and Taylor, 2010). The former class of disease mechanisms usually denotes a mutation in sites that are crucial for proper folding and recognition of RNA by regulatory proteins. An altered folding equilibrium leads to dysregulation of cellular processes. An example are mutations in microtubule-associated protein tau (MAPT aka tau) pre-mRNA that destabilize the hairpin structure at the exon 10 – intron 10 junction (Figure 7), which alters its interaction with U1 snRNP and causes deregulation of alternative pre-mRNA splicing (Clark et al., 1998; Dumanchin et al., 1998; Hutton et al., 1998; Jiang et al., 2000; Spillantini et al., 1998a; Varani et al., 1999).
The gain-of-function class of disease mechanisms is triggered by emergence of aberrantly folded RNA structural motifs in locations where they are not normally present (Mirkin, 2007; Reddy and Housman, 1997; La Spada and Taylor, 2010). The most common cause of such pathologies is genetic mutations that lead to inclusions of pathogenic RNA fragments into gene transcripts, such as those observed in nucleotide repeat expansion disorders (La Spada et al., 1991; Verkerk et al., 1991). Depending on the location of a repeat expansion, there are a variety of downstream pathological mechanisms including misregulation of the RNA’s splicing in which the mutation is found (cis-regulation), production of non-functional or toxic proteins (Faustino and Cooper, 2003; Feng and Xie, 2013; Licatalosi and Darnell, 2006), and sequestration of essential proteins that corrupts their normal functions, which is typically processing of cellular RNAs (trans-regulation) (Krzyzosiak et al., 2012; Ranum and Cooper, 2006; Todd and Paulson, 2010).
Here, we provide a brief introduction into RNA structural biology, followed by a description of various RNA structures that perform biological functions and/or trigger pathological cascades in neurological diseases. Using several examples, we will exemplify how cellular and animal models of diseases help to understand pathogenic mechanisms. Finally, we will discuss the development and use of chemical tools (small molecules and oligonucleotides) that normalize deregulated RNA function, paving the road for potential RNA-targeting therapeutic interventions in neurological diseases in a precise and selective manner.
In contrast to DNA, RNA adopts a variety of secondary and tertiary structures. In base-paired regions, RNA adopts an A-type helical conformation, which is characterized by less compact folding than B-form DNA (11 base pairs per helical turn vs. 10.5 bp, respectively), a deeper major groove, and a shallower but wider minor groove. RNA’s 2′ hydroxyl group dictates a different sugar pucker, hydration state, and thermodynamic stability than DNA (Fohrer et al., 2006; Gyi et al., 1998). Hydrogen bonding between nucleobases plays a crucial role in the formation of RNA secondary and tertiary structures (SantaLucia et al., 1992; Turner et al., 1987); only 60–70% of bases in structured RNA form classic Watson-Crick contacts. Non-canonical Hoogsteen (Hoogsteen, 1963) and wobble pairs (Crick, 1966; Varani and McClain, 2000) are common in RNA and contribute to the diversity of folding and function (Schroeder et al., 2004).
RNA folding is hierarchical in that primary sequence defines secondary structure elements through nearest neighbor nucleotide effects (Brion and Westhof, 1997; Mathews et al., 1999, 2004). The most common secondary structures formed by RNA strands are base-paired regions, stem-loops (hairpins), internal loops, bulges (Hermann and Patel, 2000; Mathews et al., 1999, 2004), pseudoknots, kink-turns (Klein et al., 2001), complex multibranch loops, and G-quadruplexes (Gellert et al., 1962; Kim et al., 1991; Sundquist and Heaphy, 1993) (Figure 2). These secondary structural motifs in turn can stabilize each other through folding into more complex 3D patterns (Brion and Westhof, 1997). It is noteworthy that both secondary and tertiary structural elements can be dynamic and can interconvert depending on the presence of proteins, electrolytes, and small molecules (Schroeder et al., 2004), altering RNA function (Baumstark et al., 1997; Bugaut et al., 2012; Schultes and Bartel, 2000; Serganov et al., 2004).
The critical role of proper RNA folding for cellular functioning is best exemplified by the central place of RNA in controlling protein synthesis – from gene expression to splicing and translation of mRNA. Ribosomes and spliceosomes are comprised of highly structured RNA modules in complex with proteins (Figure 2B). At the same time, pre-mRNAs and often mRNAs themselves contain motifs that control efficiency of translation and splicing. For example, G-quadruplex structures in 5′ untranslated regions (UTRs) regulate translation of mRNA in context dependent manner: either by inhibition or by upregulation of cap-independent translation (Bugaut and Balasubramanian, 2012).
RNA’s function is often dependent on its interactions with other biomolecules and/or small molecules; those intermolecular interactions are dependent upon acquisition of the proper fold. These types of interactions, however, are far less studied than secondary structure. For RNA–RNA interactions, the most studied examples include “kissing” loop–loop and loop–stem contacts that were found to regulate viral (Chang and Tinoco, 1994, 1997; Nicholson and White, 2014), prokaryotic (Brunel et al., 2002; Marino et al., 1995), and eukaryotic RNAs (Sudarsan et al., 2003; Wachter, 2010). Stabilization of a kink-turn by long-distance contacts was found to be essential for the folding and cellular function of prokaryotic RNA (Lilley, 2012; Schroeder et al., 2011), particularly for ribosomal RNA (McPhee et al., 2014). Proteins can recognize their cognate RNA-binding partners structure-specifically, sequence-specifically, or via multiple modes of interaction (Serganov and Patel, 2008). The binding of proteins to RNA hairpins most often occurs with the apical loop rather than the helical stem region and resembles sequence-specific recognition of unstructured ssRNA. Therefore formation of such loops is a recurrent pattern of RNA folding that regulates interaction with regulatory proteins. The presence of similar structural motifs in two or more RNA molecules promiscuous protein binding, which can serve, for example, as a feedback loop for translational regulation by controlling ribosome loading (Serganov and Patel, 2008). Depending on context, the binding of RNA structural motifs to other biomolecules or structural domains can stabilize or destabilize its tertiary structure. Modulation of structural stability by small molecules underlies the biological function of folded RNA motifs, such as riboswitches (Thore et al., 2006; Zhang et al., 2010) and generally represents a potential strategy for targeting structured RNAs therapeutically, see e.g. (Childs et al., 2002; Meisner et al., 2004).
As of 2015, sequence complementarity is the most common way of targeting RNA. Mechanistically, an oligonucleotide hybridizes with a target strand (forms antiparallel base-paired duplex), thereby affecting its natural folding and interaction with cognate partners or recruiting intracellular machinery that cleaves the RNA (Bennett and Swayze, 2010). Such interaction, however, depends on the thermodynamic and kinetic barrier of unfolding the native conformation of both the target RNA and the antisense oligonucleotide (Freier et al., 1986; Mir and Southern, 1999; Walton et al., 1999), which can be prohibitively high (Li et al., 2008). Although there are RNA-binding proteins that facilitate RNA unfolding, the antisense-based strategy is still mostly applicable to non-structured or weakly structured RNAs. Hence, small molecules, which bind to folded RNA structures, e.g. loop regions, represent a complementary means for controlling RNA function to antisense oligonucleotides as small molecules are more apt to target folded RNAs.
Various methods have been developed to deduce RNA structure as understanding folding is foundational information that is used to generate structure-function hypotheses (Figure 4). Much of the activity in this area at present is deducing secondary structure, or a map of paired and unpaired regions. For example, much of the secondary structure of the ribosome was deduced by using phylogenetic comparison and the conservation of rRNA secondary structure (Figure 4A). Importantly, these studies have been used to assign the kingdoms of life and to elucidate new ones (Gutell et al., 1985; Noller and Woese, 1981). When the crystal structures of the ribosome became available, it was found that these phylogenetic secondary structures were highly accurate (Ban et al., 2000). When it is not possible to complete phylogenetic comparison because of a limited number of available RNA sequences, secondary structure prediction via free energy minimization is often used (Hofacker, 2003; Mathews et al., 1999, 2010; Zuker, 2003). The calculations provide the predicted lowest free energy structure and a series of suboptimal structures (Figure 4B). Alternatively, decomposition of RNA sequence into basic elements and reconstruction of the 3D folding based on known patterns for these elements proved to be helpful in spotting non-canonical base-pairs interactions (Parisien and Major, 2008). These approaches can be accurate for smaller RNAs, but accuracy diminishes as the size of an RNA increases. For larger RNAs, secondary structures can be deduced by using a combination of prediction and experimental constraints generated by structure probing (Mathews et al., 1999, 2010). RNA structure has been probed both in vitro and in cellulo with dimethyl sulfate (DMS) (Lempereur et al., 1985; Mathews et al., 2004; Tijerina et al., 2007; Wells et al., 2000), selective 2′-hydroxyl acylation analyzed by primer extension (SHAPE) (Kwok et al., 2013; Merino et al., 2005) (Figure 4C), and in a variety of nuclease protection assays (Ehresmann et al., 1987). This synergistic approach has been quite accurate to decode RNA structures, such as rRNA, viral RNAs, and others (Ding et al., 2014; Mathews et al., 2004; Merino et al., 2005; Weeks, 2010; Wilkinson et al., 2006). Likewise, methods have been developed to combine free energy minimization and sequence alignment (Bernhart et al., 2008; Mathews and Turner, 2002). Although secondary structure can provide important frameworks to develop hypotheses about the role of RNA structure, information about tertiary structure would also be helpful. Predictive approaches, in addition to the well-established experimental approaches such as X-ray crystallography and nuclear magnetic resonance (NMR) spectroscopy, are being developed to quickly and accurately decipher three-dimensional structure. It is likely that a combination of prediction and experiment, such as has been shown for secondary structure prediction, will allow for accurate models of RNA three dimensional structure to be generated quickly (Sripakdeevong et al., 2014).
A great variety of functional, non-coding RNAs regulate neuronal development and physiological function (Qureshi and Mehler, 2012). Little is currently known, however, about the role of tertiary folding in most of these RNAs. Large non-coding RNAs (lncRNAs) are attracting attention due to the regulatory role they play in gene expression (Bond et al., 2009; Faghihi et al., 2008; Khalil et al., 2009; Ng et al., 2013; Tsai et al., 2010; Wahlestedt, 2013). Multiple enhancer RNA structures facilitate recruitment of CREB binding protein (CBP) to cognate genes and thereby regulate activity-dependent gene transcription in neurons, which underlies neuronal development and synaptic plasticity (Kim et al., 2010). Most of what is known about lncRNAs’ mechanism of action is antisense base-pairing with target genes, formation of DNA-RNA triplexes (Geisler and Coller, 2013), and allosteric recruitment of gene-repressing peptides to their target genes (Wang et al., 2008b) (Figure 5A).
It is estimated that >90% of human pre-mRNA is alternatively spliced, and this process is highly tissue-specific (Wang et al., 2008a). Many splicing regulators (proteins and ribonuclear proteins (RNPs)) apparently control their own syntheses (Li et al., 2007), providing important feedback loops. Some examples of splicing regulation by RNA-binding proteins and secondary structural elements are provided below. For more rigorous review of splicing regulation by pre-mRNA secondary structure, please see (De Conti et al., 2013; Warf and Berglund, 2010).
The neuro-oncological ventral antigen (NOVA) family of proteins regulates alternative splicing in neurons (Ule et al., 2006). Particularly, they control inhibitory neurotransmission in synapses by processing pre-mRNAs of glycine receptor subunit α2 (GlyRα2), GABAA receptor subunit γ2 (GABAARγ2), gephyrin, and Jnk2, amongst at least 50 other neuronal genes (Figure 3C) (Musunuru and Darnell, 2004; Ule et al., 2006). The sequence specificity of the RNA-binding K-homology (KH) domains of NOVA-1 protein was established by various methods such as systematic evolution of ligands by exponential enrichment (SELEX) (Musunuru and Darnell, 2004), UV cross-linking and immunoprecipitation (CLIP) (Ule et al., 2003), and individual-nucleotide resolution (i)CLIP (Sugimoto et al., 2012). NOVA-1 binds multiple repeats of UCAU intronic sequence in the GlyRα2 and GABAARγ2 pre-mRNAs. The binding of NOVA-1 KH-1–3 domains to RNA was studied extensively (Teplova et al., 2011) (Figure 3B), suggesting that multivalent binding of RNA-targeting motifs reorganizes the architecture of the RNA’s fold (Nicastro et al., 2015). Blockade of KH domains by antibodies disrupts NOVA–RNA interactions and causes neurodegeneration in patients with paraneoplastic opsoclonus-myoclonus-ataxia (POMA) (Luque et al., 1991).
Proteins from muscleblind-like family (MBNL), regulators of alternative pre-mRNA splicing, bind RNA with four zinc finger (ZnF) domains (Konieczny et al., 2014), determining which RNAs are MBNL substrates. MBNL1, the most studied member of the family, regulates alternative splicing of muscle-specific chloride ion channel (CLCN1) (Mankodi et al., 2002), insulin receptor (INSR) (Savkur et al., 2001), bridging integrator 1 (BIN1) (Fugier et al., 2011), cardiac troponin T (cTNT) (Philips et al., 1998), and several hundred other splicing events (Wang et al., 2012) (Figure 3C). By binding to a hairpin formed by canonical YGCY recognition sequence in cTNT pre-mRNA, MBNL1 triggers unwinding of the structure and subsequent splicing (Konieczny et al., 2014) (Figure 3A). Transcriptome-wide analysis revealed additional extranuclear roles of MBNL proteins such as targeting mRNAs to the rough endoplasmic reticulum and cellular membrane and regulation of local translation (Wang et al., 2012).
RNA structure can also regulate translation by controlling ribosomal loading. Over-expression of ribosomal proteins can cause binding to mRNA outside the context of a ribosome, thereby preventing translation. One example is L1 protein, the largest protein in the large ribosome subunit, which binds to its own mRNA (Tishchenko et al., 2008) by recognizing a domain that resembles rRNA fold (Nevskaya et al., 2005).
As mentioned above, a defined RNA structure also functions in the cellular delivery of proteins. Delivery of the translation complex to endoplasmic reticulum membrane is critical for maturation and targeting of secretory and membrane-bound proteins. This process is regulated by the signal recognition particle (SRP), a GTP-dependent ribonucleoprotein complex (Doudna and Batey, 2004). A U-turn forming Alu domain of SRP RNA plays the central role in assembly of the complex and binding to ribosome (Halic et al., 2004) and to SRP receptor in the endoplasmic membrane (Ataide et al., 2011) (Figure 5B). Mechanistic aspects of SRP function and the role of SRP RNA have been reviewed recently (Akopian et al., 2013).
Secondary structure elements in introns of mRNAs and in 5′ and 3′ UTRs have been identified as important markers for cellular splicing machinery (Buratti and Baralle, 2004), mRNA trafficking (e.g., axonal and dendritic transport (Gomes et al., 2014; Jung et al., 2012)), and translation initiation (Hughes, 2006). Stem-loops that contain canonical nucleotide sequences in their apical loop are markers for NOVA-1 recruiting. Notably, relative positioning of these structures determines NOVA-1-dependent skipping or inclusion of a particular exon (Figure 3) (Ule et al., 2006), similarly to MBNL-regulated mRNAs (Wang et al., 2012).
A stem-loop (with the canonical recognition pattern) in the 5′ UTR of ferritin-H and ferritin-L mRNA regulates expression of iron storage proteins (Thomson et al., 1999). The apical loop of the hairpin serves as a sensor for intracellular level of iron, hence the motif is called iron-responsive element (IRE). Interestingly, similar motifs were found in other mRNAs (Piccinelli and Samuelsson, 2007), including amyloid precursor protein (APP) (Rogers et al., 2002, 2011) and α-synuclein (Friedlich et al., 2007; Olivares et al., 2009). α-Synuclein is the prime toxic protein in Parkinson’s disease (PD) and other α-synucleinopathies (Lee and Trojanowski, 2006; Singleton et al., 2003). It forms fibrils that propagate across neurons in the brain and accumulate in Lewy bodies and Lewy neurites (Spillantini et al., 1997, 1998b). The expression level of α-synuclein is an important determinant in the rate of its fibrillization and neurotoxicity (El-Agnaf et al., 2006). Thus, down-regulating its expression is expected to be beneficial. The translation of some α-synuclein isoforms is regulated by a hairpin structure that is similar to the IRE in the 5′ UTR and was found to be iron-dependent (Febbraro et al., 2012).
Other common functional elements of structured mRNAs include three-way multibranch loops (junctions) (de la Peña et al., 2009), kink-turns (McPhee et al., 2014), hairpins, and G-quadruplexes (Bugaut and Balasubramanian, 2012). G-quadruplexes are often found in non-coding regions of mRNAs and provide structural domains recognized by functional proteins, e.g., fragile X mental retardation proteins (FMRP) (Melko and Bardoni, 2010). The structure selectivity of FMRP is different from other RNA-binding proteins containing KH domains (Siomi et al., 1993), as the RGG box binds G-quadruplex-containing mRNAs (including FMR1 mRNA which encodes FMRP) (Darnell et al., 2001). At the same time, another domain of FMRP binds between two ribosomal subunits and directly competes with eukaryotic translation elongation factors 1 alpha and 2 (eEF1A and eEF2) and aminoacyl-tRNA (Chen et al., 2014). Thereby, FMRP temporarily suspends the translation of the target mRNA and transports it from the nucleus to synapses. Phosphorylation of FMRP, due to external stimuli, decreases affinity of FMRP to the ribosome and de-represses transcription of its target mRNAs. Taken together, FMRP regulates activity-dependent local mRNA translation and influences cognitive processes at the cellular level (Antar et al., 2006).
Additional example of RNA-binding proteins essential for RNA transport is Staufen family (St Johnston et al., 1991). They regulate delivery of mRNA and thereby play central role in cellular differentiation and dendritic transport (Tang et al., 2001). Staufen proteins bind and stabilize base-paired sites in RNA secondary structures, which are abundant in 3′ UTR of mRNAs, (Sugimoto et al., 2015). Structural basis of double stranded (ds)RNA stem recognition by Staufen and other RNA-binding proteins has been recently reviewed (Gleghorn and Maquat, 2014).
Conversion of adenosine to inosine (i.e., RNA editing) is an important process in regulating splicing patterns of membrane receptors and ion channels, which has been proposed to be targeted therapeutically (Gott and Emeson, 2000; Morabito and Emeson, 2009). In general, it is the tertiary fold of the mRNA that determines the editing pattern (Bhalla et al., 2004; Ensterö et al., 2009; Rieder et al., 2013; Tian et al., 2011). Alternative editing of the serotonin receptor HTR2C pre-mRNA plays a major role in Prader-Willi syndrome pathology, a genetic disease associated with expression of serotonin receptor isoform with reduced constitutive activity and decreased efficiency of coupling to G-protein (Morabito et al., 2010).
Many of the functional secondary structural elements described above can spontaneously emerge as pathogenic agents as a result of point mutations, sequence deletions, and expansions. These newly formed, mutated RNA motifs interfere with normal interactions and initiate pathologic processes in cells. Examples of such gain-of-function include sequestration of RNA-binding proteins, activation of cryptic splicing sites, dysregulation of site-specific RNA editing by adenosine deaminase acting on RNA (ADAR), and formation of pseudo internal ribosome entry sites (IRES) and subsequent cap-independent translation (Mirkin, 2007), i.e., repeat associated non-ATG (RAN) translation (Zu et al., 2011). Interestingly, broad scale proteomics and transcriptome studies have shown that many peptides are produced without canonical AUG start codons (Lee et al., 2012; Slavoff et al., 2013; Stern-Ginossar et al., 2012). Thus, the biological implications of understanding RAN translation are not restricted to pathologic RNA structures.
Several causes of pathogenic RNA folding are known. In fact, most often it is the development of pathology that reveals the functional role of particular RNA structural motif. Single nucleotide polymorphisms (SNP) are a common cause of abnormal RNA folding. Mutations of a single nucleobase can change conformational stability of a secondary structure element, potentially disrupting a delicate equilibrium of RNA–protein interaction networks and causing downstream pathology. One of the most studied examples is MAPT (tau) mRNA, vide infra. For more detailed discussion on the role of mutated pre-mRNA secondary structures in splicing deregulation see (Warf and Berglund, 2010).
Retrotransposon insertions into intronic regions of pre-mRNAs contribute to protein isoform diversity by activation of cryptic splicing sites. This mechanism plays important roles in evolution, brain development, and cellular differentiation but often also contributes to genetic diseases (Baillie et al., 2011; Deininger and Batzer, 1999). For example, the structured retrotransposon Alu element of 7SL RNA plays a crucial role in the assembly of SRPs (Figure 5B). The Alu motif is the second most abundant retrotransposon in the human genome; its more than one million copies comprise roughly 11% of the entire genome (Lander et al., 2001). Insertions of Alu elements into intronic regions can activate cryptic exons, which leads to formation of unnatural protein isoforms (Pagani and Baralle, 2004; Vervoort et al., 1998). Abnormal recombination driven by Alu elements can also result in deletion of large gene fragments (Iida et al., 2012; Nakayama et al., 2010), and co-migration of other pathologic RNA fragments causes propagation of associated pathology (Clark et al., 2004; Kurosaki et al., 2009, 2012).
In the case of microsatellite repeat expansion disorders, particular DNA oligonucleotide fragments (repeated sequences) fold into stable hairpins, thereby causing strand ‘slipping’ during replication, repair, and recombination. This causes formation and elongation of such repetitive fragments (Gacy et al., 1995; López Castel et al., 2010). Transcription yields the corresponding single-stranded RNA that contains the expanded repeated sequence, which is aberrantly folded due to the presence of additional secondary structural elements (Figure 6A, B). Once stably incorporated into DNA sequence, expanded repeats are even more likely to fold into abnormal structures, which leads to gradual augmentation of pathology with age and in subsequent generations, a phenomenon known as repeat instability (Kovtun and McMurray, 2008; Liu et al., 2010; López Castel et al., 2010).
As described above, RNA-binding proteins have sequence and structural binding preferences (Serganov and Patel, 2008). Some expanded repeats mimic these recognition elements and sequester proteins or RNA-processing cellular machinery away from their normal RNA targets (Echeverria and Cooper, 2012). Most often, GC-rich oligonucleotide sequences form expanded repeats: r(CAG)exp (where “exp” denotes an expanded repeating sequence) (Mangiarini et al., 1996), r(CUG)exp (Brook et al., 1992), r(CCUG)exp (Liquori et al., 2001), r(CGG)exp (Verkerk et al., 1991), and r(G4C2)exp (DeJesus-Hernandez et al., 2011) (Figure 6A). Downstream pathogenic processes in microsatellite repeat expansion disorders include deregulation of alternative pre-mRNA splicing (Philips et al., 1998; Ranum and Cooper, 2006), formation of insoluble nucleoprotein inclusions (foci) (Taneja et al., 1995), and RAN translation (Zu et al., 2011) into toxic peptides (reviewed in Krzyzosiak et al., 2012; Shin et al., 2009; Todd and Paulson, 2010). If a repeat expansion is located in a coding region, toxic proteins with a polypeptide chain encoded by the corresponding triplet codon are produced (La Spada and Taylor, 2010).
Two secondary structure elements are formed by expanded repeats: hairpins containing periodically repeating internal loops (Figure 6B) and G-quadruplexes. The latter is formed only by extremely G-rich sequences, e.g. r(G4C2)exp (Reddy et al., 2013). Some non-natural trinucleotide repeats (r(AGG) and r(UGG)) also fold into G-quadruplexes when expanded sufficiently (Sobczak et al., 2010). The central role of aberrant RNA folding in its gain-of-function has been extensively reviewed (Mirkin, 2007; Pearson, 2011; La Spada and Taylor, 2010). It is worth keeping in mind, however, that it is nearly impossible to delineate confounding role of sheer size of expanded repeats (and associated increase in stoichiometry of protein binding) from the role of their folding in RNA pathology.
Formation of hairpins by r(CUG)exp and r(CCUG)exp, with multiple 1×1 and 2×2 internal loops in the stem, leads to the development of myotonic dystrophy (DM) types 1 and 2, respectively. DM1 is an autosomal dominant disease that is the most common form of adult muscular dystrophy with an incidence of 1:8000 (Brook et al., 1992). DM2 is also autosomal dominant but with less severe symptoms than DM1 (Liquori et al., 2001). While r(CUG)exp (DM1) resides in the 3′ UTR of dystrophia myotonica protein kinase (DMPK) mRNA, r(CCUG)exp (DM2) is located in the first intron of CCHC-type zinc finger nucleic acid binding protein (CNBP aka ZNF9) mRNA. There is overwhelming experimental evidence for protein sequestration as the major source of pathology in both diseases (Fardaei et al., 2002; Kanadia et al., 2006; Lu et al., 1999; Miller et al., 2000). MBNL proteins are sequestered in nuclear foci via dynamic interaction with r(CUG)exp or r(CCUG)exp (Fardaei et al., 2002; Mankodi et al., 2001). In DM1, sequestration leads to decreased nucleocytoplasmic transport and downregulation of DMPK. More importantly, protein sequestration deregulates alternative splicing of MBNL1- and MBNL2-dependent mRNAs (Charizanis et al., 2012; Ho et al., 2004; Philips et al., 1998). Inclusion of pathologic repeats in the native DMPK locus or a heterologous gene in transgenic mice is sufficient for development of disease phenotype, underscoring the central role of expanded repeat structure in DM pathology (Mankodi et al., 2000; Monckton et al., 1997). In addition to DM1, r(CUG)exp repeats are also characteristic of spinocerebellar ataxia type 8 (SCA8) (Daughters et al., 2009), and Huntington’s disease-like HDL2 (Rudnicki et al., 2007), where they also sequester MBNL proteins in nuclear foci and deregulate MBNL-dependent pre-mRNA splicing.
A great deal of effort has been expended to determine how MBNL proteins recognize r(CUG)exp and r(CCUG)exp, hence various studies into their secondary structures. Formation of r(CUG)exp and r(CCUG)exp hairpins has been confirmed in vitro (Gacy et al., 1995), and the stability of the RNA generally increases with repeat length (Napierała and Krzyzosiak, 1997; Tian et al., 2000). Importantly, these studies indicate that the hairpin stem contains non-canonically paired loops. A number of structures of r(CUG) repeat-containing constructs have been deposited in the Protein Data Bank (PDB) (Berman et al., 2000), which confirms formation of an imperfectly paired stem. The structures resemble an A-form helix but with deeper and narrower major grooves, reminiscent of A-form DNA. Further, some structures indicate loop dynamics (Coonrod et al., 2012; DeLorimier et al., 2014; Kiliszek et al., 2009; Kumar et al., 2011a; Mooers et al., 2005; Tamjar et al., 2012). The X-ray structure of r(CCUG)3 was recently published, providing insight into the structures of 5′UC3′/3′CU5′ present in the hairpin stem (Childs-Disney et al., 2014). In contrast to A-form RNA, the helical axis of r(CCUG)3 was bent by 18.5° at the central internal loop. Major and minor groove widening and narrowing, respectively, was observed in the internal loops. Molecular dynamics simulations suggested that 5′CU/3′UC internal loops exist in a dynamic equilibrium between two conformations (Childs-Disney et al., 2014).
A complex of tetrameric MBNL1 bound to a model RNA recognition site has been characterized by X-ray crystallography (Teplova and Patel, 2008) (Figure 3A); a complex of two MBNL1 zinc fingers complexed with two single stranded RNAs has also been refined from X-ray crystallographic data (Teplova and Patel, 2008). The specificity of MBNL1 binding to imperfect (with multiple U×U or CU×UC internal loops) rather than to fully base-paired stem-loop was demonstrated experimentally (Kino et al., 2004; Warf and Berglund, 2007). Taken together, these structural studies suggest that once bound to the DM1 and DM2 hairpins, MBNL1 unwinds their helical structure, which facilitates multivalent binding (Fu et al., 2012; Konieczny et al., 2014).
In addition to splicing deregulation, r(CUG)exp also recruits ribosomes and initiates RAN translation (Zu et al., 2011). This phenomenon was initially characterized in cells and tissue derived from SCA8 and DM1 patients. It was found that repeat length is a critical determinant of enabling RAN translation (Zu et al., 2011). These studies suggested that hairpins formed by r(CUG)exp serve as internal ribosome entry sites (IRES) and CUG binding protein (CUGBP1) acts as IRES translation-associated factor. After initial discovery of the stable hairpin structure formed by r(CUG)exp, experiments on other expanded trinucleotide repeats showed that the hairpins are formed by the vast majority (Sobczak et al., 2003), although not all trinucleotide sequences (Sobczak et al., 2010).
Expanded r(CGG) repeats form stable hairpins in the 5′ UTR of fragile X mental retardation mRNA (FMR1), which encodes fragile X mental retardation protein (FMRP), a key regulator of protein expression and trafficking in central nervous system. The repeats are major molecular pathogens in fragile X-associated tremor ataxia syndrome (FXTAS) (Hagerman et al., 2001; Jacquemont et al., 2003), fragile X syndrome (FXS) (Pieretti et al., 1991; Verkerk et al., 1991), and fragile XE syndrome (FRAXE) (Gecz et al., 1996; Knight et al., 1993). In FXTAS, a pre-mutation allele of r(CGG)exp (55–200 repeats (Jacquemont et al., 2003)) initiates RAN translation, leading to the premature start of FMR1 translation and production of non-functional FMRP with an N-terminal polyglycine chain (Todd et al., 2013). The repeats also bind and sequester various RNA-binding proteins in nuclear foci: heterogeneous ribonucleoprotein particle (hnRNP), MBNL1 (Iwahashi et al., 2006), DiGeorge syndrome critical region gene 8 protein (DGCR8) (Sellier et al., 2013), and Src-associated in mitosis, 68 kDa protein (Sam68) (Greco et al., 2006; Sellier et al., 2010) causing deregulation of alternative pre-mRNA splicing and microRNA processing (Arocena et al., 2005). Interestingly, stabilizing interruptions within the premutation allele of r(CGG)exp, such as AGG inserts, lead to branching of the hairpin structure. It was suggested that this branching precludes toxicity (Napierała et al., 2005).
FXS is caused by r(CGG) expansions of lengths >200, or the full mutation allele (Pieretti et al., 1991; Verkerk et al., 1991). Although it was known that the expansion leads to silencing of the FMR1 gene and hence loss of FMRP, the exact mechanism of silencing was only recently elucidated. Specifically, an RNA-DNA hybrid forms between r(CGG)exp and the FMR1 gene, silencing transcription via induction of chromatin remodeling (Colak et al., 2014). Stabilization of the r(CGG)exp hairpin by a small molecule prevents RNA unfolding and subsequent binding to FMR1 promoter region thus inhibiting gene silencing, vide infra.
Other hairpin-forming repeats include r(CAG)exp, which is most often located in coding regions and produce toxic polyglutamine-containing proteins as observed in Huntington’s disease (HD) (MacDonald et al., 1993). It was also suggested, however, that r(CAG)exp contributes to HD toxicity directly. Like r(CUG)exp, r(CAG)exp sequesters MBNL1 and binds with similar affinity (Mykowska et al., 2011). In addition, r(CAG)exp appears to sequester nucleolin to deregulate nuclear transport, recruit Dicer, and initiate RAN translation (Nalavade et al., 2013). In HD, the splicing factor SRSF6 binds to r(CAG)exp, which correlates with the observed deregulation of HTT (huntingtin) splicing in which exon 1 is aberrantly included. Exon 1 is followed by a premature stop codon, hence producing truncated huntingtin (protein) which is the major component of nuclear inclusions in HD (Sathasivam et al., 2013). The N-terminally truncated huntingtin has been shown to be highly pathogenic in a mouse model (Mangiarini et al., 1996). Together these findings suggest that aberrant splicing might be the major pathogenic cause of HD.
G-rich strands of RNA and DNA can form G-quadruplex structures (Sundquist and Klug, 1989). The pathogenic expanded hexanucleotide r(G4C2)exp in C9ORF72, associated with amyotrophic lateral sclerosis and frontotemporal dementia (or c9ALS/FTD) (Akimoto et al., 2014; DeJesus-Hernandez et al., 2011), folds into two structures that are in equilibrium: a hairpin displaying periodically repeating G-rich internal loops and a G-quadruplex (Haeusler et al., 2014; Reddy et al., 2013; Su et al., 2014). r(G4C2)exp is the most frequent cause of familial amyotrophic ALS/FTD (Akimoto et al., 2014; DeJesus-Hernandez et al., 2011). Akin to other repeat expansions, r(G4C2)exp forms ribonuclear foci that sequester RNA-binding proteins, deregulates gene expression (Donnelly et al., 2013; Haeusler et al., 2014; Xu et al., 2013), and initiates RAN-translation (Ash et al., 2013; Donnelly et al., 2013). Formation of RNA-DNA hybrids (R-loops) underlies dysregulation of gene expression by r(G4C2)exp via different mechanisms (Haeusler et al., 2014; Wang et al., 2015). The RAN translation products of C9ORF72 are dipeptide repeat (DPR) proteins: poly(GlyPro), poly(GlyArg), poly(GlyAla), poly(ProAla), and poly(ProArg). Among them arginine-rich products appear to be the most toxic (Mizielinska et al., 2014). DPR proteins penetrate cellular membranes, accumulate in nucleoli, and impair pre-mRNA splicing and biogenesis of rRNA (Kwon et al., 2014; Mizielinska et al., 2014; Xu et al., 2013). Replacement of r(G4C2)exp by synonymous sequences encoding poly(GlyArg) and poly(ProArg) proteins causes neurodegeneration in a Drosophila model of ALS/FTD, which suggests a prevalent role for RAN translation in r(G4C2)exp-mediated pathology (Mizielinska et al., 2014).
Interestingly, C9ORF72 is bidirectionally transcribed; the antisense strand contains r(G2C4)exp which also forms nuclear foci and toxic RAN peptides (Gendron et al., 2013; Zu et al., 2013). Moreover, recent data suggest its importance in c9ALS/FTD pathogenesis, which should be taken into account in therapeutic treatment strategies (Lagier-Tourenne et al., 2013). Such considerations of both sense and antisense strands to toxicity is not unique to c9ALS/FTD as shown by the Ranum laboratory (Daughters et al., 2009).
The stability of other RNA hairpins, not associated with repeat expansions, regulates normal alternative pre-mRNA splicing. For example, a stable stem-loop hairpin structure between exon 10 and intron 10 of MAPT dictates exon 10 inclusion or exclusion (Hutton et al., 1998; Jiang et al., 2000) (Figure 7). Mutations that destabilize the hairpin lead to more frequent inclusion of the exon and upregulation of a longer isoform of tau protein that contains four microtubule-binding domains (Liu and Gong, 2008). Overproduction of this mutant tau protein leads to frontotemporal dementia and Parkinsonism associated with chromosome 17 (FTDP-17) (Goedert et al., 1998). In contrast, stabilization of the hairpin leads to exon skipping (Donahue et al., 2006). NMR structural data confirmed that mutation of +3 G residue to A severely affects the conformation of the stem-loop (Varani et al., 1999), which regulates access of U1 snRNP to the splice site and exon inclusion (Jiang et al., 2000).
The most challenging aspect of targeting RNA structural motifs with chemical probes is achieving the required level of specificity in the presence of high concentrations of bystander RNAs. rRNA and tRNA comprise ~80% and ~15% of total cellular RNA, respectively; the level of an individual mRNA is 1% (Johnson et al., 1975, 1977). Thus, from an abundance standpoint, a given structured RNA motif is most likely to be found in rRNA or tRNA. A modular approach in which multiple motifs are targeted simultaneously is one manner to overcome this problem (Figure 6C). That is, selectivity is much improved with a multivalent compound because, although there may be a large number of single targetable RNA motifs in the transcriptome, there are far fewer RNAs that have two targetable sites separated by a specific distance (Kumar et al., 2011a).
Pathogenic RNAs are traditionally targeted with antisense oligonucleotide probes. RNAs that are highly structured, however, are difficult to target with antisense oligonucleotides as the target RNA’s structure must first be unfolded. In contrast, small molecules are more apt to bind structured RNAs by binding to discrete motifs. Development of RNA-targeting small molecules has long been hampered by absence of rational principles of ligand design. Structure-based approaches have been complicated by RNA’s high flexibility and low-barrier for dynamic rearrangement of secondary structure elements. High throughput screening libraries are optimized for protein targets, leading to low hit rates in RNA-targeting screening campaigns. Despite these challenges, significant advances have been made in the development of small molecules that target RNA (Disney, 2013; Disney et al., 2014; Gallego and Varani, 2001; Guan and Disney, 2012; Shortridge and Varani, 2015; Thomas and Hergenrother, 2008; Velagapudi et al., 2014). Below, we summarize the present state of the development of chemical tools to study structured RNAs and targeting it therapeutically.
Several key advances have pushed the RNA-targeting field forward including those in RNA structural biology, structure-based approaches including modeling of dynamic ensembles, and identification of RNA-binding modules (Batey et al., 2004; Childs-Disney et al., 2014; Davidson et al., 2009; Disney, 2013; Disney et al., 2014; Gallego and Varani, 2001; Jahromi et al., 2013a; Lee et al., 2010; Montange and Batey, 2006; Ofori et al., 2012; Palde et al., 2010; Parkesh et al., 2011; Shortridge and Varani, 2015; Stelzer et al., 2011; Trausch et al., 2011; Yildirim et al., 2013). High-resolution structures of ribosomes and other RNA-protein complexes combined with modeling of RNA dynamics have enabled structure-based approaches to develop new antibiotics and antivirals. Such studies have also enabled a fragment-based approach to drug design (Garavís et al., 2014). We recently reported a computational approach to design small molecules that bind RNA named Inforna (Velagapudi et al., 2014). Inforna uses a database of privileged RNA motif-small molecule interactions derived from a small molecule library-vs.-RNA motif library screen (2-Dimensional Combinatorial Screening; 2DCS (Childs-Disney et al., 2007; Disney et al., 2008)). This approach opens a new opportunity to overcome the selectivity problem of RNA-targeting small molecule ligands. While the number of structural building blocks is limited for RNA and targetable motifs can be present in many cellular RNAs, the probability of co-localization of two structural motifs in close proximity is much lower (see discussion in Kumar et al., 2011a). We demonstrated that a modular assembly approach using multivalent compounds that bind two motifs in close proximity increases selectively and potency (Lee et al., 2009a; Pushechnikov et al., 2009; Tran et al., 2014). Modular assembly is particularly attractive for targeting expanded repeats because of their own modular organization; by combining several identical RNA-targeting fragments in one molecular entity, a significant gain in affinity and selectivity can be achieved (Childs-Disney et al., 2012a) (Figure 6C). Unfortunately, as the valency of modularly assembled compounds increases, so does molecular weight, which hampers cellular permeability. To solve this problem, we demonstrated that a pathogenic RNA could template the assembly of RNA-targeting oligomers inside cells, using a disease-causing RNA as a catalysts for inhibitor synthesis at the required site of action (Rzuczek et al., 2014) (Figure 6C).
Another challenge in designing chemical probes for RNA is validation of a target in vivo. Techniques developed for resolving RNA-protein interactions (CLIP, iCLIP, etc.) are not directly applicable to small molecules. To solve this problem, we developed a technique named Chemical Cross-Linking and Isolation by Pull-down (Chem-CLIP) (Guan and Disney, 2013). It is based on selective covalent modification of a target RNA by attaching a reactive module and biotin to an RNA-binding scaffold. Reacted RNAs are subsequently captured with streptavidin-coated beads and analyzed by qRT-PCR. Extension of this technique termed ChemCLIP-Map allows one to locate the binding site of the small molecule (Yang et al., 2015). After reaction in cells, total RNA is treated with an antisense oligonucleotide and RNase H. Cleaved RNA fragments that reacted with the small molecule are then isolated with streptavidin beads, thereby establishing the region of the mRNA to which small molecule binds (Yang et al., 2015).
Several approaches have been used to improve downstream defects caused by r(CUG)exp (DM1). To inhibit inclusion of exon 7a in CLCN1 mRNA, a 25-mer morpholino antisense oligonucleotides (ASO) targeting CLCN1 3′ and 5′ splice sites were administered to transgenic mice expressing r(CUG)250 in the 3′ UTR of human skeletal actin gene (HSALR mice (Mankodi et al., 2002)). The oligonucleotides improved splicing of CLCN1 and eliminated associated channelopathy (Wheeler et al., 2007). Similar effects together with correction of other DM1-associated pathologies were achieved with 25-mer morpholino ASO targeting r(CUG)exp in the same mouse model of DM1 (Wheeler et al., 2009, 2012). The ASO corrected splicing of MBNL1-dependent mRNAs but also downregulated the Taxilin beta (Txlnb) gene, which contains r(CUG)9. Finally, phase I and II clinical trials of a gapmer targeting mutant DMPK mRNA has been recently initiated (Isis Pharmaceuticals, 2014). A number of small molecule probes have also been developed for targeting r(CUG)exp that displace MBNL1 and improve downstream defects (Arambula et al., 2009; Childs-Disney et al., 2012a, 2012b, 2013; Hoskins et al., 2014; Jahromi et al., 2013a, 2013b; Parkesh et al., 2012). These compounds were either identified from screening, designed from the structure of r(CUG) repeats, or designed from privileged RNA motif-small molecule interactions including modularly assembled compounds thereof.
Several examples of small molecules that target r(CCUG)exp and improve DM2-associated defects have been reported (Childs-Disney et al., 2014; Lee et al., 2009b; Nguyen et al., 2014; Rzuczek et al., 2014). Our group reported that the aminoglycoside kanamycin A selectively binds 2×2 5′CU3′/3′UC5′ internal loops. From the X-ray structure and molecular dynamics simulations we identified that dynamic equilibrium between two conformations facilitates recognition of the small molecule. Once bound, a kanamycin derivative stabilizes one conformational state of the loop, thereby stabilizing the whole structure (Childs-Disney et al., 2014). When assembled into dimers and higher order oligomers, kanamycin exhibited high affinity and selectivity for r(CCUG) repeats (Lee et al., 2009a) and improved DM2-associated defects in a cellular model (Childs-Disney et al., 2014). During the course of our in cellulo studies, the structure of r(CCUG) repeats was refined and the binding of a dimeric kanamycin ligand modeled (Childs-Disney et al., 2014). This enabled the design of a kanamycin derivative that oligomerized in cellulo upon binding to r(CCUG)exp (Rzuczek et al., 2014). The derivative contained both azide and alkyne groups that were precisely positioned within the aminoglycoside as determined by modeling studies. Upon binding adjacent 5′CU3′/3′UC5′ in r(CCUG)exp, the alkyne group of one kanamycin is brought into close proximity to the azide group of an adjacent kanamycin. The otherwise unreactive groups react to form a stable triazole via a Huisgen dipolar cycloaddition reaction, a variant of click chemistry (Kolb et al., 2001). The in cellulo assembled oligomers exhibited potency (nM) far greater than pre-assembled oligomers and assembly only occurs in DM2-affected cells, not in healthy cells (Rzuczek et al., 2014). These studies also advance a new strategy in drug design, where a drug is synthesized on-site by using disease affected cell as a reactor and a disease causing biomolecule as a drug synthesis catalyst (Figure 6C).
Following the same rational approach to the design of ligands targeting r(CUG)exp and r(CCUG)exp, our group has developed selective compounds that target r(CGG)exp in both FXTAS and FXS (Colak et al., 2014; Disney et al., 2012; Tran et al., 2014). The small molecule 1a was identified to bind 5′CGG3′/3′GGC5′ using chemical similarity searching of a known RNA binder (Disney et al., 2012). A high-throughput FRET-based screen was used to identify compounds that disrupt the binding of r(CGG) repeats to an RNA-binding protein (DGCR8Δ) (Disney et al., 2012). Small molecule 1a inhibits formation of nuclear foci and improves FXTAS-associated alternative pre-mRNA splicing defects (Disney et al., 2012). Importantly, 1a was essential in elucidating the mechanism of FMRP silencing in FXS (Colak et al., 2014). The molecule stabilizes r(CGG)exp and prevents its unfolding – a crucial step in formation of the RNA-DNA hybrid that leads to inhibition of gene expression (Colak et al., 2014).
Modular assembly of a benzimidazole derivative that binds 5′CGG3′/3′GGC5′ also improves FXTAS-associated defects (Tran et al., 2014). Notably, binding of the compound to r(CGG)exp in cellulo does not affect translation of a downstream open reading frame (ORF). This is significant as the FMR1 ORF encodes FMRP, the loss of which causes FXS. An antisense oligonucleotide also improves FXTAS-associated defects but inhibits translation of the downstream ORF (Tran et al., 2014). The analogous molecule that binds r(CGG)exp irreversibly, via the installation of a nucleic acid-reactive group, inhibits RAN translation and normalizes alternative pre-mRNA splicing patterns (Yang et al., 2015). In contrast to antisense oligonucleotide targeting r(CGG)exp, the molecule did not affect loading of ribosomes onto mRNA or inhibit translation of the downstream ORF. By using a reactive compound, the binding sites for the small molecules were mapped to r(CGG)exp in cells using Chem-CLIP-Map (Yang et al., 2015).
Interestingly, the structures of r(CGG)exp and r(G4C2)exp have a shared motif – 1×1 nucleotide GG internal loops. We therefore completed a chemical similarity search of 1a to construct a library of small molecules that might bind r(G4C2)exp and alleviate ALS-associated symptoms. We identified three small molecules that bind r(G4C2)exp in cellulo (via Chem-CLIP), two of which inhibit r(G4C2)exp foci formation and RAN translation in c9ALS cellular model, including patient-derived iNeurons (Su et al., 2014). One was further tested in ALS patient-derived cells (iNeurons) and shown to be bioactive (Su et al., 2014). In another study, a porphyrin-derived G-quadruplex binder, TMPyP4, was shown in vitro to disrupt G-quadruplexes formed by r(G4C2)exp and inhibit its interaction with RNA-binding proteins in vitro (Zamiri et al., 2014).
As aforementioned, MAPT mutations can cause destabilization of RNA secondary structure, leading to deregulation of alternative splicing via altered interaction with U1 snRNP (Jiang et al., 2000). Stabilization of the RNA’s structure restores normal splicing patterns (Donahue et al., 2006) (Figure 7). A high throughput, FRET-based screen identified that the intercalator mitoxantrone (MTX, LDN-13978) stabilizes the mutated MAPT hairpin (Donahue et al., 2007). Importantly, mere intercalating ability is not sufficient for stabilization (Donahue et al., 2007). A structural study of MTX complexed with the MAPT hairpin by NMR spectroscopy revealed that MTX binds the bulge region of the stem-loop (Zheng et al., 2009). A number of attempts were made to optimize MTX by means of classic medicinal chemistry (Yang et al., 2009; Zheng et al., 2009) or conjugating it to aminoglycosides (Artigas and Marchán, 2015; Artigas et al., 2015). In addition, alternative chemotypes were actively sought via dynamic combinatorial chemistry (Lõpez-Senín et al., 2011; Ofori et al., 2012), and ‘Janus’-type compounds were designed to recognize the GU wobble base pair created by the mutations (Artigas and Marchán, 2013).
By using Inforna, a compound was identified by our laboratory that binds to the A-bulge in the MAPT mutant RNA hairpin (Luo and Disney, 2014). The molecule binds the same A-bulge, which was targeted by MTX, and increases thermal stability of the mutant MAPT hairpin but not of the wild type (Figure 7). Mutation of the A-bulge to an AU pair ablates small molecule binding, thus indicating secondary structure-specific mode of action (Luo and Disney, 2014). The compound also affects exon 10 inclusion in cellular models of disease.
The free concentration of intracellular iron has a profound effect on conformation of IREs in several mRNAs, including APP (Alzheimer’s) and α-synuclein (Parkinson’s) (Friedlich et al., 2007; Rogers et al., 2011). The IRE-like hairpins regulate the expression level of the corresponding proteins (Febbraro et al., 2012; Rogers et al., 2002). High-throughput screening yielded a number of active compounds that regulate iron-dependent APP (Bandyopadhyay et al., 2006) and α-synuclein (Rogers et al., 2011) expression levels. It is plausible that an iron-dependent RNA conformational switch may serve as a mechanistic link connecting iron levels and protein-induced pathologies (Fine et al., 2015). Targeting the IRE-like hairpin directly is thus an attractive alternative therapeutic option in AD and PD.
The expanding functions of ncRNAs open up new opportunities for drug discovery. Structural biology and biophysical studies provide rational design principles for RNA-targeting ligands. There is no doubt that the number of functionally active RNA structural motifs will grow due to development of bioinformatic tools, our expanding knowledge of RNA structural biology, and the sheer size and diversity of the human genome. At the same time, the complexity of newly discovered RNA regulatory networks poses a significant challenge on the validation of bioinformatic transcriptome-wide analyses, and hence on the design of selective chemical tools and therapeutics. Variable tissue-specific patterns of pre-mRNA splicing, emerging functional role of RNA localization, as well as complex tangles of feedback loops are just few examples of ‘known unknowns’ of RNA cellular biology. In addition, despite substantial progress in understanding of RNA biochemistry, biophysics and structure, the relevance of these discoveries for in vivo biology remains disputable and requires rigorous validation. Nevertheless, in our opinion, it is an unaffordable luxury to disregard RNA as a potential drug target. Various therapeutically relevant structured ncRNA targets were outlined in this review, and strategies for targeting them provide a broad range of opportunities in drug development for currently incurable diseases.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.