The regulation of gene expression by interactions between nucleic acid-binding proteins and G-quadruplexes is an area of intense interest. Bioinformatic analyses have predicted that there are potentially over 350,000 G-quadruplex-forming sequences in the human genome1,2
, and recent studies of RNA G-quadruplexes in the transcriptome predict that they may be even more extensive than previously appreciated3,4
. Functionally, G-quadruplex structures in RNA have been implicated in almost all aspects of pre-mRNA and mRNA metabolism including mRNA stability, IRES-dependent translation initiation5
, translational repression3,6,7
, alternative splicing8,9
and alternative polyadenylation/3´ end formation10,11
suggesting that the G-quadruplex may be an important regulatory motif in many aspects of gene expression.
The Fragile X mental retardation protein (FMRP) is a regulatory RNA binding protein that binds with high affinity to guanine-rich RNAs capable of G-quadruplex formation12,13
. Loss of FMRP function leads to the Fragile X Syndrome, the most common form of inherited mental retardation, afflicting 1 in 2500 males, and is the leading single-gene cause of autism14
. The Fragile X Syndrome almost always results from a triplet repeat expansion in the 5’-UTR, leading to abnormal methylation of the gene, repression of transcription, and hence complete loss of FMRP expression. However, one severely affected patient harbors a missense mutation in one of the RNA binding domains15
, and mice harboring this mutation are indistinguishable from FMRP null mice in most behavioral and electrophysiologic assays and macroorchidism16
, suggesting that loss of RNA binding activity may underlie the synaptic dysfunction observed in the disease.
Recent evidence suggests that FMRP functions to repress mRNA translation in neurons, leading to the widely held view that Fragile X Syndrome is a disease of “runaway translation” resulting in inappropriate gene expression with dire consequences for synaptic plasticity. This occurs despite the presence of two autosomal paralogs, FXR1P and FXR2P, which are also expressed in the brain. While all three proteins share a great deal of homology in their N-termini, KH domains and nuclear export signal, the C-termini containing the RGG box have diverged considerably. Consequently, the KH domains share RNA binding specificity but binding to G-rich sequences capable of forming G-quadruplexes is specific to FMRP17
. These observations, together with interest in identifying the in vivo
RNA ligands of FMRP suggest that understanding the role of the RGG box - G-rich RNA interaction will be important in understanding the human disease.
Binding of FMRP to G-rich RNAs requires only the RGG box RNA binding domain of FMRP, which is rich in arginines and glycines. In vitro
selection identified guanine-rich RNA motifs that can bind tightly to FMRP, such as the 36-nt r(GCUGCGG
UUGCGCAGCG) sequence named sc112
. Binding of FMRP to sc1
RNA was shown to depend on G-quadruplex formation based on the following observations: (i) the binding affinity was significantly increased in the presence of K+
as compared to Li+
, and (ii) mutations of guanines within G-tracts abolished binding.
To understand how G-rich RNAs could be recognized by the RGG domain of FMRP, we used NMR to characterize the 1:1 complex between the 36-nt sc1 RNA sequence with a 28-aa peptide from the RGG domain of FMRP (RGG peptide) (). We prepared complexes with various RNA and peptide constructs, including unlabeled, uniformly 13C,15N-labeled, as well as residue-type-specific and site-specific 13C,15N-labeled sc1 RNA and RGG peptide molecules. The structure of the complex revealed a number of new motifs and recognition principles between the FMRP RGG peptide and the sc1 RNA duplex-quadruplex junction. Subsequently, the peptide-RNA intermolecular contacts and the contributions of shape complementarity to molecular recognition were validated following analysis of filter binding assays on structure-guided RNA and peptide mutations.
Figure 1 Sequence and NMR spectra of sc1 RNA and RGG peptide. (a) Sequence of 36-mer sc1 stem-loop RNA and 28-mer FMRP RGG peptide. (b,c) 1H,15N HSQC spectra of RGG peptide in the (b) free and (c) sc1 RNA-bound states. Backbone amide resonance assignments are (more ...)