|Home | About | Journals | Submit | Contact Us | Français|
Thermodynamic stabilities of 2 × 2 nucleotide tandem AG internal loops in RNA range from −1.3 to +3.4 kcal/mol at 37 °C and are not predicted well with a hydrogen-bonding model. To provide structural information to facilitate development of more sophisticated models for the sequence dependence of stability, we report the NMR solution structures of five RNA duplexes: (rGACGAGCGUCA)2, (rGACUAGAGUCA)2, (rGACAAGUGUCA)2, (rGGUAGGCCA)2, and (rGACGAGUGUCA)2. The structures of these duplexes are compared to that of the previously solved (rGGCAGGCC)2 (Wu, M., SantaLucia, J., Jr., and Turner, D. H. (1997) Biochemistry 36, 4449−4460). For loops bounded by Watson−Crick pairs, the AG and Watson−Crick pairs are all head-to-head imino-paired (cis Watson−Crick/Watson−Crick). The structures suggest that the sequence-dependent stability may reflect non-hydrogen-bonding interactions. Of the two loops bounded by G-U pairs, only the 5′UAGG/3′GGAU loop adopts canonical UG wobble pairing (cis Watson−Crick/Watson−Crick), with AG pairs that are only weakly imino-paired. Strikingly, the 5′GAGU/3′UGAG loop has two distinct duplex conformations, the major of which has both guanosine residues (G4 and G6 in (rGACGAGUGUCA)2) in a syn glycosidic bond conformation and forming a sheared GG pair (G4-G6*, GG trans Watson−Crick/Hoogsteen), both uracils (U7 and U7*) flipped out of the helix, and an AA pair (A5-A5*) in a dynamic or stacked conformation. These structures provide benchmarks for computational investigations into interactions responsible for the unexpected differences in loop free energies and structure.
Understanding RNA’s functions in a plethora of cellular processes, including catalysis (1−3), and in selective alteration of those processes (4−7) begs further investigation into the connections between sequence, structure, and function. Elucidation of secondary and three-dimensional structure, however, cannot keep up with the rapid acquisition of primary sequence. With the assistance of experimental data, such as thermodynamics, NMR spectra, and chemical and enzymatic mapping, computational predictions can greatly increase the speed for discovery of structure (8−15). Still, the combination of these methods falls short of absolute accuracy due to an incomplete grasp of the various interactions involved in RNA secondary and three-dimensional structure formation. Thus, further investigations of folding interactions are needed.
The symmetric 2 × 2 nucleotide internal loop is a motif with a wide range of thermodynamic stabilities that are not explained by simple models (16−24). For example, 2 × 2 internal loops with the motif 5′AG/3′GA have free energy increments that range from −1.3 to +3.4 kcal/mol at 37 °C depending on the adjacent canonical base pair (Table (Table1).1). This spread corresponds to an ~2000-fold range in equilibrium constant for folding. Structural models will provide a good starting point for theoreticians to develop explanations for the range of stabilities, but such structural information for these loops is lacking. Therefore, NMR structures were determined for five duplexes with the 5′AG/3′GA motif (see Table Table11 for sequences and abbreviations).
At 37 °C, the order of thermodynamic stabilities of symmetric AG internal loops closed by canonical base pairs is 5′GAGC/3′CGAG > 5′CAGG/3′GGAC > 5′UAGA/3′AGAU > 5′AAGU/3′UGAA ≥ 5′GAGU/3′UGAG > 5′UAGG/3′GGAU (Table (Table1).1). Only the loops bounded by GC or CG base pairs are stabilizing, i.e., have a negative ΔG°loop, and the range of ΔΔG°loop exhibited is not that expected from a simple hydrogen-bonding model. For example, the 5′UAGA/3′AGAU loop is 2.5 kcal/mol more stable than the 5′UAGG/3′GGAU loop at 37 °C, amounting to an ~50-fold increase in binding constant. Watson−Crick pairs provide more stable interactions, but if one presumes the expected base pairing and number of hydrogen bonds, then the difference in stabilities is not explained.
More subtle differences are also observed. For example, 5′GAGC/3′CGAG is ~0.6 kcal/mol more stable than 5′CAGG/3′GGAC, but NMR data indicate that both loops have cis Watson−Crick/Watson−Crick (imino) AG pairs adjacent to Watson−Crick GC pairs (Table (Table1).1). Similarly, the 5′UAGA/3′AGAU loop is ~0.8 kcal/mol more stable than the 5′AAGU/3′UGAA loop despite both apparently having identical pairing motifs and thus the same number of hydrogen bonds (17).
Comparisons of the thermodynamics for 5′AG/3′GA internal loops with those of 5′GA/3′AG internal loops with identical closing base pairs are also interesting. For instance, the 5′GGAC/3′CAGG loop is 1.3 kcal/mol more stable than the 5′GAGC/3′CGAG loop, even though both have imino-paired GA pairs on the basis of 1D NMR spectra (17). Similarly, the 5′UGAG/3′GAGU loop is 3.3 kcal/mol more stable at 37 °C than 5′UAGG/3′GGAU, although in this case the GA pairing changes from trans Hoogsteen/sugar edge (sheared) to cis Watson−Crick/Watson−Crick (imino) (Table (Table1).1). In contrast, the stabilities of 5′CAGG/3′GGAC and 5′CGAG/3′GAGC are identical at 37 °C, which is also essentially true for 5′UAGA/3′AGAU and 5′UGAA/3′AAGU at 37 °C. There is also a structural change in these cases; the 5′GA/3′AG loops have sheared GA pairs, while the 5′AG/3′GA loops have imino AG pairs (Table (Table1).1). Consideration of possible hydrogen-bonding patterns and backbone contortions indicates that while 5′GA/3′AG internal loops can adopt either sheared or imino-pairing conformations, 5′AG/3′GA internal loops flanked by Watson−Crick pairs are constrained to be only imino-paired (Table (Table1)1) as predicted by Gautheret et al. (25).
Here, we report the solution structures of 5′AG/3′GA internal loops closed by GC, UA, AU, GU, and UG pairs and compare them to the solution structure for the 5′CAGG/3′GGAC loop (23). The structures of the AG pairs bound by Watson−Crick closing pairs are all head-to-head imino-paired (cis Watson−Crick/Watson−Crick) (Table (Table1).1). As such, the number and types of AG hydrogen bonds cannot account for the stability differences for loops closed by Watson−Crick pairs.
Of AG loops bound by GU or UG pairs, only 5′UAGG/3′GGAU adopts wobble UG closing pairs and apparently weak imino AG pairs in the loop. For 5′GAGU/3′UGAG, there are two duplex conformations. These were resolved by bromination of C-8 on G6 of (GACGAGUGUCA)2, which precludes the anti conformation of G6 (26−28). The 5′GAGU/3′UGAG loop’s major conformation is novel, with both of the loop guanines, G4 and G6, having a syn glycosidic bond and U7 flipped out of the helix. The loop guanines form a GG pair (G4-G6*, G·G trans Watson−Crick/Hoogsteen) that has been reported in the ribosome (29,30) and in an ATP/AMP-binding RNA aptamer (31,32), but never with both guanines in the syn conformation. These solution structures provide benchmarks for theoreticians exploring the underlying causes of the sequence dependence of loop stability and structure.
Oligonucleotides, rGACGAGCGUCA, rGACGAGUGUCA, rGACAAGUGUCA, rGACUAGAGUCA, and rGGUAGGCCA, were purchased from Dharmacon RNA Technologies or IDT. Oligonucleotides with a modified G, rGACGABrGUGUCA, rGACBrGAGUGUCA, and rGACGAMeGUGUCA, were synthesized as previously described (33,34). “NMR buffer” is 10 mM sodium phosphate with 0.5 mM Na2EDTA and 80 mM NaCl at pH 6.1, which has been filter-sterilized with Corning 0.22 μm PES filters. Samples for NMR spectra were dissolved in RNase-free water, dialyzed against NMR buffer for 48 h at 4 °C, dried down, and resuspended with RNase-free 90% water/10% D2O, in a volume equal to that removed from dialysis. For spectra in D2O, oligonucleotides were lyophilized and resuspended in 99.96% D2O three times and then lyophilized and resuspended in 99.996% D2O from Cambridge Isotopes. The single strand concentrations for rGACGAGCGUCA, rGACGAGUGUCA, rGACAAGUGUCA, rGACUAGAGUCA, and rGGUAGGCCA were each ~2.0 mM, unless otherwise noted. Samples in NMR tubes were incubated in a water bath at 80 °C for 5 min and then allowed to slowly cool to 4 °C over a course of ~30 min for annealing of the duplex.
NMR spectra were taken on Varian Inova spectrometers at 500 or 600 MHz. The 1D imino proton spectra were taken using an S pulse for excitation (35). Except for (rGGUAGGCCA)2, NOESY spectra in 90% H2O/10% D2O were acquired at 0 or 1 °C with mixing times of 75 and 150 ms. For (rGGUAGGCCA)2, 125 and 150 ms spectra were acquired at 1 °C, and an additional 150 ms spectrum was taken at −3 °C. Conditions for NOESY spectra recorded in D2O are given in Supporting Information.
Measurements of scalar couplings were derived from TOCSY, DQ-COSY, and 31P−1H HETCOR spectra. The 2D clean-TOCSY spectra (36) were acquired at short and medium mixing times of approximately 13 and 36 ms, respectively, with wrapping in the f1 dimension for high resolution in the sugar proton region. DQ-COSY spectra with the same resolution as the TOCSY spectra were acquired for 5′GAGC/3′CGAG and 5′UAGG/3′GGAU. Measurement of proton−phosphorus scalar couplings and assignment of H3′ resonances were aided by 31P−1H HETCOR spectra. One-dimensional 31P spectra were also acquired. Verification of peak assignments was provided by natural abundance 13C−1H heteronuclear single-quantum coherence (HSQC) spectra acquired for all duplexes. NMRPipe (37) was used for data processing, and Sparky (38) was used for peak assignments and integration. Assignments are listed in Tables S1−S5 in Supporting Information.
Distance restraints were generated from cross-peaks in 75 ms mixing time NOESY spectra using (1/r)6 scaling. Cross-peaks from H5-H6 (2.45 Å) and H1′-H2′ (2.75 Å) in the Watson−Crick stems were used for reference volumes. Watson−Crick hydrogen bond restraints were applied between bases not adjacent to AG pairs as indicated by imino proton cross-peaks in NOESY spectra. Dihedral angle restraints were determined based on sugar proton and phosphorus scalar couplings taken from TOCSY, DQ-COSY, NOESY, and 31P−1H HETCOR spectra. Strong H3′-H4′ peaks and the absence of H1′-H2′ peaks in the TOCSY spectra indicated C3′-endo sugar puckers (δ ~ 81°). H4′-H5′/H5′′ J-couplings less than 2 Hz indicated γ was not in the trans or g− conformation, and so the γ dihedral angle was restrained to g+ (γ ~ 60°). In cases where J(H4′-H5′/H5′′) > 7 Hz, γ was restrained to be trans (γ ~ 180°). 31P(n + 1)-H3′(n) J-couplings greater than 14 Hz indicated ε ~ −115° (excluding g+). Weak 31P-H5′/H5′′ cross-peaks in 1H−31P HETCOR spectra (J-coupling <6 Hz) indicated β in the trans conformation (~165°). Couplings within the stem residues were within typical A-form ranges for all duplexes. Consequently, backbone dihedrals in the stems were restrained to A-form values: α (−65 ± 90°), β (165 ± 75°), γ (60 ± 60°), ε (−115 ± 125°), and ζ (−70 ± 90°) as defined previously (39). Near the loops, angles β, γ, δ, and ε were restrained to A-form values if indicated by the data. For all duplexes except 5′GAGU/3′UGAG and 5′UAGG/3′GGAU, J(H4′-H5′/H5′′) for the loop G was greater than 7 Hz, so γ was taken to be trans (γ ~ 180°), although assignments for H5′ and H5′′ were not stereospecific. No cross-peak for G6P or for G5H4′ could be identified for 5′UAGG/3′GGAU, so γ and ε were not restrained. For all duplexes, α and ζ were not restrained although no phosphorus shifts were observed outside of a 1 ppm range, except for A5P of 5′GAGU/3′UGAG. Glycosidic bonds were set to be anti (χ = 255 ± 85°) for all residues not exhibiting large H8/H6-H1′ NOE cross-peaks. Only G4 and G6 of 5′GAGU/3′UGAG had large H8-H1′ cross-peaks. Dihedral angles in the loop and adjacent residues of 5′GAGU/3′UGAG are unusual and are discussed in Results.
Simulated annealing and molecular dynamics calculations were carried out using distance and dihedral angle restraints generated from NMR data. The structures were calculated with the following protocol using implicit solvent: (1) high temperature dynamics at 5000 K in torsion angle space for 4 ps with NOE and dihedral scale factors of 150 kcal/mol Å2 and 25 kcal/mol rad2, respectively; (2) simulated annealing in torsion angle space for 40 ps with slow cooling from 2000 to 0 K (40000 steps) with NOE and dihedral scale factors of 75 kcal/mol Å2 and 100 kcal/mol rad2, respectively; (3) simulated annealing for 40 ps in Cartesian space with slow cooling from 1000 to 0 K (40000 steps) with the NOE and dihedral angle scale factors constant at 75 kcal/mol Å2 and 100 kcal/mol rad2, respectively; the van der Waals factor was linearly increased from 1 to 4; and (4) Powell energy minimization was applied with full van der Waals and electrostatic terms. A total of 40 structures were calculated in this way from randomized initial atom velocities using the program CNS version 1.2 (40). Ten structures with the lowest total energies and without distance violations (>0.2 Å) were subjected to an additional 100 ps of restrained MD using the program AMBER (version 9, ff99 force field). In this protocol, the system was heated to 600 K followed by slow cooling to 0 K over 100000 steps using NOE and dihedral scale factors of 20 kcal/mol Å2 and 30 kcal/mol rad2, respectively, and a generalized-Born implicit solvent model (41). Distance and dihedral restraints used in the CNS calculation were also used in the AMBER calculation. The structures of 5′AAGU/3′UGAA, 5′GAGC/3′CGAG, 5′UAGA/3′AGAU, and 5′UAGG/3′GGAU are deposited with the RCSB Protein Data Bank with ID codes 2KXZ, 2KY0, 2KY1, and 2KY2, respectively.
The 1D NMR spectra of the imino proton region for (rGACAAGUGUCA)2 (denoted 5′AAGU/3′UGAA), (rGACUAGAGUCA)2 (denoted 5′UAGA/3′AGAU), (rGACGAGCGUCA)2 (denoted 5′GAGC/3′CGAG) (Figure (Figure1),1), and (rGGUAGGCCA)2 (denoted 5′UAGG/3′GGAU) (Figure (Figure2)2) indicate that these duplexes have essentially a single conformation. For each of these duplexes, five peaks are observed between 10 and 14.5 ppm at ~0 °C, corresponding to G imino protons in the loops and the expected Watson−Crick pairs, though the G5 and U3 imino peaks of 5′UAGG/3′GGAU are overlapped (Figure (Figure22).
The 2D SNOESY spectra for 5′AAGU/3′UGAA, 5′UAGA/3′AGAU, 5′GAGC/3′CGAG, and 5′UAGG/3′GGAU confirm the presence of the expected Watson−Crick and imino AG (cis Watson−Crick/Watson−Crick) pairs for each of these duplexes. The imino proton region of the 2D NOESY spectrum for 5′GAGC/3′CGAG is shown in Figure Figure33 and is representative of the 2D NOESY spectra for 5′AAGU/3′UGAA, 5′UAGA/3′AGAU, and 5′UAGG/3′GGAU (see Supporting Information). The imino peaks for Watson−Crick pairs were confirmed by cross-strand NOEs to cytosine amino protons for GC pairs and adenine H2s for AU pairs. The H2 protons of the loop adenines (A5H2 and A5*H2, where * represents the second strand) have strong cross-peaks to the imino and amino protons of the cross-strand loop guanosines (G6* and G6) (Figure (Figure3).3). This eliminates the possibility of a sheared AG pair, because neither of these cross-peaks would be seen if the pair was sheared. The imino protons in the loops (G6H1 in Figure Figure1,1, G5H1 in 5′UAGG/3′GGAU in Figure Figure2)2) have downfield chemical shifts, which are indicative of relatively stable, hydrogen-bonded G imino protons.
In 5′AAGU/3′UGAA and 5′UAGA/3′AGAU, U7H3 and U4H3, respectively, are shifted upfield (12.6 and 12.7 ppm) relative to other Watson−Crick AU pairs in the duplexes studied, which average 14.4 ppm (Figure (Figure1).1). The 15N bound to U7H3 and U4H3 have their resonances at ~161 ppm, which is consistent with expectations for a UA pair (Supporting Information). The U7H3 proton is exchanging rapidly with water, which suggests weak hydrogen bonding in this closing base pair. Interestingly, the 5′AAGU/3′UGAA loop is the least thermodynamically stable of all loops closed by Watson−Crick base pairs.
The chemical shifts of G6H1 and U3H3 of 5′UAGG/3′GGAU are fairly typical of GU pairs (Figure (Figure2),2), and a strong NOE cross-peak between these protons in 2D spectra is consistent with the U3 and G6 closing pair being in a GU wobble (cis Watson−Crick/Watson−Crick) conformation. The AG hydrogen bonding is apparently not always formed, as suggested by a relatively weak G5H1-A4H2 cross-peak compared to the duplexes with Watson−Crick closing pairs and by rapid exchange with water, as indicated by an exchange cross-peak (Figure S2 in Supporting Information). The chemical shift of the G5 imino resonance in the AG pair is nearly identical to that observed for the imino AG pair in 5′CAGG/3′GGAC (23). Diffuse intensity observed at ~11.0 ppm in the 1D spectrum of 5′UAGG/3′GGAU is probably due to a small population of hairpin loops. This alternate conformation becomes more evident at higher temperatures (Figure (Figure22).
The 1D spectrum for (rGACGAGUGUCA)2 (denoted 5′GAGU/3′UGAG) has 11 peaks between 9 and 15 ppm, indicating multiple conformations (Figure (Figure2).2). Spectra measured after decreasing strand concentration and heating the sample to 70 °C followed by rapid cooling (to favor hairpin) or after increasing strand concentration and reannealing slowly (to favor duplex) demonstrated that there was a minor concentration of hairpin. In addition, there is at least one minor conformation of the duplex (Figure (Figure4).4). The minor duplex conformation is upward of 30% of the total duplex population, however, and gives rise to a significant number of major-to-minor conformation exchange cross-peaks in the 2D spectra of 5′GAGU/3′UGAG. Replacement of the G4 or G6 H8 protons with bromine resulted in a marked reduction in minor conformation peaks, with bromination of G6H8 yielding five sharp imino resonances between 10 and 15 ppm at the same chemical shifts as the peaks of the major conformation (Figure (Figure2).2). Bromination at the H8 position forces glycosidic bonds to adopt a syn conformation (26−28,33). Thus, the spectra in Figure Figure22 suggest that the major conformation of the 5′GAGU/3′UGAG loop has G6 in the syn conformation. Most of the analysis of this motif used spectra obtained from the construct with 8-bromoguanine for G6, (rGACGABrGUGUCA)2 (denoted 5′GABrGU/3′UBrGAG). Assignment of the imino protons was confirmed by a 15N−1H HSQC spectrum of 5′GAGU/3′UGAG (see Supporting Information). The lack of a clear U7 imino resonance and the absence of any NOE cross-peaks to it suggest that the expected U7-G4 closing wobble pair is not formed. The downfield shift of the G6 imino indicates involvement in a hydrogen bond, but no G6H1-A5H2 NOE cross-peak is observed. Thus, formation of a GA imino pair in the loop is not indicated. A strong G4*H8 -G6H1 cross-peak (Figure (Figure5),5), however, suggests a G6/G4* trans Watson−Crick/Hoogsteen pair, where the asterisk denotes a cross-strand interaction. This is supported by the relatively upfield shift and rapid water exchange of G4H1, suggesting lack of hydrogen bonding for this proton. A G6H1-G8H1 cross-peak suggests U7 is not between G6 and G8.
A similar construct with a methyl group instead of a bromine in the H8 position of G6 was synthesized (denoted 5′GAMeGU/3′UMeGAG). Features in the exchangeable proton spectra were essentially the same as for 5′GABrGU/3′UBrGAG (Figure (Figure22).
The 2D NOESY spectra of nonexchangeable protons for 5′AAGU/3′UGAA, 5′UAGA/3′AGAU, 5′GAGC/3′CGAG, and 5′UAGG/3′GGAU exhibit NOE patterns mostly typical of A-form RNA, even in the loop regions (Supporting Information). For instance, loop adenine H2 protons show two weak or medium H1′ cross-peaks rather than the strong cross-peaks typically observed for sheared GA pairs. TOCSY spectra indicate weak H1′-H2′ and strong H3′-H4′ scalar coupling typical of C3′-endo sugar puckers, and 31P−1H correlation spectra indicate strong H3′-P and weak H5′/H5′′-P scalar coupling in both Watson−Crick stems and AG loops, indicative of A-form backbone dihedrals throughout. However, J(H4′-H5′) > 7 Hz for the G in the AG imino pairs indicates that the γ dihedral angle is not in the typical g+ conformation, and chemical shifts of the phosphorus, H3′, H4′, H5′, and H5′′ nuclei of this G exhibit a common pattern which is distinct from equivalent nuclei in Watson−Crick stems. For 5′UAGG/3′GGAU, broad resonances for G5H1′, H2′, H3′, and H8, as well as for G6H8, and the absence of cross-peaks for G6P (possibly due to a broad G5H3′) provide further evidence of the dynamic nature of this loop.
The 2D NOESY spectrum for 5′GAGU/3′UGAG, which represents two conformations, exhibits large G4H8-H1′ and G6H8-H1′ cross-peaks, indicating that both G4 and G6 have a syn glycosidic bond in the major conformation (Figure (Figure5).5). Furthermore, 2D NOESY spectra of the G6 8-bromo duplex, 5′GABrGU/3′UBrGAG, retain the large G4H8-H1′ cross-peak seen in spectra of the unbrominated duplex, indicating that the glycosidic bond of G4 retains the syn conformation. The 2D NOESY spectrum of the duplex, 5′GAMeGU/3′UMeGAG, shows large G4H8-H1′ and G6H8Me-H1′ cross-peaks (Supporting Information). Thus, both G4 and G6 of 5′GAMeGU/3′UMeGAG have syn glycosidic bonds. Other features in spectra of the 5′GABrGU/3′UBrGAG duplex suggest U7 is out of the helix. These include lack of a U7H6-G6H2′ cross-peak, medium cross-peaks for G8H8-G6H1′ and G8H8-G6H2′, a medium cross-peak for G8H8-G6H8Me (observed in 5′GAMeGU/3′UMeGAG; see Supporting Information), and an extremely weak or absent G8H8-U7H6 cross-peak. The ribose moieties for residues 4−7 all exhibit J(H1′-H2′) ≥ 6 Hz and J(H3′-H4′) < 2 Hz, indicating that these residues have a C2′-endo sugar pucker. Also of note is the chemical shift of A5P which is more than 1 ppm downfield from those in canonical pairs.
Information about the structure of the minor conformation of 5′GAGU/3′UGAG can be extracted from 2D NOESY spectra. In the minor conformation, G4 and G6 no longer exhibit large H1′-H8 cross-peaks, suggesting these bases are not in syn conformations. Also, the minor conformation chemical shifts of U7H5 and H6 are substantially upfield relative to their shifts in the major conformation, suggesting that this base is now stacked with other bases rather than being in an extrahelical orientation (Supporting Information Figure S6).
Structures were modeled by restrained molecular dynamics and simulated annealing as described in Materials and Methods. Low energy models of 5′GAGC/3′CGAG, 5′AAGU/3′UGAA, 5′UAGA/3′AGAU, and 5′UAGG/3′GGAU without violations from NMR data are overlaid in Figure Figure6.6. The stems and closing base pairs of all four loops form cis Watson−Crick/Watson−Crick pairs, and the loops are imino-paired (AG cis Watson−Crick/Watson−Crick). The purine−purine pairing widens the duplex from an average cross-strand C1′−C1′ distance of 10.0 ± 0.1 Å for the Watson−Crick base pairs to an average of 12.5 ± 0.1 Å for the imino AG pairs. In addition, the γ dihedral angle between the two loop residues is forced into the trans conformation instead of the A-form g+ conformation. In the models, this γ switch is accompanied by a change in the α angle from −71° to +140° (crankshaft conformation), although there is no NMR observable to verify this. Interestingly, if the NMR restraints were removed and this γ dihedral angle was forced to be g+ at the beginning of a 25 ns simulation with the AMBER99 force field, then it rapidly switched to trans and returned to g+ only rarely (see Supporting Information). Thus, AMBER99 predicts this structural feature well. The structural flexibility suggested by models for 5′UAGG/3′GGAU (Figure (Figure6)6) is consistent with NMR spectra as described above.
A schematic of pairing in 5′GAGU/3′UGAG as inferred from NMR data is presented at the bottom of Figure Figure7,7, showing the G4-G6* base pair, an A5-A5* interaction, and U7 flipped out of the helix. The G4-G6* pair with both guanines in the syn conformation and forming a GG trans Watson−Crick/Hoogsteen pair is illustrated at the top of Figure Figure7.7. The A5-A5* interaction is represented in the schematic, though the exact conformation is unclear from the spectra. In particular, the NMR spectra are consistent with either a sheared A5-A5* pair in rapid exchange or stacking of A5 on A5*.
Some unusual chemical shifts can be compared to predictions by the program NUCHEMICS (42). The average shift of an adenine H2 in an AU pair is 7.67 ppm for the structures presented here, which is also the average shift of all adenine H2s reported to the Biological Magnetic Resonance Bank (43). In contrast, A4H2 in 5′AAGU/3′UGAA and A7H2 in 5′UAGA/3′AGAU are shifted upfield to 6.68 and 6.63 ppm, respectively. NUCHEMICS overpredicts the upfield shift of these AH2 protons by about 0.4 ppm, at 6.24 ± 0.09 and 6.27 ± 0.14 ppm, respectively. Only two adenine H2s in AU pairs and surrounded by canonical pairs in the BMRB database have larger upfield shifts (6.4 and 6.49 ppm) than these two protons. The upfield shifts in 5′AAGU/3′UGAA and 5′UAGA/3′AGAU support the positioning of these adenine H2s in our models, where they are stacked between the pyrimidine rings of the 5′ cross-strand and 3′ intrastrand purines (Figure (Figure88).
As shown in Table Table2,2, the four AG imino duplexes studied here have an ~0.4 ppm upfield shift of the H2′ in the residue that is 5′ of the AG loop when compared to the chemical shifts of the H2′ protons of Watson−Crick pairs flanked by Watson−Crick pairs. Examination of spectra for three other loops with imino cis Watson−Crick/Watson−Crick tandem AG or GA pairing revealed similar chemical shifts (Table (Table2).2). This shift might be explained by the necessary widening of the duplex to accommodate the purine−purine pair. NUCHEMICS (42) underpredicts these chemical shifts by an average of 0.4 ppm.
Much remains to be discovered about the interactions determining RNA structure and energetics (15). For example, differences in free energy increments of 2 × 2 nucleotide RNA internal loops with tandem AG base pairs are not fully explained by counting hydrogen bonds. To provide structural benchmarks for theoretical calculations of interactions, we report the NMR solution structures of five tandem AG internal loops and compare them to the previously solved 5′CAGG/3′GGAC loop structure (23). These structures will help to provide insight into the subtle connections between sequence and thermodynamics. The structures will also contribute to prediction of 3D structures by homology modeling. A search of RNA FRABASE (52) revealed no 3D structures of a natural RNA with a 5′AG/3′GA internal loop. The 5′AAGU/3′UGAA loop, however, occurs in the secondary structure of pre-miR-890 RNA (44), and a similar loop, 5′GAGG/3′CGAC, occurs near the 3′ end of the HIV genome (45). Interestingly, 5′AG/3′GA represents less than 0.5% of 2 × 2 nt internal loops in a database of 1899 secondary structures whereas 5′GA/3′AG loops comprise 20% of the database (53).
Solution structures of the loop regions of 5′GAGC/3′CGAG, 5′AAGU/3′UGAA, 5′CAGG/3′GGAC, and 5′UAGA/3′AGAU revealed that each of the loops is wholly cis Watson−Crick/Watson−Crick and imino GA paired (Table (Table11 and Figure Figure6).6). This suggests that a subtle interaction between stacked base pairs contributes to the difference in stabilities of comparable loops. Because the number of hydrogen bonds is identical, the average 0.7 kcal/mol greater stability of 5′GAGC/3′CGAG relative to 5′CAGG/3′GGAC and of 5′UAGA/3′AGAU relative to 5′AAGU/3′UGAA (Table (Table1)1) must be due to other interactions in or near the loop. The structures shown in Figure Figure99 suggest a possible source of the sequence dependence of stability when the tandem AG loops are closed by Watson−Crick pairs. The spatial arrangement of the amino and carbonyl partial charges in the major groove appears to be more electrostatically favorable for 5′GAGC/3′CGAG and 5′UAGA/3′AGAU than for 5′CAGG/3′GGAC and 5′AAGU/3′UGAA. As shown in the yellow boxed regions in Figure Figure9,9, in the 5′GAGC/3′CGAG and 5′UAGA/3′AGAU motifs, each major groove amino group is stacked on a carbonyl group, whereas in the 5′CAGG/3′GGAC and 5′AAGU/3′UGAA motifs, each amino group is stacked on an amino group and each carbonyl group is stacked on a carbonyl group. The same structural difference may also contribute to the 2.9 kcal/mol difference in stability between the 5′GAGC/3′CGAG and 5′AAGU/3′UGAA motifs. A transition from 5′GAGC/3′CGAG to 5′AAGU/3′UGAA involves a loss of two hydrogen bonds, but on average 2 × 2 nucleotide symmetric internal loops without AG or GA pairs that are closed by GC are only 1.7 kcal/mol more stable than those closed by AU (16).
In Table Table1,1, the 5′GA/3′AG motif is thermodynamically either more or equally stable to the 5′AG/3′GA motif. There are probably several reasons for this. One is that the 5′GA/3′AG motif can form either sheared or imino pairs in order to provide a backbone conformation able to accommodate a Watson−Crick pair 5′ of the A (25). Thus the 5′GA/3′AG motif has two options for maximizing base stacking interactions (46). The sheared bases are also able to make hydrogen bonds to the opposite backbone (21,24,50). Furthermore, it has been suggested for the 5′GA/3′AG motif that interactions of a non-hydrogen-bonded nonplanar guanosine amino group with a carbonyl group stacked on it contributes to stability (46). In the tandem 5′AG/3′GA motif, the loop guanosine’s amino hydrogens are not close to a group with partial negative charge. The closest approach is ~3.8 Å, to the oxygen of a carbonyl group on the same strand.
The structures of the two 2 × 2 nucleotide AG symmetric loops expected to be closed by GU pairs provide a remarkable contrast. The structure of 5′UAGG/3′GGAU has the expected UG wobble and AG imino pairs. Rapid exchange of the AG imino proton with water suggests its hydrogen bond is weak or not formed at all times. Several resonances in the loop show broadening indicative of dynamic sampling of alternate conformations, as is also evident from the overlap of structures in Figure Figure6.6. The high degree of flexibility in this loop correlates with the lowest stability. In contrast, the structure of 5′GAGU/3′UGAG is quite different. First, as demonstrated in the imino proton spectra in Figures Figures22 and and4,4, the duplex exists in two conformations. In the major conformation, G4 and G6 have syn glycosidic bonds (Figure (Figure7).7). The presence of the minor conformation at ~25−30% prevented an accurate determination of the major structure of this duplex. Bromination or methylation of carbon-8 in G6 of (rGACGAGUGUCA)2 eliminated the minor conformation (Figure (Figure22).
NMR spectra of 5′GABrGU/3′UBrGAG reveal that the major conformation is novel. G4 and G6 pair with the cross-strand G6* and G4* residues, respectively, in G-G N7-imino pairs with all four G residues’ glycosidic bonds in the syn conformation (Figure (Figure7).7). This pairing forces U7 and U7* to flip out of the loop. The adenines, A5 and A5*, apparently interact in a dynamic way, though it cannot be determined whether they form a sheared pair or stack. The structural features of (GACGAGUGUCA)2 are quite different from those of a shorter duplex with the 5′GAGU/3′UGAG motif, (GCGAGUGC)2(18). A 1D spectrum of (GCGAGUGC)2 has five major peaks with chemical shifts consistent with wobble GU and imino AG pairs. This is a dramatic non-nearest neighbor structural effect. The results suggest that the shorter duplex may be more flexible, which allows interactions in the 5′GAGU/3′UGAG motif to dominate, whereas the Watson−Crick and 3′ dangling end interactions dominate in the longer duplex and provide local restraints that make the structure with wobble GU and imino AG pairs less favorable than the observed structure. In fact, NOE and chemical shifts of the minor conformation observed for (GACGAGUGUCA)2 (Supporting Information Figure S6) are also consistent with a structure with wobble GU and imino AG pairs, further highlighting the thermodynamic similarity of these two structures. The results suggest that homology modeling of RNA three-dimensional structure may have to consider interactions beyond nearest neighbor.
The thermodynamic increments for symmetric 2 × 2 nucleotide 5′AG/3′GA internal loops have been fully mapped (9,10,17,18,47,48), but contributions leading to the observed order of stabilities are not understood. To determine the extent to which hydrogen bonding, electrostatics, and base-stacking influence the local structure, theoreticians will need to develop more accurate models for these systems. The structures reported here provide necessary benchmarks to facilitate expansion of knowledge regarding the roles of various interactions in determining local RNA structure and stability. The major structure of (GACGAGUGUCA)2 is novel and suggests that homology modeling and energetic calculations of RNA may need to consider interactions beyond nearest neighbors.
We thank Prof. Susan Schroeder for comments on the manuscript.
†This work was supported by NIH Grant GM22939 (to D.H.T.)
Tables of chemical shift assignments, distance and dihedral angle restraints, structure calculation statistics and 1H−31P HETCOR, natural abundance 1H−15N HSQC, TOCSY, SNOESY, and NOESY spectra, and results of an unrestrained molecular dynamics simulation of 5′GAGC/3′CGAG. This material is available free of charge via the Internet at http://pubs.acs.org.