|Home | About | Journals | Submit | Contact Us | Français|
Locked Nucleic Acids (LNAs) are RNA analogues with an O2′-C4′ methylene bridge which locks the sugar into a C3′-endo conformation. This enhances hybridization to DNA and RNA, making LNAs useful in microarrays and potential therapeutics. Here, the LNA, L(CAAU), provides a simplified benchmark for testing the ability of molecular dynamics (MD) to approximate nucleic acid properties. LNA χ torsions and partial charges were parametrized to create AMBER parm99_LNA. The revisions were tested by comparing MD predictions with AMBER parm99 and parm99_LNA against a 200 ms NOESY NMR spectrum of L(CAAU). NMR indicates an A-Form equilibrium ensemble. In 3000 ns simulations starting with an A-form structure, parm99_LNA and parm99 provide 66% and 35% agreement, respectively, with NMR NOE volumes and 3J-couplings. In simulations of L(CAAU) starting with all χ torsions in a syn conformation, only parm99_LNA is able to repair the structure. This implies methods for parametrizing force fields for nucleic acid mimics can reasonably approximate key interactions and that parm99_LNA will improve reliability of MD studies for systems with LNA. A method for approximating χ population distribution on the basis of base to sugar NOEs is also introduced.
Locked Nucleic Acids,1−9 or LNAs, are synthetic nucleic acid analogues with a methylene bridge between O2′ and C4′, i.e., ribose replaced with 2′-O-4′ C-methylene-β-d-ribofuranose (Figure (Figure1).1). The methylene bridge ensures a C3′-endo, or North (N)10 conformation, restricting the sugar to a pseudorotation phase angle between approximately 15° and 25°, similar to A-form RNA. The restricted mobility of LNA results in more favorable thermodynamic stability3,6,11−13 of duplexes containing LNA, presumably because there is less loss in conformational entropy upon duplex formation compared to RNA, 2′-O-methyl RNA, or DNA. LNAs are effective in intracellular inhibition of gene expression by a variety of mechanisms, including siRNA,14−22 and are used in array assays.23,24 Here, the LNA, L(CAAU), is used to provide a simplified benchmark for testing the ability of molecular dynamics (MD) to approximate properties of nucleic acids. In particular, MD simulations are compared to ensemble averages of 1H–1H distances and of torsion angles detected by NMR.
Molecular dynamics simulations of LNA duplexes have been investigated25−28 with CHARMM2729,30 and AMBER.31−34 Both force fields were developed for DNA and RNA but were also used for LNA. Recent MD studies of RNA, however, have shown that reparametrization of a single torsion angle, χ, greatly improves agreement with a variety of experimental benchmarks.35−41 Here, we report parametrization of χ torsions and RESP charges for LNA residues to create AMBER force field parameters, parm99_LNA. A separate parametrization of the LNA glycosidic torsion is necessary because LNAs and RNAs have different quantum mechanical energy profiles, due to the flexibility of the ribose in contrast to rigidity of the LNA sugar, as χ and sugar pucker are highly, though not perfectly, correlated.42−45 The LNA sugar is covalently locked, and so any inaccuracies in sugar stretching/bending, torsions, etc., will be greatly reduced and the predicted structures and dynamics should be more accurate than for RNA. Thus, LNA minimizes the negative effects of inaccurate sugar torsions and makes a simpler system for testing other components of nucleic acid force fields.
NMR46−48 probes the ensemble average of states rapidly exchanging in solution. In particular, mean distances and torsional angles can be determined from NOE volumes and J-couplings, respectively. Thus, results from NMR spectra can be compared to predictions from an ensemble generated by MD simulation.
Unpaired L(CAAU) provides a good test system for aspects of force field development for nucleic acids because only a limited number of parameters are important. The sugar conformation is fixed and there is no Watson–Crick49 hydrogen bonding. Both terminal and internal stacking of bases are present, however. Moreover, the single strand allows relatively unrestricted fast movement, so that simulations can sample a range of structures and also test simulations with a different starting structure than implied by NMR.41
Here, the force field is parametrized on the basis of quantum mechanical (QM) calculations which can be applied to any of the many modifications being used in practical applications of oligonucleotides. This contrasts with methods that parametrize on the basis of crystal structures of natural RNAs to give “knowledge-based” force fields suitable for RNA.50−53 The QM approach is more general, as illustrated by this modified force field, parm99_LNA.
L(CAAU) was synthesized by methods previously described.54,55 The sample was dissolved in 80 mM NaCl, 20 mM phosphate, 0.1 mM Na2EDTA, pH 7 in H2O, and twice redissolved in 99.9% D2O (Cambridge Isotope Laboratories) after vacuum centrifugation overnight, and finally dissolved in 99.99% D2O (Aldrich). The concentration of L(CAAU) was 1.1 mM. 1H and 31P spectra were taken, respectively, with Varian Inova 500 and 600 MHz (1H resonance frequency) spectrometers and indirectly referenced to the HDO signal. 1H–31P scalar coupling56 was determined by comparing 31P coupled and decoupled 800 ms NOESY spectra. NMR processing made use of NMRPipe,57 and assignments were made from 200, 600, and 1000 ms mixing time NOESY experiments and a 40 ms TOCSY. All experiments were run at 2 °C, which maximized NOE volumes. NOE volumes were obtained from the 200 ms NOESY spectrum with the box method in Sparky 3.113.58
It is assumed that the dynamics of the tetramer are dominated by global motions which are faster than local motions, i.e., that all NOEs59,60 within the molecule will scale similarly in relation to distance. Such global motion is represented by the rotational relaxation time τr. This is predicted to be about 1 ns for a sphere of radius 8 Å at 275 K according to eq 1(61)
where η is the solvent viscosity (1.79 × 10–3 Pa·s),62V is molecular volume, kB is Boltzmann’s constant, and T is temperature in kelvin. Stacking/unstacking motions should be less important as they occur on a time scale of approximately 100 ns.63−65
The locked distances within the LNA bicyclic ribose facilitate NMR analysis because they provide fixed distances (Table S1) to solve for the NOE scaling factor. The scaling factor, c, relating a NOESY volume between hydrogens i and j, Vij, to distance r is defined by eq 2
A set of fixed sugar distances (Table S1) was used to solve for the mean scaling factor, c, and its standard deviation, cSD. Although the H5–H6 distances are also fixed, they could not be used because of zero quantum coherence effects.66 The fixed distances were determined from a ωB97X-D/6-311G(2d,p)67,68in vacuo minimization of the LNA nucleoside cytidine. The ωB97X-D functional was selected as it was shown to have good results in test sets of dispersion-corrected DFT functionals.69 The distances were checked against an LNA X-ray structure (PDB ID 2X2Q),70−72 which confirmed that the heavy-atom distances were accurate within 0.055 Å (Table S2).
Equation 2 was expanded to account for measurement errors,41 which come from three principal effects: baseline noise, uncertainty in scale factor, and measurement error. The standard deviation of 20 blank peak-sized regions’ “volumes,” Verr, was measured to compensate for baseline noise error. The systematic error in measurement of Vij was accounted for by parameter mv, which reflects that variations in choice of dimensions of the box integrals can produce changes in Vij. The mv was conservatively chosen to be 4/3 (Table S1). The higher uncertainty limit in each NOE distance, r, is provided by eq 3a which has the largest possible numerator and the smallest possible denominator. The lower uncertainty limit is given by eq 3b, which has the smallest possible numerator and the largest possible denominator.
For each of the fixed distances, the experimental NOESY volume and mean scaling factor, c, were used to calculate the distance. All the fixed distances fell within the range predicted by eqs 3a and 3b.
The χ torsion populations were inferred from sugar (H1′, H2′, H3′) to base proton (H6 for C and U, or H8 for A) distances. Model systems for C, A, and U nucleosides (Figure (Figure2)2) were rotated at 5° intervals around χ.
With the χ torsional angles fixed, Gaussian0968 was used to optimize and calculate QM energies for each nucleoside with the hybrid DFT functional and basis ωB97X-D/6-311G(2d,p) to yield energy vs torsional angle curves. For each nucleoside, there were three minima (Figure S1), labeled G+, T, and G– as each falls within one of those regions following IUPAC conventions (Table 1).74−76
IUPAC regions C, A+, and A- were ignored as they are rare in RNA crystal structures77 and in the MD simulations. If any one of the C, A+, or A- torsions were highly populated, then it would not be possible to distinguish between more than three conformations using only three NMR distances. For example, the first minimum (Figure S1) is near 35°, and is referred to as G+; the second minimum occurs around 190°, and is referred to as T; and the third minimum is near 290° and is referred to as G-. Each minimum yields a distance between H8 or H6 and each of the H1′, H2′, and H3′ atoms, or 3 × 3 = nine distances for each nucleoside. For each of the nine QM predicted distances, the larger and smaller predicted NOE volumes were calculated from eqs 4a and 4b to allow comparisons to measured NOE volumes.
The AMBER force field calculates potential energy in a classical manner, as a function of bond stretching (1–2 interactions), angle bending (1–3 interactions), torsion rotation (1–4 interactions), van der Waals interactions,78 and electrostatic potential.31,79 To modify AMBER parm99 for LNA, the atomic partial charges and torsion potentials for χ were parametrized as described below.
Charges were derived for LNA nucleosides, A, C, G, and U, following the RESP protocol (Tables S3–S6),79−81 as previously described for RNA.81 These RESP charges were used in all LNA simulations. A library of 16 residues was created: LXN, LX, LX5, and LX3. Here, X = A, C, G, and U, and N represents the nucleoside. LX is an internal nucleotide; LX5 and LX3 are the 5′ and 3′ terminal residues, respectively.
LNA nucleosides, A, C, G, and U, were created with the LEaP module of AMBER9. The molecules were optimized and the electrostatic potentials at a set of grid points were calculated with HF/6-31G(d)82−84 QM level of theory using Gaussian03.85 Charges for these nucleosides were then calculated with the RESP module. The sugar atoms were made equivalent,81 except for C1′ and H1′.
For each LNA residue, four different initial geometries were chosen. For β = H5T-O5′-C5′-C4′, γ = O5′-C5′-C4′-C3′, and ε = C4′–C3′–O3′-H3T, the combinations were (173°, 172°, 208°), (180°, 55°, 208°), (70°, 63°, 170°), and (70°, 63°, 208°), respectively. Note that H5T and H3T refer to hydrogen atoms at 5′ and 3′ ends, respectively. These torsion angles were chosen as they are common in X-ray databases and result in smooth QM energy profiles.
For each conformation, a potential energy surface (PES) scan was done around the glycosidic torsion angle, χ, with increments of 10°, where χs for pyrimidines and purines are defined as O4′-C1′–N1-C2 and O4′-C1′–N9-C4, respectively (Figure (Figure1).1). For each conformation in the PES scan, the structures were first optimized with HF/6-31G(d) level of theory and QM energies were calculated with MP2/6-31G(d) level of theory.86 Separate χ parameters for purines and pyrimidines were calculated following a published procedure.35 A total of 2 × 4 × 36 = 288 QM data points were used in the fitting for purines (A, G) and for pyrimidines (C, U). These were fit to eq 5.35
In eq 5, ϕ1 and ϕ2 are the O4′-C1′–N1-C6 and C2′-C1′–N1-C6 torsion angles, respectively, for pyrimidines, and O4′-C1′–N9-C8, and C2′-C1′–N9-C8 for purines (Figure (Figure1)1) where ϕ1 – ϕ2 ≈ 120°. Vn1 and Vn2 are the new torsional energy barriers calculated after fitting data by linear least-squares to the Fourier series shown in eq 5 (see Table S6A for frcmod87 file).
Multiple starting structures provide rigorous benchmarks of a force field by testing whether a highly unstable structure can be restored to one consistent with experimental data. Multiple starting structures also assist with convergence, which can be difficult even in 2 μs replica exchange simulations of the tetramer r(GACC).88 Two different starting structures were used here: (I) an A-form-like structure generated with AMBER’s nucgen program and modified by removing H4′ atoms and adding C6′, H6′, and H6′′ atoms (Figure (Figure1),1), and (II) a syn structure generated from simulated annealing as described below and outlined in Table S7.
The syn starting structure was generated with 2000 simulated annealing runs of 5 ns each with torsional restraints on χ. Each succeeding simulation started from the previous one. The implicit solvent generalized Born method was used during simulated annealing. Minimization, Particle Mesh Ewald, and periodic boundaries were disengaged, with a 10 Å cutoff for long-range nonbonding interactions. Chiral restraints were used to prevent chiral inversions. Velocity limit was set to 10 AMBER units, or 0.49 Å/ps. The weak temperature coupling algorithm was employed. Salt concentration was set at 1.0 M. The seed for the random number generator was set at 398. Both starting structures were solvated in a truncated octahedral 8.65 Å box with TIP3P water and neutralized with three Na+ ions using AMBER9’s LEaP program.
L(CAAU) was held fixed using positional restraints of 500 kcal/(mol Å2) while the surrounding water was minimized with the steepest descent method for 500 steps and then conjugate gradient method for 500 steps. Constant volume dynamics with nonbonded cutoff of 10 Å were used. The whole system was then minimized with the steepest descent method for 1000 steps, and then the conjugate gradient method for 1500 steps with a 10 Å cutoff for nonbonded interactions.
The final minimized structure was equilibrated for 200 ps with positional restraints on L(CAAU). Temperature was gradually increased at constant volume to the NMR temperature of 275 K using Langevin dynamics with a collision frequency of 1/ps. This step also used a 10 Å cutoff of nonbonded interactions. During the equilibration, bonds involving hydrogen atoms were constrained with SHAKE.89,90
The system was again equilibrated for 100 ps starting at 300 K and cooling to 275 K using production parameters described below. During these preliminary minimizations and equilibrations, the A-form starting structure remained relatively constant but the all-syn structure changed, in particular, each nucleotide left the syn conformation.
Production runs were done with a 2 fs time step at 275 K to match the NMR temperature, and with 10 Å nonbonded cutoff, 1 atm constant pressure, isotropic position scaling, 2 ps pressure relaxation time, with no position restraints. SHAKE was used to provide bond length restraints for H atoms. MD simulations were run for 3000 ns to allow stacking and unstacking of bases. Previous studies have shown that such interactions in RNA are on the order of approximately 100 ns.63−65 The trajectory file was written every 0.1 ns. Production runs were executed on the University of Rochester’s IBM BlueHive cluster, where the simulations progressed at approximately 25–30 ns/day on 8 Xeon processors requesting 499 MB RAM.
MD simulations with the parm99 and parm99_LNA force fields were run with A-form and all-syn starting structures. Each simulation was scored by taking MD structures generated every 0.1 ns and averaging them over 20 ns stepped every 0.1 ns and over the entire 3000 ns according to how well the predictions matched NMR spectra. There were two components to the scoring program: (I) 28 observed distances and (II) five torsional angles based on scalar couplings, H3′-P3′ for C1, A2, and A3 related to three ε torsions, and a combination of H5′-P5′ and H5″-P5′ for A3 and U4 related to two β torsions. The first predicted NMR ensemble average is reported at 20 ns, and the 20 ns window is shifted in 0.1 ns steps to the end of the simulation at 3000 ns. Each point reports the percentage of the predicted structural properties from the MD simulation that are consistent with NMR spectra within error limits (vide infra). Thus, the score as a function of time provides an indication of how similar the ensemble of structures in each 20 ns interval is to the ensemble average reported by NMR. The average over the entire 3000 ns provides a comparison between the ensemble average of the MD simulation and that reported by NMR. Obtaining a 100% score should only be possible if the simulation is accurate and long enough to sample the entire ensemble. Reservoir replica exchange MD (R-REMD)91 on the RNA, GACC, indicate that 3000 ns MD is not long enough to sample the entire ensemble, but can generate the species most populated when R-REMD is used to reach convergence.88
For each of the 28 measured NMR distances, rNOE trajectory time points were read into an array as ri–6, for a total of 30000 ri. Then, the mean for each 20 ns interval, according to eq 6, was written to an array of means with 29800 points.
Here, θ = β – 120° for H5′-P5′, θ = β + 120° for H5″-P5′, and θ = ε + 120° for H3′-P3′. Because eqs 7a and 7b give nearly identical predicted 3J(1H–31P), only eq 7a was used for comparisons between MD and NMR.
Karplus functions, such as eqs 7a and 7b, are trigonometric. Therefore, multiple different angles can give the same 3J scalar coupling. Furthermore, a given error in Hz cannot be easily equated to an error in degrees. Thus, one cannot say that the simulation has a correct β or ε torsion angle, only that the torsion is consistent or inconsistent with the measured scalar coupling.
For scoring of predictions from MD simulations, the MD-generated β torsions were stored as predicted H5′-P5′ and H5″-P5′ scalar couplings. If the MD-predicted scalar couplings were within 1 Hz of the NMR value, then that time interval was scored +1.
The ε torsions were interpreted directly instead of as scalar couplings. NMR 3J scalar coupling gives four possible ranges for torsional angles, but only the range near 195° is energetically favorable for a C3′-endo sugar.93 The width of the range is also dependent on the 3J value. For example, with a value of 8.6 ± 0.5 Hz for 3J1H–31P, eq 7a gives possible ranges for ε1 of 358–3° (i.e., −2 to 3°), 117–122°, 209–221°, and 259–271° while eq 7b gives ranges of 358–3°, 117–122°, 210–223°, and 257–270°. This implies a range of 209–223° for ε1.
Each trajectory point was also scored by resemblance to A-form. Each of 19 backbone torsional angles in L(CAAU), excluding δ, was given ranges based on RNA X-ray data (Table 2). Each trajectory point was then scored according to how many of the torsions were within the specified limits.
The energy profiles of LNA nucleosides with respect to χ torsions are shown in Figure Figure3.3. Table 3 contains the revised χ torsional parameters derived from fitting the QM energies. The QM profiles are different from those of RNA parm99, i.e., the minima and barrier heights are different. As was the case for RNA, the altered shapes will affect thermodynamic equilibria and increased barriers to rotation should slow predicted kinetics determined by the χ torsion. The LNA parameters show lower barrier heights than RNA parm99 χ_Yil,35 however, suggesting LNA χ torsions have faster dynamics than RNA, especially for A and G.
Figures S2 and S3 show chemical shift temperature dependence for L(CAAU). No cooperative transitions were observed from 0 to 81 °C, indicating that L(CAAU) is single-stranded. Figure Figure44 shows a NOESY walk for L(CAAU), and resonance assignments are presented in Table S8. All H1′ resonances are singlets, consistent with the expected C3′-endo conformation. Table 4 compares the NMR distances with those from MD simulations. Figure Figure55 compares average A-form distances94−97 to NMR distances. The results imply that L(CAAU) is approximately A-form, but NOEs for nH2′ to (n + 1)H6/8 are weaker than expected for A-form (Table S9A), indicating a slight deviation from A-form. However, these deviations are predicted by the force-fields (Table 4).
QM calculations on models of C, A, and U defined three energy minima for the χ torsion (Figures (Figures22 and S1), corresponding to G+ (syn), T (anti), and G- (high-anti) conformations. Equations 2, 4a, and 4b were used to predict the NOE volumes and their experimental limits for intranucleotide cross peaks between the H6 (C and U) or H8 (A) and their H1′, H2′, and H3′ sugar protons (Table 5). The results indicate that the NOEs to H1′ and H3′ can identify a G+ conformation and those to H2′ can identify a G- conformation. As shown in Table 5, only the T conformation is consistent with NOEs to H1′, H2′, and H3′. Evidently, all nucleotides are majority T in solution. Attempts to determine χ on the basis of 1H–13C scalar coupling constants and 13C chemical shifts were unsuccesful due to the low concentration of the natural abundance sample synthesized.
The lack of an LNA H4′ atom means that the γ torsion is not directly observable by 1H–1H scalar coupling. Measured β and ε scalar couplings are presented, respectively, in the caption and the third column of Table 6. The 3JH5′-P5′ and 3JH5″-P5′ couplings for the β3 and β4 torsions were 5.0 and <1 Hz, respectively, which implies from eqs 7a and 7b two possible ranges for β: 162–166° or 194–198°. Alternatively, the β torsion could also rapidly change between the two regions and still show the same scalar coupling values. The β torsion range for an A-form structure is approximately 165–195°.
Table S10 contains means of backbone and glycosidic torsion angles, corrected for circular discontinuity,98 from the nucgen A-form-like and the all-syn starting structures and the average values from MD simulations. Backbone and glycosidic torsions are defined following standard conventions.92 Unminimized starting PDB files are printed in Tables S11A and S11B.
L(CAAU) maintained a relatively stable A-form like structure and agreement with NMR values until approximately 190 ns (Figure S6A). The pseudorotation phase angle for each of the nucleotides varied from −4° to 44° with a mean of 20° and a standard deviation of approximately 5°, which is consistent with a C3′-endo sugar. The time before collapse from the A-form was longer for L(CAAU) than the ~10 ns and ~50 ns observed for A-form r(GACC)37 and r(CCCC),41 respectively, presumably due to the more restricted LNA sugar. The collapse of L(CAAU) was started by an intercalation of C1 between A2 and A3, with sharp changes in χ2, ε2, ζ2, α3, β3, and γ3 (Figures S6 and S6A at 200 ns). The simulation only briefly recovered an A-form like structure. At ~800 ns, the molecule adopted C1-A3 and A2-U4 stacks, which are not consistent with NOESY spectra as there are neither C1-A3 nor A2-U4 cross-peaks. This transition is correlated with sharp changes in χ and ζ. From 1640 to 1850 ns, the structure primarily sampled random-coil structures, although an A-form structure with an inverted C1 was present from ~1751 to 1805 ns. This structure may have a small population in solution, although no NOEs indicate this.
Measured and predicted mean 3J scalar couplings and β and ε angles are presented in Table 6 and its caption. Measured and predicted χ torsions, respectively, are in Table 5 and Figure Figure6.6. With parm99, all of the χ torsions are predicted to have substantial populations in the G- (high-anti) conformation (Figure (Figure6),6), which does not agree with NMR (Table 5). The proclivity of AMBER parm99 to prefer high-anti (G-) over anti (T) orientations has also been seen in other RNA simulations.37,38
Distributions of MD-predicted torsions for parm99 simulations are plotted in Figure S5. The β torsions clustered at a T orientation, making occasional transitions to G+ and G-. For A3 and U4, the MD simulation predicted relatively stable β torsions of ~180°, which is between the regions around 164° and 196° consistent with the H5′/5″-P5′ scalar couplings according to eqs 7. The MD-predicted ε torsions all clustered at T positions, i.e., between 150° and 210°.
The α, γ, and ζ torsions cannot be directly determined by NMR, so these torsions are compared to the RNA crystal database. LNA backbone torsions in crystal structure 2X2Q(70) exhibit similar clustering behavior as RNA torsions, even though structures are not identical. With this caveat, the LNA α torsions are expected to cluster into one major G- (270–330°) conformation and two minor G+ (30–90°)/T (150–210°) conformations, which are similar to the distributions predicted from MD simulations (Figure S5). The γ torsion, which is closely coupled with α, was mainly G+ with a T minor form; RNA X-ray structures have a major form at 60° (G+) with two minor forms at 180° (T) and 300° (G-). Thus, this parm99 simulation predicts expected populations of α/γ. The ζ torsion in the parm99 simulation prefers a G- conformation with a minor conformation at G+. In the ribosome crystal structure,77 ζ is mostly found in the G- geometry. However, it also shows a greater variability in the crystal than in the LNA simulation. This suggests the ζ profile may be prejudiced toward a certain orientation by inaccurate force field parameters, by a difference between LNA and RNA, by the relatively small sample size for the simulation, or because ζ torsions in rRNA are affected by tertiary and quaternary interactions.
The parm99 simulation starting with the syn structure is shown in Figure S7A, and comparison to NMR data is presented in Tables 4 and 6. After initial minimization and equilibration, the starting structure had no base–base stacking (Figure S7A) and an NMR score of 18%. The average pseudorotation phase angles generated during the simulations were nearly identical to those in the A-form simulation, i.e., as expected for the constrained sugar, that parameter is not affected by starting structure. The maximum NMR score during the simulation was 42%. At 200 ns, the simulation reached an NMR score of 36%, and oscillated around that for the remainder of the simulation. At 255 ns, the molecule adopted an A-form like arrangement that endured until 850 ns, and reappeared from 1220 to 1780 ns. Here, the A-form is defined as a four-way stack of all nucleotides. This structure never entered a truly A-form arrangement or substantial agreement with NMR, as the χ torsions were never all in a T conformation. At 860 ns, U4 left the helix to form a 3′ terminal unstacked conformation. The terminal unstacked form does not show a substantial change within a 1-D RMSD plot, but is easily seen in the 2-D RMS plot (Figure S7A). After 1780 ns, U4 again briefly became 3′ terminal unstacked, before stacking on top of C1. There is no NMR evidence for this structure as there are no C1–U4 NOEs. None of the non-A-form structures in this simulation were seen in the parm99 simulation starting with A-form. The backbone torsion population distribution, however, was very similar to that with the A-form starting structure (Figure S5, red curves). Thus, the backbone torsion population distribution was not very dependent on starting structure. If a force field were perfect, then an infinitely long simulation would drive any starting structure to something in agreement with experimental data. In the 3000 ns simulation, parm99 did not drive the all-syn structure to a reasonable equilibrium that agreed with NMR data, although it moved in that direction.
The syn starting structure was consistent with only 3% of NMR distances, but the parm99 parameters were able to improve the structure during the minimization and equilibration steps. The NMR scores ranged from 9% to 42% with a mean of 33% and a standard deviation of 5%. If the first 200 ns are discounted, the score increases by <1%. Thus, the mean NMR scores for the parm99 simulations with A-form and syn starting structure are within one standard deviation of each other.
The parm99_LNA simulation for L(CAAU) starting with A-form is shown in Figure Figure7,7, and comparisons to NMR data are presented in Tables 4 and 6. The average pseudorotation phase angles were not affected by the new parameters. The tetramer preferred A-form, but cycled through approximately 22 non-A-form structures throughout the 3000 ns simulation (see structures for 580, 1620, and 1800 ns in Figure Figure7).7). For example, at approximately 580 ns, C1 left the helix to create a 5′ terminal unstacked LNA which persisted for ~10 ns. A transient 3′ terminal unstacked LNA was seen at 990 ns. As with the parm99 results, it is possible that such structures have a small population in solution. Only two structures seen would be considered “random-coil”, defined as an entirely unstacked structure, which is unlikely given stabilizing base stacking interactions.99−101 The totally unstacked structures are transient, persisting for approximately 40 and 150 ns, respectively. Unlike parm99 simulations, there were no intercalations during this simulation. Relative to the parm99 simulation, the parm99_LNA simulation that started with an A-form structure substantially improved agreement with NMR data for L(CAAU) (Tables 4, 6, and Figure Figure6).6). In contrast to parm99 simulations (Figure S6A and Figure S7A) which preferred a G- (high-anti) χ, parm99_LNA favors a T (anti) orientation (Figures (Figures6,6, S6A, and S10). Similar results were seen for RNA χ revisions.35,37,38 This parm99_LNA simulation with LNA-specific χ and RESP parameters showed improvement over both parm99 MD simulations. NMR scores ranged from 39% to 76% with a mean of 66% and a standard deviation of 8%.
The L(CAAU) parm99_LNA simulation starting in an all syn conformation is plotted in Figure Figure8;8; Tables 4 and 6 compare predictions against NMR data. The mean pseudorotation phase angles were nearly identical to the other three simulations. NMR agreement was very poor until approximately 2440 ns. From 2440 to 2450 ns, the molecule showed a substantial increase in agreement with NMR data (Figure (Figure8),8), so parm99_LNA is able to repair an unrealistic structure. This improvement was due to C1 stacking on A2 and on U4 entering the helix. This A-form like structure was largely maintained until the end of the simulation at 3000 ns. The simulation even experienced a brief dip in NMR agreement at 2930 ns due to C1 flipping outside of the helix, but was able to repair itself within 20 ns. From 2443 to 3000 ns, the parm99_LNA simulation scored a mean of 61% of the NMR observables correctly. This speaks well of the new parameters. The backbone torsion populations did not differ a great deal between the two parm99_LNA simulations (Figure S10).
LNA-specific χ torsional and RESP parameters were created that improve agreement between computational predictions and NMR spectra for L(CAAU) as illustrated in Figure Figure9.9. To evaluate computational predictions against NMR data, new methods for estimating χ torsion ranges and for scoring the NMR agreement of the simulation are introduced. Comparisons to NMR spectra indicate that parm99_LNA parameters will improve predictions for LNA nucleotides, particularly if starting structures are not far removed from the true structures or if simulations are run for very long times.
It has recently been found that reducing van der Waals interactions between bases and also changing base–water interactions improves structural predictions for three RNA tetraloops.102 These revisions may also reduce intercalated species of the type seen in MD simulations of r(GACC)37 and r(CCCC),41 as well as in L(CAAU) with parm99. The lack of intercalation in L(CAAU) with parm99_LNA, however, suggests that improved parametrization of ribose may also improve RNA simulations. The results support assumptions that fitting energy functions derived from QM calculations can provide reasonable approximations for force fields required for artificial nucleic acids. Moreover, improved modeling of ribose is likely to substantially improve RNA simulations, but more modifications are also necessary to provide agreement with NMR data.
We are very appreciative for the help and resource time of the University’s Computing Research Center. Grateful appreciation is extended to Dr. Harry Stern and users at Ubuntu Forums who helped with programming questions, and also the AMBER Mailing List, which has been invaluable and patient with questions. This work was supported by National Institutes of Health (NIH) grant R01 GM22939. The content is the sole responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
National Institutes of Health, United States
Tables showing fixed LNA distances used to calculate NMR scaling factor, comparison of QM to X-ray distances, basis sets used during QM minimization, RESP charges, simulated annealing protocol, chemical shift assignments, starting structure and simulation predicted torsion values, starting structure coordinates, and figures of torsion population histograms and backbone torsion performance. This material is available free of charge via the Internet at http://pubs.acs.org/.
The authors declare no competing financial interest.