|Home | About | Journals | Submit | Contact Us | Français|
Conformational changes are important in RNA for binding and catalysis and understanding these changes is important for understanding how RNA functions. Computational techniques using all-atom molecular models can be used to characterize conformational changes in RNA. These techniques are applied to an RNA conformational change involving a single base pair within a nine base pair RNA duplex. The Adenine-Adenine (AA) non-canonical pair in the sequence 5'GGUGAAGGCU3' paired with 3'PCCGAAGCCG5', where P is Purine, undergoes conformational exchange between two conformations on the timescale of tens of microseconds, as demonstrated in a previous NMR solution structure [Chen, G., et al., Biochemistry, 2006. 45: 6889–903]. The more populated, major, conformation was estimated to be 0.5 to 1.3 kcal/mol more stable at 30 °C than the less populated, minor, conformation. Both conformations are trans-Hoogsteen/sugar edge pairs, where the interacting edges on the adenines change with the conformational change. Targeted Molecular Dynamics (TMD) and Nudged Elastic Band (NEB) were used to model the pathway between the major and minor conformations using the AMBER software package. The adenines were predicted to change conformation via intermediates in which they are stacked as opposed to hydrogen-bonded. The predicted pathways can be described by an improper dihedral angle reaction coordinate. Umbrella sampling along the reaction coordinate was performed to model the free energy profile for the conformational change using a total of 1800 ns of sampling. Although the barrier height between the major and minor conformations was reasonable, the free energy difference between the major and minor conformations was the opposite of that expected based on the NMR experiments. Variations in the force field applied did not improve the misrepresentation of the free energies of the major and minor conformations. As an alternative, the Molecular Mechanics Poisson-Boltzmann Surface Area (MMPBSA) approximation was applied to predict free energy differences between the two conformations using a total of 800 ns of sampling. MM-PBSA also incorrectly predicted the major conformation to be higher in free energy than the minor conformation.
There are many known non-coding RNAs involved in diverse biological processes, such as binding and catalysis, where conformational changes are crucial for function.1–3 Important biological roles of RNA depend upon single base pairs and bulged bases.4–7 Diverse conformational changes in structured non-coding RNAs, such as ribosomal RNA and other ribozymes are known to occur.8–9 Conformational changes can alter binding surfaces to change their specificity, facilitate movement such as translocation of the ribosome,10–12 or change activity as in riboswitches.13–15 Thus it is important to develop methods to model conformational changes to improve understanding of RNA function.
Molecular dynamics and umbrella sampling have been used previously to study conformational changes in both DNA16–18 and RNA.19–21 Many of these studies examine base pair opening17, 19 and base flipping,16, 18, 22 where a base becomes unpaired, breaks its stacking interactions, and leaves the helix. This is described as a looped out or extra-helical state. A variety of reaction coordinates are used in these studies, including a pseudo-dihedral angle known as the Center of Mass (COM or CPD) dihedral,18 a projection of the glycosidic bond into a plane which is normal to a local helical axis vector and also includes the C'-C' vector,17 and an improper dihedral angle defined by four atoms.19 These studies demonstrate the use of free energy methods to give free energy for conformational changes involving individual bases or base pairs. RNA hairpin stability was also investigated using the end-to-end distance for two hairpin sequences known to be of different stability.21
The AA non-canonical pair system studied here is an RNA duplex of nine base pairs. The center base pair is a non-canonical pair consisting of adenine 5 (A5) and adenine 15 (A15) (Figure 1) that undergoes a conformational change from a major, i.e. more populated, A15–A5 to a minor, i.e. less populated, A5–A15 trans-Hoogsteen/sugar edge non-canonical base pairing interaction.23 The NMR data obtained by Chen et al. for the duplex contained NOEs that were inconsistent with a single structure.24 The NMR data was divided into two sets of NOE and dihedral angle restraints, based on a prior well-determined structure25 that were separately used to model structures for the major (Protein Data Bank, PDB, accession #: 2DD2) and minor (pdb: 2DD3) conformations.24 Estimates of chemical shift differences of the H2 of adenine-5 between major and minor forms suggest an exchange rate of at least 435 s−1 while line widths suggest higher rates ranging from 20,000 to 65,000 s−1.24 The observed chemical shifts of A5 and A15 H2 protons suggested that the major conformation is 0.5 to 1.3 kcal/mol more stable than the minor conformation at 30 °C.
This study focused on modeling the conformational change pathway to provide insight about the pathway and to test the accuracy of modern molecular mechanics methods. For example, two types of pathways can be imagined for the conformational change. One pathway would have stacked intermediates where the adenines are stacked on each other, and the other would have intermediates where the adenines are hydrogen bonded in a common plane.
In this study, a number of computational methods, Targeted Molecular Dynamics (TMD),26–27 Nudged Elastic Band (NEB),20, 28–30 umbrella sampling free energy calculations,31–34 and MM-PBSA35–43 were used to model the conformational change pathway and equilibrium for a trans Hoogsteen/sugar edge AA non-canonical pair.23 TMD applies a biasing potential to drive a starting structure towards a target structure in a molecular dynamics simulation. This can therefore generate plausible pathways for conformational changes. In contrast, NEB uses a string of states attached by virtual springs between fixed end states to generate a low potential energy pathway as determined by the potential energy landscape. These low potential energy conformational change pathways are close to likely pathways, but are approximate because they neglect entropic effects.28–30 True pathways involve thermal fluctuation and thus undergo random motions that do not follow exact minimum energy pathways; though the conformations visited by molecules undergoing conformational transitions are frequently along minimum potential energy pathways. NEB requires that the first and last images of the string of conformations, i.e. the reactant and product structures, be fixed in conformation. Thus the major and minor conformations are unchanged by the calculation. The low potential energy pathways determined with NEB are independent of the directionality of the pathway and therefore the choice of reactant and product structures is arbitrary. TMD and NEB provide complementary information.
TMD and NEB predicted pathways where the adenines stack in the intermediates. These calculations suggested a reaction coordinate16 described by an improper dihedral angle. Umbrella sampling and the Weighted Histogram Analysis Method (WHAM) were applied to predict relative free energies along the reaction coordinate.31–34 The reaction coordinate observed in both TMD- and NEB-predicted conformational change pathways was used to facilitate umbrella sampling to determine the potential of mean force for the pathway.
The free energy profiles from umbrella sampling calculations with a molecular mechanics force field contradicted the experimental results. The relative free energy change between the major and minor conformations was overestimated and opposite that given the relative populations of these forms in solution. Free energy calculations were repeated using the parmbsc044 force field parameters, variations in salt concentration, or the TIP4PEW45–46 water model. Each of these alternatives also was unable to correctly model the free energy difference between the major and minor conformations.
An alternative method, MM-PBSA/GBSA, for estimating free energies was also applied. This method uses a force field to estimate the molecular mechanics potential energy of the RNA in gas phase, and then applies either the Poisson-Boltzmann35–37, 39–42 or Generalized Born38, 43 surface area (PBSA or GBSA) methods of implicit solvation to estimate the free energy in solution. In addition, normal mode analysis is used to predict the conformational entropy. MM-PBSA/GBSA also predicted a higher free energy for the major conformation than for the minor conformation, and thus also did not qualitatively agree with experimental results. These results suggest that the AMBER99 force field is unable to adequately represent the conformational free energy change of this RNA molecule.
The models of the AA non-canonical pair system were those by Chen et al. for the major (pdb: 2DD2) and minor (pdb: 2DD3) RNA structures.24 The dangling ends (unpaired U and purine), added to stabilize the structures for NMR, were removed for these calculations. Both structures have the same covalent structure and consist of 587 atoms. The lowest potential energy structure as evaluated by AMBER9947–48 force field from the reported set of NMR-guided models was selected as the representative structure for each conformation. This was structure 23 and structure 9 for 2DD2 and 2DD3, respectively.
The AMBER48–50 molecular dynamics package (version 9 or 10) was used for all calculations. Unless stated otherwise, the Cornell et al. ff99 (AMBER99) force field was used.47–48 Trajectories were analyzed using ptraj51 included with AMBER and the LOOS52 software package.
For implicit solvent molecular dynamics, Generalized Born (GB) implicit solvation53–56 was used, and the pairwise interactions were modeled with the HCT method.56 A salt concentration of 0.1 M was simulated using a Debye-Hückel limiting law for interaction screening.55 An effective Generalized Born radius of 25 Å was used with a non-bonded cutoff of 100 Å. Energy minimization was first performed with steepest descent for 1,000 steps, and then for 10,000 steps via the conjugate gradient method. A 2 fs time step was used for dynamics. To start dynamics, the system was heated from 0 K to 300 K over 60 ps followed by 100 ps of equilibrating MD. SHAKE57–58 was applied for bonds to hydrogen. A Langevin thermostat, with a collision frequency of 1 ps−1, was used.59–60
Explicit solvent NPT molecular dynamics utilized the TIP3P water model61–62 with periodic boundary conditions. The TIP4PEW water model45–46 was also investigated, where only the water model and ion parameters were changed and all other parameters were the same. Electrostatics were modeled with the Particle Mesh Ewald (PME) method63–65 with a direct space sum cutoff of 10 Å. Neutralizing Na+ ions were added to counter backbone phosphates. 1 M NaCl was then added where the number of Na+ and Cl– ions added was calculated by dividing the total number of water molecules added to the simulation box by the 55.5 M concentration of pure water.
The structure models from the PDB were subjected to two rounds of minimization, followed by heat up and equilibration. In the first minimization, the RNA was held fixed by a harmonic potential at 500 kcal/(mol×Å2) for 500 steps of steepest descent minimization followed by 500 steps of conjugate gradient minimization and with the system at constant volume. In the second round, the RNA was freed of restraint and 1000 steps of steepest descent followed by 1500 steps of conjugant gradient minimization at constant volume were performed. The subsequent MD was run with a pressure relaxation time of 2 ps. A warm-up simulation was performed with the RNA fixed in space with a harmonic potential of 10 kcal/(mol×Å2) for 20 ps of simulation. Dynamics were run with a 2 fs time step with SHAKE57–58 constraining hydrogens and a Langevin thermostat collision frequency59–60, 66 of 1.0 ps−1. Equilibration simulations were run at constant pressure of 1 atm at a temperature of 300 K for 480 ps. Four separate trials of 100 ns of MD were performed for both the major and minor conformations by varying the random number seed used by the Langevin thermostat.
TMD26–27 calculations were run from an energy-minimized structure to an energy-minimized target structure where the RMSD bias was applied to the entire molecule with the target RMSD set to 0 Å. GB implicit solvation53–56 was used. A non-bonded and Generalized Born radii cutoff of 100 Å was used. A salt concentration of 0.2 M was simulated by Debye-Hückel screening.55–56 The temperature was set to 300 K and Langevin dynamics66 was used as a thermostat with a collision frequency59–60 of 1.0 ps−1. The time step was 2 fs and SHAKE57–58 was used.
NEB20, 28–30 was done using AMBER with a 12 step simulated annealing67 protocol that was concluded by quenched dynamics. The implementation uses revised tangents to prevent kinks in the pathway.30 Trials were varied by changing the random number seed. The simulated annealing protocol applied here was the same as in previous work.68 GB implicit solvation53–56 was used in NEB calculations with a non-bonded and GB radii cutoff of 15 Å and no Debye-Hückel screening.55 Langevin temperature scaling was used with a collision frequency of 1000 ps−1.59–60, 66 SHAKE57–58 was used. 15 trials were performed with 30 images, where the first 15 images were started as the major conformation and the last 15 images were started as the minor conformation. The first image is fixed as the experimental model of the minor conformation and the last image as the major conformation. An additional trial with 60 images using 30 major and 30 minor images was performed to test resolution.
Umbrella sampling31 calculations were performed along the improper dihedral reaction coordinate from −210° to 30° with the equilibrium position of the harmonic potential in windows separated by 10°. Initial atomic structures for each window were selected manually from available structures from NEB-determined pathways. Structures were selected that were close to the equilibrium dihedral values for the windows. The improper dihedral restraint was applied with parabolic sides extending +/−40° from the window center with a maximum 100 kcal/(mol×rad2) restraining potential outside of the well. The sampling of improper dihedral angle was verified to remain within the parabolic well for all umbrella sampling windows. The force constant value allowed for adequate overlap between distributions of reaction coordinate values between neighboring windows. NEB images selected for windows were solvated with TIP3P water61–62 with an isometric box with nearest contact to the RNA at a radius of 10 Å. Neutralizing Na+ was added and then an additional 1 M of NaCl ions was added. The number of Na+ and Cl− ions added was determined by dividing the number of water molecules in the box by the 55.5 M concentration of pure water. Umbrella sampling was repeated with neutralizing Na+ but without additional NaCl ions to the simulation box as well to test if the presence of salt had an effect on the free energy surface.
Energy minimization was then performed as for explicit MD for the end states above followed by heat up with the RNA solute fixed. Equilibration was run the same as for the explicit MD above with the addition of the improper dihedral restraint for the umbrella sampling window. The restraint was ramped up for the first 20 ps of simulation time from 0% to 100% strength and then maintained for the duration of equilibration and sampling. Different trials were run by changing the random number seed for all simulations. Following umbrella sampling, the Weighted Histogram Analysis Method (WHAM)33–34 was used to obtain the free energy profile, i.e. the potential of mean force (PMF) along the reaction coordinate. The AMBER99 forcefield47–49 with TIP3P water61–62 was used unless otherwise noted. This procedure was applied similarly for umbrella sampling done with the TIP4PEW45–46 water model and with the parmbsc044 force field.
When plotting multiple free energy profiles (free energy as a function of the reaction coordinate) on the same plot or when averaging free energy profiles, a reference profile was chosen and all other profiles were adjusted in the free energy axis to minimize the sum of squares difference with the reference. This adjustment was calculated by averaging the difference of each point within the free energy profile from the corresponding point in the reference free energy profile. Then the average of these differences was subtracted from the non-reference profile to obtain the new profile, adjusted to the minimum square difference relative to the reference profile. To average multiple profiles, the reference free energy profile and the profiles with the minimized sum of squares difference to the reference profile were averaged. The RMSD between free energy profiles was calculated using one profile as a reference and the other compared profile adjusted to minimize the sum of squares difference.
MM-PBSA35–43 calculations were performed on the eight total 100 ns trajectories of the major and minor conformation. The AMBER 10 mmpbsa.pl script was used to automate the procedure.69 Snapshots were taken at 1 ps intervals from the simulations. The calculation was run on a single structure with normal mode analysis, 100,000 maximum cycles of minimization of the images from the MD trajectories used in the analysis to ensure convergence and a relaxed convergence criterion for the energy gradient of 0.001 kcal/mol. For GB implicit solvent calculations, Debye-Hückel screening for 1 M NaCl was used. The free energy for configurations sampled by a trajectory was the sum of the gas phase molecular mechanics energy, the estimated hydrophobic and reaction field energy contributions from either the PBSA or GBSA methods, the hydrophobic contribution to the solvation free energy for either the PB calculation or the GB calculation, and the product of conformational entropy estimated by normal mode analysis and the temperature of 300 K.
The models for the major and minor conformations of the AA non-canonical pair system were subjected to molecular dynamics as an initial test of their stability. The dangling unpaired uridine and purine present in the experimental model structures were not included in the calculations performed in this work. Given the estimated rate of exchange between the major and minor conformations from the NMR experiment between 20,000 to 65,000 s−1, it was expected that a spontaneous change in conformation was unlikely to be observed on molecular dynamics simulation timescales. 30 ns of Generalized Born implicit solvent molecular dynamics53–56 was applied to both the major and minor structures. The structures have an RMSD to the starting experimental structure of less than 5.9 Å throughout the trajectory (Figure 2) with means of 2.93 Å and 2.98 Å for the major and minor structure, respectively. In the major structure, A15 moves out of the helix away from A5 at 8.1 ns and makes temporary van der Waals and hydrogen bonded interactions with the edges of paired bases in the stem of the helix between the A5–A15 non-canonical pair and the G2-C17 base pair. In the minor structure, A15 leaves the helix and moves away from A5 at 5 ns and forms a hydrogen bond with the 2' hydroxyl of G8 for the remainder of the simulation. Thus the unrestrained AA non-canonical pair was not stable during implicit solvent molecular dynamics for both the major and minor conformations.
Four 100 ns explicit solvent simulations were performed on each the major and minor conformations. All simulations had consistently lower RMSDs than the implicit solvent simulations (Figure 3, Table 1). The AA non-canonical pair retrained its native (trans-Hoogsteen/sugar edge) pairing throughout each of the simulations. Over the four simulations, the mean RMSD was 2.17 and 2.13 Å, for the major and minor conformations, respectively.
Sugar puckers were also determined for all the nucleotides in the major and minor conformations of the AA non-canonical pair structure from the TOCSY and NOESY spectra.24 The Altona & Sunderlingam70 convention was used to measure the sugar pucker along the MD trajectories of the major and minor conformations. The NMR results indicate the nucleotides have C3' endo sugar puckers with angles ranging from 0 to 36°. Exceptions to this are for A5, which goes from C3' endo in the minor conformation to C2' endo with an angle range of 144 to 180° in the major conformation, and for A15 which goes from C2' endo in the minor conformation to C3' endo in the major conformation. The sugar puckers for A5 and A15 are plotted in Figure S1 for the minor conformation and Figure S2 for the major conformation. The time courses for sugar pucker indicate that A15 in the minor conformation usually shifts from the experimental C2' endo conformation fluctuating near 180° to the C3' endo conformation at 0° before 50 ns of simulation time. Otherwise, the sugar puckers of A5 in the minor conformation and A5 and A15 in the major conformation reasonably agree with the NMR experiment.
The conformational change pathway translates A5 and A15 as shown in Figure 1 from the major configuration on the left to the minor configuration on the right. This can be imagined to occur in three possible ways. One possibility is that the bases move around the edges of each other in an edge-on-edge conformational change pathway, involving hydrogen-bonded intermediates. The other possibility is that the adenine bases of the non-canonical pair may slide one over the other through a stacked intermediate. The sliding pathway can occur in two alternative directions where the faces of the bases in the stacked intermediate are different. These two stacked pathways can be defined by which faces of the adenines were towards each other (Figure 4), thus giving a 5'-facing and 3'-facing stacked pathway.
TMD calculations were performed using a number of different force constants to give an initial estimate of conformational change pathways. The ideal range for the biasing potential force constant was determined to be 0.14 – 0.19 kcal/mol. Force constants below 0.14 kcal/mol resulted in trajectories where the RMSD with the target structure as reference remained at 1.4 Å for the complete 1.5 ns of TMD simulation, indicating the change to the target structure was not achieved (Figure S3). Force constants greater than 0.19 kcal/mol were too high because the conformational change occurred within the first 30 ps of the trajectory.
Figure S4 shows RMSD as a function of time for four force constants. For each simulation, the RMSD started at the RMSD between the starting and target structure (2.2 Å) and then dropped to an RMSD of approximately 1.4 Å within the first ps of TMD. The plot of RMSD to the target structure for the 0.14 kcal/mol force constant TMD calculation (Figure S4b) indicated a conformation in which a 1.4 Å RMSD was maintained for 0.6 ns of simulation before the conformation changed to the target structure. Adenine-5 and adenine-15 were stacked on each other within the helix during this time period, with the 5' faces of the adenines towards each other. Thus, TMD results showed a stacked intermediate structure along the pathway between minor and major configurations. Thirteen TMD trajectories with the biasing force constants from 0.01 to 0.13 kcal/mol each moved to a 5'-facing stacked configuration, but did not continue to approach the target conformation. Seven TMD trajectories with biasing force constants from 0.14 to 0.2 kcal/mol underwent a 5'-facing pathway. A separate set of eleven TMD calculations with the major conformation as the initial state and the minor conformation as the target gave similar results.
To address whether either face of the adenines can stack during the conformational change, TMD calculations were used to generate the alternative 3'-facing pathway within the same force constant range by using a modified starting structure. The modified structure was generated by first applying three distance restraints, found empirically, to an MD simulation of the minor conformation to move adenine-5 to the 3' side of adenine-15. The distance restraints were all flat bottom harmonic potential wells with force constants of 30 kcal/mol. Outside of the well the potential was at this constant. The three restraints were from the C8 of A5 to C2 of A15 with the well centered from 4.425 Å to 5.925 Å with outside bounds of 4.175 Å and 6.175 Å, N9 of G8 to C2 of A15 with the well centered from 9.278 Å to 10.278 Å with outside bounds of 8.278 Å and 11.278 Å, and C8 of A5 to N9 of G17 with the well centered from 4.53 Å to 5.53 Å with outside bounds of 3.53 Å and 6.53 Å. The starting minor experimental structure measured 3.88 Å for C8 of A5 to C2 of A15, 11.70 Å for N9 of G8 to C2 of A15, and 7.30 Å for C8 of A5 to N9 of G17. A structure was selected from the MD trajectory with the restraints applied as a starting configuration for TMD. With the altered starting structure, TMD followed the 3'-side pathway rather than the 5'-side pathway.
Fifteen NEB calculations with different random number seeds were performed. The end points are held fixed for NEB, thus reducing error associated with their instability observed in implicit solvent calculations. Like the TMD simulations, the initial NEB pathways predicted stacked intermediates in the pathways with the adenines sliding by the 5' face (Figure 4). Although each NEB pathway slides by the 5' face, potential energy profiles of different trials revealed variability in the numbers of transition states and intermediates (Figure 5).
For the fifteen different NEB calculations, the images along the pathways showed varying degrees of progress along the conformational change from the major to minor forms. These NEB pathways provide a model for the conformational change. The initial images of the NEB pathways involve breaking of the hydrogen bonding interaction between the non-canonical pair adenines and the stacking interactions with the neighboring sheared GA pairs of the minor conformation as shown in Figure 4. The stacking interactions broken include those between A5 and both A6 and A16, as well as those between A15 and both G14 and G4. The non-canonical pair adenines move to form the stacking interactions of the major configuration, as shown in Figure 6, by moving through configurations where the two adenines are stacked with each other through the 5'-facing pathway described in Figure 4. In the major form, the stacking interactions include those between A5 and both G4 and G14 as well as A15 with both A6 and A16.
It is possible that the bases could also traverse a pathway where the 3' faces are toward each other during sliding (Figure 4). The 3'-facing pathway was explored by NEB. Snapshots from the 3'-stacking TMD trajectories were used as starting images for five NEB calculations. The same NEB protocol was used for these calculations, but with 32 images total. These NEB pathways each converged on 3'-stacked intermediates. The potential energies of the intermediate structures in the 3'-side-biased NEB trials were similar to those of the 5'-side NEB pathways described above. The average, maximum, minimum, and standard deviation of the potential energies for the images are plotted in Figures S5 and S6 in the supplemental information. The NEB pathways appear to depend on the starting configurations, which has been noted previously for systems with periodic reaction coordinates.20 Starting with the experimental major and minor conformations poises the NEB method to produce the 5'-side pathway. Moving the adenines as described in the methods section using restraints biases both TMD and NEB to produce the 3'-side pathway.
Given the pathways observed by TMD and NEB, an improper dihedral reaction coordinate defined by C8, C4, and N1 on adenine-5 and C5 on adenine-15 described the conformational change (Figure 6). Other candidate coordinates were tested, such as the glycosidic sugar angle (χ) and sugar pucker, but were found to be unsuitable reaction coordinates (results not shown). The glycosidic angle of A15 did not differ for the major and minor conformations. The glycosidic angle for A5 did have distinct values for the major and minor conformation; however, the value fluctuated up and down along the pathways and thus did not follow reaction progress. The sugar pucker was found to change suddenly, within 2 or 3 images at different points along the pathway, for both A5 and A15, and thus did not adequately follow reaction progress.
The A5 base plane was defined by choosing three atoms, namely C8, N4, and N1. Defining the final atom as C5 of base A15 resulted in an improper dihedral angle that brought one mismatch based over the face of the other. The 5'- and 3'-facing pathways were distinguishable by whether the angle changed negatively from −8.7° to −173.5° (5'-facing) or positively from −8.7° to +186.5° (3'-facing) (Figure 6). The improper dihedral was plotted together for the images from the 15 NEB pathways (Figure 7a). The improper dihedral angle values were also plotted for the 5 NEB trails undergoing the 3'-facing pathway (Figure 7b). The improper dihedral value went from a definitive reactant value to product value in a smooth continuous manner for each of the trials, showing this behaved well as a reaction coordinate.
The improper dihedral angle was measured in the 100 ns molecular dynamics trajectories of the major and minor conformations (Figure S7). The improper dihedral angle maintained expected values for all simulations, with the minor conformation simulations averaging −4.2° and the major conformation remaining near −170.6°. The average standard deviation was ±7.9° for the minor conformation and ±14.0° for the major conformation. This demonstrates that the improper dihedral angle fluctuates more during unstrained molecular dynamics of the major conformation than for the minor conformation.
Umbrella sampling31 and WHAM32–34 were applied to predict the conformational free energy change along the reaction coordinate. Six independent calculations were run by varying the random number seed used for Langevin dynamics. Twenty-five windows for equilibrium sampling were used for each calculation. Twelve ns of sampling was done for each window in each trial, yielding a total of 1800 ns of sampling. The average of the free energy profiles for each trial is plotted in Figure 8a. The standard deviations were calculated using the six trials, and these ranged from 0.19 to 1.04 kcal/mol along the profile. The individual free energy curves were also plotted separately (Figure 8b). Overall, the location of minima and maxima and shape of free energy barriers remained consistent between the trials, although some differences in barrier heights and the relative free energy between maxima and minima occur. The greatest free energy difference of 2.42 kcal/mol from the reference profile occurred at the bin centered at −209.5° for one of the trials. This profile also had the highest RMSD difference from the reference profile of 1.16 kcal/mol. The greatest free energy difference of 2.81 kcal/mol between two trials also occurred at the bin centered at −209.5°.
Separate umbrella sampling calculations were attempted for the 3'-facing pathway. Windows with the improper dihedral angle value centered at 90° to 140° consistently changed to an alternative conformation that was not supported by the NMR spectra. For these calculations, A5 moves parallel to the direction of the helix, no longer stacking or hydrogen bonding with A15. A5 also lost its stacking interactions with the flanking bases. A5 remained outside the helix in a configuration where the A5 edges were at a right angle to the edges of bases G4 and G14, possibly partly stabilized by transient electrostatic interactions between heteroatoms. Therefore, these calculations were stopped. This also implied that the 5'-facing pathway is a more viable model for the conformational change.
To test convergence, free energy profiles were generated combining the data of all six random number seeds for 2, 4, 8, and 12 ns of sampling at each equilibrium position (Figure 9). The greatest free energy difference of 0.81 kcal/mol between the 2 ns and 12 ns of sampling occurred at the bin centered at −186.5° degrees. The RMSD difference between the free energy profiles for 2 ns and 12 ns of sampling was 0.36 kcal/mol. Between 6 and 12 ns of sampling, this reduced to 0.22 kcal/mol. Free energy differences between the major and minor conformations decreased as sampling time was increased. Although the free energy difference changed, the locations of energy maxima and minima remained the same as sampling time was increased. Therefore, 2 ns of sampling for each window was enough to establish the overall features of the free energy profile. The RMS difference of 0.13 kcal/mol between the free energy profile from 8 ns and 12 ns of sampling suggests that 12 ns sampling was adequately converged.
The relative free energy change between the major and minor conformations was estimated from the umbrella sampling and the minor conformation was predicted to be more stable. From NMR data, the population of major to minor conformations was estimated from 70:30 % to 90:10 %, giving a favorable free energy for the major conformation of −0.51 to −1.31 kcal/mol.24 The calculated free energy surface, using AMBER99, gave a free energy difference of 7.04 kcal/mol in favor of the minor conformation. This indicated the force field misrepresented the stability of the major and minor conformations.
Although evidence suggests the umbrella sampling simulation time was adequate, one additional concern is that the adjacent windows are not sampling adjacent conformations in degrees of freedom aside from the chosen reaction coordinate. To test this, both the stacking interactions of the bases and the backbone dihedral angles were examined for all windows. There was no evidence that the sampling was structurally disconnected in adjacent windows.
The three base pairs of both stem regions were consistently stacked in helical form on both sides of the three base pair GAA internal loop in all umbrella sampling windows. Stacking interactions of the AA non-canonical pair and the flanking GA pairs were observed to change smoothly from one window to the next, where the minor conformation's stacking interactions were gradually lost for intermediate windows at and around −90° and progressively reformed in windows closer to the major conformation's improper dihedral angle value of −170°.
Backbone dihedral angles were measured for all of umbrella sampling windows and all residues. Histograms of representative residue backbone dihedrals are shown in Figure S8. These dihedrals were chosen because they show different distributions in the major and minor conformations, so that intermediate umbrella sampling windows can be checked for the degree with which they transition from one distribution to the other. The δ dihedral angle for A5 was known from experiment to be 160±30° in the major conformation and 122±67.5° in the minor conformation.24 The distribution of A5 δ angles in the umbrella sampling windows for the major and minor conformation agree with this result, showing major A5 δ to have an average value of 141.9° and minor A5 δ to have an average value of 83.8° (Figure S8b). Observation of the distributions of all umbrella sampling windows between the major and minor conformation shows a smooth change as the histograms of neighboring umbrella sampling windows have significant overlap. Three other backbone dihedral angles also exhibit a similar pattern with different values for the major and minor conformation and an overlapping distribution of neighboring windows along the conformational change pathway between the two conformations, including A6 α (Figure S8 c, d), G14 δ (Figures S8 e, f), and G4 ζ (Figure S8 g, h). The average values of these four backbone dihedrals in the major and minor conformation umbrella sampling windows agree with the backbone dihedral angles in the experimental model of the major and minor conformation. These plots support that sampling orthogonal to the improper dihedral angle, such as these backbone dihedrals, were related between umbrella window simulations along the conformational change pathway.
Variation in potentials, salt concentration, water model, and periodic box size were tested to check if the predicted relative free energies of the major and minor conformations would be closer to the expected experimental result. An update to the force field parameters that improved upon the α and γ dihedrals known as the Barcelona parameters or parmbsc044 was tested and resulted in little change in the free energy profile as plotted in Figure 10a. The RMSD between the free energy profile produced with the original AMBER99 force field and the parmbsc0 force field was 0.54 kcal/mol with a maximum difference of 1.23 kcal/mol occurring at the bin centered at −165.5°.
The TIP4PEW water model45–46 and ion parameters were also tested with 1 M NaCl and the resulting free energy profile is plotted in Figure 10b. The free energy profile has barriers in the same location as the TIP3P free energy profile, but the free energy difference between the major and minor conformations increased to 9.45 kcal/mol. The RMSD between the TIP3P and TIP4PEW free energy profiles was 1.04 kcal/mol with a maximum difference of 1.72 kcal/mol occurring at the bin centered at 24.5°. Thus using the TIP4PEW water model instead of TIP3P did not improve the results produced by the AMBER99 forcefield.
Running the free energy calculations at 1 M NaCl was intended to mimic the electrolyte strength at physiological conditions. The NMR experiments24 were performed in a solution containing 80 mM NaCl, 10 mM sodium phosphate, and 0.5 mM disodium EDTA. To investigate the effect of salt, the free energy calculations were repeated with only neutralizing Na+ ions and no additional NaCl with 12 ns of sampling per window (Figure 10c). The free energy profiles with and without 1 M NaCl have barriers at roughly the same locations. The profiles have an RMSD of 0.39 kcal/mol and a maximum difference of 1.17 kcal/mol at the bin centered at 29.5°. Therefore, the 1 M NaCl simulations were considered adequate because running with minimal salt did not change the free energy profile significantly.
The size of the isometric solvent box for umbrella sampling calculations was determined using a distance of nearest contact to the solute. This was set to 10 Å as a compromise between having a box large enough to be physically reasonable, but not so large as to become computationally prohibitive. To ensure that the box size does not affect the free energy result, a larger solvent box using a radius of nearest contact to the RNA of 20 Å was used in a separate trial of simulations. The TIP3P water model and neutralizing Na+ ions plus 1 M NaCl ions were added as for the 10 Å box. Two ns of sampling for each of the 25 windows was performed and the resulting free energy profile generated with WHAM and plotted in Figure S9. The free energy of the major conformation was still estimated to be 8.85 kcal/mol greater than the minor conformation. Thus, increasing the explicit solvent periodic box size did not affect the resulting free energy.
As an alternative to umbrella sampling, the MM-PBSA and MM-GBSA methods were used to predict the free energy changes between the major and minor conformations. Calculations of the free energy for the major and minor conformations were used to predict the free energy change for the conformational transition by subtracting the free energy of the minor conformation from the major conformation (Table 2). Eight total calculations were performed using the four independent 100 ns trajectories that were run for each conformation. This facilitated an estimation of the standard deviations of means. Both PB and GB solvation methods incorrectly predicted that the minor conformation was more stable than the major. MM-PBSA gave a larger positive free energy change, 16.84 ± 19.78 kcal/mol, than MM-GBSA, 7.88 ± 9.66 kcal/mol. MM-PBSA/GBSA gave an incorrect positive free energy change similar to the umbrella sampling result.
The MM-PBSA/GBSA results demonstrate a potential problem with the method. The first 100 ns major structure molecular dynamics simulation MM-PBSA free energy deviates from the other three simulations by about 40 kcal/mol. The major contributor to this energy difference results from a much lower gas phase electrostatic potential energy from the force field. The RMSD and structure of the simulation appears similar to the other simulations, indicating that the configurations visited by this simulation happen to produce a much lower electrostatic energy. Because electrostatic interactions can be strong and highly sensitive to atomic coordinates, it is plausible that a small change in conformation could yield a large change in electrostatic energy. For MM-PBSA/GBSA the free energy difference between the major and minor conformations is calculated by taking the difference between two large numbers, and is thus inherently prone to error.
This study reports the modeling of the conformational change of an AA non-canonical pair. It highlights the capability of modern methods for modeling conformational change and predicting conformational free energy changes. This study also reveals limitations in the AMBER forcefield, a commonly used forcefield for modeling RNA dynamics.
TMD and NEB provided complementary information for the modeling of the conformational change. Each method predicted a pathway that involved stacked intermediates and, interestingly, the RNA structure showed enough flexibility for the AA to change conformation without the adjacent base pairs being broken. This was also previously observed in modeling the conformational change of a GG non-canonical pair.20
The TMD and NEB pathways suggested a dihedral reaction coordinate that could be used to follow the conformational change. This was then used to predict the free energy change along the pathway using umbrella sampling and WHAM. The rate of convergence and the magnitude of uncertainty for umbrella sampling were explored by repeating the umbrella sampling calculations with six independent trials. A total of 1800 ns of sampling was performed. The maxima, minima, and barriers of the free energy profile appear in the first two ns of sampling, but the independent trials have profiles differing with RMSDs as high as 0.56 kcal/mol even out to 12 ns. It can thus be concluded that the true rate of convergence is slower than the 12 ns of sampling performed per window. The need for extensive sampling is consistent with prior observations in peptide systems.71 Over six trials, the error in free energies can be estimated as the standard deviations of the separate trials. These errors are a result of incomplete sampling of conformational space available to the molecule and are estimated to be less than 1.04 kcal/mol. Experimental errors are typically of the same order.
Both TMD and NEB predicted only a pathway in which the 5' sides of the bases face each other in the intermediate structure. To model a pathway with the 3' bases facing each other, the starting structures needed to be altered so that the bases were poised to follow that pathway. The reaction coordinate that describes the conformational change is periodic, where the dihedral angle is defined by atoms C8, C4, and N1 on adenine 5 and C5 on adenine 15. It has been noted previously that sampling pathways with NEB can be incomplete for a periodic system. Here it is shown that TMD can also miss pathways around a periodic coordinate. In this case, the 3'-facing pathway was ruled out subsequently as a viable pathway because the intermediate dihedrals were unstable in that direction during umbrella sampling.
Free energy changes along a conformational change reaction coordinate provide both an estimate of the relative stability of the ends and the barrier(s) between states. For this AA system, the AMBER ff99 forcefield incorrectly predicted the minor conformation as the more stable conformation by 7.04 kcal/mol. All six independent calculations provided the same conclusion. Furthermore, variations in the forcefield, including the Barcelona dihedral parameters, using TIP4PEW water, altering the concentration of additional salt in the box, and using a larger water box did not change this incorrect prediction. Salt concentration and the ions are important for determining RNA structure.72 The simulations appear relatively insensitive to salt concentration, as the calculated RMSD between the free energy profile from 1 M NaCl and from only neutralizing Na+ with TIP3P water was only 0.39 kcal/mol. Umbrella sampling with TIP4PEW and the latest ion parameters for the water model did not improve upon the relative free energies of the major and minor conformations, and in fact increased the free energy of the major conformation to 9.45 kcal/mol. The parmbsc0 modification of the AMBER99 force field gave a free energy profile with an RMSD of only 0.54 kcal/mol from the original AMBER99 profile and thus did not improve upon the result.
To test the possibilities that the umbrella sampling was insufficient or that the forcefield inaccuracies are for structures in the transition state, the MM/PBSA and MM/GBSA methods were applied to predict free energy changes between the endpoints. These methods also incorrectly favored the minor conformation above the major. This supports the hypothesis that the forcefield is also inaccurate for native structures, not just for structures along conformational change pathways.
In this study, the conformational change pathway of an AA non-canonical pair in RNA was modeled using both TMD and NEB. Free energy calculations were then performed along a reaction coordinate. This simple system provides an excellent test case of available computational methods because of the availability of NMR data.24 Direct observation of the actual conformational change pathways and the configurations visited during the pathways, however, are not possible with current technology. Several conclusions can be drawn. Both NEB and TMD predict a conformational change pathway where the adenines slide one over the other within the helix with the 5'-side facing the opposing adenine. The pathways involve breaking of the hydrogen bond between the adenines of the non-canonical pair as well as the stacking interactions between the adenines and the flanking GA pairs as defined by the minor conformation as shown in Figure 11 with subsequent movement through intermediates with the adenines stacked next to each other. The pathways move through this stacked intermediate and reform the hydrogen bond and stacking that defines the major structure as shown in Figure 11.
Predicted free energy changes using both umbrella sampling with 1800 ns of total sampling and MM-PBSA/GBSA incorrectly favor the minor conformation suggesting that the AMBER99 forcefield can be improved. This system, because of its size and the available solution structures, provides a useful benchmark for testing forcefields.
The authors thank Alan Grossfield, Tod Romo, Matthew Seetin, and Harry Stern for many helpful suggestions. This study was funded by a National Institutes of Health grants R01HG004002, to DHM, and R01GM22939 to DHT.
Briefs: Conformational change pathways for an AA non-canonical pair were modeled using molecular mechanics.
Supporting Information Available: Figures of the sugar pucker dihedrals of the A5 and A15 backbone sugars from the MD trajectories used in the MMPBSA calculations; plots of the final RMSD to target for TMD trajectories as a function of force constant; RMSD to target for selected TMD trajectories as a function of time; maximum, minimum, and average potential energies for the NEB calculations of the 5'- and 3'-facing pathways; improper dihedral angle plots for the four 100 ns simulations of the major and minor conformation; histogram plots of selected backbone dihedrals for all umbrella sampling windows and the major and minor conformation; and the resulting free energy profile from sampling with a larger solvent box.