|Home | About | Journals | Submit | Contact Us | Français|
Reliable studies of enzymatic reactions by combined quantum mechanics /molecular mechanics (QM/MM) approaches, with an ab initio description of the quantum region, presents a major challenge to computational chemists. The main problem is the need for a very large computer time for the evaluation of the QM energy, which in turn makes it extremely challenging to perform proper configurational sampling. A seemingly reasonable alternative is to perform energy minimization studies of the type used in gas phase ab initio studies. However, it is hard to see why such an approach should give reliable results in protein active sites. In order to examine the problems with energy minimization QM/MM approaches we chose the hypothetical reaction of a metaphosphate ion with water in the Ras•GAP complex. This hypothetical reaction served as a simple benchmark reaction. The possible problems with the QM/MM minimization were explored by generating several protein configurations from long MD simulations and using energy minimization and scanning of the reaction coordinates to evaluate the corresponding potential energy surfaces of the reaction for each of these different protein configurations. Comparing these potential energy surfaces, we found major variations of the minima of the different total potential energy surfaces. Furthermore, the reaction energies and activation energies also varied significantly even for similar protein configurations. The specific coordination of a magnesium ion, present in the active center of the protein complex, turned out to influence the energetics of the reaction in a major way and a direct coordination to the reactant leads to an increase of the activation energy by 17 kcal/mol. This study demonstrates that energy minimizations starting from a single protein structure could lead to major errors in calculations of activation free energies and binding free energies. Thus we believe that extensive samplings of the configurational space of the protein are essential for meaningful determination of the energetics of enzymatic reactions. The possible relevance of our conclusion with regard to a recent study of the RasGAP reaction is discussed.
QM/MM approaches have provided a general scheme for studies of chemical processes in proteins1–11. A significant progress has been made with calibrated semiempirical QM/MM approaches2,7,10,11. However, one would like to move to an ab initio representation with a QM/MM treatment since such QM representations have been shown to provide “chemical accuracy” in studies of gas phase reactions of small molecules12. In principle, when one uses a reliable large ab initio QM region we can expect to obtain a reliable description of the potential surface of the reaction region. Unfortunately, this does not mean that the actual free energy barriers are estimated correctly, and in fact we may have many serious problems. First, it is essential to provide a proper long-range treatment and effective boundary conditions for the protein+solvent environments13. Second, it is important to use a polarizable force field for the MM region and finally it is important to perform sufficient sampling to obtain the actual free energy2,14.
The present work will not focus on the sampling problem, which was discussed recently2,15,16 in terms of possible simulation strategies. Here we focus, however, on the issue of using energy minimization for a few protein configurations17,45. Obviously this provides an advance over gas phase or single point calculations. However, this approximated strategy was not subjected to a systematic scrutiny.
When one deals with enzyme active sites the challenge of finding the transition state region and evaluating the activation barrier is quite different than in gas phase energy search, since the protein landscape is very complex and different configurations cannot be found by standard energy minimization approaches. Furthermore, results obtained from different minima can be very different, as already observed17.
In order to illustrate the above problems we take a relatively simple case, a segment of a possible path in the reaction of Ras, where an H2O attacks a metaphospate which can in principal be formed by a dissociative mechanism. This type of reaction has been considered in a recent QM/MM energy minimization study where it was concluded that the barrier for the dissociative bond breaking is very low and the rate determining step is a proton transfer to the metaphosphate through a concerted path that involves both Gln61 and the catalytic water. It has also been concluded that a direct proton transfer from the catalytic water to the metaphosphate involves a very high barrier45. The present paper does not attempt to explore the possible involvement of such a reaction path in the reaction of the Ras•GAP system, since this should involve a very careful evaluation of the free energy surface for the associative and dissociative mechanisms, a careful calibration and a systematic comparison to relevant experimental results. Instead we merely chose the hypothetical water-metaphosphate system as a simple benchmark for QM/MM calculations, and we will only make some general comments about the new proposal of ref45.
Section II describes our benchmark and outlines the computational methods. Section III describes the results of the calculations and demonstrates their sensitivity to the protein conformational states and the position of the Mg2+ ion. Finally, we discuss in section IV the implications of our findings and emphasize the importance of a proper average over the protein configurations.
This work considers some aspects of the rather complex mechanism of the Ras•GAP system, whose active site and a model of the bound GTP is depicted in Fig 1.
However, instead of focusing on the actual reaction and its biological implications we focus on a relatively simple model reaction, namely the attack of H2O on a metaphosphate formed in the active site of the Ras•GAP system, following an hypothetical dissociative cleavage of the GTP substrate (Fig 1). The metaphosphate water system was used as a benchmark in exploring the validity of QM/MM transition state searches in condensed phases in general and in proteins in particular.
The potential energy surface (PES) of our reaction was explored by considering the two reaction coordinates described in Fig 2, the distance of the water oxygen to the phosphorous, R, and the distance of the transferred proton to proton acceptor oxygen of the metaphosphate, r.
We also examined the possibility that a possible third reaction coordinate (i.e. the distance of the transferred proton to the water oxygen) can be neglected and its inclusion does not change our results quantitatively.
In order to evaluate the potential surface for the reaction in solution we started by scanning the surface in the gas phase as a function of R and r, where at each point we constrained the system to the specific R and r and minimized the energy with regards to all other coordinates. Next we considered the effect of the solvent by using the conductor reaction field approach COSMO20 and performed a single point COSMO calculation at each of the scanning points. The resulting PES served as an approximation for the energetics of the reaction in solution. In doing so we assumed implicitly that minimization with respect to the coordinates orthogonal to R and r will give similar results in the gas phase and in solution. This assumption was verified in some specific cases21. It seems to us that the above scanning procedure is at present more effective than alternative options of performing a transition state search on the solution surface using analytical derivatives. That is, in our experience there are problems with the analytical derivatives of most current ab initio solvation models, at least with respect to the dependence of the cavity surface and the reaction field on the solute coordinate. Here the use of a systematic scanning procedure is the simplest way of obtaining a reliable mapping of the transition state region.
In addition to the COSMO calculations we also considered the previously reported results obtained by the Langevin dipole model22. The gas phase optimizations were performed using the Hartree Fock method and a 6–31 G(d) basis set, whereas the single point calculations that considered the solvent effect used the DFT method with the hybrid functional B3LYP and an extensive 6–311++G(d,p) basis set. The results of this minimization/scanning approach will be described in the next section.
The next step of our study involved a QM/MM evaluation of the potential surface of our system in the protein active site. Before describing the actual calculations we will provide below a brief description of our QM/MM approach emphasizing the specific technical points of the current implementation. The starting point in QM/MM models is the separation of the system into QM and MM regions1. In systems where the quantum region is composed of a molecule or molecules that are not bound to the classical region, as is often the case in solution-phase reactions, separation of the two regions is rather trivial. The separation into quantum and classical regions is not as straightforward in enzyme reactions, because the quantum region is frequently bonded to the classical region, and it is not clear how to define the boundary conditions for the electronic structure calculations of the quantum region, nor how to incorporate the electrostatic and van der Waals effects of the classical region into the quantum region energy expression. An effective way of connecting the two regions can be provided using hybrid orbital1 and related techniques14,23. Unfortunately, these approaches are somewhat difficult to implement, and because the current QM(ai)/MM method aims to be an efficient method of linking standard programs (in this case GAUSSIAN24 and MOLARIS25), we chose the simpler method of using link atoms. A link atom (LA)26 is an atom inserted along the bond between the quantum and classical regions. Such an approach was introduced1, in addition to the introduction of the use of the hybrid orbital approach.. Using the nomenclature of ref27, the quantum mechanical atom to which the link atom is bonded is referred to as the link atom bond partner (LABP), and the classical atom replaced by the link atom is referred to as the link atom host (LAH). Warshel and Levitt1 introduced the link atom treatment in their AMI/MM model of the catalytic reaction of lysozyme, while using the more reliable hybrid orbital treatment1 in their QCFF/all study of lysozyme. Although the name LA was not introduced the method was incorporated into the ERFN program29 and thus constitutes the first LA QM/MM treatment. The LA treatment was also introduced in QM/MM studies of π-electron systems.46 Warshel and Levitt1 and Ostlund30 developed methods that adjust the ionization potential of the LAs to reproduce properties of the given molecule in the presence of the actual LAHs. In our and previous studies30,27 , the LA is a hydrogen atom. Figure 3 illustrates the process of defining the quantum region and inserting LAs.
This figure shows the reacting system separated into the quantum region and a small part of the MM region, which is covalently linked to the QM region. The structure below the quantum fragment shows this fragment capped with LAs to produce the quantum region. The LAs are inserted along the LAH-LABP bond at a distance determined by gas-phase ab initio energy minimization. Despite the intense current interest in proper treatment of link atoms and the availability of more effective hybrid orbital and related localized bond approaches1,23, we feel that this problem is not so crucial in studies of enzymatic reactions. That is, errors introduced by using a small fragment are generally similar in solution phase and protein simulations. Thus, such errors are largely canceled when one is concerned with enzyme catalysis, which reflects the difference between the activation energy in the enzyme and in the reference solution reaction. In the present case we treated the metaphosphate and the water quantum mechanically using the Hartree-Fock approach with a 6-31G(d) basis set, as was done for the reaction in solution. We also estimated a correction for the free reaction energies and free activation energies by calculating the energies at the reactant, product and transition states using the B3LYP functionals with a 6-311++G(d,p) basis set.
Before applying the above QM/MM approach we generated starting protein configurations. This was done by constructing an EVB potential for the QM system and used this potential to propagate trajectories in the reactant state. The trajectories started from a model structure of the Ras•GAP complex derived from the crystal structure of Scheffzek et al.19. Because we have been only interested here in the hypothetical reaction of metaphosphate (that is already dissociated from the GDP moiety) with a water molecule, we separated the metaphosphate from GDP by applying a soft harmonic potential with a minimum at 3Å distance between the two compounds. This model system was first equilibrated by an MD run of 500 ps and then the simulation continued for another 500 ps, generating five representative protein configurations. These configurations were taken as the starting points for the QM/MM studies described in the next section.
Although it is convenient to assume that ab initio QM/MM approaches should give reliable results, this is far from being certain. Thus before performing QM/MM calculations of chemical processes in enzymes it is important to validate the method used by calculating the energetics of the given reaction in solution. That is, since QM/MM calculations in enzymes do not yet give chemical accuracy it is essential to calibrate (or validate) the specific method used by examining the agreement between the calculated and observed energetics in solution reactions (see discussion in II). Such calculations are also needed in order to assess the catalytic effect of the enzyme. At any rate, we established the validity of the quantum mechanical model by evaluating the free energy surface for the solution reaction using the COSMO solvent and the mapping procedure described in the method section. The corresponding results are summarized in Table 1 (together with the previously reported results with LD solvent model) and Fig 4.
Fig. 4 depicts the equipotential lines of the PES, the transition state and the corresponding reaction path. The figure shows that the PES in the reactant region is shallow and that the water molecule can reach a very close distance (R=2.1 Å) to the metaphosphate before the proton is transferred. The transition state is found at R=1.9 Å and r=1.3 Å. The reaction free energy, ΔG0 , is −23.2 kcal/mol and the corresponding activation free energy is ΔG#=13.3 kcal/mol. These energy differences are in a very good agreement with the ΔG0 =−24.0 and ΔG#=11.0 kcal/mol values obtained by Florian and Warshel21 from ab initio/LD calculations and from analysis of relevant experimental data.
We also examined the influence of electron correlation and the basis set on the PES. That is, in addition to our chosen B3LYP/6-311++G(d,p) calculations we also evaluated the activation barrier and reaction energy with the HF/LanL2DZ and B3LYP/LanL2DZ methods. The results of this study are also summarized in Table 1.
From this comparison we conclude that the neglect of Coulomb electron correlation in the Hartree Fock approach leads to rather inaccurate estimates for both the reaction and the activation free energy, as expected. Both energy differences deviate around 50% from our original results. The deviations of the smaller basis set, LanL2DZ, compared to the triple zeta basis set are smaller but using this basis set somewhat underestimates the reaction free energy.
After establishing the reliability of the potential surfaces for the solution reaction we explored the reaction in the active site of the Ras•GAP complex. In this case we performed the transition state search with several initial protein configurations. The potential surface for each protein configuration was evaluated by the same mapping procedure described in the discussion of the solvent surface and the method section, but this time using the QM/MM description of the protein. It is important to notice that even in the case of a single protein configuration there is a risk of obtaining an incorrect transition state. This issue is considered in Fig. 5, where we depict an example of the PES for a specific protein configuration. As seen from the figure, starting from different initial reactant coordinates and using an incomplete configurational search can lead us to different transition states. This problem, that can also occur in gas phase studies, is rather well known31,32 and is not the topic of this work.
The actual serious problem addressed by this paper starts to become apparent when we consider the effect of the protein configurations. This issue is demonstrated in Fig. 6 and Table 2, where we examine the dependence of the lowest energy profile on the protein configuration used.
As seen from the figure the absolute energies of the ground state can easily vary by 30 kcal/mol between different protein configurations. This finding makes it rather clear that calculations of binding free energies, which reflect the energy of the reactant state, cannot be performed by QM/MM approaches without a very extensive averaging. The situation seems to be less catastrophic when one considers the activation barriers relative to the given ground state energy. Here the variation is about 6 kcal/mol, which is large but can be used in some qualitative considerations. Unfortunately this is only the tip of the iceberg. That is, while the error in ΔU# obtained in closely related configurations may be tolerable, some configurational changes might lead to much larger effects.
A detailed analysis of the effect of the protein fluctuations on the QM/MM surfaces is usually rather complicated. However, one can focus on some key elements. Here we explored the effect of changes in the position of the Mg2+ ion, considering two different coordinations that were both found to be stable (assuming that the Mg2+ ion is fixed at the single position found in the transition state analog is apparently unjustified). The results of this study are summarized in Fig 7 where we compared the reaction profile associated with the two initial configurations depicted on the right hand side of the figure. As seen there, increasing the distance between the Mg2+ ion and the metaphosphate oxygen reduces the barrier by 17 kca/mol.
The results presented in Fig. 7 clarify that the situation in QM/MM calculations in proteins are quite different than those obtained in gas phase calculations. That is, while in gas phase calculations it is reasonable to assume that a convergent minimization approach will lead automatically to a displacement of the magnesium ion, it is unlikely that this will happen in an energy minimization in the protein active site. These results underscore the importance of a proper conformational search in QM/MM evaluations of activation energies. Here one of the best options is to use EVB as a reference potential for QM/MM free energy calculations33.
The determination of the relative role of the two magnesium ion configurations of Fig. 7 is far from being simple. In principle one can start by evaluating the electrostatic free energy of the Mg2+ ion in the two configurations by a FEP approach. Then by assuming that the energy of moving the Mg2+ ion between the two sites is small, it would be possible to estimate the relative energy of the corresponding configurations. This treatment can be improved by the linear response approximation (LRA) treatment, which is quite effective in capturing the protein reorganization energy34. However, such a study is out of the scope of the present work, since it requires one to resolve the issue of the optimal representation of the Mg2+ ion. That is, an MM representation of the magnesium ion considers charge transfer to the substrate by the empirical van der Waals parameters. On the other hand, the seemingly superior treatment that includes the Mg2+ ion in the QM region may be problematic if the ligands of this ion are treated classically (in this case one neglects the charge transfer from the ligands to the metal and the inclusion of the ligands in the QM region is likely to be too expensive35,36. At present we are focusing on the assumption that both configurations are accessible and thus can be used as an instructive example of the risk in QM/MM energy minimization approaches. In view of the problems with simple QM/MM minimizations or scanning procedures, it seems obvious that one should attempt to average such calculations over the relevant protein configurations. The most rigorous way of obtaining such averages is by free energy calculations but such calculations can be too expensive when one uses a high level ab initio treatment for the QM region and regular free energy perturbation treatments. This problem can be drastically reduced by using the EVB or other simple potentials as a reference for the ab initio QM/MM free energy calculations37,38.
The evaluation of the energetics of chemical processes in proteins by classical force fields requires usually very extensive averaging over the configurational space of the protein. Thus it is quite likely that the same requirement will hold for QM/MM calculations. This means that simple minimization approaches of the type used in gas phase QM calculations might not be effective in evaluations of activation energies of chemical reactions in proteins. The present work examines this issue and demonstrates that QM/MM energy minimization approaches can lead to significant errors in evaluation of activation barriers of enzymatic reactions. The difficulties in using minimization approaches can be particularly serious when the enzyme substrate complex involves a rugged landscape with many local minima. An instructive example of the coupling between the protein coordinate and the solute activation barrier is provided here where the position of the Mg2+ ion determines the transition state energy.
While the afore mentioned problems with regards to the evaluation of activation barriers might not occur in some cases, the related problems associated with calculations of binding free energies are much more serious. That is, as demonstrated in Fig 6, calculations of absolute ground state energies and thus also binding energies by QM/MM minimization are extremely problematic39. Here one expects to obtain very different results in different local minima. Obviously the only way to obtain reasonable binding energies is to perform very extensive averaging of the type performed in QM/MM redox calculations40.
The present test case involves the attack of a water molecule on a hypothetical metaphosphate intermediate in the Ras•GAP complex. This intermediate has been proposed recently in the QM/MM energy minimization study of Grigorenko et al.45. As stated in the introduction we did not examine yet the proposal of a very low barrier for the dissociative formation of the metaphosphate, but we find it to be rather unlikely. Furthermore, the idea that a concerted H2O-Gln61 attack on the metaphosphate as the rate limiting step is problematic. For example, this concerted reaction cannot occur in the elongation factor Tu (EF-Tu) which has histidine instead of Gln61, but this system hydrolyses GTP quite effectively.47 Thus we believe that despite the intriguing insight provided by the studies of Grigorenko et al. it is crucial to reexamine the corresponding conclusions by a well calibrated approach (with a careful attention to a more complete sampling). It would also be crucial to examine the corresponding result in related systems such as Ras alone as was done in EVB studies.43
Finally, perhaps the simplest message that emerges from this study is the idea that QM/MM minimization and scanning approaches should involve averaging (and validation) over several protein configurations generated by long MD simulations. This idea has been found to be very useful for theoretical IR spectroscopy of the ground state of GTP in Ras41,42 and in statistical mechanically more rigorous treatments such as FEP calculations of electrostatic energies13 and in EVB calculations of enzymatic reactions43,44.
This work was supported by NSF grant MCB-0342276 and NCI Grant 1 U19 CA105010-01. The computational work was supported by University of Southern California High Performance Computing and Communication Center (HPCC).