The rational design of enzymes with native activity requires the ability to predict the proper TS stabilization, and this involves the challenge of capturing the overall preorganization effect. Attempts to estimate the catalytic effect by using gas phase models, or even by looking at the electrostatic interaction between different residues and the TS, are unlikely to reproduce the correct catalytic effect since it is impossible to assess the preorganization effect without including the protein and simulating its reorganization during the reaction.
The challenge of evaluating the catalytic power of a given mutant is not different than that addressed in our early 1986 study of computer aided mutations (6
). At this stage it seems to us that the potential of the EVB has been demonstrated in well defined cases (e.g., (3
)), where it was found to reproduce the large effects of mutations that destroy the catalytic effect of evolved enzymes. Thus our main current challenge is to use this approach in improving non-efficient enzymes.
It is also important to clarify that we appreciate the advances made in designing artificial enzymes (and clearly those done with catalytic antibodies), in terms of generating active sites that bind and fit the given reacting system. However, we do not believe that the current steps are sufficient for generating effective catalysts and a CAED must involve the ability to predict the catalysis in the given active site.
At this point it is useful to clarify the difference between our EVB approaches and current alternative approaches. The essential requirement from a proper screening approach is the ability to reproduce the observed catalytic effects. Obviously this major requirement cannot be accomplished by gas phase models (including even gas phase models with the substrate and very few residues) that were used for the initial screening in some cases (e.g., (16
)). Instructive MM and related simulations (19
) can tell us about the optimal donor / acceptor geometries and to help in generating proper scaffolds for the reacting systems, but are unlikely to be able to predict the catalytic trends in properly oriented systems. More relevant and instructive would be a comparison of the EVB to current MO-QM/MM studies of enzyme design. Here it would be useful to consider several recent studies of the Kemp eliminase and related systems: the semiempirical MO-QM/MM study of Jorgensen and coworkers have provided reliable results for the water reference reactions (40
), but the predicted trend in the protein (17
) is not encouraging. More specifically, the MO-QM/MM approach performs nicely in exploring the effect of changing the distance between the donor and acceptor (i.e., the Glu to Asp mutation (41
)). However, the real challenge is to reproduce the effect of changing the environment (which occurs in directed evolution experiments and is usually responsible for the catalytic activity) and this challenge has not been yet met by the current MO-QM/MM studies of Kemp eliminases (which drastically underestimated the barrier in the enzyme). Interestingly Ref (18
) argued that it could improve the MO-QM/MM results. However, the reported results (and the agreement with the corresponding experimental results) seem to overlook the energetics of forming the protonated water molecule that is assumed to be the proton donor. Perhaps the current difficulties with the MO-QM/MM are due to the fact that the reported studies kept the main chain fixed. Alternatively, Houk and coworkers (19
) have attempted to use ONIOM truncated protein model but obtained a relatively poor agreement (with a spread of about 12 kcal/mol for experiment deviations of 2 kcal/mol (see Fig S3
)). This work also presented views that might lead to some confusion. First, there are problems with the argument that calculations with an error of 1.5 kcal/mol may not be useful since the observed mutational effects are around 2 kcal/mol. In fact, predictability with 2 kcal/mol error range would be a fantastic tool in attempts to generate enzyme with large catalytic effects. Second, and potentially more problematic, is the idea that the predicted power requires a very high level QM method. This suggestion is risky (in terms of its possible impact on the experimental community) and unjustified. That is, predictive approaches like the EVB are not interested in predicting the absolute QM energy of the substrate, since what counts is the change in this energy upon moving from water to the protein active site. Thus the effort in developing predictive method must be spent on having a good convergence and a proper long rang treatment and not on getting the best basis set.
Overall, we have no doubt that MO-QM/MM approach with proper sampling (e.g., with our approach of using a reference potential (42
) will be able to provide a proper screening tool (in particularly when used with an EVB as a reference potential). However, at present the EVB seems to provide the most effective way for obtaining reliable screening results perhaps because its ability to allow for sufficiently extensive sampling.
Some workers may see a great potential for using disolvation effects in enzyme and causing a catalysis by ground state destabilization (GSD) (see discussion in (4
)) but native enzymes do not catalyze reaction by exploiting disolvation effects (4
). Furthermore, even Kemp eliminases have not been able to exploit this effect significantly. One of the problems is that even if we could create a strong RS disolvation for the base it would lead to a very large pKa
and this will not help at physiological pH. That is, if we try to destabilize the RS by destabilizing the ionized base (e.g., the ionized Asp), the base will be protonated by a bulk proton. Here the best option is to use the polar TS stabilization but unfortunately it is very hard to obtain for the Kemp TS (see (13
)). Perhaps a part of the reason why enzymes do not use disolvation effects is the available pH range and the available amino acids (see discussion of ODCase in (43
Note that the GSD issue has been established with the reliable linear response (LRA) calculations of the ground state solvation free energy (13
) and by detailed comparison to the related case in dehalogenase, where a similar situation is handled by a neutrally evolved enzyme in a completely different way, with ground state stabilization and with very large transition state stabilization (see ref. (13
)). Our conclusion about the fact that the Kemp eliminases use RSD is not related to the exact structure of the TS, namely concerted or stepwise, but to the charge distribution of the TS (which has been treated here in a rigorous way by our special procedure).
Since the present work invested a major computational power in validating the EVB results, one may ask why we have not provided some predictions. The answer involves two points. First, we do provide several clear predictions with regards to the effects of mutating distanced residues predictions. This reflects our finding (see (13
) that it is extremely hard to get large catalysis in Kemp eliminases by simple mutations of the active sit residues (this is why we enough turn our attention to other systems, where the changes in the substrate charges upon going to the TS are larger). Second, we already demonstrated our ability to have reasonable predictions of the Asn155 to Ala in subtilisin in ref (7
). Thus the present paper is more about what does it take to get a reliable prediction than about actual predictions. More specifically, in our previous works (e.g., (3
)) we examined the trade off between speeds and reliability in different approaches for enzyme design. At present we feel that our fast strategies (like using group contributions and reorganization energies) are not predictive enough, and that it is important to use the extensive averaging considered here. However, with the current advances in computer power, this is not such a bad news. That is, as seen from , we can screen 14 mutants (twenty runs for each mutant) in 24 hours on 200 nodes and 70 mutants can be screened using 1000 nodes. We also used a more parallel approach, where the mapping is distributed on several processers, but this did not lead to a more efficient screening.
Estimating the efficiency of the EVB screening calculations(1).
At present there are still many who believe in dynamical and other esoteric effects that are presumed to contribute to catalysis (for review see (21
)). In many cases it is clearly suggested that improving such effects will be crucial for optimal enzyme design (e.g., (44
)). However, it seems to us that by far the main factor that actually contributes to catalysis is the preorganization effect and thus we feel that there is no rational way for improving dynamics and related effects as these factors do not contribute to catalysis (21
). Furthermore, we would like to clarify that TS stabilization by delocalization effects (16
) is unlikely to provide a significant catalytic factor since the same effect exists in the reference solution reaction. Thus, the possible effect of π- stacking (which was considered in (16
)) is not expected to lead to a significant rate enhancement above the simple effect of having nearby induced dipoles, which is much less effective than having preorganized polar environment. In fact, as realized by Hilvert and coworkers (15
) the corresponding dispersion or more precisely inductive effect is small.
Our previous work (13
) attempted to refine the electrostatic environment near O1. This effort can be considered by some as an extension of the idea of placing an acid near O1 (e.g., (2
)). However, the idea that such a base is needed is reminiscence from physical organic chemistry concepts that capture some of the electrostatic effect, but end up looking at factors that do not play major role in enzyme catalysis. In our view, the issue is not a charge transfer or a covalent bond to the substrate as might be concluded from gas phase calculations, since we are not dealing here with a bifunctional reaction with two steps (unless we have a new chemistry), but with a pure electrostatic stabilization. It is true that the attempts to focus on the base lead in some cases to what we consider the correct direction, like placing Lys or His near O1 (2
), but this has little to do with the pKa
of Lys, as one would assume from the traditional picture. It actually reflects the electrostatic free energy of interactions.
Obviously our strategy can be and will be improved in the future, but the main point is the ability to consider enzyme design by using energy based concepts in a rational way. In this respect it is useful to point out a special nontrivial advantage of our CAED approach. That is, in the case of attempts to improve a specific enzyme, there is an acquired advantage in the accumulated experience of modeling different mutants (as was the case here in the study of the evolved mutants). Even significant errors in predicting the first set of mutants can lead to improvement of the model (e.g., in selecting enzyme specific dielectric to compensate for convergence difficulties in the treatment of ionized residues) and to a better understanding of the specific enzymes and thus to a better predictive ability in further design rounds. Such an advantage is not expected from computational design approaches that are not base on capturing the physics of the catalytic process.
Finally, we would like to reclarify that we have demonstrated the ability to reproduce quantitatively the absolute catalytic effects and mutational effects in naturally evolved enzymes (4
) and in designer enzymes (this work). This clearly indicates that the catalytic power of enzyme is not due to elusive effects (e.g., conformational dynamics), but to what is by now well understood; namely the electrostatic preorganization. Thus our difficulties in improving designer enzymes are not due to overlooking misunderstood factors, but to the difficulties in optimizing well understood factors. In other words, a method that reproduces the catalytic rate in known systems should be able to do so in any unknown sequence and the challenge is to find the unknown optimal sequence. At any rate, it seems to us that the present study provided a useful analysis of the reasons for the less than perfect performance of current designer enzymes.