Home | About | Journals | Submit | Contact Us | Français |

**|**HHS Author Manuscripts**|**PMC2822134

Formats

Article sections

Authors

Related links

Proteins. Author manuscript; available in PMC 2011 April 1.

Published in final edited form as:

PMCID: PMC2822134

NIHMSID: NIHMS159971

Benjamin M. Messer,^{1} Maite Roca,^{1,}^{2} Zhen T. Chu,^{1} Spyridon Vicatos,^{1} Alexandra Vardi Kilshtain,^{1} and Arieh Warshel^{1,}^{*}

Evaluating the free energy landscape of proteins and the corresponding functional aspects presents a major challenge for computer simulation approaches. This challenge is due to the complexity of the landscape and the enormous computer time needed for converging simulations. The use of simplified coarse grained (CG) folding models offers an effective way of sampling the landscape but such a treatment, however, may not give the correct description of the effect of the actual protein residues. A general way around this problem that has been put forward in our early work (Fan et al, Theor Chem Acc (1999) 103:77-80) uses the CG model as a reference potential for free energy calculations of different properties of the explicit model. This method is refined and extended here, focusing on improving the electrostatic treatment and on demonstrating key applications. This application includes: evaluation of changes of folding energy upon mutations, calculations of transition states binding free energies (which are crucial for rational enzyme design), evaluation of catalytic landscape and simulation of the time dependent responses to pH changes. Furthermore, the general potential of our approach in overcoming major challenges in studies of structure function correlation in proteins is discussed.

The elucidation of the landscape of proteins is an important element in attempts to explore the energetics and kinetics of the folding process. Furthermore, the ability to explore and sample large regions in the accessible conformational space can help in improving the description of functional properties and in exploring possible relationships between landscape and functions. Unfortunately, detailed sampling of protein landscapes requires enormous computational resources that are not generally available. Thus it is important to develop multiscale approaches that will allow one to effectively generate the free energy surface for folding and related processes. Simplified protein models have been in use since our early work in 1975^{1} and are being used extensively in folding studies^{2}^{-}^{8}. Furthermore, recent efforts have extended our early 1975 idea into membranes (see MARTINI force field ref^{9}). However such coarse grained (CG) models are, by their nature, approximated and it is important to be able to move from the coarse-grained description to more realistic explicit models. A general way to resolve this problem has been introduced by us some time ago^{10}, where we proposed the use of a simplified model as a reference potential for explicit calculations of folding free energies. This idea, which can be now classified under the umbrella name of multiscale modeling, became even more important in recent years in view of the interest in the relationship between landscape and function^{7}^{,}^{8}^{,}^{11}^{-}^{13}.

The idea of using reference potential in multiscale modeling has been exploited by us in a wide range of problems including accelerating QM/MM calculations (ref^{14}^{-}^{16}) and path integral calculations of nuclear quantum mechanical effects (ref^{17}^{,}^{18}). The idea of using CG model as reference potential in folding studies^{10} and related processes has been also explored recently by other workers (e.g. refs^{8}^{,}^{19}^{,}^{20}). Here we describe the details of the current version of our multiscale landscape modeling and provide instructive illustrations of its use in calculating the change in protein stability upon mutation. We also outline other promising applications of this powerful model.

The simplified protein model used in this work is an extension of the model originally developed by Levitt and Warshel^{1} and modified in ref.^{10} In the current model we have made additional refinements of the original model, focusing on the introduction of the specialized solvation terms, designed to replicate the effects of the missing solvent molecules and on other factors that are not included explicitly.

Our simplified protein model, which is depicted schematically in Fig. 1, is created by replacing the side chain of each residue by an effective “atom” named (X) and an additional dummy atom named (D). The atom X is usually placed at the geometrical center of the heavy atoms of the corresponding side chains (with a residue dependent charge and van der Waals equilibrium radius), where, the distance between the newly placed X atom and the C_{α} atom, *r _{Ca}*

A schematic illustration of the conversion of an explicit side chain to its simplified equivalent. The position of the explicit C_{β} is preserved, using a dummy atom, named D, and the geometrical center of the side chain heavy atoms is represented **...**

The potential energy surface of the simplified (sp) model is written as

$$\begin{array}{c}\begin{array}{c}{U}_{\mathit{sp}}={U}_{\mathit{mm}}^{0}+{U}_{\mathit{ss}}^{0}+{U}_{\mathit{ms}}^{0}+{U}_{\mathit{solv}}^{0}=\hfill \\ {U}_{\mathit{mm}}+{U}_{\mathit{ss}}^{\mathit{ef}}+{U}_{\mathit{ss}}^{\mathit{QQ}}+{U}_{s}^{\mathit{self}}+{U}_{\mathit{ms}}^{\mathit{ef}}+{U}_{\mathit{ms}}^{\mathit{Qq}}\hfill \\ +\mathrm{\Delta}{U}_{\mathit{mm}}^{\mathit{HB}}+\mathrm{\Delta}{U}_{\mathit{mm}}^{\mathit{phi}-\mathit{psi}}+\mathrm{\Delta}{U}_{\mathit{mm}}^{\mathit{qq}}\hfill \end{array}\end{array}$$

(1)

Where *m* and *s* designate, respectively, “main” and “side”, *U*^{0} is the gas phase potential of the indicated term,
${U}_{\mathit{solv}}^{0}$ is the missing effect of the solvent and the protein. The more detailed expression on the right hand side involves the effective steric potential (*U ^{ef}*), the screened interaction between different charges (

The interaction between the simplified side chains is given by,

$${U}_{\mathit{ss}}={U}_{\mathit{ss}}^{\mathit{ef}}+{U}_{\mathit{ss}}^{\mathit{QQ}}$$

(2)

where ${U}_{\mathit{ss}}^{\mathit{ef}}$ is described by an “8-6” potential of the form,

$${U}_{\mathit{ss}}^{\mathit{ef}}=\text{\u2211}_{i<j}\frac{{\epsilon}_{\mathit{ij}}^{0}}{{C}_{\mathit{ij}}^{\mathit{scale}}}\left[3{\left({r}_{\mathit{ij}}^{0}/{r}_{\mathit{ij}}\right)}^{8}-4{\left({r}_{\mathit{ij}}^{0}/{r}_{\mathit{ij}}\right)}^{6}\right]$$

(3)

where
${\epsilon}_{\mathit{ij}}^{0}=\sqrt{{\epsilon}_{i}^{0}{\epsilon}_{j}^{0}}$ and
${r}_{\mathit{ij}}^{0}=\sqrt{{r}_{i}^{0}{r}_{j}^{0}}$. The parameters
${\epsilon}_{\mathit{ij}}^{0}$ and
${r}_{\mathit{ij}}^{0}$ define, respectively, the well depth and equilibrium distance. The unitless parameter *C ^{scale}* has a value of 10 when
${U}_{\mathit{ss}}^{\mathit{ef}}$ is calculated for polar - non polar and ionized – non polar residues only. For side chain interactions between polar – polar and non polar – non polar residues,

$${U}_{\mathit{ss}}^{\mathit{QQ}}=\text{\u2211}_{i\ne j}332\phantom{\rule{0.2em}{0ex}}{Q}_{i}{Q}_{j}/{\epsilon}_{\mathit{eff}}{r}_{\mathit{ij}}$$

(4)

where *Q _{i}* designates the charge of the

The effective electrostatic interactions between the side chains and the main chain and between the main chain groups and themselves are given by

$${U}_{\mathit{ms}}^{\mathit{Qq}}+{U}_{\mathit{mm}}^{\mathit{qq}}=\text{\u2211}_{i}\text{\u2211}_{k}332({Q}_{i}{q}_{k}/{\epsilon}_{\mathit{eff}}^{\prime}{r}_{\mathit{ik}})+\text{\u2211}_{k}\text{\u2211}_{k\prime \ne k}332({q}_{k}{q}_{k\prime}/{\epsilon}_{\mathit{eff}}^{\u2033}{r}_{\mathit{kk}\prime})$$

(5)

Here *Q _{i}* and

The self energy term ${U}_{s}^{\mathit{self}}$ represents the interaction of the side chain with its environment and accounts for the crucial free energy of transferring the side chain to the protein when all other groups are not ionized (note that the effect of interaction with other ionized groups is accounted separately by ${U}_{\mathit{ss}}^{\mathit{QQ}}$). The overall ${U}_{s}^{\mathit{self}}$ term is given by,

$${U}_{s}^{\mathit{self}}=\text{\u2211}_{i}\left[{U}_{\mathit{np}}\left({N}_{\mathit{np}}^{i}\right)+{U}_{\mathit{polar}}\left({N}_{\mathit{polar}}^{i}\right)\right]$$

(6)

where *i* runs over all ionized residues and
${N}_{\mathit{np}}^{i}$ and
${N}_{\mathit{polar}}^{i}$ are the number of nonpolar and polar residues in the neighborhood of the *i*^{th} residue. The functions *U _{np}* and

$${U}_{\mathit{np}}=\{\begin{array}{cc}4\phantom{\rule{0.2em}{0ex}}\text{exp}\phantom{\rule{0.2em}{0ex}}\left[-0.2{\left({N}_{\mathit{np}}-6\right)}^{2}\right]\hfill & {N}_{\mathit{np}}\le 6\\ 4\hfill & {N}_{\mathit{np}}>6\end{array}$$

(7)

and

$${U}_{\mathit{polar}}=\{\begin{array}{cc}-2\phantom{\rule{0.2em}{0ex}}\text{exp}\phantom{\rule{0.2em}{0ex}}\left[-0.2{\left({N}_{\mathit{polar}}-4\right)}^{2}\right]\hfill & {N}_{\mathit{polar}}\le 4\\ -2\hfill & {N}_{\mathit{polar}}>4\end{array}$$

(8)

The number of nonpolar residues neighboring the *i ^{th}* ionized residue is expressed by the analytical function

$${N}_{\mathit{np}}^{i}=\text{\u2211}_{j\left(\mathit{np}\right)}G\left({r}_{\mathit{ij}}\right)$$

(9)

With

$$G\left({r}_{\mathit{ij}}\right)=\{\begin{array}{cc}1& {r}_{\mathit{ij}}\le {r}_{\mathit{np}}\\ \text{exp}\phantom{\rule{0.2em}{0ex}}\left[-6{\left({r}_{\mathit{ij}}-{r}_{\mathit{np}}\right)}^{2}\right]& {r}_{\mathit{ij}}>{r}_{\mathit{np}}\end{array}$$

(10)

where *r _{np}* is the nonpolar radius defining the cutoff range of nonpolar neighboring residues (typically

It should be noted that the calculations of N_{p}, and N_{np} are done while treating the ionizable residues as non polar residues. This is done since the effect of charging these groups is taken into account separately by the
${U}_{\mathit{ss}}^{\mathit{QQ}}$ term.

The parameters in Eqs. (6-10) were determined (see section 3.1.3) by comparing the results obtained with these equations to the results of the actual solvation energy evaluated by the semimacroscopic version of the Protein Dipole Langevin Dipole (PDLD/S) in its linear response approximation (LRA) version (PDLD/S-LRA).

In principle it would be useful to include in ${U}_{s}^{\mathit{self}}$ the hydrophobic contribution of non polar residues and the effect of the solvent on the self energy of polar residues. However, these contributions are included implicitly in ${U}_{\mathit{ss}}^{\mathit{ef}}$ and thus we only consider explicitly the ${U}_{s}^{\mathit{self}}$ of ionized residues. This is done by using different ${\epsilon}_{\mathit{ij}}^{0}$ in Eq. (3) for hydrophobic and polar residues.

The secondary structure of proteins depends strongly on the solvation of the main chains. Thus we added the correction potential ${U}_{\mathit{mm}}^{\mathit{phi}-\mathit{psi}}$ that is used to modify the gas phase potential. This solvation potential is given by,

$$\mathrm{\Delta}{U}_{\mathit{mm}}^{\mathit{phi}-\mathit{psi}}=\text{\u2211}_{i=1}^{4}{A}_{i}\phantom{\rule{0.2em}{0ex}}g\left(\mathit{\varphi}-{\mathit{\varphi}}_{0}^{i},{\omega}_{0}^{i}\right)g\left(\psi -{\psi}_{0}^{i},{\omega}_{o}^{i}\right)$$

(11)

Where

$$g\left(x,\omega \right)=\text{exp}\phantom{\rule{0.2em}{0ex}}\left(-0.\mathit{693}\phantom{\rule{0.2em}{0ex}}\left(\mathit{1}-\mathit{\text{cos}}\phantom{\rule{0.2em}{0ex}}\left(x\right)\right)/\mathit{\text{sin}}\phantom{\rule{0.2em}{0ex}}\left(\omega /2\right)\right)$$

(12)

The values of
${\mathit{\phi}}_{0}^{i}$ and
${\psi}_{0}^{i}$ are chosen to represent the minima of the α-helix and β-sheet regions of the Ramachandran plot, while *A _{i}* and
${\omega}_{0}^{i}$ have been selected to tune the simple model α-helix and β-sheet regions to match those of the explicit model (see section 3.1.1). The specific values of these parameters are listed in Table II.

Solvation effects play a crucial role in determining hydrogen bonding. A part of this effect is taken into account by the ${\epsilon}_{\mathit{eff}}^{\u2033}$ of Eq. (5). However this uniform screening does not yield the correct physics of hydrogen bonding as established by our simulated studies where the RMS between the calculated and observed structure drift upward over time. Furthermore, even the introduction of $\mathrm{\Delta}{U}_{\mathit{mm}}^{\mathit{phi}-\mathit{psi}}$ have not resolved this problem.

In order to obtain better hydrogen bonding representation we introduced a potential $\mathrm{\Delta}{U}_{\mathit{mm}}^{\mathit{HB}}$, where a small Gaussian barrier was added to the potential well used to describe hydrogen bonds. At room temperature, this barrier is easily crossed during hydrogen bond formation while preventing the already formed bonds from breaking. This treatment gives $\mathrm{\Delta}{U}_{\mathit{mm}}^{\mathit{HB}}$ the functional form

$$\mathrm{\Delta}{U}_{\mathit{mm}}^{\mathit{HB}}=\{\begin{array}{cc}-9& r\le 2.0\\ -9\phantom{\rule{0.2em}{0ex}}\text{exp}\phantom{\rule{0.2em}{0ex}}\left(-15{\left(r-2.3\right)}^{2}\right)& r>2.0\end{array}$$

(13)

The inclusion of both ${U}_{\mathit{mm}}^{\mathit{phi}-\mathit{psi}}$ and $\mathrm{\Delta}{U}_{\mathit{mm}}^{\mathit{HB}}$ greatly increases the long term stability of the simple model, with a maximum RMS deviation of less than 3.0 Å over the course of a 100 ps trajectory.

The greatest difficulty to multiscale simulations lies in accurately and efficiently recreating the original all atom system from the simplified model system. Our strategy is based on the approach introduced by Fan *et. al.*^{10} In this approach we considered the partition functions *Q _{sp}* and

$$\text{exp}\left(-\mathrm{\Delta}{G}_{\mathit{ep}\to \mathit{sp}}\beta \right)={Q}_{\mathit{sp}}/{Q}_{\mathit{ep}}$$

(14)

where *β* = 1/*k _{B}T, k_{B}* being the Boltzmann constant and

$${U}_{\mathit{sp}}={U}_{m}\left(\stackrel{\sim}{R}\right)+{U}_{s}^{\mathit{sp}}\left(\stackrel{\sim}{R}\right)$$

(15)

and

$${U}_{\mathit{ep}}={U}_{m}\left(\stackrel{\sim}{R}\right)+{U}_{s}^{\mathit{ep}}\left(\stackrel{\sim}{r},\stackrel{\sim}{R}\right)$$

(16)

where are the coordinates of the simplified model and are the coordinates of the explicit side chain atoms, relative to the centers of the corresponding side chains in the simplified model. Now our task is to evaluate the free energy of moving from the simplified to the explicit model, Δ*G _{sp→ep}*, or −Δ

$$\begin{array}{c}\text{exp}\phantom{\rule{0.2em}{0ex}}\left(-\mathrm{\Delta}{G}_{\mathit{ep}\to \mathit{sp}}\beta \right)={Q}_{\mathit{sp}}/{Q}_{\mathit{ep}}=\hfill \\ =\int d\stackrel{\sim}{R}d\stackrel{\sim}{r}\phantom{\rule{0.2em}{0ex}}\text{exp}\phantom{\rule{0.2em}{0ex}}\left\{-{U}_{\mathit{sp}}\left(\stackrel{\sim}{R}\right)\beta \right\}/\int d\stackrel{\sim}{R}d\stackrel{\sim}{r}\phantom{\rule{0.2em}{0ex}}\text{exp}\phantom{\rule{0.2em}{0ex}}\left\{-{U}_{\mathit{ep}}\left(\stackrel{\sim}{r},\stackrel{\sim}{R}\right)\beta \right\}\hfill \\ =\frac{\int d\stackrel{\sim}{R}d\stackrel{\sim}{r}\phantom{\rule{0.2em}{0ex}}\text{exp}\phantom{\rule{0.2em}{0ex}}\left\{-\left({U}_{\mathit{sp}}\left(\stackrel{\sim}{R}\right)-{U}_{\mathit{ep}}\left(\stackrel{\sim}{r},\stackrel{\sim}{R}\right)\right)\beta \right\}\text{exp}\left\{-{U}_{\mathit{ep}}\left(\stackrel{\sim}{r},\stackrel{\sim}{R}\right)\beta \right\}}{\int d\stackrel{\sim}{R}d\stackrel{\sim}{r}\phantom{\rule{0.2em}{0ex}}\text{exp}\phantom{\rule{0.2em}{0ex}}\left\{-{U}_{\mathit{ep}}\left(\stackrel{\sim}{r},\stackrel{\sim}{R}\right)\beta \right\}}\hfill \end{array}$$

(17)

Thus we have

$$\text{exp}\left(-\Delta {G}_{\mathit{ep}\to \mathit{sp}}\beta \right)={\langle \text{exp}\left\{-\left({U}_{\mathit{sp}}\left(\stackrel{\sim}{R}\right)-{U}_{\mathit{ep}}\left(\stackrel{\sim}{r},\stackrel{\sim}{R}\right)\right)\beta \right\}\rangle}_{{V}_{\mathit{ep}}}$$

(18)

where the average is done formally over the combined *+* coordinate set. Eq. (18) can be evaluated by using a free energy perturbation approach with a mapping potential:

$${U}_{m}={U}_{\mathit{ep}}\left(\stackrel{\sim}{r},\stackrel{\sim}{R}\right)\left(\mathit{1}-{\lambda}_{m}\right)+{U}_{\mathit{sp}}\left(\stackrel{\sim}{R}\right){\lambda}_{m}$$

(19)

so that we have,

$$\text{exp}\phantom{\rule{0.2em}{0ex}}\left(-\delta \mathrm{\Delta}{G}_{m\to m+1}\beta \right)={\langle \text{exp}\left\{-\left({U}_{m+1}-{U}_{m}\right)\beta \right\}\rangle}_{{V}_{m}}$$

(20)

with

$$\mathrm{\Delta}{G}_{m\to m+1}=\text{\u2211}_{m=1}^{n+1}\delta \mathrm{\Delta}{G}_{m\to m+1}$$

(21)

In many applications we are not interested in the total partition function and the total free energy, but in the potential of mean force (PMF) of the simplified and explicit models. In such cases we define the PMF by considering a partial partition function

$$Q=\int q\left(X\right)dX$$

(22)

and

$$q\left(X\right)=\text{exp}\left\{-\mathrm{\Delta}g\left(X\right)\beta \right\},$$

(23)

where X is the parameter that defines the PMF. With Δ*g _{sp}*(

The thermodynamic cycle used to calculate the change in free energy Δ*g*_{ep} for a generic process. Having calculated the free energy change of the simple model, Δ*g*_{sp}, umbrella sampling can be used to calculate the free energy change ΔΔ **...**

$$\text{exp}\phantom{\rule{0.2em}{0ex}}\left\{-\mathrm{\Delta}{g}_{\mathit{ep}\to \mathit{sp}}\left(X\right)\beta \right\}={q}_{\mathit{sp}}\left(X\right)/{q}_{\mathit{ep}}\left(X\right)$$

(24)

Here we follow the same strategy as in Eq. (18) and obtain,

$$\text{exp}\left\{-\mathrm{\Delta}{g}_{\mathit{ep}\to \mathit{sp}}\left(X\right)\beta \right\}={\langle \delta \left(X-X\prime \right)\text{exp}\left\{-\left({U}_{\mathit{sp}}\left(\stackrel{\sim}{R},X\prime \right)-{U}_{\mathit{ep}}\left(\stackrel{\sim}{r},\stackrel{\sim}{R},X\prime \right)\right)\right\}\rangle}_{{V}_{\mathit{ep}}}$$

(25)

If we use the mapping potential of Eq. (19) we will have,

$$\mathrm{\Delta}{g}_{\mathit{ep}\to \mathit{sp}}\left(X\right)=\text{\u2211}_{m=1}^{n+1}\delta \phantom{\rule{0.2em}{0ex}}\mathrm{\Delta}{g}_{m\to m+1}\left(X\right)$$

(26)

with,

$$\text{exp}\left\{-\delta \mathrm{\Delta}{g}_{m\to m+1}\left(X\right)\right\}={\langle \delta \left(X-X\prime \right)\text{exp}\left\{-\left({U}_{m+1}-{U}_{m}\right)\beta \right\}\rangle}_{{V}_{m}}$$

(27)

In evaluating the free energy terms of Eqs. (18) or (27), we start typically from the simplified model in a given configuration and construct the explicit model by replacing the dummy atom with the explicit C_{β}. Then, using a rotamer library^{26} (with random search followed by brief steepest decent torsional minimization) we select the side chain configurations that minimize the potential energy of the system, while also minimizing the displacement between the center of the explicit side chain and the position of the simplified atom. After this treatment that considers all the rotational degrees of freedom within the side chains, we apply a steepest descent minimization over the Cartesian degrees of freedom within each side chain to obtain the relaxed side chains structure. Once the explicit structure has been minimized we can use Molecular Dynamics (MD) or Monte Carlo (MC) for the evaluation of the free energy of moving from the simplified to the explicit model.

A crucial element of the introduction of any analytic potential surface is the parameterization and validation process. This section considers aspects of the refinements that were not considered in the previous sections.

The optimization of the parameters that determine the solvation contribution from
${U}_{\mathit{mm}}^{\mathit{phi}-\mathit{psi}}$ and
$\mathrm{\Delta}{U}_{\mathit{mm}}^{\mathit{HB}}$ involved the use of the alanine dipeptide as a model system. The explicit free energy surface for this system was generated both for a gas phase model and for a solution model (using the surface constrained all atom solvent (SCAAS) model^{27}^{,}^{28}). In order to recreate the hydrated landscape by the simplified model, it was necessary to add to *U ^{phi}*

The addition of the
$\mathrm{\Delta}{U}_{\mathit{mm}}^{\mathit{phi}-\mathit{psi}}$ term (and in some respects the
$\mathrm{\Delta}{U}_{\mathit{mm}}^{\mathit{HB}}$) to the simplified model created a surface with minima at the typical regions obtained in other works (e.g. Lovell et al.^{29}). Our RC diagram is presented in Fig. 3 for both the explicit and simplified models. Apparently the simplified surface is not perfect, but allows us to capture the main features of the explicit model. The notations for the different regions are taken from the ref^{29}.

The term
$\mathrm{\Delta}{U}_{\mathit{mm}}^{\mathit{HB}}$ is strongly coupled to
$\mathrm{\Delta}{U}_{\mathit{mm}}^{\mathit{phi}-\mathit{psi}}$ and it is important to examine whether in addition to reproducing reasonable phi-psi map, we can retain a reasonable description of stable helices and other structures that are determined by hydrogen bonding. With the
$\mathrm{\Delta}{U}_{\mathit{mm}}^{\mathit{HB}}$ we find that a small barrier to hydrogen bond formation (~1 kcal mol^{-1}) is easily crossed at room temperature, but forms a significant barrier to the breaking of hydrogen bonds once they have been formed. By including Δ*U ^{HB}* in Eq. (1), we find that even full scale protein systems like ubiquitin or chorismate mutase

The introduction of
${\text{U}}_{\text{s}}^{\text{self}}$ is perhaps the most important new element in our approach. This potential clearly improves the physical basis of our model but the reliability of the model still depends crucially on the parametization of the self energy potential. The refinement procedure was done by selecting different ionizable groups in several protein sites, evaluating their self energy using the PDLD/S-LRA approach^{28} and adjusting the parameters in
${\text{U}}_{\text{s}}^{\text{self}}$ to reproduce the best agreement between the PDLD/S results and those of the simplified model.

In the first step of this refinement procedure we started by considering ${\text{U}}_{\text{s}}^{\text{self}}$ in a fully hydrophobic, non polar, protein environment. For this purpose we created the relevant environment by using in-silico methods. That is, we mutated, in-silico, residues 21-56 (out of the total of 62 residues) of the protein SSO7d (PDB ID 1SSO) to all-hydrophobic residues, except Ile29 which we mutated into Glu. This created an all-hydrophobic environment surrounding the ionizable residue Glu29 (see Fig. 4), where Glu29 is buried, by residues 22 to 33 and 43 to 56. This model which is referred to as “the fully buried Glu 29” was then modified creating two additional environments for Glu29. In the second, which is referred to as a “semi-buried” environment, we deleted residues 49 to 56, making Glu29 more exposable to solvent. Finally for the third, which is refereed to here as “exposed” environment, we mutated Val22, Met24 and Phe47 into Glycine, making Glu29 fully exposed to solvent.

The system used as a benchmark for the behavior of the self energy term in non polar environment. The backbone conformation of protein 1SSO residues 21-56 was taken, and the fully hydrophobic sequence after the mutations is reported in the Figure. The **...**

Next we calculated the solvation free energy of the ionized Glu29, in all 3 hydrophobic environments described above. We used the PDLD/S-LRA as well as simplified model's Eqs. (6 – 10), and adjusted the parameters in these equations to get the best agreement between the two values. The results for simplified and PDLD/S-LRA salvation energy for the artificial protein environments are given in the first three entries of Table III.

After refining the simplified model to obtain the best description of the effect of nonpolar residues on the self-energy, we turned to the refinement of the parameters that reflect the polar contributions. This was done by the same procedure described above but now for a number of real proteins sites (where we have significant number of polar residues). The protein sites were taken from three diverge proteins: Apoflavodoxin (PDB ID 1FTG) residue site Glu106, Tm DHFR (PDB ID 1CZ3) residue sites Arg28 and Asp117, and Interleukin (PDB ID 1IOB) residue sites Glu111 and Glu113. Following the same steps as with the artificial hydrophobic environments, we calculated ${\text{U}}_{\text{s}}^{\text{self}}$ by using both the PDLD/S-LRA and the simplified model's approach and then refined the parameters in Eqs. (6-10). The self energies obtained by both approaches are given in Table III. The final set of refined parameters for ${\text{U}}_{\text{s}}^{\text{self}}$ are given in Table IV. As seen from the tables, the agreement between simplified model ${\text{U}}_{\text{s}}^{\text{self}}$, and PDLD/S-LRA ${\text{U}}_{\text{s}}^{\text{self}}$ is quite encouraging.

In considering the transition from simplified CG models to explicit models, we start by noting that Eq. (14) is not conceptually tied to a given type of reaction or process. Rather, it represents a general means of converting between two model representations of the same system. The thermodynamic cycle of this generic transformation is given in Fig. 2. Previously^{10}, we have shown that this process can be exploited in calculating the PMF for unfolding processes, using the radius of gyration as a means of denaturing the system from the folded to the unfolded state. We have also shown that it is possible to use simplified model structures to obtain explicit structures of both folded and unfolded proteins^{12}. In this work we explore other applications of our general strategy. This is done in the following subsection.

One possible application of our model is the evaluation of the effect of mutation on protein stability. Although this can be done by evaluating the folding PMF for both systems, it is much simpler to do so by using the thermodynamic cycle of Fig. 5. More specifically, ultimately we are interested in evaluating:

The thermodynamic cycles used to calculate the change in free energy of unfolding upon mutation (see text for details)

$$\mathrm{\Delta}\mathrm{\Delta}{G}_{N\to M}^{\mathit{uf}\to f}=\mathrm{\Delta}{G}_{{M}_{\mathit{ep}}}^{\mathit{fold}}-\mathrm{\Delta}{G}_{{N}_{\mathit{ep}}}^{\mathit{fold}}$$

(29)

where $\mathrm{\Delta}{G}_{{M}_{\mathit{ep}}}^{\mathit{fold}}$ and $\mathrm{\Delta}{G}_{{N}_{\mathit{ep}}}^{\mathit{fold}}$ are the free energies of the mutant and native forms of the protein, respectively, evaluated by the explicit model. $\mathrm{\Delta}{G}_{{M}_{\mathit{ep}}}^{\mathit{fold}}$ can be evaluated using the thermodynamic cycle of Fig. 5, yielding

$$\mathrm{\Delta}{G}_{{M}_{\mathit{ep}}}^{\mathit{fold}}=\mathrm{\Delta}{G}_{{M}_{\mathit{sp}}}^{\mathit{fold}}+\left(\mathrm{\Delta}\mathrm{\Delta}{G}_{{M}_{\mathit{sp}\to \mathit{ep}}}^{f}-\mathrm{\Delta}\mathrm{\Delta}{G}_{{M}_{\mathit{sp}\to \mathit{ep}}}^{\mathit{uf}}\right)$$

(30)

where $\mathrm{\Delta}{G}_{{M}_{\mathit{ep}}}^{\mathit{fold}}$ is the folding free energy of the mutant in the simplified model, while $\mathrm{\Delta}\mathrm{\Delta}{G}_{{M}_{\mathit{sp}\to \mathit{ep}}}^{f}$, $\mathrm{\Delta}\mathrm{\Delta}{G}_{{M}_{\mathit{sp}\to \mathit{ep}}}^{uf}$ are the changes in free energy of the transformation from the simplified model to the explicit model for the folded and unfolded mutant, respectively. The folding energy of the native system is produced through the same process as the mutant. However, rather than calculating the folding energy directly, we can use the thermodynamic cycle inside the grey box in Fig. 5 and obtain

$$\mathrm{\Delta}{G}_{{M}_{\mathit{sp}}}^{\mathit{fold}}-\mathrm{\Delta}{G}_{{N}_{\mathit{sp}}}^{\mathit{fold}}=\mathrm{\Delta}{G}_{{N}_{\mathit{sp}}\to {M}_{\mathit{sp}}}^{f}-\mathrm{\Delta}{G}_{{N}_{\mathit{sp}}\to {M}_{\mathit{sp}}}^{\mathit{uf}}$$

(31)

It is therefore possible to calculate the change in folding free energy in the simple model by comparing the free energy of mutating the protein in the folded and unfolded systems. Ultimately, combining Eqs. (29 – 31), and the thermodynamic cycle of Fig. 5 gives,

$$\begin{array}{r}\begin{array}{c}\mathrm{\Delta}\mathrm{\Delta}{G}_{N\to M}^{\mathit{uf}\to f}=\left(\mathrm{\Delta}{G}_{{N}_{\mathit{sp}}\to {M}_{\mathit{sp}}}^{f}-\mathrm{\Delta}{G}_{{N}_{\mathit{sp}}\to {M}_{\mathit{sp}}}^{\mathit{uf}}\right)\\ \hfill +\left(\mathrm{\Delta}\mathrm{\Delta}{G}_{{M}_{\mathit{sp}\to \mathit{ep}}}^{f}-\mathrm{\Delta}\mathrm{\Delta}{G}_{{M}_{\mathit{sp}\to \mathit{ep}}}^{\mathit{uf}}\right)\\ \hfill -\left(\mathrm{\Delta}\mathrm{\Delta}{G}_{{N}_{\mathit{sp}\to \mathit{ep}}}^{f}-\mathrm{\Delta}\mathrm{\Delta}{G}_{{N}_{\mathit{sp}\to \mathit{ep}}}^{\mathit{uf}}\right)\end{array}\end{array}$$

(32)

While Eq. (32) appears more complicated than evaluating Eq. (29) directly, it has definite advantages. First, calculating ΔΔ*G _{uf}*

As a test case for the performance of Eq. (32), we have chosen to examine the pseudo wild-type ubiquitin and the Asp21Asn mutant discussed in our previous work^{21} (see also ref^{30}). The three dimensional representation of this system is shown in Fig. 6. We have chosen this system because it has been well studied by other means^{21}, and at 76 residues is large enough and has a compact enough secondary structure to represent an actual system of interest. Our FEP calculations were performed by both MD and Monte Carlo simulations. The calculations started with 10,000 MD relaxation steps at 300 K, using 1 fs for each step. The main chain and residues far from the mutated residues were fixed during this relaxation runs. The resulting explicit structure was then used to generate the simplified structure. Four such simplified structures were generated. Starting from the given simplified structure we simulated the motions of the explicit model by either MD or MC simulations at 300 K with the mapping potential of Eq. (19) and evaluated the free energy of moving from the simplified to the explicit model by a FEP approach. The key aspect of our procedure is demonstrated in Fig. 7, where we describe the fluctuations of the gap between the explicit and simplified potential for trajectories on the given mapping potential of the native protein (in this case the sixth window with *λ* = 0.5). The same procedure was repeated for 11 frames and the overall free energy of moving from the simplified to the explicit potential was calculated (
$\mathrm{\Delta}{G}_{{N}_{\mathit{sp}\to \mathit{ep}}}^{f}$). The same procedure was used to evaluate
$\mathrm{\Delta}{G}_{{M}_{\mathit{sp}\to \mathit{ep}}}^{f}$ for the Asp 21Asn mutant. The calculations for the unfolded protein (
$\mathrm{\Delta}{G}_{{N}_{\mathit{sp}\to \mathit{ep}}}^{\mathit{uf}}$ and
$\mathrm{\Delta}{G}_{{M}_{\mathit{sp}\to \mathit{ep}}}^{\mathit{uf}}$) are shown in Table V.

A three dimensional representation of pseudo wild type ubiquitin. Asp21, which is involved in the mutational study, is represented explicitly.

The fluctuations of the energy difference between the explicit and simplified potentials, obtained during (A) FEP/MD, (B) Monte Carlo, simulations

Another important use of our approach is in the field of enzyme design where it can be used to evaluate the binding free energy of rate determining transition states. This can be done by focusing on the electrostatic free energy contribution Δ*G _{bind′}*, while using the cycle of Fig. 8, which leads to the following expression:

$$\begin{array}{l}\begin{array}{c}\mathrm{\Delta}\mathrm{\Delta}{G}_{\mathit{bin}{d}^{\prime},\mathit{ep}}^{(N\to M)}\left(\mathit{TS}\right)=-\mathrm{\Delta}{G}_{\mathit{sp}\to \mathit{ep}}^{N}(\mathit{TS})+\mathrm{\Delta}{G}_{\mathit{sp}\to \mathit{ep}}^{M}(\mathit{TS})+\mathrm{\Delta}{G}_{\mathit{sp}}^{N\to M}(\mathit{TS})\\ \hfill +\mathrm{\Delta}{G}_{\mathit{sp}\to \mathit{ep}}^{N}(T{S}^{\prime})-\mathrm{\Delta}{G}_{\mathit{sp}\to \mathit{ep}}^{M}(\mathit{TS}\prime )-\mathrm{\Delta}{G}_{\mathit{sp}}^{N\to M}(\mathit{TS}\prime )\end{array}\end{array}$$

(33)

Where
$\mathrm{\Delta}{G}_{\mathit{sp}\to \mathit{ep}}^{N}(\mathit{TS})$ and
$\mathrm{\Delta}{G}_{\mathit{sp}\to \mathit{ep}}^{M}(\mathit{TS})$ are the changes in the free energy of transforming the simplified model to the explicit model, for the fully charged TS in the native and mutant, respectively.
$\mathrm{\Delta}{G}_{\mathit{sp}}^{N\to M}(\mathit{TS})$ is the difference in free energy between native and mutant simplified structures for the fully charged TS.
$\mathrm{\Delta}{G}_{\mathit{sp}\to \mathit{ep}}^{N}(\mathit{TS}\prime )$ and
$\mathrm{\Delta}{G}_{\mathit{sp}\to \mathit{ep}}^{M}(\mathit{TS}\prime )$ are the changes in the free energy of transforming the simplified model to the explicit model for the zero charged TS, in the native and mutant, respectively.
$\mathrm{\Delta}{G}_{\mathit{sp}}^{N\to M}(\mathit{TS}\prime )$ is the difference in free energy between native and mutant simplified structures for the zero charged *TS*.

The calculations were performed on monomeric chorismate mutase (mMjCM) (whose active site is depicted in Fig. 9). The coordinates mMjCM (NMR) structure were taken from PDB access code 2GTV. The native protein preceded to 100ps MD simulations in order to obtain a relaxed structure. The resulting native explicit structure was then used to generate the simplified structure. Both the simplified and explicit models included an explicit empirical valence bond (EVB) description of the reacting system (see ref^{31} for details). Three such simplified structures were generated and used to simulate the fluctuations of the explicit model by MC moves of the torsional coordinates. The simulations were done at 300 K with the mapping potential of Eq. (19) and evaluated the free energy of moving from the simplified to the explicit model with full EVB transition state charges and zero EVB transition state charges. We have used a simulation length of 1 ps for each of the 11 mapping steps and the overall free energy of moving from the simplified to the explicit potential was calculated for the charged (
$\mathrm{\Delta}{G}_{\mathit{sp}\to \mathit{ep}}^{N}(\mathit{TS})$) and zero charged
$\mathrm{\Delta}{G}_{\mathit{sp}\to \mathit{ep}}^{N}(\mathit{TS}\prime )$ transition state. We choose as test cases the mutations of Gln88 and Arg51 since both (particularly Gln88) were found to present a significant challenge in our recent exploration of the potential of the EVB in enzyme design^{31}. Here we mutated Gln88 to Asn88 and repeated the calculation described above. We also mutated Arg51 to Gln51 and repeated the calculations. The results are shown in Table VI. Finally, we calculated
$\mathrm{\Delta}\mathrm{\Delta}{G}_{\mathit{bind}}^{\mathit{ep}(N\to M)}\left(\mathit{TS}\right)$ by taking the corresponding change in average energy (see results in Table VI).

A schematic description of the mMjCM active sites, depicting key residues that are involved in the binding of the transition-state analogue (bold).

The use of the CG model as a reference potential can facilitate studies of other long-standing problems and here we list several new key applications.

The proposal that slow conformational motions on the millisecond time scale play a major role in enzyme catalysis has been featured in many recent high profile works (see discussion in ref^{32}). Although there is no direct experimental support, and there are very clear conceptual and logical reasons why this idea is invalid (see ref^{32} and reference in that work), it is crucial to explore the dynamical proposal by simulation studies that can explore the relevant milliseconds time range. Furthermore, there are other fundamental functional problems whose analysis can be greatly helped from the ability to explore slow protein motions.

Recently we introduced a renormalization method were we move from the full model to the simplified model and then to an even simpler 2-D model in representing the landscape and dynamics in the space defined by the conformational and chemical coordinate in enzymes. In this treatment we force the CG model to have the same dynamical properties as the full model by a strategy similar to the one used in our modeling of ion channels^{33}. The main element of our approach is the refinement of the friction in the CG model so that the response of this model to the application of a strong constraint will be similar to that of the full model. The use of this model and in particular its transfer to the 2-D model allows us to explore the time dependence of processes that occur on a long time scale (e.g. ref^{34} and ref^{32}). The study of ref provided the first direct proof that the dynamical proposal is invalid and further progress in the use of the renormalization model is clearly expected.

There is a significant current interest in MD simulations that take into account changes in ionizations states during the simulated process (ref^{35} and ref^{36}^{,}^{37}). However, the current model does not consider the time dependence of the proton transport (PTR) process and thus cannot reproduce the proper time dependence of the response of a protein to changes in pH. To advance on this challenging field we combined our approach of time dependent Monte Carlo (MC) of PTR processes (ref^{38}) and the simplified protein model described here, in studies of pH dependent MD

Our model represents the energetics of the system by a simplified version of the EVB using the modified Marcus equation (see ref^{39}) for the energetic of any possible PTR step, where the free energies for the different protonation states is obtained by the electrostatic energy of the CG model. The MC moves are based on the electrostatic energies of the CG model and then scaled by the characteristic PT time to correspond to the rate constant predicted by transition state theory. More specifically (ref^{40} and ref^{41}), the free energy of each protonation state is expressed by

$$\mathrm{\Delta}{G}^{\left(m\right)}=\text{\u2211}_{k}\left\{-2.3\mathrm{RT}{\mathrm{q}}_{k}^{\left(m\right)}\left[\text{p}{K}_{\text{int},k}^{p}-\mathrm{pH}\right]+1/2\text{\u2211}_{k\ne 1}{\mathrm{W}}_{\mathit{kl}}{\mathrm{q}}_{k}^{\left(m\right)}{\mathrm{q}}_{l}^{\left(m\right)}\right\}$$

(34)

where m designates the vector of the charge states of the given configuration (i.e. m = q_{1}^{(m)}, q_{2}^{(m)},… q_{n}^{(m)}). Here q_{k}^{(}^{m}^{)} is the actual charge of the *k*^{th} group at the *m*^{th} configuration. The W* _{kl}* q

The barrier for the PT moves is given by^{39}

$$\mathrm{\Delta}{g}_{i\to j}^{\ne}={(\mathrm{\Delta}{G}_{i\to j}^{0}+\lambda )}^{2}/4\lambda -{\overline{H}}_{\mathit{ij}}({x}^{\ne})+{\overline{H}}_{\mathit{ij}}^{2}({x}_{0}^{(i)})/(\mathrm{\Delta}{G}_{i\to j}^{0}+\lambda )$$

(35)

where
$\mathrm{\Delta}{G}_{i\to j}^{0}$ is the free energy of the reaction, and H_{ij} is the EVB off-diagonal term that mixes the two relevant states whose average values at the transition state *x*^{≠} and at the reactant state
${x}_{0}^{(i)}$, are designated by the corresponding . The first term is the expression of the regular Marcus equation (see ref^{39}), which corresponds to the intersection of Δ*g*_{1} and Δ*g*_{2} at *x*^{≠}. The second and third terms represent, respectively, the effect of H_{12} at *x*^{≠} and
${x}_{0}^{(i)}$.

Using Eq. (35) with simplified modifications^{38}, we consider random jumps of the proton to any possible site, but accept only jumps to sites i+1 and i-1 (or in the more general case to sites that are less than 4.0 from site *i* (usually we use 3.5 Å cutoff but here we use a simplified model for the protein side chains). Furthermore, these jumps are accepted only if they satisfy the standard Metropolis criterion with regard to the free energy change (Δ*G _{n}*

$${\tau}_{i\to j}={S}_{\mathit{PT}}\cdot {N}_{i\to j}$$

(36)

Where *τ _{i}*

$$S=0.165\cdot \text{exp}\left\{\frac{\delta}{{k}_{B}T}\right\}$$

(37)

Where 0.165 is the average time, in picoseconds, it takes for a productive trajectory to reach the *TS* at room temperature (see supplementary information of ref^{38}). The factor *δ* represents the solute and solvent reorganization barrier for a transfer between states of similar energy, i.e.
$\delta =\mathrm{\Delta}{g}_{n\to n+1}^{\u2021}\phantom{\rule{0.2em}{0ex}}-\mathrm{\Delta}{G}_{n\to n+1}$. The MC approach guarantees that the rate of the PT steps will follow the relevant Boltzmann probability in TST (see ref^{38}).

The PT to and from the bulk is determined by the same considerations but now using the free energy of Eq. (34) for the given pH. The MC move for Δ*G _{N}*

$${\tau}_{\mathit{Balk}}=0.2\mathit{ps}$$

(38)

Since some ionized groups are not in direct contact with the bulk solvent, we scale the MC move for PT to and from the bulk by a factor, *F(R)*, that reflects the distance, *R _{i,bulk}*, between the given residue and the closest bulk water molecule.

$$\begin{array}{ll}F\left({R}_{i,\mathit{bulk}}\right)={R}_{i,\mathit{bulk}}-a& \begin{array}{c}{R}_{i,\mathit{bulk}}\ge 4\hfill \\ \mathit{site}\phantom{\rule{0.2em}{0ex}}i\phantom{\rule{0.2em}{0ex}}\mathit{is}\phantom{\rule{0.2em}{0ex}}\mathit{protonated}\hfill \end{array}\\ F\left({R}_{i,\mathit{bulk}}\right)=a-{R}_{i,\mathit{bulk}}& \begin{array}{c}{R}_{i,\mathit{bulk}}\ge 4\hfill \\ \mathit{site}\phantom{\rule{0.2em}{0ex}}i\phantom{\rule{0.2em}{0ex}}\mathit{is}\phantom{\rule{0.2em}{0ex}}\mathit{not}\phantom{\rule{0.2em}{0ex}}\mathit{protonated}\hfill \end{array}\\ F\left({R}_{i,\mathit{bulk}}\right)=0& \hfill {R}_{i,\mathit{bulk}}<4\end{array}$$

(39)

Where *a* is a parameter of positive value. In this preliminary work, *a* is set to *a* = 10 Å. The distances *R*, are determined by using a 4 Å grid and excluding grid points that are within 4 Å from the protein (this large exclusion radius is due to the fact that we use a simplified protein model). The function *F(R)* was chosen to represent the chance of a bulk molecule to approach the protein residues. The grid is then updated once in every 1000 MD steps. Since the exponential factor of *F(R)* is included in the criteria of the MC move that determines the acceptance of the PT moves we can write:

$${S}_{\mathit{bulk}}=0.2\mathit{ps}=0.16\cdot \left(\frac{0.2}{0.16}\right)\phantom{\rule{0.2em}{0ex}}\mathit{ps}$$

(40)

Thus we can write for the PT from and to the bulk

$${\tau}_{\mathit{bulk}}={S}_{\mathit{bulk}}\cdot {N}_{i\to j}$$

(41)

The scaling by 0.16 allows us to use the same scaling for the MC for the PT from bulk and the MC for the PT between the protein groups and between the protein groups and the bulk.

The next issue is the coupling between the PT steps and the protein fluctuations. One option is to run Langevin dynamics (LD) of the CG model. This option requires a significant computational time, since the typical LD steps are about 10fs, where the MC steps correspond to 200fs. Another option is to also simulate the protein motion by an MC treatment, and then estimate the characteristic time for the MC moves. Finally, considering the fact that protein's structural changes are of interest mainly when they appear after the changes in its ionization state, it is possible to simply turn-on the protein relaxation, only after reaching new stable ionization states, then allow the protein to relax (with fixed ionization state) and finally turn on again the MC for the PT processes. While such approach may not be fully consistent, it will converge to the same final state as the more consistent and much more expensive fully coupled LD - MC approach. All the above options will be considered in the future. Regardless of the method used for simulating the protein fluctuations, we expect the overall method to be very instrumental in modeling pH induced conformational changes.

The preliminary implementation of the model demonstrated that calculated ionization state at the equilibrated system appeared to be quite similar to those obtained with the PDLD/S-LRA treatment. We also performed preliminary simulations of the time dependent PTR in protein SSO7d (PDB ID: 1SSO) following pH change from 0 to 7. The residue sites used for these simulations are: Lys4, Lys6, Lys8, Glu10, Glu11, Lys12, Asp15, Lys18, Lys20, Lys21, Lys24, Lys27, Asp34 and Glu35. The corresponding results are given in Fig. 10. The approach is promising but more validation is needed.

This work presented the current version of our multiscale approach that uses a simplified folding model as a reference potential for all-atom simulations. The simplified model used here includes some innovations in terms of the electrostatic treatment and in particular in terms of the representation of the self-energy. However, the main point in our treatment is that the simplified potential does not have to be perfect since it is only used as a reference for the explicit potential. Of course, the closer are the simplified and the explicit potential, the faster will be the convergence of the calculations of the free energy for the transfer between these potentials.

The power of our treatment was illustrated in calculations of mutational effects, in the calculations of transition state binding energies and in calculations of time dependent response to pH changes. Another application that was demonstrated in our previous work involves the evaluation of the potential of mean force (PMF). We also outlined other applications of our approach, including its implementation in studies of long time scale dynamical effects and the proposal that dynamical effects contribute to catalysis^{32}. More systematic options of using the reference potential for exploring the dynamics of the full model has not been fully formalized but developments in this crucial direction are clearly expected.

The exact time saving of the CG model is an issue that requires careful analysis, which is left to subsequent studies. That is, some of the key questions like obtaining a PMF for protein unfolding are hard to explore with explicit models since performing the corresponding calculations, where the proper sampling is extremely challenging. Similarly the effect of proper sampling calculations of binding energies is not yet fully clear. Thus we are not trying to give here time estimates, and hope to address the issue in subsequent works. However we can give here a rather trivial example, noting that a structural relaxation of the protein 1SSO took 100ps took 7 hours on a 64bit Dual Intel Xeon node, with 20Gb memory, while the equivalent relaxation of the simplified model on the same node took less than 2 minutes.

Overall it is clear that multiscale modeling of proteins has advanced significantly from its early days in 1975 and that a more rigorous treatments should focus on minimizing the difference between the average of the difference between the simplified and explicit potential <(U_{sp}-U_{exp})>. In fact, minimizing this functional with respect to the parameters of the simplified model is probably the most promising ways of refining the CG model.

This work was supported by NIH grants GM 24492 and GM40283. M.R. thanks the Generalitat Valenciana from Spain for the postdoctoral fellowship. We gratefully acknowledge the University of Southern California's High Performance Computing and Communications Center for computer time.

1. Levitt M, Warshel A. Computer-Simulation of Protein Folding. Nature. 1975;253(5494):694–698. [PubMed]

2. Bryngelson JD, Wolynes PG. Spin glasses and the statistical mechanics of protein folding. Proc Natl Acad Sci U S A. 1987;84(21):7524–7528. [PubMed]

3. Dill KA. Dominant Forces in Protein Folding. Biochemistry. 1990;29(31):7133–7155. [PubMed]

4. Hinds DA, Levitt M. A lattice model for protein structure prediction at low resolution. Proc Natl Acad Sci U S A. 1992;89(7):2536–2540. [PubMed]

5. Olszewski KA, Kolinski A, Skolnick J. Folding simulations and computer redesign of protein A three-helix bundle motifs. Proteins. 1996;25(3):286–299. [PubMed]

6. Shakhnovich E, Abkevich V, Ptitsyn O. Conserved residues and the mechanism of protein folding. Nature. 1996;379(6560):96–98. [PubMed]

7. Cheung MS, Chavez LL, Onuchic JN. The energy landscape for protein folding and possible connections to function. Polymer. 2004;45(2):547–555.

8. Heath AP, Kavraki LE, Clementi C. From coarse-grain to all-atom: toward multiscale analysis of protein landscapes. Proteins. 2007;68(3):646–661. [PubMed]

9. Marrink SJ, Risselada HJ, Yefimov S, Tieleman DP, de Vries AH. The MARTINI force field: coarse grained model for biomolecular simulations. J Phys Chem B. 2007;111(27):7812–7824. [PubMed]

10. Fan ZZ, Hwang JK, Warshel A. Using simplified protein representation as a reference potential for all-atom calculations of folding free energy. Theor Chem Acc. 1999;103(1):77–80.

11. Benkovic SJ, Hammes GG, Hammes-Schiffer S. Free-energy landscape of enzyme catalysis. Biochemistry. 2008;47(11):3317–3321. [PubMed]

12. Roca M, Messer BM, Hilvert D, Warshel A. On the relationship between folding and chemical landscapes in enzyme catalysis. Proc Natl Acad Sci U S A. 2008;105:13877–13882. [PubMed]

13. Xiang Y, Goodman MF, Beard WA, Wilson SH, Warshel A. Exploring the role of large conformational changes in the fidelity of DNA polymerase beta. Proteins. 2008;70(1):231–247. [PMC free article] [PubMed]

14. Luzhkov V, Warshel A. Microscopic Models for Quantum Mechanical Calculations of Chemical Processes in Solutions: LD/AMPAC and SCAAS/AMPAC Calculations of Solvation Energies. J Comp Chem. 1992;13:199–213.

15. Rosta E, Klahn M, Warshel A. Towards Accurate Ab Initio QM/MM Calculations of Free-Energy Profiles of Enzymatic Reactions. J Phys Chem B. 2006;110(6):2934–2941. [PubMed]

16. Wesolowski TA, Warshel A. Frozen Density Functional Approach for *Ab Initio* Calculations of Solvated Molecules. J Phys Chem. 1993;97:8050–8053.

17. Hwang JK, Warshel A. A quantized classical path approach for calculations of quantum mechanical rate constants. J Phys Chem. 1993;97(39):10053–10058.

18. Hwang JK, Warshel A. How important are quantum mechanical nuclear motions in enzyme catalysis? J Am Chem Soc. 1996;118:11745–11751.

19. Brandt A. Principles of systematic upscaling. In: Fish J, editor. Bridging the scales in Science and Engineering. Oxford University Press; 2008.

20. Matysiak S, Clementi C. Minimalist protein model as a diagnostic tool for misfolding and aggregation. J Mol Biol. 2006;363(1):297–308. [PubMed]

21. Roca M, Messer B, Warshel A. Electrostatic contributions to protein stability and folding energy. FEBS Lett. 2007;581(10):2065–2071. [PubMed]

22. Warshel A, Sharma PK, Kato M, Parson WW. Modeling electrostatic effects in proteins. Biochim Biophys Acta. 2006;1764(11):1647–1676. [PubMed]

23. Vicatos S, Roca M, Warshel A. Effective Approach for Calculations of Absolute Stability of Proteins Using Focused dielectric Constants. Proteins. 2009 In Print. [PMC free article] [PubMed]

24. Warshel A, Russell ST. Calculations of electrostatic interactions in biological systems and in solutions. Q Rev Biophys. 1984;17(3):283–422. [PubMed]

25. Warshel A, Russell ST, Churg AK. Macroscopic models for studies of electrostatic interactions in proteins: limitations and applicability. Proc Natl Acad Sci U S A. 1984;81(15):4785–4789. [PubMed]

26. Lovell SC, Word JM, Richardson JS, Richardson DC. The penultimate rotamer library. Proteins. 2000;40(3):389–408. [PubMed]

27. King G, Warshel A. A surface constrained all-atom solvent model for effective simulations of polar solutions. J Chem Phys. 1989;91(6):3647–3661.

28. Lee FS, Chu ZT, Warshel A. Microscopic and semimicroscopic calculations of electrostatic energies in proteins by the POLARIS and ENZYMIX programs. J Comp Chem. 1993;14:161–185.

29. Lovell SC, Davis IW, Arendall WB, 3rd, de Bakker PI, Word JM, Prisant MG, Richardson JS, Richardson DC. Structure validation by Calpha geometry: phi,psi and Cbeta deviation. Proteins. 2003;50(3):437–450. [PubMed]

30. Went HM, Jackson SE. Ubiquitin folds through a highly polarized transition state. Protein Eng Des Sel. 2005;18(5):229–237. [PubMed]

31. Roca M, Vardi-Kilshtain A, Warshel A. Toward Accurate Screening in Computer-Aided Enzyme Design. Biochemistry. 2009;48(14):3046–3056. [PMC free article] [PubMed]

32. Pisliakov AV, Cao J, Kamerlin SCL, Warshel A. Enzyme millisecond conformational dynamics do not catalyze the chemical step. Proc Natl Acad Sci U S A. 2009 Early edition. [PubMed]

33. Burykin A, Kato M, Warshel A. Exploring the origin of the ion selectivity of the KcsA potassium channel. Proteins. 2003;52(3):412–426. [PubMed]

34. Liu H, Shi Y, Chen XS, Warshel A. Simulating the electrostatic guidance of the vectorial translocations in hexameric helicases and translocases. Proc Natl Acad Sci U S A. 2009;106(18):7449–7454. [PubMed]

35. Khandogin J, Brooks CL., 3rd Constant pH molecular dynamics with proton tautomerism. Biophys J. 2005;89(1):141–157. [PubMed]

36. Baptista AM, Martel PJ, Petersen SB. Simulation of protein conformational freedom as a function of pH: constant-pH molecular dynamics using implicit titration. Proteins. 1997;27(4):523–544. [PubMed]

37. Baptista AM, Martel PJ, Soares CM. Simulation of electron-proton coupling with a Monte Carlo method: application to cytochrome c3 using continuum electrostatics. Biophys J. 1999;76(6):2978–2998. [PubMed]

38. Olsson MH, Warshel A. Monte Carlo simulations of proton pumps: on the working principles of the biological valve that controls proton pumping in cytochrome c oxidase. Proc Natl Acad Sci U S A. 2006;103(17):6500–6505. [PubMed]

39. Braun-Sand S, Burykin A, Chu ZT, Warshel A. Realistic simulations of proton transport along the gramicidin channel: demonstrating the importance of solvation effects. J Phys Chem B. 2005;109(1):583–592. [PubMed]

40. Warshel A. Conversion of light energy to electrostatic energy in the proton pump of halobacterium halobium. Photochem Photobiol. 1979;30:285–290. [PubMed]

41. Sham YY, Muegge I, Warshel A. Simulating proton translocations in proteins: probing proton transfer pathways in the Rhodobacter sphaeroides reaction center. Proteins. 1999;36(4):484–500. [PubMed]

42. Woycechowsky KJ, Choutko A, Vamvaca K, Hilvert D. Relative Tolerance of an Enzymatic Molten Globule and Its Thermostable Counterpart to Point Mutation. Biochemistry. 2008;47(51):13489–13496. [PubMed]

PubMed Central Canada is a service of the Canadian Institutes of Health Research (CIHR) working in partnership with the National Research Council's national science library in cooperation with the National Center for Biotechnology Information at the U.S. National Library of Medicine(NCBI/NLM). It includes content provided to the PubMed Central International archive by participating publishers. |