Home | About | Journals | Submit | Contact Us | Français |

**|**HHS Author Manuscripts**|**PMC2978075

Formats

Article sections

Authors

Related links

J Phys Chem B. Author manuscript; available in PMC 2010 November 10.

Published in final edited form as:

Published online 2008 February 6. doi: 10.1021/jp076377+

PMCID: PMC2978075

NIHMSID: NIHMS247218

The publisher's final edited version of this article is available at J Phys Chem B

See other articles in PMC that cite the published article.

The efficiency of temperature replica exchange (RE) simulations hinge on their ability to enhance conformational sampling at physiological temperatures by taking advantage of more rapid conformational interconversions at higher temperatures. While temperature RE is a parallel simulation technique that is relatively straightforward to implement, kinetics in the RE ensemble is complicated and there is much to learn about how best to employ RE simulations in computational biophysics. Protein folding rates often slow down above a certain temperature due to entropic bottlenecks. This “anti-Arrhenius” behavior represents a challenge for RE. However, it is far from straightforward to systematically explore the impact of this on RE by brute force molecular simulations, since RE simulations of protein folding are very difficult to converge. To understand some of the basic mechanisms that determine the efficiency of RE it is useful to study simplified low dimensionality systems that share some of the key characteristics of molecular systems. Results are presented concerning the efficiency of temperature RE on a continuous two-dimensional potential that contains an entropic bottleneck. Optimal efficiency was obtained when the temperatures of the replicas did not exceed the temperature at which the harmonic mean of the folding and unfolding rates is maximized. This confirms a result we previously obtained using a discrete network model of RE. Comparison of the efficiencies obtained using the continuous and discrete models makes it possible to identify non-Markovian effects which slow down equilibration of the RE ensemble on the more complex continuous potential. In particular, the rate of temperature diffusion and also the efficiency of RE is limited by the timescale of conformational rearrangements within free energy basins.

One of the key challenges in the computer simulation of proteins at the atomic level is the sampling of conformational space. The efficiency of many common sampling protocols, such as Monte Carlo (MC) and Molecular Dynamics (MD), is limited by the lack of apparent ergodicity caused by high free energy barriers between conformational states and rugged energy landscapes. Replica exchange (RE) methods^{1}^{–}^{5} are widely employed to enhance the conformational sampling efficiency of biomolecular simulations for the study of protein biophysics, including peptide and protein folding^{6}^{, }^{7} and aggregation,^{8}^{–}^{10} and protein-ligand interactions.^{11}^{, }^{12} To accomplish barrier crossings, RE methods simulate a series of replicas over a range of potential parameters^{13}^{–}^{17} or temperatures.^{5} In the latter, replicas exchange temperatures following a Metropolis criterion designed to preserve canonical distributions. This scheme allows conformations at physiological temperatures, where conformational interconversions are rare, to switch to higher temperatures where transitions to other conformations are more likely. In a sense, therefore, the enhancement of conformational sampling at low temperatures is achieved by “borrowing” the faster kinetics at higher temperatures.

The popularity of RE methods is due to their ease of implementation and their ability to enhance conformational sampling while preserving canonical distributions at the thermodynamic conditions of each replica. The properties of the RE algorithm and how it can be utilized most effectively for the study of protein folding and binding has received attention recently.^{18}^{–}^{20} The determination of the temperature assignment and number of replicas to achieve optimal temperature mixing has been the subject of a variety of studies.^{3}^{, }^{21}^{–}^{27} Recent work has also recognized the importance of conformational relaxation as a key limiting factor which can affect the efficiency of the RE algorithm.^{18}^{, }^{19}^{, }^{26}^{, }^{28} While temperature RE is relatively straightforward to implement, kinetics in the RE ensemble is complicated and does not correspond in any simple way to the molecular kinetics (necessitating additional methods for the reconstruction of molecular kinetics from RE samples^{29}^{–}^{32}). Molecular kinetics, however, can have a strong effect on RE, especially when the kinetics has complex temperature dependence. The anti-Arrhenius behavior typical of protein folding kinetics, where the folding rate above a critical threshold temperature decreases with increasing temperature,^{33}^{–}^{36} is understood to occur when the transition state is energetically favored but entropically disfavored with respect to the reactants. Anti-Arrhenius behavior represents a challenge for temperature RE because when folding exhibits anti-Arrhenius behavior there exists a temperature (generally unknown) at which the folding and unfolding rates are optimal. If even higher temperatures beyond the optimal are included in the RE ensemble, this may degrade performance.^{20}

Although some comparative studies aimed at determining the benefits of RE over conventional MD for peptide folding have been conducted,^{19}^{, }^{37}^{, }^{38} it is far from straightforward to systematically explore the convergence properties of RE by brute force molecular simulations, since RE simulations of protein folding are very difficult to converge. To understand some of the basic mechanisms that determine the efficiency of RE it is useful to study simplified low dimensionality systems that share some of the key characteristics of molecular systems. We recently investigated a discrete two-state network model for replica exchange (NRE), containing two conformational states (Folded and Unfolded) at each of several temperatures.^{20} We found that the efficiency of RE for this system varies non-monotonically with respect to the temperature distribution of the replicas when the folding rate displays anti-Arrhenius behavior. The model showed that the rate of folding/unfolding events in RE is maximal when high temperature replicas are placed near the temperature at which the harmonic mean of the folding and unfolding rates for the uncoupled system (*k _{f}* and

In this paper we extend this analysis by studying a continuous two-dimensional system designed to reproduce the anti-Arrhenius kinetics of a conformational equilibrium, such as a protein folding equilibrium, mediated by an entropic bottleneck. The two-dimensional system studied here is an extension of the potential model we originally used to study the convergence of the weighted histogram analysis method,^{39} and is very similar in spirit to to the funnel-like golf course model for protein folding studied by Szabo and co-workers.^{40} This two-dimensional system is sufficiently simple to be amenable to accurate analytical and numerical solution, while including some characteristics of molecular systems that were absent from the discrete NRE model. The present model is self-contained in that the kinetic rates are determined by the potential and the move set rather than being imposed, as in the NRE model of reference ^{20}. Furthermore, and most importantly, the unfolded and folded macrostates have, like real molecular systems, microscopic internal structure. The new model makes it possible to follow the joint microscopic evolution of the system in conformational and temperature space. It incorporates the same discrete temperature exchange scheme commonly adopted in replica exchange molecular simulations, and it allows us to study the effects of non-Markovian processes likely present in replica exchange simulations of molecular systems.

In the next section we present the potential model and the kinetic scheme we have employed. We review the replica exchange method and the network model for replica exchange we previously developed. We then summarize the thermodynamic and kinetic properties of the two-dimensional system and present results showing how these determine the efficiency of the replica exchange method. The paper is then concluded by discussing the implications of these findings for replica exchange simulations of molecular systems.

A two-dimensional potential was constructed to mimic the anti-Arrhenius temperature dependence of the folding rate seen in proteins. We designed this potential to have an energetic barrier when going from the “folded” to the “unfolded” region, and an entropic barrier in the reverse direction. The entropic barrier is achieved by imposing a hard wall constraint that limits the space accessible to the folded region. Specifically, the particle can only move in the region −1 ≤ *x* ≤ 1, 0 ≤ *y* ≤ *B*(*x*), where the boundary function *B*(*x*) is a small constant for *x* ≤ 0 and an increasing function of *x* for *x* > 0 (Figure 1):

$$B(x)=\{\begin{array}{ll}\delta ,\hfill & -1\le x\le 0\hfill \\ {bx}^{{n}_{1}}+\delta ,\hfill & 0<x\le 1\hfill \end{array}.$$

(1)

A schematic representation of the two-dimensional potential function used in this work. The colored area corresponds to the accessible region of the (*x*, *y*) plane, with the colors representing the magnitude of the potential energy at that (*x*, *y*) point **...**

The use of a boundary of this form is based on a two-dimensional potential first used in our laboratory to study the convergence of the weighted histogram analysis method,^{39} and is very similar in spirit to simplified models for protein folding studied by Bicout and Szabo^{40} and the model of an entropic barrier by Zhou and Zwanzig.^{41} The specific parameters *δ*, *b*, and *n*_{1} were chosen together with the parameters of the potential function discussed below by trial and error to achieve a sufficiently strong temperature dependence to illustrate some of the possible consequences of anti-Arrhenius behavior on RE simulations. It is natural to choose the *x* axis to be the reaction coordinate, with −1 ≤ *x* ≤ 0 corresponding to the folded macrostate and 0 < *x* ≤ 1 to the unfolded macrostate. The move set was chosen to be compatible with this reaction coordinate (see below). In order for folding and unfolding to be activated processes, however, it is necessary to add a potential energy function which has an energetic well as a function of *x* in the folded region, and increases with *x* in the unfolded region. Specifically, we use the potential function

$$U(x,y)=\{\begin{array}{lll}{a}_{1}{(x+{x}_{0})}^{2},\hfill & -1\le x<-{x}_{1}\hfill & 0\le y\le B(x)\hfill \\ -{a}_{2}{x}^{2}+{c}_{0},\hfill & -{x}_{1}\le x\le 0,\hfill & 0\le y\le B(x)\hfill \\ {a}_{3}{x}^{{n}_{2}}+{c}_{0},\hfill & 0<x<{x}_{2},\hfill & 0\le y\le B(x)\hfill \\ {a}_{4}{x}^{{n}_{3}}+{c}_{1}\hfill & {x}_{2}\le x\le 1,\hfill & 0\le y\le B(x)\hfill \\ \infty \hfill & \text{otherwise}\hfill & \phantom{\rule{0.16667em}{0ex}}\hfill \end{array},$$

(2)

where *a*_{1} = 23.53 kcal/mol, *a*_{2} = 235.3 kcal/mol, *a*_{3} = 376.5 kcal/mol, *a*_{4} = 11.29 kcal/mol, *c*_{0} = 7.059 kcal/mol, *b* = 5, *n*_{1} = 4.55, *n*_{2} = 2, *n*_{3} = 0.5, and *δ* = 2 × 10^{−7}. The constants
${x}_{0}=\sqrt{{c}_{0}({a}_{1}+{a}_{2})/{a}_{1}{a}_{2}}$, *x*_{1} = *a*_{1}*x*_{0}/(*a*_{1} + *a*_{2}),
${x}_{2}={({a}_{4}{n}_{3}/{a}_{3}{n}_{2})}^{{\scriptstyle \frac{1}{{n}_{2}-{n}_{3}}}},{c}_{1}={c}_{0}-({a}_{4}{x}_{2}^{{n}_{3}}-{a}_{3}{x}_{2}^{{n}_{2}})$ were chosen so that the first derivative of *U*(*x*, *y*) is continuous. A graphical representation of the two-dimensional system studied here is shown in Figure 1.

We use Metropolis Monte Carlo (MC) sampling to simulate the movement of a particle in this two-dimensional potential. Kinetic MC has a long history in the study of protein folding using simplified models.^{42}^{–}^{44} To ensure rapid equilibration along the *y* coordinate consistent with the choice of *x* as the reaction coordinate and because of the large size difference of the accessible region in the *y* direction between the folded and unfolded regions, we adopted an asymmetric MC proposal scheme,^{39}^{, }^{45} in which the step size in the *y* direction is proportional to *b*(*x*), i.e. a proposed move (*x*′, *y*′) is generated uniformly in the region *x* − Δ < *x*′ < *x* + Δ, *y* − *b*(*x*)Δ < *y*′ < *y* + *b*(*x*) Δ. The displacement parameter Δ was chosen such that the barrier crossing is slow but not prohibitively expensive and follows a linear regime (i.e. doubling Δ causes approximately a doubling in the number of barrier crossings). To correct for the asymmetric MC proposal distribution, the factor *θ*(|*y*′ − *y*|/*b*(*x*′)Δ) was included to satisfy detailed balance, where *θ*(*z*) equals 1 if *z* < 1 and 0 otherwise.

Rate constants in units of MC steps were obtained via MC simulation by calculating the mean first passage times between the two macrostates. The same displacement parameter Δ = 0.05 was used for all temperatures. A “buffer region” −0.1 < *x* < 0.0437 was defined as not belonging to either the folded or unfolded state to reduce artefactual rapid re-crossings of the barrier.^{46}^{, }^{47} For comparison, the temperature dependence of the folding and unfolding rate constants were also estimated from the PMF using the Arrhenius equation *k* = *A* exp(−Δ*G*^{†}/*k _{B}T*), where Δ

Replica exchange simulations were performed by running *N* MC simulations at *N* inverse temperatures *β _{i}* = (

The efficiency of RE conformational sampling was monitored by measuring *N _{TE}*(

We review here the discrete kinetic network model which we devised in our recent study of replica exchange efficiency.^{20} In this model (unlike the continuous potential model above), the macrostates *F* and *U* (for “folded” and “unfolded”) do not possess any internal structure. Instead, it is assumed that the system evolves in time as a Poisson process, in which instantaneous transitions between *F* and *U* occur after waiting periods given by exponentially distributed random variables with means equal to the reciprocals of the folding or unfolding rates. The result (for a single replica) is an example of a “random telegraph” Markov process.^{50}

If the transition events are Markovian, then the simultaneous behavior of two uncoupled non-interacting replicas can be represented by the four composite states {*F*_{1}*F*_{2}, *F*_{1}*U*_{2}, *U*_{1}*F*_{2}, *U*_{1}*U*_{2}}. In each symbol, the first letter represents the configuration of replica 1, the second letter the configuration of replica 2, and the subscripts denote the temperature of each replica. Only transitions corresponding to a single conformational change (e.g. *F*_{1}*F*_{2} → *U*_{1}*F*_{2}) are allowed, assuming that the probability of two simultaneous changes (e.g. *F*_{1}*U*_{2} → *U*_{1}*F*_{2}) in an infinitesimal interval *dt* can be neglected.^{50} The four-state composite system for two non-interacting replicas can be extended to create a network model of replica exchange by introducing temperature exchanges between replicas, i.e. by allowing transitions such as *F*_{1}*U*_{2} → *F*_{2}*U*_{1}. This leads to a system with 8 states arranged in a cubic network with “horizontal” folding and unfolding transitions and “vertical” temperature exchange transitions (Figure 2). For canonical equilibrium probabilities to be preserved under temperature exchanges, it is sufficient that detailed balance is satisfied, e.g. the transition probabilities *w*(*F*_{1}*U*_{2} → *F*_{2}*U*_{1}) and *w*(*F*_{2}*U*_{1} → *F*_{1}*U*_{2}) satisfy *P _{eq}*(

The kinetic network model for the discrete NRE model used by Zheng, et al.^{20} The state labels represent the conformation (letter) and temperature (subscript) for each replica. For example, *F*_{2}*U*_{1} represents the state in which replica 1 is folded and at **...**

The effect of the rate of temperature exchanges in included by introducing the rate parameter *α*, which controls the overall scaling of the temperature exchange rate relative to the folding and unfolding rates. The forward and reverse rates of the *F*_{1}*F*_{2} *F*_{2}*F*_{1} and *U*_{1}*U*_{2} *U*_{2}*U*_{1} transitions are set equal to *α*, while the other rates are set to *α* or *wα* as required by detailed balance, where in this case *w* = *P _{eq}*(

$${k}_{ex}=\frac{{k}_{f1}{k}_{f2}+2{k}_{u1}{k}_{f2}+{k}_{u1}{k}_{u2}}{({k}_{f1}+{k}_{u1})({k}_{f2}+{k}_{u2})}\alpha .$$

(3)

The NRE model was simulated using a standard method for continuous time Markov processes with discrete states,^{50} also known as the “Gillespie algorithm”. Given a current state *X*_{0}, we identify its *m* neighboring states *X*_{1}, *X*_{2}, …, *X _{m}* and the transition rates

In Figure 3 we show the potentials of mean force (PMF) corresponding to the two-dimensional potential along the *x* coordinate at several temperatures. PMFs calculated by MC sampling and numerical integration of the canonical distribution function agree to within statistical accuracy. The PMFs show two free energy minima corresponding to the folded (F, *x* ≤ 0) and unfolded (U, *x* > 0) conformational states, separated by a free energy barrier near *x* = 0. The free energy minimum of the unfolded state and the free energy barrier have no counterparts in the potential, which is monotonically varying in both of these regions (Figure 1). These features of the PMF originate from the interplay between opposing entropic and enthalpic driving forces. The free energy minimum of the unfolded state corresponds to the optimal balance between entropy, which drives the system towards large values of *x* (where the accessible space along the *y* coordinate is greatest), and enthalpy, which drives the system towards small values of *x* (where the potential energy is smallest). The free energy barrier that separates the unfolded and folded state is entropic in origin. For *x* near 0, the entropy is significantly reduced compared to the unfolded state, and assumes a value similar to that of the folded state (compare in Figure 1 the size of the accessible space along *y* at *x* = 0 and for *x* > 0 and *x* < 0). In contrast, the potential energy at *x* = 0, although smaller than in the unfolded state, is still substantially larger than in the folded state. This imbalance between entropy and potential energy causes the free energy maximum at *x* = 0.

The potential of mean force (PMF) at three different temperatures: 296 K (solid), 474 K (dashed) and 789 K (dotted). The PMF was calculated using numerical integration. To more clearly illustrate the change in the barrier height as a function of temperature, **...**

From the point of view of folding, the free energy maximum constitutes an entropic bottleneck. In order to make a transition to the folded state, the system needs to cross the free energy barrier region at *x* = 0, where the system has lost all of the entropy required for folding without having gained all of the folding enthalpy. Similar transition bottlenecks have been described in simplified models for protein folding.^{34}^{, }^{40}^{, }^{51} After crossing this barrier the system enters the folded state by going downhill in potential energy without further reduction in conformational entropy, since the accessible space along the *y* direction is the same for all points *x* in the folded space. Because the conformational entropy is constant for *x* < 0, the potential of mean force in this region coincides with the potential energy. From the point of view of unfolding, the free energy maximum at *x* = 0 constitutes an enthalpic barrier. Relative to the folded state, points in the region near *x* = 0 have similar conformational entropy but larger potential energy. To reach the barrier region from the folded state therefore the system needs to gain potential energy (enthalpy) without the help of a concomitant increase in conformational entropy. Beyond the barrier region there is a free energy gain for moving towards the unfolded state since the gain in conformational entropy outweighs the increase in potential energy.

As shown below, the barrier region close to *x* = 0 constitutes the transition state for the folding/unfolding equilibrium. The free energy difference between the unfolded and folded states and the transition state corresponds to the free energies of activation, which determine the rate of folding and unfolding respectively. Due to their different thermodynamic origins (entropic vs enthalpic), the free energies of activation for folding and unfolding display the opposite dependence on temperature. As Figure 3 shows, the free energy of activation for folding increases with increasing temperature relative to thermal energy (*kT*), where the free energy of activation for unfolding decreases with increasing temperature. This anti-Arrhenius behavior is the signature of an entropically activated process. The conformational entropy difference between the unfolded state and the transition state increases as the temperature is increased, leading to an increase in the height of the free energy barrier for folding with increasing temperature.

Figure 4 shows the temperature dependence of the population, *P _{F}*(

The temperature dependence of the fractional population folded (solid line) calculated by numerical integration of the potential of mean force. The temperature dependence of the fraction folded corresponding to a system with a smaller average potential **...**

$${\left(\frac{{dP}_{F}}{dT}\right)}_{T={T}_{M}}=\frac{1}{4}\frac{{\overline{U}}_{F}-{\overline{U}}_{U}}{k{T}^{2}},$$

which is proportional to the difference of the average potential energies, *U _{F}* and

With the MC move set described in the Methods Section above, the kinetics of folding/unfolding is two-state as measured by the distribution of first passage times, which is exponential (Figure 5). The Arrhenius plots of the folding and unfolding reaction rates are shown in Figure 6. The temperature dependence of the reaction rates using the Arrhenius equation with activation free energies extracted from the PMFs (Figure 3) agree well with the simulation results, and is a further indication that the kinetics is two-state and that the reaction coordinate is well represented by the *x* coordinate. This is a consequence of choosing a move set for which equilibration along the *y* coordinate is faster than along the *x* coordinate. The alternative potential parametrization in the Appendix, which is characterized by a smaller average potential energy of the unfolded state relative to the folded and the transition states, leads to a weaker temperature dependence of the folding rate (Figure 6, dashed lines). Since the slope of the Arrhenius curve is proportional to the activation energy, this difference of the rates is consistent with the smaller energy of activation obtained with the alternative parametrization.

The distributions of first passage times for folding (black) and unfolding (red) observed during a 2.7×10^{10}-step kinetic MC at 475 K. Approximately 4700 folding and unfolding events were observed. A folding first passage time is defined as the **...**

The temperature dependence of the folding and unfolding rate constants (solid lines and symbols). Folding and unfolding rates are indicated by red and green color, respectively. The folding and unfolding rates corresponding to a system with a smaller **...**

The folding rates decrease with increasing temperature, a phenomenon which has been observed in the kinetics of protein folding.^{33}^{–}^{36}^{, }^{42} Processes displaying anti-Arrhenius behavior are said to be characterized by a negative effective activation energy, whereby the enthalpy of the unfolded state is larger than that of the transition state. The free energy of activation of these processes, however, remains positive due to the activation entropy favoring the unfolded state. The negative activation entropy is associated with the smaller number of accessible conformations at the transition state relative to the unfolded state; that is, the transition state constitutes an entropic “bottleneck” that needs to be traversed for the transition to the folded state to occur. These elements clearly exist in the simplified two-dimensional system under investigation. Since the potential energy decreases monotonically from the unfolded state to the folded state, the average potential energy at the transition state (*x* = 0) is smaller than the average potential energy of the unfolded state, leading to the observed anti-Arrhenius behavior of the rate of folding. Despite the enthalpic driving force favoring the transition state, the free energy of activation for folding remains positive at all temperatures examined (as the calculated PMF along the *x* coordinate shows). This is because the entropy of the transition state is smaller than the entropy of the unfolded state due to the larger accessible configuration space along the *y* coordinate (Figure 1). The entropic destabilization of the transition state, which (as in protein folding) can be described as acting as a “bottleneck”, more than offsets the enthalpic stabilization, leading to the observed positive activation free energy for folding.

Often the observed folding rates of proteins show non-monotonic behavior with respect to the temperature; the folding rate increases with temperature at low temperatures as in normal Arrhenius behavior, switching to anti-Arrhenius behavior at higher temperatures, when the folding rate decreases with increasing temperature. This phenomenon is rationalized in terms of a negative activation heat capacity. The activation heat capacity is defined as the temperature derivative of the activation energy, and a negative value of the activation heat capacity indicates that the unfolded state has a larger heat capacity than the transition state. The observed negative heat capacity of activation of protein folding has been variously interpreted as being due to the hydrophobic effect^{33}^{, }^{42} or to the difference of the distribution of energies of the molecular conformations experienced as a function of temperature.^{34}^{, }^{52} The curvature of the Arrhenius plot is related to the activation heat capacity. The present simplified two-dimensional system does not have a large enough heat capacity of activation to reproduce this turnover from Arrhenius to anti-Arrhenius behavior within the temperature range we have investigated. Thus, the results extracted from this model are applicable only to the anti-Arrhenius temperature regime of the protein folding process.

Figure 7 shows the number of direct round trip transition events *N*_{direct} observed during MC simulations of *N _{MC}* = 5 × 10

In a recent paper,^{20} we analyzed the convergence and efficiency of replica exchange using a discrete model for folding and unfolding. We found that when the physical kinetics shows anti-Arrhenius temperature dependence, there exists an optimal maximal temperature beyond which the efficiency of the replica exchange method is degraded. Similar behavior is expected from RE simulations using the continuous two-dimensional potential, with possible differences arising from the more complex nature of the present model, where the folded and unfolded states have internal structure. We performed replica exchange simulations on the continuous two-dimensional potential with MC as the dynamic propagator, and replica exchange proposals made periodically between adjacent temperatures every *N _{X}* MC steps. The efficiency of conformational sampling was monitored by counting the number of temperature-conditional transition events

In order to directly compare with the results obtained previously, we first performed replica exchange using two replicas. Although such a simulation would not be realistic in general for a protein system due to poor energy overlap and very inefficient temperature exchange, it is feasible in the two-dimensional potential. The result for a 2×10^{9}-step simulation where the lower temperature is held fixed at 296 K and the upper temperature varies from 296 to 789 K is shown in Figure 8 (green, red and blue dots). We see behavior similar to that seen for the discrete model studied previously: the number of temperature-conditional transitions *N*_{TE} has non-monotonic behavior and exhibits a maximum at an optimal high temperature given by the maximal harmonic mean of the folding and unfolding rates (474 K). This maximum point is approximately independent of the rate at which attempted temperature exchanges occur. While the location of the maximum is in agreement with our previous results,^{20} the magnitude of the number of transition events is not. We have shown that for NRE simulations employing a two-state model (folded and unfolded states), the number of transition events is given by the average over all temperatures of the harmonic means of the folding and unfolding rates, provided that the rate of temperature exchanges is sufficiently fast.^{20} In the continuous model, we find that the number of transitions is significantly lower than that predicted from the average of the harmonic means of the rates (Figure 8, black dashed line). This may be due to the finite rate of temperature exchanges, deviations from the pure Markovian kinetics of the two-state discrete model, or a combination of these effects.

The dependence of the number of temperature-conditional transition events *N*_{TE} (section 2.3) on the temperature of the high-temperature replica for a two-replica simulation on the continuous potential (circles), and comparison with predicted transition **...**

To test whether this reduced number of transitions is due to insufficiently fast temperature exchange attempts, we performed several simulations in which we varied *N _{X}* (the number of MC steps between attempted temperature exchanges). We see in Table 1 that

To explore causes for the observed transition deficit, we performed simulations using the discrete NRE model (Figure 2) using kinetic parameters derived from the two-dimensional continuous potential (Figure 6). To map the rates determined using the continuous potential to the discrete model, we used the folding and unfolding rates directly, expressed in units of 10^{−6} per MC step. Different values of *α* were used for the *F*_{1}*F*_{2} *F*_{2}*F*_{1} and *U*_{1}*U*_{2} *U*_{2}*U*_{1}, and were set to 10^{6}/*N _{X}* multiplied by the empirical acceptance rate when both replicas are in the folded or unfolded state (0.853 and 0.395, respectively).

If we compare the observed number of transitions seen in the continuous model with the number predicted by the NRE model with the same rate parameters (Table 1) we see that there is good agreement when the attempted exchange rate is small, but substantial disagreement when it becomes larger. In particular, while the number of transitions using the continuous model reaches a plateau value at *N _{X}* ≈ 1 000, the predicted number of transitions in the NRE model continues to increase, asymptotically approaching the value predicted by the average of harmonic means. Similarly, comparison of the predicted and observed number of transitions as a function of temperature (Figure 8) show a significant overestimation of the transition rate by the NRE model, and that this overestimation is much more severe when the rate of attempted temperature exchanges is fast. For example, while the

One possible origin of this speed limit is that the average effective rates are different in the coupled and uncoupled systems. To test this, we analyzed the kinetics of the continuous RE simulation by using the NRE model to “reverse-engineer” the apparent rates by estimating the mean residence times and branching ratios for various RE macrostates. If the system is Markovian, then the rate *k _{tot}* given by the inverse of the mean residence time is the sum of the rates exiting that state. The rate corresponding to a given edge can then be estimated by multiplying

In order to further investigate the origin of the observed speed limit, we calculated the mean first passage times (MFPTs) for temperature conditional folding and unfolding, i.e. the average time for a replica unfolded at low temperature to become folded at low temperature (regardless of path), or *vice versa*. The resulting MFPTs for the continuous potential are shown in Table 1. We see there that the *N _{TE}* speed limit arises exclusively from a limitation in the fastest achievable unfolding rate, since the folding process is independent of

On the other hand, the unfolding process (if it occurs via an indirect route, which is likely given the very small value of *k _{u}*

One obvious way in which the continuous and NRE models differ is that the macrostates in the continuous potential have spatial extent, unlike the NRE states which lack internal structure. This means that a finite time is required for the particle to transit the non-equivalent microstates that make up the two wells. In fact, we observe that the correlation time for diffusion in the *x* direction *in the unfolded well* at 474 K is approximately 1 400 MC steps. This timescale is of the same magnitude as the *N _{X}* value at which the speed limit effect of Table 1 begins to occur, suggesting that there may in fact be a connection between the observed

Formally, a process is Markovian if and only if the observed propagators (Green’s functions) do not depend on the history of the trajectory prior to the current state, i.e.

$$P({x}_{3},{t}_{3}\mid {x}_{1},{t}_{1};{x}_{2},{t}_{2})=P({x}_{3},{t}_{3}\mid {x}_{2},{t}_{2})$$

(4)

for all states *x*_{1}, *x*_{2}, *x*_{3} and all times *t*_{1} < *t*_{2} < *t*_{3}. Although equation 4 could be used to directly detect deviations from Markovian behavior, previous work has typically used other analysis methods to detect such deviations.^{29}^{, }^{53}^{, }^{54} For example, in a Markovian process the rate matrix **K** determines the propagators via the master equation

$$\stackrel{.}{\mathbf{p}}(t)=\mathbf{Kp}(t),$$

(5)

where **p**(*t*) is the vector of propagators at time *t*. The formal solution of Equation 5 is given by **p**(*t*) = *e*^{K}^{t}**p**(0), and therefore *e*^{K}* ^{τ}* can be thought of as a transition matrix

We have chosen to detect deviations from Markovian kinetics by examining the observed residence time distributions and branching ratios, which provides insights into the physical origin and the mechanism by which the non-Markovian effects enter into the stochastic process. In our simulations on the continuous potential, we have found that the residence time distributions in the macrostates are exponential to within statistical uncertainty (data not shown), and thus by themselves are consistent with Markovian kinetics. The branching probabilities, however, are significantly dependent on the preceding macrostate. We focused on transitions entering and leaving the thermodynamically favored *U*_{2}*F*_{1} macrostate (or its symmetry-related state *F*_{1}*U*_{2}). We ran a several trajectories using different rates of attempted temperature exchange and tallied the number of times each macrostate sequence (*X*, *U*_{2}*F*_{1}, *Y*) was observed in each (where *X*, *Y* {*F*_{2}*F*_{1}, *U*_{2}*U*_{1}, *U*_{1}*F*_{2}}). These counts were transformed into normalized branching probabilities, where *P*(*X*|*Y*) denotes the history-independent branching probability of next visiting macrostate *X* given that the system is currently in macrostate *Y*, and *P*(*X*|*Z*, *Y*) denotes the history-dependent branching probability of next visiting macrostate *X* given that the system is currently in macrostate *Y* and had been in macrostate *Z* immediately prior (Table 3).

If the kinetics is Markovian, then the history-dependent and the corresponding history-independent branching probabilities will be equal:

$$P(X\mid Z,Y)=P(X\mid Y),$$

from which it follows that history-dependent branching probabilities that differ only in the history condition will also be equal:

$$P(X\mid Z,Y)=P(X\mid W,Y).$$

This is clearly not the case for the data in Table 3. For example, we see that the history-dependent branching probabilities *P*(*U*_{1}*F*_{2}|*F*_{2}*F*_{1}, *U*_{2}*F*_{1}), and *P*(*F*_{2}*F*_{1}|*F*_{2}*F*_{1}, *U*_{2}*F*_{1}) differ significantly from their corresponding history-independent branching probabilities *P*(*U*_{1}*F*_{2}|*U*_{2}*F*_{1}) and *P* (*F*_{2}*F*_{1}|*U*_{2}*F*_{1}), and the branching probability *P*(*U*_{1}*F*_{2}|*F*_{2}*F*_{1}, *U*_{2}*F*_{1}) is significantly smaller than *P*(*U*_{1}*F*_{2}|*U*_{1}*F*_{2}, *U*_{2}*F*_{1}). This is most pronounced when the rate of attempted temperature exchanges is fast.

Examination of the kinetic scheme of Figure 2 indicates that the deviations from Markovian behavior seen in Table 3 are consistent with a reduction in the number of temperature-conditional round-trip conformational transition events. If the unfolding rate at low temperature is negligible, then a low-temperature folded conformation unfolds predominantly via indirect paths of the form *F*_{1}*F*_{2} → *F*_{2}*F*_{1} → *U*_{2}*F*_{1} → *U*_{1}*F*_{2} or *F*_{1}*U*_{2} → *F*_{2}*U*_{1} → *U*_{2}*U*_{1} → *U*_{1}*U*_{2}. In the former case, the *F*_{2}*F*_{1} → *U*_{2}*F*_{1} step is more likely to be reversed when the temperature exchange rate is rapid (Table 3), as is the *F*_{1}*U*_{2} → *F*_{2}*U*_{1} step in the latter case (which follows by symmetry from the *U*_{2}*F*_{1} → *U*_{1}*F*_{2} results of Table 3). Thus, increasing the rate of attempted temperature exchanges increases the probability of counterproductive backtracking relative to the Markovian case, resulting in a decrease in the rate of temperature-conditional unfolding events, and therefore at corresponding decrease in *N _{TE}* (since temperature-conditional unfolding was shown above to be rate-limiting).

Although the results presented here do not identify the physical origin of the non-Markovian kinetics, we hypothesize that it is due to the finite time required for diffusion of the particle within the macrostates. This effect does not arise in the NRE model, since in there the macrostates have no internal structure, and the probability of making a transition to a given macrostate at any instant *dt* is the same, regardless of which macrostate the system was in previously or how long it has been in the current macrostate. The behavior of the continuous system within the wells is not Markovian, since the system has memory that is mediated by conformational diffusion within the macrostate. This correlation in time of the particle’s position (and energy) implies that there is a maximal effective value of the rate of statistically independent temperature exchanges, which is limited by the time required for conformational relaxation *within* the folded and unfolded macrostates.

The above results were obtained with two replicas, which is not typical for replica exchange simulations that would be carried out for peptides and proteins. To investigate the effect of adding additional replicas, we performed a series of simulations of 2 × 10^{9} MC steps with 2 to 15 replicas distributed uniformly in *T*^{−1} from 296 to 789 K. The results are shown in Figure 9. One important issue that arises when considering such a set of results is the appropriate measure of conformational sampling efficiency of RE. If we consider the total number of transition events *N*_{TE} (direct and indirect) in all replicas, then we would see for the most part a monotonic increase of efficiency as a function of the number of replicas *N* simply because the number of indirect “channels” for transitions is linearly increasing. This measure of efficiency, however, implicitly assumes that computer power is inexpensive and that the convergence rate of the simulation is the important limiting factor. If both computer resources and the convergence rate are limiting factors, a more appropriate measure is the computational efficiency calculated as the number of transition events per replica (*N*_{TE}/*N* ). According to this measure, a replica exchange simulation with *N* + 1 replicas is considered more efficient than one with *N* replicas only if the introduction of the additional replica provides more than a proportional increase in the number of transition events at the temperature of interest.

Number of transition events *N*_{TE} (section 2.3) normalized by the number of replicas in 2 × 10^{9} MC steps for 2 to 15 replicas exponentially distributed in temperature from 296 to 789 K. Temperature exchanges were attempted every 10 000 (solid), **...**

We find that the efficiency increases strongly as a function of *N* when *N* is small, reaches a maximum, and decreases with *N* for larger *N* (Figure 9). This pattern is unchanged as a function of the rate of attempted temperature exchanges, showing a scaling approximately consistent with the results in Table 1. The trends seen here are qualitatively similar to that seen previously in the NRE two-state discrete model^{20} with finite *a*. In that work, we attributed the decrease with increasing number of replicas beyond an optimum value in part to a combinatoric effect that decreases the relative size of the “target” space of configurations in which a replica is at the temperature of interest relative to the total temperature/configuration space. It is reasonable to assume that a similar effect is occurring here as well. We will address this in a future communication.

The results in Figure 9 were obtained with a relatively uniform distribution of temperatures. It is of interest to consider the effect on efficiency of changing that temperature distribution. In our previous work,^{20} we concluded that in the context of the discrete network model in the “large *α*” limit, the optimal temperature distribution is one replica at the temperature of interest, and the rest at the temperature which maximizes the harmonic mean of the folding and unfolding rates. That model, however, was limited in its realism in that it did not have explicit energy distribution functions. Furthermore, it is clear from the results presented in the previous section that very large effective values of *α* may not be achievable in real systems. The continuous two-dimensional potential studied here provides a better test system for studying these questions.

In Figure 10 we show the relative number of temperature-conditional transition events in 2×10^{9} MC steps for three different temperature distributions of 11 replicas: (A) uniformly distributed in *T*^{−1} from 296 to 789 K, (B) 6 replicas uniformly distributed in *T*^{−1} from 296 to 474 K (the optimal temperature) and the remaining 5 “bunched up” at the optimal temperature, and (C) 5 replicas bunched up at the optimal temperature with the remaining distributed in the 296 to 474 K range but strongly skewed toward the optimal temperature. Temperature distribution B provides more than a 50% increase in efficiency relative to the uniform distribution over the large temperature range. This is consistent with our discrete model results, and indicates that it is possible to include temperatures that are “too high” when the system exhibits anti-Arrhenius kinetics. However, we can increase the efficiency even further (to more than a factor of 2.5 over the baseline result) by skewing the temperature distribution to increase the number of replicas in the vicinity of the transition temperature (distribution C). Previous work by Hansmann et al. has suggested that such concentration of the temperatures near a bottleneck can improve temperature mixing.^{26} However, the improved efficiency may simply be due to the increased number of replicas near the optimal temperature. The clarification of the relative contributions from these two effects will also be addressed in a future communication.

One of the challenges of studying the computational efficiency of replica exchange has been the difficulty in running molecular simulations sufficiently long to obtain full convergence and meaningful statistics. This is particularly daunting if such simulations must be run multiple times to assess the effect of differences in simulation protocols and parameters. The use of simplified model systems allows for thorough theoretical, conceptual, and computational analysis of the problem that can provide insights into the factors that limit the efficiency of replica exchange in more realistic molecular systems.

Our previous work made use of a highly simplified discrete model for protein folding with two conformational states at several temperatures.^{20} While this system did provide useful insights, it was limited in a number of ways, and in particular was fully Markovian. Here we have described a two-dimensional continuous potential function and an associated move set that allows us to perform MC and replica exchange MC simulations in a system that is small enough to quickly converge but yet is rich in complexity that is reminiscent of molecular systems. While many of the results are consistent with those observed previously, novel effects are also seen. In particular, we have confirmed that the efficiency of replica exchange in more complex systems is fundamentally limited by the timescale of conformational diffusion within basins, as we had anticipated.^{20} We expect that such behavior will also be present (perhaps even more strongly) in molecular systems.

There are many unresolved questions raised by this work. One question for which our two-dimensional system would be a good model is for studying the relationship between conformational and thermal diffusion. Optimization of the diffusion of replicas in temperature space has been a major focus of recent theoretical and computational study of the replica exchange method.^{3}^{, }^{21}^{–}^{27} However, the convergence of thermodynamic quantities is not limited by thermal diffusion *per se*, but by the exploration of the conformational space of the system. While very poor thermal diffusion obviously defeats the purpose of replica exchange by effectively reducing it to a set of parallel uncoupled simulations, it is not clear that further optimization of thermal diffusion that is already “reasonably good” will automatically improve convergence. Some recent work has begun to address the role of basin-to-basin transitions.^{18}^{, }^{28} Similarly, some work on the optimization of thermal diffusion has emphasized the role of temperature bottlenecks,^{26} which may turn out to be fundamentally conformational in nature. The exact relationship between thermal and conformational diffusion remains to be fully clarified, and we look forward to studying this and other questions using simplified continuous and discrete models of replica exchange.

We thank Attila Szabo for discussions concerning this work. RML wishes to express his pleasure at being fortunate enough to have known Attila for thirty years and his admiration for Attila’s intellectual honesty and style of doing science. We are very glad to participate in this special issue of the Journal of Physical Chemistry which honors Attila Szabo on his sixtieth birthday. This work has been supported by a grant from the National Institutes of Health (GM30580).

The alternative potential with decreased average potential energy differences between folded, un-folded and transition states is of the same general form as the primary potential described in the Methods section and Figure 1, but with the boundary function parameters *δ* = 10^{−5}, *b* = 1, and *n*_{1} = 3.5 and potential energy

$$U(x,y)=\{\begin{array}{lll}{a}_{1}{(x+{x}_{0})}^{2}+{b}_{1}{y}^{2},\hfill & -1\le x<-{x}_{1}\hfill & 0\le y\le B(x)\hfill \\ -{a}_{2}{x}^{2}+{b}_{1}{y}^{2}+{c}_{0},\hfill & -{x}_{1}\le x\le 0,\hfill & 0\le y\le B(x)\hfill \\ {a}_{3}{x}^{2}+{b}_{1}{y}^{2}+{c}_{0},\hfill & 0<x\le 1,\hfill & 0\le y\le B(x)\hfill \\ \infty \hfill & \text{otherwise}\hfill & \phantom{\rule{0.16667em}{0ex}}\hfill \end{array},$$

with *a*_{1} = 25 kcal/mol, *a*_{2} = 250 kcal/mol, *a*_{3} = 10 kcal/mol, *b*_{1} = 1000 kcal/mol, *c*_{0} = 6 kcal/mol. The constants *x*_{0} and *x*_{1} were the same as for the primary potential.

1. Swendsen RH, Wang JS. Phys Rev Lett. 1986;57:2607–2609. [PubMed]

2. Geyer CJ, Thompson EA. J Am Stat Assoc. 1995;90:909–920.

3. Hukushima K, Nemoto K. J Phys Soc Jpn. 1996;65:1604–1608.

4. Hansmann UHE. Chem Phys Lett. 1997;281:140–150.

5. Sugita Y, Okamoto Y. Chem Phys Lett. 1999;314:141–151.

6. Rhee YM, Pande VS. Biophys J. 2003;84:775–786. [PubMed]

7. Nymeyer H, Gnanakaran S, García AE. Meth Enzymol. 2004;383:119–149. [PubMed]

8. Cecchini M, Rao F, Seeber M, Caflisch A. J Chem Phys. 2004;121:10748. [PubMed]

9. Tsai HHG, Reches M, Tsai CJ, Gunasekaran K, Gazit E, Nussinov R. Proc Natl Acad Sci USA. 2005;102:8174–8179. [PubMed]

10. Baumketner A, Shea JE. Biophys J. 2005;89:1493–1503. [PubMed]

11. Verkhivker GM, Rejto PA, Bouzida D, Arthurs S, Colson AB, Freer ST, Gehlhaar DK, Larson V, Luty BA, Marrone T, Rose PW. Chem Phys Lett. 2001;337:181–189.

12. Ravindranathan KP, Gallicchio E, Friesner RA, McDermott AE, Levy RM. J Am Chem Soc. 2006;128:5786–5791. [PMC free article] [PubMed]

13. Sugita Y, Kitao A, Okamoto Y. J Chem Phys. 2000;113:6042–6051.

14. Fukunishi H, Watanabe O, Takada S. J Chem Phys. 2002;116:9058–9067.

15. Kwak W, Hansmann UHE. Phys Rev Lett. 2005;95:138102. [PMC free article] [PubMed]

16. Liu P, Huang X, Zhou R, Berne BJ. J Phys Chem B. 2006;110:19018–19022. [PMC free article] [PubMed]

17. Min D, Li H, Li G, Bitetti-Putzer R, Yang W. 2006 arXiv:physics/0605005.

18. Zuckerman DM, Lyman E. J Chem Theory Comput. 2006;2:1200–1202.

19. Beck DAC, White GWN, Daggett V. J Struct Biol. 2007;157:514–523. [PMC free article] [PubMed]

20. Zheng W, Andrec M, Gallicchio E, Levy RM. Proc Natl Acad Sci USA. 2007;104:15340–15345. [PubMed]

21. Kofke DA. J Chem Phys. 2002;117:6911–6914.

22. Kone A, Kofke DA. J Chem Phys. 2005;122:206101. [PubMed]

23. Predescu C, Predescu M, Ciobanu CV. J Chem Phys. 2004;120:4119–4128. [PubMed]

24. Predescu C, Predescu M, Ciobanu CV. J Phys Chem B. 2005;109:4189–4196. [PubMed]

25. Rathore N, Chopra M, de Pablo JJ. J Chem Phys. 2005;122:024111. [PubMed]

26. Trebst S, Troyer M, Hansmann UHE. J Chem Phys. 2006;124:174903. [PubMed]

27. Nadler W, Hansmann UHE. 2007 arXiv:0709.3289v1.

28. Zuckerman DM. J Chem Theory Comput. 2006;2:1693. [PMC free article] [PubMed]

29. Swope WC, Pitera JW, Suits F. J Phys Chem B. 2004;108:6571–6581.

30. Swope WC, Pitera JW, Suits F, Pitman M, Eleftheriou M, Fitch BG, Germain RS, Rayshubski A, Ward TJC, Zhestkov Y, Zhou R. J Phys Chem B. 2004;108:6582–6594.

31. Andrec M, Felts AK, Gallicchio E, Levy RM. Proc Natl Acad Sci USA. 2005;102:6801–6806. [PubMed]

32. van der Spoel D, Seibert MM. Phys Rev Lett. 2006;96:238102. [PubMed]

33. Oliveberg M, Tan YJ, Fersht AR. Proc Natl Acad Sci USA. 1995;92:8926–8929. [PubMed]

34. Karplus M. J Phys Chem B. 2000;104:11–27.

35. Ferrara P, Apostolakis J, Caflisch A. J Phys Chem B. 2000;104:5000–5010.

36. Yang WY, Gruebele M. Biochemistry. 2004;43:13018–13025. [PubMed]

37. Zhang W, Wu C, Duan Y. J Chem Phys. 2005;123:154105. [PubMed]

38. Periole X, Mark AE. J Chem Phys. 2007;126:014903. [PubMed]

39. Gallicchio E, Andrec M, Felts AK, Levy RM. J Phys Chem B. 2005;109:6722–6731. [PubMed]

40. Bicout D, Szabo A. Protein Science. 2000;9:452–465. [PubMed]

41. Zhou HX, Zwanzig R. J Chem Phys. 1991;94:6147–6151.

42. Chan HS, Dill KA. Proteins. 1998;30:2–33. [PubMed]

43. Kolinski A, Skolnick J. Polymer. 2004;45:511–524.

44. Tiana G, Sutto L, Broglia RA. Physica A. 2007;380:241–249.

45. Allen MP, Tildesley DJ. Computer Simulation of Liquids. Oxford University Press; Oxford: 1987.

46. Chandler D. J Chem Phys. 1978;68:2959–2970.

47. Levy RM, Karplus M, McCammon JA. Chem Phys Lett. 1979;65:4–11.

48. Rao F, Caflisch A. J Chem Phys. 2003;119:4035–4042.

49. Seibert MM, Patriksson A, Hess B, van der Spoel D. J Mol Biol. 2005;354:173–183. [PubMed]

50. Gillespie DT. Markov Processes: An Introduction for Physical Scientists. Academic Press; Boston: 1992.

51. Borreguero J, Dokholyan N, Buldyrev S, Shakhnovich E, Stanley H. J Mol Biol. 2002;318:863–876. [PubMed]

52. Bryngelson JD, Wolynes PG. J Phys Chem. 1989;93:6902–6915.

53. Park S, Pande VS. J Chem Phys. 2006;124:054118. [PubMed]

54. Noé F, Horenko I, Schütte C, Smith JC. J Chem Phys. 2007;126:155102. [PubMed]

PubMed Central Canada is a service of the Canadian Institutes of Health Research (CIHR) working in partnership with the National Research Council's national science library in cooperation with the National Center for Biotechnology Information at the U.S. National Library of Medicine(NCBI/NLM). It includes content provided to the PubMed Central International archive by participating publishers. |