Home | About | Journals | Submit | Contact Us | Français |

**|**HHS Author Manuscripts**|**PMC2783641

Formats

Article sections

- Abstract
- I. INTRODUCTION
- II. METHODS
- III. RESULTS
- IV. DISCUSSION
- V. SUMMARY AND CONCLUSIONS
- Supplementary Material
- References

Authors

Related links

J Comput Chem. Author manuscript; available in PMC 2010 August 1.

Published in final edited form as:

PMCID: PMC2783641

NIHMSID: NIHMS136550

Xin Zhang, Department of Physics & Astronomy, University of Pittsburgh, Pittsburgh, Pennsylvania 15260;

The publisher's final edited version of this article is available at J Comput Chem

See other articles in PMC that cite the published article.

The absolute free energy — or partition function, equivalently — of a molecule can be estimated computationally using a suitable reference system. Here, we demonstrate a practical method for staging such calculations by growing a molecule based on a series of fragments. Significant computer time is saved by pre-calculating fragment configurations and interactions for re-use in a variety of molecules. We employ such fragment libraries and interaction tables for amino acids and capping groups to estimate free energies for small peptides. Equilibrium ensembles for the molecules are generated at no additional computational cost, and are used to check our results by comparison to standard dynamics simulation. We explain how our work can be extended to estimate relative binding affinities.

The use of a reference system for free energy calculations has a long history in physics and chemistry.^{1} The basic idea is to employ a reference system (“ref”) for which the absolute free energy is available, and which is as similar as possible to the physical system of interest (“phys”):

$${F}^{\text{phys}}={F}^{\text{ref}}+\mathrm{\Delta}{F}_{\text{ref}\to \text{phys}},$$

(1)

where *F*^{x} is the absolute free energy of model “x,” and Δ*F*_{ref→phys} is the free energy difference between the systems. In essence, this paper is about practical choices for the both the reference system and strategies for calculating Δ*F*_{ref→phys} when the physical system is a large molecule.

Historically, Stoessel and Nowak applied the reference-system strategy to a molecular system for the first time, using a solid harmonic reference system in Cartesian coordinates.^{2} Zuckerman and Ytreberg extended that work in two ways designed to improve overlap between the reference and physical systems:^{3} (i) by using internal coordinates; and (ii) by using a more flexible, numerically exact reference system based on histograms from a short dynamics simulation, rather than an artificial analytically tractable reference state. Huang and Makarov also employed the reference-system approach embodied in (1), but in a different way.^{4}

Other efforts to calculate absolute free energies for molecular systems have been ongoing for years in the groups of Meirovitch^{5}^{–}^{8} and Gilson^{9}^{–}^{12} and more approximately using harmonic and quasi-harmonic methods.^{13} The work of Meirovitch builds on long-standing polymer-growth methodologies for estimating partition functions which date to the work of Rosenbluth and Rosenbluth.^{14} The original Rosenbluth work was generalized for higher efficiency and more realistic models by many workers.^{15}^{–}^{27} Ideas from these polymer-growth sampling methods also inform the present work, in the sense that each of our fragments is a kind of “monomer”.

The present paper significantly extends the previous work by Ytreberg and Zuckerman^{3} by estimating absolute free energies for molecules built up gradually from molecular fragments. Larger molecules can be treated, compared to our previous paper^{3}, because a series of staged intermediate systems are adopted. In essence the free energy difference of Eq. (1) is sub-divided (staged) into a sum of easy-to-calculate terms. Staging increments are highly tunable, based on the choice of fragment sizes and even by selection of subsets of interactions as detailed below. The use of fragments in other types of molecular mechanics calculations has a long history.^{28}^{–}^{30}

A key contribution of this work is a practical strategy of pre-calculation which minimizes the number of energy terms which need to be computed at each stage. Specifically, for each fragment, a statistical library — i.e., an ensemble of configurations and their energies — is stored; we have also used such libraries in Monte Carlo sampling^{31}. Additionally, for each covalently bonded fragment pair, we store the full interaction energy (based on all atoms) for every possible pair of configurations. Such storage is quite practical on typical modern computers with > 1 GB of RAM. During production simulations it is only necessary to compute interactions between fragments separated by one or more other fragments. Needless to say, the stored libraries and interaction tables can be re-used in future simulations of the same or different molecules. The pre-calculation strategy, which has early conceptual roots^{32}^{–}^{34}, appears to represent a significant practical advance over earlier polymer-growth calculations. The use of non-statistical libraries has been popularized in the Rosetta protein folding program.^{35}

There is substantial flexibility in the division of a molecule into fragments. We have used single amino acids as fragments in this study, but larger segments and even different interaction subsets as detailed below – may also be practical. The fragment-based approach could also be used to study protein-ligand binding, by growing small molecules into receptor binding pockets and estimating the free energies. This can be seen as a statistical mechanical generalization of fragment-based ideas developed earlier.^{29}^{,}^{30}

Our results, which employ single amino-acid fragments, are extremely encouraging. The data indicate that absolute free energies for small peptides can be calculated rapidly and reliably. Specifically, high-precision free energy estimates, with fluctuations of ~ 0.3 kcal/mole, are obtained for 52-atom tetra-alanine in less than an hour of single-processor computing time, with a simple dielectric "solvent". We check our data by comparing the equilibrium ensembles (obtained simultaneously with the free energy estimates) with independent Langevin simulations. As a further check, in one case, the free energy results are verified by an independent calculation using different fragments.

The remaining sections of the paper describe the methods, results, and our conclusions. Our methods section provides full details for performing the calculations, including the generation of fragment libraries and interaction tables. We also correct a technical error in our earlier study.^{3} Our results describe both the free energy values and the analysis of the equilibrium ensembles. Our discussion section describes possible improvements to the method and extension to the estimation of relative binding affinities using absolute free energies.

Our basic approach is to calculate the free energy of the physical system of interest based on the difference from a known reference system, as in Eq. (1), and also to stage the calculation using molecular fragments. The fragments not only permit the gradual staging of the calculation but also a tremendous savings of computer time based on the storage of (i) fragment configurations, (ii) energies internal to each fragment configuration, and (iii) interaction energies between covalently bonded fragments. The low cost and high precision of the resulting estimates suggests we are far from the practical limit of the approach in the present implementation. However, a number of improvements to the implementation appear to be within easy reach, as described in our Discussion. All fragment libraries used in the present calculations are available at our website (www.ccbb.pitt.edu/Zuckerman).

All calculations employ a standard atomistic forcefield, OPLS-AA^{36} at *T* = 298*K*. In the present report, our fragments are individual amino acids and capping groups. For simplicity in this initial investigation, we model the solvent by a simple uniform dielectric constant ε = 60. We compute free energy estimates for alanine dipeptide (Ace-Ala-Nme), di-alanine (Ace-(Ala)_{2}-Nme), and tetra-alanine (Ace-(Ala)_{4}-Nme). Following standard conventions, Ace is Acetyl (CH_{3}-CO), Ala is Alanine (NH-CH(CH_{3})-CO), and Nme is N-methylamide (NH-CH_{3}).

Consider the calculation of the configurational free energy of alanine dipeptide based on a division into three fragments (Ace, Ala, Nme) which can be denoted (A, B, C) respectively (see Fig. 1). We will discuss this example qualitatively, to give the reader a sense of what to expect. Full details and all equations will be presented in succeeding subsections.

In advance, we calculate statistical libraries of configurations for each fragment, which are constant-temperature OPLS-AA ensembles based only on the atoms within the given fragment. The libraries additionally include the six degrees of freedom necessary for joining the fragments, based on the use of “dummy atoms” as described below. During the library generation process, the absolute free energy for each fragment is also calculated using a reference system based on independent internal coordinates, as described previously.^{3} A typical library will contain 10,000 configurations. We also pre-calculate every possible interaction energy between covalently bound fragments — i.e., a table of 10^{8} interaction energies for the A-B and B-C fragment pairs.

The calculation proceeds as schematized in Fig. 1, where the presence of a line connecting two fragments indicates that all interactions between the fragments is included. The reference system (not shown) consists of fully independent coordinates, so that the fragments are not yet constructed. The first intermediate consists of the three non-interacting fragments, which include, however, all interactions *within* each fragment. Thus the fragment free energies, which are calculated and stored in advance, properly include the interactions among all degrees of freedom *internal* to each fragment. Other interactions are added in three stages: A-B interactions first, followed by B-C, and completed by A-C couplings.

In the first intermediate stage, the absolute free energies for the individual fragments are retrieved from disk. (They are initially calculated following reference^{3} as detailed below.) Next, A-B interactions are added by a standard free energy difference calculation. Specifically, an ensemble of non-interacting A-B configurations is generated by random combination of fragments from the A and B libraries, and the resulting energy change is exponentially averaged in the usual way — via Eq. (6) below. The energy changes due to the combination do not need to be calculated in our scheme, however, because they have been tabulated in advance. Additionally, the now interacting A-B fragments are “resampled”^{37} to correspond to the full potential for all degrees of freedom in *both* fragments. The details of resampling are given below — see Eq. (8) — but the bottom line is that one obtains 10,000-configuration ensemble of the partially grown molecule consisting of the A-B fragments.

The calculations then proceeds as if there are two fragments, A-B and C. The two libraries are joined combinatorially *but only accounting for the B-C interactions at this stage*. The A-C interactions will be handled at a later stage. Once again, the free energy change is calculated and the ensemble is resampled to reflect B-C interactions. The resulting ensemble contains 10,000 configurations of the full molecule reflecting all interactions except those between fragments A and C.

In the final stage sketched in Fig. 1, the A-C interactions are added in a standard free energy difference calculation based on the the ensemble of the previous stage. However, a standard energy call (for the whole molecule) is not required to save CPU time. Rather, our code only computes energy terms specific to the A and C fragments — i.e., electrostatic and van der Waals interactions between atoms of A and those of C. Once the energy changes have been obtained, the full free energy is rapidly estimated. Resampling into the fully interacting ensemble can also be performed rapidly without additional energy calculations.

It is not difficult to imagine generalizing this example to systems with more fragments. It is also worth noting that, strictly speaking, the final stage was not necessary. That is, we could have added the A-C interactions simultaneously with the B-C combination since the full molecular configurations were constructed at that point. These choices illustrate the flexibility intrinsic to staging the calculation with fragments, as we detail further in our Discussion. Additional staging flexibility results, of course, from the initial choice of the fragments — i.e., smaller fragments lead to staging in finer increments.

Our calculation of the absolute free energy *F*^{phys} for a molecule divided into fragments is based on standard, straightforward equations. The only novel aspect of the formalism is our particular choice of stages based on the addition of fragments and/or inter-fragment interactions. Although our heavy reliance on pre-calculated information has very significant practical implications, it does not affect the formalism.

The fundamental object of interest is the absolute classical free energy *F*^{phys} for an implicitly solvated molecule. The molecule is taken to consist of *N* atoms, and its *internal*-coordinate configurations are denoted by x = (*x*_{1},*x*_{2},…,*x _{3N−6}*). The potential energy function will be a standard forcefield (here, OPLS-AA

$${F}^{\text{phys}}[{U}^{\text{phys}}]=-{k}_{B}T\phantom{\rule{thinmathspace}{0ex}}\text{ln}\left\{\frac{1}{{(1\phantom{\rule{thinmathspace}{0ex}}\AA )}^{3N-3}}{\displaystyle \int d\mathbf{x}}\phantom{\rule{thinmathspace}{0ex}}J(\mathbf{x})\phantom{\rule{thinmathspace}{0ex}}{e}^{-\beta {U}^{\text{phys}}}(\mathbf{x})\right\},$$

(2)

where the Jacobian is given by

$$J(\mathbf{x})={\displaystyle \prod _{i=1}^{3N-6}{J}_{i}({x}_{i})},$$

(3)

with ${J}_{i}={x}_{i}^{2}$ for bond lengths, *J _{i}* = sin(

Our single-molecule formulation, as noted, allows for “implicit solvation” using an effective solvent term in *U*^{phys} that is solely a function of the internal molecular coordinates **x**. The present calculations employ a simple uniform dielectric constant (ϵ = 60). In our Discussion, we address the minor technical issues involved with using a more realistic implicit solvent model.

One issue of dimensionality is worth emphasizing. Although there are 3*N* − 6 internal coordinates for a molecule consisting of *N* atoms, the integral of Eq. (2) has dimensionality of length to the power 3*N* − 3. This is because *N* − 1 bond lengths remain in the full set of internal coordinates **x**, each of which contributes three powers of length. Put another way, of the six excluded rigid-body/center-of-mass coordinates, the three orientation angles are dimensionless; more specifically, the angles integrate to the factor of 8π^{2} included in the definition of *C*°

As illustrated in the example of Sec. IIB, we will calculate the free energy in a series of stages. These can be understood most easily by adding and subtracting the free energies corresponding to *k* intermediate models,

$${F}^{\text{phys}}=({F}^{\text{phys}}-{F}_{k})+({F}_{k}-{F}_{k-1})+\cdots +({F}_{1}-{F}^{\text{ref}})+{F}^{\text{ref}}$$

(4)

$$=\mathrm{\Delta}{F}_{k\to \text{phys}}+\mathrm{\Delta}{F}_{k}+\cdots +\mathrm{\Delta}{F}_{1}+{F}^{\text{ref}}$$

(5)

where Δ*F _{j}* =

All free energy difference calculations will be performed here using the “perturbation” formulation.^{38} Explicitly, for two arbitrary potential energy functions *U _{a}* and

$$\mathrm{\Delta}{F}_{a\to b}={F}_{b}[{U}_{b}]-{F}_{a}[{U}_{a}]=-{k}_{B}T\phantom{\rule{thinmathspace}{0ex}}\text{ln}\phantom{\rule{thinmathspace}{0ex}}{\langle \text{exp}\phantom{\rule{thinmathspace}{0ex}}\left[-\beta ({U}_{b}-{U}_{a})\right]\rangle}_{a}$$

(6)

$$\simeq -{k}_{B}T\phantom{\rule{thinmathspace}{0ex}}\text{ln}\left\{\frac{1}{{N}_{a}}{\displaystyle \sum _{i=1}^{{N}_{a}}\text{exp}\left[-\beta \phantom{\rule{thinmathspace}{0ex}}({U}_{b}({\mathbf{x}}_{i})-{U}_{a}({\mathbf{x}}_{i}))\right]}\right\}$$

(7)

where the subscript *a* denotes an average performed over configurations distributed according to the *U _{a}* ensemble and

We emphasize that succeeding intermediates are constructed to have progressively narrower distributions as more interactions are added, as in the alanine dipeptide example. In other words, we ensure good overlap and reliable Δ*F* estimates by proceeding in the generalized “insertion” direction.^{39}^{,}^{40}

As our calculation proceeds through the various stages, we will require the correspoinding equilibrium ensembles for each stage, primarily for use in Eq. (7). These are obtained by “resampling,” the process of converting an ensemble for one distribution into another by eliminating, duplicating, and/or adjusting the weights of configurations in the original distribution.^{37} In our case, we primarily use elimination of configurations from a larger ensemble (e.g., all combinations of fragments A and B) to create a smaller one (e.g., the interacting A-B ensemble); we do not adjust weights. More specificially, to resample an ensemble of configurations **x*** _{a}* generated according to

$${e}^{-\beta \left[{U}_{b}({\mathbf{x}}_{a})-{U}_{a}({\mathbf{x}}_{a})\right].}$$

(8)

Operationally, we select configurations by forming a cumulative distribution function (cdf) based on the normalized set of ratios (8), and then choosing from this cdf as many times as desired.

As already noted, the set of intermediate models {*U _{j}*} can be chosen in a variety of ways. In the present study, we employ

The present study uses a uniform staging strategy for all molecules examined, as exemplified in Fig. 1 and Fig. 2. The reference system consists of *all* coordinates fully independent — both within and between fragments — as in our previous work.^{3} The first intermediate stage adds interactions *within fragments*, so that one has true molecular fragments but no interactions between fragments. We then add interactions between neighboring, covalently bound fragments — i.e., among all the atoms in the neighboring fragment pair — one fragment pair at a time. The final stage of this simple scheme involves the addition of all remaining interactions, which occur solely between non-adjacent fragments. The result is a molecule with atoms interacting fully according to a standard forcefield and possibly continuum solvent model.

Because interactions among previously non-interacting components are added at every stage, it is expected that the configuration space will become increasingly narrow. Such a progressive narrowing justifies the use of the perturbation expression (7).

To explicitly illustrate the staging scheme employed here, consider the case of a molecule divided into the three fragments A, B, and C, as in Fig. 1. We denote by ${u}_{i}^{\text{ref}}$ the reference potential for internal coordinate *x _{i}*, where the full set is

$$\begin{array}{l}\hfill {U}^{\text{ref}}={\displaystyle \sum _{i=1}^{3N-6}{u}_{i}^{\text{ref}}({x}_{i})}\hfill \\ \hfill {U}_{1}={U}_{A}+{U}_{B}+{U}_{C}\hfill \\ \hfill {U}_{2}={U}_{1}+{U}_{AB}\hfill \\ \hfill {U}_{3}={U}_{2}+{U}_{BC}\hfill \\ \hfill {U}^{\text{phys}}={U}_{3}+{U}_{AC}\hfill \end{array}$$

(9)

The choice of the reference potentials {${u}_{i}^{\text{ref}}$} is guided by the forcefield, as detailed below in Sec. IIG. Building on our earlier work,^{3} we will use independent internal coordinates with simulation-guided distributions.

A four-fragment molecule, such as di-alanine (Ace-(Ala)_{2}-Nme) schematized in Fig. 2, would be staged according to:

$$\begin{array}{l}\hfill {U}^{\text{ref}}={\displaystyle \sum _{i=1}^{3N-6}{u}^{\text{ref}}({x}_{i})}\hfill \\ \hfill {U}_{1}={U}_{A}+{U}_{B}+{U}_{C}+{U}_{D}\hfill \\ \hfill {U}_{2}={U}_{1}+{U}_{AB}\hfill \\ \hfill {U}_{3}={U}_{2}+{U}_{BC}\hfill \\ \hfill {U}_{4}={U}_{3}+{U}_{CD}\hfill \\ \hfill {U}^{\text{phys}}={U}_{4}+{U}_{AC}+{U}_{BD}+{U}_{AD}\hfill \end{array}$$

(10)

As described in the Discussion, it is also possible to stage the final (“non-bonded”) pairwise interactions separately.

We anticipate that significant optimization can be obtained by adjusting fragmentation and staging schemes. While our Discussion, below, describes more gradual staging strategies, the present initial report is limited to the single staging strategy given above.

The computation of the (absolute) reference free energy *F*^{ref} is perhaps the most technically involved step of the calculation. The remaining free energy differences in the decomposition of *F*^{phys} in Eq. (5) are estimated using a simple, standard method. For the reference free energy, however, great care must be taken with the normalization and Jacobian factors of the chosen probability distributions. Indeed, our previous report^{3} includes an error in this regard, as explained at the end of this subsection.

As described in our discussion of staging (Sec. IID), the reference system for all molecules studied here consists of the set of non-interacting internal coordinates. The reference potential energy function will be constructed, following our previous work^{3}, so that the reference partition function is normalized to one. That is, we will *construct* our reference model *U*^{ref} so that

$${Z}^{\text{ref}}[{U}^{\text{ref}}]={e}^{-\beta {F}^{\text{ref}}}=\frac{1}{{(1\phantom{\rule{thinmathspace}{0ex}}\AA )}^{3N-3}}{\displaystyle \mathit{\int}d\mathbf{x}}\phantom{\rule{thinmathspace}{0ex}}J(\mathbf{x}){e}^{-\beta {U}^{\text{ref}(\mathbf{x})}}\equiv 1,$$

(11)

where the same standard concentration as in Eq. (2) has been used implicitly. (From this point forward, we will omit writing the length units, but they should be implicitly associated with every bond-length integration.) The motivation for the unit normalization of *Z*^{ref} is that application of a logarithm leads to the simplifying value,

$${F}^{\text{ref}}\equiv 0,$$

(12)

for every system.

While there are many ways to construct *U*^{ref} to satisfy the required normalization of Eq. (11), we primarily use the strategy of employing independent internal coordinates as in our earlier work.^{3} As usual, the full set of 3*N* − 6 internal coordinates, **x** = (*x*_{1}, *x*_{2},…,*x*_{3N−6}) consists of *N* − 1 bond lengths, *N* − 2 bond angles, and *N* − 3 dihedrals. So long as the distribution of each individual coordinate is normalized when integrated with the appropriate Jacobian factor *J*, the full distribution will be normalized.

Because total reference energy is given by a simple sum of independent terms,

$${U}^{\text{ref}}(\mathbf{x})={\displaystyle \sum _{i=1}^{3N-6}{u}_{i}^{\text{ref}}({x}_{i})}$$

(13)

the desired normalization (11) is ensured by enforcing

$$\mathit{\int}d{x}_{i}\phantom{\rule{thinmathspace}{0ex}}{J}_{i}({x}_{i})}{e}^{-\beta {u}_{i}^{\text{ref}}({x}_{i})}=1\frac{}{},$$

(14)

where the inclusion of inverse length units is understood for bond-angle integrals. In words, then, each individual potential ${u}_{i}^{\text{ref}}$ must include suitable normalization as described in Sec. IIG.

(As detailed in Sec. IIG, peptide ϕ and ψ angles were, in fact, sampled together from a single distribution based on a pairwise energy function ${u}_{\varphi \psi}^{\text{ref}}$. These angles are independent from all other coordinates, however. We emphasize that this exception does not alter the basic formalism, which has been simplified very slightly for clarity.)

It is very useful to observe that normalization of the coordinate distributions via Eq. (14) can be achieved either using standard analytic forms — e.g., Gaussians — or via numerical histogramming procedures.^{3} Thus, there is great flexibility in the choice reference distributions embodied in the reference potentials. In addition to forcefield terms, prior knowledge, such as from a simulation, can be used in constructing the set {${u}_{i}^{\text{ref}}$}. The reference potentials chosen for the present study are described below in Sec. IIG on library construction.

One word of warning is appropriate here. Although it is possible to describe the internal configuration of a molecule using additional bond angles to substitute for dihedrals in some cases, the Jacobian for such a description appears *not* to be well-defined. Therefore, it is necessary to use the standard description with *N* − 2 bond angles and *N* − 3 dihedrals. Unfortunately, we were unaware of this point during our original study,^{3} and therefore an erratum will be prepared correcting the resulting numerical errors.

The first intermediate stage adds only localized interactions to the non-interacting reference model, as illustrated in Fig. 1 and Fig. 2. Specifically, once a molecule is divided into fragments (A, B, C, …), the first intermediate includes only interactions occuring *within fragments*. The fragments exactly divide all coordinates so that we can write **x** = (**x*** _{A}*,

$${U}_{1}(\mathbf{x})={U}_{A}({\mathbf{x}}_{A})+{U}_{B}({\mathbf{x}}_{B})+\cdots ,$$

(15)

where *U _{y}* includes all interactions from the

The free energy for this stage — i.e., *F*_{1}[*U*_{1}] for use in the key equation (5), recalling *F*^{ref} = 0 — can be calculated by using the standard perturbation relation (7). For such a computation, one would use *U _{a}* =

In practice, *once the libraries are generated, no calculation of energies needs to be done*. As detailed below the libraries are generated (just once, for repeated use in many systems) based on the *U*^{ref} distribution. Thus, during the library generation process, it is a trivial matter to calculate the absolute free energy for each fragment using Eq. (7). Thus, individual fragment free energies *F _{y}* are calculated in advance that exactly sum to the first-stage free energy:

$${F}_{1}={F}_{A}+{F}_{B}+\cdots .$$

(16)

Further, the independent-coordinate distributions are subsequently resampled based on Eq. (8) to generate the library distributions — i.e., ensembles for the *U _{y}* Boltzmann factors — for use in subsequent stages.

As just described, fragment libraries are critical to the calculation of the free energy of the first intermediate stage, *F*_{1}. The libraries also greatly facilitate computations for the rest of the intermediates.

In general terms, fragment configurations are generated by sampling internal coordinates according to the independent probability distributions which constitute the reference system. The generated configurations are then used to calculate fragment free energies, *F _{y}* for

We found that for fragments the size of alanine residues, rather simple probability functions were sufficient for generating tens of thousands of (statistically independent) configurations in weeks of single-CPU time. This is a negligible cost because once a library is generated, it can be used in multiple simulations.

Different coordinate types are best sampled with different distributions, as is suggested by the forcefield terms. In our initial tests we employed the same functions and parameters as in the OPLS-AA forcefield — i.e. bonds and angles were described by harmonic functions and dihedrals by a linear combination of several cosine functions. However, this choice of energy functions and parameters did not work well because of strong correlations between coordinates. Better results were obtained when we extracted parameters from a short Langevin simulation of alanine dipeptide, using harmonic functions for bonds, angles and “stiff” dihedrals such as in the relatively planar peptide groups. So for bonds, angles and “stiff” dihedrals the reference potential used in Eq. (14) is

$${u}_{i}^{\text{ref}}({x}_{i})={\widehat{u}}_{i}({x}_{i})+{k}_{B}T\phantom{\rule{thinmathspace}{0ex}}\text{ln}\phantom{\rule{thinmathspace}{0ex}}{\widehat{z}}_{i,}$$

(17)

where *û _{i}* is simply a harmonic function and

$${\widehat{u}}_{i}({x}_{i})=\frac{{\left({x}_{i}-{\overline{x}}_{i}\right)}^{2}}{2{\sigma}_{i}^{2}}{k}_{B}T,$$

(18)

where * _{i}* is the equilibrium position and ${\sigma}_{i}^{2}$ is the variance of a corresponding coordinate. The normalization constant is a single-coordinate partition function

$${\widehat{Z}}_{i}={\displaystyle \mathit{\int}d{x}_{i}\phantom{\rule{thinmathspace}{0ex}}{J}_{i}({x}_{i}){e}^{-\beta {\widehat{u}}_{i}({x}_{i})}}$$

(19)

where inverse length units are implied for all bond-angle integrals.

For “soft”, rotatable dihedrals — such as ϕ, ψ and χ angles in amino acids — we extracted histograms from a Langevin simulation of alanine dipeptide, as described in Ref.^{3} Thus, a two-dimensional (correlated) probability function was used for the (ϕ, ψ) dihedral pair. In particular, the joint histogram for the (ϕ, ψ) dihedrals consisted of 360 × 360=129,600 bins with a width of one degree. Histograms for “soft” dihedrals employed 6,000 bins with a width of 0.06 degree.

Coordinates were sampled according to their probability distributions by constructing a numerical cumulative distribution function (cdf) for each:

$$\text{cdf}({x}_{i})={\displaystyle {\int}_{-\infty}^{{x}_{i}}d{x}_{i}\phantom{\rule{thinmathspace}{0ex}}{J}_{i}({x}_{i}){e}^{-\beta {u}_{i}^{\text{ref}}({x}_{i})}.}$$

(20)

For the histogrammed variables (ϕ, ψ and χ), the cdf will exactly reproduce the distribution in the histogram. For variables with harmonic energy functions, we used a finely discretized numerical cdf to generate a close approximation to the desired distribution. Specifically, bond-length and bond-angle cdf’s employed 3,000 bins with a bin-width of 0.001 Å and 0.06 degree respectively. Histograms for dihedrals used 6,000 bins with a width of 0.06 degree. To sample from a given cdf, a uniform random number (0,1] is generated and equated to the cdf to find the corresponding *x _{i}* value. Within each bin, the

Based on the distributions just described, internal coordinates were sampled independently (except for pairwise sampling of ϕ and ψ dihedrals) using an in-house program written in C. Generated configurations were converted to Cartesian coordinates and the potential energy was evaluated using the “analyze” module of the Tinker software package.^{41} Based on these values and the known reference energies, the individual fragment free energies were calculated using Eq. (7). A simple resampling procedure^{37} was used to generate a fragment ensemble distributed according to the forcefield; see Eq. (8). Only a small fraction ( 10^{−4}) of reference-ensemble configurations remain after resampling, requiring extensive sampling of the reference ensemble and weeks of CPU cost, as mentioned earlier.

For this study, we generated libraries consisting of 10,000 configurations. All fragment libraries were sampled according to OPLS-AA forcefield at *T* = 298*K*, with a simple dielectric constant (ε = 60) modeling solvent. The choice of dielectric constant was motivated by the reasonable behavior observed in separate Langevin simulations of poly-alanine systems (data not shown).

As noted earlier, for all possible 10^{8} *covalent* (neighboring) pairings of fragments, we also tabulated the interaction energies from the forcefield, accounting for all atoms in the fragment pair. Suitable corrections for dummy atoms (see below) were made. In other words, for a simple two-fragment system, all interactions are stored.

Because fragments are sampled independently from each other, the six degrees of freedom that specify the relative orientation of neighboring fragments are included with the fragments. For this purpose “dummy” atoms are used to provide the extra coordinates. When fragments are joined, the dummy atoms are simply removed because they are replaced with the “real” fragment atoms. Although fragment combination is simplest when dummy atoms do not interact with real atoms, we chose to have dummy atoms interact with the fragment atoms for better overlap with subsequent ensembles. Interacting dummy atoms can mimic the neighboring fragment atoms and improve the efficiency of the method. When fragments are joined, however, the interaction energy of the dummy atoms must be substracted from the full fragment energy because they are replaced with the neighboring fragment atoms. When interacting dummy atoms are employed care must be taken to avoid intoducing extra degrees of freedom (e.g., certain bond lengths and angles) because they can affect the ensemble distributions.

The dummy atoms used at the N-terminus of a fragment are carbonyl C, carbonyl O and terminal alpha-C with valence set to one. The dummy atoms used at the C-terminus are amide N, amide H, and terminal alpha-C with valance also set to one. The dummy atoms were assigned the same forcefield parameters as used in the corresponding fragment atoms.

Returning again to the scheme embodied in Eq. (5), as well as in Fig. 1 and Fig. 2, the next intermediates add interactions between neighboring fragments. These can be considered the “bonded” interactions in the space of fragments, but non-bonded interactions among all atoms in the neighboring pair are included. Explicitly, the models for the remaining intermediates are described by

$${U}_{2}(\mathbf{x})={U}_{A}({\mathbf{x}}_{A})+{U}_{B}({\mathbf{x}}_{B})+\cdots +{U}_{AB}({\mathbf{x}}_{A},{\mathbf{x}}_{B})$$

(21)

$${U}_{3}(\mathbf{x})={U}_{2}(\mathbf{x})+{U}_{BC}({\mathbf{x}}_{B},{\mathbf{x}}_{C})+\cdots ,$$

(22)

where *U _{yz}* is the full interaction energy — based on the forcefield and solvent model — between fragments

Formally, it is clear what needs to be done. The ensemble of the previous stage *j* − 1 should be used to calculate Δ*F _{j}* using the perturbation relation (7) with

Again, however, possession of the libraries and interaction tables leads to dramatic practical implications. For instance, by construction, the energy *U*_{2} − *U*_{1} is simply the pre-stored energy *U _{AB}*; similarly

Caution is required when the molecule of interest contains repeated fragment pairs. While the same libraries can be used for the repeats, say at intermediate stages *j* and *m*, the *corresponding values of* Δ*F _{j} and* Δ

As described in the master scheme of Eq. (5) and illustrated in Fig. 1 and Fig. 2, the final calculation needed to obtain *F*^{phys} entails the inclusion of all remaining interactions in the forcefield and solvent model. These interactions, excluded until now, occur between atoms in non-neighboring pairs. As described in Sec. II D, for a molecule of *k* fragments, the full physical potential energy function (i.e., the forcefield) can be written as the difference from the final (*k*th) intermediate:

$${U}^{\text{phys}}(\mathbf{x})={U}_{k}(\mathbf{x})+{\displaystyle \sum _{y\mathrm{..}z}{U}_{yz}({\mathbf{x}}_{y},{\mathbf{x}}_{z})},$$

(23)

where the sum is over non-neighboring pairs of fragments — i.e., AC, AD, BD, ….

In this case, the necessary energy terms for use in the calculation of Δ*F*_{k→phys} via Eq. (7) must be calculated. They cannot readily be stored in advance, due to the combinatorial explosion of possible configurations. For instance, with libraries of 10^{4} configurations, there are 10^{12} possible configurations for three fragments, which is beyond the range of current commerical machines.

The physical ensemble, distributed according to the Boltzmann factor of the forcefield, can be generated by resampling the *U _{k}* ensemble — the last intermediate — using Eq. (8). In this case, the “a” ensemble corresponds to

Although the formalism governing the present study is mostly straightforward, our in-house computer program not only needs to reproduce standard forcefield results, but also requires complicated “dissections” of various subsets of forcefield terms. We therefore performed three types of checks on our code. (i) We checked that the forcefield energy for full molecular configurations exactly reproduces the results reported in Tinker (data not shown). This verifies that we have correctly accounted for our dummy-atom energy terms, (ii) Using our previously developed “structural histograms” for analyzing configuration-space distributions,^{42}^{,}^{43} we checked that the equilibrium ensembles produced during our free energy calculations agree with independent Langevin simulations. This data is shown in the Results section, and generated as explained below, (iii) Finally, we performed a check to ensure that our final free energy values are independent of the choice of fragments. This data, for two- and three-fragment decompositions of alanine dipeptides is also shown in the Results section.

Statistical uncertainties were calculated by running 20 independent computations for every free energy value reported. Twice the standard deviation among these 20 values is reported, which quantifies the scale of expected statistical error for a *single* simulation. The repeated simulations were run using 20 independent sets of libraries for the various fragments — i.e., the calculation was started all the way at the beginning in each repeat. However, because the overlap between various stages is the limiting factor in the quality of the free energy results, rather than our fairly large libraries, we anticipate similar error estimates would be obtained for one set of libraries.

In two previous studies,^{42}^{,}^{43} we have developed methods for comparing equilibrium distributions for molecular systems of arbitrary complexity. The central idea is to employ a “structural histogram” which simply classifies (divides) configuration space into a number of “bins” (regions). Two correct simulations should yield the same results for the fractional populations of the bins, within statistical error, regardless of whether the bins correspond to physical states/free energy basins. (Furthermore, the statistical uncertainty in the population estimates can be used to quantify the “effective sample size”.)^{43} In the present work, we compare equilibrium distributions from fragment combination and from standard Langevin simulations based on structural histograms. The particular histograms employed in the present study have five bins derived from a Voronoi construction;^{44} the reference structures for the Voronoi procedure are derived from the equi-probability scheme described in reference.^{43} Although the resulting bins are not exactly equally probable, each is guaranteed to represent a contiguous region in configuration space due to the Voronoi construction.

The absolute configurational free energy *F*^{phys} was calculated for the monomer, dimer, and tetramer alanine peptides: alanine dipeptide (Ace-Ala-Nme), di-alanine (Ace-(Ala)_{2}-Nme), and tetra-alanine (Ace-(Ala)_{4}-Nme). For alanine dipeptide, the free energy was estimated based on two different fragment sets as a check on our code. Twenty independent calculations for every *F*^{phys} estimate were performed to quantify uncertainty, as described above. Additionally, every free energy calculation also yields an equilibrium ensemble, which is compared to independent Langevin simulations.

The results are very encouraging in every regard, and rather rapid as reported at the end of the Results section. The amount of memory used, which is a key to the present calculations, is also reported.

The results reflect the uniform protocol adopted here. First, absolute free energies for non-interacting fragments are calculated. Then free energy changes resulting from interactions among covalently bound fragments are added (“bonded” terms, in the space of fragments), one at a time in sequence. Finally, all remaining interactions are added, which account to (“non-bonded”) interactions among all non-sequential fragment atoms. The final free energy values reflect the full OPLS-AA forcefield^{36} as implemented in Tinker.^{41}

Because of the complexity of the fragmentation procedure and the lack of reference standards for absolute free energy values, we wanted to ensure our code was introducing no artifacts. We were particularly concerned about the interacting dummy atoms which introduce “temporary” energy terms, that must be corrected for properly at every combination stage. We find good agreement between free energy estimates based on two- and three-fragment decompositions.

In our standard decomposition for the present study, we separate peptide and amino acid groups. For alanine dipeptide (AD), then, the three standard fragments are Ace, Ala, and Nme, and the corresponding stages for the free energy calculation are given in Eq. (9). Recalling our convention that *F*^{ref} 0, the free energy terms from Eq. (5) can be written as

$$\begin{array}{l}\hfill {F}^{\text{ref}}\equiv 0\hfill \\ \hfill \mathrm{\Delta}{F}_{1}={F}_{\text{Ace}}+{F}_{\text{Ala}}+{F}_{\text{Nme}}\hfill \\ \hfill \mathrm{\Delta}{F}_{2}=\mathrm{\Delta}{F}_{\text{Ace}\to \text{Ala}}\hfill \\ \hfill \mathrm{\Delta}{F}_{3}=\mathrm{\Delta}{F}_{\text{Ala}\to \text{Nme}}\hfill \\ \hfill \mathrm{\Delta}{F}_{3\to \text{phys}}=\mathrm{\Delta}{F}_{\text{nonbonded}}\hfill \end{array}$$

(24)

where *F _{y}* is the absolute free energy (including dummy atoms) for fragment

As an alternative decomposition, we used Ace-Ala as one fragment and Nme as the other. Importantly, the Ace-Ala library and absolute free energy were *not* generated from a combination of the two smaller libraries, but instead from a ground-up calculation based on independent coordinates as described in the Methods section.

In this case, then, the free energy terms from Eq. (5) become

$$\begin{array}{l}\hfill {F}^{\text{ref}}\equiv 0\hfill \\ \hfill \mathrm{\Delta}{F}_{1}={F}_{\text{Ace}-\text{Ala}}+{F}_{\text{Nme}}\hfill \\ \hfill \mathrm{\Delta}{F}_{1\to \text{phys}}=\mathrm{\Delta}{F}_{\text{Ace}-\text{Ala}\to \text{Nme}},\hfill \end{array}$$

(25)

where it is notable that in the two-fragment case, *all* interactions are included in the libraries and interaction tables. In other words, no energy calls at all are needed.

There is essentially perfect agreement between free energies estimate via the two independent decompositions, which provides a reassuring check on our computer program. The full results are given in Table I. Notably, the two-fragment decomposition has a higher variance, which probably results from a decreased “precision” in the pre-generated Ace-Ala ensemble. In the composite pre-generated Ace-Ala ensemble, the whole configuration space is represented by 10^{4} configurations, whereas when Ace and Ala from separate 10^{4}-member libraries are combined, there is a much denser coverage of configuration space.

Our free energy computation produces an equilibrium ensemble through repeated resampling procedures at each stage, as explained in the Methods section (Sec. IIJ). As a further check on our data, we compare the equilibrium ensembles generated from our fragment combination procedure to those produced by long Langevin simulations performed in Tinker.^{41} The results, shown in Fig. 3(a), indicate that our computation is indeed producing correct equilibrium ensembles. The graph shows the populations of different regions of configuration space, which was divided up using a Voronoi procedure explained above (Sec. IIK). The alanine dipeptide equilibrium distribution was generated from the three-fragment protocol, and the 1 µsec. Langevin simulation (20×50 nsec) was performed in Tinker using a friction constant of 10.0 ps^{−1} at *T* = 298*K*.

Using the same libraries as for the alanine monomer above, we now calculate the absolute configurational free energy for the di-alanine peptide (Ace-(Ala)_{2}-Nme). The staging used is described in Eq. (10), which corresponds to the following free energy terms for use in Eq. (5):

$$\begin{array}{l}\hfill {F}^{\text{ref}}\equiv 0\hfill \\ \hfill \mathrm{\Delta}{F}_{1}={F}_{\text{Ace}}+2\cdot {F}_{\text{Ala}}+{F}_{\text{Nme}}\hfill \\ \hfill \mathrm{\Delta}{F}_{2}=\mathrm{\Delta}{F}_{\text{Ace}\to \text{Ala}}\hfill \\ \hfill \mathrm{\Delta}{F}_{3}=\mathrm{\Delta}{F}_{\text{Ala}\to \text{Ala}}\hfill \\ \hfill \mathrm{\Delta}{F}_{4}=\mathrm{\Delta}{F}_{\text{Ala}\to \text{Nme}}\hfill \\ \hfill \mathrm{\Delta}{F}_{4\to \text{phys}}=\mathrm{\Delta}{F}_{\text{nonbonded}}.\hfill \end{array}$$

(26)

It is important to note that Δ*F*_{Ala→Ala} for di-alanine differs in principle from the term with the same name in Eq. (24) for alanine-dipeptide because the prior ensemble is different. This point was explained in Sec. IIH. However, in practice, the difference among the three systems is not statistically significant.

The free energy values are once again calculated with high precision: fluctuations are a fraction of one kcal/mole. The data for all free energy terms is given Table II, where we see a significant change in the Δ*F*_{nonbonded} term, reflecting the increased number of attractive interactions in this larger molecule (compared to alanine dipeptide).

Free energy terms used in calculating the absolute free energy for di-alanine and tetra-alanine. The table gives free energy values in kcal/mole, as well as two standard deviations (in parentheses) based on 20 independent calculations.

Similarly, the agreement among bin populations for di-alanine in Fig. 3(b) is good, which provides an independent reason for having confidence in the free energy results. The Langevin simulation for di-alanine was performed with exactly the same parameters as for alanine-dipeptide.

Our results are of high precision (~ 0.1 kcal/mole standard deviation) even for tetra-alanine. The staging follows our standard procedure, with the only subtlety in the present case is that the addition of every Ala residue is different, because the “growing” ensemble is different in every case. Thus we consider the first alanine (Alal) separate from the second (Ala2), and so on.

$$\begin{array}{l}\hfill {F}^{\text{ref}}\equiv 0\hfill \\ \hfill \mathrm{\Delta}{F}_{1}={F}_{\text{Ace}}+4\cdot {F}_{\text{Ala}}+{F}_{\text{Nme}}\hfill \\ \hfill \mathrm{\Delta}{F}_{2}=\mathrm{\Delta}{F}_{\text{Ace}\to \text{Ala}1}\hfill \\ \hfill \mathrm{\Delta}{F}_{3}=\mathrm{\Delta}{F}_{\text{Ala}1\to \text{Ala}2}\hfill \\ \hfill \mathrm{\Delta}{F}_{4}=\mathrm{\Delta}{F}_{\text{Ala}2\to \text{Ala}3}\hfill \\ \hfill \mathrm{\Delta}{F}_{5}=\mathrm{\Delta}{F}_{\text{Ala}3\to \text{Ala}4}\hfill \\ \hfill \mathrm{\Delta}{F}_{6}=\mathrm{\Delta}{F}_{\text{Ala}4\to \text{Nme}}\hfill \\ \hfill \mathrm{\Delta}{F}_{6\to \text{phys}}=\mathrm{\Delta}{F}_{\text{nonbonded}}.\hfill \end{array}$$

(27)

Our data for each of these terms is given in Table II. Although the different alanine additions are based on different ensembles, the results show they are statistically indistin-guishable in this case. However, the “non-bonded” term Δ*F*_{nonbonded} again is significantly different from the previous molecules, as expected.

In comparing the equilibrium distributions from fragment combination and Langevin simulation, once again there is good statistical agreement. For Langevin simulation of tetra-alanine, all parameters were set as before, except for a friction constant of 5.0 ps^{−1}, which does not alter the equilibrium distribution. The contrast between the large fluctuations in the bin populations *p _{i}* and the high precision of

The calculations were reasonably inexpensive, taking 20 minutes for alanine dipeptide, 30 minutes for di-alanine, and 50 minutes for tetra-alanine using one processor of an Intel Xeon 3.20 GHz machine. Concerning memory, a single library containing 10,000 configurations requires 11 MB for Ala, 12 MB for Ace-Ala complex and 5.7 MB for Ace and Nme. An interaction table containing 10^{8} pair-wise interactions uses 1.3 GB.

Overall, the precision of our free energy estimates was very high, which we attribute to two related factors. First, our ensembles in the reference and intermediate stages were of good statistical quality — i.e., characterized by a large effective sample size (data not shown).^{43} Second, there was good overlap between the stages, which indeed contributed to maintaining the sample size throughout the stages. The overlap is present by design, as interactions were always *added* between stages. The addition of interactions or, equivalently, correlations among degrees of freedom is guaranteed to reduce the entropy.^{45} This progressive narrowing of configuration space is consistent with Kofke’s proposal to calculate free energy differences in the “insertion” direction.^{39}^{,}^{40} For larger systems, however, one expects limitations to maintaining the sample size using the present protocol, as explained below.

We note that our present approach differs from our previous work^{3} primarily because the earlier study did not employ staging. Effectively, it was a single-fragment calculation from the perspective of the present study.

Because the fragment combination procedure can be applied to fragments of small molecules, and not just to peptides as in the present report, the approach can be applied to calculate approximate relative affinities. That is, one can grow a ligand into the binding pocket of a protein receptor and calculate its free energy. A number of different approximations can be imagined. Most simply, the receptor can be held rigid and the ligand grown in the fields (van der Waals and electrostatic) of the receptor. In a better approximation, the binding-site side-chains can be grown along with the ligand. One can expect affinities based on such free energy calculations to be superior to their docking counterparts because entropy is included. To produce a relative affinity estimate between two ligands, the respective solvation terms would need to be included as usual.^{12}

As we have already noted, the fragment combination protocol we have described produces equilibrium ensembles simultaneously with free energy estimates. It is natural to wonder whether such ensembles are produced more efficiently than by standard dynamics simulation especially given that small peptides have been shown to have multi-nanosecond relaxation times.^{43} In fact, as we will document in a separate publication, fragment combination can lead to sampling that is faster by several orders of magnitude. However, somewhat more sophisticated resampling schemes and different fragment sizes are useful in reaching the highest levels of efficiency, as will be reported.

It is interesting and important to consider the additional costs which would be entailed by using a standard implicit solvent model, such as GBSA.^{46} First, both the libraries and the interaction tables would need to be regenerated using the implicit solvent model. Although this could take several weeks of single-CPU time, it needs to be done only a single time for a given model. The second cost is for additional solvent calculations not included in the libraries and tables. We hope to report on the staging and computational expense in a forthcoming publication.

The protocol described in this report will encounter challenges with larger systems, as our results hint. In the present study, the equilibrium ensemble generated at one stage, say *j*, is used to calculate the incremental free energy difference to the next stage, *j* + 1. To continue the process, the ensemble at stage *j* + 1 is produced by resampling ensemble *j* as described in the Methods section. However, it is possible the resampled ensemble will contain a small number of distinct configurations in an important part of configuration space. Such configurations will have high weight prior to resampling and thus the problem can be diagnosed by noting whether any configurations are resampled multiple times. Clearly, duplicated configurations will not be statistically independent and lead to increased statistical error in free energy estimates.

One solution to this problem would be to relax duplicated configurations — i.e. to perform short equilibrium simulations to create distinct configurations. The statistical justification of such an approach is somewhat technical^{47} and will be described in future work as required.

Additional incremental stages can be added by considering only subsets of interactions. For instance, in the case of di-alanine (Ace-(Ala)_{2}-Nme) which is composed of four fragments (A, B, C, D), there are three sets of non-bonded interactions: AC, AD, and BD. Our present implementation adds all three in a single stage, but they could be added one at a time. Undoubtedly, in larger systems, such finer staging will be necessary and probably should be required by relaxation of duplicated configurations as just described.

We have extended our earlier work on computation of absolute free energies for molecular systems ^{3} in two ways: (i) by staging the calculation using molecular fragments; and (ii) by pre-calculating “statistical libraries” of fragment configurations, intra-fragment energies, and inter-fragment interactions. For a series of test systems – the alanine monomer, dimer, and tetramer — we were able to compute extremely precise free energies, with fluctuations 1 kcal/mole. The calculations were quite fast, furthermore, with the slowest requiring less than an hour of single-processor computer time. The speed results from employing (infinitely re-usable) pre-calculated library configurations and interactions, which pre-empt expensive energy calculations. Our statistical libraries of amino-acid and capping-group fragments are available on our website (www.ccbb.pitt.edu/Zuckerman).

A future application of potential importance in computational biochemistry is the estimation of binding affinities of small molecules to proteins. We hope to develop fragment libraries suitable for small molecules — following ideas developed long ago^{29}^{,}^{30} — to be used in computing free energies within a potentially flexible protein binding site. Protein flexibility could be included using the side-chains of the binding site as fragments in the calculations.

Another application closely related to the present report is the use of library-based fragment combination for equilibrium sampling. While the calculations reported here already yield equilibrium ensembles, we are actively studying more efficient schemes based on advanced resampling approaches^{37}^{,}^{48} and a range of fragment sizes. This work will be reported in a separate publication.

We greatly appreciate the advice of Divesh Bhatt, Ying Ding, Marty Ytreberg, and Bin Zhang, as well as the financial support of the NIH (Grants GM076569 and GM070987) and NSF (Grant MCB-0643456).

Xin Zhang, Department of Physics & Astronomy, University of Pittsburgh, Pittsburgh, Pennsylvania 15260.

Artem B. Mamonov, Department of Computational Biology, School of Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania 15213.

Daniel M. Zuckerman, Department of Computational Biology, School of Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania 15213.

1. Frenkel D, Smit B. Understanding Molecular Simulation. San Diego: Academic Press; 1996.

2. Stoessel JP, Nowak P. Macromol. 1990;23:1961.

3. Ytreberg FM, Zuckerman DM. J. Chem. Phys. 2006;124:104105. [PubMed]

4. Huang L, Makarov DE. J. Chem. Phys. 2006;124:64108. [PubMed]

5. Meirovitch H. J. Chem. Phys. 1992;97:5803.

6. Meirovitch H. J. Chem. Phys. 1999;111:7215.

7. Cheluvaraja S, Meirovitch H. Proc. Natl. Acad. Sci. U. S. A. 2004;101:9241. [PubMed]

8. White RP, Meirovitch H. Proc. Natl. Acad. Sci. U. S. A. 2004;101:9235. [PubMed]

9. Head MS, Given JA, Gilson MK. J. Phys. Chem. B. 1997;101:1609.

10. Chang C-E, Potter MJ, Gilson MK. J. Phys. Chem. B. 2003;107:1048.

11. Chang C, Gilson M. J. Am. Chem. Soc. 2004;126:13156. [PubMed]

12. Chen CC, W, Gilson M. Biophys. J. 2004;87:3035. [PubMed]

13. Go N, Scheraga H. J. Chem. Phys. 1969;51:4751.

14. Rosenbluth MN, Rosenbluth AW. J. Chem. Phys. 1955;23:356.

15. Wall FT, Erpenbeck JJ. J. Chem. Phys. 1959;30:634.

16. Meirovich H. J. Phys. A: Math. Gen. 1982;15:L735.

17. Meirovich H. Phys. Rev. A. 1985;32:3699. [PubMed]

18. Garel T, Orland H. J. Phys. A: Math. Gen. 1990;23:L621.

19. Garel T, Niel JC, Orland H, Smith J, Velikson B. J. Chem. Phys. 1991;88:2479.

20. Velikson B, Garel T, Niel JC, Orland H, Smith JC. J. Comput. Chem. 1992;13:1216.

21. Bascle J, Garel T, Orland H, Velikson B. Biopolymers. 1993;33:1843. [PubMed]

22. Grassberger P. J. Phys. A: Math. Gen. 1993;26:2769.

23. Grassberger P, Hegger R. J. Phys.- Condens. Mat. 1995;7:3089.

24. Grassberger P. Phys. Rev. E. 1997;56:3682.

25. Bastolla U, Frauenkron H, Gerstner E, Grassberger P, Nadler W. Proteins. 1998;32:52. [PubMed]

26. Zhang JL, Liu JS. J. Chem. Phys. 2002;117:3492.

27. Zhang J, Lin M, Chen R, Liang J, Liu JS. Proteins. 2007;66:61. [PubMed]

28. Gibson KD, Scheraga HA. J. Comput. Chem. 1987;8:826.

29. Miranker A, Karplus M. Proteins. 1991;11:29. [PubMed]

30. Leach AR. Molecular Modelling: Principles and Applications. Pearson Education Limited; 2001.

31. Mamonov AB, Bhatt D, Cashman DJ, Zuckerman DM. submitted. [PMC free article] [PubMed]

32. Wall FT, Rubin RJ, Isaacson LM. J. Chem. Phys. 1957;27:186.

33. Alexandrowicz Z. J. Chem. Phys. 1969;51:561.

34. Macedonia MD, Maginn EJ. Mol. Phys. 1999;96:1375.

35. Rohl CA, Strauss CEM, Misura KMS, Baker D. Method Enzymol. 2004;383:66. [PubMed]

36. Jorgensen WL, Maxwell DS, Tirado-Rives J. J. Am. Chem. Soc. 1996;118:11225.

37. Liu JS. Monte Carlo strategies in scientific computing. Springer; 2004.

38. Zwanzig RW. J. Chem. Phys. 1954;22:1420.

39. Peter DAK, Cummings T. Mol. Phys. 1997;92:973.

40. Kofke DA, Cummings PT. Fluid Phase Equilibr. 1998;150–151:41.

41. Ponder JW, Richard FM. J. Comput. Chem. 1987;8:1016.

42. Lyman E, Zuckerman DM. Biophys. J. 2006;91:164. [PubMed]

43. Lyman E, Zuckerman DM. J. Phys. Chem. B. 2007;111:12876. [PMC free article] [PubMed]

44. Voronoi G. Journal für die Reine und Angewandte Mathematik. 1907;133:97.

45. Cover TM, Thomas JA. Elements of Information Theory. 2nd Edition. Wiley; 2006.

46. Qiu D, Shenkin PS, Hollinger FP, Still WC. J Phys. Chem. A. 1997;101:3005.

47. Neal RM. Stat. comput. 2001;11:125.

48. Fearnhead P, Clifford P. J. R. Stat. Soc. B. 2003;65:887.

PubMed Central Canada is a service of the Canadian Institutes of Health Research (CIHR) working in partnership with the National Research Council's national science library in cooperation with the National Center for Biotechnology Information at the U.S. National Library of Medicine(NCBI/NLM). It includes content provided to the PubMed Central International archive by participating publishers. |