Home | About | Journals | Submit | Contact Us | Français |

**|**HHS Author Manuscripts**|**PMC2832335

Formats

Article sections

- Abstract
- I. Introduction
- II. Methods
- III. Results and Discussion
- IV. Conclusions
- Supplementary Material
- References

Authors

Related links

J Chem Theory Comput. Author manuscript; available in PMC 2010 October 5.

Published in final edited form as:

J Chem Theory Comput. 2009 October 5; 5(11): 2935–2943.

doi: 10.1021/ct900409pPMCID: PMC2832335

NIHMSID: NIHMS148492

Department of Chemistry and Biochemistry, Worcester Polytechnic Institute, Worcester, MA 01609

George A. Kaminski: ude.ipw@iksnimakg

See other articles in PMC that cite the published article.

We are presenting POSSIM (POlarizable Simulations with Second order Interaction Model) – a software package and a set of parameters designed for molecular simulations. The key feature of POSSIM is that the electrostatic polarization is taken into account using a previously introduced fast formalism. This permits cutting computational cost of using the explicit polarization by about an order of magnitude. In this article, parameters for water, methane, ethane, propane, butane, methanol and NMA are introduced. These molecules are viewed as model systems for protein simulations. We have achieved our goal of ca. 0.5 kcal/mol accuracy for gas-phase dimerization energies and no more than 2% deviations in liquid state heats of vaporization and densities. Moreover, free energies of hydration of the polarizable methane, ethane and methanol have been calculated using the statistical perturbation theory. These calculations are a model for calculating protein pKa shifts and ligand binding affinities. The free energies of hydration were found to be 2.12 kcal/mol, 1.80 kcal/mol and −4.95 kcal/mol for methane, ethane and methanol, respectively. The experimentally determined literature values are 1.91 kcal/mol, 1.83 kcal/mol and −5.11 kcal/mol. The POSSIM average error in these absolute free energies of hydration is only about 0.13 kcal/mol. Using the statistical perturbation theory with polarizable force fields is not widespread, and we believe that this work opens road to further development of the POSSIM force field and its applications for obtaining accurate energies in protein-related computer modeling.

Computer simulations have become an integral part of chemical and biochemical research, delivering answers to a variety of questions ranging from assessing reaction kinetic data to providing microscopic insight into systems involving proteins and DNA. The key issue in any such modeling is the way the energy of the system is evaluated. Quantum mechanics provides a robust tool for this task, but there are two general problems. First, quantum mechanical calculations are resource demanding, therefore, the size of a system which can be treated this way is limited. Second, there is no one single recipe for obtaining uniformly accurate quantum mechanical data, and the level of theory required for different problems varies.

Empirical force fields offer an alternative which is much cheaper computationally. However, it has been shown that, in many cases, accurate assessment of energy with empirical force fields necessitates explicit treatment of many-body interactions, mainly the electrostatic polarization.^{1} Dimerization energies and acidity constants of small molecules, energies of protein-ligand interactions, protein pKa values, or even the very thermodynamic stability of complexes in solutions often depend on including the electrostatic polarization into the empirical Hamiltonian. For example, it has been demonstrated in earlier works, that pKa values for acidic and basic residues of the OMTKY3 can be reproduced within 0.6 and 0.7 pH units of the experimental data if explicit treatment of polarization is included. The errors with the non-polarizable OPLS are 3.3 pH units for the acidic residues and 2.2 for the basic ones.^{2} Another interesting example was a study of Sialyl LewisX in complex with SelectinE. This sugar-protein complex is known to exist experimentally, but dissociated in molecular dynamics simulations with fixed charges.^{3} Overall, many-body interactions play a crucial role in many applications, although they are sometimes included in surrogate forms, for example as conformation-specific protein charges.^{4}

There are two major issues pertaining to polarizable calculations which are still not uniformly resolved. On one hand, the optimal source of fitting data seems to vary from application to application. While it is attractive to rely almost solely on high-level quantum mechanical results,^{5} experimental data often presents more robustness. Moreover, while quantum mechanical calculations permit a great level of microscopic insight, the very values of quantum mechanical energies are often uncertain as they may significantly change depending on the level of theory. We adopt a middle-path approach in this work, in which we rely on experimental data whenever possible and make heavy use of quantum mechanical calculations, while trying not to treat them as panacea.

The second issue with polarizable force fields is the functional form of polarization itself. There are several viable techniques present, including fluctuating charges or inducible point dipoles. It is possible to implement these quite efficiently for uniform systems known in advance (such as pure water^{6}). But software for simulating arbitrary polarizable systems (including proteins and protein-ligand complexes) is usually significantly slower than its fixed-charges counterparts. Calculating induced dipoles or fluctuating charges requires solving a system of self-consistent equations. This is usually done iteratively, and the process may lead to unlimited growth of the induced dipoles (the so called polarization catastrophe). In order to reduce the computational cost of polarizable calculations, we introduced an approximation that we termed second-order polarization. It permitted to reduce the time needed for assessing the polarization energy by about an order of magnitude, removed any possibility of polarization catastrophe, and it was done without any sacrifice of the computational accuracy.^{7}

This article reports the next stage of the systematic development of the second-order polarization technique. A software suite POSSIM (POlarizable Simulations with Second order Interaction Model) has been created for the second order polarizable calculations. Parameters for several model systems have been developed. The list of these systems includes the NMA, which will be further used as the main building block in protein backbones. Moreover, free energy perturbations were performed with the POSSIM force field and software to obtain relative and absolute Gibbs free energies of hydration for methane, ethane and methanol. While being widely used with fixed charges force fields, such calculations are still rather rare with polarizable techniques. The toolset presented in this paper will be further utilized in developing POSSIM parameters for proteins and other systems and studying biophysical and organic processes, including protein-ligand binding.

In this work, we benchmark the POSSIM results (marked PFF, for Polarizable Force Field) against the fixed-charges OPLS-AA^{8} as well as the previous version of the polarizable force field (non-second order, thus a slower one) termed PFF0.^{9}

The rest of the paper is organized as follows: Given in Section II is description of the methodology involved. Section III contains results and discussion. Finally, conclusions are presented in Section IV.

The total energy *E _{tot}* is calculated as follows:

$${E}_{\mathit{\text{tot}}}={E}_{\mathit{\text{electrostatic}}}+{E}_{\mathit{\text{vdW}}}+{E}_{\mathit{\text{stretch}}}+{E}_{\mathit{\text{bend}}}+{E}_{\mathit{\text{torsion}}}$$

(1)

*E _{electrostatic}* stands for the electrostatic interactions, including the dipole-dipole, dipole-charge, and charge-charge contributions.

The electrostatic polarization energy with inducible point dipoles is:

$${E}_{\mathit{\text{pol}}}=-\frac{1}{2}{\displaystyle \sum _{i}{\mathrm{\mu}}_{i}{\mathbf{E}}_{i}^{0}},$$

(2)

**E**^{0} denotes the electrostatic field in the absence of the dipoles, and μ represents the induced dipole moments which are calculated as follows:

$${\mathrm{\mu}}_{i}={\mathrm{\alpha}}_{i}{\mathbf{E}}_{i}^{\mathit{\text{tot}}},$$

(3)

where α are scalar polarizabilities and **E**_{i}^{tot} is the total field, including the dipole-dipole component,

$${\mathbf{E}}_{i}^{\mathit{\text{tot}}}={\mathbf{E}}_{i}^{0}+{\displaystyle \sum _{j\ne i}{\mathbf{T}}_{\mathit{\text{ij}}}{\mathrm{\mu}}_{j}}$$

(4)

$${\mathbf{T}}_{\mathit{\text{ij}}}=\frac{1}{{\mathrm{R}}_{\mathit{\text{ij}}}^{3}}(\frac{3{\mathbf{R}}_{\mathit{\text{ij}}}{\mathbf{R}}_{\mathit{\text{ij}}}}{{\mathrm{R}}_{\mathit{\text{ij}}}^{2}}-\mathbf{I})$$

(5)

with R_{ij} being distances between atomic sites *i* and *j*. **I** is the unit tensor. Then

$${\mathrm{\mu}}_{i}={\mathrm{\alpha}}_{i}{\mathbf{E}}_{i}^{0}+{\mathrm{\alpha}}_{i}{\displaystyle \sum _{j\ne i}{\mathbf{T}}_{\mathit{\text{ij}}}{\mathrm{\mu}}_{j}}$$

(6a)

or

$$\mathbf{A}\mathrm{\mu}={\mathbf{E}}^{0}$$

(6b)

The self-consistent Equation 6 is usually solved iteratively. If we explicitly write down the first three iterations, we can produce "first-order", "second-order" and "third-order" approximations for the induced dipoles:

$${\mathrm{\mu}}_{i}^{0}={\mathrm{\alpha}}_{i}{\mathbf{E}}_{i}^{0}$$

(7a)

$${\mathrm{\mu}}_{i}^{I}={\mathrm{\alpha}}_{i}{\mathbf{E}}_{i}^{0}+{\mathrm{\alpha}}_{i}{\displaystyle \sum _{j\ne i}{\mathbf{T}}_{\mathit{\text{ij}}}{\mathrm{\mu}}_{j}^{0}=}{\mathrm{\alpha}}_{i}{\mathbf{E}}_{i}^{0}+{\mathrm{\alpha}}_{i}{\displaystyle \sum _{j\ne i}{\mathbf{T}}_{\mathit{\text{ij}}}{\mathrm{\alpha}}_{i}{\mathbf{E}}_{j}^{0}}$$

(7b)

$${\mathrm{\mu}}_{i}^{\mathit{\text{II}}}={\mathrm{\alpha}}_{i}{\mathbf{E}}_{i}^{0}+{\mathrm{\alpha}}_{i}{\displaystyle \sum _{j\ne i}{\mathbf{T}}_{\mathit{\text{ij}}}{\mathrm{\mu}}_{j}^{I}={\mathrm{\alpha}}_{i}}{\mathbf{E}}_{i}^{0}+{\mathrm{\alpha}}_{i}{\displaystyle \sum _{j\ne i}{\mathbf{T}}_{\mathit{\text{ij}}}{\mathrm{\alpha}}_{j}}{\mathbf{E}}_{j}^{0}+{\mathrm{\alpha}}_{i}{\displaystyle \sum _{j\ne i}{\mathbf{T}}_{\mathit{\text{ij}}}{\mathrm{\alpha}}_{j}}{\displaystyle \sum _{k\ne j}{\mathbf{T}}_{\mathit{\text{jk}}}{\mathrm{\alpha}}_{k}}{\mathbf{E}}_{k}^{0}$$

(7c)

The first-order expression requires considerably less resources than the exact expression in Equation 6, as the matrix **A** does not come into play, but dipole-dipole interactions are totally ignored. It has been shown that, although more accurate than the fixed-charges description, this technique is not accurate enough in describing molecular systems.^{10} We are utilizing the second-order expression in Equation (7b) instead. It has been previously shown to yield a ca. order of magnitude increase of the computational speed with no loss of accuracy.^{7}

The overall electrostatic energy

$${E}_{\mathit{\text{electrostatic}}}={E}_{\mathit{\text{pol}}}+{E}_{\mathit{\text{additive}}}$$

(8)

Where

$${E}_{\mathit{\text{additive}}}={\displaystyle \sum _{i\ne j}\frac{{q}_{i}{q}_{j}}{{R}_{\mathit{\text{ij}}}}}{f}_{\mathit{\text{ij}}},$$

(9)

represents the pairwise-additive Coulomb charge-charge interaction energies between charges on atoms *i* and *j*. *f _{ij}* equals to zero for 1,2- and 1,3-pairs (atoms which belong to the same valence bond or angle), 0.5 for 1,4-interactions (atoms in the same dihedral angle) and 1.0 for all the other pairs.

To avoid unphysical increase of the electrostatic interactions at close distances, each atom is assigned a cutoff parameter *R _{cut}*. When the overall distance

$${R}_{\mathit{\text{ij}}}^{\mathit{\text{eff}}}=(1-{\left(\frac{{R}_{\mathit{\text{ij}}}}{{R}_{\mathit{\text{min}}}^{\mathit{\text{ij}}}}\right)}^{2}+{\left(\frac{{R}_{\mathit{\text{ij}}}}{{R}_{\mathit{\text{min}}}^{\mathit{\text{ij}}}}\right)}^{3})\xb7{R}_{\mathit{\text{min}}}^{\mathit{\text{ij}}}$$

(10)

The van-der-Waals energy is evaluated with the standard Lennard-Jones formalism:

$${E}_{\mathit{\text{vdW}}}={\displaystyle \sum _{i\ne j}4{\epsilon}_{\mathit{\text{ij}}}\phantom{\rule{thinmathspace}{0ex}}\left[{(\frac{{\mathrm{\sigma}}_{\mathit{\text{ij}}}}{{R}_{\mathit{\text{ij}}}})}^{12}-{(\frac{{\mathrm{\sigma}}_{\mathit{\text{ij}}}}{{R}_{\mathit{\text{ij}}}})}^{6}\right]}{f}_{\mathit{\text{ij}}}$$

(11)

Geometric combining rules are used:

$${\epsilon}_{\mathit{\text{ij}}}={({\epsilon}_{i}\xb7{\epsilon}_{j})}^{1/2},{\mathrm{\sigma}}_{\mathit{\text{ij}}}={({\mathrm{\sigma}}_{i}\xb7{\mathrm{\sigma}}_{j})}^{1/2}$$

(12)

and the coefficient *f _{ij}* is calculated the same way as for the electrostatic.

The bond stretching and angle bending energies were obtained in accordance with Equation 13 and Equation 14.

$${E}_{\mathit{\text{stretch}}}={\displaystyle \sum _{\mathit{\text{bonds}}}{K}_{r}{(r-{r}_{\mathit{\text{eq}}})}^{2}}$$

(13)

$${E}_{\mathit{\text{bend}}}={\displaystyle \sum _{\mathit{\text{angles}}}{K}_{\mathrm{\Theta}}{(\mathrm{\Theta}-{\mathrm{\Theta}}_{\mathit{\text{eq}}})}^{2}}$$

(14)

Here the subscripts *eq* are used to denote the equilibrium values of the bond length *r* and angle Θ.

Finally, the torsional term was computed as follows:

$${E}_{\mathit{\text{torsion}}}={\displaystyle \sum _{i}\frac{{V}_{1}^{i}}{2}[1+\text{cos}({\varphi}_{i})]}+\frac{{V}_{2}^{i}}{2}[1-\text{cos}(2{\varphi}_{i})]+\frac{{V}_{3}^{i}}{2}[1+\text{cos}(3{\varphi}_{i})],$$

(15)

with the summation performed over all the dihedral angles *i*.

When comparison is done with results obtained with the non-polarizable OPLS-AA force field,^{8} the latter lacks the first (polarizable) term in the electrostatic energy expression in Equation 8. The previous generation polarizable force field^{9} (denoted as PFF0) employs the full iterative solvation of Equation 6, has permanent dipole moments in addition to the inducible ones and uses exp-6 van-der-Waals energy term instead of the Lennard-Jones 12-6 formalism of this work.

The procedure for determining values of the potential energy parameters consisted of the following stages: (i) fitting the electrostatic polarizabilities; (ii) fitting the permanent electrostatic charges; (iii) determining Lennard-Jones parameters; (iv) obtaining values of the torsional parameters and (v) fine-tuning of the force field. All the calculations for the newly developed PFF (both for the parameterization and free energy perturbations) were performed with the POSSIM software suite.

A series of electrostatic perturbations were applied to the target molecules, in the form of dipolar probes consisting of two opposite charges of magnitude 0.78 *e*, 0.58 Å apart (for a dipole moment of 2.17D – similar to that of non-polarizable models for liquid water such as SPC/E^{11}), placed at locations where hydrogen-bonds to the molecule were formed. The perturbations were the same as used for the previous generation of the polarizable force field (PFF0).^{5}^{,}^{12} For each perturbation, the change in the electrostatic potential (ESP) at a set of gridpoints outside the van der Waals surface of the molecule, as well as the energy of the perturbed system, were computed using density-functional theory (DFT) with the B3LYP method^{13} and cc-pVTZ(-f) basis set. All calculations were performed with the Jaguar electronic structure code.^{14} The polarizablities α_{i} are assumed to be isotropic and are chosen to minimize the deviation of the three-body energy obtained with the PFF and the DFT calculations. The three body energies were calculated in accordance with Equation 16 and Figure 1

$${E}_{3\mathit{\text{body}}}=E(1+2+3)-E(1+2)-E(1+3)-E(2+3)+E(1)+E(2)+E(3)$$

(16)

Here the molecule for which parameters are produced is denoted as 1, and the two dipolar probes are marked as 2 and 3. This three body energy is automatically equal to zero in a nonpolarized fixed-charges force field, in which no many-body interactions are explicitly present. It has been shown that the three-body energy is independent of the permanent charges and depends only on the values of the polarizabilities.^{7} Therefore, it is justified that the polarizabilities in our force field were parameterized first.

The choice of electronic structure method (DFT/B3LYP functional, cc-pVTZ (-f) basis set) yields quite accurate permanent charge distributions, but underestimates the gas phase polarizability as compared to experiment. Closer agreement with gas phase experiments could be obtained by including diffuse functions in the DFT calculations. However, our previous computational experiments with liquid state simulations strongly suggest that these diffuse function contributions are considerably damped in the condensed phase, and that ignoring them is in fact a much better approximation than fully including them.^{5} Briefly, in the condensed phase, Pauli repulsion from neighboring molecules raises the energies of diffuse functions and so diminishes their contribution to the polarization. Empirically, when diffuse functions are used to develop polarization responses for small molecules, liquid state simulations of these molecules manifest overpolarization of the solvent.

We have used the same quantum mechanical systems as above, except that two-body energies were employed as the fitting target:

$${E}_{2\mathit{\text{body}}}=E(1+2)-E(1)-E(2)$$

(17)

The polarizabilities were not changed from their values obtained at the previous step. The charges were adjusted to minimize the 2-body RMS deviations.

In all the cases (except for the dipolar probes), the *R _{cut}* for both dipoles and charges on the molecule involved were set to 0.8 Å.

Fitting the Lennard-Jones part of the force field was done with high accuracy *ab initio* results for intermolecular hydrogen bonding interaction energies and distances as a target. Gas-phase energy minimizations were carried out. The quantum mechanical data was obtained as described in Reference ^{15}, and many actual target energies were adopted from the References ^{9} and ^{12} (as noted in the tables in the Results and Discussion section below). The methods employed to calculate binding energies is based on an MP2 extrapolation procedure that was previously developed using the pseudospectral local MP2 (LMP2) approach.

The goal was to reproduce the gas phase intermolecular binding affinities and geometries as accurately as possible. We normally set a target of 0.25 – 0.5 kcal/mole or better for the precision of the binding affinity. For hydrogen bonds, this can be achieved via MP2 calculations extrapolated to the basis set limit, where the contribution of higher level excitations (e.g. CCSD (T)) has been shown to be negligible (although in some cases, such as pi stacking of aromatic rings, the MP2 level is not adequate to achieve the target accuracy).

Briefly, dimer geometries were obtained by LMP2 optimizations with a cc-pVTZ(-f) basis set.^{16} The empirical dimer binding energy consists of the LMP2 binding energy for a smaller cc-pVTZ(-f) basis set (*E*_{ccpvtz}) and the LMP2 binding energy with a larger cc-pVQZ(−g) basis set (*E*_{ccpvq}). The model binding energy *E*_{bind} takes the simple form:^{15}

$${E}_{\text{bind}}={C}_{1}\xb7{E}_{\text{ccpvtz}}+{C}_{2}\xb7{E}_{\text{ccpvtz}}$$

(18a)

$${C}_{1}={\mathrm{a}}_{1}/({\mathrm{a}}_{1}-{\mathrm{a}}_{2});\phantom{\rule{thinmathspace}{0ex}}{C}_{2}=-{\mathrm{a}}_{2}/({\mathrm{a}}_{1}-{\mathrm{a}}_{2})$$

(18b)

$${\mathrm{a}}_{1}=\text{exp}(-2.7);\phantom{\rule{thinmathspace}{0ex}}{\mathrm{a}}_{2}=\text{exp}(-1.8)$$

(18c)

In calculating binding energies the Hartree-Fock (HF) energies are corrected for BSSE using the counterpoise method.

The target hydrogen-bonded distances were taken directly from the LMP2/cc-pVTZ(-f) energy minimizations.

The bond stretching and angle bending energies were obtained according to Equation 13 and Equation 14. The values of the parameters were taken directly from the OPLS-AA.^{8} The reason is that both OPLS-AA and our PFF disregard 1,2- (covalent bond) and 1,3- (covalent angle) interactions, therefore, the energetics related to these arrangements is the same with the both techniques.

Finally, the torsional term was computed as shown in Equation 15. We performed constrained geometry optimizations with dihedral angles fixed at their key positions and optimized the parameters *V* to minimize deviations of the relative energies from the LMP2/cc-pVTZ(-f) quantum mechanical results. For instance, for the ethane molecule, constraining the H-C-C-H dihedral to 60° and to 0° yielded relative quantum mechanical energies of 0 and 2.7762 kcal/mol, respectively. We have optimized the *V _{3}* term of this torsion to achieve a close agreement with these numbers (0 and 2.7751 kcal/mol). The

The final tuning of the force field parameters was achieved by reproducing experimental values of heats of vaporization and molecular volumes of the pure molecular liquids involved. Each simulation was run with the POSSIM software and included 216 molecules in a cubic cell with periodic boundary conditions. The NPT ensemble (constant temperature, pressure and the number of molecules) was employed. Methanol and water were simulated at 25 °C. The NMA liquid had a temperature of 100°C. The hydrocarbons were modeled at their boiling temperatures: −161.49 °C for methane, −88.63 °C for ethane, −42.1 °C for propane and −0.5 °C for butane. The calculations were carried out with the Monte Carlo technique, and the heats of vaporization were calculated according to Equation 19:

$$\mathrm{\Delta}{H}_{\mathit{\text{vap}}}=E(\mathit{\text{gas}})-E(\mathit{\text{liq}})+\mathit{\text{RT}}$$

(19)

The difference between the energy for one molecule in the gas-phase and in the condensed state was augmented by the *RT* term to account for the Δ*(PV)* part of the enthalpy, in the assumption that the vapor obeys the ideal gas law, and the molecular volume of the liquid can be neglected compared to that of the gas. In all the calculations, at least 1 ×10^{6} Monte Carlo configurations of averaging were followed by no less than 5 × 10^{6} configurations of averaging for the thermodynamic properties. Elements of the dipole-dipole interaction tensor in Equation 5 were set to zero for distances beyond 7.0 Å. The other intermolecular interactions were cut off at 8.0 Å for water, 10.0 Å for methane and methanol and 11.0 Å for all ethane, propane, butane and NMA. The charge-charge interactions were switched off smoothly over the last 0.5 Å. The standard correction for the neglected Lennard-Jones energies beyond the cutoff distances was applied.

Calculating the free energies of hydration were not in any way a part of the parameter fitting procedure, but rather a test of the parameters produced as discussed above. Therefore, we believe that the high quality of the results reflects the genuinely adequate underlying physical model.

The thermodynamic cycle used to calculate relative hydration energies between species A and B is shown in Figure 2.

From this cycle, the relative free energy of hydration:

$$\mathrm{\Delta}\mathrm{\Delta}{G}_{\mathit{\text{hyd}}}=\mathrm{\Delta}{G}_{\mathit{\text{hyd}}}(\mathrm{B})-\mathrm{\Delta}{G}_{\mathit{\text{hyd}}}(\mathrm{A})=\mathrm{\Delta}G(\mathrm{A}\to \mathrm{B},\text{liq})-\mathrm{\Delta}G(\mathrm{A}\to \mathrm{B},\text{gas})$$

(20)

The statistical perturbation theory was used to calculate the differences of free energies between solvated and gas-phase species A and B (Δ*G*(A→B, liq) and Δ*G*(A→B, gas), respectively). When B was set to nothing, the absolute free energy of hydration of the species A was obtained. This was done to methane to anchor the other hydration free energies and to obtain their absolute values.

To calculate the Δ*G*(A→B, liq) and Δ*G*(A→B, gas), the standard statistical perturbation theory procedure was used. Differences between atoms of molecules A and B were switched on according to a parameter 0 < λ < 1, with λ = 0 corresponding to A, and λ = 1 to B. Then the interval from 0 to 1 was divided into a number of sub-intervals, and for each point between two intervals a corresponding mixture of molecules A and B was created. The difference in free energies between systems corresponding to such points *i* and *j* was calculated according to Equation 21:^{17}

$$\mathrm{\Delta}G(i\to j)=-\mathit{\text{RT}}\phantom{\rule{thinmathspace}{0ex}}\text{ln}(<\text{exp}[(-{E}_{j}-{E}_{i})/\mathit{\text{RT}}]{>}_{i})$$

(21)

Here the brackets <…>_{i} signify averaging of the value inside the brackets over the configurational space of the mixed system at point *i*, and *E _{i}* and

The averaging was performed with Monte Carlo calculations for a single solute molecule in a water box, thus, corresponding to infinitely dilute solutions. The simulations proceeded as described in the previous subsection, except that the number of water molecules equal to the number of non-hydrogen atoms in the solute were removed (for example, for the methanol to ethane perturbation, the number of water molecules was equal to 214 instead of the pure water box of 216).

As described above, fitting polarizabilities to the three-body energies and charges to the interaction energies with dipolar probes were the first two steps of the POSSIM force field production. While the further fine tuning did lead to some adjustments, we still view reproducing the quantum mechanical three- and two-body energies as an important part of the force field validation. Listed in Table 1 and Table 2 are three-body and two-body energies resulting from the final parameter sets. The following conclusions can be drawn from these data. The RMS deviations of the three-body energies are within ca. 0.37 kcal/mol, with the largest deviation corresponding to the case of N-methylacetamide. This result is not surprising, given that parameter fitting for amides is by no means a straightforward business. For example, we had to build three separate parameter sets for the formamide, acetamide and N-methylacetamide while producing the previous version of the polarizable force field (PFF0)^{9}^{,}^{12}. The acetamide parameters were ultimately used for the protein backbone.^{9} Likewise, the two-body energies display a similar trend with the maximum RMS deviation (for NMA) being ca. 1.73 kcal/mol. While the absolute value of this error seems to be large, it constitutes only about 11.5% of the two-body energy itself. It is very likely that this error is, in fact, not much greater than that resulting from the imperfections of the quantum mechanical calculations as such.

As an added test, we have compared the POSSIM dipole moments with their experimental values in Table 3. It can be seen that the results are rather close and do not suffer from the overpolarization which is characteristic for the non-polarizable fixed-charges force field. For example, the TIP4P monomer dipole moment is 2.18 Debye. This overpolarization is necessary to account for the increased dipole moment in polar media (such as bulk water), but it creates an unphysical situation in the gas-phase. This problem is resolved by applying the polarization formalism.

It can be noticed that results for methane, ethane, propane and butane are not present in the Table 1 – Table 3. The non-bonded parameters (including the electrostatics) for these systems were produced in a different fashion. Following the approach employed in creating the previous generation of the polarizable force field,^{12} we set the electrostatic charges on alkane atoms to be the same as in the OPLS-AA. That is, each aliphatic hydrogen had a charge of 0.06 electrons, and the sp^{3} carbons were charged by −0.24, −0.18 or −0.12 electron units, depending on the number of hydrogens attached to the carbon. Since the -CH_{n}- groups are essentially spherically symmetric, the magnitudes of the charges do not make a significant difference. This has been shown previously by simulating pure liquid and hydrated saturated hydrocarbons.^{19} Furthermore, the sigma and epsilon Lennard-Jones parameters were also set to their OPLS-AA values, 3.5 Å and 0.066 kcal/mol for carbons, 2.5 Å and 0.030 kcal/mol for hydrogens. Finally, the isotropic polarizabilities for the carbon atoms in the POSSIM PFF are the same as had been previously found to perform well in the PFF0.^{12} For these atoms, the inverse polarizability α^{−1} = 0.5069 Å^{−3}. No polarizabilities were assigned to the aliphatic hydrogen atoms.

Therefore, the non-bonded parameters for the alkanes were not refitted. However, the torsional coefficients for ethane, propane and butane H-C-C-H, H-C-C-C and C-C-C-C torsions were refitted to reproduce quantum mechanical energy profiles.

Torsional energy coefficients have been fitted to reproduce LMP2/cc-pVTZ(-f) energy profiles as described in the Methods section. The results of performing this task are shown in Table 4. Our original goal was to achieve a ca. 0.1 kcal/mol agreement with the quantum mechanical results, but the final accuracy was much better than that. The reason is that the molecules which we considered contained no coupled torsions (such as, for example, protein backbone and ψ). This was true even for the NMA molecule. Therefore, we were dealing with one-dimensional torsional profiles, and adjusting the three Fourier coefficient for each torsion was enough for achieving the high level of accuracy.

The next step in our force field development, and, perhaps the first truly significant test of the new force field, was reproducing gas-phase binding energies and geometries of the complexes. Given in Table 5 and Table 6 are the results of comparing our polarizable force field application with quantum mechanical and experimental data. As described in the previous section, the quantum mechanical results, unless otherwise noted, are taken from applying our previously developed extrapolation technique utilizing calculations with LMP2/cc-pVTZ(-f) and LMP2/cc-pVQZ basis sets.

Table 5 contains results for the aliphatic hydrocarbons (as represented by methane), water and methanol. Unlike in generating the previous version of the polarizable force field (PFF0^{12}), we have calculated energies for both homodimer of methanol and its heterodimer with water. Moreover, both methanol-dimers with water in which the latter serves as an electron donor and acceptor have been considered. This was also our routine when dealing with the NMA. We plan to follow this pattern in the future development of the POSSIM force field. Methane-water dimers were not considered, as no true hydrogen bond is formed in this case.

As demonstrated by the results in Table 5, the dimerization energies for all the systems were within 0.5 kcal/mol of their quantum mechanical counterparts. For the methanol-methanol dimer, the agreement is much better than that. Water-water interaction energy lower than −5 kcal/mol has been suggested previously, therefore, the relatively high error in this case is probably not a reason for concern. The distances between the heavy atoms are within ca. 0.1 Å of the quantum mechanical results. Overall, the developed PFF performs well in reproducing these gas-phase dimerization properties.

It can also be seen that the POSSIM PFF is showing a slight improvement over the PFF0. And the improvement with comparison to the fixed-charges OPLS-AA is quite dramatic. For example, as shown in Table 5, the water-water dimerization energy is overestimated by almost 2 kcal/mol if the OPLS-AA model is used. This is natural, as fixed-charges models need to overpolarize individual molecules in order to capture the overall polarization increase in the bulk polar media (such as, for example, bulk water). Therefore, our fast second-order polarization technique does not require any sacrifice of computational accuracy as compared to the full-scale polarizable model (PFF0) and outperforms the fixed-charges one.

Given in Table 6 are results of simulating NMA dimmers in the gas-phase. Fitting parameters for the NMA molecule is generally a more complex task. One reason for that is that quantum mechanical data for such dimerization energies are often less reliable. Shown in the table are just two values of the trans-NMA dimerization energy. And the full range of this piece of data which can be found in the literature is actually considerably wider. Perhaps this is why some authors choose to simply ignore the NMA gas-phase dimerization energy when producing or reparameterizing a force field.^{25} We took a middle road approach in regard to this matter. While we did use the quantum mechanical value for the NMA(t) – NMA(t) dimer, we did not pose as strong restrictions on the PFF results as we did for the other systems. We considered an error of 0.64 kcal/mol to be acceptable, given the uncertainty of the quantum mechanical results themselves. At the same time, we adhered to the stronger criteria when reproducing geometries of the complexes, with the error being within ca. 0.1 – 0.15 Å range.

The new POSSIM PFF performs better in reproducing the NMA(t) and NMA- water dimmers than the both force fields used for comparison, the fixed-charges OPLS-AA and the previously generated PFF0. While the OPLS-AA NMA(t) dimer has an energy which is slightly closer to the QM value than the POSSIM PFF, this is achieved at the cost of underestimating the N…O distance by about 0.16 Å versus the 0.05 Å error in the new PFF. As far as the PFF0 is concerned, it provides much better values for both the energy and the distance for the NMA(c) dimer, but significantly underestimates both the distance and the magnitude of the energy for the dimer of the NMA in its trans form. The reason is that we considered only the cis-NMA dimer when the PFF0 was developed, since its dimerization energy is stronger than for the NMA(t). At the same time, the trans-NMA is much less abundant than the cis-form, and it is the NMA(t) which forms the building block of protein and peptide backbones. This is why we choose to use the NMA(t) dimer as the fitting target in this work, and our results definitely show an improvement over the PFF0 in this respect.

Calculated heats of vaporization and molecular density are shown in Table 7. It follows from the values of the average errors that the overall performance of the POSSIM PFF is superior to both the OPLS-AA and PFF0. The average PFF deviations of the heats of vaporization is 0.083 kcal/mol versus the 0.0.129 kcal/mol for the OPLS-AA and 0.240 kcal/mol for the PFF0. The average error in the molecular volume is 1.485 Å^{3}, which is a noticeable improvement, as compared to the PFF0, and is slightly better than for the OPLS-AA. Moreover, the worst (although still acceptable) results for the new PFF are obtained for methane and ethane, which are probably not the most relevant pure liquids considered, given their very low boiling temperatures of −161.49 °C and −88.63 °C, respectively. At the same time, the density of the NMA was improved very significantly as compared to the PFF0, and the NMA molecule is extremely important as the building block for protein backbones. Overall, all three models perform well in simulating the pure liquids involved. Therefore, we have clearly shown that our second-order approximation for calculating induced electrostatic dipoles is completely adequate, as is it capable of reproducing both gas-phase and liquid properties at a good level of accuracy.

Presented in this subsection are results of calculating relative and absolute free energies of hydration for methane, ethane and methanol. In contrast with the work described above, this part of the project was carried out as a test and not parameterization, thus, we did not refit any parameters to obtain good results. Therefore, these calculations can serve as a test of physical robustness of the model. The successful results presented here have a double value of validating both the force field parameters and the overall methodological and POSSIM software robustness in calculating free energy changes with the statistical perturbation theory.

The calculated values of the free energies are shown in Table 8. We carried out three sets of these calculations: transforming methanol into ethane (by converting the -OH group into a methyl), shrinking the methyl group into an aliphatic hydrogen to mutate ethane into methane and, finally, disappearing the methane molecule altogether by converting all the atoms into dummy ones. This last step permits us to anchor all the results into the energy of solvation of a dummy atom (*i. e.* zero), thus, we can obtain the absolute hydration energies for all the species involved.

According to the data in Table 8, the agreement of our results with experiment is excellent. The hydration energy of methane is correct within 0.115 kcal/mol. The non-obvious trend of the slight hydration energy drop between methane and ethane is reproduced correctly, even though it is exaggerated by about 0.15 kcal/mol (which actually leads to the absolute hydration energy of ethane being in a better agreement with the experiment). The difference in hydration energies between methanol and ethane are correct within ca. 0.21 kcal/mol. Overall, the absolute values of these energies are within 0.13 kcal/mol of their experimental counterparts. It is important that this agreement was achieved without any specific fitting. These results represent a test of the parameters produced as described above, therefore, they attest to the robustness of the physical model and the POSSIM software.

We have utilized a previously suggested fast polarization formalism to create a software package named POSSIM and produced polarizable force field parameters for several molecular systems relevant in organic and biological simulations. The agreement with gas-phase and liquid experimental data has been found to be very good. Moreover, we tested POSSIM by calculating absolute free energies of hydration of methane, ethane and methanol and determined them to be within 0.13 kcal/mol of their experimental counterparts. We believe that this verifies the robustness of the parameters and software. We view this work as the first step in creating of an extensive set of force field parameters to be used in organic and biophysical simulations, including modeling of proteins and protein-ligand complexes. Furthermore, we have been able to successfully use a procedure for utilizing both quantum mechanical and experimental data as fitting targets. We hope our results will further prove the importance of treating the electrostatic polarization explicitly, when building empirical force fields, and will contribute to making polarizable calculations more affordable computationally.

The project described was supported by Grant Number R01GM074624 from the National Institute of General Medical Sciences. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institute of General Medical Sciences or the national Institutes of Health. The authors express gratitude to Schrödinger, LLC for the Jaguar and PFF software.

**Supporting Information Available**

POSSIM force filed parameters. This material is available free of charge via the Internet at http://pubs.acs.org.

1.
See, for example
Caldwell JW, Kollman PA. J. Am. Chem. Soc. 1995;117:4177–4178.

Jiao D, Zhang JJ, Duke RE, Li GH, Schneiders MJ, Ren PY. J. Comp. Chem. 2009;30:1701–1711. [PMC free article] [PubMed]

Jiao D, Zhang JJ, Duke RE, Li GH, Schneiders MJ, Ren PY. J. Comp. Chem. 2009;30:1701–1711. [PMC free article] [PubMed]

2. **(a) **MacDermaid CM, Kaminski GA. J. Phys. Chem. B. 2007;111:9036–9044. [PubMed]** (b) **Click TH, Kaminski GA. J. Phys. Chem. B. 2009;113:7844–7850. [PubMed]

3. Veluraja K, Margulis CJ. J. Biomol. Struct. & Dynamics. 2005;23:101–111. [PubMed]

4. Ji C, Mei Y, Zhang JZH. Biophys. J. 2008;95:1080–1088. [PubMed]

5. Kaminski GA, Stern HA, Berne BJ, Friesner RA, Cao YX, Murphy RB, Zhou R, Halgren T. J. Comput. Chem. 2002;23:1515–1531. [PMC free article] [PubMed]

6. Rick SW, Stuart SJ, Berne BJ. J. Chem. Phys. 1994;101:6141–6156.

7. Kaminski GA, Zhou R, Friesner RA. J. Comp. Chem. 2003;24:267–276. [PubMed]

8. Jorgensen WL, Maxwell DS, Tirado-Rives J. J. Am. Chem. Soc. 1996;118:11225–11236.

9. Maple JR, Cao YX, Damm W, Halgren TA, Kaminski GA, Zhang LY, Friesner RA. J. Chem. Theory Comput. 2005;1:694–715. [PubMed]

10. **(a) **Straatsma TP, McCammon JA. Chem. Phys. Lett. 1990;167:252–254.** (b) **Straatsma TP, McCammon JA. Chem. Phys. Lett. 1991;177:433–440.** (c) **Roux B. Chem. Phys. Lett. 1993;212:231–240.

11. Berendsen HJC, Grigera JR, Straatsma TP. J. Phys. Chem. 1987;91:6269–6271.

12. Kaminski GA, Stern HA, Berne BJ, Friesner RA. J. Phys. Chem. A. 2004;108:621–627.

13. **(a) **Becke AD. Phys. Rev. A. 1988;38:3098–3100. [PubMed]** (b) **Lee C, Yang W, Parr RG. Phys. Rev. B. 1988;37:785–789. [PubMed]

14. **(a) **Jaguar v3.5. Portland, OR: Schrödinger, Inc.; 1998. ** (b) **Jaguar v4.2. Portland, OR: Schrödinger, Inc.; 2000.

15. Kaminski GA, Maple JR, Murphy RB, Braden D, Friesner RA. J. Chem. Theory Comput. 2005;1:248–254. [PubMed]

16. Dunning TH. J. Chem. Phys. 1989;90:1007–1023.

17. Zwanzig RW. J. Chem. Phys. 1954;22:1420–1426.

18. Caldwell JW, Kollman PA. J. Phys. Chem. 1995;99:6208–6219.

19. Kaminski GA, Duffy EM, Matsui T, Jergensen WL. J. Phys. Chem. 1994;98:13077–13082.

20. Mattsson AE, Mattsson TR. J. Chem Theory Comput. 2009;5:887–894. [PubMed]

21. Kaminski GA. J. Phys. Chem. B. 2005;109:5884–5890. [PubMed]

22. Tsuzuki S, Uchimaru T, Matsumura K, Mikami M, Tanabe K. J. Chem. Phys. 1999;110:11906–11910.

23. Takatani T, Sherrill CD. Phys. Chem. Chem. Phys. 2007;9:6106–6114. [PubMed]

24. Dixon DA, Dobbs KD, Valentini JJ. J. Phys. Chem. 1994;98:13435–13439.

25. Xie W, Pu J, MacKerell AD, Gao J. J. Chem. Theory Comput. 2007;3:1878–1889. [PMC free article] [PubMed]

26. Gallicchio E, Zhang LY, Levy RM. J. Comp. Chem. 2002;23:517–529. [PubMed]

PubMed Central Canada is a service of the Canadian Institutes of Health Research (CIHR) working in partnership with the National Research Council's national science library in cooperation with the National Center for Biotechnology Information at the U.S. National Library of Medicine(NCBI/NLM). It includes content provided to the PubMed Central International archive by participating publishers. |