|Home | About | Journals | Submit | Contact Us | Français|
A method is presented to explicitly represent the first solvation shell in continuum solvation calculations. Initial solvation shell geometries were generated with classical molecular dynamics simulations. Clusters consisting of solute and 5 solvent molecules were fully relaxed in quantum mechanical calculations. The free energy of solvation of the solute was calculated from the free energy of formation of the cluster and the solvation free energy of the cluster calculated with continuum solvation models. The method has been implemented with two continuum solvation models, a Poisson-Boltzmann model and the IEF-PCM model. Calculations were carried out for a set of 60 ionic species. Implemented with the Poisson-Boltzmann model the method gave an unsigned average error of 2.1 kcal/mol and a RMSD of 2.6 kcal/mol for anions, for cations the unsigned average error was 2.8 kcal/mol and the RMSD 3.9 kcal/mol. Similar results were obtained with the IEF-PCM model.
Solvation phenomena play an important part in many areas of chemistry and there has for some time been a significant interest among chemists in predicting solvation effects. One property of particular importance is the free energy of solvation. Accurate calculation of this property is essential in predicting chemical reactions in solution.
In the last decades theoretical and computational chemists have put considerable efforts into the development of methods to calculate the free energy of solvation. Much of this work is covered in reviews by Cramer and Truhlar,1 Orozco and Javier Luque 2 and Tomasi, Menucci and Cammi.3 Present solvation models tend to perform well in predicting the solvation energy of small neutral species, but are less successful for ionic species.
Continuum solvation models represent an appealing approach to calculating solvation free energies. Compared to free energy calculations with explicit solvent models, continuum model calculations are both easy to set up and computationally inexpensive. The main shortcoming of continuum models is probably their failure to account for chemical interactions between a solute and solvent molecules in the first solvation shell. Such interactions are expected to be of particular importance for ionic solutes.4-7 Continuum models are also not able to capture effects of charge transfer between solute and solvent, an effect that may play a significant role in solvation.8,9
It has repeatedly been proposed that this shortcoming can be overcome by adding explicit solvent molecules in the continuum model calculations.1,4,5,9-19 One approach is quasi-chemical theory, where the closest solute molecules are regarded as ligands binding to the solute.11 Other approaches involve including a limited number of solvent molecules in continuum solvation calculation, with varying thermodynamic cycles.4-7 Such models have been referred to by varying names such as supermolecule method or Cluster-Continuum method.4
In the present work a new method is proposed for adding explicit solvent molecules in continuum model calculations. The focus of the present work is to produce a robust method that may be applied to a wide variety of ionic species. The method is fully scripted, only requiring solute geometry and charge as input.
In the thermodynamic cycle in Scheme 1 we show that the free energy of solvation of a given solute can be calculated from the solvation free energies of solute-solvent clusters and the free energy of solute-solvent cluster formation. If the solvent and solute-solvent clusters have the structure of bulk solvent, the free energy of cluster formation in solution is 0. The reaction in solution can in this case be regarded as exchange of implicit and explicit solvent molecules.
It follows that the free energy of solvation of the solute can be calculated from the free energy of cluster formation in the gas phase and the free energy of solvation of the clusters:
Experimental and calculated solvation energies are usually reported with a standard state of 1 mol/Liter (indicated with a * in the present work) both for gas and liquid state. Adopting this standard state the following equation is obtained:
The last term in equation 2 is a correction term for the free-energy change of 1 mol of (H2O n gas from 55.34/n M liquid to 1 mol/liter. The correction from ΔGclust at a standard state of 1 atmosphere to ΔG*clust is -RT ln(R'T), where R' is 0.082 053 K-1.4 A discussion of the standard state corrections is given by Bryantsev et al.7 Equation 2 is the same as was applied by Bryantsev et al7 in their recent work. This equation also gives the same relative solvation energies as the method Pliego and Riveros4 proposed in their cluster-continuum model. We have however chosen to calculate the energies relative to pure solvent clusters, while Pliego and Riveros calculated energies relative to the experimentally determined vaporization free energy of the solvent.
It should be noted that the accuracy of calculations based on equation 1 will depend on how close the utilized cluster geometries are to the real structure of solvation shells in bulk water.
In this work we extract the geometry of the solvation shell from molecular simulations of the solute in bulk solvent and fully optimize this solvation shell geometry with quantum mechanical calculations. One could also do single-point calculations on the simulation geometries, but our initial work suggested that the present molecular dynamics geometries (from classical simulations) are not good enough for such an approach to yield accurate solvation free energies.
The clusters consisted of the solute and the closest 5 water molecules. In selecting the number of solvent molecules which to include in the solvation shell, there are several considerations. We wish to include enough solvent molecules to describe all strong solute-solvent interactions. On the other hand adding more solvent molecules adds to the cost of calculations. Since we are fully relaxing the solvation shell in vacuum there is also a risk that the structure will deviate more from the structure of bulk solvent as more solvent molecules are added. The present method may have systematic errors that depend on the number of solvent molecules in the calculation (an issue to which we will return in the discussion), we therefore believe that the most reliable results are obtained by using a constant number of solvent molecules for all solutes. For the small solutes studied in the present work, we believe 5 solvent molecules to be enough to represent strong solute-solvent interactions.
We wished to include the solvent molecules that are bound most strongly to the solute in the cluster. The solvent molecules were therefore selected based on their distance to hydrophile atoms in the solute. Hydrophile atoms are in the present work defined as any solute atoms that are not carbon atoms or hydrogen atom bound to a carbon atom.
Full optimization of solute and the solute-solvent clusters were carried out in vacuum at the HF/6-31+G(d) level. Pure solvent clusters were optimized at the same level of theory. The entropy was determined from vibrational frequency calculations on the optimized clusters at the same level of theory at a temperature of 298K.
The interactions between a solute and solvent molecules are in general not well represented by a single geometry. In the present work we have therefore chosen to carry out calculations with an ensemble of clusters. 100 different cluster geometries were extracted from molecular dynamics simulations for each solute. The free energies we report are linear averages from calculations on the full set of clusters.
We have chosen to implement the method with two continuum solvation models, this is in part done to test to what extent the method performance depends on the continuum model. The continuum models employed for calculations of cluster solvation energies were the Poisson-Boltzmann based model in the DIVCON code20 and the IEF-PCM 21 model in Gaussian 03.22 The continuum solvation calculations were carried out as single-point calculations on the optimized HF/6-31+G(d) clusters. The Poisson-Boltzmann model calculations were carried out at AM1 level (the model is not implemented at HF level), while the IEF-PCM calculations were carried out at HF/6-31+G(d) level. The IEF-PCM calculations were carried out with default settings in Gaussian 03.22
A summary of the Poisson-Boltzmann model is given in the supporting information. All quantum mechanical calculations were carried out in Gaussian 0322 and all simulations were carried out using Sander from the AMBER 9 suite.23 Details of the molecular dynamics simulations are given in the supporting information.
In Figure 1 and Figure 2 are shown examples of optimized clusters produced with the present method. Figure 1 shows a case of the solvent molecules interacting with all parts of the solute, creating a complete solvation shell. In other cases (such as shown in Figure 2) the solvent molecules will cluster around hydrophilic groups in the solute, leaving hydrophobic groups exposed.
Table 1 shows the calculated solvation energies, cluster formation energies, entropies and cluster solvation energies for a set of 30 anionic compounds. In Table 2 are shown data for cations presented in the same form. The set of ions covers many important functional groups in chemistry such as alcohols, amines and carboxylic acids. This set also includes all ions studied by Pliego and Riveros.4 The cluster solvation energies in the Tables are calculated with the Poisson-Boltzmann continuum solvation model, results with the IEF-PCM solvation model are given in the supporting information.
All calculated solvation energies were shifted to remove the average error relative to experimental data. For anionic species the shift is -2.41 kcal/mol, while for cationic species the shift is -5.60 kcal/mol. This means that the model is only used to predict relative free energies of solvation, and not their absolute values.
The unsigned average error in the calculated solvation free energies for anions is 2.1 kcal/mol, while the root mean square deviation (RMSD) is 2.6 kcal/mol. For cations the unsigned average error in the calculated solvation free energies is 2.8 kcal/mol and the RMSD is 3.9 kcal/mol. It can be seen from the data in Table 1 and and22 that the energy of cluster formation is the most important term in determining differences in solvation energies. The clusters solvation energies do also vary considerably.
Calculations with the IEF-PCM continuum model gave similar results. For anions the unsigned average error is 1.6 kcal/mol and the RMSD 2.1 kcal/mol. For cations the unsigned average error is 3.2 kcal/mol and the RMSD 4.5 kcal/mol. The average shift of calculated relative to experimental data for anions is -12.0 kcal/mol. For cations this average shift is -5.5 kcal/mol. The two continuum models predict similar absolute solvation energies for anions, for cations the IEF-PCM solvation energies are however on average 7.7 kcal/mol lower than the solvation energies from the Poisson-Boltzmann model. The two solvation models also give substantially different solvation energies for the pure solvent clusters. The Poisson-Boltzmann model gives an average value of -15.7 kcal/mol, while the IEF-PCM model gives an average value of -23.6 kcal/mol. Similar differences between continuum models in solvent cluster calculations were reported by Bryantsev et. al.7
For NH2- proton exchange occurred between the solute and solvent molecules. This problem was overcome by constraining all solvent O-H bond lengths. A similar issue with proton exchange was found for protonated acetone and dimethylether. For clusters of these solutes the O-H bond in the solute was constrained. These three solutes were not included in calculations of systematic errors between calculated and experimental data.
In the presented results thermal corrections to the energy and zero-point energy calculations were not included. We found that including these effects would result in somewhat worse results. Inclusion of the thermal correction and the zero-point energy would have formally been the more correct choice. The omission of these corrections can be regarded as an implicit form of parameter fitting in the present method. Results with these terms included are shown in the supporting information.
The statistical uncertainty of the method was estimated by carrying out calculations at the semiempirical AM1 level of theory.27 10 sets of calculations were performed for the cluster formation energies (based on a total of 1000 different cluster geometries). This gave a standard deviation in calculated cluster formation energies of around 1 kcal/mol.
Some of the cluster geometry optimizations failed, resulting in the breakdown of the geometry of the solute or solvent molecules. In some cases the vibration frequency calculations also failed. These clusters were omitted in the calculations of solvation energies. For most solutes there were 10-20 failed calculations. A likely cause of such failed calculations is poor initial geometries obtained from the molecular dynamics simulations.
The present results are to our knowledge among the best reported relative solvation energies for ions. The error bars are not too far from the error bars in the experimental data suggested by Kelly, Cramer and Truhlar (+/- 2 to 3 kcal/mol).25 The size of the study set also gives confidence that the method is robust. Much of the work with models combining continuum models with explicit solvent representation has been focused on solutes with no hydrophobic groups.7,13,14,19 For such solutes optimized clusters will form complete solvation shells and such systems are likely to be ideally suited for models with explicit representations of the solvation shell in continuum. The present work does however show that even for systems with hydrophobic groups where a complete solvation shell is not formed, good results can be obtained.
It is also noteworthy that methods combining continuum models with explicit solvent representation, unlike a conventional continuum model or classical simulation, to some extent takes account of all contributions to the solvation energy. Long-range electrostatics, solute polarization and cavitation energy contributions are accounted for in the continuum model, while solute-solvent hydrogen bonding and entropy contributions are accounted for in the calculation of cluster formation energy.
The quality of the results obtained is likely to depend strongly on the quality of the quantum mechanical method employed. The continuum solvation model is also likely to have a significant effect on the quality of the results. The entropy term contributes less to relative solvation energies, but may also have a significant effect on the results. We do make the approximation that the vibration entropy can be calculated as if the cluster was a single molecule. Since the clusters are completely relaxed in quantum mechanical calculations the initial molecular dynamics geometry probably does not affect the final results. As noted in the Theory section the results also depend on how closely the optimized cluster resembles the solvation shell geometry in bulk liquid. This may especially be an issue for solutes with hydrophobic groups that are not forming complete solvation shells.
For NH2-, acetone and dimethylether calculations were carried out with constrained bonds. The errors in terms of relative solvation free energies for these species are all somewhat large. While this approach seems to work reasonably well for solutes that might otherwise undergo some form of proton transfer, it does appear to lead to somewhat less accurate results. If the three solutes with constrained bonds were to be removed from the dataset the unsigned average error would fall to 2.0 kcal/mol for anions and 2.2 kcal/mol for cations (with the Poisson-Boltzmann continuum model). With the IEF-PCM continuum model the unsigned average errors improve to 1.5 kcal/mol (anions) and 2.6 kcal/mol (cations).
The calculated solvation free energies for purely hydrophilic solutes also have relatively large errors. For these solutes the calculated solvation free energies are more negative than the experimental values. These solutes such as H+ and NH4+ form ideal solvation shells. The relative differences in calculated solvation free energies between entirely hydrophilic solutes and solutes with hydrophobic groups, suggests that the model performance to some extent depends on the structure of the solvation shell formed.
In the present model hydrophobic groups will not interact with the explicit solvation shell. It is the continuum model that accounts for the hydrophobic group's contribution of the free energy of solvation. Relatively large errors for species with large hydrophobic groups such as benzyl alcohol are probably due to the performance of the continuum model.
The present work is focused on obtaining relative solvation energies. Predictions of absolute solvation energies can be made by correcting with the correction factors reported in the present work.
It is however also of interest to look at how close the absolute solvation energies produced by the model are to experimental data. The thermodynamic cycle we employ should in principle yield absolute values of the solvation energies. Models similar in nature to the present work have also been used to predict absolute solvation energies.7,13,14,17-19
In looking at the absolute values predicted by the present model, we should at the formally correct calculations including the zero-point energy and thermal corrections to the energy. If this term is added the average shift for anions is -5.7 kcal/mol and for cations it is -7.3 kcal/mol with the Poisson-Boltzmann continuum model. If we had looked at the average shift for hydrophilic species only, the shift would be -2.6 kcal/mol for anions and -3.3 kcal/mol for cations. With the IEF-PCM continuum solvation model the average shift for anions is -15.3 kcal/mol and for cations -7.2 kcal/mol. For hydrophilic species only, the shift was in this case -12.6 kcal/mol for anions and -2.6 kcal/mol for cations.
These results suggest that part of the difference between absolute values of experimental and calculated solvation free energies are systematic errors in the calculation of hydrophobic species. It may be that the continuum solvation models underestimate the contribution to the free energy of solvation of hydrophobic groups.
The absolute solvation energies do depend to a large extent on the continuum solvation model, and the two continuum models we utilize produce quite different results. Asthagiri et al13 have also reported that the absolute values of long-range electrostatic terms depend significantly on the calculation method. Large differences between continuum solvation in calculations of solvation free energies of solute-solvent clusters have also been observed by Bryantsev et al.7 The large uncertainty in the absolute energy values produced by continuum models, suggest that contributions from these calculations may be the main cause of the systematic difference observed between calculated and experimental values. Hartree-Fock level calculations can also not be expected to yield accurate absolute values of binding energies. The full relaxation of the solvation-shell in vacuum may also lead to bonding that deviates systematically from that of bulk solvent structure. There may also be systematic errors from the approach taken to calculate the entropy terms. It should furthermore be noted that there is significant uncertainty in the absolute value of the experimental solvation free energies, this is due to the uncertainty in the solvation energy of the proton itself. Kelly, Cramer and Truhlar suggest an uncertainty of no less than 2 kcal/mol for the free energy of solvation of the proton.16
As noted in the Methods section we believe any such systematic error in the present method is likely to depend on the number of solvent molecules included in the clusters. Since we are using a constant number of solvent molecules any such systematic error should not affect the calculated relative solvation free energies.
A key issue in a method to explicitly represent solute-solvent interactions is the number of solvent molecules which to include. The number we employ in the present work is higher than in some other studies, it can for example be noted that Kelly, Cramer and Truhlar28 suggested adding a single solvent molecule. We believe solvent molecules should be added to represent all strong solute-solvent interactions. Some ions such as Cl- will interact strongly with at least 5 water molecules. In calculations with a similar method Bryantsev et al7 reported convergence for H+ with 14 water molecules. We did however find that not all solutes would bind directly to 5 solvent molecules, suggesting that the optimal number of solvent molecules will vary from ion to ion. The risk with adding more solvent molecules than binds to the solute is that when the cluster is optimized in vacuum the geometry may come to deviate more from that of a solvation shell in bulk liquid. The quality of the present results are however very encouraging, suggesting that 5 solvent molecules is a reasonable choice for small ionic species. While there is some uncertainty as to the optimal number of solvent molecules to include in the calculations, it would also seem clear that for ions including solvent molecules produces significantly better results than a continuum model by itself.
Schemes to automatically select the number of solvent molecules to include in the cluster could be based on binding energies or geometry. A series of calculations could for example be made to determine how many solvent molecules bind strongly to the solute. Geometry criteria could on the other hand be based on the distance between solute and solvent molecules. Such schemes would result in a more elaborate method, but the calculations could probably be carried out at a low level of theory and need not be time-consuming.
Bryantsev et al7 have in recent work looked at adding solvent molecules until the calculated solvation energies converge, i. e. until results are not affected by the addition of further solvent molecules. This would appear to be a rigorous approach, but it should be noted that the method involves rather large calculations.
The present work was greatly influenced by the work of Pliego and Riveros4 and the present method is similar to their Cluster-Continuum method. The overall quality of their results appears to be similar to that of the present method. Using the same procedure as we employ in Table 1 and Table 2 and comparing against the same experimental data unsigned average errors of 1.8 kcal/mol (anions) and 2.4 kcal/mol (cations) are obtained. The Pliego and Riveros dataset is however small (14 ions), and the statistics is therefore more uncertain than in the present work.
The Cluster-Continuum method and the present method differ in the number of clusters in the calculation, number of solvent molecules included and the level of quantum mechanical calculations. The work also differs in the continuum solvation models employed. Pliego and Riveros4 propose a kind of variational principle to determine the number of solvent molecules included in their clusters. As a result of this rule Pliego and Riveros employ different number of solvent molecules for different solutes. We suspect that this may introduce errors in the calculations. In the same way as the present method, the Cluster-Continuum method may have systematic errors depending on the number of solvent molecules in the clusters. The quality of Pliego and Riveros results, does however suggest that any such systematic error is likely to be small. We are also not confident that their proposed variational principle will yield the best results. We believe that when enough solvent molecules are added to saturate the solvation shell, the calculated solvation energies should converge. In other words; once enough solvent molecules have been added to represent all strong solute-solvent interactions addition of further solvent molecules should not significantly affect the results.
The level of theory and basis set employed in the work of Pliego and Riveros4 are better than in the present work. The present method can also be implemented at higher levels of theory, and we believe results are likely to improve with the quality of the quantum mechanical method.
The present method is based on ensemble averages, while Pliego and Riveros4 did calculations on a single cluster. We have done some initial calculations suggesting that comparable results are obtained with ensemble averages and calculations on the single most stable cluster in the ensemble. This would imply that similar results can be obtained from methods based on ensemble average and procedures to find the global energy minimum.
The recently developed SM6 continuum model has reported unsigned average errors for ions of 4.19 kcal/mol for a dataset of 112 ions.25 While our dataset is smaller, it does cover all important classes of ions and we believe the reported performances are comparable. The performance of the SM6 model is achieved by direct fitting of the model to experimental free energy of solvation data. SM6 is also the latest generation of a series of solvation models that have been developed for over 15 years. That the present model performs better than SM6 is very encouraging. It is also noteworthy that this is achieved with a very limited degree of parameterization.
Kelly, Cramer and Truhlar24 also report performance data for a number of other continuum solvation models such as SM5 and IEF-PCM. The present method is substantially better than any of these in predicting relative free energies of solvation for ions. The IEF-PCM method at HF/6-31G(d) level of theory is for example reported to have unsigned average errors of 7.2 kcal/mol.
The present solvation model provides relative solvation energies in good agreement with experimental data with very limited parameter fitting. The results suggest that methods based on adding explicit solvent molecules in a continuum model, performs significantly better than a continuum model by itself in calculations of free energy of solvation for ionic species.
The authors would like to thank Martin Peters and Ning Liao for help with implementing the method and providing the solvation model. This work has been supported by the Norwegian Research Council through Post. Doc. funding for Eirik F. da Silva. KMM thanks the NIH (GM066859) for financial support of this research. The authors also thank the Norwegian Metacenter for Computational Science for a grant of computer time.
Supporting Information Available: Details of molecular dynamics simulations, results with IEFPCM model and description of Poisson-Boltzmann continuum-solvation model.