# Related Articles

The crystal structure of (Z)-N-(5-ethyl-2,3-dihydro-1,3,4-thiadiazol-2-ylidene)-4-methylbenzenesulfonamide contains an imine tautomer, rather than the previously reported amine tautomer. The tautomers can be distinguished using dispersion-corrected density functional theory calculations and by comparison of calculated and measured 13C solid-state NMR spectra.

The crystal structure of the title compound, C11H13N3O2S2, has been determined previously on the basis of refinement against laboratory powder X-ray diffraction (PXRD) data, supported by comparison of measured and calculated 13C solid-state NMR spectra [Hangan et al. (2010 ▶). Acta Cryst. B66, 615–621]. The molecule is tautomeric, and was reported as an amine tautomer [systematic name: N-(5-ethyl-1,3,4-thiadiazol-2-yl)-p-toluenesulfonamide], rather than the correct imine tautomer. The protonation site on the molecule’s 1,3,4-thiadiazole ring is indicated by the intermolecular contacts in the crystal structure: N—H⋯O hydrogen bonds are established at the correct site, while the alternative protonation site does not establish any notable intermolecular interactions. The two tautomers provide essentially identical Rietveld fits to laboratory PXRD data, and therefore they cannot be directly distinguished in this way. However, the correct tautomer can be distinguished from the incorrect one by previously reported quantitative criteria based on the extent of structural distortion on optimization of the crystal structure using dispersion-corrected density functional theory (DFT-D) calculations. Calculation of the 13C SS-NMR spectrum based on the correct imine tautomer also provides considerably better agreement with the measured 13C SS-NMR spectrum.

doi:10.1107/S2053229614015356

PMCID: PMC4174016
PMID: 25093360

crystal structure; powder diffraction; NMR analysis; amine–imine tautomerism; dispersion-corrected DFT

We present density functional theory calculations of the geometry, adsorption energy and electronic structure of thiophene adsorbed on Cu(111), Cu(110) and Cu(100) surfaces. Our calculations employ dispersion corrections and self-consistent van der Waals density functionals (vdW-DFs). In terms of speed and accuracy, we find that the dispersion-energy-corrected Revised Perdue-Burke-Enzerhof (RPBE) functional is the “best balanced” method for predicting structural and energetic properties, while vdW-DF is also highly accurate if a proper exchange functional is used. Discrepancies between theory and experiment in molecular geometry can be solved by considering x-ray generated core-holes. However, the discrepancy concerning the adsorption site for thiophene/Cu(100) remains unresolved and requires both further experiments and deeper theoretical analysis. For all the interfaces, the PBE functional reveals a covalent bonding picture which the inclusion of dispersive contributions does not change to a vdW one. Our results provide a comprehensive understanding of the role of dispersive forces in modelling molecule-metal interfaces.

doi:10.1038/srep05036

PMCID: PMC4030267
PMID: 24849493

The accuracy of a dispersion-corrected density functional theory method is validated against 241 experimental organic crystal structures from Acta Cryst. Section E.

This paper describes the validation of a dispersion-corrected density functional theory (d-DFT) method for the purpose of assessing the correctness of experimental organic crystal structures and enhancing the information content of purely experimental data. 241 experimental organic crystal structures from the August 2008 issue of Acta Cryst. Section E were energy-minimized in full, including unit-cell parameters. The differences between the experimental and the minimized crystal structures were subjected to statistical analysis. The r.m.s. Cartesian displacement excluding H atoms upon energy minimization with flexible unit-cell parameters is selected as a pertinent indicator of the correctness of a crystal structure. All 241 experimental crystal structures are reproduced very well: the average r.m.s. Cartesian displacement for the 241 crystal structures, including 16 disordered structures, is only 0.095 Å (0.084 Å for the 225 ordered structures). R.m.s. Cartesian displacements above 0.25 Å either indicate incorrect experimental crystal structures or reveal interesting structural features such as exceptionally large temperature effects, incorrectly modelled disorder or symmetry breaking H atoms. After validation, the method is applied to nine examples that are known to be ambiguous or subtly incorrect.

doi:10.1107/S0108768110031873

PMCID: PMC2940256
PMID: 20841921

dispersion-corrected density functional theory; organic structures

The relatively complex structure of a triclinic disolvate was solved from low-resolution laboratory powder diffraction data through the intermediate use of dummy atoms and the combination with quantum-mechanical calculations.

With only a 2.6 Å resolution laboratory powder diffraction pattern of the θ phase of Pigment Yellow 181 (P.Y. 181) available, crystal-structure solution and Rietveld refinement proved challenging; especially when the crystal structure was shown to be a triclinic dimethylsulfoxide N-methyl-2-pyrrolidone (1:1:1) solvate. The crystal structure, which in principle has 28 possible degrees of freedom, was determined in three stages by a combination of simulated annealing, partial Rietveld refinement with dummy atoms replacing the solvent molecules and further simulated annealing. The θ phase not being of commercial interest, additional experiments were not economically feasible and additional dispersion-corrected density functional theory (DFT-D) calculations were employed to confirm the correctness of the crystal structure. After the correctness of the structure had been ascertained, the bond lengths and valence angles from the DFT-D minimized crystal structure were fed back into the Rietveld refinement as geometrical restraints (‘polymorph-dependent restraints’) to further improve the details of the crystal structure; the positions of the H atoms were also taken from the DFT-D calculations. The final crystal structure is a layered structure with an elaborate network of hydrogen bonds.

doi:10.1107/S2052520615000724

PMCID: PMC4316649
PMID: 25643720

Pigment Yellow 181; X-ray powder diffraction; dispersion-corrected density functional theory

IUCrJ
2014;1(Pt 5):328-337.
Relationships between the crystal structures of two polymorphs of sodium naproxen dihydrate and its monohydrate and anhydrate phases provide a basis to rationalize the observed transformation pathways in the sodium (S)-naproxen anhydrate–hydrate system.

Crystal structures are presented for two dihydrate polymorphs (DH-I and DH-II) of the non-steroidal anti-inflammatory drug sodium (S)-naproxen. The structure of DH-I is determined from twinned single crystals obtained by solution crystallization. DH-II is obtained by solid-state routes, and its structure is derived using powder X-ray diffraction, solid-state 13C and 23Na MAS NMR, and molecular modelling. The validity of both structures is supported by dispersion-corrected density functional theory (DFT-D) calculations. The structures of DH-I and DH-II, and in particular their relationships to the monohydrate (MH) and anhydrate (AH) structures, provide a basis to rationalize the observed transformation pathways in the sodium (S)-naproxen anhydrate–hydrate system. All structures contain Na+/carboxylate/H2O sections, alternating with sections containing the naproxen molecules. The structure of DH-I is essentially identical to MH in the naproxen region, containing face-to-face arrangements of the naphthalene rings, whereas the structure of DH-II is comparable to AH in the naproxen region, containing edge-to-face arrangements of the naphthalene rings. This structural similarity permits topotactic transformation between AH and DH-II, and between MH and DH-I, but requires re-organization of the naproxen molecules for transformation between any other pair of structures. The topotactic pathways dominate at room temperature or below, while the non-topotactic pathways become active at higher temperatures. Thermochemical data for the dehydration processes are rationalized in the light of this new structural information.

doi:10.1107/S2052252514015450

PMCID: PMC4174875
PMID: 25295174

pharmaceutical; hydrate; X-ray diffraction; solid-state NMR; DFT-D

The performances of Møller-Plesset second-order perturbation theory (MP2) and density functional theory (DFT) have been assessed for the purposes of investigating the interaction between stannylenes and aromatic molecules. The complexes between SnX2 (where X = H, F, Cl, Br, and I) and benzene or pyridine are considered. Structural and energetic properties of such complexes are calculated using six MP2-type and 14 DFT methods. The assessment of the above-mentioned methods is based on the comparison of the structures and interaction energies predicted by these methods with reference computational data. A very detailed analysis of the performances of the MP2-type and DFT methods is carried out for two complexes, namely SnH2-benzene and SnH2-pyridine. Of the MP2-type methods, the reference structure of SnH2-benzene is reproduced best by SOS-MP2, whereas SCS-MP2 is capable of mimicking the reference structure of SnH2-pyridine with the greatest accuracy. The latter method performs best in predicting the interaction energy between SnH2 and benzene or pyridine. Among the DFT methods, ωB97X provides the structures and interaction energies of the SnH2-benzene and SnH2-pyridine complexes with good accuracy. However, this density functional is not as effective in reproducing the reference data for the two complexes as the best performing MP2-type methods. Next, the DFT methods are evaluated using the full test set of SnX2-benzene and SnX2-pyridine complexes. It is found that the range-separated hybrid or dispersion-corrected density functionals should be used for describing the interaction in such complexes with reasonable accuracy.

Electronic supplementary material

The online version of this article (doi:10.1007/s00894-015-2589-1) contains supplementary material, which is available to authorized users.

doi:10.1007/s00894-015-2589-1

PMCID: PMC4326664
PMID: 25677452

Benchmarking; MP2; DFT; Benzene; Pyridine; Stannylene

The design and assembly of mechanically interlocked molecules, such as catenanes and rotaxanes, are dictated by various types of noncovalent interactions. In particular, [C-H⋯O] hydrogen-bonding and π-π stacking interactions in these supramolecular complexes have been identified as important noncovalent interactions. With this in mind, we examined the [3] catenane 2·4PF6 using molecular mechanics (MM3), ab initio methods (HF, MP2), several versions of density functional theory (DFT) (B3LYP, M0X), and the dispersion-corrected method DFT-D3. Symmetry adapted perturbation theory (DFT-SAPT) provides the highest level of theory considered, and we use the DFT-SAPT results both to calibrate the other electronic structure methods, and the empirical potential MM3 force field that is often used to describe larger catenane and rotaxane structures where [C-H⋯O] hydrogen-bonding and π-π stacking interactions play a role. Our results indicate that the MM3 calculated complexation energies agree qualitatively with the energetic ordering from DFT-SAPT calculations with an aug-cc-pVTZ basis, both for structures dominated by [C-H⋯O] hydrogen-bonding and π-π stacking interactions. When the DFT-SAPT energies are decomposed into components, and we find that electrostatic interactions dominate the [C-H⋯O] hydrogen-bonding interactions while dispersion makes a significant contribution to π-π stacking. Another important conclusion is that DFT-D3 based on M06 or M06-2X provides interactions energies that are in near-quantitative agreement with DFT-SAPT. DFT results without the D3 correct have important differences compared to DFT-SAPT while HF and even MP2 results are in poor agreement with DFT-SAPT.

doi:10.1021/jp400051b

PMCID: PMC3840798
PMID: 23941280

catenanes; dispersion; MM3 force field; supramolecular complexes and DFT-SAPT

NMR spectroscopy is the most popular technique used for structure elucidation of small organic molecules in solution, but incorrect structures are regularly reported. One-bond proton-carbon J-couplings provide additional information about chemical structure because they are determined by different features of molecular structure than are proton and carbon chemical shifts. However, these couplings are not routinely used to validate proposed structures because few software tools exist to predict them. This study assesses the accuracy of Density Functional Theory for predicting them using 396 published experimental observations from a diverse range of small organic molecules. With the B3LYP functional and the TZVP basis set, Density Functional Theory calculations using the open-source software package NWChem can predict one-bond CH J-couplings with good accuracy for most classes of small organic molecule. The root-mean-square deviation after correction is 1.5 Hz for most sp3 CH pairs and 1.9 Hz for sp2 pairs; larger errors are observed for sp3 pairs with multiple electronegative substituents and for sp pairs. These results suggest that prediction of one-bond CH J-couplings by Density Functional Theory is sufficiently accurate for structure validation. This will be of particular use in strained ring systems and heterocycles which have characteristic couplings and which pose challenges for structure elucidation.

doi:10.1371/journal.pone.0111576

PMCID: PMC4218771
PMID: 25365289

Reliable thermochemical measurements and theoretical predictions for reactions involving large transition metal complexes in which long-range intramolecular London dispersion interactions contribute significantly to their stabilization are still a challenge, particularly for reactions in solution. As an illustrative and chemically important example, two reactions are investigated where a large dipalladium complex is quenched by bulky phosphane ligands (triphenylphosphane and tricyclohexylphosphane). Reaction enthalpies and Gibbs free energies were measured by isotherm titration calorimetry (ITC) and theoretically ‘back-corrected’ to yield 0 K gas-phase reaction energies (ΔE). It is shown that the Gibbs free solvation energy calculated with continuum models represents the largest source of error in theoretical thermochemistry protocols. The (‘back-corrected’) experimental reaction energies were used to benchmark (dispersion-corrected) density functional and wave function theory methods. Particularly, we investigated whether the atom-pairwise D3 dispersion correction is also accurate for transition metal chemistry, and how accurately recently developed local coupled-cluster methods describe the important long-range electron correlation contributions. Both, modern dispersion-corrected density functions (e.g., PW6B95-D3(BJ) or B3LYP-NL), as well as the now possible DLPNO-CCSD(T) calculations, are within the ‘experimental’ gas phase reference value. The remaining uncertainties of 2–3 kcal mol−1 can be essentially attributed to the solvation models. Hence, the future for accurate theoretical thermochemistry of large transition metal reactions in solution is very promising.

doi:10.1002/open.201402017

PMCID: PMC4234214
PMID: 25478313

density functional theory; isothermal titration calorimetry; local coupled cluster; London dispersion interactions; transition metal reactions

Bardwell, David A. | Adjiman, Claire S. | Arnautova, Yelena A. | Bartashevich, Ekaterina | Boerrigter, Stephan X. M. | Braun, Doris E. | Cruz-Cabeza, Aurora J. | Day, Graeme M. | Della Valle, Raffaele G. | Desiraju, Gautam R. | van Eijck, Bouke P. | Facelli, Julio C. | Ferraro, Marta B. | Grillo, Damian | Habgood, Matthew | Hofmann, Detlef W. M. | Hofmann, Fridolin | Jose, K. V. Jovan | Karamertzanis, Panagiotis G. | Kazantsev, Andrei V. | Kendrick, John | Kuleshova, Liudmila N. | Leusen, Frank J. J. | Maleev, Andrey V. | Misquitta, Alston J. | Mohamed, Sharmarke | Needs, Richard J. | Neumann, Marcus A. | Nikylov, Denis | Orendt, Anita M. | Pal, Rumpa | Pantelides, Constantinos C. | Pickard, Chris J. | Price, Louise S. | Price, Sarah L. | Scheraga, Harold A. | van de Streek, Jacco | Thakur, Tejender S. | Tiwari, Siddharth | Venuti, Elisabetta | Zhitkov, Ilia K.
The results of the fifth blind test of crystal structure prediction, which show important success with more challenging large and flexible molecules, are presented and discussed.

Following on from the success of the previous crystal structure prediction blind tests (CSP1999, CSP2001, CSP2004 and CSP2007), a fifth such collaborative project (CSP2010) was organized at the Cambridge Crystallographic Data Centre. A range of methodologies was used by the participating groups in order to evaluate the ability of the current computational methods to predict the crystal structures of the six organic molecules chosen as targets for this blind test. The first four targets, two rigid molecules, one semi-flexible molecule and a 1:1 salt, matched the criteria for the targets from CSP2007, while the last two targets belonged to two new challenging categories – a larger, much more flexible molecule and a hydrate with more than one polymorph. Each group submitted three predictions for each target it attempted. There was at least one successful prediction for each target, and two groups were able to successfully predict the structure of the large flexible molecule as their first place submission. The results show that while not as many groups successfully predicted the structures of the three smallest molecules as in CSP2007, there is now evidence that methodologies such as dispersion-corrected density functional theory (DFT-D) are able to reliably do so. The results also highlight the many challenges posed by more complex systems and show that there are still issues to be overcome.

doi:10.1107/S0108768111042868

PMCID: PMC3222142
PMID: 22101543

prediction; blind test; polymorph; crystal structure prediction

Graphical abstract

Highlights

► Alkane adsorption in chabazite is modelled using electronic structure theory. ► Finite temperature effects are estimated by molecular dynamics simulations. ► An extrapolation mechanism to finite temperature is proposed. ► Results are critically compared to experimental data.

The adsorption of alkanes in a protonated zeolite has been investigated at different levels of theory. At the lowest level we use density-functional theory (DFT) based on semi-local (gradient-corrected) functionals which account only for the interaction of the molecule with the acid site. To describe the van der Waals (vdW) interactions between the saturated molecule and the inner wall of the zeolite we use (i) semi-empirical pair interactions, (ii) calculations using a non-local correlation functional designed to include vdW interactions, and (iii) an approach based on calculations of the dynamical response function within the random-phase approximation (RPA). The effect of finite temperature on the adsorption properties has been studied by performing molecular dynamics (MD) simulations based on forces derived from DFT plus semi-empirical vdW corrections. The simulations demonstrate that even at room temperature the binding of the molecule to the acid site is frequently broken such that only the vdW interaction between the alkane and the zeolite remains. The finite temperature adsorption energy is calculated as the ensemble average over a sufficiently long molecular dynamics run, it is significantly reduced compared to the T = 0 K limit. At a higher level of theory where MD simulations would be prohibitively expensive we propose a simple scheme based on the averaging over the adsorption energies in the acid and in the purely siliceous zeolite to account for temperature effects. With these corrections we find an excellent agreement between the RPA predictions and experiment.

doi:10.1016/j.micromeso.2012.04.052

PMCID: PMC4268788
PMID: 25540604

Alkane adsorption; Molecular dynamics; Ab-initio; van der Waals interactions; Zeolites

Bardwell, David A. | Adjiman, Claire S. | Arnautova, Yelena A. | Bartashevich, Ekaterina | Boerrigter, Stephan X. M. | Braun, Doris E. | Cruz-Cabeza, Aurora J. | Day, Graeme M. | Della Valle, Raffaele G. | Desiraju, Gautam R. | van Eijck, Bouke P. | Facelli, Julio C. | Ferraro, Marta B. | Grillo, Damian | Habgood, Matthew | Hofmann, Detlef W. M. | Hofmann, Fridolin | Jose, K. V. Jovan | Karamertzanis, Panagiotis G. | Kazantsev, Andrei V. | Kendrick, John | Kuleshova, Liudmila N. | Leusen, Frank J. J. | Maleev, Andrey V. | Misquitta, Alston J. | Mohamed, Sharmarke | Needs, Richard J. | Neumann, Marcus A. | Nikylov, Denis | Orendt, Anita M. | Pal, Rumpa | Pantelides, Constantinos C. | Pickard, Chris J. | Price, Louise S. | Price, Sarah L. | Scheraga, Harold A. | van de Streek, Jacco | Thakur, Tejender S. | Tiwari, Siddharth | Venuti, Elisabetta | Zhitkov, Ilia K.
Following on from the success of the previous crystal structure prediction blind tests (CSP1999, CSP2001, CSP2004 and CSP2007), a fifth such collaborative project (CSP2010) was organized at the Cambridge Crystallographic Data Centre. A range of methodologies was used by the participating groups in order to evaluate the ability of the current computational methods to predict the crystal structures of the six organic molecules chosen as targets for this blind test. The first four targets, two rigid molecules, one semi-flexible molecule and a 1:1 salt, matched the criteria for the targets from CSP2007, while the last two targets belonged to two new challenging categories – a larger, much more flexible molecule and a hydrate with more than one polymorph. Each group submitted three predictions for each target it attempted. There was at least one successful prediction for each target, and two groups were able to successfully predict the structure of the large flexible molecule as their first place submission. The results show that while not as many groups successfully predicted the structures of the three smallest molecules as in CSP2007, there is now evidence that methodologies such as dispersion-corrected density functional theory (DFT-D) are able to reliably do so. The results also highlight the many challenges posed by more complex systems and show that there are still issues to be overcome.

doi:10.1107/S0108768111042868

PMCID: PMC3222142
PMID: 22101543

In
this study we investigate π-stacking interactions of a
variety of aromatic heterocycles with benzene using dispersion corrected
density functional theory. We calculate extensive potential energy
surfaces for parallel-displaced interaction geometries. We find that
dispersion contributes significantly to the interaction energy and
is complemented by a varying degree of electrostatic interactions.
We identify geometric preferences and minimum interaction energies
for a set of 13 5- and 6-membered aromatic heterocycles frequently
encountered in small drug-like molecules. We demonstrate that the
electrostatic properties of these systems are a key determinant for
their orientational preferences. The results of this study can be
applied in lead optimization for the improvement of stacking interactions,
as it provides detailed energy landscapes for a wide range of coplanar
heteroaromatic geometries. These energy landscapes can serve as a
guide for ring replacement in structure-based drug design.

doi:10.1021/ci500183u

PMCID: PMC4037317
PMID: 24773380

It is becoming increasingly clear that careful treatment of water molecules in ligand–protein interactions is required in many cases if the correct binding pose is to be identified in molecular docking. Water can form complex bridging networks and can play a critical role in dictating the binding mode of ligands. A particularly striking example of this can be found in the ionotropic glutamate receptors. Despite possessing similar chemical moieties, crystal structures of glutamate and α-amino-3-hydroxy-5-methyl-4-isoxazole-propionic acid (AMPA) in complex with the ligand-binding core of the GluA2 ionotropic glutamate receptor revealed, contrary to all expectation, two distinct modes of binding. The difference appears to be related to the position of water molecules within the binding pocket. However, it is unclear exactly what governs the preference for water molecules to occupy a particular site in any one binding mode. In this work we use density functional theory (DFT) calculations to investigate the interaction energies and polarization effects of the various components of the binding pocket. Our results show (i) the energetics of a key water molecule are more favorable for the site found in the glutamate-bound mode compared to the alternative site observed in the AMPA-bound mode, (ii) polarization effects are important for glutamate but less so for AMPA, (iii) ligand–system interaction energies alone can predict the correct binding mode for glutamate, but for AMPA alternative modes of binding have similar interaction energies, and (iv) the internal energy is a significant factor for AMPA but not for glutamate. We discuss the results within the broader context of rational drug-design.

doi:10.1021/jp200776t

PMCID: PMC3102440
PMID: 21545106

The major objective of this paper is to address a controversial binding
sequence between nucleic acid bases (NABs) and C60 by investigating
adsorptions of NABs and their cations on C60 fullerene with a variety
of density functional theories including two novel hybrid meta-GGA functionals,
M05-2x and M06-2x, as well as a dispersion-corrected density functional, PBE-D.
The M05-2x/6-311++G** provides the same binding
sequence as previously reported, guanine(G) > cytosine(C) > adenine (A)
> thymine (T); however, M06-2x switches the binding strengths of A and C, and
PBE-D eventually results in the following sequence, G>A>T>C, which is
the same as the widely accepted hierarchy for the stacking of NABs on other
carbon nanomaterials such as single-walled carbon nanotube and graphite. The
results indicate that the questionable relative binding strength is due to
insufficient electron correlation treatment with the M05-2x or even the M06-2x
method. The binding energy of G@C60 obtained with the
M06-2x/6-311++G(d,p) and the PBE-D/cc-pVDZ is −7.10 and
−8.07 kcal/mol, respectively, and the latter is only slightly weaker
than that predicted by the MP2/6-31G(d,p) (−8.10kca/mol). Thus, the
PDE-D performs better than the M06-2x for the observed NAB@C60
π-stacked complexes. To discuss whether C60 could prevent
NABs from radiation-induced damage, ionization potentials of NABs and
C60, and frontier molecular orbitals of the complexes
NABs@C60 and (NABs@C60)+ are also
extensively investigated. These results revealed that when an electron escapes
from the complexes, a hole was preferentially created in C60 for T
and C complexes, while for G and A the hole delocalizes over the entire complex,
rather than a localization on the C60 moiety. The interesting finding
might open a new strategy for protecting DNA from radiation-induced damage and
offer a new idea for designing C60-based antiradiation drugs.

doi:10.1021/jp108812z

PMCID: PMC3101642
PMID: 21625361

radiation-induced damage; NAB; C60; dispersion-corrected DFT; binding sequence

Background

Chalcones are ubiquitous natural compounds with a wide variety of reported biological activities, including antitumoral, antiviral and antimicrobial effects. Furthermore, chalcones are being studied for its potential use in organic electroluminescent devices; therefore the description of their spectroscopic properties is important to elucidate the structure of these molecules. One of the main techniques available for structure elucidation is the use of Nuclear Magnetic Resonance Spectroscopy (NMR). Accordingly, the prediction of the NMR spectra in this kind of molecules is necessary to gather information about the influence of substituents on their spectra.

Results

A novel substituted chalcone has been synthetized. In order to identify the functional groups present in the new synthesized compound and confirm its chemical structure, experimental and theoretical 1H-NMR and 13C-NMR spectra were analyzed. The theoretical molecular structure and NMR spectra were calculated at both the Hartree-Fock and Density Functional (meta: TPSS; hybrid: B3LYP and PBE1PBE; hybrid meta GGA: M05-2X and M06-2X) levels of theory in combination with a 6-311++G(d,p) basis set. The structural parameters showed that the best method for geometry optimization was DFT:M06-2X/6-311++G(d,p), whereas the calculated bond angles and bond distances match experimental values of similar chalcone derivatives. The NMR calculations were carried out using the Gauge-Independent Atomic Orbital (GIAO) formalism in a DFT:M06-2X/6-311++G(d,p) optimized geometry.

Conclusion

Considering all HF and DFT methods with GIAO calculations, TPSS and PBE1PBE were the most accurate methods used for calculation of 1H-NMR and 13C-NMR chemical shifts, which was almost similar to the B3LYP functional, followed in order by HF, M05-2X and M06-2X methods. All calculations were done using the Gaussian 09 software package. Theoretical calculations can be used to predict and confirm the structure of substituted chalcones with good correlation with the experimental data.

doi:10.1186/1752-153X-7-17

PMCID: PMC3573982
PMID: 23351546

NMR; Molecular structure; Quantum-chemistry methods; Flavonoid

Background

The rapid access to intrinsic physicochemical properties of molecules is highly desired for large scale chemical data mining explorations such as mass spectrum prediction in metabolomics, toxicity risk assessment and drug discovery. Large volumes of data are being produced by quantum chemistry calculations, which provide increasing accurate estimations of several properties, e.g. by Density Functional Theory (DFT), but are still too computationally expensive for those large scale uses. This work explores the possibility of using large amounts of data generated by DFT methods for thousands of molecular structures, extracting relevant molecular properties and applying machine learning (ML) algorithms to learn from the data. Once trained, these ML models can be applied to new structures to produce ultra-fast predictions. An approach is presented for homolytic bond dissociation energy (BDE).

Results

Machine learning models were trained with a data set of >12,000 BDEs calculated by B3LYP/6-311++G(d,p)//DFTB. Descriptors were designed to encode atom types and connectivity in the 2D topological environment of the bonds. The best model, an Associative Neural Network (ASNN) based on 85 bond descriptors, was able to predict the BDE of 887 bonds in an independent test set (covering a range of 17.67–202.30 kcal/mol) with RMSD of 5.29 kcal/mol, mean absolute deviation of 3.35 kcal/mol, and R2 = 0.953. The predictions were compared with semi-empirical PM6 calculations, and were found to be superior for all types of bonds in the data set, except for O-H, N-H, and N-N bonds. The B3LYP/6-311++G(d,p)//DFTB calculations can approach the higher-level calculations B3LYP/6-311++G(3df,2p)//B3LYP/6-31G(d,p) with an RMSD of 3.04 kcal/mol, which is less than the RMSD of ASNN (against both DFT methods). An experimental web service for on-line prediction of BDEs is available at http://joao.airesdesousa.com/bde.

Conclusion

Knowledge could be automatically extracted by machine learning techniques from a data set of calculated BDEs, providing ultra-fast access to accurate estimations of DFT-calculated BDEs. This demonstrates how to extract value from large volumes of data currently being produced by quantum chemistry calculations at an increasing speed mostly without human intervention. In this way, high-level theoretical quantum calculations can be used in large-scale applications that otherwise would not afford the intrinsic computational cost.

doi:10.1186/1758-2946-5-34

PMCID: PMC3720218
PMID: 23849655

BDE; Bond dissociation energy; Neural network; Random forest; Machine learning; Chemoinformatics; DFT; DFTB; Big data

Diabatic models are widely employed for studying chemical reactivity in condensed phases and enzymes, but there has been little discussion of the pros and cons of various diabatic representations for this purpose. Here we discuss and contrast six different schemes for computing diabatic potentials for a charge rearrangement reaction. They include (i) the variational diabatic configurations (VDC) constructed by variationally optimizing individual valence bond structures and (ii) the consistent diabatic configurations (CDC) obtained by variationally optimizing the ground-state adiabatic energy, both in the nonorthogonal molecular orbital valence bond (MOVB) method, along with the orthogonalized (iii) VDC-MOVB and (iv) CDC-MOVB models. In addition, we consider (v) the fourfold way (based on diabatic molecular orbitals and configuration uniformity), and (vi) empirical valence bond (EVB) theory. To make the considerations concrete, we calculate diabatic electronic states and diabatic potential energies along the reaction path that connects the reactant and the product ion-molecule complexes of the gas-phase bimolecular nucleophilic substitution (SN2) reaction of 1,2-dichloethane (DCE) with acetate ion, which is a model reaction corresponding to the reaction catalyzed by haloalkane dehalogenase. We utilize ab initio block-localized molecular orbital theory to construct the MOVB diabatic states and ab initio multi-configuration quasidegenerate perturbation theory to construct the fourfold-way diabatic states; the latter are calculated at reaction path geometries obtained with the M06-2X density functional. The EVB diabatic states are computed with parameters taken from the literature. The MOVB and fourfold-way adiabatic and diabatic potential energy profiles along the reaction path are in qualitative but not quantitative agreement with each other. In order to validate that these wave-function-based diabatic states are qualitatively correct, we show that the reaction energy and barrier for the adiabatic ground state, obtained with these methods, agree reasonably well with the results of high-level calculations using the composite G3SX and G3SX(MP3) methods and the BMC-CCSD multi-coefficient correlation method. However, a comparison of the EVB gas-phase adiabatic ground-state reaction path with those obtained from MOVB and with the fourfold way reveals that the EVB reaction path geometries show a systematic shift towards the products region, and that the EVB lowest-energy path has a much lower barrier. The free energies of solvation and activation energy in water reported from dynamical calculations based on EVB also imply a low activation barrier in the gas phase. In addition, calculations of the free energy of solvation using the recently proposed SM8 continuum solvation model with CM4M partial atomic charges lead to an activation barrier in reasonable agreement with experiment only when the geometries and the gas-phase barrier are those obtained from electronic structure calculations, i.e., methods i–v. These comparisons show the danger of basing the diabatic states on molecular mechanics without the explicit calculation of electronic wave functions. Furthermore, comparison of schemes i–v with one another shows that significantly different quantitative results can be obtained by using different methods for extracting diabatic states from wave function calculations, and it is important for each user to justify the choice of diabatization method in the context of its intended use.

doi:10.1021/ct800318h

PMCID: PMC2658610
PMID: 20047005

Machine learning has been used for estimation of potential energy surfaces to speed up molecular dynamics simulations of small systems. We demonstrate that this approach is feasible for significantly larger, structurally complex molecules, taking the natural product Archazolid A, a potent inhibitor of vacuolar-type ATPase, from the myxobacterium Archangium gephyra as an example. Our model estimates energies of new conformations by exploiting information from previous calculations via Gaussian process regression. Predictive variance is used to assess whether a conformation is in the interpolation region, allowing a controlled trade-off between prediction accuracy and computational speed-up. For energies of relaxed conformations at the density functional level of theory (implicit solvent, DFT/BLYP-disp3/def2-TZVP), mean absolute errors of less than 1 kcal/mol were achieved. The study demonstrates that predictive machine learning models can be developed for structurally complex, pharmaceutically relevant compounds, potentially enabling considerable speed-ups in simulations of larger molecular structures.

Author Summary

Molecular dynamics simulations provide insight into the dynamic behavior of molecules, e.g., into the adopted spatial arrangements of its atoms over time. Methods differ in the approximations they employ, resulting in a trade-off between accuracy and speed that ranges from highly accurate but expensive quantum mechanical calculations to fast but more inaccurate molecular mechanics force fields. Machine learning, a sub-discipline of artificial intelligence, provides algorithms that learn from data, that is, make predictions based on previously seen examples. By starting with a few expensive quantum mechanical calculations, training a machine learning algorithm on them, and then using the resulting model to carry out the molecular dynamics simulation, one can improve the accuracy/speed trade-off. We have developed and applied such a hybrid quantum mechanics/machine learning approach to Archazolid A, a natural product from the myxobacterium Archangium gephyra and a potent inhibitor of vacuolar-type ATPase. By dynamically refining our model over the course of the simulation, we achieve errors of less than 1 kcal/mol while saving over 40% of the quantum mechanical calculations. Our study demonstrates the feasibility of predictive machine learning models for the dynamics of structurally complex, pharmaceutically relevant compounds, potentially enabling considerable speed-ups in simulations of even larger biomolecular structures.

doi:10.1371/journal.pcbi.1003400

PMCID: PMC3894151
PMID: 24453952

This work deduces from a series of well defined copper-doped amino acid crystals, relationships between structural features of the copper complexes and ligand-bound proton hyperfine parameters. These were established by combining results from electron paramagnetic resonance (EPR)/electron-nuclear double resonance (ENDOR) studies, crystallography and were further assessed by quantum mechanical (QM) calculations. A detailed evaluation of previous studies on Cu2+-doped into α-glycine, triglycine sulfate, α-glycylglycine and l-alanine crystals reveal correlations between geometric features of the copper sites and proton hyperfine couplings from amino bound and carbon bound hydrogens. Experimental variations in proton isotropic hyperfine coupling values (aiso) could be fit to cosine-square dependences on dihedral angles, namely, for Cα-bound hydrogens, aiso = −1.09 + 8.21cos2θ MHz, and for amino hydrogens, aiso = −6.16 + 4.15cos2φ MHz. For the Cα hydrogens, this dependency suggests a hyperconjugative-like mechanism for transfer of spin density into the hydrogen 1s-orbital. In the course of this work, it was also necessary to reanalyze the ENDOR measurements from Cu2+-doped α-glycine since the initial study determined the 14N coupling parameters without holding its nuclear quadrupole tensor traceless. This new treatment of the data was needed to correctly align the 14N hyperfine tensor principal directions in the molecular complex. In order to provide a theoretical basis for the coupling variations, QM calculations performed at the Density Functional Theory (DFT) level were used to compute the proton hyperfine tensors in the four crystal complexes as well as in a geometry-optimized Cu2+(glycine)2 model. These theoretical calculations confirmed systematic changes in couplings with dihedral angles, but greatly overestimated the experimental geometric sensitivity to the amino hydrogen isotropic coupling.

doi:10.1021/jp811249s

PMCID: PMC2896622
PMID: 19378965

proton hyperfine couplings; hyperconjugation; spin density; glycine; triglycine sulfate; alanine; glycylglycine

The adsorption of Ag, Au, and Pd atoms on benzene, coronene, and graphene has been studied using post Hartree–Fock wave function theory (CCSD(T), MP2) and density functional theory (M06-2X, DFT-D3, PBE, vdW-DF) methods. The CCSD(T) benchmark binding energies for benzene–M (M = Pd, Au, Ag) complexes are 19.7, 4.2, and 2.3 kcal/mol, respectively. We found that the nature of binding of the three metals is different: While silver binds predominantly through dispersion interactions, the binding of palladium has a covalent character, and the binding of gold involves a subtle combination of charge transfer and dispersion interactions as well as relativistic effects. We demonstrate that the CCSD(T) benchmark binding energies for benzene–M complexes can be reproduced in plane-wave density functional theory calculations by including a fraction of the exact exchange and a nonempirical van der Waals correction (EE+vdW). Applying the EE+vdW method, we obtained binding energies for the graphene–M (M = Pd, Au, Ag) complexes of 17.4, 5.6, and 4.3 kcal/mol, respectively. The trends in binding energies found for the benzene–M complexes correspond to those in coronene and graphene complexes. DFT methods that use empirical corrections to account for the effects of vdW interactions significantly overestimate binding energies in some of the studied systems.

doi:10.1021/ct200625h

PMCID: PMC3210524
PMID: 22076121

Bahena, Daniel | Bhattarai, Nabraj | Santiago, Ulises | Tlahuice, Alfredo | Ponce, Arturo | Bach, Stephan B. H. | Yoon, Bokwon | Whetten, Robert L. | Landman, Uzi | Jose-Yacaman, Miguel
Determination of the total structure of molecular nanocrystals is an outstanding experimental challenge that has been met, in only a few cases, by single-crystal X-ray diffraction. Described here is an alternative approach that is of most general applicability and does not require the fabrication of a single crystal. The method is based on rapid, time-resolved nanobeam electron diffraction (NBD) combined with high-angle annular dark field scanning/transmission electron microscopy (HAADF-STEM) images in a probe corrected STEM microscope, operated at reduced voltages. The results are compared with theoretical simulations of images and diffraction patterns obtained from atomistic structural models derived through first-principles density functional theory (DFT) calculations. The method is demonstrated by application to determination of the structure of the Au144(SCH2CH2Ph)60 cluster.

doi:10.1021/jz400111d

PMCID: PMC3655783
PMID: 23687562

aberration-corrected microscopy; metal nanoparticles; low voltages; first-principles density functional theory; structure determination

We have tested a variety of approximate methods for modeling 30 systems containing mixtures of nitrogen heterocycles and exocyclic amines, each of which is studied with up to 31 methods in one or two phases (gaseous and aqueous). Fifteen of the systems are protonated, and 15 are not. We consider a data set consisting of geometric parameters, partial atomic charges, and water binding energies for the methotrexate fragments 2-(aminomethyl)pyrazine and 2,4-diaminopyrimidine, as well as their cationic forms 1H-2-(aminomethyl)pyrazine and 1H-2,4-diaminopyrimidine. We first evaluated the suitability of several density functionals with the 6-31+G(d,p) basis set to serve as a benchmark by comparing calculated molecular geometries to results obtained from coupled-cluster [CCSD/6-31+G(d,p)] wave function theory (WFT). We found that the M05-2X density functional can be used to obtain reliable geometries for our data set. To accurately model partial charges in our molecules, we elected to utilize the well-validated Charge Model 4 (CM4). In the process of establishing benchmark values, we consider gas-phase coupled cluster and density functional theory (DFT) calculations followed by aqueous-phase DFT calculations, where the effect of solvent is treated by the SM6 quantum mechanical implicit solvation model. The resulting benchmarks were used to test several widely available and economical semiempirical molecular orbital (SE-MO) methods and molecular mechanical (MM) force fields for their ability to accurately predict the partial charges, binding energies to a water molecule, and molecular geometries of representative fragments of methotrexate in the gaseous and aqueous phases, where effects of water were simulated by the SM5.4 and SM5.42 quantum mechanical implicit solvation models for SE-MO and explicit solvation used for MM. In addition, we substituted CM4 charges into the MM force fields tested to observe the effect of improved charge assignment on geometric and energetic modeling. The most accurate MM force fields (with or without CM4 charges substituted) were validated against gas-phase and aqueous-phase geometries and charge distributions of a larger set of 16 drug-like ligands, both neutral and cationic. This process showed that the Merck Molecular Force Field (MMFF94) with or without CM4 charges substituted, is, on average, the most accurate force field for geometries of molecules containing nitrogen heterocycles and exocyclic amino groups, both protonated and unprotonated. This force field was then applied to the complete methotrexate molecule, in an effort to systematically explore its accuracy for trends in geometries and charge distributions. The most accurate force fields for the binding energies of nitrogen heterocycles to a water molecule are OPLS2005 and AMBER.

doi:10.1021/ct8000766

PMCID: PMC3658833
PMID: 23700392

Photochromic molecules have the potential to find utility in a wide variety of applications including photoswitchable binding and optical memory. This work explores the relationship between photochromism and structural parameters such as particular bond lengths for this class of compounds for which very few crystal structures have been published. Photochemical kinetics, Density Functional Theory (DFT) and X-ray crystallography were used to study the benzothiazolinic spiropyran 3-methyl-6-nitro-3′-methylspiro-[2H-l-benzopyran-2,2′-benzothiazoline]. A second benzothiazolinic spiropyran 3-methyl-8-methoxy-6-nitro-3′-methylspiro-[2H-l-benzopyran-2,2′-benzothiazoline] was synthesized and subjected to photochemical and computational studies. Selected structural and photochemical data for these, related benzothiazolinic spirooxazines and spiropyrans, and related thiazolidinic spiropyrans are compared. Both benzothiazolinic spiropyrans exhibit photochromic properties that are influenced by substituents, solvent, and temperature. The crystallographic Cspiro-O bond distance of 3-methyl-6-nitro-3′-methylspiro-[2H-l-benzopyran-2,2′-benzothiazoline] that has been shown to correlate with photochromic properties is 1.458 Å. The crystallographic Cspiro-O bond distance matches that of the structure generated by DFT calculations exactly. The effect of substituents on calculated bond lengths and photochemical parameters was determined. For this class of compounds, both X-ray geometry and DFT optimized geometry may be used to predict photochromism, but not degree of photocolorability.

doi:10.1016/j.molstruc.2010.01.012

PMCID: PMC2850124
PMID: 20383273

photochromism; X-ray; benzothiazolinic spiropyran

In this work we analyze the effect of the inclusion of an empirical dispersion term to standard DFT (DFT-D) in the prediction of the conformational energy of the alanine dipeptide (Ala2) and in assessing the relative stabilities of short polyala-nine peptides in helical conformations, i.e., α and 310 helices, from Ala4 to Ala16. The Ala2 conformational energies obtained with the dispersion-corrected GGA functional B97-D are compared to previously published high level MP2 data. Meanwhile, the B97-D performance on larger polyalanine peptides is compared to MP2, B3LYP and RHF calculations obtained at a lower level of theory. Our results show that electron correlation affects the conformational energies of short peptides with a weight that increases with the peptide length. Indeed, while the contribution of vdW forces is significant for larger peptides, in the case of Ala2 it is negligible when compared to solvent effects. Even for short peptides, the inclusion of an empirical dispersion term greatly improves accuracy of DFT methods, providing results that correlate very well with the MP2 reference at no additional computational cost.

doi:10.1139/cjc-2012-0542

PMCID: PMC4239032
PMID: 25418993

alanine dipeptide; short polyalanine peptides; ab initio and DFT calculations; empirical dispersion-corrected DFT; peptides structure and stability; Ramachandran plot