Molecular modeling of proteins including homology modeling, structure determination, and knowledge-based protein design requires tools to evaluate and refine three-dimensional protein structures. Steric clash is one of the artifacts prevalent in low-resolution structures and homology models. Steric clashes arise due to the unnatural overlap of any two non-bonding atoms in a protein structure. Usually, removal of severe steric clashes in some structures is challenging since many existing refinement programs do not accept structures with severe steric clashes. Here, we present a quantitative approach of identifying steric clashes in proteins by defining clashes based on the Van der Waals repulsion energy of the clashing atoms. We also define a metric for quantitative estimation of the severity of clashes in proteins by performing statistical analysis of clashes in high-resolution protein structures. We describe a rapid, automated and robust protocol, Chiron, which efficiently resolves severe clashes in low-resolution structures and homology models with minimal perturbation in the protein backbone. Benchmark studies highlight the efficiency and robustness of Chiron compared to other widely used methods. We provide Chiron as an automated web server to evaluate and resolve clashes in protein structures that can be further used for more accurate protein design.
Homology modeling; refinement; Chiron; Discrete Molecular Dynamics; Protein Design
Non-covalent interactions hold the key to understanding many chemical, biological, and technological problems. Describing these non-covalent interactions accurately, including their positions in real space, constitutes a first step in the process of decoupling the complex balance of forces that define non-covalent interactions. Because of the size of macromolecules, the most common approach has been to assign van der Waals interactions (vdW), steric clashes (SC), and hydrogen bonds (HBs) based on pairwise distances between atoms according to their van der Waals radii. We recently developed an alternative perspective, derived from the electronic density: the Non-Covalent Interactions (NCI) index [J. Am. Chem. Soc. 2010, 132, 6498]. This index has the dual advantages of being generally transferable to diverse chemical applications and being very fast to compute, since it can be calculated from promolecular densities. Thus, NCI analysis is applicable to large systems, including proteins and DNA, where analysis of non-covalent interactions is of great potential value. Here, we describe the NCI computational algorithms and their implementation for the analysis and visualization of weak interactions, using both self-consistent fully quantum-mechanical, as well as promolecular, densities. A wide range of options for tuning the range of interactions to be plotted is also presented. To demonstrate the capabilities of our approach, several examples are given from organic, inorganic, solid state, and macromolecular chemistry, including cases where NCI analysis gives insight into unconventional chemical bonding. The NCI code and its manual are available for download at http://www.chem.duke.edu/~yang/software.htm
Molecular structure does not easily identify the intricate non-covalent interactions that govern many areas of biology and chemistry, including design of new materials and drugs. We develop an approach to detect non-covalent interactions in real space, based on the electron density and its derivatives. Our approach reveals underlying chemistry that compliments the covalent structure. It provides a rich representation of van der Waals interactions, hydrogen bonds, and steric repulsion in small molecules, molecular complexes, and solids. Most importantly, the method, requiring only knowledge of the atomic coordinates, is efficient and applicable to large systems, such as proteins or DNA. Across these applications, a view of non-bonded interactions emerges as continuous surfaces rather than close contacts between atom pairs, offering rich insight into the design of new and improved ligands.
In the title compound, C15H11NO2S, a new thio-benzoxazole derivative, the dihedral angle between the benzoxazole ring and the phenyl ring is 9.91 (9)°. An interesting feature of the crystal structure is the short C⋯S [3.4858 (17) Å] contact, which is shorter than the sum of the van der Waals radii of these atoms. In the crystal structure, molecules are linked together by zigzag intermolecular C—H⋯N interactions into a column along the a axis. The crystal structure is further stabilized by intermolecular π–π interactions [centroid–centroid = 3.8048 (10) Å].
Most current crystallographic structure refinements augment the diffraction data with a priori information consisting of bond, angle, dihedral, planarity restraints and atomic repulsion based on the Pauli exclusion principle. Yet, electrostatics and van der Waals attraction are physical forces that provide additional a priori information. Here we assess the inclusion of electrostatics for the force field used for all-atom (including hydrogen) joint neutron/X-ray refinement. Two DNA and a protein crystal structure were refined against joint neutron/X-ray diffraction data sets using force fields without electrostatics or with electrostatics. Hydrogen bond orientation/geometry favors the inclusion of electrostatics. Refinement of Z-DNA with electrostatics leads to a hypothesis for the entropic stabilization of Z-DNA that may partly explain the thermodynamics of converting the B form of DNA to its Z form. Thus, inclusion of electrostatics assists joint neutron/X-ray refinements, especially for placing and orienting hydrogen atoms.
The title compound, C8H8F2, lies across a crystallographic inversion centre. The structure features short C⋯F [2.8515 (18) Å] and F⋯F [2.490 (4) Å] contacts, which are significantly shorter than the sum of the van der Waals radii of these atoms. The F atom and methylene H atoms are disordered over two positions with a site-occupancy ratio of 0.633 (3):0.367 (3). In the crystal structure, intermolecular C—H⋯F interactions link neighboring molecules into infinite chains along the b axis. In addition, C—H⋯π interactions link these molecules along , forming a two-dimensional network parallel to (101).
In the molecule of the title compound, C8H9N3, a new imidazoline derivative, the six- and five-membered rings are slightly twisted away from each other, forming a dihedral angle of 7.96 (15)°. In the crystal structure, neighbouring molecules are linked together by intermolecular N—H⋯N hydrogen bonds into extended one-dimensional chains along the a axis. The pyridine N atom is in close proximity to a carbon-bound H atom of the imidazoline ring, with an H⋯N distance of 2.70 Å, which is slightly shorter than the sum of the van der Waals radii of these atoms (2.75 Å). The crystal structure is further stabilized by intermolecular C—H⋯π and π–π interactions (centroid-to-centroid distance 3.853 Å).
Atomic radii are not precisely defined but are nevertheless widely used parameters in modeling and understanding molecular structure and interactions. The van der Waals radii determined by Bondi from molecular crystals and noble gas crystals are the most widely used values, but Bondi recommended radius values for only 28 of the 44 main-group elements in the periodic table. In the present article we present atomic radii for the other 16; these new radii were determined in a way designed to be compatible with Bondi’s scale. The method chosen is a set of two-parameter correlations of Bondi’s radii with repulsive-wall distances calculated by relativistic coupled-cluster electronic structure calculations. The newly determined radii (in Å) are Be, 1.53; B, 1.92; Al, 1.84; Ca, 2.31; Ge, 2.11; Rb, 3.03; Sr, 2.50; Sb, 2.06; Cs, 3.43; Ba, 2.68; Bi, 2.07; Po, 1.97; At, 2.02; Rn, 2.20; Fr, 3.48; and Ra, 2.83.
A method to accelerate the computation of structure factors from an electron density described by anisotropic and aspherical atomic form factors via fast Fourier transformation is described for the first time.
Recent advances in computational chemistry have produced force fields based on a polarizable atomic multipole description of biomolecular electrostatics. In this work, the Atomic Multipole Optimized Energetics for Biomolecular Applications (AMOEBA) force field is applied to restrained refinement of molecular models against X-ray diffraction data from peptide crystals. A new formalism is also developed to compute anisotropic and aspherical structure factors using fast Fourier transformation (FFT) of Cartesian Gaussian multipoles. Relative to direct summation, the FFT approach can give a speedup of more than an order of magnitude for aspherical refinement of ultrahigh-resolution data sets. Use of a sublattice formalism makes the method highly parallelizable. Application of the Cartesian Gaussian multipole scattering model to a series of four peptide crystals using multipole coefficients from the AMOEBA force field demonstrates that AMOEBA systematically underestimates electron density at bond centers. For the trigonal and tetrahedral bonding geometries common in organic chemistry, an atomic multipole expansion through hexadecapole order is required to explain bond electron density. Alternatively, the addition of interatomic scattering (IAS) sites to the AMOEBA-based density captured bonding effects with fewer parameters. For a series of four peptide crystals, the AMOEBA–IAS model lowered R
free by 20–40% relative to the original spherically symmetric scattering model.
scattering factors; aspherical; anisotropic; force fields; multipole; polarization; AMOEBA; bond density; direct summation; FFT; SGFFT; Ewald; PME
In the title compound, [ReBr(C16H12Cl2F2N2)(CO)3], the Re atom is in a slightly distorted octahedral coordination environment with the three carbonyl ligands having a fac configuration. The diimine ligand is equatorial and is bonded to the Re centre in an N,N′-bidentate chelating fashion, with a bite angle of 77.7 (2)°. The dihedral angle between the two benzene rings is 88.7 (6)°. In the crystal structure, there are F⋯O [2.856 (9) Å], Cl⋯C [3.150 (8) Å] and O⋯C [2.984 (10) Å] contacts which are shorter than the sum of the van der Waals radii for these atoms. In addition, symmetry-related molecules are linked via intermolecular C—H⋯O, C—H⋯Br and the F⋯O interactions into one-dimensional chains extending along the a axis. The crystal structure is further stabilized by intermolecular π–π interactions [centroid–centroid distance = 3.571 (5) Å].
The asymmetric unit of the title compound, C16H14O4, consists of one half-molecule of an essentially planar biphenyldicarboxylic acid ester, with the complete molecule generated by an inversion centre. The maximum deviation from a least-squares plane through all non-H atoms occurs for the peripheric methyl groups and amounts to 0.124 (2) Å. The solid represents a typical molecular crystal without classical hydrogen bonds. The shortest intermolecular contacts do not differ significantly from the sum of the van der Waals radii of the atoms involved.
The title compound, C24H25Br, packs efficiently in the crystal structure with no solvent-accessible voids and several intermolecular H⋯H contacts approximating the sum of the van der Waals radii. The molecule is quite crowded, with intramolecular Br⋯H and C⋯H contacts ca 0.38 and 0.30 Å, respectively, less than the sum of the corresponding van der Waals radii. All cyclohexyl rings adopt chair conformations with the ‘seat’ of the chair inclined at approximately 57–81° to the mean plane of the benzene ring, while those ortho to bromine have their centroids displaced in opposite directions from this plane.
The crystal structure of the complex between the dodecamer d(CGCGAATTCGCG) and a synthetic dye molecule Hoechst 33258 was solved by X-ray diffraction analysis and refined to an R-factor of 15.7% at 2.25 A resolution. The crescent-shaped Hoechst compound is found to bind to the central four AATT base pairs in the narrow minor groove of the B-DNA double helix. The piperazine ring of the drug has its flat face almost parallel to the aromatic bisbenzimidazole ring and lies sideways in the minor groove. No evidence of disordered structure of the drug is seen in the complex. The binding of Hoechst to DNA is stabilized by a combination of hydrogen bonding, van der Waals interaction and electrostatic interactions. The binding preference for AT base pairs by the drug is the result of the close contact between the Hoechst molecule and the C2 hydrogen atoms of adenine. The nature of these contacts precludes the binding of the drug to G-C base pairs due to the presence of N2 amino groups of guanines. The present crystal structural information agrees well with the data obtained from chemical footprinting experiments.
In the title compound, C11H11N3O2S2, the dihedral angle between the benzene ring and the five-membered ring is 6.85 (9)°. An intramolecular O—H⋯N hydrogen bond makes an S(6) ring motif. In the crystal, molecules are linked through bifurcated N—H⋯(O,O) hydrogen bonds with R
2(5) ring motifs, forming chains along the b axis. A short C⋯S contact [3.3189 (19) Å], which is shorter than the sum of the van der Waals radii of these atoms (3.50 Å), occurs in the structure. The crystal structure is further stabilized by C—H⋯N hydrogen bonding and π–π interactions [centroid–centroid distance = 3.7649 (12) Å].
In the title hydrazide Schiff base compound, C14H12ClN3O2, the conformation around the C=N double bond is E. The dihedral angle between the benzene rings is 41.57 (14) Å. An intramolecular O—H⋯N hydrogen bond makes an S(6) ring motif. In the crystal, molecules are linked by N—H⋯O (bifurcated acceptor) and N—H⋯N hydrogen bonds, forming chains along the a axis. The interesting feature of the crystal structure is the short intermolecular C⋯O [3.216 (3), 3.170 (3), and 2.992 (3) Å] contacts, one of which is significantly shorter than the sum of the van der Waals radii of these atoms [3.22 Å].
A large number of viral capsids, as well as other macromolecular assemblies, have icosahedral structure or structures with other rotational symmetries. This symmetry can be exploited during molecular dynamics (MD) to model in effect the full viral capsid using only a subset of primary atoms plus copies of image atoms generated from rotational symmetry boundary conditions (RSBC). A pure rotational symmetry operation results in both primary and image atoms at short range, and within nonbonded interaction distance of each other, so that nonbonded interactions can not be specified by the minimum image convention and explicit treatment of image atoms is required. As such an unavoidable consequence of RSBC is that the enumeration of nonbonded interactions in regions surrounding certain rotational axes must include both a primary atom and its copied image atom, thereby imposing microscopic symmetry for some forces. We examined the possibility of artifacts arising from this imposed microscopic symmetry of RSBC using two simulation systems: a water shell and human rhinovirus 14 (HRV14) capsid with explicit water. The primary unit was a pentamer of the icosahedron, which has the advantage of direct comparison of icosahedrally equivalent spatial regions, for example regions near a 2-fold symmetry axis with imposed symmetry and a 2-fold axis without imposed symmetry. Analysis of structural and dynamic properties of water molecules and protein atoms found similar behavior near symmetry axes with imposed symmetry and where the minimum image convention fails compared with that in other regions in the simulation system, even though an excluded volume effect was detected for water molecules near the axes with imposed symmetry. These results validate the use of RSBC for icosahedral viral capsids or other rotationally symmetric systems.
Specific binding between proteins plays a crucial role in molecular functions and biological processes. Protein binding interfaces and their atomic contacts are typically defined by simple criteria, such as distance-based definitions that only use some threshold of spatial distance in previous studies. These definitions neglect the nearby atomic organization of contact atoms, and thus detect predominant contacts which are interrupted by other atoms. It is questionable whether such kinds of interrupted contacts are as important as other contacts in protein binding. To tackle this challenge, we propose a new definition called beta (β) atomic contacts. Our definition, founded on the β-skeletons in computational geometry, requires that there is no other atom in the contact spheres defined by two contact atoms; this sphere is similar to the van der Waals spheres of atoms. The statistical analysis on a large dataset shows that β contacts are only a small fraction of conventional distance-based contacts. To empirically quantify the importance of β contacts, we design βACV, an SVM classifier with β contacts as input, to classify homodimers from crystal packing. We found that our βACV is able to achieve the state-of-the-art classification performance superior to SVM classifiers with distance-based contacts as input. Our βACV also outperforms several existing methods when being evaluated on several datasets in previous works. The promising empirical performance suggests that β contacts can truly identify critical specific contacts in protein binding interfaces. β contacts thus provide a new model for more precise description of atomic organization in protein quaternary structures than distance-based contacts.
In the title complex, [Re2(C6H5Te)2(C11H9N)(CO)7], two Re atoms are coordinated in slightly distorted octahedral coordination environments and are bridged by two Te atoms, which are coordinated in trigonal-pyramidal environments. The torsion angle for the Te—Re—Te—Re sequence of atoms is 17.06 (3)°. The crystal structure is stabilized by weak C—H⋯O and C—H⋯π interactions. In addition, there are Te⋯Te distances [4.0392 (12) Å] and O⋯O distances [2.902 (19) Å] which are shorter than the sum of the van der Waals radii for these atoms. A short intermolecular lone pair⋯π distance [C O⋯Cg = 3.31 (2) Å] is also observed.
In the title molecule, C15H10BrNO3S2, the dihedral angle between the benzothiazole ring system and the benzene ring is 67.57 (12)°. The crystal structure is stabilized by weak intermolecular C—H⋯O interactions. In addition, there is an intermolecular Br⋯C [3.379 (3) Å] contact which is shorter than the sum of the van der Waals radii of these atoms.
Restriction endonuclease Bse634I recognizes and cleaves the degenerate DNA sequence 5′-R/CCGGY-3′ (R stands for A or G; Y for T or C, ‘/’ indicates a cleavage position). Here, we report the crystal structures of the Bse634I R226A mutant complexed with cognate oligoduplexes containing ACCGGT and GCCGGC sites, respectively. In the crystal, all potential H-bond donor and acceptor atoms on the base edges of the conserved CCGG core are engaged in the interactions with Bse634I amino acid residues located on the α6 helix. In contrast, direct contacts between the protein and outer base pairs are limited to van der Waals contact between the purine nucleobase and Pro203 residue in the major groove and a single H-bond between the O2 atom of the outer pyrimidine and the side chain of the Asn73 residue in the minor groove. Structural data coupled with biochemical experiments suggest that both van der Waals interactions and indirect readout contribute to the discrimination of the degenerate base pair by Bse634I. Structure comparison between related enzymes Bse634I (R/CCGGY), NgoMIV (G/CCGGC) and SgrAI (CR/CCGGYG) reveals how different specificities are achieved within a conserved structural core.
Orientational disorder of the distal nitrosyl (NO) ligand in iron porphyrinates is a common phenomenon. We present an analysis of multi-temperature crystallographic data for the order/disorder phenomenon. The observed temperature-dependent order/disorder and variable rotational orientations of nitrosyl ligands for six different six-coordinate iron porphyrinates have been examined in terms of the nonbonded contacts found in the solid state. Favorable orientations for NO can be identified either by the calculation of the close nonbonded contacts or by evaluation of the geometry-dependent potential energy using semi-empirical nonbonded potential functions. The nonbonded contacts display temperature-dependent differences consistent with observed structural differences. The motion of NO appears to be controlled by intermolecular interactions that allow a limited set of orientations and under some conditions only a single NO orientation is allowed. In some cases, the equilibria involving the orientations of NO can be analyzed using the van’t Hoff relationship and the free energy and the enthalpy of the solid-state transitions evaluated. The intrinsic barriers to rotation of the NO were examined using a fine-meshed series of DFT calculations. The calculations also showed the detailed effects of the variation of the NO orientation on the equatorial bond distances.
The ability to predict protein-protein binding sites has a wide range of applications, including signal transduction studies, de novo drug design, structure identification and comparison of functional sites. The interface in a complex involves two structurally matched protein subunits, and the binding sites can be predicted by identifying structural matches at protein surfaces.
We propose a method which enumerates “all” the configurations (or poses) between two proteins (3D coordinates of the two subunits in a complex) and evaluates each configuration by the interaction between its components using the Atomic Contact Energy function. The enumeration is achieved efficiently by exploring a set of rigid transformations. Our approach incorporates a surface identification technique and a method for avoiding clashes of two subunits when computing rigid transformations. When the optimal transformations according to the Atomic Contact Energy function are identified, the corresponding binding sites are given as predictions. Our results show that this approach consistently performs better than other methods in binding site identification.
Our method achieved a success rate higher than other methods, with the prediction quality improved in terms of both accuracy and coverage. Moreover, our method is being able to predict the configurations of two binding proteins, where most of other methods predict only the binding sites. The software package is available at
http://sites.google.com/site/guofeics/dobi for non-commercial use.
The prediction of ligand binding or protein structure requires very accurate force field potentials – even small errors in force field potentials can make a 'wrong' structure (from the billions possible) more stable than the single, 'correct' one. However, despite huge efforts to optimize them, currently-used all-atom force fields are still not able, in a vast majority of cases, even to keep a protein molecule in its native conformation in the course of molecular dynamics simulations or to bring an approximate, homology-based model of protein structure closer to its native conformation.
A strict analysis shows that a specific coupling of multi-atom Van der Waals interactions with covalent bonding can, in extreme cases, increase (or decrease) the interaction energy by about 20–40% at certain angles between the direction of interaction and the covalent bond. It is also shown that on average multi-body effects decrease the total Van der Waals energy in proportion to the square root of the electronic component of dielectric permittivity corresponding to dipole-dipole interactions at small distances, where Van der Waals interactions take place.
The study shows that currently-ignored multi-atom Van der Waals interactions can, in certain instances, lead to significant energy effects, comparable to those caused by the replacement of atoms (for instance, C by N) in conventional pairwise Van der Waals interactions.
We present an implicit solvent coarse-grained (CG) model for quantitative simulations of 1-palmitoyl-2-oleoyl-sn-glycero-3-phosphocholine (POPC) bilayers. The absence of explicit solvent enables membrane simulations on large length and time scales at moderate computational expense. Despite improved computational efficiency, the model preserves chemical specificity and quantitative accuracy. The bonded and nonbonded interactions together with the effective cohesion mimicking the hydrophobic effect were systematically tuned by matching structural and mechanical properties from experiments and all-atom bilayer simulations, such as saturated area per lipid, radial distribution functions, density and pressure profiles across the bilayer, P2 order, etc. The CG lipid model is shown to self-assemble into a bilayer starting from a random dispersion. Its line tension and elastic properties, such as bending and stretching modulus, are semiquantitatively consistent with experiments. The effects of (i) reduced molecular friction and (ii) more efficient integration combine to an overall speed-up of 3−4 orders of magnitude compared to all-atom bilayer simulations. Our CG lipid model is especially useful for studies of large-scale phenomena in membranes that nevertheless require a fair description of chemical specificity, e.g., membrane patches interacting with movable and transformable membrane proteins and peptides.
Achieving atomic-level accuracy in comparative protein models is limited by our ability to refine the initial, homolog-derived model closer to the native state. Despite considerable effort, progress in developing a generalized refinement method has been limited. In contrast, methods have been described that can accurately reconstruct loop conformations in native protein structures. We hypothesize that loop refinement in homology models is much more difficult than loop reconstruction in crystal structures, in part, because side-chain, backbone, and other structural inaccuracies surrounding the loop create a challenging sampling problem; the loop cannot be refined without simultaneously refining adjacent portions. In this work, we single out one sampling issue in an artificial but useful test set and examine how loop refinement accuracy is affected by errors in surrounding side-chains. In 80 high-resolution crystal structures, we first perturbed 6–12 residue loops away from the crystal conformation, and placed all protein side chains in non-native but low energy conformations. Even these relatively small perturbations in the surroundings made the loop prediction problem much more challenging. Using a previously published loop prediction method, median backbone (N-Cα-CO) RMSD’s for groups of 6, 8, 10, and 12 residue loops are 0.3/0.6/0.4/0.6 Å, respectively, on native structures and increase to 1.1/2.2/1.5/2.3 Å on the perturbed cases. We then augmented our previous loop prediction method to simultaneously optimize the rotamer states of side chains surrounding the loop. Our results show that this augmented loop prediction method can recover the native state in many perturbed structures where the previous method failed; the median RMSD’s for the 6, 8, 10, and 12 residue perturbed loops improve to 0.4/0.8/1.1/1.2 Å. Finally, we highlight three comparative models from blind tests, in which our new method predicted loops closer to the native conformation than first modeled using the homolog template, a task generally understood to be difficult. Although many challenges remain in refining full comparative models to high accuracy, this work offers a methodical step toward that goal.
comparative; homology; modeling; refinement; loop prediction; molecular mechanics; force field