The new automated iterative Hirshfeld atom refinement method is explained and validated through comparison of structural models of Gly–l-Ala obtained from synchrotron X-ray and neutron diffraction data at 12, 50, 150 and 295 K. Structural parameters involving hydrogen atoms are determined with comparable precision from both experiments and agree mostly to within two combined standard uncertainties.
Hirshfeld atom refinement (HAR) is a method which determines structural parameters from single-crystal X-ray diffraction data by using an aspherical atom partitioning of tailor-made ab initio quantum mechanical molecular electron densities without any further approximation. Here the original HAR method is extended by implementing an iterative procedure of successive cycles of electron density calculations, Hirshfeld atom scattering factor calculations and structural least-squares refinements, repeated until convergence. The importance of this iterative procedure is illustrated via the example of crystalline ammonia. The new HAR method is then applied to X-ray diffraction data of the dipeptide Gly–l-Ala measured at 12, 50, 100, 150, 220 and 295 K, using Hartree–Fock and BLYP density functional theory electron densities and three different basis sets. All positions and anisotropic displacement parameters (ADPs) are freely refined without constraints or restraints – even those for hydrogen atoms. The results are systematically compared with those from neutron diffraction experiments at the temperatures 12, 50, 150 and 295 K. Although non-hydrogen-atom ADPs differ by up to three combined standard uncertainties (csu’s), all other structural parameters agree within less than 2 csu’s. Using our best calculations (BLYP/cc-pVTZ, recommended for organic molecules), the accuracy of determining bond lengths involving hydrogen atoms from HAR is better than 0.009 Å for temperatures of 150 K or below; for hydrogen-atom ADPs it is better than 0.006 Å2 as judged from the mean absolute X-ray minus neutron differences. These results are among the best ever obtained. Remarkably, the precision of determining bond lengths and ADPs for the hydrogen atoms from the HAR procedure is comparable with that from the neutron measurements – an outcome which is obtained with a routinely achievable resolution of the X-ray data of 0.65 Å.
aspherical atom partitioning; quantum mechanical molecular electron densities; X-ray structure refinement; hydrogen atom modelling; anisotropic displacement parameters
A unified superspace model, based on average triplite structure, for the description of different modulation periodicities of wagnerite and related phases
Reinvestigation of more than 40 samples of minerals belonging to the wagnerite group (Mg, Fe, Mn)2(PO4)(F,OH) from diverse geological environments worldwide, using single-crystal X-ray diffraction analysis, showed that most crystals have incommensurate structures and, as such, are not adequately described with known polytype models (2b), (3b), (5b), (7b) and (9b). Therefore, we present here a unified superspace model for the structural description of periodically and aperiodically modulated wagnerite with the (3+1)-dimensional superspace group C2/c(0β0)s0 based on the average triplite structure with cell parameters a ≃ 12.8, b ≃ 6.4, c ≃ 9.6 Å, β ≃ 117° and the modulation vectors q = β
b*. The superspace approach provides a way of simple modelling of the positional and occupational modulation of Mg/Fe and F/OH in wagnerite. This allows direct comparison of crystal properties.
wagnerite; modulated structure; superspace; unified model; triplite
Multi-temperature single-crystal and powder diffraction experiments on 1-(2′-aminophenyl)-2-methyl-4-nitroimidazole show that this crystal undergoes an isomorphic phase transition with the coexistence of two phase domains over a wide temperature range. The anharmonic approach was the only way to model the resulting disorder.
The harmonic model of atomic nuclear motions is usually enough for multipole modelling of high-resolution X-ray diffraction data; however, in some molecular crystals, such as 1-(2′-aminophenyl)-2-methyl-4-nitro-1H-imidazole [Paul, Kubicki, Jelsch et al. (2011 ▶). Acta Cryst. B67, 365–378], it may not be sufficient for a correct description of the charge-density distribution. Multipole refinement using harmonic atom vibrations does not lead to the best electron density model in this case and the so-called ‘shashlik-like’ pattern of positive and negative residual electron density peaks is observed in the vicinity of some atoms. This slight disorder, which cannot be modelled by split atoms, was solved using third-order anharmonic nuclear motion (ANM) parameters. Multipole refinement of the experimental high-resolution X-ray diffraction data of 1-(2′-aminophenyl)-2-methyl-4-nitro-1H-imidazole at three different temperatures (10, 35 and 70 K) and a series of powder diffraction experiments (20 ≤ T ≤ 300 K) were performed to relate this anharmonicity observed for several light atoms (N atoms of amino and nitro groups, and O atoms of nitro groups) to an isomorphic phase transition reflected by a change in the b cell parameter around 65 K. The observed disorder may result from the coexistence of domains of two phases over a large temperature range, as shown by low-temperature powder diffraction.
anharmonicity; isomorphic phase transition; experimental charge density; X-ray closed-circuit helium cryostat; Hansen–Coppens model; multiple-temperature powder diffraction
High resolution X-ray diffraction
data on forms I–IV of
sulfathiazole and neutron diffraction data on forms II–IV have
been collected at 100 K and analyzed using the Atoms in Molecules
topological approach. The molecular thermal motion as judged by the
anisotropic displacement parameters (adp’s) is very similar
in all four forms. The adp of the thiazole sulfur atom had the greatest
amplitude perpendicular to the five-membered ring, and analysis of
the temperature dependence of the adps indicates that this is due
to genuine thermal motion rather than a concealed disorder. A minor
disorder (∼1–2%) is evident for forms I and II, but
a statistical analysis reveals no deleterious effect on the derived
multipole populations. The topological analysis reveals an intramolecular
S–O···S interaction, which is consistently present
in all experimental topologies. Analysis of the gas-phase conformation
of the molecule indicates two low-energy theoretical conformers, one
of which possesses the same intramolecular S–O···S
interaction observed in the experimental studies and the other an
S–O···H–N intermolecular interaction.
These two interactions appear responsible for “locking”
the molecular conformation. The lattice energies of the various polymorphs
computed from the experimental multipole populations are highly dependent
on the exact refinement model. They are similar in magnitude to theoretically
derived lattice energies, but the relatively high estimated errors
mean that this method is insufficiently accurate to allow a definitive
stability order for the sulfathiazole polymorphs at 0 K to be determined.
High resolution X-ray diffraction data on sulfathiazole
(forms I−IV) and neutron diffraction data have been used to
analyze the polymorphic electron density using Quantum Theory of Atoms
in Molecules. Two low-energy theoretical conformers are found in the
gas phase, one of which possesses an S−O···S
interaction (a) and the other an S−O···H−N
(b) intermolecular interaction. These interactions appear responsible
for “locking” the molecular conformation.
Chemical bonding at the active site of lysozyme is analyzed on the basis of a multipole model employing transferable multipole parameters from a database. Large B factors at low temperatures reflect frozen-in disorder, but therefore prevent a meaningful free refinement of multipole parameters.
Chemical bonding at the active site of hen egg-white lysozyme (HEWL) is analyzed on the basis of Bader’s quantum theory of atoms in molecules [QTAIM; Bader (1994 ▶), Atoms in Molecules: A Quantum Theory. Oxford University Press] applied to electron-density maps derived from a multipole model. The observation is made that the atomic displacement parameters (ADPs) of HEWL at a temperature of 100 K are larger than ADPs in crystals of small biological molecules at 298 K. This feature shows that the ADPs in the cold crystals of HEWL reflect frozen-in disorder rather than thermal vibrations of the atoms. Directly generalizing the results of multipole studies on small-molecule crystals, the important consequence for electron-density analysis of protein crystals is that multipole parameters cannot be independently varied in a meaningful way in structure refinements. Instead, a multipole model for HEWL has been developed by refinement of atomic coordinates and ADPs against the X-ray diffraction data of Wang and coworkers [Wang et al. (2007), Acta Cryst. D63, 1254–1268], while multipole parameters were fixed to the values for transferable multipole parameters from the ELMAM2 database [Domagala et al. (2012), Acta Cryst. A68, 337–351] . Static and dynamic electron densities based on this multipole model are presented. Analysis of their topological properties according to the QTAIM shows that the covalent bonds possess similar properties to the covalent bonds of small molecules. Hydrogen bonds of intermediate strength are identified for the Glu35 and Asp52 residues, which are considered to be essential parts of the active site of HEWL. Furthermore, a series of weak C—H⋯O hydrogen bonds are identified by means of the existence of bond critical points (BCPs) in the multipole electron density. It is proposed that these weak interactions might be important for defining the tertiary structure and activity of HEWL. The deprotonated state of Glu35 prevents a distinction between the Phillips and Koshland mechanisms.
hen egg-white lysozyme; multipole model; multipole parameters
Formamide harmonic and anharmonic frequencies of fundamental vibrations in the gas phase and in several solvents were successfully estimated in the B3LYP Kohn-Sham complete basis set limit (KS CBS). CBS results were obtained by extrapolating a power function (two-parameter formula) to the results calculated with polarization-consistent basis sets. Anharmonic corrections using the second order perturbation treatment (PT2) and hybrid B3LYP functional combined with polarization consistent pc-n (n = 0, 1, 2, 3, 4) and several Pople’s basis sets were analyzed for all fundamental formamide vibrational modes in the gas phase and solution. Solvent effects were modeled within a PCM method. The anharmonic frequency of diagnostic amide vibration C = O in the gas phase and the CCl4 solution calculated with the VPT2 method was significantly closer to experimental data than the corresponding harmonic frequency. Both harmonic and anharmonic frequencies of C = O stretching mode decreased linearly with solvent polarity, expressed by relative environment permittivity (ε) ratio (ε − 1)/(2ε + 1). However, an unphysical behavior of solvent dependence of some low frequency anharmonic amide modes of formamide (e.g., CN stretch, NH2 scissoring, and NH2 in plane bend) was observed, probably due to the presence of severe anharmonicity and Fermi resonance.
FigureFormamide harmonic and anharmonic frequencies of fundamental vibrations in the gas phase and in several solvents were successfully estimated in the B3LYP Kohn-Sham complete basis set limit (KS CBS). CBS results were obtained by extrapolating a power function (two-parameter formula) to the results calculated with polarization-consistent basis sets. Anharmonic corrections using the second order perturbation treatment (PT2) and hybrid B3LYP functional combined with polarization consistent pc-n (n = 0, 1, 2, 3, 4) and several Pople’s basis sets were analysed for all fundamental formamide vibrational modes in the gas phase and solution.
Electronic supplementary material
The online version of this article (doi:10.1007/s00894-010-0944-9) contains supplementary material, which is available to authorized users.
Harmonic vibration; Anharmonic vibration; Complete basis set limit; Formamide; Solvent effect
The harmonic and anharmonic frequencies of fundamental vibrations in formaldehyde and water were successfully estimated using the B3LYP Kohn-Sham limit. The results obtained with polarization- and correlation-consistent basis sets were fitted with a two-parameter formula. Anharmonic corrections were obtained by a second order perturbation treatment (PT2). We compared the performance of the PT2 scheme on the two title molecules using SCF, MP2 and DFT (BLYP, B3LYP, PBE and B3PW91 functionals) methods combined with polarization consistent pc-n (n = 0, 1, 2, 3, 4) basis sets, Dunning’s basis sets (aug)-cc-pVXZ where X = D, T, Q, 5, 6 and Pople’s basis sets up to 6-311++G(3df,2pd). The influence of SCF convergence level and density grid size on the root mean square of harmonic and anharmonic frequency deviations from experimental values was tested. The wavenumber of formaldehyde CH2 anharmonic asymmetric stretching mode is very sensitive to grid size for large basis sets; this effect is not observed for harmonic modes. BLYP-calculated anharmonic frequencies consistently underestimate observed wavenumbers. On the basis of formaldehyde anharmonic frequencies, we show that increasing the Pople basis set size does not always lead to improved agreement between anharmonic frequencies and experimental values.
FigureSensitivity of water B3LYP calculated harmonic and anharmonic vs(OH) frequencies on selected Pople and polarization consistent basis sets size. The results for pc-n basis sets were fitted with two parameter formula and the CBS(2,3,4) estimated
Electronic supplementary material
The online version of this article (doi:10.1007/s00894-010-0913-3) contains supplementary material, which is available to authorized users.
Harmonic; Anharmonic; Complete basis set limit; IR and Raman theoretical spectra
The maximum-entropy charge densities of six amino acids and peptides reveal systematic dependencies of the properties at bond critical points on bond lengths. MEM densities demonstrate that low-order multipoles (l
max = 1) and isotropic atomic displacement parameters for H atoms in the multipole model are insufficient for capturing all the features of charge densities in hydrogen bonds.
Charge densities have been determined by the Maximum Entropy Method (MEM) from the high-resolution, low-temperature (T ≃ 20 K) X-ray diffraction data of six different crystals of amino acids and peptides. A comparison of dynamic deformation densities of the MEM with static and dynamic deformation densities of multipole models shows that the MEM may lead to a better description of the electron density in hydrogen bonds in cases where the multipole model has been restricted to isotropic displacement parameters and low-order multipoles (l
max = 1) for the H atoms. Topological properties at bond critical points (BCPs) are found to depend systematically on the bond length, but with different functions for covalent C—C, C—N and C—O bonds, and for hydrogen bonds together with covalent C—H and N—H bonds. Similar dependencies are known for AIM properties derived from static multipole densities. The ratio of potential and kinetic energy densities |V(BCP)|/G(BCP) is successfully used for a classification of hydrogen bonds according to their distance d(H⋯O) between the H atom and the acceptor atom. The classification based on MEM densities coincides with the usual classification of hydrogen bonds as strong, intermediate and weak [Jeffrey (1997) ▶. An Introduction to Hydrogen Bonding. Oxford University Press]. MEM and procrystal densities lead to similar values of the densities at the BCPs of hydrogen bonds, but differences are shown to prevail, such that it is found that only the true charge density, represented by MEM densities, the multipole model or some other method can lead to the correct characterization of chemical bonding. Our results do not confirm suggestions in the literature that the promolecule density might be sufficient for a characterization of hydrogen bonds.
topological properties; hydrogen bonding; maximum entropy method; charge densities; peptides; amino acids
The standard settings of (3 + d)-dimensional superspace groups are determined for a series of modulated compounds, especially concentrating on d = 2 and 3. The coordinate transformation in superspace is discussed in view of its implications in physical space.
An algorithm is presented which determines the equivalence of two settings of a (3 + d)-dimensional superspace group (d = 1, 2, 3). The algorithm has been implemented as a web tool on , providing the transformation of any user-given superspace group to the standard setting of this superspace group in . It is shown how the standard setting of a superspace group can be directly obtained by an appropriate transformation of the external-space lattice vectors (the basic structure unit cell) and a transformation of the internal-space lattice vectors (new modulation wavevectors are linear combinations of old modulation wavevectors plus a three-dimensional reciprocal-lattice vector). The need for non-standard settings in some cases and the desirability of employing standard settings of superspace groups in other cases are illustrated by an analysis of the symmetries of a series of compounds, comparing published and standard settings and the transformations between them. A compilation is provided of standard settings of compounds with two- and three-dimensional modulations. The problem of settings of superspace groups is discussed for incommensurate composite crystals and for chiral superspace groups.
symmetry; superspace groups; two-dimensionally modulated crystals; three-dimensionally modulated crystals
A computer simulation was created for a modulated protein structure along with structure factors in a periodic supercell and in superspace for the purpose of developing and validating software modifications that will be used to solve and refine modulated protein crystals.
The toolbox for computational protein crystallography is full of easy-to-use applications for the routine solution and refinement of periodic diffraction data sets and protein structures. There is a gap in the available software when it comes to aperiodic crystallographic data. Current protein crystallography software cannot handle modulated data, and small-molecule software for aperiodic crystallography cannot work with protein structures. To adapt software for modulated protein data requires training data to test and debug the changed software. Thus, a comprehensive training data set consisting of atomic positions with associated modulation functions and the modulated structure factors packaged as both a three-dimensional supercell and as a modulated structure in (3+1)D superspace has been created. The (3+1)D data were imported into Jana2006; this is the first time that this has been performed for protein data.
protein; modulated structures; satellite reflections; q vectors; average structure; disorder; supercells; superspace
The structure of human carbonic anhydrase II has been solved with a sulfonamide inhibitor at 0.9 Å resolution. Structural variation and flexibility is seen on the surface of the protein and is consistent with the anisotropic ADPs obtained from refinement. Comparison with 13 other atomic resolution carbonic anhydrase structures shows that surface variation exists even in these highly ordered isomorphous crystals.
Carbonic anhydrase has been well studied structurally and functionally owing to its importance in respiration. A large number of X-ray crystallographic structures of carbonic anhydrase and its inhibitor complexes have been determined, some at atomic resolution. Structure determination of a sulfonamide-containing inhibitor complex has been carried out and the structure was refined at 0.9 Å resolution with anisotropic atomic displacement parameters to an R value of 0.141. The structure is similar to those of other carbonic anhydrase complexes, with the inhibitor providing a fourth nonprotein ligand to the active-site zinc. Comparison of this structure with 13 other atomic resolution (higher than 1.25 Å) isomorphous carbonic anhydrase structures provides a view of the structural similarity and variability in a series of crystal structures. At the center of the protein the structures superpose very well. The metal complexes superpose (with only two exceptions) with standard deviations of 0.01 Å in some zinc–protein and zinc–ligand bond lengths. In contrast, regions of structural variability are found on the protein surface, possibly owing to flexibility and disorder in the individual structures, differences in the chemical and crystalline environments or the different approaches used by different investigators to model weak or complicated electron-density maps. These findings suggest that care must be taken in interpreting structural details on protein surfaces on the basis of individual X-ray structures, even if atomic resolution data are available.
carbonic anhydrase; structure comparison; metalloproteins; atomic resolution
The general principles behind the macromolecular crystal structure refinement program REFMAC5 are described.
This paper describes various components of the macromolecular crystallographic refinement program REFMAC5, which is distributed as part of the CCP4 suite. REFMAC5 utilizes different likelihood functions depending on the diffraction data employed (amplitudes or intensities), the presence of twinning and the availability of SAD/SIRAS experimental diffraction data. To ensure chemical and structural integrity of the refined model, REFMAC5 offers several classes of restraints and choices of model parameterization. Reliable models at resolutions at least as low as 4 Å can be achieved thanks to low-resolution refinement tools such as secondary-structure restraints, restraints to known homologous structures, automatic global and local NCS restraints, ‘jelly-body’ restraints and the use of novel long-range restraints on atomic displacement parameters (ADPs) based on the Kullback–Leibler divergence. REFMAC5 additionally offers TLS parameterization and, when high-resolution data are available, fast refinement of anisotropic ADPs. Refinement in the presence of twinning is performed in a fully automated fashion. REFMAC5 is a flexible and highly optimized refinement package that is ideally suited for refinement across the entire resolution spectrum encountered in macromolecular crystallography.
The functional properties of materials
can arise from local structural
features that are not well determined or described by crystallographic
methods based on long-range average structural models. The room temperature
(RT) structure of the Bi perovskite Bi2Mn4/3Ni2/3O6 has previously been modeled as a locally
polar structure where polarization is suppressed by a long-range incommensurate
antiferroelectric modulation. In this study we investigate the short-range
local structure of Bi2Mn4/3Ni2/3O6, determined through reverse Monte Carlo (RMC) modeling of
neutron total scattering data, and compare the results with the long-range
incommensurate structure description. While the incommensurate structure
has equivalent B site environments for Mn and Ni, the local structure
displays a significantly Jahn–Teller distorted environment
for Mn3+. The local structure displays the rock-salt-type
Mn/Ni ordering of the related Bi2MnNiO6 high
pressure phase, as opposed to Mn/Ni clustering observed in the long-range
average incommensurate model. RMC modeling reveals short-range ferroelectric
correlations between Bi3+ cations, giving rise to polar
regions that are quantified for the first time as existing within
a distance of approximately 12 Å. These local correlations persist
in the commensurate high temperature (HT) phase, where the long-range
average structure is nonpolar. The local structure thus provides information
about cation ordering and B site structural flexibility that may stabilize
Bi3+ on the A site of the perovskite structure and reveals
the extent of the local polar regions created by this cation.
Kinesin motor proteins drive intracellular transport by coupling ATP hydrolysis to conformational changes that mediate directed movement along microtubules. Characterizing these distinct conformations and their interconversion mechanism is essential to determining an atomic-level model of kinesin action. Here we report a comprehensive principal component analysis of 114 experimental structures along with the results of conventional and accelerated molecular dynamics simulations that together map the structural dynamics of the kinesin motor domain. All experimental structures were found to reside in one of three distinct conformational clusters (ATP-like, ADP-like and Eg5 inhibitor-bound). These groups differ in the orientation of key functional elements, most notably the microtubule binding α4–α5, loop8 subdomain and α2b-β4-β6-β7 motor domain tip. Group membership was found not to correlate with the nature of the bound nucleotide in a given structure. However, groupings were coincident with distinct neck-linker orientations. Accelerated molecular dynamics simulations of ATP, ADP and nucleotide free Eg5 indicate that all three nucleotide states could sample the major crystallographically observed conformations. Differences in the dynamic coupling of distal sites were also evident. In multiple ATP bound simulations, the neck-linker, loop8 and the α4–α5 subdomain display correlated motions that are absent in ADP bound simulations. Further dissection of these couplings provides evidence for a network of dynamic communication between the active site, microtubule-binding interface and neck-linker via loop7 and loop13. Additional simulations indicate that the mutations G325A and G326A in loop13 reduce the flexibility of these regions and disrupt their couplings. Our combined results indicate that the reported ATP and ADP-like conformations of kinesin are intrinsically accessible regardless of nucleotide state and support a model where neck-linker docking leads to a tighter coupling of the microtubule and nucleotide binding regions. Furthermore, simulations highlight sites critical for large-scale conformational changes and the allosteric coupling between distal functional sites.
Kinesin motor proteins transport cargo along microtubule tracks to support essential cellular functions including cell growth, axonal signaling and the separation of chromosomes during cell division. All kinesins contain one or more conserved motor domains that modulate binding and movement along microtubules via cycles of ATP hydrolysis. Important conformational transitions occurring during this cycle have been characterized with extensive crystallographic studies. However, the link between the observed conformations and the mechanisms involved in conformational change and microtubule interaction modulation remain unclear. Here we describe a comprehensive principal component analysis of available motor domain crystallographic structures supplemented with extensive unbiased conventional and accelerated molecular dynamics simulations that together characterize the response of kinesin motor domains to ATP binding and hydrolysis. Our studies reveal atomic details of conformational transitions, as well as novel nucleotide-dependent dynamical couplings, of distal regions and residues potentially important for the allosteric link between nucleotide and microtubule binding sites.
Data processing of an incommensurately modulated profilin–actin crystal is described.
Recent challenges in biological X-ray crystallography include the processing of modulated diffraction data. A modulated crystal has lost its three-dimensional translational symmetry but retains long-range order that can be restored by refining a periodic modulation function. The presence of a crystal modulation is indicated by an X-ray diffraction pattern with periodic main reflections flanked by off-lattice satellite reflections. While the periodic main reflections can easily be indexed using three reciprocal-lattice vectors a*, b*, c*, the satellite reflections have a non-integral relationship to the main lattice and require a q vector for indexing. While methods for the processing of diffraction intensities from modulated small-molecule crystals are well developed, they have not been applied in protein crystallography. A recipe is presented here for processing incommensurately modulated data from a macromolecular crystal using the Eval program suite. The diffraction data are from an incommensurately modulated crystal of profilin–actin with single-order satellites parallel to b*. The steps taken in this report can be used as a guide for protein crystallographers when encountering crystal modulations. To our knowledge, this is the first report of the processing of data from an incommensurately modulated macromolecular crystal.
modulation; incommensurate; Eval15; profilin–actin
When refining the fit of component atomic structures into electron microscopic reconstructions, use of a resolution-dependent atomic density function makes it possible to jointly optimize the atomic model and imaging parameters of the microscope. Atomic density is calculated by one-dimensional Fourier transform of atomic form factors convoluted with a microscope envelope correction and a low-pass filter, allowing refinement of imaging parameters such as resolution, by optimizing the agreement of calculated and experimental maps. A similar approach allows refinement of atomic displacement parameters, providing indications of molecular flexibility even at low resolution. A modest improvement in atomic coordinates is possible following optimization of these additional parameters. Methods have been implemented in a Python program that can be used in stand-alone mode for rigid-group refinement, or embedded in other optimizers for flexible refinement with stereochemical restraints. The approach is demonstrated with refinements of virus and chaperonin structures at resolutions of 9 through 4.5 Å, representing regimes where rigid-group and fully flexible parameterizations are appropriate. Through comparisons to known crystal structures, flexible fitting by RSRef is shown to be an improvement relative to other methods and to generate models with all-atom rms accuracies of 1.5–2.5 Å at resolutions of 4.5–6 Å.
Fitting; Optimization; Structure; Resolution; Restraint; B-factor; Flexibility
A simple rule of thumb based on resolution is not adequate to identify the best treatment of atomic displacements in macromolecular structural models. The choice to use isotropic B factors, anisotropic B factors, TLS models or some combination of the three should be validated through statistical analysis of the model refinement.
In choosing and refining any crystallographic structural model, there is tension between the desire to extract the most detailed information possible and the necessity to describe no more than what is justified by the observed data. A more complex model is not necessarily a better model. Thus, it is important to validate the choice of parameters as well as validating their refined values. One recurring task is to choose the best model for describing the displacement of each atom about its mean position. At atomic resolution one has the option of devoting six model parameters (a ‘thermal ellipsoid’) to describe the displacement of each atom. At medium resolution one typically devotes at most one model parameter per atom to describe the same thing (a ‘B factor’). At very low resolution one cannot justify the use of even one parameter per atom. Furthermore, this aspect of the structure may be described better by an explicit model of bulk displacements, the most common of which is the translation/libration/screw (TLS) formalism, rather than by assigning some number of parameters to each atom individually. One can sidestep this choice between atomic displacement parameters and TLS descriptions by including both treatments in the same model, but this is not always statistically justifiable. The choice of which treatment is best for a particular structure refinement at a particular resolution can be guided by general considerations of the ratio of model parameters to the number of observations and by specific statistics such as the Hamilton R-factor ratio test.
atomic displacements; B factors; TLS models; model parameters
The mechanism of intra-protein communication and allosteric coupling is key to understanding the structure-property relationship of protein function. For subtilisin Carlsberg, the Ca2+-binding loop is distal to substrate-binding and active sites, yet the serine protease function depends on Ca2+ binding. The atomic molecular dynamics (MD) simulations of apo and Ca2+-bound subtilisin show similar structures and there is no direct evidence that subtilisin has alternative conformations. To model the intra-protein communication due to Ca2+ binding, we transform the sequential segments of an atomic MD trajectory into separate elastic network models to represent anharmonicity and nonlinearity effectively as the temporal and spatial variation of the mechanical coupling network. In analogy to the spectrogram of sound waves, this transformation is termed the “fluctuogram” of protein dynamics. We illustrate that the Ca2+-bound and apo states of subtilisin have different fluctuograms and that intra-protein communication proceeds intermittently both in space and in time. We found that residues with large mechanical coupling variation due to Ca2+ binding correlate with the reported mutation sites selected by directed evolution for improving the stability of subtilisin and its activity in a non-aqueous environment. Furthermore, we utilize the fluctuograms calculated from MD to capture the highly correlated residues in a multiple sequence alignment. We show that in addition to the magnitude, the variance of coupling strength is also an indicative property for the sequence correlation observed in a statistical coupling analysis. The results of this work illustrate that the mechanical coupling networks calculated from atomic details can be used to correlate with functionally important mutation sites and co-evolution.
A hallmark of protein molecules is their machine-like behaviors while carrying out biological functions. At the molecular level, molecular signals such as binding a metal ion at an action site can cause long-range effects and alter protein function. Such phenomena are often referred to as intra-protein communication or allosteric coupling. Elucidating the underlying mechanisms could lead to novel discovery of molecular modulators to regulate protein function in a more specific and effective manner. A long-standing puzzle is the roles of the anharmonicity and nonlinearity in protein dynamics. To incorporate these characters in modeling intra-protein communication, we devise a “fluctuogram” analysis to record the choreography of allosteric coupling in an atomic molecular dynamics simulation. We show that fluctuogram analysis can bridge the results of physics-based simulation and sequence alignment in bioinformatics by capturing the residues that exhibit high correlation in a multiple sequence alignment. We also show that the fluctuograms calculated from atomic details have the potential to be applied as a tool to select mutation sites for modulating protein function.
ATP-dependent nucleosome-remodeling enzymes and covalent modifiers of chromatin set the functional state of chromatin. However, how these enzymatic activities are coordinated in the nucleus is largely unknown. We found that the evolutionary conserved nucleosome-remodeling ATPase ISWI and the poly-ADP-ribose polymerase PARP genetically interact. We present evidence showing that ISWI is target of poly-ADP-ribosylation. Poly-ADP-ribosylation counteracts ISWI function in vitro and in vivo. Our work suggests that ISWI is a physiological target of PARP and that poly-ADP-ribosylation can be a new, important post-translational modification regulating the activity of ATP-dependent nucleosome remodelers.
The ISWI protein is a highly conserved nucleosome remodeler that plays essential roles in regulating chromosome structure, DNA replication, and gene expression. The variety of functions associated with ISWI activity are probably connected to the ability of other cellular factors to regulate its ATP-dependent nucleosome-remodeling activity. We identified one factor—the poly-ADP-ribose polymerase, PARP—that can counteract ISWI function. PARP is an abundant nuclear protein that catalyzes the transfer of ADP-ribose units to specific proteins involved in DNA repair, transcription, and chromatin structure. Our work suggests that the activity of an ATP-dependent remodeler can be modulated by poly-ADP-ribosylation in order to regulate chromatin function in vivo.
Enzymes that mediate nucleosome remodeling and poly-ADP-ribosylation play essential roles in the eukaryotic cell. A new study suggests a mechanism to explain how two nuclear enzymes can coordinate their activities to regulate chromatin structure and function.
KOSMOS is the first online morph server to be able to address the structural dynamics of DNA/RNA, proteins and even their complexes, such as ribosomes. The key functions of KOSMOS are the harmonic and anharmonic analyses of macromolecules. In the harmonic analysis, normal mode analysis (NMA) based on an elastic network model (ENM) is performed, yielding vibrational modes and B-factor calculations, which provide insight into the potential biological functions of macromolecules based on their structural features. Anharmonic analysis involving elastic network interpolation (ENI) is used to generate plausible transition pathways between two given conformations by optimizing a topology-oriented cost function that guarantees a smooth transition without steric clashes. The quality of the computed pathways is evaluated based on their various facets, including topology, energy cost and compatibility with the NMA results. There are also two unique features of KOSMOS that distinguish it from other morph servers: (i) the versatility in the coarse-graining methods and (ii) the various connection rules in the ENM. The models enable us to analyze macromolecular dynamics with the maximum degrees of freedom by combining a variety of ENMs from full-atom to coarse-grained, backbone and hybrid models with one connection rule, such as distance-cutoff, number-cutoff or chemical-cutoff. KOSMOS is available at http://bioengineering.skku.ac.kr/kosmos.
Purpose: Recent biochemical and physiological data point to the existence of one or more Ca++-mediated feedback mechanisms modulating gain at stages early in the vertebrate phototransduction cascade, i.e., prior to activation of cGMP-phosphodiesterase (PDE). The present study is a computational analysis that combines quantitative optimization to key data with a qualitative evaluation of each candidate model's ability to capture “signature” features of representative rod responses obtained under a broad range of dark-(DA) and light-adapted (LA) conditions. The primary data motivating the analyses were the two-flash data of Murnick & Lamb. These data exhibited strikingly nonlinear behavior: the period of complete photocurrent saturation (Tsat) in response to a Test flash was reduced substantially when preceded by a less-intense saturating Pre-flash. Depending on the delay between Pre- and Test flashes, the change in Tsat (ΔTsat) could exceed the magnitude of the delay, and could be reduced by as much as ∼50%, corresponding to a large reduction in gain by a factor of 10-15. The overall goal of the study was to evaluate what model structure(s) were commensurate with both the Murnick & Lamb data and the salient qualitative features of rod responses obtained under a broad range of DA and LA conditions.
Methods: Three candidate models were quantitatively optimized to the Murnick & Lamb saturated toad rod flash responses and, simultaneously, to a set of sub-saturated flash responses. Using the parameters from these optimizations, each candidate model was then used to simulate a suite of DA and LA responses.
Results: The analyses showed that: (1) Within the context of a model with Ca++ feedback onto rhodopsin (R*) lifetime (τR), the salient features of the Murnick & Lamb data can only be accounted for if the rate-limiting step is not the Ca++-sensitive step in the early cascade reactions, i.e., if PDE* lifetime, and not τR, is rate-limiting. (2) With τR rate-limiting, the model cannot account for ΔTsat exceeding the delay. (3) The Ca++-dependent reduction in τR required to effect the large gain is incommensurate with the empirical dynamics of dim-flash responses. (4) Regardless of which reaction is rate-limiting, a model using solely modulation of R* lifetime puts strong constraints on the domain of biochemical parameters commensurate with the large gain changes Murnick & Lamb observed. (5) The analyses show that, in principle, the Murnick & Lamb data can be accounted for when τRis both rate-limiting and Ca++-sensitive if, in addition to the feedback onto τR, there is an earlier, stronger Ca++ feedback that does not affect R* inactivation kinetics (e.g., gain at R* activation or transducin (T*) activation). (6) Ca++-modulation of R* activation or T* activation as the sole early gain mechanism can also account for the Murnick & Lamb data, but fails to predict the data of Matthews, and can thus be rejected along with any model of comparable form.
Conclusions: The results imply that the Murnick & Lamb data per se are insufficient to rule out rate-limitation by (Ca++-sensitive) R* lifetime; evaluation of a broader set of responses is required. The analyses illustrate the importance of evaluating candidate models in relation to sets of data obtained under the broadest possible range of DA and LA conditions. The analyses are aided by the presence of reproducible signature, qualitative features in the data since these tend to constrain the domain of acceptable model structures and/or parameter sets. Some implications for vertebrate photoreceptor light-adaptation are discussed.
Automatic modeling methods using cryo-electron microscopy (cryoEM) density maps as constrains are promising approaches to building atomic models of individual proteins or protein domains. However, their application to large macromolecular assemblies has not been possible largely due to computational limitations inherent to such unsupervised methods. Here we describe a new method, EM-IMO, for building, modifying and refining local structures of protein models using cryoEM maps as a constraint. As a supervised refinement method, EM-IMO allows users to specify parameters derived from inspections, so as to guide, and as a consequence, significantly speed up the refinement. An EM-IMO-based refinement protocol is first benchmarked on a data set of 50 homology models using simulated density maps. A multi-scale refinement strategy that combines EM-IMO-based and molecular dynamics (MD)-based refinement is then applied to build backbone models for the seven conformers of the five capsid proteins in our near-atomic resolution cryoEM map of the grass carp reovirus (GCRV) virion, a member of the aquareovirus genus of the Reoviridae family. The refined models allow us to reconstruct a backbone model of the entire GCRV capsid and provide valuable functional insights that are described in the accompanying publication. Our study demonstrates that the integrated use of homology modeling and a multi-scale refinement protocol that combines supervised and automated structure refinement offers a practical strategy for building atomic models based on medium- to high-resolution cryoEM density maps.
cryo-electron microscopy; density fitting; homology modeling; structure refinement; protein structure prediction
Phosphomevalonate kinase (PMK) catalyzes an essential step in the mevalonate pathway, which is the only pathway for synthesis of isoprenoids and steroids in humans. PMK catalyzes transfer of the γ-phosphate of ATP to mevalonate 5-phosphate (M5P) to form mevalonate 5-diphosphate. Bringing these phosphate groups in proximity to react is especially challenging, given the high negative charge density on the four phosphate groups in the active site. As such, conformational and dynamics changes needed to form the Michaelis complex are of mechanistic interest. Herein, we report the characterization of substrate induced changes (Mg-ADP, M5P, and the ternary complex) in PMK, using NMR-based dynamics and chemical shift perturbation measurements. Mg-ADP and M5P Kd's were 6-60 μM in all complexes, consistent with there being little binding synergy. Binding of M5P causes the PMK structure to compress (τc= 13.5 nsec), while subsequent binding of Mg-ADP opens the structure up (τc= 17.6 nsec). The overall complex seems to stay very rigid on the psec-nsec timescale with an average NMR order parameter of S2∼0.88. Data are consistent with addition of M5P causing movement around a hinge region to permit domain closure, which would bring the M5P domain close to ATP to permit catalysis. Dynamics data identify potential hinge residues as H55 and R93, based on their low order parameters and their location in extended regions that connect the M5P and ATP domains in the PMK homology model. Likewise, D163 may be a hinge residue for the lid region that is homologous to the adenylate kinase lid, covering the “Walker-A” catalytic loop. Binding of ATP or ADP appears to cause similar conformational changes; but, these observations do not indicate an obvious role for γ-phosphate binding interactions. Indeed, the role of γ-phosphate interactions may be more subtle than suggested by ATP/ADP comparisons, since the conservative O to NH substitution in the β-γ bridge of ATP causes a dramatic decrease in affinity and induces few chemical shift perturbations. In terms of positioning of catalytic residues, binding of M5P induces a rigidification of Gly21 (adjacent to the catalytically important Lys22), although exchange broadening in the ternary complex suggests some motion on a slower timescale does still occur. Finally, the first 9 residues of the N-terminus are highly disordered, suggesting they may be part of a cleavable signal or regulatory peptide sequence.
Phosphomevalonate kinase; chemical shift perturbation; relaxation dynamics; mevalonate; NMR; Modelfree
Poly(ADP-ribose) (pADPr) is a polymer assembled from the enzymatic polymerization of the ADP-ribosyl moiety of NAD by poly(ADP-ribose) polymerases (PARPs). The dynamic turnover of pADPr within the cell is essential for a number of cellular processes including progression through the cell cycle, DNA repair and the maintenance of genomic integrity, and apoptosis. In spite of the considerable advances in the knowledge of the physiological conditions modulated by poly(ADP-ribosyl)ation reactions, and notwithstanding the fact that pADPr can play a role of mediator in a wide spectrum of biological processes, few pADPr binding proteins have been identified so far. In this study, refined in silico prediction of pADPr binding proteins and large-scale mass spectrometry-based proteome analysis of pADPr binding proteins were used to establish a comprehensive repertoire of pADPr-associated proteins. Visualization and modeling of these pADPr-associated proteins in networks not only reflect the widespread involvement of poly(ADP-ribosyl)ation in several pathways but also identify protein targets that could shed new light on the regulatory functions of pADPr in normal physiological conditions as well as after exposure to genotoxic stimuli.
A method to accelerate the computation of structure factors from an electron density described by anisotropic and aspherical atomic form factors via fast Fourier transformation is described for the first time.
Recent advances in computational chemistry have produced force fields based on a polarizable atomic multipole description of biomolecular electrostatics. In this work, the Atomic Multipole Optimized Energetics for Biomolecular Applications (AMOEBA) force field is applied to restrained refinement of molecular models against X-ray diffraction data from peptide crystals. A new formalism is also developed to compute anisotropic and aspherical structure factors using fast Fourier transformation (FFT) of Cartesian Gaussian multipoles. Relative to direct summation, the FFT approach can give a speedup of more than an order of magnitude for aspherical refinement of ultrahigh-resolution data sets. Use of a sublattice formalism makes the method highly parallelizable. Application of the Cartesian Gaussian multipole scattering model to a series of four peptide crystals using multipole coefficients from the AMOEBA force field demonstrates that AMOEBA systematically underestimates electron density at bond centers. For the trigonal and tetrahedral bonding geometries common in organic chemistry, an atomic multipole expansion through hexadecapole order is required to explain bond electron density. Alternatively, the addition of interatomic scattering (IAS) sites to the AMOEBA-based density captured bonding effects with fewer parameters. For a series of four peptide crystals, the AMOEBA–IAS model lowered R
free by 20–40% relative to the original spherically symmetric scattering model.
scattering factors; aspherical; anisotropic; force fields; multipole; polarization; AMOEBA; bond density; direct summation; FFT; SGFFT; Ewald; PME