A description is given of new tools to facilitate model building and refinement into electron cryo-microscopy reconstructions.
The recent rapid development of single-particle electron cryo-microscopy (cryo-EM) now allows structures to be solved by this method at resolutions close to 3 Å. Here, a number of tools to facilitate the interpretation of EM reconstructions with stereochemically reasonable all-atom models are described. The BALBES database has been repurposed as a tool for identifying protein folds from density maps. Modifications to Coot, including new Jiggle Fit and morphing tools and improved handling of nucleic acids, enhance its functionality for interpreting EM maps. REFMAC has been modified for optimal fitting of atomic models into EM maps. As external structural information can enhance the reliability of the derived atomic models, stabilize refinement and reduce overfitting, ProSMART has been extended to generate interatomic distance restraints from nucleic acid reference structures, and a new tool, LIBG, has been developed to generate nucleic acid base-pair and parallel-plane restraints. Furthermore, restraint generation has been integrated with visualization and editing in Coot, and these restraints have been applied to both real-space refinement in Coot and reciprocal-space refinement in REFMAC.
model building; refinement; electron cryo-microscopy reconstructions; LIBG
The general principles behind the macromolecular crystal structure refinement program REFMAC5 are described.
This paper describes various components of the macromolecular crystallographic refinement program REFMAC5, which is distributed as part of the CCP4 suite. REFMAC5 utilizes different likelihood functions depending on the diffraction data employed (amplitudes or intensities), the presence of twinning and the availability of SAD/SIRAS experimental diffraction data. To ensure chemical and structural integrity of the refined model, REFMAC5 offers several classes of restraints and choices of model parameterization. Reliable models at resolutions at least as low as 4 Å can be achieved thanks to low-resolution refinement tools such as secondary-structure restraints, restraints to known homologous structures, automatic global and local NCS restraints, ‘jelly-body’ restraints and the use of novel long-range restraints on atomic displacement parameters (ADPs) based on the Kullback–Leibler divergence. REFMAC5 additionally offers TLS parameterization and, when high-resolution data are available, fast refinement of anisotropic ADPs. Refinement in the presence of twinning is performed in a fully automated fashion. REFMAC5 is a flexible and highly optimized refinement package that is ideally suited for refinement across the entire resolution spectrum encountered in macromolecular crystallography.
The automated building of a protein model into an electron density map remains a challenging problem. In the ARP/wARP approach, model building is facilitated by initially interpreting a density map with free atoms of unknown chemical identity; all structural information for such chemically unassigned atoms is discarded. Here, this is remedied by applying restraints between free atoms, and between free atoms and a partial protein model. These are based on geometric considerations of protein structure and tentative (conditional) assignments for the free atoms. Restraints are applied in the REFMAC5 refinement program and are generated on an ad hoc basis, allowing them to fluctuate from step to step. A large set of experimentally phased and molecular replacement structures showcases individual structures where automated building is improved drastically by the conditional restraints. The concept and implementation we present can also find application in restraining geometries, such as hydrogen bonds, in low-resolution refinement.
The Procrustes Structural Matching Alignment and Restraints Tool (ProSMART) has been developed to allow local comparative structural analyses independent of the global conformations and sequence homology of the compared macromolecules. This allows quick and intuitive visualization of the conservation of backbone and side-chain conformations, providing complementary information to existing methods.
The identification and exploration of (dis)similarities between macromolecular structures can help to gain biological insight, for instance when visualizing or quantifying the response of a protein to ligand binding. Obtaining a residue alignment between compared structures is often a prerequisite for such comparative analysis. If the conformational change of the protein is dramatic, conventional alignment methods may struggle to provide an intuitive solution for straightforward analysis. To make such analyses more accessible, the Procrustes Structural Matching Alignment and Restraints Tool (ProSMART) has been developed, which achieves a conformation-independent structural alignment, as well as providing such additional functionalities as the generation of restraints for use in the refinement of macromolecular models. Sensible comparison of protein (or DNA/RNA) structures in the presence of conformational changes is achieved by enforcing neither chain nor domain rigidity. The visualization of results is facilitated by popular molecular-graphics software such as CCP4mg and PyMOL, providing intuitive feedback regarding structural conservation and subtle dissimilarities between close homologues that can otherwise be hard to identify. Automatically generated colour schemes corresponding to various residue-based scores are provided, which allow the assessment of the conservation of backbone and side-chain conformations relative to the local coordinate frame. Structural comparison tools such as ProSMART can help to break the complexity that accompanies the constantly growing pool of structural data into a more readily accessible form, potentially offering biological insight or influencing subsequent experiments.
ProSMART; Procrustes; structural comparison; alignment; external restraints; refinement
The CCP4 template-restraint library defines restraints for biopolymers, their modifications and ligands that are used in macromolecular structure refinement. JLigand is a graphical editor for generating descriptions of new ligands and covalent linkages.
Biological macromolecules are polymers and therefore the restraints for macromolecular refinement can be subdivided into two sets: restraints that are applied to atoms that all belong to the same monomer and restraints that are associated with the covalent bonds between monomers. The CCP4 template-restraint library contains three types of data entries defining template restraints: descriptions of monomers and their modifications, both used for intramonomer restraints, and descriptions of links for intermonomer restraints. The library provides generic descriptions of modifications and links for protein, DNA and RNA chains, and for some post-translational modifications including glycosylation. Structure-specific template restraints can be defined in a user’s additional restraint library. Here, JLigand, a new CCP4 graphical interface to LibCheck and REFMAC that has been developed to manage the user’s library and generate new monomer entries is described, as well as new entries for links and associated modifications.
macromolecular refinement; restraint library; molecular graphics
The structural model of the unliganded and fully glycosylated simian immunodeficiency virus gp120 core determined to 4.0 Å resolution was substantially improved using a recently developed normal-mode-based anisotropic B-factor refinement method.
The envelope protein gp120/gp41 of simian and human immunodeficiency viruses plays a critical role in viral entry into host cells. However, the extraordinarily high structural flexibility and heavy glycosylation of the protein have presented enormous difficulties in the pursuit of high-resolution structural investigation of some of its conformational states. An unliganded and fully glycosylated gp120 core structure was recently determined to 4.0 Å resolution. The rather low data-to-parameter ratio limited refinement efforts in the original structure determination. In this work, refinement of this gp120 core structure was carried out using a normal-mode-based refinement method that has been shown in previous studies to be effective in improving models of a supramolecular complex at 3.42 Å resolution and of a membrane protein at 3.2 Å resolution. By using only the first four nonzero lowest-frequency normal modes to construct the anisotropic thermal parameters, combined with manual adjustments and standard positional refinement using REFMAC5, the structural model of the gp120 core was significantly improved in many aspects, including substantial decreases in R factors, better fitting of several flexible regions in electron-density maps, the addition of five new sugar rings at four glycan chains and an excellent correlation of the B-factor distribution with known structural flexibility. These results further underscore the effectiveness of this normal-mode-based method in improving models of protein and nonprotein components in low-resolution X-ray structures.
conformational flexibility; normal-mode analysis; anisotropic thermal parameters; glycoproteins
Paramagnetic NMR data (pseudocontact shifts and self-orientation residual dipolar couplings) and diamagnetic residual dipolar couplings can now be used in the program REFMAC5 from CCP4 as structural restraints together with X-ray crystallographic data. These NMR restraints can reveal differences between solid state and solution conformations of molecules or, in their absence, can be used together with X-ray crystallographic data for structural refinement.
The program REFMAC5 from CCP4 was modified to allow the simultaneous use of X-ray crystallographic data and paramagnetic NMR data (pseudocontact shifts and self-orientation residual dipolar couplings) and/or diamagnetic residual dipolar couplings. Incorporation of these long-range NMR restraints in REFMAC5 can reveal differences between solid-state and solution conformations of molecules or, in their absence, can be used together with X-ray crystallographic data for structural refinement. Since NMR and X-ray data are complementary, when a single structure is consistent with both sets of data and still maintains reasonably ‘ideal’ geometries, the reliability of the derived atomic model is expected to increase. The program was tested on five different proteins: the catalytic domain of matrix metalloproteinase 1, GB3, ubiquitin, free calmodulin and calmodulin complexed with a peptide. In some cases the joint refinement produced a single model consistent with both sets of observations, while in other cases it indicated, outside the experimental uncertainty, the presence of different protein conformations in solution and in the solid state.
structure refinement; PCS; RDC; X-ray; REFMAC
The side-chain torsion angles of isoleucines in X-ray protein structures are a function of resolution, secondary structure and refinement software. Detailing the standard torsion angles used in refinement software can improve protein structure refinement.
A study of isoleucines in protein structures solved using X-ray crystallography revealed a series of systematic trends for the two side-chain torsion angles χ1 and χ2 dependent on the resolution, secondary structure and refinement software used. The average torsion angles for the nine rotamers were similar in high-resolution structures solved using either the REFMAC, CNS or PHENIX software. However, at low resolution these programs often refine towards somewhat different χ1 and χ2 values. Small systematic differences can be observed between refinement software that uses molecular dynamics-type energy terms (for example CNS) and software that does not use these terms (for example REFMAC). Detailing the standard torsion angles used in refinement software can improve the refinement of protein structures. The target values in the molecular dynamics-type energy functions can also be improved.
The PDB_REDO pipeline aims to improve macromolecular structures by optimizing the crystallographic refinement parameters and performing partial model building. Here, algorithms are presented that allowed a web-server implementation of PDB_REDO, and the first user results are discussed.
The refinement and validation of a crystallographic structure model is the last step before the coordinates and the associated data are submitted to the Protein Data Bank (PDB). The success of the refinement procedure is typically assessed by validating the models against geometrical criteria and the diffraction data, and is an important step in ensuring the quality of the PDB public archive [Read et al. (2011 ▶), Structure, 19, 1395–1412]. The PDB_REDO procedure aims for ‘constructive validation’, aspiring to consistent and optimal refinement parameterization and pro-active model rebuilding, not only correcting errors but striving for optimal interpretation of the electron density. A web server for PDB_REDO has been implemented, allowing thorough, consistent and fully automated optimization of the refinement procedure in REFMAC and partial model rebuilding. The goal of the web server is to help practicing crystallographers to improve their model prior to submission to the PDB. For this, additional steps were implemented in the PDB_REDO pipeline, both in the refinement procedure, e.g. testing of resolution limits and k-fold cross-validation for small test sets, and as new validation criteria, e.g. the density-fit metrics implemented in EDSTATS and ligand validation as implemented in YASARA. Innovative ways to present the refinement and validation results to the user are also described, which together with auto-generated Coot scripts can guide users to subsequent model inspection and improvement. It is demonstrated that using the server can lead to substantial improvement of structure models before they are submitted to the PDB.
PDB_REDO; validation; model optimization
Recent developments in PHENIX are reported that allow the use of reference-model torsion restraints, secondary-structure hydrogen-bond restraints and Ramachandran restraints for improved macromolecular refinement in phenix.refine at low resolution.
Traditional methods for macromolecular refinement often have limited success at low resolution (3.0–3.5 Å or worse), producing models that score poorly on crystallographic and geometric validation criteria. To improve low-resolution refinement, knowledge from macromolecular chemistry and homology was used to add three new coordinate-restraint functions to the refinement program phenix.refine. Firstly, a ‘reference-model’ method uses an identical or homologous higher resolution model to add restraints on torsion angles to the geometric target function. Secondly, automatic restraints for common secondary-structure elements in proteins and nucleic acids were implemented that can help to preserve the secondary-structure geometry, which is often distorted at low resolution. Lastly, we have implemented Ramachandran-based restraints on the backbone torsion angles. In this method, a ϕ,ψ term is added to the geometric target function to minimize a modified Ramachandran landscape that smoothly combines favorable peaks identified from nonredundant high-quality data with unfavorable peaks calculated using a clash-based pseudo-energy function. All three methods show improved MolProbity validation statistics, typically complemented by a lowered R
free and a decreased gap between R
work and R
macromolecular crystallography; low resolution; refinement; automation
DEN refinement and automated model building with AutoBuild were used to determine the structure of a putative succinyl-diaminopimelate desuccinylase from C. glutamicum. This difficult case of molecular-replacement phasing shows that the synergism between DEN refinement and AutoBuild outperforms standard refinement protocols.
Phasing by molecular replacement remains difficult for targets that are far from the search model or in situations where the crystal diffracts only weakly or to low resolution. Here, the process of determining and refining the structure of Cgl1109, a putative succinyl-diaminopimelate desuccinylase from Corynebacterium glutamicum, at ∼3 Å resolution is described using a combination of homology modeling with MODELLER, molecular-replacement phasing with Phaser, deformable elastic network (DEN) refinement and automated model building using AutoBuild in a semi-automated fashion, followed by final refinement cycles with phenix.refine and Coot. This difficult molecular-replacement case illustrates the power of including DEN restraints derived from a starting model to guide the movements of the model during refinement. The resulting improved model phases provide better starting points for automated model building and produce more significant difference peaks in anomalous difference Fourier maps to locate anomalous scatterers than does standard refinement. This example also illustrates a current limitation of automated procedures that require manual adjustment of local sequence misalignments between the homology model and the target sequence.
reciprocal-space refinement; DEN refinement; real-space refinement; automated model building; succinyl-diaminopimelate desuccinylase
This paper describes an approach for making use of the components of the experimentally determined rotational diffusion tensor derived from NMR relaxation measurements in macomolecular structure determination. The parameters of the rotational diffusion tensor describe the shape and size of the macromolecule or macromolecular complex and are therefore complimentary to traditional NMR restraints. The structural information contained in the rotational diffusion tensor is not dissimilar to that present in the small angle region of the solution X-ray scattering profiles. We demonstrate the utility of rotational diffusion tensor restraints for protein structure refinement using the N-terminal domain of enzyme I (EIN) as an example and validate the results by solution small angle X-ray scattering. We also show how rotational diffusion tensor restraints can be used for docking complexes using the dimeric HIV-1 protease and the EIN-HPr complexes as examples. In the former case, the rotational diffusion tensor restraints are sufficient in their own right to determine the position of one subunit relative to another. In the latter case, rotational diffusion tensor restraints complemented by highly ambiguous distance restraints derived from chemical shift pertubation mapping and a hydrophobic contact potential are sufficient to correctly dock EIN to HPr. In each case, the cluster containing the lowest energy structure corresponds to the correct solution.
Local structural similarity restraints (LSSR) provide a novel method for exploiting NCS or structural similarity to an external target structure. Two examples are given where BUSTER re-refinement of PDB entries with LSSR produces marked improvements, enabling further structural features to be modelled.
Maximum-likelihood X-ray macromolecular structure refinement in BUSTER has been extended with restraints facilitating the exploitation of structural similarity. The similarity can be between two or more chains within the structure being refined, thus favouring NCS, or to a distinct ‘target’ structure that remains fixed during refinement. The local structural similarity restraints (LSSR) approach considers all distances less than 5.5 Å between pairs of atoms in the chain to be restrained. For each, the difference from the distance between the corresponding atoms in the related chain is found. LSSR applies a restraint penalty on each difference. A functional form that reaches a plateau for large differences is used to avoid the restraints distorting parts of the structure that are not similar. Because LSSR are local, there is no need to separate out domains. Some restraint pruning is still necessary, but this has been automated. LSSR have been available to academic users of BUSTER since 2009 with the easy-to-use -autoncs and -target target.pdb options. The use of LSSR is illustrated in the re-refinement of PDB entries 5rnt, where -target enables the correct ligand-binding structure to be found, and 1osg, where -autoncs contributes to the location of an additional copy of the cyclic peptide ligand.
BUSTER; NCS restraints; target-structure restraints; local structural similarity restraints
Flexible-fitting computational algorithms are often useful to interpret low resolution maps of many macromolecular complexes generated by electron microscopy (EM) imaging. One such atomistic simulation technique is molecular dynamics flexible fitting (MDFF), which has been widely applied to generate structural models of large ribonucleoprotein assemblies such as the ribosome. We have previously shown that MDFF simulations of globular proteins are sensitive to the resolution of the target EM map, and the strength of restraints used to preserve the secondary structure elements during fitting (Vashisth et al. Structure
2012, 20, 1453–1462). In this work, we aim to systematically examine the quality of structural models of various nucleic acids obtained via MDFF by varying the map resolution and the strength of structural restraints. We also demonstrate how an enhanced conformational sampling technique for proteins, temperatureaccelerated molecular dynamics (TAMD), can be combined with MDFF for the structural refinement of nucleic acids in EMmaps. Finally, we also demonstrate application of TAMD-assisted MDFF (TAMDFF) on a RNA/protein complex and suggest that TAMDFF is a viable strategy for enhanced conformational fitting in target maps of ribonucleoprotein complexes.
electron microscopy; ribonucleic acid; molecular dynamics flexible fitting; temperature-accelerated molecular dynamics; enhanced sampling
When refining the fit of component atomic structures into electron microscopic reconstructions, use of a resolution-dependent atomic density function makes it possible to jointly optimize the atomic model and imaging parameters of the microscope. Atomic density is calculated by one-dimensional Fourier transform of atomic form factors convoluted with a microscope envelope correction and a low-pass filter, allowing refinement of imaging parameters such as resolution, by optimizing the agreement of calculated and experimental maps. A similar approach allows refinement of atomic displacement parameters, providing indications of molecular flexibility even at low resolution. A modest improvement in atomic coordinates is possible following optimization of these additional parameters. Methods have been implemented in a Python program that can be used in stand-alone mode for rigid-group refinement, or embedded in other optimizers for flexible refinement with stereochemical restraints. The approach is demonstrated with refinements of virus and chaperonin structures at resolutions of 9 through 4.5 Å, representing regimes where rigid-group and fully flexible parameterizations are appropriate. Through comparisons to known crystal structures, flexible fitting by RSRef is shown to be an improvement relative to other methods and to generate models with all-atom rms accuracies of 1.5–2.5 Å at resolutions of 4.5–6 Å.
Fitting; Optimization; Structure; Resolution; Restraint; B-factor; Flexibility
X-ray diffraction plays a pivotal role in understanding of biological systems by revealing atomic structures of proteins, nucleic acids, and their complexes, with much recent interest in very large assemblies like the ribosome. Since crystals of such large assemblies often diffract weakly (resolution worse than 4 Å), we need methods that work at such low resolution. In macromolecular assemblies, some of the components may be known at high resolution, while others are unknown: current refinement methods fail as they require a high-resolution starting structure for the entire complex1. Determining such complexes, which are often of key biological importance, should be possible in principle as the number of independent diffraction intensities at a resolution below 5 Å generally exceed the number of degrees of freedom. Here we introduce a new method that adds specific information from known homologous structures but allows global and local deformations of these homology models. Our approach uses the observation that local protein structure tends to be conserved as sequence and function evolve. Cross-validation with Rfree determines the optimum deformation and influence of the homology model. For test cases at 3.5 – 5 Å resolution with known structures at high resolution, our method gives significant improvements over conventional refinement in the model coordinate accuracy, the definition of secondary structure, and the quality of electron density maps. For re-refinements of a representative set of 19 low-resolution crystal structures from the PDB, we find similar improvements. Thus, a structure derived from low-resolution diffraction data can have quality similar to a high-resolution structure. Our method is applicable to studying weakly diffracting crystals using X-ray micro-diffraction2 as well as data from new X-ray light sources3. Use of homology information is not restricted to X-ray crystallography and cryo-electron microscopy: as optical imaging advances to sub-nanometer resolution4,5, it can use similar tools.
X-ray crystallography; homology modeling; cross-validation; Rfree value; refinement
Protein fragments suitable for use in molecular replacement can be generated by normal-mode perturbation, analysis of the difference distance matrix of the original versus normal-mode perturbed structures, and SCEDS, a score that measures the sphericity, continuity, equality and density of the resulting fragments.
A method is described for generating protein fragments suitable for use as molecular-replacement (MR) template models. The template model for a protein suspected to undergo a conformational change is perturbed along combinations of low-frequency normal modes of the elastic network model. The unperturbed structure is then compared with each perturbed structure in turn and the structurally invariant regions are identified by analysing the difference distance matrix. These fragments are scored with SCEDS, which is a combined measure of the sphericity of the fragments, the continuity of the fragments with respect to the polypeptide chain, the equality in number of atoms in the fragments and the density of Cα atoms in the triaxial ellipsoid of the fragment extents. The fragment divisions with the highest SCEDS are then used as separate template models for MR. Test cases show that where the protein contains fragments that undergo a change in juxtaposition between template model and target, SCEDS can identify fragments that lead to a lower R factor after ten cycles of all-atom refinement with REFMAC5 than the original template structure. The method has been implemented in the software Phaser.
difference distance matrix; normal-mode analysis
The structures of large macromolecular complexes in different functional states can be determined by cryo-electron microscopy, which yields electron density maps of low to intermediate resolutions. The maps can be combined with high-resolution atomic structures of components of the complex, to produce a model for the complex that is more accurate than the formal resolution of the map. To this end, methods have been developed to dock atomic models into density maps rigidly or flexibly, and to refine a docked model so as to optimize the fit of the atomic model into the map. We have developed a new refinement method called YUP.SCX. The electron density map is converted into a component of the potential energy function to which terms for stereochemical restraints and volume exclusion are added. The potential energy function is then minimized (using simulated annealing) to yield a stereochemically-restrained atomic structure that fits into the electron density map optimally. We used this procedure to construct an atomic model of the 70S ribosome in the pre-accommodation state. Although some atoms are displaced by as much as 33 Å, they divide themselves into nearly rigid fragments along natural boundaries with smooth transitions between the fragments.
Electron microscopy; simulated annealing; structural refinement
Prediction of protein structures from their sequences is still one of the open grand challenges of computational biology. Some approaches to protein structure prediction, especially ab initio ones, rely to some extent on the prediction of residue contact maps. Residue contact map predictions have been assessed at the CASP competition for several years now. Although it has been shown that exact contact maps generally yield correct three-dimensional structures, this is true only at a relatively low resolution (3–4 Å from the native structure). Another known weakness of contact maps is that they are generally predicted ab initio, that is not exploiting information about potential homologues of known structure.
We introduce a new class of distance restraints for protein structures: multi-class distance maps. We show that Cα trace reconstructions based on 4-class native maps are significantly better than those from residue contact maps. We then build two predictors of 4-class maps based on recursive neural networks: one ab initio, or relying on the sequence and on evolutionary information; one template-based, or in which homology information to known structures is provided as a further input. We show that virtually any level of sequence similarity to structural templates (down to less than 10%) yields more accurate 4-class maps than the ab initio predictor. We show that template-based predictions by recursive neural networks are consistently better than the best template and than a number of combinations of the best available templates. We also extract binary residue contact maps at an 8 Å threshold (as per CASP assessment) from the 4-class predictors and show that the template-based version is also more accurate than the best template and consistently better than the ab initio one, down to very low levels of sequence identity to structural templates. Furthermore, we test both ab-initio and template-based 8 Å predictions on the CASP7 targets using a pre-CASP7 PDB, and find that both predictors are state-of-the-art, with the template-based one far outperforming the best CASP7 systems if templates with sequence identity to the query of 10% or better are available. Although this is not the main focus of this paper we also report on reconstructions of Cα traces based on both ab initio and template-based 4-class map predictions, showing that the latter are generally more accurate even when homology is dubious.
Accurate predictions of multi-class maps may provide valuable constraints for improved ab initio and template-based prediction of protein structures, naturally incorporate multiple templates, and yield state-of-the-art binary maps. Predictions of protein structures and 8 Å contact maps based on the multi-class distance map predictors described in this paper are freely available to academic users at the url .
The title compound, C23H26F2N2O4, is a dipeptidic inhibitor of γ-secretase, one of the enzymes involved in Alzheimer’s disease. The molecule adopts a compact conformation, without intramolecular hydrogen bonds. In the crystal structure, one of the amide N atoms forms the only intermolecular N—H⋯O hydrogen bond; the second amide N atom does not form hydrogen bonds. High-resolution synchrotron diffraction data permitted the unequivocal location and refinement without restraints of all H atoms, and the identification of the characteristic shift of the amide H atom engaged in the hydrogen bond from its ideal position, resulting in a more linear hydrogen bond. Significant residual densities for bonding electrons were revealed after the usual SHELXL refinement, and modeling of these features as additional interatomic scatterers (IAS) using the program PHENIX led to a significant decrease in the R factor from 0.0411 to 0.0325 and diminished the r.m.s. deviation level of noise in the final difference Fourier map from 0.063 to 0.037 e Å−3.
SAD data can be used in Phaser to solve novel structures, supplement molecular-replacement phase information or identify anomalous scatterers from a final refined model.
Phaser is a program that implements likelihood-based methods to solve macromolecular crystal structures, currently by molecular replacement or single-wavelength anomalous diffraction (SAD). SAD phasing is based on a likelihood target derived from the joint probability distribution of observed and calculated pairs of Friedel-related structure factors. This target combines information from the total structure factor (primarily non-anomalous scattering) and the difference between the Friedel mates (anomalous scattering). Phasing starts from a substructure, which is usually but not necessarily a set of anomalous scatterers. The substructure can also be a protein model, such as one obtained by molecular replacement. Additional atoms are found using a log-likelihood gradient map, which shows the sites where the addition of scattering from a particular atom type would improve the likelihood score. An automated completion algorithm adds new sites, choosing optionally among different atom types, adds anisotropic B-factor parameters if appropriate and deletes atoms that refine to low occupancy. Log-likelihood gradient maps can also identify which atoms in a refined protein structure are anomalous scatterers, such as metal or halide ions. These maps are more sensitive than conventional model-phased anomalous difference Fouriers and the iterative completion algorithm is able to find a significantly larger number of convincing sites.
SAD phasing; likelihood; molecular replacement
Structural studies of large proteins and protein assemblies are a difficult and pressing challenge in molecular biology. Experiments often yield only low-resolution or sparse data which are not sufficient to fully determine atomistic structures. We have developed a general geometry-based algorithm that efficiently samples conformational space under constraints imposed by low-resolution density maps obtained from electron microscopy or X-ray crystallography experiments. A deformable elastic network (DEN) is used to restrain the sampling to prior knowledge of an approximate structure. The DEN restraints dramatically reduce over-fitting, especially at low resolution. Cross-validation is used to optimally weight the structural information and experimental data. Our algorithm is robust even for noise-added density maps and has a large radius of convergence for our test case. The DEN restraints can also be used to enhance reciprocal space simulated annealing refinement.
The deformable elastic network (DEN) method for reciprocal-space crystallographic refinement improves crystal structures, especially at resolutions lower than 3.5 Å. The DEN web service presented here intends to provide structural biologists with access to resources for running computationally intensive DEN refinements.
Deformable elastic network (DEN) restraints have proved to be a powerful tool for refining structures from low-resolution X-ray crystallographic data sets. Unfortunately, optimal refinement using DEN restraints requires extensive calculations and is often hindered by a lack of access to sufficient computational resources. The DEN web service presented here intends to provide structural biologists with access to resources for running computationally intensive DEN refinements in parallel on the Open Science Grid, the US cyberinfrastructure. Access to the grid is provided through a simple and intuitive web interface integrated into the SBGrid Science Portal. Using this portal, refinements combined with full parameter optimization that would take many thousands of hours on standard computational resources can now be completed in several hours. An example of the successful application of DEN restraints to the human Notch1 transcriptional complex using the grid resource, and summaries of all submitted refinements, are presented as justification.
deformable elastic network restraints; low-resolution refinement; DEN refinement
A simple rule of thumb based on resolution is not adequate to identify the best treatment of atomic displacements in macromolecular structural models. The choice to use isotropic B factors, anisotropic B factors, TLS models or some combination of the three should be validated through statistical analysis of the model refinement.
In choosing and refining any crystallographic structural model, there is tension between the desire to extract the most detailed information possible and the necessity to describe no more than what is justified by the observed data. A more complex model is not necessarily a better model. Thus, it is important to validate the choice of parameters as well as validating their refined values. One recurring task is to choose the best model for describing the displacement of each atom about its mean position. At atomic resolution one has the option of devoting six model parameters (a ‘thermal ellipsoid’) to describe the displacement of each atom. At medium resolution one typically devotes at most one model parameter per atom to describe the same thing (a ‘B factor’). At very low resolution one cannot justify the use of even one parameter per atom. Furthermore, this aspect of the structure may be described better by an explicit model of bulk displacements, the most common of which is the translation/libration/screw (TLS) formalism, rather than by assigning some number of parameters to each atom individually. One can sidestep this choice between atomic displacement parameters and TLS descriptions by including both treatments in the same model, but this is not always statistically justifiable. The choice of which treatment is best for a particular structure refinement at a particular resolution can be guided by general considerations of the ratio of model parameters to the number of observations and by specific statistics such as the Hamilton R-factor ratio test.
atomic displacements; B factors; TLS models; model parameters
A brief summary of the types of restraint defined in refinement dictionaries.
At the resolution available from most macromolecular crystals, the X-ray data alone are insufficient to lead to a chemically reasonable structure, so stereochemical restraints are essential. These usually restrain bond lengths, bond angles, planes and chiral volumes. The definition of these restraints and where the values come from are described. A dictionary entry contains information about the atom types, their connectivity and all the appropriate restraints. Torsion angles are not usually restrained, but they do have optimum values. In the special case of flexible five- and six-membered rings, including pentose and hexose sugars, the ring pucker is defined by combinations of torsion angles and the pucker affects the position of substituents.
stereochemistry; restraints; bond lengths; bond angles; protein structure; crystallographic refinement