The highly automated PHENIX AutoBuild wizard is described. The procedure can be applied equally well to phases derived from isomorphous/anomalous and molecular-replacement methods.
The PHENIX AutoBuild wizard is a highly automated tool for iterative model building, structure refinement and density modification using RESOLVE model building, RESOLVE statistical density modification and phenix.refine structure refinement. Recent advances in the AutoBuild wizard and phenix.refine include automated detection and application of NCS from models as they are built, extensive model-completion algorithms and automated solvent-molecule picking. Model-completion algorithms in the AutoBuild wizard include loop building, crossovers between chains in different models of a structure and side-chain optimization. The AutoBuild wizard has been applied to a set of 48 structures at resolutions ranging from 1.1 to 3.2 Å, resulting in a mean R factor of 0.24 and a mean free R factor of 0.29. The R factor of the final model is dependent on the quality of the starting electron density and is relatively independent of resolution.
model building; model completion; macromolecular models; Protein Data Bank; structure refinement; PHENIX
A semi-automated computational procedure to assist in the identification of bound ligands from unknown electron density has been developed. The atomic surface surrounding the density blob is compared to a library of three-dimensional ligand binding surfaces extracted from the Protein Data Bank (PDB). Ligands corresponding to surfaces which share physicochemical texture and geometric shape similarities are considered for assignment. The method is benchmarked against a set of well represented ligands from the PDB, in which we show that we can identify the correct ligand based on the corresponding binding surface. Finally, we apply the method during model building and refinement stages from structural genomics targets in which unknown density blobs were discovered. A semi-automated computational method is described which aims to assist crystallographers with assigning the identity of a ligand corresponding to unknown electron density. Using shape and physicochemical similarity assessments between the protein surface surrounding the density and a database of known ligand binding surfaces, a plausible list of candidate ligands are identified for consideration. The method is validated against highly observed ligands from the Protein Data Bank and results are shown from its use in a high-throughput structural genomics pipeline.
Electron density assignment; Function annotation; Ligand identification; Ligand assignment; Protein surfaces
RCrane is a new tool for the partially automated building of RNA crystallographic models into electron-density maps of low or intermediate resolution. This tool helps crystallographers to place phosphates and bases into electron density and then automatically predicts and builds the detailed all-atom structure of the traced nucleotides.
RNA crystals typically diffract to much lower resolutions than protein crystals. This low-resolution diffraction results in unclear density maps, which cause considerable difficulties during the model-building process. These difficulties are exacerbated by the lack of computational tools for RNA modeling. Here, RCrane, a tool for the partially automated building of RNA into electron-density maps of low or intermediate resolution, is presented. This tool works within Coot, a common program for macromolecular model building. RCrane helps crystallographers to place phosphates and bases into electron density and then automatically predicts and builds the detailed all-atom structure of the traced nucleotides. RCrane then allows the crystallographer to review the newly built structure and select alternative backbone conformations where desired. This tool can also be used to automatically correct the backbone structure of previously built nucleotides. These automated corrections can fix incorrect sugar puckers, steric clashes and other structural problems.
RCrane; RNA model building
A method for automated macromolecular side-chain model building and for aligning the sequence to the map is described.
An algorithm is described for automated building of side chains in an electron-density map once a main-chain model is built and for alignment of the protein sequence to the map. The procedure is based on a comparison of electron density at the expected side-chain positions with electron-density templates. The templates are constructed from average amino-acid side-chain densities in 574 refined protein structures. For each contiguous segment of main chain, a matrix with entries corresponding to an estimate of the probability that each of the 20 amino acids is located at each position of the main-chain model is obtained. The probability that this segment corresponds to each possible alignment with the sequence of the protein is estimated using a Bayesian approach and high-confidence matches are kept. Once side-chain identities are determined, the most probable rotamer for each side chain is built into the model. The automated procedure has been implemented in the RESOLVE software. Combined with automated main-chain model building, the procedure produces a preliminary model suitable for refinement and extension by an experienced crystallographer.
model building; template matching
A method for automated macromolecular main-chain model building is described.
An algorithm for the automated macromolecular model building of polypeptide backbones is described. The procedure is hierarchical. In the initial stages, many overlapping polypeptide fragments are built. In subsequent stages, the fragments are extended and then connected. Identification of the locations of helical and β-strand regions is carried out by FFT-based template matching. Fragment libraries of helices and β-strands from refined protein structures are then positioned at the potential locations of helices and strands and the longest segments that fit the electron-density map are chosen. The helices and strands are then extended using fragment libraries consisting of sequences three amino acids long derived from refined protein structures. The resulting segments of polypeptide chain are then connected by choosing those which overlap at two or more Cα positions. The fully automated procedure has been implemented in RESOLVE and is capable of model building at resolutions as low as 3.5 Å. The algorithm is useful for building a preliminary main-chain model that can serve as a basis for refinement and side-chain addition.
model building; template matching; fragment extension
A novel method that uses the conformational distribution of Cα atoms in known structures is used to build short missing regions (‘loops’) in protein models. An initial tree of possible loop paths is pruned according to structural and electron-density criteria and the most likely loop conformation(s) are selected and built.
One of the most cumbersome and time-demanding tasks in completing a protein model is building short missing regions or ‘loops’. A method is presented that uses structural and electron-density information to build the most likely conformations of such loops. Using the distribution of angles and dihedral angles in pentapeptides as the driving parameters, a set of possible conformations for the Cα backbone of loops was generated. The most likely candidate is then selected in a hierarchical manner: new and stronger restraints are added while the loop is built. The weight of the electron-density correlation relative to geometrical considerations is gradually increased until the most likely loop is selected on map correlation alone. To conclude, the loop is refined against the electron density in real space. This is started by using structural information to trace a set of models for the Cα backbone of the loop. Only in later steps of the algorithm is the electron-density correlation used as a criterion to select the loop(s). Thus, this method is more robust in low-density regions than an approach using density as a primary criterion. The algorithm is implemented in a loop-building program, Loopy, which can be used either alone or as part of an automatic building cycle. Loopy can build loops of up to 14 residues in length within a couple of minutes. The average root-mean-square deviation of the Cα atoms in the loops built during validation was less than 0.4 Å. When implemented in the context of automated model building in ARP/wARP, Loopy can increase the completeness of the built models.
model building; loop modelling; Loopy
The automated building of a protein model into an electron density map remains a challenging problem. In the ARP/wARP approach, model building is facilitated by initially interpreting a density map with free atoms of unknown chemical identity; all structural information for such chemically unassigned atoms is discarded. Here, this is remedied by applying restraints between free atoms, and between free atoms and a partial protein model. These are based on geometric considerations of protein structure and tentative (conditional) assignments for the free atoms. Restraints are applied in the REFMAC5 refinement program and are generated on an ad hoc basis, allowing them to fluctuate from step to step. A large set of experimentally phased and molecular replacement structures showcases individual structures where automated building is improved drastically by the conditional restraints. The concept and implementation we present can also find application in restraining geometries, such as hydrogen bonds, in low-resolution refinement.
The challenges that arise in nucleic acid model building as a consequence of their simpler and more symmetric super-secondary structures are addressed.
The process of building and refining crystal structures of nucleic acids, although similar to that for proteins, has some peculiarities that give rise to both various complications and various benefits. Although conventional isomorphous replacement phasing techniques are typically used to generate an experimental electron-density map for the purposes of determining novel nucleic acid structures, it is also possible to couple the phasing and model-building steps to permit the solution of complex and novel RNA three-dimensional structures without the need for conventional heavy-atom phasing approaches.
nucleic acids; model building; refinement
High level ab initio studies demonstrate substantial conformational flexibility of amino groups of nucleic acid bases. This flexibility is important for biological functions of DNA. Existing force field models of molecular mechanics do not describe this phenomenon due to a lack of quantitative experimental data necessary for an adjustment of empirical parameters. We have performed extensive calculations of nucleic acid bases at the MP2/6-31G(d,p) level of ab initio theory for broad set of amino group configurations. Two-dimensional maps of energy and geometrical characteristics as functions of two amino hydrogen torsions have been constructed. We approximate the maps by polynomial expressions, which can be used in molecular mechanics calculations. Detailed considerations of these maps enable us to propose a method for determination of numerical coefficients in the developed formulae using restricted sets of points obtained via higher-level calculations.
Molecular mechanics; Correlated ab initio calculations; Nucleic acids; Amino group flexibility
Single-particle cryo electron microscopy (cryoEM) is a technique for determining three-dimensional (3D) structures from projection images of molecular complexes preserved in their “native,” noncrystalline state. Recently, atomic or near-atomic resolution structures of several viruses and protein assemblies have been determined by single-particle cryoEM, allowing ab initio atomic model building by following the amino acid side chains or nucleic acid bases identifiable in their cryoEM density maps. In particular, these cryoEM structures have revealed extended arms contributing to molecular interactions that are otherwise not resolved by the conventional structural method of X-ray crystallography at similar resolutions. High-resolution cryoEM requires careful consideration of a number of factors, including proper sample preparation to ensure structural homogeneity, optimal configuration of electron imaging conditions to record high-resolution cryoEM images, accurate determination of image parameters to correct image distortions, efficient refinement and computation to reconstruct a 3D density map, and finally appropriate choice of modeling tools to construct atomic models for functional interpretation. This progress illustrates the power of cryoEM and ushers it into the arsenal of structural biology, alongside conventional techniques of X-ray crystallography and NMR, as a major tool (and sometimes the preferred one) for the studies of molecular interactions in supramolecular assemblies or machines.
Biological polymers such as nucleic acids and proteins are ubiquitous in living systems, but their ability to address problems beyond those found in nature is constrained by factors such as chemical or biological instability, limited building-block functionality, bioavailability, and immunogenicity. In principle, sequence-defined synthetic polymers based on nonbiological monomers and backbones might overcome these constraints; however, identifying the sequence of a synthetic polymer that possesses a specific desired functional property remains a major challenge. Molecular evolution can rapidly generate functional polymers but requires a means of translating amplifiable templates such as nucleic acids into the polymer being evolved. This review covers recent advances in the enzymatic and nonenzymatic templated polymerization of nonnatural polymers and their potential applications in the directed evolution of sequence-defined synthetic polymers.
A cross-validation-based method for bias reduction in ‘classical’ iterative density modification of experimental X-ray crystallography maps provides significantly more accurate phase-quality estimates and leads to improved automated model building.
Density modification often suffers from an overestimation of phase quality, as seen by escalated figures of merit. A new cross-validation-based method to address this estimation bias by applying a bias-correction parameter ‘β’ to maximum-likelihood phase-combination functions is proposed. In tests on over 100 single-wavelength anomalous diffraction data sets, the method is shown to produce much more reliable figures of merit and improved electron-density maps. Furthermore, significantly better results are obtained in automated model building iterated with phased refinement using the more accurate phase probability parameters from density modification.
reliable figure-of-merit estimates; density modification; maximum likelihood; bias reduction
The identification and modelling of ligands into macromolecular models is important for understanding molecule's function and for designing inhibitors to modulate its activities. We describe new algorithms for the automated building of ligands into electron density maps in crystal structure determination. Location of the ligand-binding site is achieved by matching numerical shape features describing the ligand to those of density clusters using a “fragmentation-tree” density representation. The ligand molecule is built using two distinct algorithms exploiting free atoms with inter-atomic connectivity and Metropolis-based optimisation of the conformational state of the ligand, producing an ensemble of structures from which the final model is derived. The method was validated on several thousand entries from the Protein Data Bank. In the majority of cases, the ligand-binding site could be correctly located and the ligand model built with a coordinate accuracy of better than 1 Å. We anticipate that the method will be of routine use to anyone modelling ligands, lead compounds or even compound fragments as part of protein functional analyses or drug design efforts.
electron density map; small-molecule binders; shape; hybrid approach; drug design
Recent developments in PHENIX are reported that allow the use of reference-model torsion restraints, secondary-structure hydrogen-bond restraints and Ramachandran restraints for improved macromolecular refinement in phenix.refine at low resolution.
Traditional methods for macromolecular refinement often have limited success at low resolution (3.0–3.5 Å or worse), producing models that score poorly on crystallographic and geometric validation criteria. To improve low-resolution refinement, knowledge from macromolecular chemistry and homology was used to add three new coordinate-restraint functions to the refinement program phenix.refine. Firstly, a ‘reference-model’ method uses an identical or homologous higher resolution model to add restraints on torsion angles to the geometric target function. Secondly, automatic restraints for common secondary-structure elements in proteins and nucleic acids were implemented that can help to preserve the secondary-structure geometry, which is often distorted at low resolution. Lastly, we have implemented Ramachandran-based restraints on the backbone torsion angles. In this method, a ϕ,ψ term is added to the geometric target function to minimize a modified Ramachandran landscape that smoothly combines favorable peaks identified from nonredundant high-quality data with unfavorable peaks calculated using a clash-based pseudo-energy function. All three methods show improved MolProbity validation statistics, typically complemented by a lowered R
free and a decreased gap between R
work and R
macromolecular crystallography; low resolution; refinement; automation
Torsion-angle sampling, as implemented in the Protein Local Optimization Program (PLOP), is used to generate multiple structurally variable single-conformer models which are in good agreement with X-ray data. An ensemble-refinement approach to differentiate between positional uncertainty and conformational heterogeneity is proposed.
Modeling structural variability is critical for understanding protein function and for modeling reliable targets for in silico docking experiments. Because of the time-intensive nature of manual X-ray crystallographic refinement, automated refinement methods that thoroughly explore conformational space are essential for the systematic construction of structurally variable models. Using five proteins spanning resolutions of 1.0–2.8 Å, it is demonstrated how torsion-angle sampling of backbone and side-chain libraries with filtering against both the chemical energy, using a modern effective potential, and the electron density, coupled with minimization of a reciprocal-space X-ray target function, can generate multiple structurally variable models which fit the X-ray data well. Torsion-angle sampling as implemented in the Protein Local Optimization Program (PLOP) has been used in this work. Models with the lowest R
free values are obtained when electrostatic and implicit solvation terms are included in the effective potential. HIV-1 protease, calmodulin and SUMO-conjugating enzyme illustrate how variability in the ensemble of structures captures structural variability that is observed across multiple crystal structures and is linked to functional flexibility at hinge regions and binding interfaces. An ensemble-refinement procedure is proposed to differentiate between variability that is a consequence of physical conformational heterogeneity and that which reflects uncertainty in the atomic coordinates.
automated refinement; multiple models; conformational heterogeneity; torsion-angle sampling
A procedure for model building is described that combines morphing a model to match a density map, trimming the morphed model and aligning the model to a sequence.
A procedure termed ‘morphing’ for improving a model after it has been placed in the crystallographic cell by molecular replacement has recently been developed. Morphing consists of applying a smooth deformation to a model to make it match an electron-density map more closely. Morphing does not change the identities of the residues in the chain, only their coordinates. Consequently, if the true structure differs from the working model by containing different residues, these differences cannot be corrected by morphing. Here, a procedure that helps to address this limitation is described. The goal of the procedure is to obtain a relatively complete model that has accurate main-chain atomic positions and residues that are correctly assigned to the sequence. Residues in a morphed model that do not match the electron-density map are removed. Each segment of the resulting trimmed morphed model is then assigned to the sequence of the molecule using information about the connectivity of the chains from the working model and from connections that can be identified from the electron-density map. The procedure was tested by application to a recently determined structure at a resolution of 3.2 Å and was found to increase the number of correctly identified residues in this structure from the 88 obtained using phenix.resolve sequence assignment alone (Terwilliger, 2003 ▶) to 247 of a possible 359. Additionally, the procedure was tested by application to a series of templates with sequence identities to a target structure ranging between 7 and 36%. The mean fraction of correctly identified residues in these cases was increased from 33% using phenix.resolve sequence assignment to 47% using the current procedure. The procedure is simple to apply and is available in the Phenix software package.
morphing; model building; sequence assignment; model–map correlation; loop-building
Grouping the 20 residues is a classic strategy to discover ordered patterns and insights about the fundamental nature of proteins, their structure, and how they fold. Usually, this categorization is based on the biophysical and/or structural properties of a residue’s side-chain group. We extend this approach to understand the effects that side-chains have upon backbone conformation and perform a knowledge-based classification of amino acids by comparing their backbone φ,ψ distributions in different types of secondary structure. At this finer, more specific resolution, the torsion angle data is often sparse and discontinuous (especially for the non-helical classes) even though a comprehensive set of protein structures is used. To insure the precision of the Ramachandran plot comparisons, we applied a rigorous Bayesian density estimation method that produces continuous estimates of the backbone φ,ψ distributions. Based on this statistical modeling, a robust, hierarchical clustering was performed using a divergence score to measure the similarity between plots. There were 7 general groups based on the clusters from the complete Ramachandran data: nonpolar/β-branched (Ile & Val), AsX (Asn & Asp), long (Met, Gln, Arg, Glu, Lys, & Leu), aromatic (Phe, Tyr, His, & Cys), small (Ala & Ser), bulky (Thr & Trp), and lastly the singletons of Gly and Pro. At the level of 4 types of secondary structure (helix, sheet, turn, and coil), these groups remain somewhat consistent, although there are a few significant variations. Besides the expected uniqueness of the Gly and Pro distributions, the nonpolar/β-branched and AsX clusters were very consistent across all types of secondary structure. Effectively, this consistency across the secondary structure classes imply that side-chain steric effects strongly influence a residue’s backbone torsion angle conformation. These results help to explain the plasticity of amino acid substitutions on protein structure, and should help in protein design and structure evaluation.
Ramachandran Plot; Torsion Angles; Bayesian Density Estimation; Clustering; Residue Backbone Similarity
Statistical density modification can make use of local patterns of density found in protein structures to improve crystallographic phases.
A method for improving crystallographic phases is presented that is based on the preferential occurrence of certain local patterns of electron density in macromolecular electron-density maps. The method focuses on the relationship between the value of electron density at a point in the map and the pattern of density surrounding this point. Patterns of density that can be superimposed by rotation about the central point are considered equivalent. Standard templates are created from experimental or model electron-density maps by clustering and averaging local patterns of electron density. The clustering is based on correlation coefficients after rotation to maximize the correlation. Experimental or model maps are also used to create histograms relating the value of electron density at the central point to the correlation coefficient of the density surrounding this point with each member of the set of standard patterns. These histograms are then used to estimate the electron density at each point in a new experimental electron-density map using the pattern of electron density at points surrounding that point and the correlation coefficient of this density to each of the set of standard templates, again after rotation to maximize the correlation. The method is strengthened by excluding any information from the point in question from both the templates and the local pattern of density in the calculation. A function based on the origin of the Patterson function is used to remove information about the electron density at the point in question from nearby electron density. This allows an estimation of the electron density at each point in a map, using only information from other points in the process. The resulting estimates of electron density are shown to have errors that are nearly independent of the errors in the original map using model data and templates calculated at a resolution of 2.6 Å. Owing to this independence of errors, information from the new map can be combined in a simple fashion with information from the original map to create an improved map. An iterative phase-improvement process using this approach and other applications of the image-reconstruction method are described and applied to experimental data at resolutions ranging from 2.4 to 2.8 Å.
density modification; pattern matching
We present a variational approach to smooth molecular (proteins, nucleic acids) surface constructions, starting from atomic coordinates, as available from the protein and nucleic-acid data banks. Molecular dynamics (MD) simulations traditionally used in understanding protein and nucleic-acid folding processes, are based on molecular force fields, and require smooth models of these molecular surfaces. To accelerate MD simulations, a popular methodology is to employ coarse grained molecular models, which represent clusters of atoms with similar physical properties by psuedo- atoms, resulting in coarser resolution molecular surfaces. We consider generation of these mixed-resolution or adaptive molecular surfaces. Our approach starts from deriving a general form second order geometric partial differential equation in the level-set formulation, by minimizing a first order energy functional which additionally includes a regularization term to minimize the occurrence of chemically infeasible molecular surface pockets or tunnel-like artifacts. To achieve even higher computational efficiency, a fast cubic B-spline C2 interpolation algorithm is also utilized. A narrow band, tri-cubic B-spline level-set method is then used to provide C2 smooth and resolution adaptive molecular surfaces.
Variational methods; High-order level-set; Molecular surface; Geometric partial differential equation
A genetic algorithm has been developed to optimize the phases of the strongest reflections in SIR/SAD data. This is shown to facilitate density modification and model building in several test cases.
Experimental phasing of diffraction data from macromolecular crystals involves deriving phase probability distributions. These distributions are often bimodal, making their weighted average, the centroid phase, improbable, so that electron-density maps computed using centroid phases are often non-interpretable. Density modification brings in information about the characteristics of electron density in protein crystals. In successful cases, this allows a choice between the modes in the phase probability distributions, and the maps can cross the borderline between non-interpretable and interpretable. Based on the suggestions by Vekhter [Vekhter (2005 ▶), Acta Cryst. D61, 899–902], the impact of identifying optimized phases for a small number of strong reflections prior to the density-modification process was investigated while using the centroid phase as a starting point for the remaining reflections. A genetic algorithm was developed that optimizes the quality of such phases using the skewness of the density map as a target function. Phases optimized in this way are then used in density modification. In most of the tests, the resulting maps were of higher quality than maps generated from the original centroid phases. In one of the test cases, the new method sufficiently improved a marginal set of experimental SAD phases to enable successful map interpretation. A computer program, SISA, has been developed to apply this method for phase improvement in macromolecular crystallography.
experimental phasing; density modification; genetic algorithms
The DNA-templated polymerization of synthetic building blocks provides a potential route to the laboratory evolution of sequence-defined polymers with structures and properties not necessarily limited to those of natural biopolymers. We previously reported the efficient and sequence-specific DNA-templated polymerization of peptide nucleic acid (PNA) aldehydes. Here, we report the enzyme-free, DNA-templated polymerization of side-chain-functionalized PNA tetramer and pentamer aldehydes. We observed that the polymerization of tetramer and pentamer PNA building blocks with a single lysine-based side chain at various positions in the building block could proceed efficiently and sequence-specifically. In addition, DNA-templated polymerization also proceeded efficiently and in a sequence-specific manner with pentamer PNA aldehydes containing two or three lysine side chains in a single building block to generate more densely functionalized polymers.
To further our understanding of side-chain compatibility and expand the capabilities of this system, we also examined the polymerization efficiencies of 20 pentamer building blocks each containing one of five different side-chain groups and four different side-chain regio- and stereochemistries. Polymerization reactions were efficient for all five different side-chain groups and for three of the four combinations of side-chain regio- and stereochemistries. Differences in the efficiency and initial rate of polymerization correlate with the apparent melting temperature of each building block, which is dependent on side-chain regio- and stereochemistry, but relatively insensitive to side-chain structure among the substrates tested. Our findings represent a significant step towards the evolution of sequence-defined synthetic polymers and also demonstrate that enzyme-free nucleic acid-templated polymerization can occur efficiently using substrates with a wide range of side-chain structures, functionalization positions within each building block, and functionalization densities.
MAIN is interactive software designed to interactively perform the complex tasks of macromolecular crystal structure determination and validation. The features of MAIN and its tools for electron-density map calculations, model building, refinement in real and reciprocal space, and validation exploiting noncrystallographic symmetry in single and multiple crystal forms are presented.
MAIN is software that has been designed to interactively perform the complex tasks of macromolecular crystal structure determination and validation. Using MAIN, it is possible to perform density modification, manual and semi-automated or automated model building and rebuilding, real- and reciprocal-space structure optimization and refinement, map calculations and various types of molecular structure validation. The prompt availability of various analytical tools and the immediate visualization of molecular and map objects allow a user to efficiently progress towards the completed refined structure. The extraordinary depth perception of molecular objects in three dimensions that is provided by MAIN is achieved by the clarity and contrast of colours and the smooth rotation of the displayed objects. MAIN allows simultaneous work on several molecular models and various crystal forms. The strength of MAIN lies in its manipulation of averaged density maps and molecular models when noncrystallographic symmetry (NCS) is present. Using MAIN, it is possible to optimize NCS parameters and envelopes and to refine the structure in single or multiple crystal forms.
molecular modelling; molecular graphics; macromolecular crystal structure determination; map calculation; computer programs
Noncrystallographic symmetry is automatically detected and used to achieve higher completeness and greater accuracy of automatically built protein structures at resolutions of 2.3 Å or poorer.
A novel method is presented for the automatic detection of noncrystallographic symmetry (NCS) in macromolecular crystal structure determination which does not require the derivation of molecular masks or the segmentation of density. It was found that throughout structure determination the NCS-related parts may be differently pronounced in the electron density. This often results in the modelling of molecular fragments of variable length and accuracy, especially during automated model-building procedures. These fragments were used to identify NCS relations in order to aid automated model building and refinement. In a number of test cases higher completeness and greater accuracy of the obtained structures were achieved, specifically at a crystallographic resolution of 2.3 Å or poorer. In the best case, the method allowed the building of up to 15% more residues automatically and a tripling of the average length of the built fragments.
noncrystallographic symmetry; automated model building
A density-based procedure is described for improving a homology model that is locally accurate but differs globally. The model is deformed to match the map and refined, yielding an improved starting point for density modification and further model-building.
An approach is presented for addressing the challenge of model rebuilding after molecular replacement in cases where the placed template is very different from the structure to be determined. The approach takes advantage of the observation that a template and target structure may have local structures that can be superimposed much more closely than can their complete structures. A density-guided procedure for deformation of a properly placed template is introduced. A shift in the coordinates of each residue in the structure is calculated based on optimizing the match of model density within a 6 Å radius of the center of that residue with a prime-and-switch electron-density map. The shifts are smoothed and applied to the atoms in each residue, leading to local deformation of the template that improves the match of map and model. The model is then refined to improve the geometry and the fit of model to the structure-factor data. A new map is then calculated and the process is repeated until convergence. The procedure can extend the routine applicability of automated molecular replacement, model building and refinement to search models with over 2 Å r.m.s.d. representing 65–100% of the structure.
molecular replacement; automation; macromolecular crystallography; structure similarity; modeling; Phenix; morphing
Bacteriophage Psp231a infects Pseudomonas phaseolicola, strain HB10Y, which is the host cell for the enveloped bacteriophage phi 6. This paper describes the biophysical characteristics of Psp231a and the physical properties of its nucleic acid. In electron micrographs the virion appears as an icosahedral structure, approximately 55 nm in diameter, with a short tail. The virion density is 1.48 g/cm3 in CsCl, and the sedimentation coefficient is approximately 407S. Sodium dodecyl sulfate-polyacrylamide gel electrophoresis revealed the presence of 12 polypeptides ranging in molecular weight from 5,000 to 117,000. The nucleic acid of Psp231a is linear, double-stranded DNA of molecular weight 28 X 10(6). Its density in CsCl is 1.716 g/cm3, and its sedimentation coefficient in 3 M CsCl is 20.0S, corresponding to an S020,W of 34S.