The highly automated PHENIX AutoBuild wizard is described. The procedure can be applied equally well to phases derived from isomorphous/anomalous and molecular-replacement methods.
The PHENIX AutoBuild wizard is a highly automated tool for iterative model building, structure refinement and density modification using RESOLVE model building, RESOLVE statistical density modification and phenix.refine structure refinement. Recent advances in the AutoBuild wizard and phenix.refine include automated detection and application of NCS from models as they are built, extensive model-completion algorithms and automated solvent-molecule picking. Model-completion algorithms in the AutoBuild wizard include loop building, crossovers between chains in different models of a structure and side-chain optimization. The AutoBuild wizard has been applied to a set of 48 structures at resolutions ranging from 1.1 to 3.2 Å, resulting in a mean R factor of 0.24 and a mean free R factor of 0.29. The R factor of the final model is dependent on the quality of the starting electron density and is relatively independent of resolution.
model building; model completion; macromolecular models; Protein Data Bank; structure refinement; PHENIX
The combination of molecular replacement and single-wavelength anomalous diffraction improves the performance of automated structure determination with Auto-Rickshaw.
A combination of molecular replacement and single-wavelength anomalous diffraction phasing has been incorporated into the automated structure-determination platform Auto-Rickshaw. The complete MRSAD procedure includes molecular replacement, model refinement, experimental phasing, phase improvement and automated model building. The improvement over the standard SAD or MR approaches is illustrated by ten test cases taken from the JCSG diffraction data-set database. Poor MR or SAD phases with phase errors larger than 70° can be improved using the described procedure and a large fraction of the model can be determined in a purely automatic manner from X-ray data extending to better than 2.6 Å resolution.
automated structure determination; molecular replacement; single-wavelength anomalous diffraction
In X-ray crystallography, molecular replacement and subsequent refinement is challenging at low resolution. We compared refinement methods using synchrotron diffraction data of photosystem I at 7.4 Å resolution, starting from different initial models with increasing deviations from the known high-resolution structure. Standard refinement spoiled the initial models moving them further away from the true structure and leading to high Rfree-values. In contrast, DEN-refinement improved even the most distant starting model as judged by Rfree, atomic root-mean-square differences to the true structure, significance of features not included in the initial model, and connectivity of electron density. The best protocol was DEN-refinement with initial segmented rigid-body refinement. For the most distant initial model, the fraction of atoms within 2 Å of the true structure improved from 24% to 60%. We also found a significant correlation between Rfree-values and the accuracy of the model, suggesting that Rfree is useful even at low resolution.
DEN refinement; membrane protein; low-resolution refinement; simulated annealing; free R value
The application of a multivariate likelihood function to a single isomorphous replacement with anomalous scattering experiment improves phasing and automated model building with iterative refinement in the test cases shown.
A likelihood function based on the multivariate probability distribution of all observed structure-factor amplitudes from a single isomorphous replacement with anomalous scattering experiment has been derived and implemented for use in substructure refinement and phasing as well as macromolecular model refinement. Efficient calculation of a multidimensional integration required for function evaluation has been achieved by approximations based on the function’s properties. The use of the function in both phasing and protein model building with iterative refinement was essential for successful automated model building in the test cases presented.
multivariate normal probability distribution; single isomorphous replacement with anomalous scattering; experimental phasing; direct incorporation of prior phase information
The automated building of a protein model into an electron density map remains a challenging problem. In the ARP/wARP approach, model building is facilitated by initially interpreting a density map with free atoms of unknown chemical identity; all structural information for such chemically unassigned atoms is discarded. Here, this is remedied by applying restraints between free atoms, and between free atoms and a partial protein model. These are based on geometric considerations of protein structure and tentative (conditional) assignments for the free atoms. Restraints are applied in the REFMAC5 refinement program and are generated on an ad hoc basis, allowing them to fluctuate from step to step. A large set of experimentally phased and molecular replacement structures showcases individual structures where automated building is improved drastically by the conditional restraints. The concept and implementation we present can also find application in restraining geometries, such as hydrogen bonds, in low-resolution refinement.
Helicobacter pylori colonizes the human gastric mucosa and causes gastritis, ulceration, or gastric cancer. A previously uncharacterized region of the H. pylori genome was identified and sequenced. This region includes a putative operon containing three open reading frames termed gidA (1,866 bp), dapE (1,167 bp), and orf2 (753 bp); the gidA and dapE products are highly homologous to other bacterial proteins. In E. coli, dapE encodes N-succinyl-L-diaminopimelic acid desuccinylase, which catalyzes the hydrolysis of N-succinyl-L-diaminopimelic acid to L-diaminopimelic acid (L-DAP) and succinate. When wild-type H. pylori strains were transformed to select for dapE mutagenesis, mutants were present when plates were supplemented with DAP but not with lysine; orf2 mutants were selected without DAP supplementation. Consistent with the finding that GidA is essential in Escherichia coli, we were unable to obtain a gidA mutant in H. pylori despite evidence that insertional mutagenesis had occurred. The positions of gidA, dapE, and orf2 suggest that they form an operon, which was supported by slot blot RNA hybridization and reverse transcriptase PCR studies. The data imply that the H. pylori dapE mutant may be useful as a conditionally lethal vaccine.
The emergence of bacterial strains that are resistant to virtually all currently available antibiotics underscores the importance of developing new antimicrobial compounds. N-succinyl-l,l-diaminopimelic acid desuccinylase (DapE) is a metallohydrolase involved in the meso-diaminopimelate (mDAP)/lysine biosynthetic pathway necessary for lysine biosynthesis and for building the peptidoglycan cell wall. Because DapE is essential for Gram-negative and some Gram-positive bacteria, DapE has been proposed as a good target for antibiotic development. Recently, l-captopril has been suggested as a lead compound for inhibition of DapE, although its selectivity for this enzyme target in bacteria remains unclear (Gillner et al. (2009)). Here, we tested the selectivity of l-captopril against DapE in bacteria. Since DapE knockout strains of gram-negative bacteria are viable upon chemical supplementation with mDAP, we reasoned that the antimicrobial activity of compounds targeting DapE should be abolished in mDAP-containing media. Although l-captopril had modest antimicrobial activity in Escherichia coli and in Salmonella enterica, to our surprise, inhibition of bacterial growth was independent both of mDAP supplementation and DapE over-expression. We conclude that DapE is not the main target of l-captopril inhibition in these bacteria. The methods implemented here will be useful for screening DapE-selective antimicrobial compounds directly in bacterial cultures.
An OMIT procedure is presented that has the benefits of iterative model building density modification and refinement yet is essentially unbiased by the atomic model that is built.
A procedure for carrying out iterative model building, density modification and refinement is presented in which the density in an OMIT region is essentially unbiased by an atomic model. Density from a set of overlapping OMIT regions can be combined to create a composite ‘iterative-build’ OMIT map that is everywhere unbiased by an atomic model but also everywhere benefiting from the model-based information present elsewhere in the unit cell. The procedure may have applications in the validation of specific features in atomic models as well as in overall model validation. The procedure is demonstrated with a molecular-replacement structure and with an experimentally phased structure and a variation on the method is demonstrated by removing model bias from a structure from the Protein Data Bank.
model building; model validation; macromolecular models; Protein Data Bank; refinement; OMIT maps; bias; structure refinement; PHENIX
A description is given of Phaser-2.1: software for phasing macromolecular crystal structures by molecular replacement and single-wavelength anomalous dispersion phasing.
Phaser is a program for phasing macromolecular crystal structures by both molecular replacement and experimental phasing methods. The novel phasing algorithms implemented in Phaser have been developed using maximum likelihood and multivariate statistics. For molecular replacement, the new algorithms have proved to be significantly better than traditional methods in discriminating correct solutions from noise, and for single-wavelength anomalous dispersion experimental phasing, the new algorithms, which account for correlations between F
+ and F
−, give better phases (lower mean phase error with respect to the phases given by the refined structure) than those that use mean F and anomalous differences ΔF. One of the design concepts of Phaser was that it be capable of a high degree of automation. To this end, Phaser (written in C++) can be called directly from Python, although it can also be called using traditional CCP4 keyword-style input. Phaser is a platform for future development of improved phasing methods and their release, including source code, to the crystallographic community.
computer programs; molecular replacement; SAD phasing; likelihood; structural genomics
SAD data can be used in Phaser to solve novel structures, supplement molecular-replacement phase information or identify anomalous scatterers from a final refined model.
Phaser is a program that implements likelihood-based methods to solve macromolecular crystal structures, currently by molecular replacement or single-wavelength anomalous diffraction (SAD). SAD phasing is based on a likelihood target derived from the joint probability distribution of observed and calculated pairs of Friedel-related structure factors. This target combines information from the total structure factor (primarily non-anomalous scattering) and the difference between the Friedel mates (anomalous scattering). Phasing starts from a substructure, which is usually but not necessarily a set of anomalous scatterers. The substructure can also be a protein model, such as one obtained by molecular replacement. Additional atoms are found using a log-likelihood gradient map, which shows the sites where the addition of scattering from a particular atom type would improve the likelihood score. An automated completion algorithm adds new sites, choosing optionally among different atom types, adds anisotropic B-factor parameters if appropriate and deletes atoms that refine to low occupancy. Log-likelihood gradient maps can also identify which atoms in a refined protein structure are anomalous scatterers, such as metal or halide ions. These maps are more sensitive than conventional model-phased anomalous difference Fouriers and the iterative completion algorithm is able to find a significantly larger number of convincing sites.
SAD phasing; likelihood; molecular replacement
The deformable elastic network (DEN) method for reciprocal-space crystallographic refinement improves crystal structures, especially at resolutions lower than 3.5 Å. The DEN web service presented here intends to provide structural biologists with access to resources for running computationally intensive DEN refinements.
Deformable elastic network (DEN) restraints have proved to be a powerful tool for refining structures from low-resolution X-ray crystallographic data sets. Unfortunately, optimal refinement using DEN restraints requires extensive calculations and is often hindered by a lack of access to sufficient computational resources. The DEN web service presented here intends to provide structural biologists with access to resources for running computationally intensive DEN refinements in parallel on the Open Science Grid, the US cyberinfrastructure. Access to the grid is provided through a simple and intuitive web interface integrated into the SBGrid Science Portal. Using this portal, refinements combined with full parameter optimization that would take many thousands of hours on standard computational resources can now be completed in several hours. An example of the successful application of DEN restraints to the human Notch1 transcriptional complex using the grid resource, and summaries of all submitted refinements, are presented as justification.
deformable elastic network restraints; low-resolution refinement; DEN refinement
In eubacteria, there are three slightly different pathways for the synthesis of m-diaminopimelate (m-DAP), which is one of the key linking units of peptidoglycan. Surprisingly, for unknown reasons, some bacteria use two of these pathways together. An example is Corynebacterium glutamicum, which uses both the succinylase and dehydrogenase pathways for m-DAP synthesis. In this study, we clone dapD and prove by enzyme experiments that this gene encodes the succinylase (Mr = 24082), initiating the succinylase pathway of m-DAP synthesis. By using gene-directed mutation, dapD, as well as dapE encoding the desuccinylase, was inactivated, thereby forcing C. glutamicum to use only the dehydrogenase pathway of m-DAP synthesis. The mutants are unable to grow on organic nitrogen sources. When supplied with low ammonium concentrations but excess carbon, their morphology is radically altered and they are less resistant to mechanical stress than the wild type. Since the succinylase has a high affinity toward its substrate and uses glutamate as the nitrogen donor, while the dehydrogenase has a low affinity and incorporates ammonium directly, the m-DAP synthesis is another example of twin activities present in bacteria for access to important metabolites such as the well-known twin activities for the synthesis of glutamate or for the uptake of potassium.
The dapE gene of Escherichia coli encodes N-succinyl-L-diaminopimelic acid desuccinylase, an enzyme that catalyzes the synthesis of LL-diaminopimelic acid, one of the last steps in the diaminopimelic acid-lysine pathway. The dapE gene region was previously purified from a lambda bacteriophage transducing the neighboring purC gene (J. Parker, J. Bacteriol. 157:712-717, 1984). Various subcloning steps led to the identification of a 2.3-kb fragment that complemented several dapE mutants and allowed more than 400-fold overexpression of N-succinyl-L-diaminopimelic acid desuccinylase. Sequencing of this fragment revealed the presence of two closely linked open reading frames. The second one encodes a 375-residue, 41,129-M(r) polypeptide that was identified as N-succinyl-L-diaminopimelic acid desuccinylase. The first one encodes a 118-residue polypeptide that is not required for diaminopimelic acid biosynthesis, as judged by the wild-type phenotype of a strain in which this gene was disrupted. Expression of the dapE gene was studied by monitoring amylomaltase activity in strains in which the malPQ operon was under the control of various fragments located upstream of the dapE gene. The major promoter governing dapE transcription was found to be located in the adjacent orf118 gene, while a minor promoter allowed the transcription of both orf118 and dapE. Neither of these two promoters is regulated by the lysine concentration in the growth medium.
The functional complementation of two Escherichia coli strains defective in the succinylase pathway of meso-diaminopimelate (meso-DAP) biosynthesis with a Bordetella pertussis gene library resulted in the isolation of a putative dap operon containing three open reading frames (ORFs). In line with the successful complementation of the E. coli dapD and dapE mutants, the deduced amino acid sequences of two ORFs revealed significant sequence similarities with the DapD and DapE proteins of E. coli and many other bacteria which exhibit tetrahydrodipicolinate succinylase and N-succinyl-l,l-DAP desuccinylase activity, respectively. The first ORF within the operon showed significant sequence similarities with transaminases and contains the characteristic pyridoxal-5′-phosphate binding motif. Enzymatic studies revealed that this ORF encodes a protein with N-succinyl-l,l-DAP aminotransferase activity converting N-succinyl-2-amino-6-ketopimelate, the product of the succinylase DapD, to N-succinyl-l,l-DAP, the substrate of the desuccinylase DapE. Therefore, this gene appears to encode the DapC protein of B. pertussis. Apart from the pyridoxal-5′-phosphate binding motif, the DapC protein does not show further amino acid sequence similarities with the only other known enzyme with N-succinyl-l,l-DAP aminotransferase activity, ArgD of E. coli.
In a first systematic exploration of phasing with Rosetta de novo models, it is shown that all-atom refinement of coarse-grained models significantly improves both the model quality and performance in molecular replacement with the Phaser software.
The prospect of phasing diffraction data sets ‘de novo’ for proteins with previously unseen folds is appealing but largely untested. In a first systematic exploration of phasing with Rosetta de novo models, it is shown that all-atom refinement of coarse-grained models significantly improves both the model quality and performance in molecular replacement with the Phaser software. 15 new cases of diffraction data sets that are unambiguously phased with de novo models are presented. These diffraction data sets represent nine space groups and span a large range of solvent contents (33–79%) and asymmetric unit copy numbers (1–4). No correlation is observed between the ease of phasing and the solvent content or asymmetric unit copy number. Instead, a weak correlation is found with the length of the modeled protein: larger proteins required somewhat less accurate models to give successful molecular replacement. Overall, the results of this survey suggest that de novo models can phase diffraction data for approximately one sixth of proteins with sizes of 100 residues or less. However, for many of these cases, ‘de novo phasing with de novo models’ requires significant investment of computational power, much greater than 103 CPU days per target. Improvements in conformational search methods will be necessary if molecular replacement with de novo models is to become a practical tool for targets without homology to previously solved protein structures.
structure prediction; molecular replacement; de novo phasing
Three different pathways of D,L-diaminopimelate and L-lysine synthesis are known in procaryotes. Determinations of the corresponding enzyme activities in Escherichia coli, Bacillus subtilis, and Bacillus sphaericus verified the fact that in each of these bacteria only one of the possible pathways operates. However, in Corynebacterium glutamicum activities are present which allow in principle the use of the dehydrogenase variant and succinylase variant of lysine synthesis together. Applying gene-directed mutagenesis, various C. glutamicum strains were constructed with interrupted ddh gene. These mutants have an inactive dehydrogenase pathway but are still prototrophic, which is proof that the succinylase pathway of D,L-diaminopimelate synthesis can be utilized. In strains with an increased flow of precursors to D,L-diaminopimelate, however, the inactivation of the dehydrogenase pathway resulted in a reduced formation of lysine, with concomitant accumulation of N-succinyl-diaminopimelate in the cytosol up to a concentration of 25 mM. These data show (i) that both pathways can operate in C. glutamicum for D,L-diaminopimelate and L-lysine synthesis, (ii) that the dehydrogenase pathway is not essential, and (iii) that the dehydrogenase pathway is a prerequisite for handling an increased flow of metabolites to D,L-diaminopimelate.
MAIN is interactive software designed to interactively perform the complex tasks of macromolecular crystal structure determination and validation. The features of MAIN and its tools for electron-density map calculations, model building, refinement in real and reciprocal space, and validation exploiting noncrystallographic symmetry in single and multiple crystal forms are presented.
MAIN is software that has been designed to interactively perform the complex tasks of macromolecular crystal structure determination and validation. Using MAIN, it is possible to perform density modification, manual and semi-automated or automated model building and rebuilding, real- and reciprocal-space structure optimization and refinement, map calculations and various types of molecular structure validation. The prompt availability of various analytical tools and the immediate visualization of molecular and map objects allow a user to efficiently progress towards the completed refined structure. The extraordinary depth perception of molecular objects in three dimensions that is provided by MAIN is achieved by the clarity and contrast of colours and the smooth rotation of the displayed objects. MAIN allows simultaneous work on several molecular models and various crystal forms. The strength of MAIN lies in its manipulation of averaged density maps and molecular models when noncrystallographic symmetry (NCS) is present. Using MAIN, it is possible to optimize NCS parameters and envelopes and to refine the structure in single or multiple crystal forms.
molecular modelling; molecular graphics; macromolecular crystal structure determination; map calculation; computer programs
Noncrystallographic symmetry is automatically detected and used to achieve higher completeness and greater accuracy of automatically built protein structures at resolutions of 2.3 Å or poorer.
A novel method is presented for the automatic detection of noncrystallographic symmetry (NCS) in macromolecular crystal structure determination which does not require the derivation of molecular masks or the segmentation of density. It was found that throughout structure determination the NCS-related parts may be differently pronounced in the electron density. This often results in the modelling of molecular fragments of variable length and accuracy, especially during automated model-building procedures. These fragments were used to identify NCS relations in order to aid automated model building and refinement. In a number of test cases higher completeness and greater accuracy of the obtained structures were achieved, specifically at a crystallographic resolution of 2.3 Å or poorer. In the best case, the method allowed the building of up to 15% more residues automatically and a tripling of the average length of the built fragments.
noncrystallographic symmetry; automated model building
Four case studies in using maximum-likelihood molecular replacement, as implemented in the program Phaser, to solve structures of protein complexes are described.
Molecular replacement (MR) generally becomes more difficult as the number of components in the asymmetric unit requiring separate MR models (i.e. the dimensionality of the search) increases. When the proportion of the total scattering contributed by each search component is small, the signal in the search for each component in isolation is weak or non-existent. Maximum-likelihood MR functions enable complex asymmetric units to be built up from individual components with a ‘tree search with pruning’ approach. This method, as implemented in the automated search procedure of the program Phaser, has been very successful in solving many previously intractable MR problems. However, there are a number of cases in which the automated search procedure of Phaser is suboptimal or encounters difficulties. These include cases where there are a large number of copies of the same component in the asymmetric unit or where the components of the asymmetric unit have greatly varying B factors. Two case studies are presented to illustrate how Phaser can be used to best advantage in the standard ‘automated MR’ mode and two case studies are used to show how to modify the automated search strategy for problematic cases.
macromolecular crystallography; molecular replacement; maximum likelihood
The pitfalls of experimental phasing are described.
Developments in protein crystal structure determination by experimental phasing are reviewed, emphasizing the theoretical continuum between experimental phasing, density modification, model building and refinement. Traditional notions of the composition of the substructure and the best coefficients for map generation are discussed. Pitfalls such as determining the enantiomorph, identifying centrosymmetry (or pseudo-symmetry) in the substructure and crystal twinning are discussed in detail. An appendix introduces combined real–imaginary log-likelihood gradient map coefficients for SAD phasing and their use for substructure completion as implemented in the software Phaser. Supplementary material includes animated probabilistic Harker diagrams showing how maximum-likelihood-based phasing methods can be used to refine parameters in the case of SIR and MIR; it is hoped that these will be useful for those teaching best practice in experimental phasing methods.
enantiomers; handedness; absolute configuration; chirality; twinning; experimental phasing
Biosynthesis of lysine and meso-diaminopimelic acid in bacteria provides essential components for protein synthesis and construction of the bacterial peptidoglycan cell wall. The dapE operon enzymes synthesize both meso-diaminopimelic acid and lysine and, therefore, represent a potential targets for novel antibacterials. The dapE-encoded N-succinyl-L,L-diaminopimelic acid desuccinylase functions in a late step of the pathway and converts N-succinyl-L,L-diaminopimelic acid (L,L-SDAP) to L,L-diaminopimelic acid and succinate. Deletion of the dapE gene is lethal to Helicobacter pylori and Mycobacterium smegmatis indicating that DapE’s are essential for cell growth and proliferation. Since there are no similar pathways in humans, inhibitors that target DapE may have selective toxicity against only bacteria. A major limitation in developing antimicrobial agents that target DapE has been the lack of structural information. Herein we report the high-resolution X-ray crystal structures of the DapE from Haemophilus influenzae with one and two zinc ions bound in the active site, respectively. These two forms show different activity. Based on these newly determined structures we propose a revised catalytic mechanism of peptide bond cleavage by DapE enzymes. These structures provide important insight into catalytic mechanism of DapE enzymes as well as a structural foundation that is critical for the rational design of DapE inhibitors.
DtsR1, a carboxyltransferase subunit of acetyl-CoA carboxylase from C. glutamicum, was crystallized and phases were obtained by molecular replacement.
DtsR1, a carboxyltransferase subunit of acetyl-CoA carboxylase derived from Corynebacterium glutamicum, was crystallized by the sitting-drop vapour-diffusion method using polyethylene glycol 6000 as a precipitant. The crystal belongs to the trigonal system with space group R32 and contains three subunits in the asymmetric unit. A molecular-replacement solution was found using the structure of transcarboxylase 12S from Propionibacterium shermanii as a search model.
acetyl-CoA carboxylase; carboxyltransferases; metabolic engineering
The catalytic and structural properties of the H67A and H349A altered dapE-encoded N-succinyl-l,l-diaminopimelic acid desuccinylase (DapE) from H. influenzae were investigated. Based on sequence alignment with CPG2 both H67 and H349 were predicted to be Zn(II) ligands. Catalytic activity was observed for the H67A altered DapE enzyme which exhibited kcat = 1.5 ± 0.5 sec−1 and Km = 1.4 ± 0.3 mM. No catalytic activity was observed for H349A under the experimental conditions used. The EPR and electronic absorption data indicate that the Co(II) ion bound to H349A-DapE is analogous to WT DapE after the addition of a single Co(II) ion. The addition of one equivalent of Co(II) to H67A altered DapE provides spectra that are very different from the first Co(II) binding site of the WT enzyme, but similar to the second binding site. The EPR and electronic absorption data, in conjunction with the kinetic data, are consistent with the assignment of H67 and H349 as active site metal ligands for the DapE from H. influenzae. Furthermore, the data suggest that H67 is a ligand in the first metal binding site while H349 resides in the second metal binding site. A three-dimensional homology structure of the DapE from H. influenzae was generated using the X-ray crystal structure of the DapE from N. meningitidis as a template and superimposed on the structure of AAP. This homology structure confirms the assignment of H67 and H349 as active site ligands. The superimposition of the homology model of DapE with the dizinc(II) structure of AAP indicates that within 4.0 Å of the Zn(II) binding sites of AAP, all of the amino acid residues of DapE are nearly identical.
biomedicine; biosynthesis; electron paramagnetic resonance; enzyme kinetics; homology model; site-directed mutagenesis; structure-function relationship
A procedure for carrying out iterative model building, density modification and refinement is presented in which the density in an OMITregion is essentially unbiased by an atomic model. Density from a set of overlapping OMIT regions can be combined to create a composite ‘iterative-build’ OMIT map that is everywhere unbiased by an atomic model but also everywhere benefiting from the model-based information present elsewhere in the unit cell. The procedure may have applications in the validation of specific features in atomic models as well as in overall model validation. The procedure is demonstrated with a molecular-replacement structure and with an experimentally phased structure and a variation on the method is demonstrated by removing model bias from a structure from the Protein Data Bank.
An evaluation of validation and real-space intervention possibilities for improving existing automated (re-)refinement methods.
The deposition of X-ray data along with the customary structural models defining PDB entries makes it possible to apply large-scale re-refinement protocols to these entries, thus giving users the benefit of improvements in X-ray methods that have occurred since the structure was deposited. Automated gradient refinement is an effective method to achieve this goal, but real-space intervention is most often required in order to adequately address problems detected by structure-validation software. In order to improve the existing protocol, automated re-refinement was combined with structure validation and difference-density peak analysis to produce a catalogue of problems in PDB entries that are amenable to automatic correction. It is shown that re-refinement can be effective in producing improvements, which are often associated with the systematic use of the TLS parameterization of B factors, even for relatively new and high-resolution PDB entries, while the accompanying manual or semi-manual map analysis and fitting steps show good prospects for eventual automation. It is proposed that the potential for simultaneous improvements in methods and in re-refinement results be further encouraged by broadening the scope of depositions to include refinement metadata and ultimately primary rather than reduced X-ray data.