A procedure for model building is described that combines morphing a model to match a density map, trimming the morphed model and aligning the model to a sequence.
A procedure termed ‘morphing’ for improving a model after it has been placed in the crystallographic cell by molecular replacement has recently been developed. Morphing consists of applying a smooth deformation to a model to make it match an electron-density map more closely. Morphing does not change the identities of the residues in the chain, only their coordinates. Consequently, if the true structure differs from the working model by containing different residues, these differences cannot be corrected by morphing. Here, a procedure that helps to address this limitation is described. The goal of the procedure is to obtain a relatively complete model that has accurate main-chain atomic positions and residues that are correctly assigned to the sequence. Residues in a morphed model that do not match the electron-density map are removed. Each segment of the resulting trimmed morphed model is then assigned to the sequence of the molecule using information about the connectivity of the chains from the working model and from connections that can be identified from the electron-density map. The procedure was tested by application to a recently determined structure at a resolution of 3.2 Å and was found to increase the number of correctly identified residues in this structure from the 88 obtained using phenix.resolve sequence assignment alone (Terwilliger, 2003 ▶) to 247 of a possible 359. Additionally, the procedure was tested by application to a series of templates with sequence identities to a target structure ranging between 7 and 36%. The mean fraction of correctly identified residues in these cases was increased from 33% using phenix.resolve sequence assignment to 47% using the current procedure. The procedure is simple to apply and is available in the Phenix software package.
morphing; model building; sequence assignment; model–map correlation; loop-building
A density-based procedure is described for improving a homology model that is locally accurate but differs globally. The model is deformed to match the map and refined, yielding an improved starting point for density modification and further model-building.
An approach is presented for addressing the challenge of model rebuilding after molecular replacement in cases where the placed template is very different from the structure to be determined. The approach takes advantage of the observation that a template and target structure may have local structures that can be superimposed much more closely than can their complete structures. A density-guided procedure for deformation of a properly placed template is introduced. A shift in the coordinates of each residue in the structure is calculated based on optimizing the match of model density within a 6 Å radius of the center of that residue with a prime-and-switch electron-density map. The shifts are smoothed and applied to the atoms in each residue, leading to local deformation of the template that improves the match of map and model. The model is then refined to improve the geometry and the fit of model to the structure-factor data. A new map is then calculated and the process is repeated until convergence. The procedure can extend the routine applicability of automated molecular replacement, model building and refinement to search models with over 2 Å r.m.s.d. representing 65–100% of the structure.
molecular replacement; automation; macromolecular crystallography; structure similarity; modeling; Phenix; morphing
A strategy using a new split green fluorescent protein (GFP) as a modular binding partner to form stable protein complexes with a target protein is presented. The modular split GFP may open the way to rapidly creating crystallization variants.
A modular strategy for protein crystallization using split green fluorescent protein (GFP) as a crystallization partner is demonstrated. Insertion of a hairpin containing GFP β-strands 10 and 11 into a surface loop of a target protein provides two chain crossings between the target and the reconstituted GFP compared with the single connection afforded by terminal GFP fusions. This strategy was tested by inserting this hairpin into a loop of another fluorescent protein, sfCherry. The crystal structure of the sfCherry-GFP(10–11) hairpin in complex with GFP(1–9) was determined at a resolution of 2.6 Å. Analysis of the complex shows that the reconstituted GFP is attached to the target protein (sfCherry) in a structurally ordered way. This work opens the way to rapidly creating crystallization variants by reconstituting a target protein bearing the GFP(10–11) hairpin with a variety of GFP(1–9) mutants engineered for favorable crystallization.
protein crystallization; synthetic symmetrization; protein tagging; split GFP; split protein; green fluorescent protein; protein expression; protein-fragment complementation; crystallization reagents
A comparative analysis of sulfur phasing of death receptor 6 (DR6) using data collected at wavelengths of 2.0 and 2.7 Å is presented. SAXS analysis of unliganded DR6 defines a dimer as the minimum physical unit in solution.
A subset of tumour necrosis factor receptor (TNFR) superfamily members contain death domains in their cytoplasmic tails. Death receptor 6 (DR6) is one such member and can trigger apoptosis upon the binding of a ligand by its cysteine-rich domains (CRDs). The crystal structure of the ectodomain (amino acids 1–348) of human death receptor 6 (DR6) encompassing the CRD region was phased using the anomalous signal from S atoms. In order to explore the feasibility of S-SAD phasing at longer wavelengths (beyond 2.5 Å), a comparative study was performed on data collected at wavelengths of 2.0 and 2.7 Å. In spite of sub-optimal experimental conditions, the 2.7 Å wavelength used for data collection showed potential for S-SAD phasing. The results showed that the R
p.i.m. ratio is a good indicator for monitoring the anomalous data quality when the anomalous signal is relatively strong, while d′′/sig(d′′) calculated by SHELXC is a more sensitive and stable indicator applicable for grading a wider range of anomalous data qualities. The use of the ‘parameter-space screening method’ for S-SAD phasing resulted in solutions for data sets that failed during manual attempts. SAXS measurements on the ectodomain suggested that a dimer defines the minimal physical unit of an unliganded DR6 molecule in solution.
sulfur phasing; SAXS analysis; long-wavelength X-rays; death receptor 6
Here, the crystal structure of TM0439, a GntR regulator with an FCD domain found in the Thermotoga maritima genome, is described.
The GntR superfamily of dimeric transcription factors, with more than 6200 members encoded in bacterial genomes, are characterized by N-terminal winged-helix DNA-binding domains and diverse C-terminal regulatory domains which provide a basis for the classification of the constituent families. The largest of these families, FadR, contains nearly 3000 proteins with all-α-helical regulatory domains classified into two related Pfam families: FadR_C and FCD. Only two crystal structures of FadR-family members, those of Escherichia coli FadR protein and LldR from Corynebacterium glutamicum, have been described to date in the literature. Here, the crystal structure of TM0439, a GntR regulator with an FCD domain found in the Thermotoga maritima genome, is described. The FCD domain is similar to that of the LldR regulator and contains a buried metal-binding site. Using atomic absorption spectroscopy and Trp fluorescence, it is shown that the recombinant protein contains bound Ni2+ ions but that it is able to bind Zn2+ with K
d < 70 nM. It is concluded that Zn2+ is the likely physiological metal and that it may perform either structural or regulatory roles or both. Finally, the TM0439 structure is compared with two other FadR-family structures recently deposited by structural genomics consortia. The results call for a revision in the classification of the FadR family of transcription factors.
transcription regulation; GntR family; structural genomics; surface-entropy reduction
The PHENIX software for macromolecular structure determination is described.
Macromolecular X-ray crystallography is routinely applied to understand biological processes at a molecular level. However, significant time and effort are still required to solve and complete many of these structures because of the need for manual interpretation of complex numerical data using many software packages and the repeated use of interactive three-dimensional graphics. PHENIX has been developed to provide a comprehensive system for macromolecular crystallographic structure solution with an emphasis on the automation of all procedures. This has relied on the development of algorithms that minimize or eliminate subjective input, the development of algorithms that automate procedures that are traditionally performed by hand and, finally, the development of a framework that allows a tight integration between the algorithms.
PHENIX; Python; macromolecular crystallography; algorithms
Ten measures of experimental electron-density-map quality are examined and the skewness of electron density is found to be the best indicator of actual map quality. A Bayesian approach to estimating map quality is developed and used in the PHENIX AutoSol wizard to make decisions during automated structure solution.
Estimates of the quality of experimental maps are important in many stages of structure determination of macromolecules. Map quality is defined here as the correlation between a map and the corresponding map obtained using phases from the final refined model. Here, ten different measures of experimental map quality were examined using a set of 1359 maps calculated by re-analysis of 246 solved MAD, SAD and MIR data sets. A simple Bayesian approach to estimation of map quality from one or more measures is presented. It was found that a Bayesian estimator based on the skewness of the density values in an electron-density map is the most accurate of the ten individual Bayesian estimators of map quality examined, with a correlation between estimated and actual map quality of 0.90. A combination of the skewness of electron density with the local correlation of r.m.s. density gives a further improvement in estimating map quality, with an overall correlation coefficient of 0.92. The PHENIX AutoSol wizard carries out automated structure solution based on any combination of SAD, MAD, SIR or MIR data sets. The wizard is based on tools from the PHENIX package and uses the Bayesian estimates of map quality described here to choose the highest quality solutions after experimental phasing.
structure solution; scoring; Protein Data Bank; phasing; decision-making; PHENIX; experimental electron-density maps
A procedure for carrying out iterative model building, density modification and refinement is presented in which the density in an OMITregion is essentially unbiased by an atomic model. Density from a set of overlapping OMIT regions can be combined to create a composite ‘iterative-build’ OMIT map that is everywhere unbiased by an atomic model but also everywhere benefiting from the model-based information present elsewhere in the unit cell. The procedure may have applications in the validation of specific features in atomic models as well as in overall model validation. The procedure is demonstrated with a molecular-replacement structure and with an experimentally phased structure and a variation on the method is demonstrated by removing model bias from a structure from the Protein Data Bank.
An OMIT procedure is presented that has the benefits of iterative model building density modification and refinement yet is essentially unbiased by the atomic model that is built.
A procedure for carrying out iterative model building, density modification and refinement is presented in which the density in an OMIT region is essentially unbiased by an atomic model. Density from a set of overlapping OMIT regions can be combined to create a composite ‘iterative-build’ OMIT map that is everywhere unbiased by an atomic model but also everywhere benefiting from the model-based information present elsewhere in the unit cell. The procedure may have applications in the validation of specific features in atomic models as well as in overall model validation. The procedure is demonstrated with a molecular-replacement structure and with an experimentally phased structure and a variation on the method is demonstrated by removing model bias from a structure from the Protein Data Bank.
model building; model validation; macromolecular models; Protein Data Bank; refinement; OMIT maps; bias; structure refinement; PHENIX
The highly automated PHENIX AutoBuild wizard is described. The procedure can be applied equally well to phases derived from isomorphous/anomalous and molecular-replacement methods.
The PHENIX AutoBuild wizard is a highly automated tool for iterative model building, structure refinement and density modification using RESOLVE model building, RESOLVE statistical density modification and phenix.refine structure refinement. Recent advances in the AutoBuild wizard and phenix.refine include automated detection and application of NCS from models as they are built, extensive model-completion algorithms and automated solvent-molecule picking. Model-completion algorithms in the AutoBuild wizard include loop building, crossovers between chains in different models of a structure and side-chain optimization. The AutoBuild wizard has been applied to a set of 48 structures at resolutions ranging from 1.1 to 3.2 Å, resulting in a mean R factor of 0.24 and a mean free R factor of 0.29. The R factor of the final model is dependent on the quality of the starting electron density and is relatively independent of resolution.
model building; model completion; macromolecular models; Protein Data Bank; structure refinement; PHENIX
Heterogeneity in ensembles generated by independent model rebuilding principally reflects the limitations of the data and of the model-building process rather than the diversity of structures in the crystal.
Automation of iterative model building, density modification and refinement in macromolecular crystallography has made it feasible to carry out this entire process multiple times. By using different random seeds in the process, a number of different models compatible with experimental data can be created. Sets of models were generated in this way using real data for ten protein structures from the Protein Data Bank and using synthetic data generated at various resolutions. Most of the heterogeneity among models produced in this way is in the side chains and loops on the protein surface. Possible interpretations of the variation among models created by repetitive rebuilding were investigated. Synthetic data were created in which a crystal structure was modelled as the average of a set of ‘perfect’ structures and the range of models obtained by rebuilding a single starting model was examined. The standard deviations of coordinates in models obtained by repetitive rebuilding at high resolution are small, while those obtained for the same synthetic crystal structure at low resolution are large, so that the diversity within a group of models cannot generally be a quantitative reflection of the actual structures in a crystal. Instead, the group of structures obtained by repetitive rebuilding reflects the precision of the models, and the standard deviation of coordinates of these structures is a lower bound estimate of the uncertainty in coordinates of the individual models.
model building; model completion; coordinate errors; models; Protein Data Bank; convergence; reproducibility; heterogeneity; precision; accuracy