A strategy using a new split green fluorescent protein (GFP) as a modular binding partner to form stable protein complexes with a target protein is presented. The modular split GFP may open the way to rapidly creating crystallization variants.
A modular strategy for protein crystallization using split green fluorescent protein (GFP) as a crystallization partner is demonstrated. Insertion of a hairpin containing GFP β-strands 10 and 11 into a surface loop of a target protein provides two chain crossings between the target and the reconstituted GFP compared with the single connection afforded by terminal GFP fusions. This strategy was tested by inserting this hairpin into a loop of another fluorescent protein, sfCherry. The crystal structure of the sfCherry-GFP(10–11) hairpin in complex with GFP(1–9) was determined at a resolution of 2.6 Å. Analysis of the complex shows that the reconstituted GFP is attached to the target protein (sfCherry) in a structurally ordered way. This work opens the way to rapidly creating crystallization variants by reconstituting a target protein bearing the GFP(10–11) hairpin with a variety of GFP(1–9) mutants engineered for favorable crystallization.
protein crystallization; synthetic symmetrization; protein tagging; split GFP; split protein; green fluorescent protein; protein expression; protein-fragment complementation; crystallization reagents
A procedure for model building is described that combines morphing a model to match a density map, trimming the morphed model and aligning the model to a sequence.
A procedure termed ‘morphing’ for improving a model after it has been placed in the crystallographic cell by molecular replacement has recently been developed. Morphing consists of applying a smooth deformation to a model to make it match an electron-density map more closely. Morphing does not change the identities of the residues in the chain, only their coordinates. Consequently, if the true structure differs from the working model by containing different residues, these differences cannot be corrected by morphing. Here, a procedure that helps to address this limitation is described. The goal of the procedure is to obtain a relatively complete model that has accurate main-chain atomic positions and residues that are correctly assigned to the sequence. Residues in a morphed model that do not match the electron-density map are removed. Each segment of the resulting trimmed morphed model is then assigned to the sequence of the molecule using information about the connectivity of the chains from the working model and from connections that can be identified from the electron-density map. The procedure was tested by application to a recently determined structure at a resolution of 3.2 Å and was found to increase the number of correctly identified residues in this structure from the 88 obtained using phenix.resolve sequence assignment alone (Terwilliger, 2003 ▶) to 247 of a possible 359. Additionally, the procedure was tested by application to a series of templates with sequence identities to a target structure ranging between 7 and 36%. The mean fraction of correctly identified residues in these cases was increased from 33% using phenix.resolve sequence assignment to 47% using the current procedure. The procedure is simple to apply and is available in the Phenix software package.
morphing; model building; sequence assignment; model–map correlation; loop-building
O-Acetylhomoserine sulfhydrylase from M. tuberculosis H37Rv has been crystallized and preliminary X-ray crystallographic analysis has been performed.
The gene product of the open reading frame Rv3340 from Mycobacterium tuberculosis is annotated as encoding a probable O-acetylhomoserine (OAH) sulfhydrylase (MetC), an enzyme that catalyzes the last step in the biosynthesis of methionine, which is an essential amino acid in bacteria and plants. Following overexpression in Escherichia coli, the M. tuberculosis MetC enzyme was purified and crystallized using the hanging-drop vapor-diffusion method. Native diffraction data were collected from crystals belonging to space group P21 and were processed to a resolution of 2.1 Å.
Mycobacterium tuberculosis H37Rv; Rv3340; O-acetylhomoserine sulfhydrylase; methionine biosynthesis
Osteoporotic hip fractures with a significant morbidity and excess mortality among the elderly have imposed huge health and economic burdens on societies worldwide. In this age- and sex-matched case control study, we examined the risk factors of hip fractures and assessed the fracture risk by conditional logistic regression (CLR) and ensemble artificial neural network (ANN). The performances of these two classifiers were compared.
The study population consisted of 217 pairs (149 women and 68 men) of fractures and controls with an age older than 60 years. All the participants were interviewed with the same standardized questionnaire including questions on 66 risk factors in 12 categories. Univariate CLR analysis was initially conducted to examine the unadjusted odds ratio of all potential risk factors. The significant risk factors were then tested by multivariate analyses. For fracture risk assessment, the participants were randomly divided into modeling and testing datasets for 10-fold cross validation analyses. The predicting models built by CLR and ANN in modeling datasets were applied to testing datasets for generalization study. The performances, including discrimination and calibration, were compared with non-parametric Wilcoxon tests.
In univariate CLR analyses, 16 variables achieved significant level, and six of them remained significant in multivariate analyses, including low T score, low BMI, low MMSE score, milk intake, walking difficulty, and significant fall at home. For discrimination, ANN outperformed CLR in both 16- and 6-variable analyses in modeling and testing datasets (p?0.005). For calibration, ANN outperformed CLR only in 16-variable analyses in modeling and testing datasets (p?=?0.013 and 0.047, respectively).
The risk factors of hip fracture are more personal than environmental. With adequate model construction, ANN may outperform CLR in both discrimination and calibration. ANN seems to have not been developed to its full potential and efforts should be made to improve its performance.
Hip fracture; Artificial neural network; Conditional logistic regression; Discrimination; Calibration
AcrB is an inner membrane resistance-nodulation-cell division efflux pump and is part of the AcrAB–TolC tripartite efflux system. We have determined the crystal structure of AcrB with bound Linezolid at a resolution of 3.5 Å. The structure shows that Linezolid binds to the A385/F386 loops of the symmetric trimer of AcrB. A conformational change of a loop in the bottom of the periplasmic cleft is also observed.
Multidrug resistance; AcrB; RND efflux pumps; Linezolid; Membrane protein; Protein–drug complex; X-ray crystal structure
A comparative analysis of sulfur phasing of death receptor 6 (DR6) using data collected at wavelengths of 2.0 and 2.7 Å is presented. SAXS analysis of unliganded DR6 defines a dimer as the minimum physical unit in solution.
A subset of tumour necrosis factor receptor (TNFR) superfamily members contain death domains in their cytoplasmic tails. Death receptor 6 (DR6) is one such member and can trigger apoptosis upon the binding of a ligand by its cysteine-rich domains (CRDs). The crystal structure of the ectodomain (amino acids 1–348) of human death receptor 6 (DR6) encompassing the CRD region was phased using the anomalous signal from S atoms. In order to explore the feasibility of S-SAD phasing at longer wavelengths (beyond 2.5 Å), a comparative study was performed on data collected at wavelengths of 2.0 and 2.7 Å. In spite of sub-optimal experimental conditions, the 2.7 Å wavelength used for data collection showed potential for S-SAD phasing. The results showed that the R
p.i.m. ratio is a good indicator for monitoring the anomalous data quality when the anomalous signal is relatively strong, while d′′/sig(d′′) calculated by SHELXC is a more sensitive and stable indicator applicable for grading a wider range of anomalous data qualities. The use of the ‘parameter-space screening method’ for S-SAD phasing resulted in solutions for data sets that failed during manual attempts. SAXS measurements on the ectodomain suggested that a dimer defines the minimal physical unit of an unliganded DR6 molecule in solution.
sulfur phasing; SAXS analysis; long-wavelength X-rays; death receptor 6
X-ray crystallography is a critical tool in the study of biological systems. It is able to provide information that has been a prerequisite to understanding the fundamentals of life. It is also a method that is central to the development of new therapeutics for human disease. Significant time and effort are required to determine and optimize many macromolecular structures because of the need for manual interpretation of complex numerical data, often using many different software packages, and the repeated use of interactive three-dimensional graphics. The Phenix software package has been developed to provide a comprehensive system for macromolecular crystallographic structure solution with an emphasis on automation. This has required the development of new algorithms that minimize or eliminate subjective input in favour of built-in expert-systems knowledge, the automation of procedures that are traditionally performed by hand, and the development of a computational framework that allows a tight integration between the algorithms. The application of automated methods is particularly appropriate in the field of structural proteomics, where high throughput is desired. Features in Phenix for the automation of experimental phasing with subsequent model building, molecular replacement, structure refinement and validation are described and examples given of running Phenix from both the command line and graphical user interface.
Macromolecular Crystallography; Automation; Phenix; X-ray; Diffraction; Python
A density-based procedure is described for improving a homology model that is locally accurate but differs globally. The model is deformed to match the map and refined, yielding an improved starting point for density modification and further model-building.
An approach is presented for addressing the challenge of model rebuilding after molecular replacement in cases where the placed template is very different from the structure to be determined. The approach takes advantage of the observation that a template and target structure may have local structures that can be superimposed much more closely than can their complete structures. A density-guided procedure for deformation of a properly placed template is introduced. A shift in the coordinates of each residue in the structure is calculated based on optimizing the match of model density within a 6 Å radius of the center of that residue with a prime-and-switch electron-density map. The shifts are smoothed and applied to the atoms in each residue, leading to local deformation of the template that improves the match of map and model. The model is then refined to improve the geometry and the fit of model to the structure-factor data. A new map is then calculated and the process is repeated until convergence. The procedure can extend the routine applicability of automated molecular replacement, model building and refinement to search models with over 2 Å r.m.s.d. representing 65–100% of the structure.
molecular replacement; automation; macromolecular crystallography; structure similarity; modeling; Phenix; morphing
The TB Structural Genomics Consortium is a worldwide organization of collaborators whose mission is the comprehensive structural determination and analyses of Mycobacterium tuberculosis proteins to ultimately aid in tuberculosis diagnosis and treatment. Congruent to the overall vision, Consortium members have additionally established an integrated facilities core to streamline M. tuberculosis structural biology and developed bioinformatics resources for data mining. This review aims to share the latest Consortium developments with the TB community, including recent structures of proteins that play significant roles within M. tuberculosis. Atomic resolution details may unravel mechanistic insights and reveal unique and novel protein features, as well as important protein-protein and protein-ligand interactions, which ultimately leads to a better understanding of M. tuberculosis biology and may be exploited for rational, structure-based therapeutics design.
Mycobacterium tuberculosis; Protein structure; X-ray crystallography; Structural genomics; Drug discovery
Ligands interacting with Mycobacterium tuberculosis recombinant proteins were identified through use of the ability of Cibacron Blue F3GA dye to interact with nucleoside/nucleotide binding proteins, and the effects of these ligands on crystallization were examined. Co-crystallization with ligands enhanced crystallization and enabled X-ray diffraction data to be collected to a resolution of at least 2.7 Å for 5 of 10 proteins tested. Additionally, clues about individual proteins’ functions were obtained from their interactions with each of a panel of ligands.
Electronic supplementary material
The online version of this article (doi:10.1007/s10969-012-9124-8) contains supplementary material, which is available to authorized users.
Characterization of proteins based on ligands; Dye-ligand affinity chromatography; Enhancement of crystallization; Ligand aided crystallization; Ligand analysis; Nucleotide ligand
Crystal and solution structures of Rv1848 protein and their implications in the biological assembly of Mtb urease is presented.
The crystal structure of the urease γ subunit (UreA) from Mycobacterium tuberculosis, Rv1848, has been determined at 1.8 Å resolution. The asymmetric unit contains three copies of Rv1848 arranged into a homotrimer that is similar to the UreA trimer in the structure of urease from Klebsiella aerogenes. Small-angle X-ray scattering experiments indicate that the Rv1848 protein also forms trimers in solution. The observed homotrimer and the organization of urease genes within the M. tuberculosis genome suggest that M. tuberculosis urease has the (αβγ)3 composition observed for other bacterial ureases. The γ subunit may be of primary importance for the formation of the urease quaternary structure.
Mycobacterium tuberculosis; urease; structural genomics
Disulfide bond forming (Dsb) proteins ensure correct folding and disulfide bond formation of secreted proteins. Previously, we showed that Mycobacterium tuberculosis DsbE (Mtb DsbE, Rv2878c) aids in vitro oxidative folding of proteins. Here we present structural, biochemical and gene expression analyses of another putative Mtb secreted disulfide bond isomerase protein homologous to Mtb DsbE, Mtb DsbF (Rv1677). The X-ray crystal structure of Mtb DsbF reveals a conserved thioredoxin fold although the active-site cysteines may be modeled in both oxidized and reduced forms, in contrast to the solely reduced form in Mtb DsbE. Furthermore, the shorter loop region in Mtb DsbF results in a more solvent-exposed active site. Biochemical analyses show that, similar to Mtb DsbE, Mtb DsbF can oxidatively refold reduced, unfolded hirudin and has a comparable pKa for the active-site solvent-exposed cysteine. However, contrary to Mtb DsbE, the Mtb DsbF redox potential is more oxidizing and its reduced state is more stable. From computational genomics analysis of the M. tuberculosis genome, we identified a potential Mtb DsbF interaction partner, Rv1676, a predicted peroxiredoxin. Complex formation is supported by protein co-expression studies and inferred by gene expression profiles, whereby Mtb DsbF and Rv1676 are upregulated under similar environments. Additionally, comparison of Mtb DsbF and Mtb DsbE gene expression data indicate anticorrelated gene expression patterns, suggesting that these two proteins and their functionally linked partners constitute analogous pathways that may function under different conditions.
Mycobacterium tuberculosis; disulfide bond forming protein; structure-function; gene expression data; X-ray crystallography
Here, the crystal structure of TM0439, a GntR regulator with an FCD domain found in the Thermotoga maritima genome, is described.
The GntR superfamily of dimeric transcription factors, with more than 6200 members encoded in bacterial genomes, are characterized by N-terminal winged-helix DNA-binding domains and diverse C-terminal regulatory domains which provide a basis for the classification of the constituent families. The largest of these families, FadR, contains nearly 3000 proteins with all-α-helical regulatory domains classified into two related Pfam families: FadR_C and FCD. Only two crystal structures of FadR-family members, those of Escherichia coli FadR protein and LldR from Corynebacterium glutamicum, have been described to date in the literature. Here, the crystal structure of TM0439, a GntR regulator with an FCD domain found in the Thermotoga maritima genome, is described. The FCD domain is similar to that of the LldR regulator and contains a buried metal-binding site. Using atomic absorption spectroscopy and Trp fluorescence, it is shown that the recombinant protein contains bound Ni2+ ions but that it is able to bind Zn2+ with K
d < 70 nM. It is concluded that Zn2+ is the likely physiological metal and that it may perform either structural or regulatory roles or both. Finally, the TM0439 structure is compared with two other FadR-family structures recently deposited by structural genomics consortia. The results call for a revision in the classification of the FadR family of transcription factors.
transcription regulation; GntR family; structural genomics; surface-entropy reduction
The PHENIX software for macromolecular structure determination is described.
Macromolecular X-ray crystallography is routinely applied to understand biological processes at a molecular level. However, significant time and effort are still required to solve and complete many of these structures because of the need for manual interpretation of complex numerical data using many software packages and the repeated use of interactive three-dimensional graphics. PHENIX has been developed to provide a comprehensive system for macromolecular crystallographic structure solution with an emphasis on the automation of all procedures. This has relied on the development of algorithms that minimize or eliminate subjective input, the development of algorithms that automate procedures that are traditionally performed by hand and, finally, the development of a framework that allows a tight integration between the algorithms.
PHENIX; Python; macromolecular crystallography; algorithms
We show that Cibacron Blue F3GA dye resin chromatography can be used to identify ligands that specifically interact with proteins from Mycobacterium tuberculosis, and that the identification of these ligands can facilitate structure determination by enhancing the quality of crystals. Four native Mtb proteins of the aldehyde dehydrogenase (ALDH) family were previously shown to be specifically eluted from a Cibacron Blue F3GA dye resin with nucleosides. In this study we characterized the nucleoside-binding specificity of one of these ALDH isozymes (recombinant Mtb Rv0223c) and compared these biochemical results with co-crystallization experiments with different Rv0223c-nucleoside pairings. We found that the strongly interacting ligands (NAD and NADH) aided formation of high-quality crystals, permitting solution of the first Mtb ALDH (Rv0223c) structure. Other nucleoside ligands (AMP, FAD, adenosine, GTP and NADP) exhibited weaker binding to Rv0223c, and produced co-crystals diffracting to lower resolution. Difference electron density maps based on crystals of Rv0223c with various nucleoside ligands show most share the binding site where the natural ligand NAD binds. From the high degree of similarity of sequence and structure compared to human mitochondrial ALDH-2 (BLAST Z-score = 53.5 and RMSD = 1.5 Å), Rv0223c appears to belong to the ALDH-2 class. An altered oligomerization domain in the Rv0223c structure seems to keep this protein as monomer whereas native human ALDH-2 is a multimer.
Electronic supplementary material
The online version of this article (doi:10.1007/s10969-009-9073-z) contains supplementary material, which is available to authorized users.
Functional analysis; High efficiency in structural genomics; Improvement of crystal quality; Nucleoside binding proteins; Prioritization of targeting; Specificity of ligand binding
The DUF1094 family contains over 100 bacterial proteins, all containing a conserved CXC motif, with unknown function. We solved the crystal structure of the Bacillus subtilis representative, the product of the yphP gene. The protein shows remarkable structural similarity to thioredoxins, with a canonical αβαβαββα topology, despite low amino acid sequence identity to thioredoxin. The CXC motif is found in the loop immediately downstream of the first β-strand, in a location equivalent to the CXXC motif of thioredoxins, with the first Cys occupying a position equivalent to the first Cys in canonical thioredoxin. The experimentally determined reduction potential of YphP is E°′ = −130 mV, significantly higher than that of thioredoxin and consistent with disulfide isomerase activity. Functional assays confirmed that the protein displays a level of isomerase activity that might be biologically significant. We propose a mechanism by which the members of this family catalyze isomerization using the CXC catalytic site.
Ten measures of experimental electron-density-map quality are examined and the skewness of electron density is found to be the best indicator of actual map quality. A Bayesian approach to estimating map quality is developed and used in the PHENIX AutoSol wizard to make decisions during automated structure solution.
Estimates of the quality of experimental maps are important in many stages of structure determination of macromolecules. Map quality is defined here as the correlation between a map and the corresponding map obtained using phases from the final refined model. Here, ten different measures of experimental map quality were examined using a set of 1359 maps calculated by re-analysis of 246 solved MAD, SAD and MIR data sets. A simple Bayesian approach to estimation of map quality from one or more measures is presented. It was found that a Bayesian estimator based on the skewness of the density values in an electron-density map is the most accurate of the ten individual Bayesian estimators of map quality examined, with a correlation between estimated and actual map quality of 0.90. A combination of the skewness of electron density with the local correlation of r.m.s. density gives a further improvement in estimating map quality, with an overall correlation coefficient of 0.92. The PHENIX AutoSol wizard carries out automated structure solution based on any combination of SAD, MAD, SIR or MIR data sets. The wizard is based on tools from the PHENIX package and uses the Bayesian estimates of map quality described here to choose the highest quality solutions after experimental phasing.
structure solution; scoring; Protein Data Bank; phasing; decision-making; PHENIX; experimental electron-density maps
The structure of MtrA, an essential gene product for the human pathogen Mycobacterium tuberculosis, has been solved to a resolution of 2.1 Å. MtrA is a member of the OmpR/PhoB family of response regulators and represents the fourth family member for which a structure of the protein in its inactive state has been determined. As is true for all OmpR/PhoB family members, MtrA possesses an N-terminal regulatory domain and a C-terminal winged helix-turn-helix DNA-binding domain, with phosphorylation of the regulatory domain modulating the activity of the protein. In the inactive form of MtrA these two domains form an extensive interface that is composed of the α4-β5-α5 face of the regulatory domain and the C-terminal end of the positioning helix, the trans-activation loop, and the recognition helix of the DNA-binding domain. This domain orientation suggests a mechanism of mutual inhibition by the two domains. Activation of MtrA would require a disruption of this interface to allow the α4-β5-α5 face of the regulatory domain to form the inter-molecule interactions that are associated with the active state and to allow the recognition helix to interact with DNA. Furthermore, the interface appears to stabilize the inactive conformation of MtrA, potentially reducing the rate of phosphorylation of the N-terminal domain. This combination of effects may form a switch regulating the activity of MtrA. The domain orientation exhibited by MtrA also provides a rationale for the variation in linker length that is observed within the OmpR/PhoB family of response regulators.
RuvA, a protein from M. tuberculosis H37Rv involved in recombination, has been cloned, expressed, purified and analysed by X-ray crystallography.
The process of recombinational repair is crucial for maintaining genomic integrity and generating biological diversity. In association with RuvB and RuvC, RuvA plays a central role in processing and resolving Holliday junctions, which are a critical intermediate in homologous recombination. Here, the cloning, purification and structure determination of the RuvA protein from Mycobacterium tuberculosis (MtRuvA) are reported. Analysis of the structure and comparison with other known RuvA proteins reveal an octameric state with conserved subunit–subunit interaction surfaces, indicating the requirement of octamer formation for biological activity. A detailed analysis of plasticity in the RuvA molecules has led to insights into the invariant and variable regions, thus providing a framework for understanding regional flexibility in various aspects of RuvA function.
RuvA; Mycobacterium tuberculosis; recombinational repair
A procedure for carrying out iterative model building, density modification and refinement is presented in which the density in an OMITregion is essentially unbiased by an atomic model. Density from a set of overlapping OMIT regions can be combined to create a composite ‘iterative-build’ OMIT map that is everywhere unbiased by an atomic model but also everywhere benefiting from the model-based information present elsewhere in the unit cell. The procedure may have applications in the validation of specific features in atomic models as well as in overall model validation. The procedure is demonstrated with a molecular-replacement structure and with an experimentally phased structure and a variation on the method is demonstrated by removing model bias from a structure from the Protein Data Bank.
An OMIT procedure is presented that has the benefits of iterative model building density modification and refinement yet is essentially unbiased by the atomic model that is built.
A procedure for carrying out iterative model building, density modification and refinement is presented in which the density in an OMIT region is essentially unbiased by an atomic model. Density from a set of overlapping OMIT regions can be combined to create a composite ‘iterative-build’ OMIT map that is everywhere unbiased by an atomic model but also everywhere benefiting from the model-based information present elsewhere in the unit cell. The procedure may have applications in the validation of specific features in atomic models as well as in overall model validation. The procedure is demonstrated with a molecular-replacement structure and with an experimentally phased structure and a variation on the method is demonstrated by removing model bias from a structure from the Protein Data Bank.
model building; model validation; macromolecular models; Protein Data Bank; refinement; OMIT maps; bias; structure refinement; PHENIX
The highly automated PHENIX AutoBuild wizard is described. The procedure can be applied equally well to phases derived from isomorphous/anomalous and molecular-replacement methods.
The PHENIX AutoBuild wizard is a highly automated tool for iterative model building, structure refinement and density modification using RESOLVE model building, RESOLVE statistical density modification and phenix.refine structure refinement. Recent advances in the AutoBuild wizard and phenix.refine include automated detection and application of NCS from models as they are built, extensive model-completion algorithms and automated solvent-molecule picking. Model-completion algorithms in the AutoBuild wizard include loop building, crossovers between chains in different models of a structure and side-chain optimization. The AutoBuild wizard has been applied to a set of 48 structures at resolutions ranging from 1.1 to 3.2 Å, resulting in a mean R factor of 0.24 and a mean free R factor of 0.29. The R factor of the final model is dependent on the quality of the starting electron density and is relatively independent of resolution.
model building; model completion; macromolecular models; Protein Data Bank; structure refinement; PHENIX
Heterogeneity in ensembles generated by independent model rebuilding principally reflects the limitations of the data and of the model-building process rather than the diversity of structures in the crystal.
Automation of iterative model building, density modification and refinement in macromolecular crystallography has made it feasible to carry out this entire process multiple times. By using different random seeds in the process, a number of different models compatible with experimental data can be created. Sets of models were generated in this way using real data for ten protein structures from the Protein Data Bank and using synthetic data generated at various resolutions. Most of the heterogeneity among models produced in this way is in the side chains and loops on the protein surface. Possible interpretations of the variation among models created by repetitive rebuilding were investigated. Synthetic data were created in which a crystal structure was modelled as the average of a set of ‘perfect’ structures and the range of models obtained by rebuilding a single starting model was examined. The standard deviations of coordinates in models obtained by repetitive rebuilding at high resolution are small, while those obtained for the same synthetic crystal structure at low resolution are large, so that the diversity within a group of models cannot generally be a quantitative reflection of the actual structures in a crystal. Instead, the group of structures obtained by repetitive rebuilding reflects the precision of the models, and the standard deviation of coordinates of these structures is a lower bound estimate of the uncertainty in coordinates of the individual models.
model building; model completion; coordinate errors; models; Protein Data Bank; convergence; reproducibility; heterogeneity; precision; accuracy