Dephosphorylation of eukaryotic translation initiation factor 2a (eIF2a) restores protein synthesis at the waning of stress responses and requires a PP1 catalytic subunit and a regulatory subunit, PPP1R15A/GADD34 or PPP1R15B/CReP. Surprisingly, PPP1R15-PP1 binary complexes reconstituted in vitro lacked substrate selectivity. However, selectivity was restored by crude cell lysate or purified G-actin, which joined PPP1R15-PP1 to form a stable ternary complex. In crystal structures of the non-selective PPP1R15B-PP1G complex, the functional core of PPP1R15 made multiple surface contacts with PP1G, but at a distance from the active site, whereas in the substrate-selective ternary complex, actin contributes to one face of a platform encompassing the active site. Computational docking of the N-terminal lobe of eIF2a at this platform placed phosphorylated serine 51 near the active site. Mutagenesis of predicted surface-contacting residues enfeebled dephosphorylation, suggesting that avidity for the substrate plays an important role in imparting specificity on the PPP1R15B-PP1G-actin ternary complex.
For a cell to build a protein, it must first copy the instructions contained within a gene. A complex molecular machine called a ribosome then reads these instructions and translates them into a protein. This translation process involves a number of steps. Proteins called eukaryotic translation initiation factors (or eIFs for short) coordinate the first step in the process, which is known as ‘initiation’.
The eIFs also provide the cell with ways to control how quickly it makes proteins. For example, when a cell is stressed, either by starvation or toxins, it adds a phosphate group onto part of an eIF protein, called eIF2α. This modification makes this eIF protein less able to initiate translation, and so the cell builds fewer proteins and conserves more of its resources during times of stress.
Once the stressful conditions are over, the phosphate group is removed from eIF2α by an enzyme called a phosphatase. This phosphatase contains two subunits: one that recognizes eIF2α and another that removes the phosphate group. However, experiments that attempted to recreate this phosphatase activity using just these two subunits in a test tube failed to generate a working enzyme that specifically targeted the phosphate group of eIF2α. This suggests that in cells this enzyme contains an additional unknown subunit. Now, Chen et al. (and Chambers, Dalton et al.) report the identity of a ‘missing’ third subunit as a protein known as globular-actin or G-actin.
First, Chen et al. looked at the three-dimensional structure of a two-subunit complex formed from the previously known subunits of the phosphatase enzyme, and confirmed that it could remove phosphate groups from a range of proteins and not just eIF2α. However, when a mixture of other proteins taken from mouse cells was added to this two-subunit complex, the complex could specifically remove the phosphate group on the eIF2α protein.
Further experiments revealed that G-actin was the protein in the mixture that, when added to the two-subunit complex, made it specifically target the eIF2α protein. Chen et al. then used a combination of biochemical and structural biology techniques to investigate the phosphatase activity of the three-subunit complex. These findings suggest a plausible molecular mechanism by which the three-subunit complex becomes selective for its target, but further refinements to the structural work will be needed to critically test these suggestions.
rabbit; cell lines; mouse; E. coli; human
Subversion of the host immune system by viruses is often mediated by molecular decoys that sequester host proteins pivotal to mounting effective immune responses. The widespread mammalian pathogen parapox Orf virus deploys GIF, a member of the poxvirus immune evasion superfamily, to antagonize GM-CSF (granulocyte macrophage colony-stimulating factor) and IL-2 (interleukin-2), two pleiotropic cytokines of the mammalian immune system. However, structural and mechanistic insights into the unprecedented functional duality of GIF have remained elusive. Here we reveal that GIF employs a dimeric binding platform that sequesters two copies of its target cytokines with high affinity and slow dissociation kinetics to yield distinct complexes featuring mutually exclusive interaction footprints. We illustrate how GIF serves as a competitive decoy receptor by leveraging binding hotspots underlying the cognate receptor interactions of GM-CSF and IL-2, without sharing any structural similarity with the cytokine receptors. Our findings contribute to the tracing of novel molecular mimicry mechanisms employed by pathogenic viruses.
Viruses often subvert the host immune system using molecular decoys to prevent an effective immune response. Here, the authors examine the structural details of the viral decoy receptor GIF and its antagnosim of GM-CSF and IL-2.
The DAN-family, including Gremlin-1 and Gremlin-2 (Grem1 and Grem2), represents a large family of secreted BMP antagonists. However, how DAN proteins specifically inhibit BMP signaling has remained elusive. Here, we report the structure of Grem2 bound to GDF5 at 2.9 Å resolution. The structure reveals two Grem2 dimers binding perpendicularly to each GDF5 monomer, resembling an H-like structure. Comparison to the unbound Grem2 structure reveals a dynamic N-terminus that undergoes significant transition upon complex formation, leading to simultaneous interaction with the type I/type II receptor motifs on GDF5. Binding studies show that DAN-family members can interact with BMP-type I receptor complexes, whereas Noggin outcompetes the type I receptor for ligand binding. Interestingly, Grem2-GDF5 forms a stable aggregate-like structure, in vitro, not clearly observed for other antagonists, including Noggin and Follistatin. These findings exemplify the structural and functional diversity across the various BMP antagonist families.
Structures of biomolecular systems are increasingly computed by integrative modeling that relies on varied types of experimental data and theoretical information. We describe here the proceedings and conclusions from the first wwPDB Hybrid/Integrative Methods Task Force Workshop held at the European Bioinformatics Institute in Hinxton, UK, October 6 and 7, 2014. At the workshop, experts in various experimental fields of structural biology, experts in integrative modeling and visualization, and experts in data archiving addressed a series of questions central to the future of structural biology. How should integrative models be represented? How should the data and integrative models be validated? What data should be archived? How should the data and models be archived? What information should accompany the publication of integrative models?
integrative modeling; hybrid modeling; integrative structural biology; Protein Data Bank
The Z mutation (E342K) of α1-antitrypsin (α1-AT), carried by 4% of Northern Europeans, predisposes to early onset of emphysema due to decreased functional α1-AT in the lung and to liver cirrhosis due to accumulation of polymers in hepatocytes. However, it remains unclear why the Z mutation causes intracellular polymerization of nascent Z α1-AT and why 15% of the expressed Z α1-AT is secreted into circulation as functional, but polymerogenic, monomers. Here, we solve the crystal structure of the Z-monomer and have engineered replacements to assess the conformational role of residue Glu-342 in α1-AT. The results reveal that Z α1-AT has a labile strand 5 of the central β-sheet A (s5A) with a consequent equilibrium between a native inhibitory conformation, as in its crystal structure here, and an aberrant conformation with s5A only partially incorporated into the central β-sheet. This aberrant conformation, induced by the loss of interactions from the Glu-342 side chain, explains why Z α1-AT is prone to polymerization and readily binds to a 6-mer peptide, and it supports that annealing of s5A into the central β-sheet is a crucial step in the serpins' metastable conformational formation. The demonstration that the aberrant conformation can be rectified through stabilization of the labile s5A by binding of a small molecule opens a potential therapeutic approach for Z α1-AT deficiency.
conformational change; crystal structure; protein folding; serpin; small molecule; PBA; Polymerization; Z α1-antitrypsin
Structures of multi-subunit macromolecular machines are primarily determined by either electron microscopy (EM) or X-ray crystallography. In many cases, a structure for a complex can be obtained at low resolution (at a coarse level of detail) with EM and at higher resolution (with finer detail) by X-ray crystallography. The integration of these two structural techniques is becoming increasingly important for generating atomic models of macromolecular complexes. A low-resolution EM image can be a powerful tool for obtaining the "phase" information that is missing from an X-ray crystallography experiment, however integration of EM and X-ray diffraction data has been technically challenging. Here we present a step-by-step protocol that explains how low-resolution EM maps can be placed in the crystallographic unit cell by molecular replacement, and how initial phases computed from the placed EM density are extended to high resolution by averaging maps over non-crystallographic symmetry. As the resolution gap between EM and X-ray crystallography continues to narrow, the use of EM maps to help with X-ray crystal structure determination, as described in this protocol, will become increasingly effective.
A new Rice-function approximation for the effect of intensity-measurement errors improves the treatment of weak intensity data in calculating log-likelihood-gain scores in crystallographic applications including experimental phasing, molecular replacement and refinement.
The crystallographic diffraction experiment measures Bragg intensities; crystallographic electron-density maps and other crystallographic calculations in phasing require structure-factor amplitudes. If data were measured with no errors, the structure-factor amplitudes would be trivially proportional to the square roots of the intensities. When the experimental errors are large, and especially when random errors yield negative net intensities, the conversion of intensities and their error estimates into amplitudes and associated error estimates becomes nontrivial. Although this problem has been addressed intermittently in the history of crystallographic phasing, current approaches to accounting for experimental errors in macromolecular crystallography have numerous significant defects. These have been addressed with the formulation of LLGI, a log-likelihood-gain function in terms of the Bragg intensities and their associated experimental error estimates. LLGI has the correct asymptotic behaviour for data with large experimental error, appropriately downweighting these reflections without introducing bias. LLGI abrogates the need for the conversion of intensity data to amplitudes, which is usually performed with the French and Wilson method [French & Wilson (1978 ▸), Acta Cryst. A35, 517–525], wherever likelihood target functions are required. It has general applicability for a wide variety of algorithms in macromolecular crystallography, including scaling, characterizing anisotropy and translational noncrystallographic symmetry, detecting outliers, experimental phasing, molecular replacement and refinement. Because it is impossible to reliably recover the original intensity data from amplitudes, it is suggested that crystallographers should always deposit the intensity data in the Protein Data Bank.
intensity-measurement errors; likelihood
Recent studies of corticosteroid-binding globulin (CBG) indicate that it does not merely transport cortisol passively but also actively regulates its release in the circulation. We show how CBG binding affinity can vary to give changes in free cortisol concentration in a physiologically relevant range.
The objective was to determine how the binding affinity of plasma CBG is affected by glycosylation, changes in body temperature, and the conformational change induced by proteases at sites of inflammation.
Binding assays were performed over a range of temperatures with plasma and recombinant CBG to determine the contribution of glycosylation. The role of conformational change was assessed by measuring binding affinities of plasma CBG before and after reactive loop cleavage by neutrophil elastase.
Main Outcome Measures:
Determination of binding constants allows calculation of clinically relevant changes in CBG saturation and free cortisol concentrations.
On reactive loop cleavage at inflammation sites, CBG can continue to act as a buffered source of cortisol, although with a much reduced affinity, to give a potential quadrupling of free cortisol. Predicted increases in systemic free cortisol resulting from elevated body temperatures, previously reported based on affinity measurements using nonglycosylated recombinant CBG, were shown here to be considerably increased using glycosylated plasma CBG, with a doubling for every 2°C rise in body temperature.
The ability of CBG to modulate free cortisol levels in blood must be considered in the understanding and management of disease processes, as illustrated here with predictable changes in inflammation and fever.
A likelihood-based method for determining the sub-structure of anomalously-scattering atoms in macromolecular crystals can allow successful structure determination by single-wavelength anomalous diffraction (SAD) X-ray analysis with weak anomalous signal. Along with use of partial models and electron density maps in searches for anomalously-scattering atoms, testing of alternative values of parameters, and parallelized automated model-building, this method has the potential for extending the applicability of the SAD method in challenging cases.
A 1.8 Å resolution structure of the sphingolipid activator protein saposin A has been determined at pH 4.8, the physiologically relevant lysosomal pH for hydrolase enzyme activation and lipid-transfer activity.
The saposins are essential cofactors for the normal lysosomal degradation of complex glycosphingolipids by acid hydrolase enzymes; defects in either saposin or hydrolase function lead to severe metabolic diseases. Saposin A (SapA) activates the enzyme β-galactocerebrosidase (GALC), which catalyzes the breakdown of β-d-galactocerebroside, the principal lipid component of myelin. SapA is known to bind lipids and detergents in a pH-dependent manner; this is accompanied by a striking transition from a ‘closed’ to an ‘open’ conformation. However, previous structures were determined at non-lysosomal pH. This work describes a 1.8 Å resolution X-ray crystal structure determined at the physiologically relevant lysosomal pH 4.8. In the absence of lipid or detergent at pH 4.8, SapA is observeed to adopt a conformation closely resembling the previously determined ‘closed’ conformation, showing that pH alone is not sufficient for the transition to the ‘open’ conformation. Structural alignments reveal small conformational changes, highlighting regions of flexibility.
saposin A; lipid-transfer protein; sphingolipid activator protein; GALC
Modified azasugar molecules have been synthesized and characterized as excellent pharmacological chaperone candidates to treat the neurodegenerative disorder Krabbe disease.
Krabbe disease is a devastating neurodegenerative disorder characterized by rapid demyelination of nerve fibers. This disease is caused by defects in the lysosomal enzyme β-galactocerebrosidase (GALC), which hydrolyzes the terminal galactose from glycosphingolipids. These lipids are essential components of eukaryotic cell membranes: substrates of GALC include galactocerebroside, the primary lipid component of myelin, and psychosine, a cytotoxic metabolite. Mutations of GALC that cause misfolding of the protein may be responsive to pharmacological chaperone therapy (PCT), whereby small molecules are used to stabilize these mutant proteins, thus correcting trafficking defects and increasing residual catabolic activity in cells. Here we describe a new approach for the synthesis of galacto-configured azasugars and the characterization of their interaction with GALC using biophysical, biochemical and crystallographic methods. We identify that the global stabilization of GALC conferred by azasugar derivatives, measured by fluorescence-based thermal shift assays, is directly related to their binding affinity, measured by enzyme inhibition. X-ray crystal structures of these molecules bound in the GALC active site reveal which residues participate in stabilizing interactions, show how potency is achieved and illustrate the penalties of aza/iminosugar ring distortion. The structure–activity relationships described here identify the key physical properties required of pharmacological chaperones for Krabbe disease and highlight the potential of azasugars as stabilizing agents for future enzyme replacement therapies. This work lays the foundation for new drug-based treatments of Krabbe disease.
Blood cells derive from hematopoietic stem cells through stepwise fating events. To characterize gene expression programs driving lineage choice we sequenced RNA from eight primary human hematopoietic progenitor populations representing the major myeloid commitment stages and the main lymphoid stage. We identify extensive cell-type specific expression changes: 6,711 genes and 10,724 transcripts, enriched in non-protein coding elements at early stages of differentiation. In addition, we discovered 7,881 novel splice junctions and 2,301 differentially used alternative splicing events, enriched in genes involved in regulatory processes. We demonstrate experimentally cell specific isoform usage, identifying NFIB as a regulator of megakaryocyte maturation – the platelet precursor. Our data highlight the complexity of fating events in closely related progenitor populations, the understanding of which is essential for the advancement of transplantation and regenerative medicine.
Hyp-1, a pathogenesis-related class 10 (PR-10) protein from H. perforatum, was crystallized in complex with the fluorescent probe 8-anilino-1-naphthalene sulfonate (ANS). The asymmetric unit of the tetartohedrally twinned crystal contains 28 copies of the protein arranged in columns with noncrystallographic sevenfold translational symmetry and with additional pseudotetragonal rotational NCS.
Hyp-1, a pathogenesis-related class 10 (PR-10) protein from St John’s wort (Hypericum perforatum), was crystallized in complex with the fluorescent probe 8-anilino-1-naphthalene sulfonate (ANS). The highly pseudosymmetric crystal has 28 unique protein molecules arranged in columns with sevenfold translational noncrystallographic symmetry (tNCS) along c and modulated X-ray diffraction with intensity crests at l = 7n and l = 7n ± 3. The translational NCS is combined with pseudotetragonal rotational NCS. The crystal was a perfect tetartohedral twin, although detection of twinning was severely hindered by the pseudosymmetry. The structure determined at 2.4 Å resolution reveals that the Hyp-1 molecules (packed as β-sheet dimers) have three novel ligand-binding sites (two internal and one in a surface pocket), which was confirmed by solution studies. In addition to 60 Hyp-1-docked ligands, there are 29 interstitial ANS molecules distributed in a pattern that violates the arrangement of the protein molecules and is likely to be the generator of the structural modulation. In particular, whenever the stacked Hyp-1 molecules are found closer together there is an ANS molecule bridging them.
pathogenesis-related class 10 protein; St John’s wort; Hypericum perforatum; 8-anilino-1-naphthalene sulfonate
Clustered regularly interspaced short palindromic repeats (CRISPRs) are essential components of RNA-guided adaptive immune systems that protect bacteria and archaea from viruses and plasmids. In Escherichia coli, short CRISPR-derived RNAs (crRNAs) assemble into a 405 kDa multi-subunit surveillance complex called Cascade (CRISPR-associated complex for antiviral defense). Here we present the 3.24 Å resolution x-ray crystal structure of Cascade. Eleven proteins and a 61-nucleotide crRNA assemble into a sea-horse-shaped architecture that binds double-stranded DNA targets complementary to the crRNA-guide sequence. Conserved sequences on the 3′- and 5′-ends of the crRNA are anchored by proteins at opposite ends of the complex, while the guide sequence is displayed along a helical assembly of six interwoven subunits that present 5-nucleotide segments of the crRNA in pseudo A-form configuration. The structure of Cascade suggests a mechanism for assembly and provides insights into the mechanisms of target recognition.
Predicted structures submitted for CASP10 have been evaluated as molecular replacement models against the corresponding sets of structure factor amplitudes. It has been found that the log-likelihood gain score computed for each prediction correlates well with common structure quality indicators but is more sensitive when the accuracy of the models is high. In addition, it was observed that using coordinate error estimates submitted by predictors to weight the model can improve its utility in molecular replacement dramatically, and several groups have been identified who reliably provide accurate error estimates that could be used to extend the application of molecular replacement for low-homology cases.
•Error estimates increase the value of homology models for molecular replacement•Poorer models with good error estimates trump better models without errors•A simple protocol creates coordinate error estimates for individual models•Local coordinate error estimates enable molecular replacement for more targets
Although estimating the size of local errors in homology models is not common practice, Bunkóczi et al. show that the provision of error estimates can have a substantial practical impact in the utility of these models when used to solve crystal structures by molecular replacement.
The treatment of many diseases such as cancer requires the use of drugs that can cause severe side effects. Off-target toxicity can often be reduced simply by directing the drugs specifically to sites of diseases. Amidst increasingly sophisticated methods of targeted drug delivery, we observed that Nature has already evolved elegant means of sending biological molecules to where they are needed. One such example is corticosteroid binding globulin (CBG), the major carrier of the anti-inflammatory hormone, cortisol. Targeted release of cortisol is triggered by cleavage of CBG's reactive centre loop by elastase, a protease released by neutrophils in inflamed tissues. This work aimed to establish the feasibility of exploiting this mechanism to carry therapeutic agents to defined locations. The reactive centre loop of CBG was altered with site-directed mutagenesis to favour cleavage by other proteases, to alter the sites at which it would release its cargo. Mutagenesis succeeded in making CBG a substrate for either prostate specific antigen (PSA), a prostate-specific serine protease, or thrombin, a key protease in the blood coagulation cascade. PSA is conspicuously overproduced in prostatic hyperplasia and is, therefore, a good way of targeting hyperplastic prostate tissues. Thrombin is released during clotting and consequently is ideal for conferring specificity to thrombotic sites. Using fluorescence-based titration assays, we also showed that CBG can be engineered to bind a new compound, thyroxine-6-carboxyfluorescein, instead of its physiological ligand, cortisol, thereby demonstrating that it is possible to tailor the hormone binding site to deliver a therapeutic drug. In addition, we proved that the efficiency with which CBG releases bound ligand can be increased by introducing some well-placed mutations. This proof-of-concept study has raised the prospect of a novel means of targeted drug delivery, using the serpin conformational change to combat the problem of off-target effects in the treatment of diseases.
The hormone thyroxine that regulates mammalian metabolism is carried and stored in the blood by thyroxine-binding globulin (TBG). We demonstrate here that the release of thyroxine from TBG occurs by a temperature-sensitive mechanism and show how this will provide a homoeostatic adjustment of the concentration of thyroxine to match metabolic needs, as with the hypothermia and torpor of small animals. In humans, a rise in temperature, as in infections, will trigger an accelerated release of thyroxine, resulting in a predictable 23% increase in the concentration of free thyroxine at 39°C. The in vivo relevance of this fever-response is affirmed in an environmental adaptation in aboriginal Australians. We show how two mutations incorporated in their TBG interact in a way that will halve the surge in thyroxine release, and hence the boost in metabolic rate that would otherwise occur as body temperatures exceed 37°C. The overall findings open insights into physiological changes that accompany variations in body temperature, as notably in fevers.
thyroxine; thyroxine-binding globulin; aboriginal Australian; febrile convulsions; hypothermia; hibernation
The solvent-picking procedure in phenix.refine has been extended and combined with Phaser anomalous substructure completion and analysis of coordination geometry to identify and place elemental ions.
Many macromolecular model-building and refinement programs can automatically place solvent atoms in electron density at moderate-to-high resolution. This process frequently builds water molecules in place of elemental ions, the identification of which must be performed manually. The solvent-picking algorithms in phenix.refine have been extended to build common ions based on an analysis of the chemical environment as well as physical properties such as occupancy, B factor and anomalous scattering. The method is most effective for heavier elements such as calcium and zinc, for which a majority of sites can be placed with few false positives in a diverse test set of structures. At atomic resolution, it is observed that it can also be possible to identify tightly bound sodium and magnesium ions. A number of challenges that contribute to the difficulty of completely automating the process of structure completion are discussed.
refinement; ions; PHENIX
With the implementation of a molecular-replacement likelihood target that accounts for translational noncrystallographic symmetry, it became possible to solve the crystal structure of a protein with seven tetrameric assemblies arrayed translationally along the c axis. The new algorithm found 56 protein molecules in reduced symmetry (P1), which was used to resolve space-group ambiguity caused by severe twinning.
Translational noncrystallographic symmetry (tNCS) is a pathology of protein crystals in which multiple copies of a molecule or assembly are found in similar orientations. Structure solution is problematic because this breaks the assumptions used in current likelihood-based methods. To cope with such cases, new likelihood approaches have been developed and implemented in Phaser to account for the statistical effects of tNCS in molecular replacement. Using these new approaches, it was possible to solve the crystal structure of a protein exhibiting an extreme form of this pathology with seven tetrameric assemblies arrayed along the c axis. To resolve space-group ambiguities caused by tetartohedral twinning, the structure was initially solved by placing 56 copies of the monomer in space group P1 and using the symmetry of the solution to define the true space group, C2. The resulting structure of Hyp-1, a pathogenesis-related class 10 (PR-10) protein from the medicinal herb St John’s wort, reveals the binding modes of the fluorescent probe 8-anilino-1-naphthalene sulfonate (ANS), providing insight into the function of the protein in binding or storing hydrophobic ligands.
maximum likelihood; translational noncrystallographic symmetry; molecular replacement; commensurate modulation; pseudo-symmetry
A software system for automated protein–ligand crystallography has been implemented in the Phenix suite. This significantly reduces the manual effort required in high-throughput crystallographic studies.
High-throughput drug-discovery and mechanistic studies often require the determination of multiple related crystal structures that only differ in the bound ligands, point mutations in the protein sequence and minor conformational changes. If performed manually, solution and refinement requires extensive repetition of the same tasks for each structure. To accelerate this process and minimize manual effort, a pipeline encompassing all stages of ligand building and refinement, starting from integrated and scaled diffraction intensities, has been implemented in Phenix. The resulting system is able to successfully solve and refine large collections of structures in parallel without extensive user intervention prior to the final stages of model completion and validation.
protein–ligand complexes; automation; crystallographic structure solution and refinement
A procedure for model building is described that combines morphing a model to match a density map, trimming the morphed model and aligning the model to a sequence.
A procedure termed ‘morphing’ for improving a model after it has been placed in the crystallographic cell by molecular replacement has recently been developed. Morphing consists of applying a smooth deformation to a model to make it match an electron-density map more closely. Morphing does not change the identities of the residues in the chain, only their coordinates. Consequently, if the true structure differs from the working model by containing different residues, these differences cannot be corrected by morphing. Here, a procedure that helps to address this limitation is described. The goal of the procedure is to obtain a relatively complete model that has accurate main-chain atomic positions and residues that are correctly assigned to the sequence. Residues in a morphed model that do not match the electron-density map are removed. Each segment of the resulting trimmed morphed model is then assigned to the sequence of the molecule using information about the connectivity of the chains from the working model and from connections that can be identified from the electron-density map. The procedure was tested by application to a recently determined structure at a resolution of 3.2 Å and was found to increase the number of correctly identified residues in this structure from the 88 obtained using phenix.resolve sequence assignment alone (Terwilliger, 2003 ▶) to 247 of a possible 359. Additionally, the procedure was tested by application to a series of templates with sequence identities to a target structure ranging between 7 and 36%. The mean fraction of correctly identified residues in these cases was increased from 33% using phenix.resolve sequence assignment to 47% using the current procedure. The procedure is simple to apply and is available in the Phenix software package.
morphing; model building; sequence assignment; model–map correlation; loop-building
The functionality of the molecular-replacement pipeline phaser.MRage is introduced and illustrated with examples.
Phaser.MRage is a molecular-replacement automation framework that implements a full model-generation workflow and provides several layers of model exploration to the user. It is designed to handle a large number of models and can distribute calculations efficiently onto parallel hardware. In addition, phaser.MRage can identify correct solutions and use this information to accelerate the search. Firstly, it can quickly score all alternative models of a component once a correct solution has been found. Secondly, it can perform extensive analysis of identified solutions to find protein assemblies and can employ assembled models for subsequent searches. Thirdly, it is able to use a priori assembly information (derived from, for example, homologues) to speculatively place and score molecules, thereby customizing the search procedure to a certain class of protein molecule (for example, antibodies) and incorporating additional biological information into molecular replacement.
molecular replacement; pipeline; automation; phaser.MRage
A function for estimating the effective root-mean-square deviation in coordinates between two proteins has been developed that depends on both the sequence identity and the size of the protein and is optimized for use with molecular replacement in Phaser. A top peak translation-function Z-score of over 8 is found to be a reliable metric of when molecular replacement has succeeded.
The estimate of the root-mean-square deviation (r.m.s.d.) in coordinates between the model and the target is an essential parameter for calibrating likelihood functions for molecular replacement (MR). Good estimates of the r.m.s.d. lead to good estimates of the variance term in the likelihood functions, which increases signal to noise and hence success rates in the MR search. Phaser has hitherto used an estimate of the r.m.s.d. that only depends on the sequence identity between the model and target and which was not optimized for the MR likelihood functions. Variance-refinement functionality was added to Phaser to enable determination of the effective r.m.s.d. that optimized the log-likelihood gain (LLG) for a correct MR solution. Variance refinement was subsequently performed on a database of over 21 000 MR problems that sampled a range of sequence identities, protein sizes and protein fold classes. Success was monitored using the translation-function Z-score (TFZ), where a TFZ of 8 or over for the top peak was found to be a reliable indicator that MR had succeeded for these cases with one molecule in the asymmetric unit. Good estimates of the r.m.s.d. are correlated with the sequence identity and the protein size. A new estimate of the r.m.s.d. that uses these two parameters in a function optimized to fit the mean of the refined variance is implemented in Phaser and improves MR outcomes. Perturbing the initial estimate of the r.m.s.d. from the mean of the distribution in steps of standard deviations of the distribution further increases MR success rates.
Phaser; maximum likelihood; molecular replacement
A genetic algorithm has been developed to optimize the phases of the strongest reflections in SIR/SAD data. This is shown to facilitate density modification and model building in several test cases.
Experimental phasing of diffraction data from macromolecular crystals involves deriving phase probability distributions. These distributions are often bimodal, making their weighted average, the centroid phase, improbable, so that electron-density maps computed using centroid phases are often non-interpretable. Density modification brings in information about the characteristics of electron density in protein crystals. In successful cases, this allows a choice between the modes in the phase probability distributions, and the maps can cross the borderline between non-interpretable and interpretable. Based on the suggestions by Vekhter [Vekhter (2005 ▶), Acta Cryst. D61, 899–902], the impact of identifying optimized phases for a small number of strong reflections prior to the density-modification process was investigated while using the centroid phase as a starting point for the remaining reflections. A genetic algorithm was developed that optimizes the quality of such phases using the skewness of the density map as a target function. Phases optimized in this way are then used in density modification. In most of the tests, the resulting maps were of higher quality than maps generated from the original centroid phases. In one of the test cases, the new method sufficiently improved a marginal set of experimental SAD phases to enable successful map interpretation. A computer program, SISA, has been developed to apply this method for phase improvement in macromolecular crystallography.
experimental phasing; density modification; genetic algorithms
BTZ043, a tuberculosis drug candidate with nanomolar whole-cell activity, targets the DprE1 enzyme of the essential decaprenylphosphoryl-β-D-ribofuranose-2′-epimerase thus blocking biosynthesis of arabinans, vital cell-wall components of mycobacteria. Crystal structures of DprE1, in its native form and in complex with BTZ043, unambiguously reveal formation of a semimercaptal adduct between the drug and an active-site cysteine, as well as contacts to a neighbouring catalytic lysine residue. Kinetic studies confirm BTZ043 as a mechanism-based, covalent inhibitor. This explains the exquisite potency of BTZ043, which, when fluorescently labelled, localizes DprE1 at the poles of growing bacteria. Menaquinone can reoxidize the FAD cofactor in DprE1 and may be the natural electron acceptor for this reaction in the cell. Our structural and kinetic analysis provides both insight into a critical epimerization reaction and a platform for structure-based design of improved inhibitors. Surprisingly, given the colossal tuberculosis burden globally, BTZ043 is the only new drug candidate to have been co-crystallized with its target.