Algorithms for evaluating and optimizing the useful anomalous correlation and the anomalous signal in a SAD data set are described.
A key challenge in the SAD phasing method is solving a structure when the anomalous signal-to-noise ratio is low. Here, algorithms and tools for evaluating and optimizing the useful anomalous correlation and the anomalous signal in a SAD experiment are described. A simple theoretical framework [Terwilliger et al. (2016 ▸), Acta Cryst. D72, 346–358] is used to develop methods for planning a SAD experiment, scaling SAD data sets and estimating the useful anomalous correlation and anomalous signal in a SAD data set. The phenix.plan_sad_experiment tool uses a database of solved and unsolved SAD data sets and the expected characteristics of a SAD data set to estimate the probability that the anomalous substructure will be found in the SAD experiment and the expected map quality that would be obtained if the substructure were found. The phenix.scale_and_merge tool scales unmerged SAD data from one or more crystals using local scaling and optimizes the anomalous signal by identifying the systematic differences among data sets, and the phenix.anomalous_signal tool estimates the useful anomalous correlation and anomalous signal after collecting SAD data and estimates the probability that the data set can be solved and the likely figure of merit of phasing.
SAD phasing; anomalous signal; experimental design
The useful anomalous correlation and the anomalous signal in a SAD experiment are metrics describing the accuracy of the data and the total information content in a SAD data set and are shown to be related to the probability of solving the anomalous substructure and the quality of the initial phases.
A key challenge in the SAD phasing method is solving a structure when the anomalous signal-to-noise ratio is low. A simple theoretical framework for describing measurements of anomalous differences and the resulting useful anomalous correlation and anomalous signal in a SAD experiment is presented. Here, the useful anomalous correlation is defined as the correlation of anomalous differences with ideal anomalous differences from the anomalous substructure. The useful anomalous correlation reflects the accuracy of the data and the absence of minor sites. The useful anomalous correlation also reflects the information available for estimating crystallographic phases once the substructure has been determined. In contrast, the anomalous signal (the peak height in a model-phased anomalous difference Fourier at the coordinates of atoms in the anomalous substructure) reflects the information available about each site in the substructure and is related to the ability to find the substructure. A theoretical analysis shows that the expected value of the anomalous signal is the product of the useful anomalous correlation, the square root of the ratio of the number of unique reflections in the data set to the number of sites in the substructure, and a function that decreases with increasing values of the atomic displacement factor for the atoms in the substructure. This means that the ability to find the substructure in a SAD experiment is increased by high data quality and by a high ratio of reflections to sites in the substructure, and is decreased by high atomic displacement factors for the substructure.
SAD phasing; anomalous signal; anomalous phasing; solving structures
Advances in high resolution electron cryomicroscopy (cryo-EM) have been accompanied by the development of validation metrics to independently assess map quality and model geometry. EMRinger assesses the precise fitting of an atomic model into the map during refinement and shows how radiation damage alters scattering from negatively charged amino acids. EMRinger will be useful for monitoring progress in resolving and modeling high-resolution features in cryo-EM.
SERK proteins play a central role in immune and developmental signaling pathways in plants. Structural studies have been performed in order to better understand the role of the OsSERK2 coreceptor in signaling with its partner receptors. Here, crystal structures of the LRR domains of OsSERK2 and a D128N OsSERK2 mutant, expressed as hagfish variable lymphocyte receptor (VLR) fusions, are reported.
Somatic embryogenesis receptor kinases (SERKs) are leucine-rich repeat (LRR)-containing integral membrane receptors that are involved in the regulation of development and immune responses in plants. It has recently been shown that rice SERK2 (OsSERK2) is essential for XA21-mediated resistance to the pathogen Xanthomonas oryzae pv. oryzae. OsSERK2 is also required for the BRI1-mediated, FLS2-mediated and EFR-mediated responses to brassinosteroids, flagellin and elongation factor Tu (EF-Tu), respectively. Here, crystal structures of the LRR domains of OsSERK2 and a D128N OsSERK2 mutant, expressed as hagfish variable lymphocyte receptor (VLR) fusions, are reported. These structures suggest that the aspartate mutation does not generate any significant conformational change in the protein, but instead leads to an altered interaction with partner receptors.
SERKs; leucine-rich repeats; OsSERK2
Lignin poses a major challenge in the processing of plant biomass for agro-industrial applications. For bioengineering purposes, there is a pressing interest in identifying and characterizing the enzymes responsible for the biosynthesis of lignin. Hydroxycinnamoyl-CoA:shikimate hydroxycinnamoyl transferase (HCT; EC 126.96.36.199) is a key metabolic entry point for the synthesis of the most important lignin monomers: coniferyl and sinapyl alcohols. In this study, we investigated the substrate promiscuity of HCT from a bryophyte (Physcomitrella) and from five representatives of vascular plants (Arabidopsis, poplar, switchgrass, pine and Selaginella) using a yeast expression system. We demonstrate for these HCTs a conserved capacity to acylate with p-coumaroyl-CoA several phenolic compounds in addition to the canonical acceptor shikimate normally used during lignin biosynthesis. Using either recombinant HCT from switchgrass (PvHCT2a) or an Arabidopsis stem protein extract, we show evidence of the inhibitory effect of these phenolics on the synthesis of p-coumaroyl shikimate in vitro, which presumably occurs via a mechanism of competitive inhibition. A structural study of PvHCT2a confirmed the binding of a non-canonical acceptor in a similar manner to shikimate in the active site of the enzyme. Finally, we exploited in Arabidopsis the substrate flexibility of HCT to reduce lignin content and improve biomass saccharification by engineering transgenic lines that overproduce one of the HCT non-canonical acceptors. Our results demonstrate conservation of HCT substrate promiscuity and provide support for a new strategy for lignin reduction in the effort to improve the quality of plant biomass for forage and cellulosic biofuels.
Arabidopsis; Bioenergy; Cell wall; HCT; Lignin; Saccharification
The default geometry restraints used in Phenix for the protein backbone have been upgraded to account for the known conformation-dependencies of bond angles and lengths.
Chemical restraints are a fundamental part of crystallographic protein structure refinement. In response to mounting evidence that conventional restraints have shortcomings, it has previously been documented that using backbone restraints that depend on the protein backbone conformation helps to address these shortcomings and improves the performance of refinements [Moriarty et al. (2014 ▸), FEBS J.
281, 4061–4071]. It is important that these improvements be made available to all in the protein crystallography community. Toward this end, a change in the default geometry library used by Phenix is described here. Tests are presented showing that this change will not generate increased numbers of outliers during validation, or deposition in the Protein Data Bank, during the transition period in which some validation tools still use the conventional restraint libraries.
covalent geometry restraints; crystallographic refinement; protein structure; validation; Phenix
Carbohydrate binding modules (CBMs) bind polysaccharides and help target glycoside hydrolases catalytic domains to their appropriate carbohydrate substrates. To better understand how CBMs can improve cellulolytic enzyme reactivity, representatives from each of the 18 families of CBM found in Ruminoclostridiumthermocellum were fused to the multifunctional GH5 catalytic domain of CelE (Cthe_0797, CelEcc), which can hydrolyze numerous types of polysaccharides including cellulose, mannan, and xylan. Since CelE is a cellulosomal enzyme, none of these fusions to a CBM previously existed.
CelEcc_CBM fusions were assayed for their ability to hydrolyze cellulose, lichenan, xylan, and mannan. Several CelEcc_CBM fusions showed enhanced hydrolytic activity with different substrates relative to the fusion to CBM3a from the cellulosome scaffoldin, which has high affinity for binding to crystalline cellulose. Additional binding studies and quantitative catalysis studies using nanostructure-initiator mass spectrometry (NIMS) were carried out with the CBM3a, CBM6, CBM30, and CBM44 fusion enzymes. In general, and consistent with observations of others, enhanced enzyme reactivity was correlated with moderate binding affinity of the CBM. Numerical analysis of reaction time courses showed that CelEcc_CBM44, a combination of a multifunctional enzyme domain with a CBM having broad binding specificity, gave the fastest rates for hydrolysis of both the hexose and pentose fractions of ionic-liquid pretreated switchgrass.
We have shown that fusions of different CBMs to a single multifunctional GH5 catalytic domain can increase its rate of reaction with different pure polysaccharides and with pretreated biomass. This fusion approach, incorporating domains with broad specificity for binding and catalysis, provides a new avenue to improve reactivity of simple combinations of enzymes within the complexity of plant biomass.
Electronic supplementary material
The online version of this article (doi:10.1186/s13068-015-0402-0) contains supplementary material, which is available to authorized users.
Cellulase; Xylanase; Hemicellulase; Mannanase; Carbohydrate binding module; Ruminoclostridium thermocellum; Enzyme engineering; Biofuels; Mass spectrometry; Kinetic analysis
Abnormal expression or mutations in Ras proteins has been found in up to 30% of cancer cell types, making them excellent protein models to probe structure-function relationships of cell-signaling processes that mediate cell transformtion. Yet, there has been very little development of therapies to help tackle Ras-related diseased states. The development of small molecules to target Ras proteins to potentially inhibit abnormal Ras-stimulated cell signaling has been conceptualized and some progress has been made over the last 16 or so years. Here, we briefly review studies characterizing Ras protein-small molecule interactions to show the importance and potential that these small molecules may have for Ras-related drug discovery. We summarize recent results, highlighting small molecules that can be directly targeted to Ras using Structure-Based Drug Design (SBDD) and Fragment-Based Lead Discovery (FBLD) methods. The inactivation of Ras oncogenic signaling in vitro by small molecules is currently an attractive hurdle to try to and leap over in order to attack the oncogenic state. In this regard, important features of previously characterized properties of small molecule Ras targets, as well as a current understanding of conformational and dynamics changes seen for Ras-related mutants, relative to wild type, must be taken into account as newer small molecule design strategies towards Ras are developed.
Ras [Rat Sarcoma]; Small Molecule Target; Structure-Based Drug Design; Fragment-Based Drug Design; GTP Hydrolysis; Guanine Nucleotide Exchange Factors [GEF]
The actin filament-binding and filament-severing activities of the aplyronine, kabiramide and reidispongiolide families of marine macrolides are located within the hydrophobic tail region of the molecule. Two synthetic tail analogs of aplyronine C (SF-01 and GC-04) are shown to bind to G-actin with kd values of 285 +/−33 nM and 132 +/−13 nM, respectively. The crystal structures of actin complexes with GC-04, SF-01 and kabiramide C reveal a conserved mode of tail binding within the cleft that forms between sub-domains (SD) 1 and 3. Our studies support the view that filament severing is brought about by specific binding of the tail region to the SD1/3 cleft on the upper protomer, which displaces loop-D from the lower protomer on the same half-filament. With previous studies showing that GC-04 analog can sever actin filaments, it is argued that the shorter complex lifetime of tail analogs with F-actin would make them more effective at severing filaments compared with plasma gelsolin. Structure-based analyses are used to suggest more reactive or targetable forms of GC-04 and SF-01, which may serve to boost the capacity of the serum actin scavenging system, to generate antibody conjugates against tumor cell antigens, and to reduce sputum viscosity in children with cystic fibrosis.
Actin; macrolide analogs; structural biology; chemical synthesis
Ideal values of bond angles and lengths used as external restraints are crucial for the successful refinement of protein crystal structures at all but the highest of resolutions. The restraints in common usage today have been designed based on the assumption that each type of bond or angle has a single ideal value independent of context. However, recent work has shown that the ideal values are, in fact, sensitive to local conformation, and as a first step toward using such information to build more accurate models, ultra-high resolution protein crystal structures have been used to derive a conformation-dependent library (CDL) of restraints for the protein backbone (Berkholz et al. 2009. Structure.
17, 1316). Here, we report the introduction of this CDL into the Phenix package and the results of test refinements of thousands of structures across a wide range of resolutions. These tests show that use of the conformation dependent library yields models that have substantially better agreement with ideal main-chain bond angles and lengths and, on average, a slightly enhanced fit to the X-ray data. No disadvantages of using the backbone CDL are apparent. In Phenix usage of the CDL can be selected by simply specifying the cdl=True option. This successful implementation paves the way for further aspects of the context-dependence of ideal geometry to be characterized and applied to improve experimental and predictive modelling accuracy.
Protein structure; crystallographic refinement; geometry restraints; ideal geometry; structural genomics
A likelihood-based method for determining the sub-structure of anomalously-scattering atoms in macromolecular crystals can allow successful structure determination by single-wavelength anomalous diffraction (SAD) X-ray analysis with weak anomalous signal. Along with use of partial models and electron density maps in searches for anomalously-scattering atoms, testing of alternative values of parameters, and parallelized automated model-building, this method has the potential for extending the applicability of the SAD method in challenging cases.
A method of simulating X-ray diffuse scattering from multi-model PDB files is presented. Despite similar agreement with Bragg data, different translation–libration–screw refinement strategies produce unique diffuse intensity patterns.
Identifying the intramolecular motions of proteins and nucleic acids is a major challenge in macromolecular X-ray crystallography. Because Bragg diffraction describes the average positional distribution of crystalline atoms with imperfect precision, the resulting electron density can be compatible with multiple models of motion. Diffuse X-ray scattering can reduce this degeneracy by reporting on correlated atomic displacements. Although recent technological advances are increasing the potential to accurately measure diffuse scattering, computational modeling and validation tools are still needed to quantify the agreement between experimental data and different parameterizations of crystalline disorder. A new tool, phenix.diffuse, addresses this need by employing Guinier’s equation to calculate diffuse scattering from Protein Data Bank (PDB)-formatted structural ensembles. As an example case, phenix.diffuse is applied to translation–libration–screw (TLS) refinement, which models rigid-body displacement for segments of the macromolecule. To enable the calculation of diffuse scattering from TLS-refined structures, phenix.tls_as_xyz builds multi-model PDB files that sample the underlying T, L and S tensors. In the glycerophosphodiesterase GpdQ, alternative TLS-group partitioning and different motional correlations between groups yield markedly dissimilar diffuse scattering maps with distinct implications for molecular mechanism and allostery. These methods demonstrate how, in principle, X-ray diffuse scattering could extend macromolecular structural refinement, validation and analysis.
diffuse scattering; TLS; correlated motion; structural ensemble; structure refinement
Details are described of the calculation of new parallelity restraints recently introduced in cctbx and PHENIX.
Improvements in structural biology methods, in particular crystallography and cryo-electron microscopy, have created an increased demand for the refinement of atomic models against low-resolution experimental data. One way to compensate for the lack of high-resolution experimental data is to use a priori information about model geometry that can be utilized in refinement in the form of stereochemical restraints or constraints. Here, the definition and calculation of the restraints that can be imposed on planar atomic groups, in particular the angle between such groups, are described. Detailed derivations of the restraint targets and their gradients are provided so that they can be readily implemented in other contexts. Practical implementations of the restraints, and of associated data structures, in the Computational Crystallography Toolbox (cctbx) are presented.
restraints; atomic model refinement; parallel planes; cctbx; PHENIX; gradient calculation
the biotechnological potential of the large number of
proteins available in sequence databases requires scalable methods
for functional characterization. Here we propose a workflow to address
this challenge by combining phylogenomic guided DNA synthesis with
high-throughput mass spectrometry and apply it to the systematic characterization
of GH1 β-glucosidases, a family of enzymes necessary for biomass
hydrolysis, an important step in the conversion of lignocellulosic
feedstocks to fuels and chemicals. We synthesized and expressed 175
GH1s, selected from over 2000 candidate sequences to cover maximum
sequence diversity. These enzymes were functionally characterized
over a range of temperatures and pHs using nanostructure-initiator
mass spectrometry (NIMS), generating over 10,000 data points. When
combined with HPLC-based sugar profiling, we observed GH1 enzymes
active over a broad temperature range and toward many different β-linked
disaccharides. For some GH1s we also observed activity toward laminarin,
a more complex oligosaccharide present as a major component of macroalgae.
An area of particular interest was the identification of GH1 enzymes
compatible with the ionic liquid 1-ethyl-3-methylimidazolium acetate
([C2mim][OAc]), a next-generation biomass pretreatment
technology. We thus searched for GH1 enzymes active at 70 °C
and 20% (v/v) [C2mim][OAc] over the course of a 24-h saccharification
reaction. Using our unbiased approach, we identified multiple enzymes
of different phylogentic origin with such activities. Our approach
of characterizing sequence diversity through targeted gene synthesis
coupled to high-throughput screening technologies is a broadly applicable
paradigm for a wide range of biological problems.
Branched five carbon (C5) alcohols are attractive targets for microbial production due to their desirable fuel properties and importance as platform chemicals. In this study, we engineered a heterologous isoprenoid pathway in E. coli for the high-yield production of 3-methyl-3-buten-1-ol, 3-methyl-2-buten-1-ol, and 3-methyl-1-butanol, three C5 alcohols that serve as potential biofuels. We first constructed a pathway for 3-methyl-3-buten-1-ol, where metabolite profiling identified NudB, a promiscuous phosphatase, as a likely pathway bottleneck. We achieved a 60% increase in the yield of 3-methyl-3-buten-1-ol by engineering the Shine-Dalgarno sequence of nudB, which increased protein levels by 9-fold and reduced isopentenyl diphosphate (IPP) accumulation by 4-fold. To further optimize the pathway, we adjusted mevalonate kinase (MK) expression and investigated MK enzymes from alternative microbes such as Methanosarcina mazei. Next, we expressed a fusion protein of IPP isomerase and the phosphatase (Idi1~NudB) along with a reductase (NemA) to diversify production to 3-methyl-2-buten-1-ol and 3-methyl-1-butanol. Finally, we used an oleyl alcohol overlay to improve alcohol recovery, achieving final titers of 2.23 g/L of 3-methyl-3-buten-1-ol (~70% of pathway-dependent theoretical yield), 150 mg/L of 3-methyl-2-buten-1-ol, and 300 mg/L of 3-methyl-1-butanol.
Biological sensors can be engineered to measure a wide range of environmental conditions. Here we show that statistical analysis of DNA from natural microbial communities can be used to accurately identify environmental contaminants, including uranium and nitrate at a nuclear waste site. In addition to contamination, sequence data from the 16S rRNA gene alone can quantitatively predict a rich catalogue of 26 geochemical features collected from 93 wells with highly differing geochemistry characteristics. We extend this approach to identify sites contaminated with hydrocarbons from the Deepwater Horizon oil spill, finding that altered bacterial communities encode a memory of prior contamination, even after the contaminants themselves have been fully degraded. We show that the bacterial strains that are most useful for detecting oil and uranium are known to interact with these substrates, indicating that this statistical approach uncovers ecologically meaningful interactions consistent with previous experimental observations. Future efforts should focus on evaluating the geographical generalizability of these associations. Taken as a whole, these results indicate that ubiquitous, natural bacterial communities can be used as in situ environmental sensors that respond to and capture perturbations caused by human impacts. These in situ biosensors rely on environmental selection rather than directed engineering, and so this approach could be rapidly deployed and scaled as sequencing technology continues to become faster, simpler, and less expensive.
Here we show that DNA from natural bacterial communities can be used as a quantitative biosensor to accurately distinguish unpolluted sites from those contaminated with uranium, nitrate, or oil. These results indicate that bacterial communities can be used as environmental sensors that respond to and capture perturbations caused by human impacts.
A method to automatically identify possible elemental ions in X-ray crystal structures has been extended to use support vector machine (SVM) classifiers trained on selected structures in the PDB, with significantly improved sensitivity over manually encoded heuristics.
In the process of macromolecular model building, crystallographers must examine electron density for isolated atoms and differentiate sites containing structured solvent molecules from those containing elemental ions. This task requires specific knowledge of metal-binding chemistry and scattering properties and is prone to error. A method has previously been described to identify ions based on manually chosen criteria for a number of elements. Here, the use of support vector machines (SVMs) to automatically classify isolated atoms as either solvent or one of various ions is described. Two data sets of protein crystal structures, one containing manually curated structures deposited with anomalous diffraction data and another with automatically filtered, high-resolution structures, were constructed. On the manually curated data set, an SVM classifier was able to distinguish calcium from manganese, zinc, iron and nickel, as well as all five of these ions from water molecules, with a high degree of accuracy. Additionally, SVMs trained on the automatically curated set of high-resolution structures were able to successfully classify most common elemental ions in an independent validation test set. This method is readily extensible to other elemental ions and can also be used in conjunction with previous methods based on a priori expectations of the chemical environment and X-ray scattering.
elemental ion identification; support vector machines; model building
The non-iterative feature-enhancing approach improves crystallographic maps’ interpretability by reducing model bias and noise and strengthening the existing signal.
A method is presented that modifies a 2m
obs − D
model σA-weighted map such that the resulting map can strengthen a weak signal, if present, and can reduce model bias and noise. The method consists of first randomizing the starting map and filling in missing reflections using multiple methods. This is followed by restricting the map to regions with convincing density and the application of sharpening. The final map is then created by combining a series of histogram-equalized intermediate maps. In the test cases shown, the maps produced in this way are found to have increased interpretability and decreased model bias compared with the starting 2m
obs − D
model σA-weighted map.
Fourier map; map sharpening; map kurtosis; model bias; map improvement; density modification; PHENIX; cctbx; FEM; feature-enhanced map; OMIT
Flexible torsion angle-based NCS restraints have been implemented in phenix.refine, allowing improved model refinement at all resolutions. Rotamer correction and rotamer consistency checks between NCS-related amino-acid side chains further improve the final model quality.
One of the great challenges in refining macromolecular crystal structures is a low data-to-parameter ratio. Historically, knowledge from chemistry has been used to help to improve this ratio. When a macromolecule crystallizes with more than one copy in the asymmetric unit, the noncrystallographic symmetry relationships can be exploited to provide additional restraints when refining the working model. However, although globally similar, NCS-related chains often have local differences. To allow for local differences between NCS-related molecules, flexible torsion-based NCS restraints have been introduced, coupled with intelligent rotamer handling for protein chains, and are available in phenix.refine for refinement of models at all resolutions.
macromolecular crystallography; noncrystallographic symmetry; NCS; refinement; automation
Special methods are required to interpret sparse diffraction patterns collected from peptide crystals at X-ray free-electron lasers. Bragg spots can be indexed from composite-image powder rings, with crystal orientations then deduced from a very limited number of spot positions.
Still diffraction patterns from peptide nanocrystals with small unit cells are challenging to index using conventional methods owing to the limited number of spots and the lack of crystal orientation information for individual images. New indexing algorithms have been developed as part of the Computational Crystallography Toolbox (cctbx) to overcome these challenges. Accurate unit-cell information derived from an aggregate data set from thousands of diffraction patterns can be used to determine a crystal orientation matrix for individual images with as few as five reflections. These algorithms are potentially applicable not only to amyloid peptides but also to any set of diffraction patterns with sparse properties, such as low-resolution virus structures or high-throughput screening of still images captured by raster-scanning at synchrotron sources. As a proof of concept for this technique, successful integration of X-ray free-electron laser (XFEL) data to 2.5 Å resolution for the amyloid segment GNNQQNY from the Sup35 yeast prion is presented.
XFEL; Sup35 yeast prion; indexing methods; crystallography
Single-structure models derived from X-ray data do not adequately account for the inherent, functionally important dynamics of protein molecules. We generated ensembles of structures by time-averaged refinement, where local molecular vibrations were sampled by molecular-dynamics (MD) simulation whilst global disorder was partitioned into an underlying overall translation–libration–screw (TLS) model. Modeling of 20 protein datasets at 1.1–3.1 Å resolution reduced cross-validated Rfree values by 0.3–4.9%, indicating that ensemble models fit the X-ray data better than single structures. The ensembles revealed that, while most proteins display a well-ordered core, some proteins exhibit a ‘molten core’ likely supporting functionally important dynamics in ligand binding, enzyme activity and protomer assembly. Order–disorder changes in HIV protease indicate a mechanism of entropy compensation for ordering the catalytic residues upon ligand binding by disordering specific core residues. Thus, ensemble refinement extracts dynamical details from the X-ray data that allow a more comprehensive understanding of structure–dynamics–function relationships.
It has been clear since the early days of structural biology in the late 1950s that proteins and other biomolecules are continually changing shape, and that these changes have an important influence on both the structure and function of the molecules. X-ray diffraction can provide detailed information about the structure of a protein, but only limited information about how its structure fluctuates over time. Detailed information about the dynamic behaviour of proteins is essential for a proper understanding of a variety of processes, including catalysis, ligand binding and protein–protein interactions, and could also prove useful in drug design.
Currently most of the X-ray crystal structures in the Protein Data Bank are ‘snap-shots’ with limited or no information about protein dynamics. However, X-ray diffraction patterns are affected by the dynamics of the protein, and also by distortions of the crystal lattice, so three-dimensional (3D) models of proteins ought to take these phenomena into account. Molecular-dynamics (MD) computer simulations transform 3D structures into 4D ‘molecular movies’ by predicting the movement of individual atoms.
Combining MD simulations with crystallographic data has the potential to produce more realistic ensemble models of proteins in which the atomic fluctuations are represented by multiple structures within the ensemble. Moreover, in addition to improved structural information, this process—which is called ensemble refinement—can provide dynamical information about the protein. Earlier attempts to do this ran into problems because the number of model parameters needed was greater than the number of observed data points. Burnley et al. now overcome this problem by modelling local molecular vibrations with MD simulations and, at the same time, using a course-grain model to describe global disorder of longer length scales.
Ensemble refinement of high-resolution X-ray diffraction datasets for 20 different proteins from the Protein Data Bank produced a better fit to the data than single structures for all 20 proteins. Ensemble refinement also revealed that 3 of the 20 proteins had a ‘molten core’, rather than the well-ordered residues core found in most proteins: this is likely to be important in various biological functions including ligand binding, filament formation and enzymatic function. Burnley et al. also showed that a HIV enzyme underwent an order–disorder transition that is likely to influence how this enzyme works, and that similar transitions might influence the interactions between the small-molecule drug Imatinib (also known as Gleevec) and the enzymes it targets. Ensemble refinement could be applied to the majority of crystallography data currently being collected, or collected in the past, so further insights into the properties and interactions of a variety of proteins and other biomolecules can be expected.
protein; crystallography; structure; function; dynamics; None
The dioxygen we breathe is formed from water by its light-induced oxidation in photosystem II. O2 formation takes place at a catalytic manganese cluster within milliseconds after the photosystem II reaction center is excited by three single-turnover flashes. Here we present combined X-ray emission spectra and diffraction data of 2 flash (2F) and 3 flash (3F) photosystem II samples, and of a transient 3F′ state (250 μs after the third flash), collected under functional conditions using an X-ray free electron laser. The spectra show that the initial O-O bond formation, coupled to Mn-reduction, does not yet occur within 250 μs after the third flash. Diffraction data of all states studied exhibit an anomalous scattering signal from Mn but show no significant structural changes at the present resolution of 4.5 Å. This study represents the initial frames in a molecular movie of the structural changes during the catalytic reaction in photosystem II.
Over the past 10 years, the bioenergy field has realized significant achievements that have encouraged many follow on efforts centered on biosynthetic production of fuel-like compounds. Key to the success of these efforts has been transformational developments in feedstock characterization and metabolic engineering of biofuel-producing microbes. Lagging far behind these advancements are analytical methods to characterize and quantify systems of interest to the bioenergy field. In particular, the utilization of proteomics, while valuable for identifying novel enzymes and diagnosing problems associated with biofuel-producing microbes, is limited by a lack of robustness and limited throughput. Nano-flow liquid chromatography coupled to high-mass accuracy, high-resolution mass spectrometers has become the dominant approach for the analysis of complex proteomic samples, yet such assays still require dedicated experts for data acquisition, analysis, and instrument upkeep. The recent adoption of standard flow chromatography (ca. 0.5 mL/min) for targeted proteomics has highlighted the robust nature and increased throughput of this approach for sample analysis. Consequently, we assessed the applicability of standard flow liquid chromatography for shotgun proteomics using samples from Escherichia coli and Arabidopsis thaliana, organisms commonly used as model systems for lignocellulosic biofuels research. Employing 120 min gradients with standard flow chromatography, we were able to routinely identify nearly 800 proteins from E. coli samples; while for samples from Arabidopsis, over 1,000 proteins could be reliably identified. An examination of identified peptides indicated that the method was suitable for reproducible applications in shotgun proteomics. Standard flow liquid chromatography for shotgun proteomics provides a robust approach for the analysis of complex samples. To the best of our knowledge, this study represents the first attempt to validate the standard flow approach for shotgun proteomics.
proteomics; standard flow chromatography; biofuels; mass spectrometry
Realizing the promise of metabolic engineering has been slowed by challenges related to moving beyond proof-of-concept examples to robust and economically viable systems. Key to advancing metabolic engineering beyond trial-and-error research is access to parts with well-defined performance metrics that can be readily applied in vastly different contexts with predictable effects. As the field now stands, research depends greatly on analytical tools that assay target molecules, transcripts, proteins, and metabolites across different hosts and pathways. Screening technologies yield specific information for many thousands of strain variants, while deep omics analysis provides a systems-level view of the cell factory. Efforts focused on a combination of these analyses yield quantitative information of dynamic processes between parts and the host chassis that drive the next engineering steps. Overall, the data generated from these types of assays aid better decision-making at the design and strain construction stages to speed progress in metabolic engineering research.
metabolic engineering; RNA-seq; proteomics; metabolomics; high-throughput screening; microfluidics