The general principles behind the macromolecular crystal structure refinement program REFMAC5 are described.
This paper describes various components of the macromolecular crystallographic refinement program REFMAC5, which is distributed as part of the CCP4 suite. REFMAC5 utilizes different likelihood functions depending on the diffraction data employed (amplitudes or intensities), the presence of twinning and the availability of SAD/SIRAS experimental diffraction data. To ensure chemical and structural integrity of the refined model, REFMAC5 offers several classes of restraints and choices of model parameterization. Reliable models at resolutions at least as low as 4 Å can be achieved thanks to low-resolution refinement tools such as secondary-structure restraints, restraints to known homologous structures, automatic global and local NCS restraints, ‘jelly-body’ restraints and the use of novel long-range restraints on atomic displacement parameters (ADPs) based on the Kullback–Leibler divergence. REFMAC5 additionally offers TLS parameterization and, when high-resolution data are available, fast refinement of anisotropic ADPs. Refinement in the presence of twinning is performed in a fully automated fashion. REFMAC5 is a flexible and highly optimized refinement package that is ideally suited for refinement across the entire resolution spectrum encountered in macromolecular crystallography.
A description is given of new tools to facilitate model building and refinement into electron cryo-microscopy reconstructions.
The recent rapid development of single-particle electron cryo-microscopy (cryo-EM) now allows structures to be solved by this method at resolutions close to 3 Å. Here, a number of tools to facilitate the interpretation of EM reconstructions with stereochemically reasonable all-atom models are described. The BALBES database has been repurposed as a tool for identifying protein folds from density maps. Modifications to Coot, including new Jiggle Fit and morphing tools and improved handling of nucleic acids, enhance its functionality for interpreting EM maps. REFMAC has been modified for optimal fitting of atomic models into EM maps. As external structural information can enhance the reliability of the derived atomic models, stabilize refinement and reduce overfitting, ProSMART has been extended to generate interatomic distance restraints from nucleic acid reference structures, and a new tool, LIBG, has been developed to generate nucleic acid base-pair and parallel-plane restraints. Furthermore, restraint generation has been integrated with visualization and editing in Coot, and these restraints have been applied to both real-space refinement in Coot and reciprocal-space refinement in REFMAC.
model building; refinement; electron cryo-microscopy reconstructions; LIBG
The Procrustes Structural Matching Alignment and Restraints Tool (ProSMART) has been developed to allow local comparative structural analyses independent of the global conformations and sequence homology of the compared macromolecules. This allows quick and intuitive visualization of the conservation of backbone and side-chain conformations, providing complementary information to existing methods.
The identification and exploration of (dis)similarities between macromolecular structures can help to gain biological insight, for instance when visualizing or quantifying the response of a protein to ligand binding. Obtaining a residue alignment between compared structures is often a prerequisite for such comparative analysis. If the conformational change of the protein is dramatic, conventional alignment methods may struggle to provide an intuitive solution for straightforward analysis. To make such analyses more accessible, the Procrustes Structural Matching Alignment and Restraints Tool (ProSMART) has been developed, which achieves a conformation-independent structural alignment, as well as providing such additional functionalities as the generation of restraints for use in the refinement of macromolecular models. Sensible comparison of protein (or DNA/RNA) structures in the presence of conformational changes is achieved by enforcing neither chain nor domain rigidity. The visualization of results is facilitated by popular molecular-graphics software such as CCP4mg and PyMOL, providing intuitive feedback regarding structural conservation and subtle dissimilarities between close homologues that can otherwise be hard to identify. Automatically generated colour schemes corresponding to various residue-based scores are provided, which allow the assessment of the conservation of backbone and side-chain conformations relative to the local coordinate frame. Structural comparison tools such as ProSMART can help to break the complexity that accompanies the constantly growing pool of structural data into a more readily accessible form, potentially offering biological insight or influencing subsequent experiments.
ProSMART; Procrustes; structural comparison; alignment; external restraints; refinement
Local structural similarity restraints (LSSR) provide a novel method for exploiting NCS or structural similarity to an external target structure. Two examples are given where BUSTER re-refinement of PDB entries with LSSR produces marked improvements, enabling further structural features to be modelled.
Maximum-likelihood X-ray macromolecular structure refinement in BUSTER has been extended with restraints facilitating the exploitation of structural similarity. The similarity can be between two or more chains within the structure being refined, thus favouring NCS, or to a distinct ‘target’ structure that remains fixed during refinement. The local structural similarity restraints (LSSR) approach considers all distances less than 5.5 Å between pairs of atoms in the chain to be restrained. For each, the difference from the distance between the corresponding atoms in the related chain is found. LSSR applies a restraint penalty on each difference. A functional form that reaches a plateau for large differences is used to avoid the restraints distorting parts of the structure that are not similar. Because LSSR are local, there is no need to separate out domains. Some restraint pruning is still necessary, but this has been automated. LSSR have been available to academic users of BUSTER since 2009 with the easy-to-use -autoncs and -target target.pdb options. The use of LSSR is illustrated in the re-refinement of PDB entries 5rnt, where -target enables the correct ligand-binding structure to be found, and 1osg, where -autoncs contributes to the location of an additional copy of the cyclic peptide ligand.
BUSTER; NCS restraints; target-structure restraints; local structural similarity restraints
Macromolecular structures are modeled by conformational optimization within experimental and knowledge-based restraints. Discrete restraint-based sampling generates high-quality structures within these restraints and facilitates further refinement in a continuous all-atom energy landscape. This approach has been used successfully for protein loop modeling, comparative modeling and electron density fitting in X-ray crystallography.
Here we present a software toolkit (Rappertk) which generalizes discrete restraint-based sampling for use in structural biology. Modular design and multi-layered architecture enables Rappertk to sample conformations of any macromolecule at many levels of detail and within a variety of experimental restraints. Performance against a Cα-tracing benchmark shows that the efficiency has not suffered despite the overhead required by this flexibility. We demonstrate the toolkit's capabilities by building high-quality β-sheets and by introducing restraint-driven sampling. RNA sampling is demonstrated by rebuilding a protein-RNA interface. Ability to construct arbitrary ligands is used in sampling protein-ligand interfaces within electron density. Finally, secondary structure and shape information derived from EM are combined to generate multiple conformations of a protein consistent with the observed density.
Through its modular design and ease of use, Rappertk enables exploration of a wide variety of interesting avenues in structural biology. This toolkit, with illustrative examples, is freely available to academic users from .
A script was created to allow SHELXL to use the new CDL v.1.2 stereochemical library which defines the target values for main-chain bond lengths and angles as a function of the residue’s ϕ/ψ angles. Test refinements using this script show that the refinement behavior of structures at resolutions even better than 1 Å is substantially enhanced by the use of the new conformation-dependent ideal geometry paradigm.
To utilize a new conformation-dependent backbone-geometry library (CDL) in protein refinements at atomic resolution, a script was written that creates a restraint file for the SHELXL refinement program. It was found that the use of this library allows models to be created that have a substantially better fit to main-chain bond angles and lengths without degrading their fit to the X-ray data even at resolutions near 1 Å. For models at much higher resolution (∼0.7 Å), the refined model for parts adopting single well occupied positions is largely independent of the restraints used, but these structures still showed much smaller r.m.s.d. residuals when assessed with the CDL. Examination of the refinement tests across a wide resolution range from 2.4 to 0.65 Å revealed consistent behavior supporting the use of the CDL as a next-generation restraint library to improve refinement. CDL restraints can be generated using the service at http://pgd.science.oregonstate.edu/cdl_shelxl/.
stereochemical libraries; refinement; conformation-dependent library
Although comparative modelling is routinely used to produce three-dimensional models of proteins, very few automated approaches are formulated in a way that allows inclusion of restraints derived from experimental data as well as those from the structures of homologues. Furthermore, proteins are usually described as a single conformer, rather than an ensemble that represents the heterogeneity and inaccuracy of experimentally determined protein structures. Here we address these issues by exploring the application of the restraint-based conformational space search engine, RAPPER, which has previously been developed for rebuilding experimentally defined protein structures and for fitting models to electron density derived from X-ray diffraction analyses.
A new application of RAPPER for comparative modelling uses positional restraints and knowledge-based sampling to generate models with accuracies comparable to other leading modelling tools. Knowledge-based predictions are based on geometrical features of the homologous templates and rules concerning main-chain and side-chain conformations. By directly changing the restraints derived from available templates we estimate the accuracy limits of the method in comparative modelling.
The application of RAPPER to comparative modelling provides an effective means of exploring the conformational space available to a target sequence. Enhanced methods for generating positional restraints can greatly improve structure prediction. Generation of an ensemble of solutions that are consistent with both target sequence and knowledge derived from the template structures provides a more appropriate representation of a structural prediction than a single model. By formulating homologous structural information as sets of restraints we can begin to consider how comparative models might be used to inform conformer generation from sparse experimental data.
Paramagnetic NMR data (pseudocontact shifts and self-orientation residual dipolar couplings) and diamagnetic residual dipolar couplings can now be used in the program REFMAC5 from CCP4 as structural restraints together with X-ray crystallographic data. These NMR restraints can reveal differences between solid state and solution conformations of molecules or, in their absence, can be used together with X-ray crystallographic data for structural refinement.
The program REFMAC5 from CCP4 was modified to allow the simultaneous use of X-ray crystallographic data and paramagnetic NMR data (pseudocontact shifts and self-orientation residual dipolar couplings) and/or diamagnetic residual dipolar couplings. Incorporation of these long-range NMR restraints in REFMAC5 can reveal differences between solid-state and solution conformations of molecules or, in their absence, can be used together with X-ray crystallographic data for structural refinement. Since NMR and X-ray data are complementary, when a single structure is consistent with both sets of data and still maintains reasonably ‘ideal’ geometries, the reliability of the derived atomic model is expected to increase. The program was tested on five different proteins: the catalytic domain of matrix metalloproteinase 1, GB3, ubiquitin, free calmodulin and calmodulin complexed with a peptide. In some cases the joint refinement produced a single model consistent with both sets of observations, while in other cases it indicated, outside the experimental uncertainty, the presence of different protein conformations in solution and in the solid state.
structure refinement; PCS; RDC; X-ray; REFMAC
Low-resolution refinement tools implemented in REFMAC5 are described, including the use of external structural restraints, helical restraints and regularized anisotropic map sharpening.
Two aspects of low-resolution macromolecular crystal structure analysis are considered: (i) the use of reference structures and structural units for provision of structural prior information and (ii) map sharpening in the presence of noise and the effects of Fourier series termination. The generation of interatomic distance restraints by ProSMART and their subsequent application in REFMAC5 is described. It is shown that the use of such external structural information can enhance the reliability of derived atomic models and stabilize refinement. The problem of map sharpening is considered as an inverse deblurring problem and is solved using Tikhonov regularizers. It is demonstrated that this type of map sharpening can automatically produce a map with more structural features whilst maintaining connectivity. Tests show that both of these directions are promising, although more work needs to be performed in order to further exploit structural information and to address the problem of reliable electron-density calculation.
low-resolution refinement; REFMAC5
Recent developments in PHENIX are reported that allow the use of reference-model torsion restraints, secondary-structure hydrogen-bond restraints and Ramachandran restraints for improved macromolecular refinement in phenix.refine at low resolution.
Traditional methods for macromolecular refinement often have limited success at low resolution (3.0–3.5 Å or worse), producing models that score poorly on crystallographic and geometric validation criteria. To improve low-resolution refinement, knowledge from macromolecular chemistry and homology was used to add three new coordinate-restraint functions to the refinement program phenix.refine. Firstly, a ‘reference-model’ method uses an identical or homologous higher resolution model to add restraints on torsion angles to the geometric target function. Secondly, automatic restraints for common secondary-structure elements in proteins and nucleic acids were implemented that can help to preserve the secondary-structure geometry, which is often distorted at low resolution. Lastly, we have implemented Ramachandran-based restraints on the backbone torsion angles. In this method, a ϕ,ψ term is added to the geometric target function to minimize a modified Ramachandran landscape that smoothly combines favorable peaks identified from nonredundant high-quality data with unfavorable peaks calculated using a clash-based pseudo-energy function. All three methods show improved MolProbity validation statistics, typically complemented by a lowered R
free and a decreased gap between R
work and R
macromolecular crystallography; low resolution; refinement; automation
This report describes the working of the program CcpNmr Analysis for both NMR chemical shift assignment and structure determination of biological macromolecules.
CcpNmr Analysis provides a streamlined pipeline for both NMR chemical shift assignment and structure determination of biological macromolecules. In addition, it encompasses tools to analyse the many additional experiments that make NMR such a pivotal technique for research into complex biological questions. This report describes how CcpNmr Analysis can seamlessly link together all of the tasks in the NMR structure-determination process. It details each of the stages from generating NMR restraints [distance, dihedral, hydrogen bonds and residual dipolar couplings (RDCs)], exporting these to and subsequently re-importing them from structure-calculation software (such as the programs CYANA or ARIA) and analysing and validating the results obtained from the structure calculation to, ultimately, the streamlined deposition of the completed assignments and the refined ensemble of structures into the PDBe repository. Until recently, such solution-structure determination by NMR has been quite a laborious task, requiring multiple stages and programs. However, with the new enhancements to CcpNmr Analysis described here, this process is now much more intuitive and efficient and less error-prone.
NMR; processing; structure calculation; analysis; CcpNmr; talin
The heterogeneous array of software tools used in the process of protein NMR structure determination presents organizational challenges in the structure determination and validation processes, and creates a learning curve that limits the broader use of protein NMR in biology. These challenges, including accurate use of data in different data formats required by software carrying out similar tasks, continue to confound the efforts of novices and experts alike. These important issues need to be addressed robustly in order to standardize protein NMR structure determination and validation. PDBStat is a C/C++ computer program originally developed as a universal coordinate and protein NMR restraint converter. Its primary function is to provide a user-friendly tool for interconverting between protein coordinate and protein NMR restraint data formats. It also provides an integrated set of computational methods for protein NMR restraint analysis and structure quality assessment, relabeling of prochiral atoms with correct IUPAC names, as well as multiple methods for analysis of the consistency of atomic positions indicated by their convergence across a protein NMR ensemble. In this paper we provide a detailed description of the PDBStat software, and highlight some of its valuable computational capabilities. As an example, we demonstrate the use of the PDBStat restraint converter for restrained CS-Rosetta structure generation calculations, and compare the resulting protein NMR structure models with those generated from the same NMR restraint data using more traditional structure determination methods. These results demonstrate the value of a universal restraint converter in allowing the use of multiple structure generation methods with the same restraint data for consensus analysis of protein NMR structures and the underlying restraint data.
Protein NMR Structure Validation; BioMagResDatabase; XPLOR; CNS; CYANA; CS-Rosetta
A brief summary of the types of restraint defined in refinement dictionaries.
At the resolution available from most macromolecular crystals, the X-ray data alone are insufficient to lead to a chemically reasonable structure, so stereochemical restraints are essential. These usually restrain bond lengths, bond angles, planes and chiral volumes. The definition of these restraints and where the values come from are described. A dictionary entry contains information about the atom types, their connectivity and all the appropriate restraints. Torsion angles are not usually restrained, but they do have optimum values. In the special case of flexible five- and six-membered rings, including pentose and hexose sugars, the ring pucker is defined by combinations of torsion angles and the pucker affects the position of substituents.
stereochemistry; restraints; bond lengths; bond angles; protein structure; crystallographic refinement
Summary: Automatic methods for macromolecular structure prediction (fold recognition, de novo folding and docking programs) produce large sets of alternative models. These large model sets often include many native-like structures, which are often scored as false positives. Such native-like models can be more easily identified based on data from experimental analyses used as structural restraints (e.g. identification of nearby residues by cross-linking, chemical modification, site-directed mutagenesis, deuterium exchange coupled with mass spectrometry, etc.). We present a simple server for scoring and ranking of models according to their agreement with user-defined restraints.
Availability: FILTREST3D is freely available for users as a web server and standalone software at: http://filtrest3d.genesilico.pl/
Supplementary information: Supplementary data are available at Bioinformatics online.
We describe comparative patch analysis for modeling the structures of multidomain proteins and protein complexes, and apply it to the PSD-95 protein. Comparative patch analysis is a hybrid of comparative modeling based on a template complex and protein docking, with a greater applicability than comparative modeling and a higher accuracy than docking. It relies on structurally defined interactions of each of the complex components, or their homologs, with any other protein, irrespective of its fold. For each component, its known binding modes with other proteins of any fold are collected and expanded by the known binding modes of its homologs. These modes are then used to restrain conventional molecular docking, resulting in a set of binary domain complexes that are subsequently ranked by geometric complementarity and a statistical potential. The method is evaluated by predicting 20 binary complexes of known structure. It is able to correctly identify the binding mode in 70% of the benchmark complexes compared with 30% for protein docking. We applied comparative patch analysis to model the complex of the third PSD-95, DLG, and ZO-1 (PDZ) domain and the SH3-GK domains in the PSD-95 protein, whose structure is unknown. In the first predicted configuration of the domains, PDZ interacts with SH3, leaving both the GMP-binding site of guanylate kinase (GK) and the C-terminus binding cleft of PDZ accessible, while in the second configuration PDZ interacts with GK, burying both binding sites. We suggest that the two alternate configurations correspond to the different functional forms of PSD-95 and provide a possible structural description for the experimentally observed cooperative folding transitions in PSD-95 and its homologs. More generally, we expect that comparative patch analysis will provide useful spatial restraints for the structural characterization of an increasing number of binary and higher-order protein complexes.
Protein–protein interactions play a crucial role in many cellular processes. An important step towards a mechanistic description of these processes is a structural characterization of the proteins and their complexes. The authors developed a new approach to modeling the structure of protein complexes and multidomain proteins. The approach, called comparative patch analysis, complements the two currently existing approaches for structural modeling of protein complexes, comparative modeling, and protein docking. It limits the configurations refined by molecular docking to the structurally defined interactions of each of the complex components, or their homologs, with any other protein, irrespective of its fold; the final prediction corresponds to the best-scoring refined configuration. The authors applied comparative patch analysis to predict the structure of the core fragment of PSD-95, a five-domain protein that plays a major role in the postsynaptic density at neuronal synapses. The study suggests two alternate configurations of the core fragment that potentially correspond to the different functional forms of PSD-95. This finding provides a possible structural explanation for the experimentally observed cooperative folding transitions in PSD-95 and its homologs.
We present a suite of software for the complete and easy deposition of NMR data to the PDB and BMRB. This suite uses the CCPN framework and introduces a freely downloadable, graphical desktop application called CcpNmr Entry Completion Interface (ECI) for the secure editing of experimental information and associated datasets through the lifetime of an NMR project. CCPN projects can be created within the CcpNmr Analysis software or by importing existing NMR data files using the CcpNmr FormatConverter. After further data entry and checking with the ECI, the project can then be rapidly deposited to the PDBe using AutoDep, or exported as a complete deposition NMR-STAR file. In full CCPN projects created with ECI, it is straightforward to select chemical shift lists, restraint data sets, structural ensembles and all relevant associated experimental collection details, which all are or will become mandatory when depositing to the PDB. Instructions and download information for the ECI are available from the PDBe web site at http://www.ebi.ac.uk/pdbe/nmr/deposition/eci.html.
Electronic supplementary material
The online version of this article (doi:10.1007/s10858-010-9439-3) contains supplementary material, which is available to authorized users.
Database deposition; CCPN; wwPDB; Structure calculation; Structure validation; NMR-STAR
An overview of the CCP4 software suite for macromolecular crystallography is given.
The CCP4 (Collaborative Computational Project, Number 4) software suite is a collection of programs and associated data and software libraries which can be used for macromolecular structure determination by X-ray crystallography. The suite is designed to be flexible, allowing users a number of methods of achieving their aims. The programs are from a wide variety of sources but are connected by a common infrastructure provided by standard file formats, data objects and graphical interfaces. Structure solution by macromolecular crystallography is becoming increasingly automated and the CCP4 suite includes several automation pipelines. After giving a brief description of the evolution of CCP4 over the last 30 years, an overview of the current suite is given. While detailed descriptions are given in the accompanying articles, here it is shown how the individual programs contribute to a complete software package.
CCP4; macromolecular crystallography; software; collaboration; automation; macromolecular structure determination
The 155-kDa plasma glycoprotein factor H (FH), which consists of 20 complement control protein (CCP) modules, protects self-tissue but not foreign organisms from damage by the complement cascade. Protection is achieved by selective engagement of FH, via CCPs 1–4, CCPs 6–8 and CCPs 19–20, with polyanion-rich host surfaces that bear covalently attached, activation-specific, fragments of complement component C3. The role of intervening CCPs 9–18 in this process is obscured by lack of structural knowledge. We have concatenated new high-resolution solution structures of overlapping recombinant CCP pairs, 10–11 and 11–12, to form a three-dimensional structure of CCPs 10–12 and validated it by small-angle X-ray scattering of the recombinant triple‐module fragment. Superimposing CCP 12 of this 10–12 structure with CCP 12 from the previously solved CCP 12–13 structure yielded an S-shaped structure for CCPs 10–13 in which modules are tilted by 80–110° with respect to immediate neighbors, but the bend between CCPs 10 and 11 is counter to the arc traced by CCPs 11–13. Including this four-CCP structure in interpretation of scattering data for the longer recombinant segments, CCPs 10–15 and 8–15, implied flexible attachment of CCPs 8 and 9 to CCP 10 but compact and intimate arrangements of CCP 14 with CCPs 12, 13 and 15. Taken together with difficulties in recombinant production of module pairs 13–14 and 14–15, the aberrant structure of CCP 13 and the variability of 13–14 linker sequences among orthologues, a structural dependency of CCP 14 on its neighbors is suggested; this has implications for the FH mechanism.
► The 20-CCP‐module human protein FH prevents complement-mediated tissue damage. ► NMR structures of CCPs 10–11 and 11–12 suggest that this region enhances flexional strength of FH. ► Concatenating bi-modules helps interpret small‐angle X‐ray scattering data, revealing highly compacted arrangement of CCPs 13, 14 and 15. ► Apparent structural dependency of CCP 14 on neighbors could provide a switch between ordered and flexible FH architectures.
CCP, complement control protein; CR1, complement receptor type 1; DAF, decay accelerating factor; FH, factor H; EOM, ensemble optimization method; HSQC, heteronuclear single quantum coherence; MCP, membrane cofactor protein; NOE, nuclear Overhauser enhancement; SAXS, small-angle X-ray scattering; TOCSY, total correlated spectroscopy; protein NMR; protein domains; complement system; small-angle X-ray scattering; regulators of complement activation
Child safety restraints are effective measures in protecting children from an injury while traveling in a car. However, the rate of child restraint use is extremely low in Chinese cities. Parent drivers could play an important role in promoting child safety restraint use, but not all of them take active responsibility.
This study used a qualitative approach and included 14 in-depth interviews among parents with a child, under the age of 6, living in Shantou City (7 child safety restraint users and 7 non-users). Purposive sampling was used to recruit eligible parent drivers who participated in a previous observation study. Interview data were collected from March to April 2013. The audio taped and transcribed data were coded and analyzed to identify key themes.
Four key themes on child safety restraint emerged from the in-depth interviews with parents. These included 1) Having a child safety restraint installed in the rear seat with an adult sitting next to the restrained child is ideal, and child safety restraint is seen as an alternative when adult accompaniment is not available; 2) Having effective parental education strategies could help make a difference in child safety restraint use; 3) Inadequate promotion and parents’ poor safety awareness contribute to the low rate of child safety restraint in China; 4) Mandatory legislation on child safety restraint use could be an effective approach.
Inadequate promotion and low awareness of safe traveling by parents were closely linked to low child safety seat usage under the circumstance of no mandatory legislation. Future intervention efforts need to focus on increasing parents’ safe travel awareness combined with CSS product promotion before the laws are enacted.
Child safety seat; Interview; Qualitative research
We report here new computational tools and strategies to efficiently generate three-dimensional models for oligomeric biomolecular complexes in cases where there is limited experimental restraint data to guide the docking calculations. Our computational tools are designed to rapidly and exhaustively enumerate all geometrically possible docking poses for an oligomeric complex, rather than generate detailed, atomic-resolution models. Experimental data, such as inter-atomic distance measurements, are then used to select and refine docking poses that are consistent with the experimental restraints. Our computational toolkit is designed for use with sparse datasets to generate intermediate-resolution docking models, and utilizes distance difference matrix analysis to identify further restraint measurements that will provide maximum additional structural refinement. Thus, these tools can be used to help plan optimal residue positions for probe incorporation in labor-intensive biophysical experiments such as chemical crosslinking, EPR, or FRET spectroscopy studies. We present benchmark results for docking the collection of all 176 heterodimer protein complexes from the ZDOCK database, as well as a protein homodimer with recently collected experimental distance restraints, to illustrate the toolkit’s capabilities and performance, and to demonstrate how distance difference matrix analysis can automatically identify and prioritize additional restraint measurements that allow us to rapidly optimize docking poses.
Prediction of protein structures from their sequences is still one of the open grand challenges of computational biology. Some approaches to protein structure prediction, especially ab initio ones, rely to some extent on the prediction of residue contact maps. Residue contact map predictions have been assessed at the CASP competition for several years now. Although it has been shown that exact contact maps generally yield correct three-dimensional structures, this is true only at a relatively low resolution (3–4 Å from the native structure). Another known weakness of contact maps is that they are generally predicted ab initio, that is not exploiting information about potential homologues of known structure.
We introduce a new class of distance restraints for protein structures: multi-class distance maps. We show that Cα trace reconstructions based on 4-class native maps are significantly better than those from residue contact maps. We then build two predictors of 4-class maps based on recursive neural networks: one ab initio, or relying on the sequence and on evolutionary information; one template-based, or in which homology information to known structures is provided as a further input. We show that virtually any level of sequence similarity to structural templates (down to less than 10%) yields more accurate 4-class maps than the ab initio predictor. We show that template-based predictions by recursive neural networks are consistently better than the best template and than a number of combinations of the best available templates. We also extract binary residue contact maps at an 8 Å threshold (as per CASP assessment) from the 4-class predictors and show that the template-based version is also more accurate than the best template and consistently better than the ab initio one, down to very low levels of sequence identity to structural templates. Furthermore, we test both ab-initio and template-based 8 Å predictions on the CASP7 targets using a pre-CASP7 PDB, and find that both predictors are state-of-the-art, with the template-based one far outperforming the best CASP7 systems if templates with sequence identity to the query of 10% or better are available. Although this is not the main focus of this paper we also report on reconstructions of Cα traces based on both ab initio and template-based 4-class map predictions, showing that the latter are generally more accurate even when homology is dubious.
Accurate predictions of multi-class maps may provide valuable constraints for improved ab initio and template-based prediction of protein structures, naturally incorporate multiple templates, and yield state-of-the-art binary maps. Predictions of protein structures and 8 Å contact maps based on the multi-class distance map predictors described in this paper are freely available to academic users at the url .
The crystal structure of a Z-DNA hexamer duplex d(CGCGCG)2 determined at ultra high resolution of 0.55 Å and refined without restraints, displays a high degree of regularity and rigidity in its stereochemistry, in contrast to the more flexible B-DNA duplexes. The estimations of standard uncertainties of all individually refined parameters, obtained by full-matrix least-squares optimization, are comparable with values that are typical for small-molecule crystallography. The Z-DNA model generated with ultra high-resolution diffraction data can be used to revise the stereochemical restraints applied in lower resolution refinements. Detailed comparisons of the stereochemical library values with the present accurate Z-DNA parameters, shows in general a good agreement, but also reveals significant discrepancies in the description of guanine-sugar valence angles and in the geometry of the phosphate groups.
Food insecurity is linked to higher weight gain in pregnancy, as is dietary restraint. We hypothesized that pregnant women exposed to marginal food insecurity, and who reported dietary restraint before pregnancy, will paradoxically show the greatest weight gain. Weight outcomes were defined as total kilograms, observed-to-recommended weight gain ratio, and categorized as adequate, inadequate or excessive weight gain based on 2009 Institute of Medicine guidelines. A likelihood ratio test assessed the interaction between marginal food insecurity and dietary restraint and found significant. Adjusted multivariate regression and multinomial logistic models were used to estimate weight gain outcomes. In adjusted models stratified by dietary restraint, marginal insecurity and low restraint was significantly associated with lower weight gain and weight gain ratio compared to food secure and low restraint. Conversely, marginal insecurity and high restraint was significantly associated with higher weight gain and weight gain ratio compared to food secure and high restraint. Marginal insecurity with high restraint was significantly associated with excessive weight gain. Models were consistent when restricted to low-income women and full-term deliveries. In the presence of marginal food insecurity, women who struggle with weight and dieting issues may be at risk for excessive weight gain.
food insecurity; dietary restraint; weight gain; dieting; disordered eating; pregnancy
Restrained molecular dynamics simulations are a robust, though perhaps underused, tool for the end-stage refinement of biomolecular structures. We demonstrate their utility—using modern simulation protocols, optimized force fields, and inclusion of explicit solvent and mobile counterions—by re-investigating the solution structures of two RNA hairpins that had previously been refined using conventional techniques. The structures, both domain 5 group II intron ribozymes from yeast ai5γ and Pylaiella littoralis, share a nearly identical primary sequence yet the published 3D structures appear quite different. Relatively long restrained MD simulations using the original NMR restraint data identified the presence of a small set of violated distance restraints in one structure and a possibly incorrect trapped bulge nucleotide conformation in the other structure. The removal of problematic distance restraints and the addition of a heating step yielded representative ensembles with very similar 3D structures and much lower pairwise RMSD values. Analysis of ion density during the restrained simulations helped to explain chemical shift perturbation data published previously. These results suggest that restrained MD simulations, with proper caution, can be used to “update” older structures or aid in the refinement of new structures that lack sufficient experimental data to produce a high quality result. Notable cautions include the need for sufficient sampling, awareness of potential force field bias (such as small angle deviations with the current AMBER force fields), and a proper balance between the various restraint weights.
Electronic supplementary material
The online version of this article (doi:10.1007/s10858-012-9642-5) contains supplementary material, which is available to authorized users.
RNA structure; Molecular dynamics; Residual dipolar coupling restraints; Bulge structure; Force fields; Ion binding
The automated building of a protein model into an electron density map remains a challenging problem. In the ARP/wARP approach, model building is facilitated by initially interpreting a density map with free atoms of unknown chemical identity; all structural information for such chemically unassigned atoms is discarded. Here, this is remedied by applying restraints between free atoms, and between free atoms and a partial protein model. These are based on geometric considerations of protein structure and tentative (conditional) assignments for the free atoms. Restraints are applied in the REFMAC5 refinement program and are generated on an ad hoc basis, allowing them to fluctuate from step to step. A large set of experimentally phased and molecular replacement structures showcases individual structures where automated building is improved drastically by the conditional restraints. The concept and implementation we present can also find application in restraining geometries, such as hydrogen bonds, in low-resolution refinement.