The study aimed to explore the views of general practitioners (GPs), nurses and physiotherapists towards extending the role of sickness certification beyond the medical profession in primary care.
Fifteen GPs, seven nurses and six physiotherapists were selected to achieve varied respondent characteristics including sex, geographical location, service duration and post-graduate specialist training. Constant-comparative qualitative analysis of data from 28 semi-structured telephone interviews was undertaken.
The majority of respondents supported the extended role concept; however members of each professional group also rejected the notion. Respondents employed four different legitimacy claims to justify their views and define their occupational boundaries in relation to sickness certification practice. Condition-specific legitimacy, the ability to adopt a holistic approach to sickness certification, system efficiency and control-related arguments were used to different degrees by each occupation. Practical suggestions for the extension of the sickness certification role beyond the medical profession are underpinned by the sociological theory of professional identity.
Extending the authority to certify sickness absence beyond the medical profession is not simply a matter of addressing practical and organisational obstacles. There is also a need to consider the impact on, and preferences of, the specific occupations and their respective boundary claims. This paper explores the implications of extending the sick certification role beyond general practice. We conclude that the main policy challenge of such a move is to a) persuade GPs to relinquish this role (or to share it with other professions), and b) to understand the ‘boundary work’ involved.
Professional boundaries; Sick certification; Qualitative methods; Sociology of professions; Primary care
The solvent-picking procedure in phenix.refine has been extended and combined with Phaser anomalous substructure completion and analysis of coordination geometry to identify and place elemental ions.
Many macromolecular model-building and refinement programs can automatically place solvent atoms in electron density at moderate-to-high resolution. This process frequently builds water molecules in place of elemental ions, the identification of which must be performed manually. The solvent-picking algorithms in phenix.refine have been extended to build common ions based on an analysis of the chemical environment as well as physical properties such as occupancy, B factor and anomalous scattering. The method is most effective for heavier elements such as calcium and zinc, for which a majority of sites can be placed with few false positives in a diverse test set of structures. At atomic resolution, it is observed that it can also be possible to identify tightly bound sodium and magnesium ions. A number of challenges that contribute to the difficulty of completely automating the process of structure completion are discussed.
refinement; ions; PHENIX
Macromolecular crystal structures are among the best of scientific data, providing detailed insight into these complex and biologically important molecules with a relatively low level of error and subjectivity. However, there are two notable problems with getting the most information from them. The first is that the models are not perfect: there is still opportunity for improving them, and users need to evaluate whether the local reliability in a structure is up to answering their question of interest. The second is that protein and nucleic acid molecules are highly complex and individual, inherently handed and 3-dimensional, and the cooperative and subtle interactions that govern their detailed structure and function are not intuitively evident. Thus there is a real need for graphical representations and descriptive classifications that enable molecular 3D literacy. We have spent our career working to understand these elegant molecules ourselves, and building tools to help us and others determine and understand them better. The Protein Data Bank (PDB) has of course been vital and central to this undertaking. Here we combine some history of our involvement as depositors, illustrators, evaluators, and end-users of PDB structures with commentary on how best to study and draw scientific inferences from them.
Objectives. To identify the instruments that have been used to measure health-related quality of life (HRQOL) in gout and assess their clinimetric properties, determine the distribution of HRQOL in gout and identify factors associated with poor HRQOL.
Methods. Medline, CINAHL, EMBASE and PsycINFO were searched from inception to October 2012. Search terms pertained to gout, health or functional status, clinimetric properties and HRQOL. Study data extraction and quality assessment were performed by two independent reviewers.
Results. From 474 identified studies, 22 met the inclusion criteria. Health Assessment Questionnaire Disability Index (HAQ-DI) and Short Form 36 (SF-36) were most frequently used and highest rated due to robust construct and concurrent validity, despite high floor and ceiling effects. The Gout Impact Scale had good content validity. Gout had a greater impact on physical HRQOL compared to other domains. Both gout-specific features (attack frequency and intensity, intercritical pain and number of joints involved) and comorbid disease were associated with poor HRQOL. Evidence for objective features such as tophi and serum uric acid was less robust. Limitations of existing studies include cross-sectional design, recruitment from specialist clinic settings and frequent use of generic instruments.
Conclusion. Most studies have used the generic HAQ-DI and SF-36. Gout-specific characteristics and comorbidities contribute to poor HRQOL. There is a need for a cohort study in primary care (where most patients with gout are treated) to determine which factors predict changes in HRQOL over time. This will enable those at risk of deterioration to be identified and better targeted for treatment.
gout; health-related quality of life; clinimetrics
Accurate energy functions are critical to macromolecular modeling and design. We describe new tools for identifying inaccuracies in energy functions and guiding their improvement, and illustrate the application of these tools to improvement of the Rosetta energy function. The feature analysis tool identifies discrepancies between structures deposited in the PDB and low energy structures generated by Rosetta; these likely arise from inaccuracies in the energy function. The optE tool optimizes the weights on the different components of the energy function by maximizing the recapitulation of a wide range of experimental observations. We use the tools to examine three proposed modifications to the Rosetta energy function: improving the unfolded state energy model (reference energies), using bicubic spline interpolation to generate knowledge based torisonal potentials, and incorporating the recently developed Dunbrack 2010 rotamer library (Shapovalov and Dunbrack, 2011).
Rosetta; energy function; scientific benchmarking; parameter estimation; decoy discrimination
A macromolecular structure, as measured data or as a list of coordinates or even on-screen as a full atomic model, is an extremely complex and confusing object. The underlying rules of how it folds, moves, and interacts as a biological entity are even less evident or intuitive to the human mind. To do science on such molecules, or to relate them usefully to higher levels of biology, we need to start with a natural history that names their features in meaningful ways and with multiple representations (visual or algebraic) that show some aspect of their organizing principles. The two of us have jointly enjoyed a highly varied and engrossing career in biophysical research over nearly 50 years. Our frequent changes of emphasis are tied together by two threads: first, by finding the right names, visualizations, and methods to help both ourselves and others to better understand the 3D structures of protein and RNA molecules, and second, by redefining the boundary between signal and noise for complex data, in both directions—sometimes identifying and promoting real signal up out of what seemed just noise, and sometimes demoting apparent signal into noise or systematic error. Here we relate parts of our scientific and personal lives, including ups and downs, influences, anecdotes, and guiding principles such as the title theme.
scientific biography; structural biology; molecular graphics; ribbon drawings; structure validation; all-atom contacts
We have developed a suite of protein redesign algorithms that improves realistic in silico modeling of proteins. These algorithms are based on three characteristics that make them unique: (1) improved flexibility of the protein backbone, protein side chains, and ligand to accurately capture the conformational changes that are induced by mutations to the protein sequence; (2) modeling of proteins and ligands as ensembles of low-energy structures to better approximate binding affinity; and (3) a globally-optimal protein design search, guaranteeing that the computational predictions are optimal with respect to the input model. Here, we illustrate the importance of these three characteristics. We then describe OSPREY, a protein redesign suite that implements our protein design algorithms. OSPREY has been used prospectively, with experimental validation, in several biomedically-relevant settings. We show in detail how OSPREY has been used to predict resistance mutations and explain why improved flexibility, ensembles, and provability are essential for this application.
protein design; OSPREY; Dead-end elimination; protein ensembles; protein flexibility; K*; minDEE
Gout is the commonest inflammatory arthritis affecting around 1.4% of adults in Europe. It is predominantly managed in primary care and classically affects the joints of the foot, particularly the first metatarsophalangeal joint. Gout related factors (including disease characteristics and treatment) as well as comorbid chronic disease are associated with poor Health Related Quality of Life (HRQOL) yet to date there is limited evidence concerning gout in a community setting. Existing epidemiological studies are limited by their cross-sectional design, selection of secondary care patients with atypical disease and the use of generic tools to measure HRQOL. This 3 year primary care-based prospective observational cohort study will describe the spectrum of HRQOL in community dwelling patients with gout, associated factors, predictors of poor outcome, and prevalence and incidence of foot problems in gout patients.
Adults aged ≥ 18 years diagnosed with gout or prescribed colchicine or allopurinol in the preceding 2 years will be identified through Read codes and mailed a series of self-completion postal questionnaires over a 3-year period. Consenting participants will have their general practice medical records reviewed.
This is the first prospective cohort study of HRQOL in patients with gout in primary care in the UK. The combination of survey data and medical record review will allow an in-depth understanding of factors that are associated with and lead to poor HRQOL and foot problems in gout. Identification of these factors will improve the management of this prevalent, yet under-treated, condition in primary care.
Gout; HRQOL; Foot; Patient experience; Prospective cohort; Primary care
X-ray crystallography is a critical tool in the study of biological systems. It is able to provide information that has been a prerequisite to understanding the fundamentals of life. It is also a method that is central to the development of new therapeutics for human disease. Significant time and effort are required to determine and optimize many macromolecular structures because of the need for manual interpretation of complex numerical data, often using many different software packages, and the repeated use of interactive three-dimensional graphics. The Phenix software package has been developed to provide a comprehensive system for macromolecular crystallographic structure solution with an emphasis on automation. This has required the development of new algorithms that minimize or eliminate subjective input in favour of built-in expert-systems knowledge, the automation of procedures that are traditionally performed by hand, and the development of a computational framework that allows a tight integration between the algorithms. The application of automated methods is particularly appropriate in the field of structural proteomics, where high throughput is desired. Features in Phenix for the automation of experimental phasing with subsequent model building, molecular replacement, structure refinement and validation are described and examples given of running Phenix from both the command line and graphical user interface.
Macromolecular Crystallography; Automation; Phenix; X-ray; Diffraction; Python
Amino acid substitutions in protein structures often require subtle backbone adjustments that are difficult to model in atomic detail. An improved ability to predict realistic backbone changes in response to engineered mutations would be of great utility for the blossoming field of rational protein design. One model that has recently grown in acceptance is the backrub motion, a low-energy dipeptide rotation with single-peptide counter-rotations, that is coupled to dynamic two-state sidechain rotamer jumps, as evidenced by alternate conformations in very high-resolution crystal structures. It has been speculated that backrubs may facilitate sequence changes equally well as rotamer changes. However, backrub-induced shifts and experimental uncertainty are of similar magnitude for backbone atoms in even high-resolution structures, so comparison of wildtype-vs.-mutant crystal structure pairs is not sufficient to directly link backrubs to mutations. In this study, we use two alternative approaches that bypass this limitation. First, we use a quality-filtered structure database to aggregate many examples for precisely defined motifs with single amino acid differences, and find that the effectively amplified backbone differences closely resemble backrubs. Second, we directly apply a provably-accurate, backrub-enabled protein design algorithm to idealized versions of these motifs, and discover that the lowest-energy computed models match the average-coordinate experimental structures. These results support the hypothesis that backrubs participate in natural protein evolution and validate their continued use for design of synthetic proteins.
Protein design has the potential to generate useful molecules for medicine and chemistry, including sensors, drugs, and catalysts for arbitrary reactions. When protein design is carried out starting from an experimentally determined structure, as is often the case, one important aspect to consider is backbone flexibility, because in response to a mutation the backbone often must shift slightly to reconcile the new sidechain with its environment. In principle, one may model the backbone in many ways, but not all are physically realistic or experimentally validated. Here we study the "backrub" motion, which has been previously documented in atomic detail, but only for sidechain movements within single structures. By a twopronged approach involving both structural bioinformatics and computation with a principled design algorithm, we demonstrate that backrubs are sufficient to explain the backbone differences between mutation-related sets of very precisely defined motifs from the protein structure database. Our findings illustrate that backrubs are useful for describing evolutionary sequence change and, by extension, suggest that they are also appropriate for rational protein design calculations.
The foundations and current features of a widely used graphical user interface for macromolecular crystallography are described.
A new Python-based graphical user interface for the PHENIX suite of crystallography software is described. This interface unifies the command-line programs and their graphical displays, simplifying the development of new interfaces and avoiding duplication of function. With careful design, graphical interfaces can be displayed automatically, instead of being manually constructed. The resulting package is easily maintained and extended as new programs are added or modified.
macromolecular crystallography; graphical user interfaces; PHENIX
In response to long waiting lists and problems with access to primary care physiotherapy, several Primary Care Trusts (PCTs) have developed physiotherapy-led telephone assessment and treatment services. The MRC funded PhysioDirect trial is a randomised trial in four PCTs with a total of 2252 patients comparing this approach with usual physiotherapy care, where patients join a waiting list for face-to-face physiotherapy.
This nested qualitative study aimed to explore and understand the key issues that determine acceptability of PhysioDirect services from the perspectives of patients, physiotherapists and their managers, GPs and commissioners.
Semi-structured interviews were conducted with 57 purposively sampled patients with musculoskeletal problems participating in the randomised trial. Sixteen physiotherapists, 4 physiotherapy managers, 8 GPs and 4 PCT commissioners were interviewed. The framework method was used to analyse the qualitative data.
All stakeholder groups perceived the PhysioDirect service as helpful in improving access to physiotherapy care by reducing physiotherapy waiting times. The physiotherapists and their managers perceived that physiotherapists could safely diagnose patients with musculoskeletal problems over the telephone. The GPs and commissioners raised concerns about the accuracy of diagnoses reached over the telephone and perceived it as a triage service which precedes face-to-face contact. Both patients and physiotherapists felt that the lack of visual information impaired their ability to effectively communicate their health problems over the telephone and both perceived that the PhysioDirect assessment was less personal than face-to-face contact. Patients expressed their concerns about trusting the expertise and knowledge of the physiotherapist without knowing them personally, with both patients and physiotherapists seeing the PhysioDirect service as impairing continuity of care. However, both patients and physiotherapists found that the PhysioDirect service worked particularly well as a medium to provide early, self-management advice. Physiotherapy managers found the unpredictable nature of the timing and volume of patient calls to the PhysioDirect service difficult to manage. Physiotherapy managers, GPs and commissioners had divergent views about the information needed to support future implementation of a PhysioDirect service. Service commissioners also appeared to have wide ranging and unrealistic expectations of the type of data that they wanted from physiotherapy managers in order to support decisions about commissioning PhysioDirect services.
The PhysioDirect service was perceived by the patients, physiotherapists and their managers, as well as GP and commissioners as broadly acceptable. All three groups felt that the PhysioDirect service improved access to physiotherapy services. Both patients and physiotherapists had some concerns that the PhysioDirect service impaired the development of a good therapeutic relationship. The key challenges to future implementation of PhysioDirect services were managers’ ability to accurately allocate physiotherapy time to the service, along with providing the range of data that commissioners expected from a new service. Despite these reservations, all stakeholders could foresee PhysioDirect as one option of access for future physiotherapy services.
telecare; PhysioDirect; physiotherapy; qualitative; musculoskeletal
Recent developments in PHENIX are reported that allow the use of reference-model torsion restraints, secondary-structure hydrogen-bond restraints and Ramachandran restraints for improved macromolecular refinement in phenix.refine at low resolution.
Traditional methods for macromolecular refinement often have limited success at low resolution (3.0–3.5 Å or worse), producing models that score poorly on crystallographic and geometric validation criteria. To improve low-resolution refinement, knowledge from macromolecular chemistry and homology was used to add three new coordinate-restraint functions to the refinement program phenix.refine. Firstly, a ‘reference-model’ method uses an identical or homologous higher resolution model to add restraints on torsion angles to the geometric target function. Secondly, automatic restraints for common secondary-structure elements in proteins and nucleic acids were implemented that can help to preserve the secondary-structure geometry, which is often distorted at low resolution. Lastly, we have implemented Ramachandran-based restraints on the backbone torsion angles. In this method, a ϕ,ψ term is added to the geometric target function to minimize a modified Ramachandran landscape that smoothly combines favorable peaks identified from nonredundant high-quality data with unfavorable peaks calculated using a clash-based pseudo-energy function. All three methods show improved MolProbity validation statistics, typically complemented by a lowered R
free and a decreased gap between R
work and R
macromolecular crystallography; low resolution; refinement; automation
During protein synthesis, the ribosome controls the movement of transfer RNA (tRNA) and messenger RNA (mRNA) by means of large-scale structural rearrangements. We describe structures of the intact bacterial ribosome from Escherichia coli that reveal how the ribosome binds tRNA in two functionally distinct states, determined to a resolution of ~3.2 Å by x-ray crystallography. One state positions tRNA in the peptidyl-tRNA binding site. The second, a fully rotated state, is stabilized by ribosome recycling factor (RRF) and binds tRNA in a highly bent conformation in a hybrid peptidyl/exit (P/E) site. The structures help to explain how the ratchet-like motion of the two ribosomal subunits contributes to the mechanisms of translocation, termination, and ribosome recycling.
This report presents the conclusions of the X-ray Validation Task Force of the worldwide Protein Data Bank (PDB). The PDB has expanded massively since current criteria for validation of deposited structures were adopted, allowing a much more sophisticated understanding of all the components of macromolecular crystals. The size of the PDB creates new opportunities to validate structures by comparison with the existing database, and the now-mandatory deposition of structure factors creates new opportunities to validate the underlying diffraction data. These developments highlighted the need for a new assessment of validation criteria. The Task Force recommends that a small set of validation data be presented in an easily understood format, relative to both the full PDB and the applicable resolution class, with greater detail available to interested users. Most importantly, we recommend that referees and editors judging the quality of structural experiments have access to a concise summary of well-established quality indicators.
► Validation criteria used by the PDB for X-ray crystal structures have been reassessed ► Key scores should be presented prominently in an easily understood format ► A concise validation report should be available to referees of papers on crystal structures
Older people often view osteoarthritis as a part of normal ageing and see themselves as healthy despite painful joints. Professionals have mixed views about this. One concern is that seeing osteoarthritis as a result of ‘wear and tear’ leads to restricting exercise in order to avoid further wear.
To explore lay perceptions of wellness and joint pain, and their implications for consulting healthcare professionals and taking exercise.
Design of study
Qualitative, longitudinal study.
General practice in the North Midlands.
Semi-structured interviews with 27 older people who reported a joint problem but rated themselves as healthy. Diary sheets were sent for 11 consecutive months to record changes in health and circumstances. Thematic data analysis was facilitated by NVivo 8.
A key element of wellness was being able to continue with everyday roles and activities. ‘Wear and tear’ was used to categorise arthritis that is a normal part of old age. New joint symptoms that came on suddenly and severely were not necessarily attributed to ‘wear and tear’ arthritis, and were likely to lead to a professional consultation. Physical activity was not restricted to prevent further wear of affected joint(s). Keeping joints mobile was important in order to maintain independence.
Professionals should explore patients' ideas and concerns about their joint problem, in order to individually tailor explanations and advice. Patients are likely to be receptive to recommendations that promote independence, but advice needs to be set into patients' existing ways of living and coping with joint pain.
Health; joint pain; elderly; osteoarthritis; primary care
Central to crystallographic structure solution is obtaining accurate phases in order to build a molecular model, ultimately followed by refinement of that model to optimize its fit to the experimental diffraction data and prior chemical knowledge. Recent advances in phasing and model refinement and validation algorithms make it possible to arrive at better electron density maps and more accurate models.
For template-based modeling in the CASP8 Critical Assessment of Techniques for Protein Structure Prediction, this work develops and applies six new full-model metrics. They are designed to complement and add value to the traditional template-based assessment by GDT (Global Distance Test) and related scores (based on multiple superpositions of Cα atoms between target structure and predictions labeled “model 1”). The new metrics evaluate each predictor group on each target, using all atoms of their best model with above-average GDT. Two metrics evaluate how “protein-like” the predicted model is: the MolProbity score used for validating experimental structures, and a mainchain reality score using all-atom steric clashes, bond length and angle outliers, and backbone dihedrals. Four other new metrics evaluate match of model to target for mainchain and sidechain hydrogen bonds, sidechain end positioning, and sidechain rotamers. Group-average Z-score across the six full-model measures is averaged with group-average GDT Z-score to produce the overall ranking for full-model, high-accuracy performance.
Separate assessments are reported for specific aspects of predictor-group performance, such as robustness of approximately correct template or fold identification, and self-scoring ability at identifying the best of their models. Fold identification is distinct from but correlated with group-average GDT Z-score if target difficulty is taken into account, while self-scoring is done best by servers and is uncorrelated with GDT performance. Outstanding individual models on specific targets are identified and discussed. Predictor groups excelled at different aspects, highlighting the diversity of current methodologies. However, good full-model scores correlate robustly with high Cα accuracy.
homology modeling; protein structure prediction; all-atom contacts; full-model assessment
Application of phenix.model_vs_data to the contents of the Protein Data Bank shows that the vast majority of deposited structures can be automatically analyzed to reproduce the reported quality statistics. However, the small fraction of structures that elude automated re-analysis highlight areas where new software developments can help retain valuable information for future analysis.
phenix.model_vs_data is a high-level command-line tool for the computation of crystallographic model and data statistics, and the evaluation of the fit of the model to data. Analysis of all Protein Data Bank structures that have experimental data available shows that in most cases the reported statistics, in particular R factors, can be reproduced within a few percentage points. However, there are a number of outliers where the recomputed R values are significantly different from those originally reported. The reasons for these discrepancies are discussed.
PHENIX; Protein Data Bank; data quality; model quality; structure validation; R factors
Distributions of the backbone dihedral angles of proteins have been studied for over 40 years. While many statistical analyses have been presented, only a handful of probability densities are publicly available for use in structure validation and structure prediction methods. The available distributions differ in a number of important ways, which determine their usefulness for various purposes. These include: 1) input data size and criteria for structure inclusion (resolution, R-factor, etc.); 2) filtering of suspect conformations and outliers using B-factors or other features; 3) secondary structure of input data (e.g., whether helix and sheet are included; whether beta turns are included); 4) the method used for determining probability densities ranging from simple histograms to modern nonparametric density estimation; and 5) whether they include nearest neighbor effects on the distribution of conformations in different regions of the Ramachandran map. In this work, Ramachandran probability distributions are presented for residues in protein loops from a high-resolution data set with filtering based on calculated electron densities. Distributions for all 20 amino acids (with cis and trans proline treated separately) have been determined, as well as 420 left-neighbor and 420 right-neighbor dependent distributions. The neighbor-independent and neighbor-dependent probability densities have been accurately estimated using Bayesian nonparametric statistical analysis based on the Dirichlet process. In particular, we used hierarchical Dirichlet process priors, which allow sharing of information between densities for a particular residue type and different neighbor residue types. The resulting distributions are tested in a loop modeling benchmark with the program Rosetta, and are shown to improve protein loop conformation prediction significantly. The distributions are available at http://dunbrack.fccc.edu/hdp.
The three-dimensional structure of a protein enables it to perform its specific function, which may be catalysis, DNA binding, cell signaling, maintaining cell shape and structure, or one of many other functions. Predicting the structures of proteins is an important goal of computational biology. One way of doing this is to figure out the rules that determine protein structure from protein sequences by determining how local protein sequence is associated with local protein structure. That is, many (but not all) of the interactions that determine protein structure occur between amino acids that are a short distance away from each other in the sequence. This is particularly true in the irregular parts of protein structure, often called loops. In this work, we have performed a statistical analysis of the structure of the protein backbone in loops as a function of the protein sequence. We have determined how an amino acid bends the local backbone due to its amino acid type and the amino acid types of its neighbors. We used a recently developed statistical method that is particularly suited to this problem. The analysis shows that backbone conformation prediction can be improved using the information in the statistical distributions we have developed.
The PHENIX software for macromolecular structure determination is described.
Macromolecular X-ray crystallography is routinely applied to understand biological processes at a molecular level. However, significant time and effort are still required to solve and complete many of these structures because of the need for manual interpretation of complex numerical data using many software packages and the repeated use of interactive three-dimensional graphics. PHENIX has been developed to provide a comprehensive system for macromolecular crystallographic structure solution with an emphasis on the automation of all procedures. This has relied on the development of algorithms that minimize or eliminate subjective input, the development of algorithms that automate procedures that are traditionally performed by hand and, finally, the development of a framework that allows a tight integration between the algorithms.
PHENIX; Python; macromolecular crystallography; algorithms
In order to be successful CASP experiments require experimentally determined protein structures. These structures form the basis of the experiment. Structural genomics groups have provided the vast majority of these structures in recent editions of CASP. Before the structure prediction assessment can begin these target structures must be divided into structural domains for assessment purposes, and each assessment unit must be assigned to one or more tertiary structure prediction categories. In CASP8 target domain boundaries were based on visual inspection of targets and their experimental data, and on superpositions of the target structures with related template structures. As in CASP7 target domains were broadly classified into two different categories: “template-based modeling” and “free modeling”. Assessment categories were determined by structural similarity between the target domain and the nearest structural templates in the PDB and by whether or not related structural templates were used to build the models. The vast majority of the 164 assessment units in CASP8 were classified as template-based modeling. Just 10 target domains were defined as free modeling. In addition three targets were assessed in both the free modeling and template based categories and a subset of 50 template-based models were evaluated as part of the “high accuracy” subset. The targets submitted for CASP8 confirmed a trend that has been apparent since CASP5: targets submitted to the CASP experiments are becoming easier to predict.
Protein Structure; Domains; Assessment Units; Structure Prediction; Structure Classification
MolProbity structure validation will diagnose most local errors in macromolecular crystal structures and help to guide their correction.
MolProbity is a structure-validation web service that provides broad-spectrum solidly based evaluation of model quality at both the global and local levels for both proteins and nucleic acids. It relies heavily on the power and sensitivity provided by optimized hydrogen placement and all-atom contact analysis, complemented by updated versions of covalent-geometry and torsion-angle criteria. Some of the local corrections can be performed automatically in MolProbity and all of the diagnostics are presented in chart and graphical forms that help guide manual rebuilding. X-ray crystallography provides a wealth of biologically important molecular data in the form of atomic three-dimensional structures of proteins, nucleic acids and increasingly large complexes in multiple forms and states. Advances in automation, in everything from crystallization to data collection to phasing to model building to refinement, have made solving a structure using crystallography easier than ever. However, despite these improvements, local errors that can affect biological interpretation are widespread at low resolution and even high-resolution structures nearly all contain at least a few local errors such as Ramachandran outliers, flipped branched protein side chains and incorrect sugar puckers. It is critical both for the crystallographer and for the end user that there are easy and reliable methods to diagnose and correct these sorts of errors in structures. MolProbity is the authors’ contribution to helping solve this problem and this article reviews its general capabilities, reports on recent enhancements and usage, and presents evidence that the resulting improvements are now beneficially affecting the global database.
all-atom contacts; clashscore; automated correction; KiNG; ribose pucker; Ramachandran plots; side-chain rotamers; model quality; systematic errors; database improvement
Misfit sidechains in protein crystal structures are a stumbling block in using those structures to direct further scientific inference. Problems due to surface disorder and poor electron density are very difficult to address, but a large class of systematic errors are quite common even in well-ordered regions, resulting in sidechains fit backwards into local density in predictable ways. The MolProbity web site is effective at diagnosing such errors, and can perform reliable automated correction of a few special cases such as 180° flips of Asn or Gln sidechain amides, using all-atom contacts and H-bond networks. However, most at-risk residues involve tetrahedral geometry, and their valid correction requires rigorous evaluation of sidechain movement and sometimes backbone shift. The current work extends the benefits of robust automated correction to more sidechain types. The Autofix method identifies candidate systematic, flipped-over errors in Leu, Thr, Val, and Arg using MolProbity quality statistics, proposes a corrected position using real-space refinement with rotamer selection in Coot, and accepts or rejects the correction based on improvement in MolProbity criteria and on χ angle change. Criteria are chosen conservatively, after examining many individual results, to ensure valid correction. To test this method, Autofix was run and analyzed for 945 representative PDB files and on the 50S ribosomal subunit of file 1YHQ. Over 40% of Leu, Val, and Thr outliers and 15% of Arg outliers were successfully corrected, resulting in a total of 3,679 corrected sidechains, or 4 per structure on average. Summary Sentences: A common class of misfit sidechains in protein crystal structures is due to systematic errors that place the sidechain backwards into the local electron density. A fully automated method called “Autofix” identifies such errors for Leu, Val, Thr, and Arg and corrects over one third of them, using MolProbity validation criteria and Coot real-space refinement of rotamers.
Electronic supplementary material
The online version of this article (doi:10.1007/s10969-008-9045-8) contains supplementary material, which is available to authorized users.
Automation; Structure improvement; Crystallography; Sidechain rotamers; Protein/RNA interactions
The crystal structure of the hypothetical protein PF0899 from P. furiosus has been determined to 1.85 Å resolution.
The hypothetical protein PF0899 is a 95-residue peptide from the hyperthermophilic archaeon Pyrococcus furiosus that represents a gene family with six members. P. furiosus ORF PF0899 has been cloned, expressed and crystallized and its structure has been determined by the Southeast Collaboratory for Structural Genomics (http://www.secsg.org). The structure was solved using the SCA2Structure pipeline from multiple data sets and has been refined to 1.85 Å against the highest resolution data set collected (a presumed gold derivative), with a crystallographic R factor of 21.0% and R
free of 24.0%. The refined structure shows some structural similarity to a wedge-shaped domain observed in the structure of the major capsid protein from bacteriophage HK97, suggesting that PF0899 may be a structural protein.
structural genomics; SECSG; Pfu-871755; PF0899; high-throughput structure