Under favourable circumstances, density modification and polyalanine tracing with SHELXE can be used to improve and validate potential solutions from molecular replacement.
Although the program SHELXE was originally intended for the experimental phasing of macromolecules, it can also prove useful for expanding a small protein fragment to an almost complete polyalanine trace of the structure, given a favourable combination of native data resolution (better than about 2.1 Å) and solvent content. A correlation coefficient (CC) of more than 25% between the native structure factors and those calculated from the polyalanine trace appears to be a reliable indicator of success and has already been exploited in a number of pipelines. Here, a more detailed account of this usage of SHELXE for molecular-replacement solutions is given.
molecular replacement; density modification; autotracing; SHELX
Experimental phasing with SHELXC/D/E has been enhanced by the incorporation of main-chain tracing into the iterative density modification; this also provides a simple and effective way of exploiting noncrystallographic symmetry.
The programs SHELXC, SHELXD and SHELXE are designed to provide simple, robust and efficient experimental phasing of macromolecules by the SAD, MAD, SIR, SIRAS and RIP methods and are particularly suitable for use in automated structure-solution pipelines. This paper gives a general account of experimental phasing using these programs and describes the extension of iterative density modification in SHELXE by the inclusion of automated protein main-chain tracing. This gives a good indication as to whether the structure has been solved and enables interpretable maps to be obtained from poorer starting phases. The autotracing algorithm starts with the location of possible seven-residue α-helices and common tripeptides. After extension of these fragments in both directions, various criteria are used to decide whether to accept or reject the resulting poly-Ala traces. Noncrystallographic symmetry (NCS) is applied to the traced fragments, not to the density. Further features are the use of a ‘no-go’ map to prevent the traces from passing through heavy atoms or symmetry elements and a splicing technique to combine the best parts of traces (including those generated by NCS) that partly overlap.
experimental phasing of macromolecules; density modification; main-chain tracing; noncrystallographic symmetry; SHELX
Phase information from both MIRAS and MR was used to produce an interpretable electron-density map of the novel type II restriction endonuclease SgrAI bound to DNA. The MR solution corrected an instructive error in the initially chosen averaging transformation.
Uninterpretable electron-density maps were obtained using either MIRAS phases or MR phases in attempts to determine the structure of the type II restriction endonuclease SgrAI bound to DNA. While neither solution strategy was particularly promising (map correlation coefficients of 0.29 and 0.22 with the final model, respectively, for the MIRAS and MR phases and Phaser Z scores of 4.0 and 4.3 for the rotation and translation searches), phase combination followed by density modification gave a readily interpretable map. MR with a distantly related model located a dimer in the asymmetric unit and provided the correct transformation to use in averaging electron density between SgrAI subunits. MIRAS data sets with low substitution and MR solutions from only distantly related models should not be ignored, as poor-quality starting phases can be significantly improved. The bootstrapping strategy employed to improve the initial MIRAS phases is described.
SgrAI; MIRAS; phase combination; molecular replacement; density averaging; restriction enzymes
Four case studies in using maximum-likelihood molecular replacement, as implemented in the program Phaser, to solve structures of protein complexes are described.
Molecular replacement (MR) generally becomes more difficult as the number of components in the asymmetric unit requiring separate MR models (i.e. the dimensionality of the search) increases. When the proportion of the total scattering contributed by each search component is small, the signal in the search for each component in isolation is weak or non-existent. Maximum-likelihood MR functions enable complex asymmetric units to be built up from individual components with a ‘tree search with pruning’ approach. This method, as implemented in the automated search procedure of the program Phaser, has been very successful in solving many previously intractable MR problems. However, there are a number of cases in which the automated search procedure of Phaser is suboptimal or encounters difficulties. These include cases where there are a large number of copies of the same component in the asymmetric unit or where the components of the asymmetric unit have greatly varying B factors. Two case studies are presented to illustrate how Phaser can be used to best advantage in the standard ‘automated MR’ mode and two case studies are used to show how to modify the automated search strategy for problematic cases.
macromolecular crystallography; molecular replacement; maximum likelihood
How a visual stimulus is initially categorized as a face in a network of human brain areas remains largely unclear. Hierarchical neuro-computational models of face perception assume that the visual stimulus is first decomposed in local parts in lower order visual areas. These parts would then be combined into a global representation in higher order face-sensitive areas of the occipito-temporal cortex. Here we tested this view in fMRI with visual stimuli that are categorized as faces based on their global configuration rather than their local parts (two-tones Mooney figures and Arcimboldo's facelike paintings). Compared to the same inverted visual stimuli that are not categorized as faces, these stimuli activated the right middle fusiform gyrus (“Fusiform face area”) and superior temporal sulcus (pSTS), with no significant activation in the posteriorly located inferior occipital gyrus (i.e., no “occipital face area”). This observation is strengthened by behavioral and neural evidence for normal face categorization of these stimuli in a brain-damaged prosopagnosic patient whose intact right middle fusiform gyrus and superior temporal sulcus are devoid of any potential face-sensitive inputs from the lesioned right inferior occipital cortex. Together, these observations indicate that face-preferential activation may emerge in higher order visual areas of the right hemisphere without any face-preferential inputs from lower order visual areas, supporting a non-hierarchical view of face perception in the visual cortex.
face perception; visual cortex; Mooney; fusiform gyrus; prosopagnosia; FFA
The pitfalls of experimental phasing are described.
Developments in protein crystal structure determination by experimental phasing are reviewed, emphasizing the theoretical continuum between experimental phasing, density modification, model building and refinement. Traditional notions of the composition of the substructure and the best coefficients for map generation are discussed. Pitfalls such as determining the enantiomorph, identifying centrosymmetry (or pseudo-symmetry) in the substructure and crystal twinning are discussed in detail. An appendix introduces combined real–imaginary log-likelihood gradient map coefficients for SAD phasing and their use for substructure completion as implemented in the software Phaser. Supplementary material includes animated probabilistic Harker diagrams showing how maximum-likelihood-based phasing methods can be used to refine parameters in the case of SIR and MIR; it is hoped that these will be useful for those teaching best practice in experimental phasing methods.
enantiomers; handedness; absolute configuration; chirality; twinning; experimental phasing
A description is given of Phaser-2.1: software for phasing macromolecular crystal structures by molecular replacement and single-wavelength anomalous dispersion phasing.
Phaser is a program for phasing macromolecular crystal structures by both molecular replacement and experimental phasing methods. The novel phasing algorithms implemented in Phaser have been developed using maximum likelihood and multivariate statistics. For molecular replacement, the new algorithms have proved to be significantly better than traditional methods in discriminating correct solutions from noise, and for single-wavelength anomalous dispersion experimental phasing, the new algorithms, which account for correlations between F
+ and F
−, give better phases (lower mean phase error with respect to the phases given by the refined structure) than those that use mean F and anomalous differences ΔF. One of the design concepts of Phaser was that it be capable of a high degree of automation. To this end, Phaser (written in C++) can be called directly from Python, although it can also be called using traditional CCP4 keyword-style input. Phaser is a platform for future development of improved phasing methods and their release, including source code, to the crystallographic community.
computer programs; molecular replacement; SAD phasing; likelihood; structural genomics
SAD data can be used in Phaser to solve novel structures, supplement molecular-replacement phase information or identify anomalous scatterers from a final refined model.
Phaser is a program that implements likelihood-based methods to solve macromolecular crystal structures, currently by molecular replacement or single-wavelength anomalous diffraction (SAD). SAD phasing is based on a likelihood target derived from the joint probability distribution of observed and calculated pairs of Friedel-related structure factors. This target combines information from the total structure factor (primarily non-anomalous scattering) and the difference between the Friedel mates (anomalous scattering). Phasing starts from a substructure, which is usually but not necessarily a set of anomalous scatterers. The substructure can also be a protein model, such as one obtained by molecular replacement. Additional atoms are found using a log-likelihood gradient map, which shows the sites where the addition of scattering from a particular atom type would improve the likelihood score. An automated completion algorithm adds new sites, choosing optionally among different atom types, adds anisotropic B-factor parameters if appropriate and deletes atoms that refine to low occupancy. Log-likelihood gradient maps can also identify which atoms in a refined protein structure are anomalous scatterers, such as metal or halide ions. These maps are more sensitive than conventional model-phased anomalous difference Fouriers and the iterative completion algorithm is able to find a significantly larger number of convincing sites.
SAD phasing; likelihood; molecular replacement
Test studies have been conducted on five crystal structures of large molecular assemblies, in which EM maps are used as models for structure solution by molecular replacement using various standard MR packages such as AMoRe, MOLREP and Phaser.
Multi-component molecular complexes are increasingly being tackled by structural biology, bringing X-ray crystallography into the purview of electron-microscopy (EM) studies. X-ray crystallography can utilize a low-resolution EM map for structure determination followed by phase extension to high resolution. Test studies have been conducted on five crystal structures of large molecular assemblies, in which EM maps are used as models for structure solution by molecular replacement (MR) using various standard MR packages such as AMoRe, MOLREP and Phaser. The results demonstrate that EM maps are viable models for molecular replacement. Possible difficulties in data analysis, such as the effects of the EM magnification error, and the effect of MR positional/rotational errors on phase extension are discussed.
electron microscopy; molecular replacement
A method for automated macromolecular main-chain model building is described.
An algorithm for the automated macromolecular model building of polypeptide backbones is described. The procedure is hierarchical. In the initial stages, many overlapping polypeptide fragments are built. In subsequent stages, the fragments are extended and then connected. Identification of the locations of helical and β-strand regions is carried out by FFT-based template matching. Fragment libraries of helices and β-strands from refined protein structures are then positioned at the potential locations of helices and strands and the longest segments that fit the electron-density map are chosen. The helices and strands are then extended using fragment libraries consisting of sequences three amino acids long derived from refined protein structures. The resulting segments of polypeptide chain are then connected by choosing those which overlap at two or more Cα positions. The fully automated procedure has been implemented in RESOLVE and is capable of model building at resolutions as low as 3.5 Å. The algorithm is useful for building a preliminary main-chain model that can serve as a basis for refinement and side-chain addition.
model building; template matching; fragment extension
The functionality of the molecular-replacement pipeline phaser.MRage is introduced and illustrated with examples.
Phaser.MRage is a molecular-replacement automation framework that implements a full model-generation workflow and provides several layers of model exploration to the user. It is designed to handle a large number of models and can distribute calculations efficiently onto parallel hardware. In addition, phaser.MRage can identify correct solutions and use this information to accelerate the search. Firstly, it can quickly score all alternative models of a component once a correct solution has been found. Secondly, it can perform extensive analysis of identified solutions to find protein assemblies and can employ assembled models for subsequent searches. Thirdly, it is able to use a priori assembly information (derived from, for example, homologues) to speculatively place and score molecules, thereby customizing the search procedure to a certain class of protein molecule (for example, antibodies) and incorporating additional biological information into molecular replacement.
molecular replacement; pipeline; automation; phaser.MRage
We present density functional theory (DFT) calculations at the X3LYP/D95(d,p) level on the solvation of polyalanine α-helices in water. The study includes the effects of discrete water molecules and the CPCM and AMSOL SM5.2 solvent continuum model both separately and in combination. We find that individual water molecules cooperatively hydrogen-bond to both the C- and N-termini of the helix, which results in increases in the dipole moment of the helix/water complex to more than the vector sum of their individual dipole moments. These waters are found to be more stable than in bulk solvent. On the other hand, individual water that interact with the backbone lower the dipole moment of the helix/water complex to below that of the helix, itself. Small clusters of waters at the termini increase the dipole moments of the helix/water aggregates, but the effect diminishes as more waters are added. We discuss the somewhat complex behavior of the helix with the discrete waters in the continuum models.
The side-chains of the residues of glutamine (Q) and asparagine (N) contain amide groups. These can H-bond to each other in patterns similar to those of the backbone amides in α-helices. We show that mutating multiple Q's for alanines (A's) in a polyalanine helix stabilizes the helical structure, while similar mutations with multiple N's do not. We suggest that modification of peptides by incorporating Q's in such positions can make more robust helices that can be used to test the effects of secondary structures in biochemical experiments linked to proteins with variable structures such as tau and α-synuclein.
DEN refinement and automated model building with AutoBuild were used to determine the structure of a putative succinyl-diaminopimelate desuccinylase from C. glutamicum. This difficult case of molecular-replacement phasing shows that the synergism between DEN refinement and AutoBuild outperforms standard refinement protocols.
Phasing by molecular replacement remains difficult for targets that are far from the search model or in situations where the crystal diffracts only weakly or to low resolution. Here, the process of determining and refining the structure of Cgl1109, a putative succinyl-diaminopimelate desuccinylase from Corynebacterium glutamicum, at ∼3 Å resolution is described using a combination of homology modeling with MODELLER, molecular-replacement phasing with Phaser, deformable elastic network (DEN) refinement and automated model building using AutoBuild in a semi-automated fashion, followed by final refinement cycles with phenix.refine and Coot. This difficult molecular-replacement case illustrates the power of including DEN restraints derived from a starting model to guide the movements of the model during refinement. The resulting improved model phases provide better starting points for automated model building and produce more significant difference peaks in anomalous difference Fourier maps to locate anomalous scatterers than does standard refinement. This example also illustrates a current limitation of automated procedures that require manual adjustment of local sequence misalignments between the homology model and the target sequence.
reciprocal-space refinement; DEN refinement; real-space refinement; automated model building; succinyl-diaminopimelate desuccinylase
A number of techniques for the location of small and medium-sized model fragments in experimentally phased electron-density maps are explored. The application of one of these techniques to automated model building is discussed.
Molecular replacement is a powerful tool for the location of large models using structure-factor magnitudes alone. When phase information is available, it becomes possible to locate smaller fragments of the structure ranging in size from a few atoms to a single domain. The calculation is demanding, requiring a six-dimensional rotation and translation search. A number of approaches have been developed to this problem and a selection of these are reviewed in this paper. The application of one of these techniques to the problem of automated model building is explored in more detail, with particular reference to the problem of sequencing a protein main-chain trace.
model fragments; electron-density maps; model building
We have studied stability of polyalanine alpha-helices with lysine residues added at C-and N-termini in gas-phase and aqueous solution. Monte Carlo simulations with the fixed-charges OPLS-AA and our polarizable POSSIM force fields were carried out. The results of the simulations confirm previously observed phenomena of the helix being stable with the LYS residue on the C-terminus and losing its helical structure if the charged LYS residue is located at the N-terminus of the polypeptide in gas-hase. Both OPLS-AA and POSSIM force fields performed essentially similarly, thus validity of the both for reproducing and predicting structures of such polypeptides has been confirmed. We have also studied the effect of replacing the normal N- and C-termini with methyl capping (this approach is often used in computational studies). Our results have demonstrated that the structure and stability of the polypeptides do not depend significantly on such a substitution although details of the resulting structure may change. The liquid-state simulations produced stable alpha-helixes regardless of the position of the protonated lysine residue. Overall, we have validated our polarizable POSSIM force field and the techniques used in the simulations, since the change of the helix structure as a function of the position of the LYS residue depends on a fine balance of energy contributions, and our methodology reproduced this balance well.
force fields; electrostatic polarization; alpha-helix; gas-phase peptide conformation; polyalanine
We extend PRIME, an intermediate-resolution protein model previously used in simulations of the aggregation of polyalanine and polyglutamine, to the description of the geometry and energetics of peptides containing all twenty amino acid residues. The 20 amino acid side chains are classified into 14 groups according to their hydrophobicity, polarity, size, charge and potential for side chain hydrogen bonding. The parameters for extended PRIME, called PRIME 20, include hydrogen-bonding energies, side-chain interaction range and energy, and excluded volume. The parameters are obtained by applying a perceptron- learning algorithm and a modified stochastic learning algorithm that optimizes the energy gap between 711 known native states from the PDB and decoy structures generated by gapless threading. The number of independent pair-interaction parameters is chosen to be small enough to be physically meaningful yet large enough to give reasonably accurate results in discriminating decoys from native structures. The most physically meaningful results are obtained with 19 energy parameters.
The structure of Ca2+-bound EF-hand protein S100A2 was determined by calcium and sulfur SAD at a wavelength of 0.90 Å.
Human S100A2 is an EF-hand protein and acts as a major tumour suppressor, binding and activating p53 in a Ca2+-dependent manner. Ca2+-bound S100A2 was crystallized and its structure was determined based on the anomalous scattering provided by six S atoms from methionine residues and four calcium ions present in the asymmetric unit. Although the diffraction data were recorded at a wavelength of 0.90 Å, which is usually not assumed to be suitable for calcium/sulfur SAD, the anomalous signal was satisfactory. A nine-atom substructure was determined at 1.8 Å resolution using SHELXD, and SHELXE was used for density modification and phase extension to 1.3 Å resolution. The electron-density map obtained was well interpretable and could be used for automated model building by ARP/wARP.
S100A2; EF-hands; calcium; sulfur SAD
Major histocompatibility proteins share a common overall structure or peptide binding groove. Two binding groove domains, on the same chain for major histocompatibility class I or on two different chains for major histocompatibility class II, contribute to that structure that consists of two α-helices (“wall”) and a sheet of eight anti-parallel beta strands (“floor”). Apart from the peptide presented in the groove, the major histocompatibility α-helices play a central role for the interaction with the T cell receptor. This study presents a generalized mathematical approach for the characterization of these helices. We employed polynomials of degree 1 to 7 and splines with 1 to 2 nodes based on polynomials of degree 1 to 7 on the α-helices projected on their principal components. We evaluated all models with a corrected Akaike Information Criterion to determine which model represents the α-helices in the best way without overfitting the data. This method is applicable for both the stationary and the dynamic characterization of α-helices. By deriving differential geometric parameters from these models one obtains a reliable method to characterize and compare α-helices for a broad range of applications.
Program title: MH2c (MH helix curves)
Catalogue identifier: AELX_v1_0
Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AELX_v1_0.html
Program obtainable from: CPC Program Library, Queenʼs University, Belfast, N. Ireland
Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html
No. of lines in distributed program, including test data, etc.: 327 565
No. of bytes in distributed program, including test data, etc.: 17 433 656
Distribution format: tar.gz
Programming language: Matlab
Computer: Personal computer architectures
Operating system: Windows, Linux, Mac (all systems on which Matlab can be installed)
RAM: Depends on the trajectory size, min. 1 GB (Matlab)
Classification: 2.1, 4.9, 4.14
External routines: Curve Fitting Toolbox and Statistic Toolbox of Matlab
Nature of problem: Major histocompatibility (MH) proteins share a similar overall structure. However, identical MH alleles which present different peptides differ by subtle conformational alterations. One hypothesis is that such conformational differences could be another level of T cell regulation. By this software package we present a reliable and systematic way to compare different MH structures to each other.
Solution method: We tested several fitting approaches on all available experimental crystal structures of MH to obtain an overall picture of how to describe MH helices. For this purpose we transformed all complexes into the same space and applied splines and polynomials of several degrees to them. To draw a general conclusion which method fits them best we employed the “corrected Akaike Information Criterion”. The software is applicable for all kinds of helices of biomolecules.
Running time: Depends on the data, for a single stationary structure the runtime should not exceed a few seconds.
MH2c, MH helix curves (name of software); TR, T cell receptor; p, peptide; MH, major histocompatibility; MH1, major histocompatibility class I; MH2, major histocompatibility class II; G, binding groove; CDR, complementarity determining region; MD, Molecular Dynamics; PDB, Protein Data Bank; VMD, Visual Molecular Dynamics; PCA, Principal Component Analysis; PC, principal component; AIC, Akaike Information Criterion; cAIC, corrected Akaike Information Criterion; IMGT®, the international ImMunoGeneTics information system®; MH; MHC; Helix; Akaike Information Criterion; Minimization and fitting; Utility; Structure and properties; Molecular dynamics simulation; Proteins; Secondary structure; Theory, modeling, and computer simulation; Conformational changes
This work considers the physics of a brush formed by polymers capable of undergoing a helix-coil transition. A self-consistent field approximation for strongly stretched polymer chains is used in combination with a lattice model of the interaction energy in helix-coil mixtures. Crowding-induced chain stretching stabilizes helix formation at moderate tethering densities while high tethering density causes sufficiently strong stretching to unravel segments of the helix, resulting in distinct layers of monomer density and helical content. Compared to a random-coil brush at low-to-moderate tethering density, a helicogenic brush is less resistant to compression in the direction perpendicular to stretching due to easy alignment of helices and fewer unfavorable interactions between helical segments. At higher tethering density, the abovementioned stretch-induced decrease in helical content resists further compression. The proposed model is useful for understanding an emerging class of biomaterials that utilize helix-forming polymer brushes to induce shape changes or to stabilize biofunctional helical peptide sequences.
Protein fragments suitable for use in molecular replacement can be generated by normal-mode perturbation, analysis of the difference distance matrix of the original versus normal-mode perturbed structures, and SCEDS, a score that measures the sphericity, continuity, equality and density of the resulting fragments.
A method is described for generating protein fragments suitable for use as molecular-replacement (MR) template models. The template model for a protein suspected to undergo a conformational change is perturbed along combinations of low-frequency normal modes of the elastic network model. The unperturbed structure is then compared with each perturbed structure in turn and the structurally invariant regions are identified by analysing the difference distance matrix. These fragments are scored with SCEDS, which is a combined measure of the sphericity of the fragments, the continuity of the fragments with respect to the polypeptide chain, the equality in number of atoms in the fragments and the density of Cα atoms in the triaxial ellipsoid of the fragment extents. The fragment divisions with the highest SCEDS are then used as separate template models for MR. Test cases show that where the protein contains fragments that undergo a change in juxtaposition between template model and target, SCEDS can identify fragments that lead to a lower R factor after ten cycles of all-atom refinement with REFMAC5 than the original template structure. The method has been implemented in the software Phaser.
difference distance matrix; normal-mode analysis
The explicit polarization (X-Pol) method is a fragment-based quantum mechanical model, in which a macromolecular system in solution is partitioned into monomer fragments. The present study extends the original X-Pol method, where all fragments are treated using the same electronic structure theory, to a multilevel representations, called multilevel X-Pol, in which different electronic structure methods are used to describe different fragments. The multilevel X-Pol method has been implemented into Gaussian 09. A key ingredient that is used to couple interfragment electrostatic interactions at different levels of theory is the use of the response density for post-self-consistent-field energy (The response density is also called the generalized density). The method is useful for treating fragments in a small region of the system such as the solute molecules or the substrate and amino acids in the active site of an enzyme with a high-level theory, and the fragments in the rest of the system by a lower-level and computationally more efficient method. The method is illustrated here by applications to hydrogen bonding complexes in which one fragment is treated with the hybrid M06 density functional, Møller-Plesset perturbation theory, or coupled cluster theory, and the other fragments are treated by Hartree-Fock theory or the B3LYP or M06 hybrid density functionals.
Mixed fragment method; explicit polarization theory; fragment-based molecular orbital; block-localized density functional theory
SSEP is a comprehensive resource for accessing information related to the secondary structural elements present in the 25 and 90% non-redundant protein chains. The database contains 1771 protein chains from 1670 protein structures and 6182 protein chains from 5425 protein structures in 25 and 90% non-redundant protein chains, respectively. The current version provides information about the α-helical segments and β-strand fragments of varying lengths. In addition, it also contains the information about 310-helix, β- and ν-turns and hairpin loops. The free graphics program RASMOL has been interfaced with the search engine to visualize the three-dimensional structures of the user queried secondary structural fragment. The database is updated regularly and is available through Bioinformatics web server at http://cluster.physics.iisc.ernet.in/ssep/ or http://18.104.22.168/ssep/.
Rotamer libraries are a valuable tool for protein structure determination, modeling and design. Site-directed tryptophan fluorescence (SDTF) was used in combination with the rotamer model for the fluorescence intensity decays to solve α-helical conformations of proteins in solution. Single Trp mutations located in an α-helical segment of human tear lipocalin were explored for structure assignment. Along with fluorescence λmax values, the rotamer model assignment of fluorescence lifetimes fits the backbone conformation. Typically Trp fluorescence in proteins shows three lifetimes. However, for the α-helix, two lifetimes assigned to t and g− rotamers were satisfactory to describe Trp fluorescence intensity decays. The g+ rotamer is not feasible in the α-helix due to steric restriction. Trp rotamer distributions obtained by fluorescence were compared with the rotamer library derived from X-ray crystallography data of proteins. The Trp rotamer distributions vary for solvent exposed and buried (tertiary interaction) sites. A new strategy using the rotamer distribution with SDTF (RD-SDTF) removes the limitation of regular SDTF and other labeling techniques, in which site specific differences, e.g. accessibility, are presumed. The RD-SDTF technique does not rely on environmental differences of side chains and is able to detect α-helical structure where all side chains are exposed to solvent. Potentially this technique is applicable to various proteins including membrane proteins, which are rich in α-helix motif.
LCN1; fluorescence lifetime; side chain conformations; protein structure
The structure of human protein HSPC034 has been determined by both solution NMR spectroscopy and X-ray crystallography. Refinement of the NMR structure ensemble, using a Rosetta protocol in the absence of NMR restraints, resulted in significant improvements not only in structure quality, but also in molecular replacement (MR) performance with the raw X-ray diffraction data using MOLREP and Phaser. This method has recently been shown to be generally applicable with improved MR performance demonstrated for eight NMR structures refined using Rosetta.1 Additionally, NMR structures of HSPC034 calculated by standard methods that include NMR restraints, have improvements in the RMSD to the crystal structure and MR performance in the order DYANA, CYANA, XPLOR-NIH, and CNS with explicit water refinement (CNSw). Further Rosetta refinement of the CNSw structures, perhaps due to more thorough conformational sampling and/or a superior force field, was capable of finding alternative low energy protein conformations that were equally consistent with the NMR data according to the RPF scores. Upon further examination, the additional MR-performance shortfall for NMR refined structures as compared to the X-ray structure MR performance were attributed, in part, to crystal-packing effects, real structural differences, and inferior hydrogen bonding in the NMR structures. A good correlation between a decrease in the number of buried unsatisfied hydrogen-bond donors and improved MR performance demonstrates the importance of hydrogen-bond terms in the force field for improving NMR structures. The superior hydrogen-bond network in Rosetta-refined structures, demonstrates that correct identification of hydrogen bonds should be a critical goal of NMR structure refinement. Inclusion of non-bivalent hydrogen bonds identified from Rosetta structures as additional restraints in the structure calculation results in NMR structures with improved MR performance
NMR; X-ray; HSPC034; PP25; C1orf41; Northeast Structural Genomics Consortium; structural genomics; comparison of NMR and X-ray structures; Rosetta; NMR force field refinement; molecular replacement; hydrogen bonding; X-ray crystallography; refinement methods