Nuclear magnetic resonance (NMR) spectroscopy is a primary tool to perform structural studies of proteins in physiologically-relevant solution conditions. Restraints on distances between pairs of nuclei in the protein, derived from the nuclear Overhauser effect (NOE), provide information about the structure of the protein in its folded state. NMR studies of symmetric protein homo-oligomers present a unique challenge. Using X-filtered NOESY experiments, it is possible to determine whether an NOE restrains a pair of protons across different subunits or within a single subunit, but current experimental techniques are unable to determine in which subunits the restrained protons lie. Consequently, it is difficult to assign NOEs to particular pairs of subunits with certainty, thus hindering the structural analysis of the oligomeric state. Computational approaches are needed to address this subunit ambiguity, but traditional solutions often rely on stochastic search coupled with simulated annealing and simulations of simplified molecular dynamics, which have many tunable parameters that must be chosen carefully and can also fail to report structures consistent with the experimental restraints. In addition, these traditional approaches rarely provide guarantees on running time or solution quality. We reduce the structure determination of homo-oligomers with cyclic symmetry to computing geometric arrangements of unions of annuli in a plane. Our algorithm, disco, runs in expected O(n2) time, where n is the number of distance restraints, potentially assigned ambiguously. disco is guaranteed to report the exact set of oligomer structures consistent with the distance restraints and also with orientational restraints from residual dipolar couplings (RDCs). We demonstrate our method using two symmetric protein complexes: the trimeric E. coli diacylglycerol kinase (DAGK) and a dimeric mutant of the immunoglobulin-binding domain B1 of streptococcal protein G (GB1). In both cases, disco computes oligomer structures with high precision and also finds distance restraints that are either mutually inconsistent or inconsistent with the RDCs. The entire protocol DISCO has been completely automated in a software package that is freely available and open-source at www.cs.duke.edu/donaldlab/software.php.
algorithms; computational molecular biology; protein structure
We explore, using the Crh protein dimer as a model, how information from solution NMR, solid-state NMR and X-ray crystallography can be combined using structural bioinformatics methods, in order to get insights into the transition from solution to crystal. Using solid-state NMR chemical shifts, we filtered intra-monomer NMR distance restraints in order to keep only the restraints valid in the solid state. These filtered restraints were added to solid-state NMR restraints recorded on the dimer state to sample the conformational landscape explored during the oligomerization process. The use of non-crystallographic symmetries then permitted the extraction of converged conformers subsets. Ensembles of NMR and crystallographic conformers calculated independently display similar variability in monomer orientation, which supports a funnel shape for the conformational space explored during the solution-crystal transition. Insights into alternative conformations possibly sampled during oligomerization were obtained by analyzing the relative orientation of the two monomers, according to the restraint precision. Molecular dynamics simulations of Crh confirmed the tendencies observed in NMR conformers, as a paradoxical increase of the distance between the two β1a strands, when the structure gets closer to the crystallographic structure, and the role of water bridges in this context.
structural bioinformatics; NMR structure calculation; ARIA; non-crystallographic symmetry; crystallographic ensemble refinement; molecular dynamics simulation
The two principal components of biological membranes, the lipid bilayer and the proteins integrated within it, have coevolved for specific functions that mediate the interactions of cells with their environment. Molecular structures can provide very significant insights about protein function. In the case of membrane proteins, the physical and chemical properties of lipids and proteins are highly interdependent; therefore structure determination should include the membrane environment. Considering the membrane alongside the protein eliminates the possibility that crystal contacts or detergent molecules could distort protein structure, dynamics, and function and enables ligand binding studies to be performed in a natural setting.
Solid-state NMR spectroscopy is compatible with three-dimensional structure determination of membrane proteins in phospholipid bilayer membranes under physiological conditions and has played an important role in elucidating the physical and chemical properties of biological membranes, providing key information about the structure and dynamics of the phospholipid components. Recently, developments in the recombinant expression of membrane proteins, sample preparation, pulse sequences for high-resolution spectroscopy, radio frequency probes, high-field magnets, and computational methods have enabled a number of membrane protein structures to be determined in lipid bilayer membranes.
In this Account, we illustrate solid-state NMR methods with examples from two bacterial outer membrane proteins (OmpX and Ail) that form integral membrane β-barrels. The ability to measure orientation-dependent frequencies in the solid-state NMR spectra of membrane-embedded proteins provides the foundation for a powerful approach to structure determination based primarily on orientation restraints. Orientation restraints are particularly useful for NMR structural studies of membrane proteins because they provide information about both three-dimensional structure and the orientation of the protein within the membrane. When combined with dihedral angle restraints derived from analysis of isotropic chemical shifts, molecular fragment replacement, and de novo structure prediction, orientation restraints can yield high-quality three-dimensional structures with few or no distance restraints. Using complementary solid-state NMR methods based on oriented sample (OS) and magic angle spinning (MAS) approaches, one can resolve and assign multiple peaks through the use of 15N/13C labeled samples and measure precise restraints to determine structures.
We combine molecular dynamics simulations and new high-field NMR experiments to describe the solution structure of the Aβ21–30 peptide fragment that may be relevant for understanding structural mechanisms related to Alzheimer’s disease. By using two different empirical force-field combinations, we provide predictions of the three-bond scalar coupling constants (3JHNHα), chemical-shift values, 13C relaxation parameters, and rotating-frame nuclear Overhauser effect spectroscopy (ROESY) crosspeaks that can then be compared directly to the same observables measured in the corresponding NMR experiment of Aβ21–30. We find robust prediction of the 13C relaxation parameters and medium-range ROESY crosspeaks by using new generation TIP4P-Ew water and Amber ff99SB protein force fields, in which the NMR validates that the simulation yields both a structurally and dynamically correct ensemble over the entire Aβ21–30 peptide. Analysis of the simulated ensemble shows that all medium-range ROE restraints are not satisfied simultaneously and demonstrates the structural diversity of the Aβ21–30 conformations more completely than when determined from the experimental medium-range ROE restraints alone. We find that the structural ensemble of the Aβ21–30 peptide involves a majority population (~60%) of unstructured conformers, lacking any secondary structure or persistent hydrogen-bonding networks. However, the remaining minority population contains a substantial percentage of conformers with a β-turn centered at Val24 and Gly25, as well as evidence of the Asp23 to Lys28 salt bridge important to the fibril structure. This study sets the stage for robust theoretical work on Aβ1–40 and Aβ1–42, for which collection of detailed NMR data on the monomer will be more challenging because of aggregation and fibril formation on experimental timescales at physiological conditions. In addition, we believe that the interplay of modern molecular simulation and high-quality NMR experiments has reached a fruitful stage for characterizing structural ensembles of disordered peptides and proteins in general.
Several all-helical single-domain proteins have been shown to fold rapidly (microsecond time scale) to a compact intermediate state and subsequently rearrange more slowly to the native conformation. An understanding of this process has been hindered by difficulties in experimental studies of intermediates in cases where they are both low-populated and only transiently formed. One such example is provided by the on-pathway folding intermediate of the small four-helix bundle FF domain from HYPA/FBP11 that is populated at several percent with a millisecond lifetime at room temperature. Here we have studied the L24A mutant that has been shown previously to form nonnative interactions in the folding transition state. A suite of Carr–Purcell–Meiboom–Gill relaxation dispersion NMR experiments have been used to measure backbone chemical shifts and amide bond vector orientations of the invisible folding intermediate that form the input restraints in calculations of atomic resolution models of its structure. Despite the fact that the intermediate structure has many features that are similar to that of the native state, a set of nonnative contacts is observed that is even more extensive than noted previously for the wild-type (WT) folding intermediate. Such nonnative interactions, which must be broken prior to adoption of the native conformation, explain why the transition from the intermediate state to the native conformer (millisecond time scale) is significantly slower than from the unfolded ensemble to the intermediate and why the L24A mutant folds more slowly than the WT.
AssignFit is a computer program developed within the XPLOR-NIH package for the assignment of dipolar coupling (DC) and chemical shift anisotropy (CSA) restraints derived from the solid-state NMR spectra of protein samples with uniaxial order. The method is based on minimizing the difference between experimentally observed solid-state NMR spectra and the frequencies back calculated from a structural model. Starting with a structural model and a set of DC and CSA restraints grouped only by amino acid type, as would be obtained by selective isotopic labeling, AssignFit generates all of the possible assignment permutations and calculates the corresponding atomic coordinates oriented in the alignment frame, together with the associated set of NMR frequencies, which are then compared with the experimental data for best fit. Incorporation of AssignFit in a simulated annealing refinement cycle provides an approach for simultaneous assignment and structure refinement (SASR) of proteins from solid-state NMR orientation restraints. The methods are demonstrated with data from two integral membrane proteins, one α-helical and one β-barrel, embedded in phospholipid bilayer membranes.
Restrained molecular dynamics simulations are a robust, though perhaps underused, tool for the end-stage refinement of biomolecular structures. We demonstrate their utility—using modern simulation protocols, optimized force fields, and inclusion of explicit solvent and mobile counterions—by re-investigating the solution structures of two RNA hairpins that had previously been refined using conventional techniques. The structures, both domain 5 group II intron ribozymes from yeast ai5γ and Pylaiella littoralis, share a nearly identical primary sequence yet the published 3D structures appear quite different. Relatively long restrained MD simulations using the original NMR restraint data identified the presence of a small set of violated distance restraints in one structure and a possibly incorrect trapped bulge nucleotide conformation in the other structure. The removal of problematic distance restraints and the addition of a heating step yielded representative ensembles with very similar 3D structures and much lower pairwise RMSD values. Analysis of ion density during the restrained simulations helped to explain chemical shift perturbation data published previously. These results suggest that restrained MD simulations, with proper caution, can be used to “update” older structures or aid in the refinement of new structures that lack sufficient experimental data to produce a high quality result. Notable cautions include the need for sufficient sampling, awareness of potential force field bias (such as small angle deviations with the current AMBER force fields), and a proper balance between the various restraint weights.
Electronic supplementary material
The online version of this article (doi:10.1007/s10858-012-9642-5) contains supplementary material, which is available to authorized users.
RNA structure; Molecular dynamics; Residual dipolar coupling restraints; Bulge structure; Force fields; Ion binding
GpW is a 68-residue protein from bacteriophage λ that participates in virus head morphogenesis. Previous NMR studies revealed a novel α+β fold for this protein. Recent experiments have shown that gpW folds in microseconds by crossing a marginal free energy barrier (i.e., downhill folding). These features make gpW a highly desirable target for further experimental and computational folding studies. As a step in that direction, we have re-determined the high-resolution structure of gpW by multidimensional NMR on a construct that eliminates the purification tags and unstructured C-terminal tail present in the prior study. In contrast to the previous work, we have obtained a full manual assignment and calculated the structure using only unambiguous distance restraints. This new structure confirms the α+β topology, but reveals important differences in tertiary packing. Namely, the two α-helices are rotated along their main axis to form a leucine zipper. The β-hairpin is orthogonal to the helical interface rather than parallel, displaying most tertiary contacts through strand 1. There also are differences in secondary structure: longer and less curved helices and a hairpin that now shows the typical right-hand twist. Molecular dynamics simulations starting from both gpW structures, and calculations with CS-Rosetta, all converge to our gpW structure. This confirms that the original structure has strange tertiary packing and strained secondary structure. A comparison of NMR datasets suggests that the problems were mainly caused by incomplete chemical shift assignments, mistakes in NOE assignment and the inclusion of ambiguous distance restraints during the automated procedure used in the original study. The new gpW corrects these problems, providing the appropriate structural reference for future work. Furthermore, our results are a cautionary tale against the inclusion of ambiguous experimental information in the determination of protein structures.
Residual dipolar couplings (RDCs) give orientational dependent NMR restraints that improve the resolution of NMR conformational ensembles and define the relative orientation of multidomain proteins and protein complexes. The interpretation of RDCs is complicated by protein dynamics and the intrinsic degeneracy of solutions that lead to ill-defined orientations of the structural domains (ghost orientations). Here, we illustrate how paramagnetic-based restraints can remove the orientational ambiguity of multidomain membrane proteins solubilized in detergent micelles. We tested this approach for the monomeric form of phospholamban (PLN), a 52-residue membrane protein, which is composed of two helical domains connected by a relatively flexible loop. We show that the combination of classical solution NMR restraints (NOEs and dihedral angles) with RDCs and PREs resolve topological ambiguities, improving the convergence of the PLN structural ensemble and giving the depth of insertion of the protein within the micelle. This combined approach will be necessary for membrane proteins, whose three-dimensional structure is strongly influenced by interactions with the membrane-mimicking environment rather than compact tertiary folds common in soluble proteins.
Structure Determination; NMR; Membrane Protein Topology; Paramagnetic Relaxation Enhancement; Residual Dipolar Couplings; Detergent Micelles; Phospholamban
The CS-RDC-NOE Rosetta program was used to generate the solution structure of a 27 kDa fragment of the E. coli BamC protein from a limited set of NMR data. The BamC protein is a component of the essential five-protein β-barrel assembly machinery in E. coli. The first 100 residues in BamC were disordered in solution. The Rosetta calculations showed that BamC101-344 forms two well-defined domains connected by an ~18 residue linker, where the relative orientation of the domains was not defined. Both domains adopt a helix-grip fold, previously observed in the Bet v I superfamily. 15N relaxation data indicated a high degree of conformational flexibility for the linker connecting the N- and C-terminal domains in BamC. The results here show that CS-RDC-NOE Rosetta is robust and has a high tolerance for misassigned NOE restraints, which greatly simplifies NMR structure determinations.
β-barrel assembly machinery; helix-grip motif; NMR structure; Rosetta; outer membrane protein
All-atom simulations are carried out on ErbB1/B2 and EphA1 transmembrane helix dimers in lipid bilayers starting from their solution/DMPC bicelle NMR structures. Over the course of microsecond trajectories, the structures remain in close proximity to the initial configuration and satisfy the great majority of experimental tertiary contact restraints. These results further validate CHARMM protein/lipid force fields and simulation protocols on Anton. Separately, dimer conformations are generated using replica exchange in conjunction with an implicit solvent and lipid representation. The implicit model requires further improvement, and this study investigates whether lengthy all-atom molecular dynamics simulations can alleviate the shortcomings of the initial conditions. The simulations correct many of the deficiencies. For example excessive helix twisting is eliminated over a period of hundreds of nanoseconds. The helix tilt, crossing angles and dimer contacts approximate those of the NMR derived structure, although the detailed contact surface remains off-set for one of two helices in both systems. Hence, even microsecond simulations are not long enough for extensive helix rotations. The alternate structures can be rationalized with reference to interaction motifs and may represent still sought after receptor states that are important in ErbB1/B2 and EphA1 signaling.
structure prediction; implicit solvent and lipid; Generalized Born model; replica exchange; receptor tyrosine kinases; solution NMR
15N-R2/R1 relaxation data contain information on molecular shape and size as well as on bond vector orientations relative to the diffusion tensor. Since the diffusion tensor can be directly calculated from the molecular coordinates, direct inclusion of 15N-R2/R1 restraints in NMR structure calculations without any a priori assumptions is possible. Here we show that 15N-R2/R1 restraints are particularly valuable when only sparse distance restraints are available. Using three examples of proteins of varying size, namely GB3 (56 residues), ubiquitin (76 residues) and the N-terminal domain of enzyme I (EIN, 249 residues), we show that incorporation of 15N-R2/R1 restraints results in large and significant increases in coordinate accuracy that can make the difference between being able or not being able to determine an approximate global fold. For GB3 and ubiquitin, good coordinate accuracy is obtained using only backbone hydrogen bond restraints supplemented by 15N-R2/R1 relaxation restraints. For EIN, the global fold can be determined using sparse NOE distance restraints, involving only NH and methyl groups, in conjunction with 15N-R2/R1 restraints. These results are of practical significance in the study of larger and more complex systems where increasing spectral complexity and chemical shift degeneracies reduce the number of unambiguous NOE asssignments that can be readily obtained, resulting in progressively reduced NOE coverage as the size of the protein increases.
Small globular proteins and peptides commonly exhibit two-state folding kinetics in which the rate limiting step of folding is the surmounting of a single free energy barrier at the transition state (TS) separating the folded and the unfolded states. An intriguing question is whether the polypeptide chain reaches, and leaves, the TS by completely random fluctuations, or whether there is a directed, stepwise process. Here, the folding TS of a 15-residue β-hairpin peptide, Peptide 1, is characterized using independent 2.5 μs-long unbiased atomistic molecular dynamics (MD) simulations (a total of 15 μs). The trajectories were started from fully unfolded structures. Multiple (spontaneous) folding events to the NMR-derived conformation are observed, allowing both structural and dynamical characterization of the folding TS. A common loop-like topology is observed in all the TS structures with native end-to-end and turn contacts, while the central segments of the strands are not in contact. Non-native sidechain contacts are present in the TS between the only tryptophan (W11) and the turn region (P7-G9). Prior to the TS the turn is found to be already locked by the W11 sidechain, while the ends are apart. Once the ends have also come into contact, the TS is reached. Finally, along the reactive folding paths the cooperative loss of the W11 non-native contacts and the formation of the central inter-strand native contacts lead to the peptide rapidly proceeding from the TS to the native state. The present results indicate a directed stepwise process to folding the peptide.
The folding dynamics of many small protein/peptides investigated recently are in terms of simple two-state model in which only two populations exist (folded and unfolded), separated by a single free energy barrier with only one kinetically important transition state (TS). However, dynamical characterization of the folding TS is challenging. We have used independent unbiased atomistic molecular dynamics simulations with clear folding-unfolding transitions to characterize structural and dynamical features of transition state ensemble of Peptide 1. A common loop-like topology is observed in all TS structures extracted from multiple simulations. The trajectories were used to examine the mechanism by which the TS is reached and subsequent events in folding pathways. The folding TS is reached and crossed in a directed stagewise process rather than through random fluctuations. Specific structures are formed before, during, and after the transition state, indicating a clear structured folding pathway.
Partially folded proteins, characterized as exhibiting secondary structure elements with loose or absent tertiary contacts, represent important intermediates in both physiological protein folding and pathological protein misfolding. To aid in the characterization of the structural state(s) of such proteins, a novel structure calculation scheme is presented that combines structural restraints derived from pulsed EPR and NMR spectroscopy. The methodology is established for the protein α-synuclein (αS), which exhibits characteristics of a partially folded protein when bound to a micelle of the detergent sodium lauroyl sarcosinate (SLAS). By combining 18 EPR-derived interelectron spin label distance distributions with NMR-based secondary structure definitions and bond vector restraints, interelectron distances were correlated and a set of theoretical ensemble basis populations was calculated. A minimal set of basis structures, representing the partially folded state of SLAS-bound αS, was subsequently derived by back-calculating correlated distance distributions. A surprising variety of well-defined protein-micelle interactions was thus revealed in which the micelle is engulfed by two differently arranged anti-parallel αS helices. The methodology further provided the population ratios between dominant ensemble structural states, whereas limitation in obtainable structural resolution arose from spin label flexibility and residual uncertainties in secondary structure definitions. To advance the understanding of protein-micelle interactions, the present study concludes by showing that, in marked contrast to secondary structure stability, helix dynamics of SLAS-bound αS correlate with the degree of protein-induced departures from free micelle dimensions.
A new method to develop low-energy folding routes for proteins is presented. The novel aspect of the proposed approach is the synergistic use of optimal control theory with Molecular Dynamics (MD). In the first step of the method, optimal control theory is employed to compute the force field and the optimal folding trajectory for the atoms of a Coarse-Grained (CG) protein model. The solution of this CG optimization provides an harmonic approximation of the true potential energy surface around the native state. In the next step CG optimization guides the MD simulation by specifying the optimal target positions for the atoms. In turn, MD simulation provides an all-atom conformation whose positions match closely the reference target positions determined by CG optimization. This is accomplished by Targeted Molecular Dynamics (TMD) which uses a bias potential or harmonic restraint in addition to the usual MD potential. Folding is a dynamical process and as such residues make different contacts during the course of folding. Therefore CG optimization has to be reinitialized and repeated over time to accomodate these important changes. At each sampled folding time, the active contacts among the residues are recalculated based on the all-atom conformation obtained from MD. Using the new set of contacts, the CG potential is updated and the CG optimal trajectory for the atoms is recomputed. This is followed by MD. Implementation of this repetitive CG optimization - MD simulation cycle generates the folding trajectory. Simulations on a model protein Villin demonstrate the utility of the method. Since the method is founded on the general tools of optimal control theory and MD without any restrictions, it is widely applicable to other systems. It can be easily implemented with available MD software packages.
The structure of the 1,N2-εdG adduct, arising from the reaction of vinyl chloride with dG, was determined in the oligonucleotide duplex 5′-d(CGCATXGAATCC)-3′•5′-d(GGATTCCATGCG)-3′ (X=1,N2-εdG) at pH 8.6 using high resolution NMR spectroscopy. The exocyclic lesion prevented Watson-Crick base-pairing capability at the adduct site and resulted in an ~17 °C of the C decrease in Tm oligodeoxynucleotide duplex. At neutral pH, conformational exchange resulted in spectral line broadening near the adducted site, and it was not possible to determine the structure. However, at pH 8.6, it was possible to obtain well-resolved 1H NMR spectra. This enabled a total of 385 NOE based distance restraints to be obtained, consisting of 245 intra- and 140 inter-nucleotide distances. The 31P NMR spectra exhibited two downfield-shifted resonances, suggesting a localized perturbation of the DNA backbone. The two downfield 31P resonances were assigned to G7 and C19. The solution structure was refined by molecular dynamics calculations restrained by NMR derived distance and dihedral angle restraints, using a simulated annealing protocol. The generalized Born approximation was used to simulate solvent. The emergent structures indicated that the 1,N2-εdG-induced structural perturbation was localized at the X6•C19 base pair, and its 5′-neighbor T5•A20. Both 1,N2-εdG and the complementary dC adopted the anti conformation about the glycosyl bonds. The 1,N2-εdG adduct was inserted into the duplex but was shifted towards the minor groove as compared to dG in a normal Watson-Crick C•G base pair. The complementary cytosine was displaced toward the major groove. The 5′-neighbor T5•A20 base pair was destabilized with respect to Watson-Crick base pairing. The refined structure predicted a bend in the helical axis associated with the adduct site.
This paper describes an approach for making use of the components of the experimentally determined rotational diffusion tensor derived from NMR relaxation measurements in macomolecular structure determination. The parameters of the rotational diffusion tensor describe the shape and size of the macromolecule or macromolecular complex and are therefore complimentary to traditional NMR restraints. The structural information contained in the rotational diffusion tensor is not dissimilar to that present in the small angle region of the solution X-ray scattering profiles. We demonstrate the utility of rotational diffusion tensor restraints for protein structure refinement using the N-terminal domain of enzyme I (EIN) as an example and validate the results by solution small angle X-ray scattering. We also show how rotational diffusion tensor restraints can be used for docking complexes using the dimeric HIV-1 protease and the EIN-HPr complexes as examples. In the former case, the rotational diffusion tensor restraints are sufficient in their own right to determine the position of one subunit relative to another. In the latter case, rotational diffusion tensor restraints complemented by highly ambiguous distance restraints derived from chemical shift pertubation mapping and a hydrophobic contact potential are sufficient to correctly dock EIN to HPr. In each case, the cluster containing the lowest energy structure corresponds to the correct solution.
Recent structural studies of uniformly 15N, 13C-labeled proteins by solid state nuclear magnetic resonance (NMR) rely principally on two sources of structural restraints: (i) restraints on backbone conformation from isotropic 15N and 13C chemical shifts, based on empirical correlations between chemical shifts and backbone torsion angles; (ii) restraints on inter-residue proximities from qualitative measurements of internuclear dipole–dipole couplings, detected as the presence or absence of inter-residue crosspeaks in multidimensional spectra. We show that site-specific dipole–dipole couplings among 15N-labeled backbone amide sites and among 13C-labeled backbone carbonyl sites can be measured quantitatively in uniformly-labeled proteins, using dipolar recoupling techniques that we call 15N-BARE and 13C-BARE (BAckbone REcoupling), and that the resulting data represent a new source of restraints on backbone conformation. 15N-BARE and 13C-BARE data can be incorporated into structural modeling calculations as potential energy surfaces, which are derived from comparisons between experimental 15N and 13C signal decay curves, extracted from crosspeak intensities in series of two-dimensional spectra, with numerical simulations of the 15N-BARE and 13C-BARE measurements. We demonstrate this approach through experiments on microcrystalline, uniformly 15N, 13C-labeled protein GB1. Results for GB1 show that 15N-BARE and 13C-BARE restraints are complementary to restraints from chemical shifts and inter-residue crosspeaks, improving both the precision and the accuracy of calculated structures.
Magic-angle spinning; Protein structure; Pulse sequences
Chimeric hybrids derived from the rubredoxins of Pyrococcus furiosus (Pf) and Clostridium pasteurianum (Cp) provide a robust system for the characterization of protein conformational stability and dynamics in a differential mode. Interchange of the seven nonconserved residues of the metal binding site between the Pf and Cp rubredoxins yields a complementary pair of hybrids, for which the sum of the thermodynamic stabilities is equal to the sum for the parental proteins. Furthermore, the increase in amide hydrogen exchange rates for the hyperthermophile-derived metal binding site hybrid is faithfully mirrored by a corresponding decrease for the complementary hybrid that is derived from the less thermostable rubredoxin, indicating a degree of additivity in the conformational fluctuations that underlie these exchange reactions.
Initial NMR studies indicated that the structures of the two complementary hybrids closely resemble "cut-and-paste" models derived from the parental Pf and Cp rubredoxins. This protein system offers a robust opportunity to characterize differences in solution structure, permitting the quantitative NMR chemical shift and NOE peak intensity data to be analyzed without recourse to the conventional conversion of experimental NOE peak intensities into distance restraints. The intensities for 1573 of the 1652 well-resolved NOE crosspeaks from the hybrid rubredoxins were statistically indistinguishable from the intensities of the corresponding parental crosspeaks, to within the baseplane noise level of these high sensitivity data sets. The differences in intensity for the remaining 79 NOE crosspeaks were directly ascribable to localized dynamical processes. Subsequent X-ray analysis of the metal binding site-swapped hybrids, to resolution limits of 0.79 Å and 1.04 Å, demonstrated that the backbone and sidechain heavy atoms in the NMR-derived structures lie within the range of structural variability exhibited among the individual molecules in the crystallographic asymmetric unit (~0.3 Å), indicating consistency with the "cut-and-paste" structuring of the hybrid rubredoxins in both crystal and solution.
Each of the significant energetic interactions in the metal binding site-swapped hybrids appears to exhibit a 1-to-1 correspondence with the interactions present in the corresponding parental rubredoxin structure, thus providing a structural basis for the observed additivity in conformational stability and dynamics. The congruence of these X-ray and NMR experimental data offers additional support for the interpretation that the conventional treatment of NOE distance restraints contributes substantially to the systematic differences that are commonly reported between NMR- and X-ray-derived protein structures.
Kinase-inducible domain (KID) as transcriptional activator can stimulate target gene expression in signal transduction by associating with KID interacting domain (KIX). NMR spectra suggest that apo-KID is an unstructured protein. After post-translational modification by phosphorylation, KID undergoes a transition from disordered to well folded protein upon binding to KIX. However, the mechanism of folding coupled to binding is poorly understood.
To get an insight into the mechanism, we have performed ten trajectories of explicit-solvent molecular dynamics (MD) for both bound and apo phosphorylated KID (pKID). Ten MD simulations are sufficient to capture the average properties in the protein folding and unfolding.
Room-temperature MD simulations suggest that pKID becomes more rigid and stable upon the KIX-binding. Kinetic analysis of high-temperature MD simulations shows that bound pKID and apo-pKID unfold via a three-state and a two-state process, respectively. Both kinetics and free energy landscape analyses indicate that bound pKID folds in the order of KIX access, initiation of pKID tertiary folding, folding of helix αB, folding of helix αA, completion of pKID tertiary folding, and finalization of pKID-KIX binding. Our data show that the folding pathways of apo-pKID are different from the bound state: the foldings of helices αA and αB are swapped. Here we also show that Asn139, Asp140 and Leu141 with large Φ-values are key residues in the folding of bound pKID. Our results are in good agreement with NMR experimental observations and provide significant insight into the general mechanisms of binding induced protein folding and other conformational adjustment in post-translational modification.
Trp-cage is a designed 20-residue polypeptide that, in spite of its size, shares several features with larger globular proteins. Although the system has been intensively investigated experimentally and theoretically, its folding mechanism is not yet fully understood. Indeed, some experiments suggest a two-state behavior, while others point to the presence of intermediates. In this work we show that the results of a bias-exchange metadynamics simulation can be used for constructing a detailed thermodynamic and kinetic model of the system. The model, although constructed from a biased simulation, has a quality similar to those extracted from the analysis of long unbiased molecular dynamics trajectories. This is demonstrated by a careful benchmark of the approach on a smaller system, the solvated Ace-Ala3-Nme peptide. For the Trp-cage folding, the model predicts that the relaxation time of 3100 ns observed experimentally is due to the presence of a compact molten globule-like conformation. This state has an occupancy of only 3% at 300 K, but acts as a kinetic trap. Instead, non-compact structures relax to the folded state on the sub-microsecond timescale. The model also predicts the presence of a state at of 4.4 Å from the NMR structure in which the Trp strongly interacts with Pro12. This state can explain the abnormal temperature dependence of the and chemical shifts. The structures of the two most stable misfolded intermediates are in agreement with NMR experiments on the unfolded protein. Our work shows that, using biased molecular dynamics trajectories, it is possible to construct a model describing in detail the Trp-cage folding kinetics and thermodynamics in agreement with experimental data.
Understanding the mechanism by which proteins find their folded state is a holy grail of computational biology. Accurate all-atom simulations have the potential to describe such a process in great detail, but, unfortunately, folding of most proteins takes place on a time scale that is still not accessible to routine computer simulations. We introduce here an approach that allows for constructing an accurate kinetic and thermodynamic model of folding (or other complex biological processes) using trajectories in which the process under investigation is forced to happen in a short simulation time by an appropriate external bias. An important strength of this approach is the possibility of identifying and characterizing misfolded conformations that, in some proteins, are related to important diseases. We use this method to study the folding of Trp-cage, predicting the structure of the folded state and the presence of several intermediates. We find that, surprisingly, fully unstructured “unfolded” states relax towards the folded conformation rather quickly. The slowest relaxation time of the system is instead related to the equilibration between the folded state and another compact structure that acts as a kinetic trap. Thus, the experimental folding time would be determined primarily by this process.
We present a high-resolution nuclear magnetic resonance (NMR) solution structure of a 14-mer RNA hairpin capped by cUUCGg tetraloop. This short and very stable RNA presents an important model system for the study of RNA structure and dynamics using NMR spectroscopy, molecular dynamics (MD) simulations and RNA force-field development. The extraordinary high precision of the structure (root mean square deviation of 0.3 Å) could be achieved by measuring and incorporating all currently accessible NMR parameters, including distances derived from nuclear Overhauser effect (NOE) intensities, torsion-angle dependent homonuclear and heteronuclear scalar coupling constants, projection-angle-dependent cross-correlated relaxation rates and residual dipolar couplings. The structure calculations were performed with the program CNS using the ARIA setup and protocols. The structure quality was further improved by a final refinement in explicit water using OPLS force field parameters for non-bonded interactions and charges. In addition, the 2′-hydroxyl groups have been assigned and their conformation has been analyzed based on NOE contacts. The structure currently defines a benchmark for the precision and accuracy amenable to RNA structure determination by NMR spectroscopy. Here, we discuss the impact of various NMR restraints on structure quality and discuss in detail the dynamics of this system as previously determined.
We have developed the program PERMOL for semi-automated homology modeling of proteins. It is based on restrained molecular dynamics using a simulated annealing protocol in torsion angle space. As main restraints defining the optimal local geometry of the structure weighted mean dihedral angles and their standard deviations are used which are calculated with an algorithm described earlier by Döker et al. (1999, BBRC, 257, 348–350). The overall long-range contacts are established via a small number of distance restraints between atoms involved in hydrogen bonds and backbone atoms of conserved residues. Employing the restraints generated by PERMOL three-dimensional structures are obtained using standard molecular dynamics programs such as DYANA or CNS.
To test this modeling approach it has been used for predicting the structure of the histidine-containing phosphocarrier protein HPr from E. coli and the structure of the human peroxisome proliferator activated receptor γ (Ppar γ). The divergence between the modeled HPr and the previously determined X-ray structure was comparable to the divergence between the X-ray structure and the published NMR structure. The modeled structure of Ppar γ was also very close to the previously solved X-ray structure with an RMSD of 0.262 nm for the backbone atoms.
In summary, we present a new method for homology modeling capable of producing high-quality structure models. An advantage of the method is that it can be used in combination with incomplete NMR data to obtain reasonable structure models in accordance with the experimental data.
Determination of protein-DNA complex structures with both NMR and X-ray crystallography remains challenging in many cases. High Ambiguity-Driven DOCKing (HADDOCK) is an information-driven docking program that has been used to successfully model many protein-DNA complexes. However, a protein-DNA complex model whereby the protein wraps around DNA has not been reported. Defining the ambiguous interaction restraints for the classical three-Cys2His2 zinc-finger proteins that wrap around DNA is critical because of the complicated binding geometry. In this study, we generated a Zif268-DNA complex model using three different sets of ambiguous interaction restraints (AIRs) to study the effect of the geometric distribution on the docking and used this approach to generate a newly reported Sp1-DNA complex model.
The complex models we generated on the basis of two AIRs with a good geometric distribution in each domain are reasonable in terms of the number of models with wrap-around conformation, interface root mean square deviation, AIR energy and fraction native contacts. We derived the modeling approach for generating a three-Cys2His2 zinc-finger-DNA complex model according to the results of docking studies using the Zif268-DNA and other three crystal complex structures. Furthermore, the Sp1-DNA complex model was calculated with this approach, and the interactions between Sp1 and DNA are in good agreement with those previously reported.
Our docking data demonstrate that two AIRs with a reasonable geometric distribution in each of the three-Cys2His2 zinc-finger domains are sufficient to generate an accurate complex model with protein wrapping around DNA. This approach is efficient for generating a zinc-finger protein-DNA complex model for unknown complex structures in which the protein wraps around DNA. We provide a flowchart showing the detailed procedures of this approach.
In a combined NMR/MD study, the temperature-dependent changes in the conformation of two members of the RNA YNMG-tetraloop motif (cUUCGg and uCACGg) have been investigated at temperatures of 298, 317 and 325 K. The two members have considerable different thermal stability and biological functions. In order to address these differences, the combined NMR/MD study was performed. The large temperature range represents a challenge for both, NMR relaxation analysis (consistent choice of effective bond length and CSA parameter) and all-atom MD simulation with explicit solvent (necessity to rescale the temperature). A convincing agreement of experiment and theory is found. Employing a principle component analysis of the MD trajectories, the conformational distribution of both hairpins at various temperatures is investigated. The ground state conformation and dynamics of the two tetraloops are indeed found to be very similar. Furthermore, both systems are initially destabilized by a loss of the stacking interactions between the first and the third nucleobase in the loop region. While the global fold is still preserved, this initiation of unfolding is already observed at 317 K for the uCACGg hairpin but at a significantly higher temperature for the cUUCGg hairpin.