It is of interest to know whether local fluctuations in a polypeptide chain play any role in the mechanism by which the chain folds to the native structure of a protein. This question is addressed by analyzing folding and non-folding trajectories of a protein; as an example, the analysis is applied to the 37-residue triple β-strand WW domain from the Formin binding protein 28 (FBP28) (PDB ID: 1E0L). Molecular dynamics (MD) trajectories were generated with the coarse-grained united-residue force field, and one- and two-dimensional free-energy landscapes (FELs) along the backbone virtual-bond angle θ and backbone virtual-bond-dihedral angle γ of each residue, and principal components, respectively, were analyzed. The key residues involved in the folding of the FBP28 WW domain are elucidated by this analysis. The correlations between local and global motions are found. It is shown that most of the residues in the folding trajectories of the system studied here move in a concerted fashion, following the dynamics of the whole system. This demonstrates how the choice of a pathway has to involve concerted movements in order for this protein to fold. This finding also sheds light on the effectiveness of principal component analysis (PCA) for the description of the folding dynamics of the system studied. It is demonstrated that the FEL along the PCs, computed by considering only several critically-placed residues, can correctly describe the folding dynamics.
Coarse-grained force fields for protein simulations are usually designed and parameterized to treat proteins composed of natural L-amino-acid residues. However, D-amino-acid residues occur in bacterial, fungal (e.g., gramicidins), as well as human-designed proteins. For this reason, we have extended the UNRES coarse-grained force field developed in our laboratory to treat systems with D-amino-acid residues. We developed the respective virtual-bond-torsional and double-torsional potentials for rotation about the Cα · · · Cα virtual-bond axis and two consecutive Cα · · · Cα virtual-bond axes, respectively, as functions of virtual-bond-dihedral angles γ. In turn, these were calculated as potentials of mean force (PMFs) from the diabatic energy surfaces of terminally-blocked model compounds for glycine, alanine, and proline. The potential-energy surfaces were calculated by using the ab initio method of molecular quantum mechanics at the Møller-Plesset (MP2) level of theory and the 6-31G(d,p) basis set, with the rotation angles of the peptide groups about
Ciα⋯Ci+1α(λ(2)) used as variables, and the energy was minimized with respect to the remaining degrees of freedom. The PMFs were calculated by numerical integration for all pairs and triplets with all possible combinations of types (glycine, alanine, and proline) and chirality (D or L); however, symmetry relations reduce the number of non-equivalent torsional potentials to 13 and the number of double-torsional potentials to 63 for a given C-terminal blocking group. Subsequently, one- (for torsional) and two-dimensional (for double-torsional potentials) Fourier series were fitted to the PMFs to obtain analytical expressions. It was found that the torsional potentials of the x-Y and X-y types, where X and Y are Ala or Pro, respectively, and a lowercase letter denotes D-chirality, have global minima for small absolute values of γ, accounting for the double-helical structure of gramicidin A, which is a dimer of two chains, each possessing an alternating D-Tyr-L-Tyr sequence, and similar peptides. The side-chain and correlation potentials for D-amino-acid residues were obtained by applying the reflection about the
Ci-1α⋯Ciα⋯Ci+1α plane to the respective potentials for the L-amino-acid residues.
Since chemical shifts provide important and relatively accessible information about protein structure in solution, a Web server, CheShift-2, was developed for structure interrogation, based on a quantum mechanics database of 13Cα chemical shifts. CheShift-2 to a local inconsistency between two X-ray crystal structures (PDB IDs 1IKN and INFI) of the complex between the p65/p50 heterodimer of NFκB and its inhibitor IκBα. The availability of NMR resonance assignments that included the region of the inconsistency provided an opportunity for independent validation of the CheShift-2 server. Application of the server showed that the 13Cψ chemical shift measured for the Gly270-Pro281 sequence close to the C-terminus of IκBα were unequivocally consistent with the backbone structure modeled in the 1IKN structure, and were inconsistent with the 1NFI structure. Previous NOE measurements had demonstrated that the position of a tryptophan ring in the region immediately N-terminal in this region was not consistent with either structure. Subsequent recalculation of the local structure in this region, based on the electron density of the deposited structure factors for 1IKN, confirmed that the local backbone structure was best modeled by 1IKN, but that the rotamer of Trp258 is consistent with the 1NFI structure, including the presence of a hydrogen bond between the ring NεH of Trp258 and the backbone carbonyl group of Gln278. The consensus between all of these measures suggests that the CheShift-2 server operates well under circumstances in which backbone chemical shifts are available but where local plasticity may render X-ray structural data ambiguous.
By examining the molecular dynamics (MD) of protein folding trajectories, generated with the coarse-grained UNRES force field, for the B-domain of staphylococcal protein A and the triple β-strand WW domain from the Formin binding protein 28 (FBP), by principal component analysis (PCA), it is demonstrated how different free energy landscapes (FELs) and folding pathways of trajectories can be, even though they appear to be very similar by visual inspection of the time-dependence of the root-mean-square deviation (rmsd). Approaches to determine the minimal dimensionality of FELs for a correct description of protein folding dynamics are discussed. The correlation between the amplitude of the fluctuations of proteins and the dimensionality of the FELs is shown. The advantage of internal coordinate PCA over Cartesian PCA for small proteins is also illustrated.
A proposed coarse-grained model of nucleic acids demonstrates that average interactions between base dipoles, together with chain connectivity and excluded-volume interactions, are sufficient to form double-helical structures of DNA and RNA molecules. Additionally, local interactions determine helix handedness and direction of strand packing. This result, and earlier research on reduced protein models, suggest that mean-field multipole-multipole interactions are the principal factors responsible for the formation of regular structure of biomolecules.
A network analysis is used to uncover hidden folding pathways in free-energy landscapes usually defined in terms of such arbitrary order parameters as root-mean-square deviation from the native structure, radius of gyration, etc. The analysis has been applied to molecular dynamics (MD) trajectories of the B-domain of staphylococcal protein A, generated with the coarse-grained united-residue (UNRES) force field in a broad range of temperatures (270K ≤ T ≤ 325K). Thousands of folding pathways have been identified at each temperature. Out of these many folding pathways, several most probable ones were selected for investigation of the conformational transitions during protein folding. Unlike other conformational space network (CSN) methods, a node in the CSN variant implemented in this work is defined according to the nativelikeness class of the structure, which defines the similarity of segments of the compared structures in terms of secondary-structure, contact-pattern, and local geometry, as well as the overall geometric similarity of the conformation under consideration to that of the reference (experimental) structure. Our previous findings, regarding the folding model and conformations found at the folding-transition temperature for protein A (Maisuradze et al., J. Am. Chem. Soc. 132, 9444, 2010), were confirmed by the conformational space network analysis. In the methodology and in the analysis of the results, the shortest path identified by using the shortest-path algorithm corresponds to the most probable folding pathway in the conformational space network.
The potentials of mean force (PMF’s) for the deformation of the Cα ⋯ Cα virtual bonds in polypeptide chains were determined from the diabatic energy surfaces of N-methylacetamide (modeling regular peptide groups) and N-acetylpyrrolidine (modeling the peptide groups preceding proline), calculated at the Møller-Plesset (MP2) ab initio level of theory with the 6-31G(d,p) basis set. The energy surfaces were expressed in the Cα ⋯ Cα virtual-bond length (d) and the H-N-Cα ⋯ C′ improper dihedral angle (α) that describes the pyramidicity of the amide nitrogen, or in the Cα-C′(O)-N-Cα dihedral angle (ω) and the angle α. For each grid point, the potential energy was minimized with respect to all remaining degrees of freedom. The PMF’s obtained from the (d, α) energy surfaces produced realistic free-energy barriers to the trans-cis transition (10 kcal/mol and 13 kcal/mol for the regular and proline peptide groups, respectively, compared to 12.6 – 13.9 kcal/mol and 17.3 – 19.6 kcal/mol determined experimentally for glycylglycine and N-acylprolines, respectively), while those obtained from the (ω, α) energy maps produced either low-quality PMF curves when direct Boltzmann summation was implemented to compute the PMF’s or too-flat curves with too-low free-energy barriers to the trans-cis transition if harmonic extrapolation was used to estimate the contributions to the partition function. An analytical bimodal logarithmic-Gaussian expression was fitted to the PMF’s, and the potentials were implemented in the UNRES force field. Test Langevin-dynamics simulations were carried out for the Gly-Gly and Gly-Pro dipeptides, which showed a 106-fold increase of the simulated rate of the trans-cis isomerization with respect to that measured experimentally; effectively the same result was obtained with the analytical Kramers theory of reaction rate applied to the UNRES representation of the peptide groups. Application of Kramers’ theory to compute the rate constants from the all-atom ab initio energy surfaces of the model compounds studied resulted in isomerization rates close to the experimental values, which demonstrates that the increase of the isomerization rate in UNRES simulations results solely from averaging out the secondary degrees of freedom.
protein folding; trans-cis isomerization of peptide bonds; UNRES force field; potential of mean force; ab initio molecular quantum mechanics
The purpose of this work is to show how mutation, truncation and change of temperature can influence the folding kinetics of a protein. This is accomplished by principal component analysis (PCA) of molecular dynamics (MD)-generated folding trajectories of the triple β-strand WW domain from the Formin binding protein 28 (FBP) [PDB: 1E0L] and its full-size, and singly- and doubly-truncated mutants at temperatures below and very close to the melting point. The reasons for biphasic folding kinetics [i.e., coexistence of slow (three-state) and fast (two-state) phases], including the involvement of a solvent-exposed hydrophobic cluster and another delocalized hydrophobic core in the folding kinetics, are discussed. New folding pathways are identified in free-energy landscapes determined in terms of principal components for full-size mutants. Three-state folding is found to be a main mechanism for folding FBP28 WW domain and most of the full-size and truncated mutants. The results from the theoretical analysis are compared to those from experiment. Agreements and discrepancies between the theoretical and experimental results are discussed. Because of its importance in understanding protein kinetics and function, the diffusive mechanism by which FBP28 WW domain and its full-size and truncated mutants explore their conformational space is examined in terms of the mean-square displacement, (MSD), and PCA eigenvalue spectrum analyses. Subdiffusive behavior is observed for all studied systems.
In the oxidative folding of onconase, the stabilization of intermediates early in the folding process gives rise to efficient formation of its biologically active form. To identify the residues responsible for initial formation of structured intermediates, the transition from an ensemble of unstructured three-disulfide species, 3SU, to a single structured three-disulfide intermediate species, des-[30–75] or 3SF, at pH 8.0, 25 °C was examined. This transition was first monitored by far-UV CD spectroscopy at pH 8.0, 25 °C, showing that it occurs with formation of secondary structure, presumably due to native interactions. The time-dependence of formation of native-like structure was then followed by NMR spectroscopy after arresting the transition at different times by lowering the pH to 3 and then acquiring 1H,15N – HSQC spectra at pH 3, 16 °C to identify amide hydrogens that become part of native-like structure. H/D exchange was utilized to reduce the intensity of resonances from backbone amide hydrogens not involved in structure, without allowing exchange of backbone amide hydrogens involved in initial structure. Six hydrogen-bonding residues, namely, Tyr38, Lys49, Ser82, Cys90, Glu91, and Ala94 were identified as involved in the earliest detectable native-like structure before complete formation of des-[30–75], and are further stabilized later in the formation of this intermediate through S-S/SH interchange. By observing the stabilization of the structures of these residues by their neighboring residues, the initial, native-like structural elements formed in this transition have been identified, providing details of the initial events in the oxidative folding of onconase.
We have examined the effect of like-charged residues on the conformation of an oligoalanine sequence. This was facilitated by CD and NMR spectroscopic and differential scanning calorimetric (DSC) measurements, and molecular dynamics calculations, of the following three alanine-based peptides: Ac-K-(A)5-K-NH2 (KAK5), Ac-K-(A)4-K-NH2 (KAK4), Ac-K-(A)3-K-NH2 (KAK3), where A and K denote alanine and lysine residues, respectively. Our earlier studies suggested that the presence of like-charged residues at the end of a short polypeptide chain composed of nonpolar residues can induce a chain reversal. For all three peptides, canonical MD simulations with NMR-derived restraints demonstrate the presence of ensembles of structures with a tendency to form a chain reversal. The KAK3 peptide exhibits a bent shape with its ends close to each other, while KAK4 and KAK5 are more extended. In the KAK5 peptide, the lysine residues do not have any influence on each other and are very mobile. Nevertheless, the tendency to form a more or less pronounced chain reversal is observed and it seems to be stable in all three peptides. This chain reversal seems to be caused by screening of the nonpolar core from the solvent by the hydrated charged residues
alanine-based peptides; NMR spectroscopy; molecular dynamics; conformational studies; bent-like structure
The β amyloid (Aβ) peptide aggregates to form β-rich structures that are known to trigger Alzheimer’s disease. Experiments suggest that an α-helical intermediate precedes the formation of these aggregates. However, a description at the molecular level of the α-to-β transition has not been obtained. Because it has been proposed that the transition might be initiated in the amino-terminal region of Aβ, we studied the aggregation of the 28-residue amino-terminal fragment of Aβ (Aβ1–28) using molecular dynamics and a coarse-grained force field. Simulations starting from extended and helical conformations showed that oligomerization is initiated by formation of intermolecular β -sheets between the residues in the N-terminal regions. In simulations starting from the α-helical conformation, forcing residues 17–21 to remain in the initial (helical) conformation prevents aggregation but allows for the formation of dimers, indicating that oligomerization, initiated along the non-helical N-terminal regions, cannot progress without the α-to-β transition propagating along the chains.
Amyloids; Aβ peptide; UNRES force field; molecular dynamics; Alzheimer’s disease
We review the coarse-grained UNited RESidue (UNRES) force field for the simulations of protein structure and dynamics, which is being developed in our laboratory over the last several years. UNRES is a physics-based force field, the prototype of which is defined as a potential of mean force of polypeptide chains in water, where all the degrees of freedom except the coordinates of α-carbon atoms and side-chain centers have been integrated out. We describe the initial implementation of UNRES to protein-structure prediction formulated as a search for the global minimum of the potential-energy function and its subsequent molecular dynamics and extensions of molecular-dynamics implementation, which enabled us to study protein-folding pathways and thermodynamics, as well as to reformulate the protein-structure prediction problem as a search for the conformational ensemble with the lowest free energy at temperatures below the folding-transition temperature. Applications of UNRES to study biological problems are also described.
By using the potentiometric titration method, we have determined the pKa values of the two terminal lysine groups in six alanine-based peptides differing in the length of the alanine chain: Ac–Lys–Lys–NH2 (KK), Ac–Lys–Ala–Lys–NH2 (KAK), Ac–Lys–Ala–Ala–Lys–NH2 (KAK2), Ac–Lys–Ala–Ala–Ala–Lys–NH2 (KAK3), Ac–Lys–Ala–Ala–Ala–Ala–Lys–NH2 (KAK4), and Ac–Lys–Ala–Ala–Ala–Ala–Ala–Lys–NH2 (KAK5) in aqueous solution. For each compound, the model of two stepwise acid–base equilibria was fitted to the potentiometric-titration data. As expected, the pKa values of the lysine groups increase with increasing length of the alanine spacer, which means that the influence of the electrostatic field between one charged group on the other decreases with increasing length of the alanine spacer. However, for KAK3, the pKa1 value (8.20) is unusually small and pKa2 (11.41) is remarkably greater than pKa1, suggesting that the two groups are close to each other and, in turn, that a chain-reversal conformation is present for this peptide. Starting with KAK3, the differences between pKa1 and pKa2 decrease; however, for the longest peptide (KAK5), the values of pKa1 and pKa2 still differ by about 1 unit, i.e., by more than the value of log10 (4) = 0.60 that is a limiting value for the pKa difference of dicarboxylic acids with increasing methylene-spacer length. Consequently, some interactions between the two charged groups are present and, in turn, a bent shape occurs even for the longest of the peptides studied.
Alanine-based peptides; Potentiometric titration; Acid–base equilibria
Two major techniques have been used to determine the three-dimensional
structures of proteins: x-ray diffraction and NMR spectroscopy. In particular,
the validation of NMR-derived protein structures is one of the most challenging
problems in NMR spectroscopy. Therefore, researchers have proposed a plethora of
methods to determine the accuracy and reliability of protein structures. Despite
these proposals, there is a growing need for more sophisticated, physics-based
structure validation methods. This approach will enable us to (a) characterize
the “quality” of the NMR-derived ensemble as a whole by a single
parameter, (b) unambiguously identify flaws in the sequence at a residue level,
and (c) provide precise information, such as sets of backbone and side-chain
torsional angles, that we can use to detect local flaws.
Rather than reviewing all of the existing validation methods, this
Account describes the contributions of our research group toward a solution of
the long-standing problem of both global and local structure validation of
NMR-derived protein structures. We emphasize a recently introduced physics-based
methodology that makes use of observed and computed
13Cα chemical shifts (at the DFT level of
theory) for an accurate validation of protein structures in solution and in
crystals. By assessing the ability of computed 13Cα
chemical shifts to reproduce observed 13Cα chemical
shifts of a single or ensemble of structures in solution and in crystals, we
accomplish a global validation by using the
ca-rmsd, as a scoring function. In addition, the method
enables us to provide local validation by identifying a set of individual amino
acid conformations for which the computed and observed
13Cα chemical shifts do not agree within a
certain error range and may represent a non-reliable fold of the protein
Although it is computationally intensive, our validation method has
several advantages, which we illustrate through a series of applications. This
method makes use of the 13Cα chemical shifts, not
shielding, that are ubiquitous to proteins and can be computed precisely from
the φ, ψ, and χ torsional angles. There is no need for
a priori knowledge of the oligomeric state of the protein,
and no knowledge-based information or additional NMR data are required. The
primary limitation at this point is the computational cost of such calculations.
However, we anticipate that enhancements both in the speed of calculating these
chemical shifts coupled with ever increasing computational power should soon
make this a standard method accessible to the general NMR community.
The two-site coarse-grained model for the interactions of charged side chains, to be used with our coarse-grained UNRES force field for protein simulations proposed in the accompanying paper, has been extended to pairs of oppositely-charged side chains. The potentials of mean force of four pairs of molecules modeling charged amino-acid side chains, i.e., propionate – n-pentylamine cation (for aspartic acid – lysine), butyrate…n-pentylamine cation (for glutamic acid – lysine), propionate –1-butylguanidine (for aspartic acid – arginine), and butyrate – 1-butylguanidine (for glutamic acid – arginine) pairs were determined by umbrella-sampling molecular dynamics simulations in explicit water as functions of distance and orientation, and the analytical expression was fitted to the potentials of mean force. Compared to pairs of like-charged side chains discussed in the accompanying paper, an average quadrupole-quadrupole interaction term had to be introduced to reproduce the Coulombic interactions, and a multi-state model of charge distribution had to be introduced to fit the potentials of mean force of all oppositely-charged pairs well. The model reproduces all salt-bridge minima and, consequently, is likely to improve the performance of the UNRES force field.
charged side chains; new model of side-chain – side-chain interactions; potential of mean force; molecular dynamics; umbrella sampling; multipole expansion
A new model of side-chain – side-chain interactions for charged side-chains of amino acids, to be used in the UNRES force-field, has been developed, in which a side chain consists of a nonpolar and a charged site. The interaction energy between the nonpolar sites is composed of a Gay-Berne and a cavity term; the interaction energy between the charged sites consists of a Lennard-Jones term, a Coulombic term, a Generalized-Born term, and a cavity term, while the interaction energy between the nonpolar and charged sites is composed of a Gay-Berne and a polarization term. We parameterized the energy function for the models of all six pairs of natural like-charged amino-acid side chains, namely propionate-propionate (for the aspartic acid-aspartic acid pair), butyrate-butyrate (for the glutamic acid-glutamic acid pair), propionate-butyrate (for the aspartic acid-glutamic acid pair), pentylamine cation-pentylamine cation (for the lysine-lysine pair), 1-butylguanidine cation-1-butylguanidine cation (for the arginine-arginine pair), and pentylamine cation-1-butylguanidine cation (for the lysine-arginine pair). By using umbrella-sampling molecular dynamics simulations in explicit TIP3P water, we determined the potentials of mean force of the above-mentioned pairs as functions of distance and orientation and fitted analytical expressions to them. The positions and depths of the contact minima and the positions and heights of the desolvation maxima, including their dependence on the orientation of the molecules were well represented by analytical expressions for all systems. The values of the parameters of all the energy components are physically reasonable, which justifies use of such potentials in coarse-grain protein-folding simulations.
charged side chains; new model of side-chain – side-chain interactions; potential of mean force; molecular dynamics; umbrella sampling
Summary: The differences between observed and predicted 13Cα chemical shifts can be used as a sensitive probe with which to detect possible local flaws in protein structures. For this reason, we previously introduced CheShift, a Web server for protein structure validation. Now, we present CheShift-2 in which a graphical user interface is implemented to render such local flaws easily visible. A series of applications to 15 ensembles of conformations illustrate the ability of CheShift-2 to locate the main structural flaws rapidly and accurately on a per-residue basis. Since accuracy plays a central role in CheShift predictions, the treatment of histidine (His) is investigated here by exploring which form of His should be used in CheShift-2.
CheShift-2 is free of charge for academic use and can be accessed from www.cheshift.com
Supplementary data are available at the Bioinformatics online.
A key regulator of AMPA (α-amino-3-hydroxy-5-methylisoxazole-4-propionic acid) receptor traffic, PICK1 is also known to interact with over 40 other proteins, including receptors, transporters, and ionic channels, and to be active mostly as a homodimer. The current lack of a complete PICK1 structure determined at atomic resolution hinders the elucidation of its functional mechanisms. Here, we identify interactions between the component PDZ and BAR domains of PICK1 by calculating possible binding sites for the PDZ domain of PICK1, PICK1-PDZ, to the homology-modeled crescent-shaped dimer of the PICK1-BAR domain using multiplexed replica-exchange molecular dynamics (MREMD) and canonical molecular dynamics (MD) simulations with the coarse-grained UNRES force field. The MREMD results show that the preferred binding site for the single PDZ domain is the concave cavity of the BAR dimer. A second possible binding site is near the N-terminus of the BAR domain that is linked directly to the PDZ domain. Subsequent short MD simulations, used to determine how the PICK1-PDZ domain moves to the preferred binding site on the BAR domain of PICK1, revealed that initial hydrophobic interactions drive the progress of the simulated binding. Thus, the concave face of the BAR dimer accommodates the PDZ domain first by weak hydrophobic interactions, and then the PDZ domain slides to the center of the concave face, where more favorable hydrophobic interactions take over.
PICK1; binding; hydrophobic interactions; UNRES force field; molecular dynamics
We report the modification and parameterization of the united-residue (UNRES) force field for energy-based protein-structure prediction and protein-folding simulations. We tested the approach on three training proteins separately: 1E0L (β), 1GAB (α), and 1E0G (α + β). Heretofore, the UNRES force field had been designed and parameterized to locate native-like structures of proteins as global minima of their effective potential-energy surfaces, which largely neglected the conformational entropy because decoys composed of only lowest-energy conformations were used to optimize the force field. Recently, we developed a mesoscopic dynamics procedure for UNRES, and applied it with success to simulate protein folding pathways. How ever, the force field turned out to be largely biased towards α-helical structures in canonical simulations because the conformational entropy had been neglected in the parameterization. We applied the hierarchical optimization method developed in our earlier work to optimize the force field, in which the conformational space of a training protein is divided into levels each corresponding to a certain degree of native-likeness. The levels are ordered according to increasing native-likeness; level 0 corresponds to structures with no native-like elements and the highest level corresponds to the fully native-like structures. The aim of optimization is to achieve the order of the free energies of levels, decreasing as their native-likeness increases. The procedure is iterative, and decoys of the training protein(s) generated with the energy-function parameters of the preceding iteration are used to optimize the force field in a current iteration. We applied the multiplexing replica exchange molecular dynamics (MREMD) method, recently implemented in UNRES, to generate decoys; with this modification, conformational entropy is taken into account. Moreover, we optimized the free-energy gaps between levels at temperatures corresponding to a predominance of folded or unfolded structures, as well as to structures at the putative folding-transition temperature, changing the sign of the gaps at the transition temperature. This enabled us to obtain force fields characterized by a single peak in the heat capacity at the transition temperature. Furthermore, we introduced temperature dependence to the UNRES force field; this is consistent with the fact that it is a free-energy and not a potential-energy function.
ab initio protein folding; folding transition; thermodynamic hypothesis; potential-function optimization; hierarchical energy landscapes; foldability
The mechanism of growth of fibrils of the β-amyloid peptide (Aβ) was studied by means of a physics-based coarse-grained united-residue (UNRES) model and molecular dynamics (MD) simulations. To identify the mechanism of monomer addition to an Aβ1–40 fibril, an unstructured monomer was placed at a 20 Å distance from a fibril template, and allowed to interact freely with it. The monomer was not biased towards the fibril conformation, by either the force field or the MD algorithm. By using a coarse-grained model with replica exchange MD, a longer time scale was accessible making it possible to observe how the monomers probe different binding modes during their search towards the fibril conformation. Although different assembly pathways were seen, they all follow a dock-lock mechanism, with two distinct locking stages, which is consistent with data from experiments on fibril elongation. Whereas these experiments have not been able to characterize the conformations populating the different stages, we have been able to describe these different stages explicitly by following free monomers as they dock onto a fibril template and adopt the fibril conformation; i.e., we describe fibril elongation step by step, at the molecular level. During the first stage of the assembly, “docking”, the monomer tries different conformations. After docking, the monomer is locked into the fibril through two different locking stages. In the first stage the monomer forms hydrogen bonds with the fibril template along one of the strands in a two-stranded β hairpin; in the second stage, hydrogen bonds are formed along the second strand, locking the monomer into the fibril structure. The data reveal a free-energy barrier separating the two locking stages. The importance of hydrophobic interactions and hydrogen bonds in the stability of the Aβ fibril structure was examined by carrying out additional canonical MD simulations of oligomers with different numbers of chains (4 to 16 chains) with the fibril structure as the initial conformation. The data confirm that the structures are stabilized largely by hydrophobic interactions and show that the intermolecular hydrogen bonds are highly stable and contribute to the stability of the oligomers as well.
amyloids; Aβ peptide; Alzheimer’s disease; hydrophobic interactions; UNRES force field; molecular dynamics
Formation of β-hairpins is considered the initial step of folding of many proteins and, consequently, peptides constituting the β-hairpin sequence of proteins (the β-hairpin-forming peptides) are considered as models of early stages of protein folding. In this article, we discuss the results of experimental studies (circular-dichroism, infrared and nuclear magnetic resonance spectroscopy, and differential scanning calorimetry) of the structure of β-hairpin-forming peptides excised from the B1 domain of protein G, which are known to fold on their own. We demonstrate that local interactions at the turn sequence and hydrophobic interactions between nonpolar residues are the dominant structure-determining factors, while there is no convincing evidence that stable backbone hydrogen bonds are formed in these peptides in aqueous solution. Consequently, the most plausible mechanism for folding of the β-hairpin sequence appears to be the broken-zipper mechanism consisting of the following three steps: (i) bending the chain at the turn sequence owing to favorable local interactions, (ii) formation of loose hydrophobic contacts between nonpolar residues, which occur close to the contacts in the native structure of the protein but not exactly in the same position and, finally, (iii) formation of backbone hydrogen bonds and locking the hydrophobic contacts in the native positions as a hydrophobic core develops, sufficient to dehydrate the backbone peptide groups. This mechanism provides sufficient uniqueness (contacts form between residues that become close together because the chain is bent at the turn position) and robustness (contacts need not occur at once in the native positions) for folding a β-hairpin sequence.
β-hairpin-forming peptides; protein folding; CD spectroscopy; IR spectroscopy; NMR spectroscopy; differential scanning calorimetry
To evaluate sequential nearest-neighbor effects on quantum-chemical calculations of 13Cα chemical shifts, we selected the structure of the nucleic acid binding (NAB) protein from the SARS coronavirus determined by NMR in solution (PDB id 2K87). NAB is a 116-residue α/β protein, which contains 9 prolines and has 50% of its residues located in loops and turns. Overall, the results presented here show that sizeable nearest-neighbor effects are seen only for residues preceding proline, where Pro introduces an overestimation, on average, of 1.73 ppm in the computed 13Cα chemical shifts. A new ensemble of 20 conformers representing the NMR structure of the NAB, which was calculated with an input containing backbone torsion angle constraints derived from the theoretical 13Cα chemical shifts as supplementary data to the NOE distance constraints, exhibits very similar topology and comparable agreement with the NOE constraints as the published NMR structure. However, the two structures differ in the patterns of differences between observed and computed 13Cα chemical shifts, Δca,i, for the individual residues along the sequence. This indicates that the Δca,i -values for the NAB protein are primarily a consequence of the limited sampling by the bundles of 20 conformers used, as in common practice, to represent the two NMR structures, rather than of local flaws in the structures.
Quantum-chemical calculation of 13Cα - chemical shifts; NMR structures of proteins; Sampling of conformation space
The folding of the B-domain of staphylococcal protein A has been studied by coarse-grained canonical and multiplexed replica-exchange molecular dynamics simulations with the UNRES force field in a broad range of temperatures (270K ≤ T ≤ 350K). In canonical simulations, the folding was found to occur either directly to the native state or through kinetic traps, mainly the topological mirror image of the native three-helix bundle. The latter folding scenario was observed more frequently at low temperatures. With increase of temperature, the frequency of the transitions between the folded and misfolded/unfolded states increased and the folded state became more diffuse with conformations exhibiting increased root-mean-square deviations from the experimental structure (from about 4 Å at T = 300K to 8.7 Å at T = 325K). An analysis of the equilibrium conformational ensemble determined from multiplexed replica exchange simulations at the folding-transition temperature (Tf = 325K) showed that the conformational ensemble at this temperature is a collection of conformations with residual secondary structures, which possess native or near-native clusters of nonpolar residues in place, and not a 50%-50% mixture of fully-folded and fully-unfolded conformations. These findings contradict the quasi-chemical picture of two- or multi-state protein folding, which assumes an equilibrium between the folded, unfolded, and intermediate states, with equilibrium shifting with temperature but with the native conformations remaining essentially unchanged. Our results also suggest that long-range hydrophobic contacts are the essential factor to keep the structure of a protein thermally stable.
protein folding; folding/unfolding transition; coarse-grained dynamics; conformational ensemble
A 34-residue α/β peptide, [IG(28-61)], derived from the C-terminal part of the B3 domain of the immunoglobulin binding protein G from Streptoccocus was studied using CD and NMR spectroscopy at various temperatures, and by differential scanning calorimetry. It was found that the C-terminal part (a 16-residue-long fragment) of this peptide, which corresponds to the sequence of the β-hairpin in the native structure, forms structure similar to the β-hairpin only at T = 313 K, and the structure is stabilized by non-native long-range hydrophobic interactions (Val47 – Val59). On the other hand, the N-terminal part of IG(28-61), which corresponds to the middle α-helix in the native structure, is unstructured at low temperature (283 K), and forms an α-helix-like structure at 305 K and only one helical turn is observed at 313 K. At all temperatures at which NMR experiments were performed (283, 305 and 313 K), we do not observe any long-range connectivities which would have supported packing between the C-terminal (β-hairpin) and the N-terminal (α-helix) parts of the sequence. Such interactions are absent, in contrast to the folding pathway of the B domain of protein G, proposed recently by Kmiecik and Koliński [Kmiecik, S.; Kolinski, A. Biophys J 2008, 94, 726-736], based on Monte Carlo dynamics studies. Alternative folding mechanisms are proposed and discussed.
peptide structure; β-hairpin; α-helix; B3 domain of protein G; NMR; CD
In this and the accompanying paper, we report the development of new physics-based side-chain-rotamer and virtual-bond-deformation potentials which now replace the respective statistical potentials used so far in our physics-based united-reside UNRES force field for large-scale simulations of protein structure and dynamics. In this paper, we describe the methodology for determining the corresponding potentials of mean force (PMF’s) from the energy surfaces of terminally-blocked amino-acid residues calculated with the AM1 quantum-mechanical semiempirical method. The approach is based on minimization of the AM1 energy for fixed values of the angles λ for rotation of the peptide groups about the Cα ⋯ Cα virtual bonds, and for fixed values of the side-chain dihedral angles χ, which formed a multi-dimensional grid. A harmonic-approximation approach was developed to extrapolate from the energy at a given grid point to other points of the conformational space in order to compute the respective contributions to the PMF. To test the applicability of the harmonic approximation, the rotamer PMF’s of alanine and valine obtained with this approach have been compared with those obtained by using a Metropolis Monte Carlo method. The PMF surfaces computed with the harmonic approximation are more rugged and have more pronounced minima than the MC-calculated surfaces but the harmonic-approximation- and MC-calculated PMF values are linearly correlated. The potentials derived with the harmonic approximation are, therefore, appropriate for UNRES for which the weights (scaling factors) of the energy terms are determined by force-field optimization for foldability.
protein folding; UNRES force field; potentials of mean force; quantum mechanics; harmonic approximation