|Home | About | Journals | Submit | Contact Us | Français|
Chemical denaturants are the most commonly used agent for unfolding proteins, and are thought to act by better solvating the unfolded state. Improved solvation is expected to lead to an expansion of unfolded chains with increasing denaturant concentration, providing a sensitive probe of the denaturant action. However, experiments have so far yielded qualitatively different results concerning the effects of chemical denaturation, with studies using Förster resonance energy transfer (FRET) and other methods finding an increase in radius of gyration with denaturant concentration, but with small-angle X-ray scattering (SAXS) studies finding mostly no change. This discrepancy therefore challenges our understanding of denaturation mechanism, and more generally the accuracy of these experiments as applied to unfolded or disordered proteins. Here, we use all-atom molecular simulations to investigate the effect of urea and guanidinium chloride on the structure of the intrinsically disordered protein ACTR, which can be studied by experiment over a wide range of denaturant concentration. Using unbiased molecular simulations with a carefully calibrated denaturant model, we find that the protein chain indeed swells with increasing denaturant concentration. This is due to the favourable association of urea or guanidinium chloride with the backbone of all residues and with the side-chains of almost all residues, with denaturant-water transfer free energies inferred from this association in reasonable accord with experimental estimates. Interactions of the denaturants with the backbone are dominated by hydrogen bonding, while interactions with side-chains include other contributions. By computing FRET transfer efficiencies and SAXS intensities at each denaturant concentration, we show that the simulation trajectories are in accord with both experiments on this protein, demonstrating that there is no fundamental inconsistency between the two types of experiment. Agreement with experiment also supports the picture of chemical denaturation described in our simulations, driven by weak association of denaturant with the protein. Our simulations support some assumptions needed for each experiment to accurately reflect changes in protein size, namely that the commonly used FRET chromophores do not qualitatively alter the results, and that possible effects such as preferential solvent partitioning into the interior of the chain do not interfere with the determination of radius of gyration from the SAXS experiments.
Addition of chemical denaturants represents one of the most straightforward and widely used methods of perturbing the stability of proteins, since they are simple to use and rarely involve protein aggregation artifacts which may occur, for example, in thermal denaturation. To a good approximation, their action is the result of differential effects on the folded and unfolded states, for which several models have been proposed1–3. In the most accepted model, the denaturant binds weakly to the protein, favouring unfolding due to the greater surface available for binding in the unfolded state1, 4–10. At a coarser level, the consequence of such models is that solutions containing higher denaturant concentrations are better able to solvate proteins. Polymer theory predicts that a concomitant effect of this improved solvation should be a further swelling of the unfolded chain with increasing denaturant concentration11–12, providing a sensitive measure of the denaturant action. Indeed, studies using a large number of experimental techniques have reached this conclusion13, including ensemble and single-molecule Förster resonance energy transfer (FRET)14–22, dynamic light-scattering (DLS)10, 23, nuclear magnetic resonance (NMR)24–26, and small-angle X-ray scattering (SAXS)25, 27.
However, a significant number of studies, using SAXS on unfolded proteins28–32, have reached a qualitatively different conclusion, namely that the radius of gyration (Rg) of the unfolded state of a two-state protein does not vary with denaturant concentration. Also for intrinsically disordered proteins33, which are easier to study experimentally as the folded and unfolded states do not need to be separated, the results differ according to the method used: FRET experiments suggest an expansion with denaturant concentration for a number of IDPs34–35, while SAXS experiments on the N protein, an IDP from bacteriophage lambda, were inconclusive regarding the change in Rg with added urea, after considering the errors36. The reasons for the differences between the conclusions drawn from these different experiments are unclear, but important to resolve. Firstly, the implication is that at least some of the experimental results, as currently analyzed, are incorrect. This would have wide-ranging implications because both SAXS and FRET experiments are frequently used to characterize IDPs. Secondly, if proteins do not expand with increasing denaturant concentration, it means we must fundamentally re-evaluate our understanding of the denaturation mechanism. To the best of our knowledge, there is still no accepted solution to this controversy, and it has been identified by two recent reviews as one of the key outstanding problems in protein folding37–38.
Leaving aside finer details of the experimental analysis, which are not the focus of the present work, it is possible that the apparent collapse or lack thereof is related to the experimental probes. While FRET can detect dimensional changes with unparalleled sensitivity even in heterogeneous samples and at very low sample concentration, it has been suggested that the hydrophobicity of the chromophores used to label the protein may stabilize a collapsed unfolded state at low denaturant concentration, making it difficult to determine the true collapse in the absence of chromophores31–32. SAXS requires highly homogeneous samples and relatively high protein concentration, but should provide a robust readout of the protein dimension, because the data at very low scattering angles are a direct measure of the Rg via the Guinier approximation. Assuming perfect experimental data, the only necessary assumption is a lack of systematic variation of solvent structure within the volume of the protein chain, which cannot easily be corrected for by the solvent subtraction procedure routinely used in SAXS data analysis. It is always expected that the composition of the surface layer around the protein will differ from the bulk solvent, but imperfect subtraction of this surface layer should not influence the long-range structural features probed at low scattering angles, since this layer would be highly correlated with the chain locus. The only way in which the inferred Rg could conceivably be altered is a preferential partitioning of denaturant molecules, for example, towards the center of the coil rather than at the periphery, or vice versa. This type of effect could arise, e.g., due to some denaturant molecules binding to multiple sites on the protein, leading to cooperativity and preferred binding where the chain is most dense. Differences in background contrast between denaturant and water molecules could then alter the apparent Rg of the solute. Whether such an effect could be strong enough to alter measurably the Rg, and approximately cancel any expansion with denaturant concentration, is unclear a priori.
The interpretation of most experimental data requires a simplified model of some type, for example a polymer model to obtain an average distance or Rg from a FRET experiment39–40, or a continuum solvent model to interpret SAXS experiments41, in order to solve the inverse problem of reconstructing molecular properties from a small number of observables (or sub-ensemble averages for FRET). Hence, the results may be sensitive to the specific model chosen. Alternatively, one can use atomistic molecular simulations with an accurate force field as a predictive tool, and employ the experimental data to validate the quality of simulations (since they are themselves based on empirically derived force fields)42. While the simulation model is much more complex, it has the advantage of not being fitted to the experiments it is meant to explain, i.e. the parameters are transferable to different proteins. If the simulations are quantitatively comparable to the experiments, they can then be used to provide molecular-scale insights into the observed phenomenon. Simulations have demonstrated their value in interpreting scattering data on unfolded proteins at large scattering angles, where analytical models may be insufficient43. Molecular simulations have already been used in a large number of studies to determine the effect of chemical denaturants on protein stability5, 44–45, on the unfolded state44, 46–47 and on the mechanism by which they denature proteins4–5, 8, 46, 48–53. However, the rapid denaturation observed in some of these studies suggested that many of the force fields may not quantitatively capture the effect of denaturants on protein stability. Indeed, a recent study by Netz and co-workers found that, while the best combination of protein and urea force fields they tested reproduced well the variation in denaturant affinity from one amino acid to another, the affinity of urea for each residue type was ~0.5 kcal/mol too favourable, per residue6. Other studies have also found too strong an association of urea with peptide models using a variety of force fields, and different methods5, 54. Clearly, for any study aiming to capture quantitatively the effect of denaturant on unfolded proteins, the simulation model must, at a minimum, reproduce the affinity of the denaturant for the chain.
To address this problem, we have recently parameterized models for urea and guanidinium chloride (GdmCl) in order to achieve a good balance between protein-protein, protein-water and protein-denaturant interactions. Water-protein interactions were first tuned by matching data for a short unfolded peptide, and cross-validated against multiple other experiments55. Protein-denaturant interactions were adjusted by scaling these interactions to match experimental solubility data for a tetraglycine peptide. The resulting force field was shown to reproduce denaturant-dependent FRET efficiencies of a fragment of the protein CspTm35 in which the chromophores were explicitly represented, as well as m-values for denaturation of the Trp cage miniprotein44. Here, we apply this force field to study the denaturation of the intrinsically disordered protein ACTR (activator for thyroid hormone and retinoid receptors)56–57 in urea and GdmCl over a range of denaturant concentrations. The advantage of studying an IDP is that in experiments there is no signal from the folded state which needs to be separated, and so, even at equilibrium, data can be recorded down to the lowest denaturant concentrations where the largest change in FRET efficiencies is typically observed. We show first that we are able to capture the experimental FRET efficiencies and X-ray scattering intensities from unbiased simulation trajectories, within the estimated statistical uncertainty of the simulation. The simulations show an expansion of the chain with increasing denaturant concentration, demonstrating that such an expansion can be consistent with both SAXS and FRET experimental results. We have also compared our results with experimental measurements of transfer free energies, and analyzed in detail the contributions made by different groups in the protein to these free energies. We find that urea associates favorably with almost every residue in the protein, explaining the improved solvation implied by chain expansion.
Having shown that we can reproduce the experimental data adequately, the simulation results allow us to test potential molecular scale artefacts which may confound the interpretation of FRET and SAXS data. For SAXS, we investigate the potential effect of solvent structure on the measured SAXS intensity by comparing the signal computed from an all-atom simulation to that calculated using a continuum model for the solvent, as well as by analyzing the first solvation layer in more detail. For FRET, we compare results from simulations with and without explicit representation of the chromophores, to test their influence on the degree of collapse, and some of the approximations which need to be made in order to estimate intramolecular distances from FRET efficiencies.
All-atom simulations were run using the Gromacs 4.6.758 simulation code at a constant temperature of 298 K (maintained by a Langevin thermostat) and pressure of 1 bar (with a Parrinello-Rahman barostat59). The time step was 2 fs, electrostatic energies and forces were computed with particle-mesh Ewald60 using a 0.12 nm grid spacing and real-space cut-off of 0.9 nm. Lennard-Jones interactions were calculated using a twin-range scheme with inner and outer cut-offs of 0.9 and 1.4 nm. The Amber ff03ws force field55 was used for the protein together with the TIP4P/2005 water model61 and KBFFs model for urea and GdmCl44 (i.e. the Kirkwood-Buff force field (KBFF) model62–63 including scaled denaturant-protein interactions in order to balance protein-denaturant interactions). The dye force field was described in a previous work64. Details of system size, composition and run length are summarized in Table S1, and the sequences of the peptides simulated in Table S2.
All-atom SAXS calculations were performed using the algorithm described by Köfinger and Hummer65, and is briefly described here. The SAXS intensity I(q) is calculated by
in which ΔIij represents the partial intensity difference between protein with solution (foreground) and pure solution (background), and fi(q) the form factor of species i. The term νIij(q) adds back the bulk solvent contribution oversubtracted in Iij. This correction is not used in the current work, since it has been shown in the original literature to have limited influence on the SAXS intensity for the range of q we are interested in (q < 0.5 Å−1). For each denaturant condition with SAXS calculation, two sets of MD simulations, with and without a protein molecule, were set up with the same number of denaturant molecules and ions. The protein was replaced with additional water molecules in the pure solvent simulation to make the volume of the background simulation the same as the foreground. ΔIij can then be calculated from all-atom MD data by
in which ΔNi is the difference of the average number of particles of species i between the foreground and background simulations, and ΔHij are the difference distribution functions of interparticle pair distances between the foreground and background simulations, and R is the radius of the sphere in which foreground observations are made.
SAXS calculations with an implicit solvent model were performed with the programs CRYSOL41 and FOXS66. Here, we briefly describe CRYSOL and refer the readers to the original literature for more details. The scattering intensity is calculated by three terms:
the first scattering amplitude is that of the protein in vacuo, the second is from the volume excluded to the solvent, and the third the one from the surface hydration layer with a higher density than the bulk solution. ρb and ρw represent the scattering density of surface layer and pure solvent, respectively, and Ω an average over a uniform distribution of macromolecular orientations relative to the incident beam. For a conformational ensemble additional averaging needs to be performed over the I(q) profiles calculated for each ensemble member. Since the hydration layer is empirically estimated, just protein coordinates are required to calculate the SAXS intensity using CRYSOL. Here we use the protein coordinates only from the same all-atom MD simulation data.
In most cases, FRET efficiencies E were calculated based on the donor-acceptor distance R, assuming that the orientational dynamics of donor and acceptor chromophores was fast compared to the fluorescence lifetime,67 so that the orientational factor κ2 = 2/3, and that the distance dynamics within the chain are slow relative to fluorescence lifetime of the donor67–68, i.e.
where the averaging is over all frames of the trajectory, and R0 is the spectroscopically determined Förster radius69 for the donor and acceptor dyes used here, AlexaFluor 488 and AlexaFluor 59414. R0 is defined from the refractive index, n, and the R0 in the absence of denaturant, R0(0) = 5.4 nm, as69:
In this expression, n(0) is the refractive index at 0 M and
in which [Urea] is the urea concentration (this curve is determined for a solution of 50 mM sodium phosphate, 140 mM β-mercaptoethanol and 0.01% Tween 20)70. It is assumed that changes in the donor quantum yield and spectral overlap integral do not significantly change with denaturant concentration. Since the chromophores were not present in most of the simulations, in these cases R was calculated between the Cα of residues X and Y labeled in the experiment. The distance is then rescaled by a factor of
in which N is the number of bonds between the FRET dyes in the experiment, ν is the Flory scaling exponent determined from the scaling of internal chain distances with sequence separation in the simulation (Figure 1),71 and Nlinker is a free parameter representing the linker length. It has previously been estimated empirically to be ~917, 35.
For the case in which the chromophores were explicitly simulated, the FRET efficiency can be calculated in three different ways. The first is to use the equation and correction factor described above, in which distances between Cα are used. The second is to use the distances between the “C1” atoms of each chromophore as described previously64 without a correction factor. A more sophisticated approach, which assumes only that Förster theory is sufficiently accurate, can also be applied to the simulations including explicit donor and acceptor chromophores64, 72–74. In this case, the transfer rate kET(x) for configuration x in the simulation trajectory is given by
where kD is the donor fluorescence decay rate in the absence of an acceptor, and the orientational factor κ is given by
where D and A are unit vectors in the direction of the donor and acceptor transition dipoles, respectively, and is a unit vector pointing between donor and acceptor. We assume that the donor and acceptor transition dipole moments are approximately aligned with the long axis of each chromophore system (defined by the vectors between atoms C11 and C12 within each chromophore), and the distance between the chromophores is taken to be that between the C1 atoms of each chromophore64. The decay in donor fluorescence intensity is evaluated by calculating the survival probability of the excited state with a fluctuating transfer rate, averaged over all possible time origins, t0, along a simulation trajectory:
The average FRET efficiency was obtained by integration of the intensity decay (or lifetime distribution)
where the maximum integration time tmax was chosen as 20 ns, by which time the fluorescence had essentially decayed to zero for kD = 0.238 ns−1.
To sample the configurations of the intrinsically disordered protein ACTR, we ran multiple unbiased, 2 µs-long, equilibrium MD simulations in explicit water and different denaturant concentrations (Table S1). Such extensive trajectories, while still posing a challenge for the large systems considered, are the minimum necessary to obtain a representative sampling, given that the experimental reconfiguration times of unfolded and disordered proteins are typically of the order of 0.05–0.1 µs17, 75–77. We use force field models for protein, urea and water which we have recently parameterized to reproduce the balance of interactions between the protein, water, and denaturant components of the system44, 55. We note that using such a force field is essential, because recent work has shown that most existing force fields result in too collapsed conformations of proteins even in the absence of denaturant78–79, with several suggested corrections proposed55, 80–81. This would confound any attempt at quantitative comparison with experiment55. Although we consider the effects of both urea and GdmCl, in the interest of brevity, we describe only the results for urea in the main text (see Supporting Information for GdmCl).
In Figure 1a, we show the fluctuations in Rg (computed directly from the protein coordinates) over the course of representative simulations at each denaturant concentration. The relatively long time scale of fluctuations necessitates sampling on the microsecond time scale. In Figure 1b, we show the autocorrelation function for the radius of gyration, which yield correlation times ranging from around 40 to 140 ns, comparable to those measured in earlier experiments on other proteins76–77. Even though the distributions of Rg are very broad, there is nonetheless a clear increase in its average value as a function of denaturant concentration, illustrated in Figure 1c, as well as in the average distance between the residues labeled with chromophores (Figure 1d). The swelling of the chain is also reflected in an increase in the scaling exponent with denaturant concentration. We have characterized this by means of a power law fit of the dependence of the root-mean-square (RMS) inter-residue distance between pairs of residues on the sequence separation of those residues, Figure 1d71. In addition, to obtain better averaging, we have also computed the RMS Rg of the chain segment included between pairs of residues on their sequence separation, Figure 1c. The fits, summarized in Table 1, show an increase in scaling exponent with denaturant concentration, from approximately 0.55 in the absence of denaturant, to a value slightly larger than 0.6 in high denaturant. These exponents are comparable, respectively, to the trend obtained from single-molecule FRET experiments in low and high denaturant, which show a transition from near θ-solvent conditions in water to close to the excluded volume limit in denaturant35. The finding of near theta-solvent conditions in water is also consistent with the fractal dimension from SAXS experiments on reduced RNase A in water82, while the exponent at high denaturant is in accord with the scaling inferred from a comprehensive small-angle X-ray scattering study of a wide range of sequence lengths in high GdmCl concentration83. Therefore, our results are consistent, qualitatively, with the expectations of polymer theory for the changes which occur when the solvent quality is improved12. In Figure S1, we show corresponding results for ACTR in GdmCl solutions. Note that although the chain collapses as denaturant is diluted, its most collapsed state, in water, still approximates a θ-state35. Thus, in analogy with protein folding84–85, ACTR does not form a fully collapsed state prior to binding its cognate partner, NCBD. For an intrinsically disordered protein, maintaining an expanded state may have advantages for recognition of binding partners and for binding kinetics86–87.
The clear expansion of the chain in urea implies an improvement in solvent quality with increasing denaturant concentration (an alternative explanation for increased Rg might be an increase in chain stiffness, but that would not explain the increase of scaling exponent). The improved solvent quality could be thought of in terms of urea molecules “binding” to the protein, as previously inferred from experimental studies using NMR and X-ray scattering88; however, for such weak binding occurring at high denaturant concentration, it is critical to remove the contribution from denaturant molecules which happen to be near the protein but are not necessarily interacting. Therefore, in order to characterize in more detail the weak interactions between the protein chain and denaturant molecules, we use the formalism of preferential interaction coefficients. The preferential interaction coefficient ΓUP is defined experimentally as ΓUP = (mU/mP)μU, where mU and mP are the molalities of urea and protein, respectively89–90. That is, ΓUP measures how much urea must be added to keep the bulk urea chemical potential μU constant when a protein is added to the solution and is expected to be positive if urea interacts favourably with the protein, and vice versa. In simulations, the coefficient can be estimated very simply from the heuristic relation:89–92
In this equation, and are the number of urea and water molecules in a defined volume close to the protein, while and are the corresponding numbers in the bulk solution away from the protein, i.e. ΓUP is the average number of urea molecules in the volume near the protein, in excess of what would be expected based on the bulk solution composition. We define the volume near to the protein by using a simple cut-off of 0.7 nm between protein heavy atoms and the water oxygen or urea carbon, however the results are fairly insensitive to the choice of cutoff, as long as it is large enough (Figure S2). We can in addition write the total ΓUP as a sum over group contributions, by assigning the water and urea molecules in the protein domain to the group on the protein to which they are closest, corresponding to a Voronoi tessellation of the domain surrounding the protein.93–94 The groups we have chosen are the backbone and side-chain heavy atoms of each residue.
In Figure 2, we show the decomposition of the preferential interaction for backbone and side-chains for each residue type. We see that urea interacts favourably with the backbone of all residue types, although there is some residue to residue variation. While we have presented the average ΓUP for each residue type, we note that its value is relatively independent of the sequence context, with similar results being obtained for all residues of a given type (Figure S3). This supports one of the assumptions of the commonly used additive schemes for decomposing protein-denaturant interactions,1, 9, 95 for example the decomposition of protein folding m-values as a sum over independent contributions from different functional groups in the polypeptide chain.9 On the other hand, the association with glycine, which is often used as a model for the protein backbone in decomposition schemes, is notably higher than the average value (compare dashed and dotted lines in Fig. 2), which may lead such schemes to underestimate the contributions from side-chains. This may reflect some of the known limitations of assuming additivity in calculations of protein-solvent interactions,96 although we must also concede that our decomposition of space using a Voronoi scheme is certainly not unique. The side-chain contributions show that urea also interacts favorably with almost all side-chains, the only exceptions being the anionic aspartate and glutamate residues, consistent with an earlier study8. Our results thus suggest that both backbone and side-chains contribute comparable amounts to the favorable solvation of the unfolded state by urea solutions, in agreement with the results of other recent computational studies7, 92. Note that this does not mean they contribute equally to folding m-values, which measure how the difference between the folded and unfolded Δμtr changes with denaturant concentration. A calculation including the folded state (or at least, a fully collapsed state97) would be needed to evaluate the relative contribution of the backbone to folding m-values.98
Since experimental preferential interaction coefficients are not available for all residue types, we have compared our results with per-residue transfer free energies from water to 1 M urea95. We estimate transfer free energies Δμtr from preferential interaction coefficients using the approximate relation Δμtr ≈ −RTΓUP. This expression is valid at low denaturant concentrations and for ideal denaturant solutions90. A concentration of 1M is sufficiently low for the first assumption to be valid and urea solutions are known to be very close to ideal99 (also reflected in properties of the KBFF force field62). This expression also ignores any systematic changes in protein dimensions with denaturant concentration, which we have just shown to occur. However, given that the percentage increase in protein size is modest, we feel this is also a reasonable first approximation. The Pearson correlation coefficient of the transfer free energies between the simulation and experiment is 0.63 with a p-value of 0.01, suggesting the calculated values capture very well the overall magnitude and sign of the protein-urea interactions, and to a good extent the variation from residue to residue. We note that a direct comparison cannot be made with the charged residues, because their Δμtr refers to transfer free energies for their Na+ or Cl− salts, and so may include significant contributions from the transfer free energies of these ions. The total preferential interaction or transfer free energy depends both on the size of the residue, as well as on its chemical identity. We can approximately normalize for the contribution from size by dividing by the average Connolly solvent-accessible surface area, shown in Figure 2. After this correction for size, the interaction coefficients from simulation are quite similar for most of the residues, the remaining outliers being the charged residues, and glutamine.
We have analyzed further the mechanism of action of urea with both backbone and side-chains, starting with hydrogen bonding, which is the easiest type of interaction to single out (Figure 3). We find that for most residue types, the average number of hydrogen bonds between the backbone and urea at 1 M denaturant is very close to the number of excess urea molecules, relative to bulk (given by the preferential interaction coefficient). This strongly suggests that hydrogen bonding is the main mode of interaction with the backbone, with many of the side-chains also making hydrogen-bonded interactions with urea. In addition, however, it is clear that the hydrophobic side-chains interact favorably without forming any hydrogen bonds. An apparently more puzzling result is the negative preferential interaction of some of the charged side-chains with urea, despite the number of hydrogen bonds to water being similar for analogous charged and uncharged side-chains (e.g. Asp and Asn). The most likely explanation is an enhanced local water density in the vicinity of the ionic side-chain, such that the average number of urea molecules per water molecule is still lower than in bulk. A high local density of water dipoles helps to solvate the charged side-chains, and indeed we observe a very large first peak in the water g(r) around Asp, relative to Asn (Figure S4): This higher water density is a manifestation of the well-known electrostriction effect of ions. Since there are no residues with aromatic side chains in ACTR, yet these usually have the most favorable water-urea transfer free energies in experiment95, we have calculated the preferential interaction coefficients of the unfolded Trp cage mini protein using published simulations with the same force field in 3M urea44 (Figure S5). We find qualitatively that Trp and Tyr have much larger preferential interaction coefficients and therefore more favorable transfer free energy in Trp cage, consistent with experiment. Based on this, one would expect denaturant to lead to a larger expansion in denaturant for sequences containing also aromatic residues.
Although in the main text we focus on urea, in Figure S6, we show the corresponding results for 1 M GdmCl. Similar to urea, we find that both backbone and side-chains make comparable contributions to transfer free energies: while only the Gdm+ ions have a significant preferential interaction with the backbone, both Gdm+ and Cl− ions associate favorably with side-chains in simulation (Figure S7). The major difference is the stronger interaction of the cationic guanidinium ion with the anionic residues. Quantitatively, the magnitude of the interaction coefficients and transfer free energies for GdmCl are about twice the values for urea, consistent with the stronger effect of this denaturant, as well as with experimental transfer free energies reported by Nozaki and Tanford100. The denaturant-dependence of protein stability (m-value) is also usually a factor of ~2 larger in GdmCl than in urea, consistent with the dominant role of the unfolded state in determining m97, although the native state must also contribute98. The backbone preferential coefficient is still comparable to the number of hydrogen bonds formed per residue, similar to the urea case (Figure S8), indicating that, according to our simulations, both Gdm+ and urea interact by hydrogen bonding with the backbone. However, the type of hydrogen bonds formed is different, with Gdm+ hydrogen bonding exclusively to the CO group of the amide bond (as may be expected from its lack of hydrogen bond acceptors), and urea to both the NH and CO groups (Figure S9). We can compare these results with earlier hydrogen exchange experiments in the presence of urea or GdmCl101. Base-catalyzed exchange was found to be blocked by urea and unaffected by GdmCl, which is expected as the base would attack the NH, which can only be blocked by urea hydrogen bonding. The results for acid-catalyzed exchange indicate that urea accelerates exchange while GdmCl has little effect, which was interpreted to mean that Gdm+ does not hydrogen bond to the CO group either101, in contrast to what we find. However, a definitive conclusion based on experiment would require a quantitative model for the expected effect of the denaturant on the rate of acid catalysis, which has a more complex mechanism than base catalysis102. Overall, our analysis suggests that the effect of both urea and GdmCl can be explained in terms of preferential solvent partitioning, which essentially describes a weak binding of the denaturant to the protein1, the model favored by most recent studies4–8, 49–50, 52, 103, and consistent with our results.
The stronger interactions of the protein with the solvent imply relatively weaker protein-protein interactions, which should disrupt any local structure (native or non-native) formed at low denaturant concentration. ACTR is known to be quite unstructured in water, however it does have some residual helical structure which is lost at high urea concentration, as probed by ultraviolet circular dichroism (CD), as well as nuclear magnetic resonance spectroscopy56, 104. We have computed average helix fraction as a function of denaturation concentration, and while the data exhibit considerable noise, there does appear to be a modest decrease in helix fraction with increasing denaturant concentration, in good agreement with the helix fraction inferred from CD, considering the statistical error in the simulation (Figure S10), and in accord with the finding that proteins populate more extended structure in denaturant than in water.105 This loss of helix when the protein expands at higher denaturant concentration is in contrast with the situation when the temperature is raised, which causes ACTR to collapse (due to strengthened hydrophobic effect106), but also an apparent reduction of helix content104, as has also been observed in all-atom simulations of unfolded proteins107. Thus there is not a simple connection between the collapse and the formation of helical structure, and collapse can be driven by the different types of interaction, depending on the conditions.
To validate the results of our simulations, we compare the raw experimental data70 with that calculated from the all-atom simulations. In Figure 4, we show the mean FRET efficiency computed from the simulations using the Förster equation (Equation 3) as a function of urea concentration, together with the experimental results, for three different pairs of residues labeled with FRET donor and acceptor chromophores. There is naturally a considerable statistical uncertainty in our estimates, given the quantity of data available. Especially at high denaturant concentration, there is a deviation of FRET efficiency between the simulation and experiment. This is probably due to the limited box size affecting end-end distance of more expanded configuration and significantly lower viscosity of the solution at high denaturant concentration. We note that the simulations with a larger solvent box do agree better with experiment at high urea concentration. This may reflect an absence of interactions with the periodic image, but with the caveat that the simulations with the larger boxes are only 0.6 µs versus 2 µs for the small box simulations. Even with the deviation at high urea concentration, the Pearson correlation coefficient of the FRET efficiencies between the simulation and experiment is 0.91 with a p-value in the order of 10−7, suggesting the agreement between experiment and simulation is overall quite good. In Table 2, an all-vs-all comparison of simulation and experimental efficiencies at different concentrations shows that the best agreement is obtained when the concentrations in simulation and experiment are the same, or nearly the same, implying that the expansion we observed in simulation is also present in experiment. We note that, since the chromophores were not explicitly present in the initial set of simulations, we have accounted for the effects of the protein-chromophore linkers by scaling the separation between the Cα of the labeled residues (Equation 5). In addition, we assume that the efficiency is determined only by the donor-acceptor distance and that the FRET orientational factor κ2 = 2/3108. We will revisit and justify both of these assumptions in the next section.
We have also computed SAXS scattering profiles I(q) using the all-atom coordinates via an established procedure65. To do this, we compute the atom-atom pair distance distribution functions (PDDFs) within a spherical volume around the protein centre of mass, whose summed Fourier transforms yield the scattering intensities. The background is computed from a large simulation box of denaturant solution with a similar concentration, as in the experiment. Contrast matching is performed by comparing the average electron density in a shell outside of the primary sphere in the protein simulation with the electron density in the reference (background) simulation. This calculation, therefore, exactly mimics the experiment, and includes any possible contributions due to cooperative solvent structuring around the protein. We found that the essential parameters in this calculation are the radius of the primary sphere, and the thickness of the surrounding solvent shell used for contrast matching. As we discuss in the supporting text, and show in Figure S11, choosing a radius for the primary sphere which does not completely contain the vast majority (i.e. ~99%) of the disordered protein configurations, distorts the results, for example giving an underestimation of the simulated Rg based on a Guinier approximation. A second requirement is that the solvent shell for contrast matching must be thick enough to be representative of the solvent background. These requirements led us to adjust the system size for the different denaturant concentrations according to the protein Rg, as shown in Table S1. Note that our observed variation in Rg is not an artefact of confinement due to the smaller system sizes used at low denaturant concentration. Within statistical error, we obtain the same radii of gyration when using the same system size for all systems (see Figure 1), the larger system size at high denaturant concentrations only being required for the explicit SAXS calculation. The computed scattering profiles, I(q), are shown in Figure 4 for different denaturant concentrations together with the experimental data. Although the curves at different denaturant concentrations all appear superficially very similar, we find that the simulations capture the subtle differences between them. An all against all comparison of the simulated curves at different denaturant concentrations with the experimental curves at different concentrations shows that, in most cases, the best agreement of the experimental data with simulation (assessed by the reduced χ2 parameter) occurs when the denaturant concentrations in experiment and simulation are the same (Table 2), so that again the expansion of the chain seen in simulation is consistent with experiment.
For FRET to yield an accurate estimate of molecular size it is important that the chromophores do not substantially affect the radius of gyration, or its denaturant dependence – it has been implied that the chromophore labels may somehow influence the denaturant dependent collapse31–32. In the results described so far, we have used the same simulations for both FRET and SAXS calculations in order that the results be as comparable as possible. We have also tested the assumption that chromophores should not noticeably perturb the protein by performing simulations of ACTR in urea with explicit chromophores at two different denaturant concentrations, 1 M and 5 M. The force field for the chromophores has been found to reproduce fairly well a battery of experimental data on the interaction of chromophores with zwitterionic tryptophan and on chromophores attached to proteins and peptides64. An initial comparison of the radius of gyration shows that at both 1 M and 5 M urea, the Rg is slightly smaller in the simulations with labeled protein than with unlabeled, although at 5 M the difference is well within the statistical error bars (Figure 5(a)). It is clear, however, that Rg increases with denaturant concentration, both for the labeled and unlabeled systems. In a previous study109, the protein Rg was shown to be insensitive to whether the protein was labeled or not. However that study used a force field (Amber ff03w110) in which the unfolded structure was already somewhat collapsed. Here we obtain the same conclusion, although using an improved force field which reproduces the correct dimensions of the unfolded configurations, and we still find little effect of the labels, strengthening the earlier conclusion.
The results of a simple average FRET calculation using the distance between the chromophores directly rather than an approximate distance based on the separation of Cα atoms are included in Figure 5(b) (details in Methods section), showing very similar results. A second assumption in interpreting the FRET data is that the chromophores reorient rapidly on the time scale of the donor lifetime, so that only an average effect of the relative chromophore orientation factor, κ2, needs to be considered, i.e. κ2 = 2/3. Since, in the simulations with explicit chromophores, we have a complete, unbiased, trajectory of the chromophore positions, we can calculate directly the time-dependent rate coefficient for resonance energy transfer, the decay of the donor fluorescence intensity and consequently the FRET efficiency. Thus, the only remaining assumptions we make are those included in Förster’s original theory (e.g. that the transition densities can be approximated as point dipoles). The donor fluorescence decay is shown in Figure S12, and the transfer efficiencies in Figure 5(b). The consistency of the different calculations provides strong support both for the simple distance-based FRET estimate as well as for the assumption of κ2 = 2/3. Indeed, the equilibrium average κ2 computed from the simulations is very close to the expected value of 2/3 for an isotropic distribution of chromophore orientations (Figure 5(d)), as seen in an earlier study109. The reason for the validity of this assumption is that at least one of the chromophores in each case is reorienting rapidly on the scale of the donor lifetime, with rotational correlation times of ~1 ns, with a similar correlation time for κ2 itself (Figure S12), compared with donor lifetimes of ~2 ns for molecules labeled with both donor and acceptor under the denaturant conditions used. Even though the correlation time for Alexa 488 reorientation is substantially longer than this in the 1 M urea simulation due to formation of stable contacts with the protein, the free rotation of the Alexa 594 ensures a short correlation time for the overall κ2. The differences between Alexa 488 and Alexa 594 may relate to differences between the chromophores themselves, to the labeling position (N or C terminal), and to the limited sampling in the simulation – these effects would have to be investigated in future work. Although the average κ2 at 1 M is slightly less than 2/3, which would tend to increase the apparent efficiency, the consistency of the full calculation including the relative orientation of the dyes with that based only on distance (Figure 5(b)) indicates that most of the variation in efficiency with denaturant concentration comes from changes in the distance distribution.
A second issue in interpreting FRET experiments is that the distance probed by FRET is that between the chromophores, which are usually attached to the protein by long flexible linkers in order to allow the chromophores to reorient freely. Thus a transformation needs to be made to convert the mean square distance between the chromophores to a distance between protein residues, which is the quantity of interest. One procedure for doing this is to rescale the observed distance Robs by assuming that the linkers effectively add a certain number of extra residues to the length of the chain, so that the distance between protein residues is R = (N/(N + Nlinker))νRobs where the number of extra residues Nlinker has been chosen to be around 9 from the literature35 and ν is the polymer scaling exponent (from Table 1). Since in the simulations with attached chromophores, we can measure both distances, we determine Nlinker by minimizing the difference between the average FRET efficiency computed using the distance between chromophores, and that computed using the distance between residues with the Nlinker-dependent correction. The χ2 between these two estimates is shown in Figure 5(c), yielding a minimum at Nlinker ≈ 10 residues, very close to the value of 9 estimated from experiment.
In the above analysis, we have computed SAXS scattering intensities using an all-atom representation, including all solvent molecules65, 111. This is the gold standard, and could be important if there were significant solvent structure around the protein which could even affect the measured radius of gyration if, e.g., the solvent specifically partitioned toward the center of the coil rather than being uniformly distributed along its length. Whether such solvent structuring is significant can be elucidated via a straightforward test: comparison of the scattering curves from the all-atom calculations with those from an implicit uniform model for the surface solvent. Using the same protein configurations as for the atomistic SAXS calculation, we have computed scattering profiles using the programs CRYSOL41 and FOXS66.
Note that CRYSOL includes parameters describing the average thickness and background contrast of the solvation layer around the protein which are optimized for folded proteins in water to 0.3 nm and ~10% of the bulk density, respectively, while FOXS is also optimized for aqueous solvent. For non-compact unfolded conformations, errors arising from this assumption are only expected to affect I(q) at larger scattering angles, provided that the solvation layer is strongly correlated with the chain locus. In Figure 6 we show the comparison between the explicit solvent calculation of I(q) and a CRYSOL calculation, in which the background electron density is taken from the all-atom simulations at each urea concentration, and the default hydration shell parameter is used. As is evident, the continuum approximation is very good for q < 0.3 Å−1, and excellent for q < 0.04 Å−1, which includes the Guinier region used to determine the Rg in experiments. The quality of this agreement is not very sensitive to the solvent model used: we have tried alternative procedures of not adjusting the background electron density (Table S5, Figure S13), and of leaving out the hydration shell altogether (Figure S14). Although the latter of course leads to larger deviations for q > 0.08 Å−1, the difference of SAXS intensity from the explicit solvent calculation is still within 3% in the Guinier region. We have also computed scattering intensities using a different program, FOXS66. in which the surface solvent is modeled by adjusting the atomic form factor of the solvent-exposed atoms. With default parameters originally optimized for folded proteins in water, we again obtain a good agreement with the SAXS intensity computed from explicit solvent calculations in Guinier region (Figure S15, Table S5). All of these results suggest that a precise description of the hydration shell is not necessary to estimate the Rg of unfolded proteins and that there is no cooperative solvent structuring around the protein, beyond the first solvation layer. As a final verification of this point, we show in Figure 1 the Rg estimated from Guinier fits to the I(q) in Figure 6, demonstrating that the result is almost identical to that obtained using explicit solvent coordinates. Guinier fits to the other scattering calculations with implicit solvent also yield the same results, after considering the statistical error due to finite sampling in MD. Therefore, the Rg inferred from the Guinier fit accurately reflects the expansion of the chain as urea concentration increases (Figure 1(c)). To provide some intuitive understanding of this observation, we have calculated the average number of urea molecules within 4.5 Å of each residue, showing that the distribution of urea within an approximate first solvation shell is uniform along the sequence (Figure S16). These results effectively rule out the possibility that effects such as preferential partitioning of the solvent towards the centre of the coil could distort the Rg inferred from SAXS.
We have used unbiased microsecond atomistic simulations with a force field carefully calibrated against small-molecule solubility data to investigate the effect of denaturants on an intrinsically disordered protein. We find that increasing only the denaturant concentration causes an increase of radius of gyration, end-to-end distance, and polymer scaling exponent. We further show that the molecular origin of the expansion is preferential association of denaturant molecules with the chain. Careful analysis of the interactions between the protein and urea yields transfer free energies from water into denaturant solution in good accord with experiment. With the new force field we achieve a good match with experimental transfer free energies, as well as with the SAXS and FRET data for ACTR, which is essential for a quantitative understanding of the underlying mechanism, We find that almost all residues have a favorable transfer free energy from water to 1 M urea, the only exceptions being the small anionic residues Asp and Glu, for which water is a better solvent. A more detailed breakdown indicates that the backbone and side-chains make similar contributions to the overall transfer free energy. Interactions with the backbone appear to be dominated by hydrogen bonding, whilst other types of interaction (e.g. hydrophobic interactions) are clearly also important for side-chains. The small amount of helical secondary structure present is progressively lost with increasing denaturant concentration.
The observed chain expansion is validated by comparison with experimental FRET efficiencies and SAXS scattering intensities, where quantitative agreement is obtained – emerging only from the basic intermolecular interactions captured by the force field. Thus, at least for the intrinsically disordered protein ACTR, which we study here, all of the experimental data is consistent with a scenario in which the protein expands, and with the current understanding of denaturation mechanism, mediated by protein-denaturant binding. We have investigated potential molecular-scale artefacts which have been suggested to explain the discrepancies between experiments. First, we verify the accuracy of assuming that solvent distribution has little impact on the radius of gyration of the protein obtained from SAXS. Second, for FRET we show that the chain collapse is not induced by the FRET labels. Since there is no fundamental inconsistency between the two experiments, and in the absence of the above artefacts, the experimental discrepancy most likely relates to the challenging inverse problem of determining properties of IDPs from limited experimental data.70
Overall, our results highlight the potential of unbiased atomistic simulations for providing a molecular interpretation for complex experimental data. The good agreement between our simulation results and the properties of ACTR in both water in and denaturant suggests that the force fields used are reaching the point of being a useful tool for the investigation of intrinsically disordered proteins.
We would like to thank Jürgen Köfinger and Gerhard Hummer for providing their software package to perform the all-atom SAXS intensity calculation. R.B. and W.Z. were supported by the intramural research program of the National Institute of Diabetes and Digestive and Kidney Diseases of the National Institutes of Health. This work utilized the computational resources of the NIH HPC Biowulf cluster. (http://hpc.nih.gov)