|Home | About | Journals | Submit | Contact Us | Français|
Aggregation of expanded polyglutamine tracts is associated with nine different neurodegenerative diseases, including Huntington’s disease. Experiments and computer simulations have demonstrated that monomeric forms of polyglutamine molecules sample heterogeneous sets of collapsed structures in water. The current work focuses on a mechanistic characterization of polyglutamine homodimerization as a function of chain length and temperature. These studies were carried out using molecular simulations based on a recently developed continuum solvation model that was designed for studying conformational and binding equilibria of intrinsically disordered molecules such as polyglutamine systems. The main results are as follows: Polyglutamine molecules form disordered, collapsed globules in aqueous solution. These molecules spontaneously associate at conditions approaching those of typical in vitro experiments for chains of length N ≥ 15. The spontaneity of these homotypic associations increases with increasing chain length and decreases with increasing temperature. Similar and generic driving forces govern both collapse and spontaneous homodimerization of polyglutamine in aqueous milieus. Collapse and dimerization maximize self-interactions and reduce the interface between polyglutamine molecules and the surrounding solvent. Other than these generic considerations, there do not appear to be any specific structural requirements for either chain collapse or chain dimerization, i.e., both collapse and dimerization are non-specific in that disordered globules form disordered dimers. In fact, it is shown that the driving force for intermolecular associations is governed by spontaneous conformational fluctuations within monomeric polyglutamine. These results suggest that polyglutamine aggregation is unlikely to follow a homogeneous nucleation mechanism with the monomer as the critical nucleus. Instead, the results support the formation of disordered, non beta-sheet-like soluble molten oligomers as early intermediates – a proposal that is congruent with recent experimental data.
Expanded polyglutamine tracts in unrelated proteins are responsible for nine different neurodegenerative diseases.1 These include Huntington’s disease (HD),2 spinal and bulbar muscular atrophy, dentatorubral pallidoluysian atrophy,3 and the spinocerebellar ataxias 1,2,3,6,7,17.4 Each disease is associated with the deposition of neuronal intranuclear inclusions rich in aggregated polyglutamine.5 Ages of onset of these diseases are inversely correlated with the lengths of polyglutamine repeats.5 In HD, symptoms appear if the lengths of polyglutamine tracts exceed a threshold of 35–40 residues. 6; 7; 8; 9
Proteins with expanded polyglutamine repeats have increased susceptibility to proteolysis.10; 11; 12; 13; 14; 15; 16; 17 Products of proteolysis are rich in polyglutamine and these fragments can recruit glutamine-rich domains from other proteins and sequester them into growing aggregates.18; 19; 20; 21; 22; 23 Recruitment and sequestration can be toxic because it deprives cells of crucial proteins.24 It can also be a source of chronic stress,25 which in turn leads to an acceleration of the aging program by compromising protein homeostasis.26; 27; 28; 29; 30; 31; 32; 33 The length dependence of intrinsic rates and spontaneity of polyglutamine aggregation are of direct relevance to the recruitment and sequestration hypothesis for toxicity.34; 35; 36; 37
In vitro biophysical studies have yielded important insights regarding conformational equilibria and aggregation of various polyglutamine constructs.13; 23; 38; 39; 40; 41; 42; 43; 44; 45 The main results are as follows: Monomeric polyglutamine-rich constructs are intrinsically disordered and this holds true irrespective of polyglutamine length.23; 46; 47; 48 The spontaneity and overall rate of aggregation increases with polyglutamine length.23; 41; 45 Synthetic peptides rich in polyglutamine form large aggregates with many morphological and dye-binding characteristics that mark these aggregates as being amyloid-like. 13; 40; 49
To obtain a definitive biophysical explanation for the observed length dependence of polyglutamine aggregation we need answers to two specific questions: First, why should polyglutamine molecules, which are polar, aggregate and/or be insoluble in aqueous milieus, i.e., what are the driving forces for polyglutamine aggregation?50; 51; 52; 53; 54; 55 Second, what are the mechanisms by which polyglutamine molecules self-associate to form aggregates?41 Recent work from our lab has focused on answering the first question.52; 53; 54; 55 Here, we turn our attention to seek answers to the second question.
We proposed a connection between the phenomenology of polyglutamine aggregation and the well-established field of conformational and phase equilibria of synthetic homopolymers.53; 56 The relevant concepts have been reviewed recently53 and originate in the seminal works of Flory, 57; 58 Huggins,59; 60 and others.61; 62; 63; 64; 65; 66; 67; 68 For homopolymers there is a clear driving force for aggregation if polymers are in milieus that are so-called poor solvents. In such environments, polymers form homogeneously mixed solutions of isolated globules under dilute solution conditions. As concentration increases, we enter the two-phase regime where there is a clear driving force for phase separation/aggregation. In the single molecule limit, intra-chain interactions are preferred to chain-solvent interactions for chains in a poor solvent, and chain sizes measured using radii of gyration (Rg) or hydrodynamic radii (Rh) scale as N1/3 with chain length.69; 70; 71 As a result, the conformational ensemble in dilute solutions comprises of compact, roughly spherical conformations, which minimize the interface between chain molecules and the surrounding solvent. In poor solvents, the stabilities of collapsed structures and the spontaneities of homotypic intermolecular associations will increase with chain length.53; 61; 62; 66; 67; 68; 72; 73; 74; 75
Recent computational and fluorescence correlation spectroscopy studies helped establish that aqueous milieus at ca. 25°C are poor solvents for polyglutamine.52; 54; 55 Crick et al. 52 quantified the scaling of Rh for monomeric polyglutamine in aqueous solutions and showed that chain size scales with chain length as N1/3. Therefore, in aqueous milieus, individual polyglutamine molecules prefer collapsed structures that minimize interactions with the surrounding solvent. The demonstration that water is a poor solvent for polyglutamine identifies a generic driving force for the aggregation of this system of molecules.68
With knowledge of the driving force for aggregation, we turn to the question of whether polyglutamine aggregation follows homogeneous nucleation i.e., is the formation of a specific, high energy, conformational species or a specific cluster of molecules a pre-requisite for aggregation?41; 76; 77; 78 The number of chain molecules within an aggregate can vary and the smallest aggregate is a dimer. If the mechanism of aggregation strictly follows homogeneous nucleation and dimerization is spontaneous, then the monomeric form of polyglutamine is the critical nucleus.41 In this scenario, the nucleus would be a specific, thermodynamically unfavorable, conformational species of the monomeric form. Alternatively, the nucleus size could be greater than or equal to two.76 In this case dimerization along the productive aggregation pathway would not be spontaneous. Interpretations of experimental data lead to diametrically opposite views for the mechanism of polyglutamine aggregation.41; 42 Furthermore, theoretical work suggests that polymer aggregation in poor solvents does not follow homogeneous nucleation.68 To adjudicate between conflicting suggestions, we need to interrogate intermolecular associations and conformational preferences realized at low concentrations and low copy numbers. This limit is inaccessible to most conventional assays for aggregation and hence we turn to computer simulations.
We present results from atomistic simulations of conformational equilibria and the monomer-dimer equilibrium for polyglutamine as a function of chain length and temperature. All of our simulations were carried out using the ABSINTH continuum solvation model that was developed with focus on simulating conformational equilibria and reversible associations of intrinsically disordered systems.79 Details of the ABSINTH model and results from extensive validation of the Metropolis Monte Carlo simulation engine that uses the solvation model have been published recently.79 Only the details relevant to the simulations of the polyglutamine system are presented in the methods section. Within the ABSINTH framework, conformational equilibria of various polypeptide systems change as a function of temperature, although the parameters of the model do not have explicit temperature dependencies. The ability to alter conformational equilibria using temperature as a control parameter, allows us to view temperature as a modulator of solvent quality.
The remainder of the text is organized as follows: We present results from quantitative studies of the length dependence of coil-to-globule transitions for monomeric polyglutamine molecules. This is followed by quantification of the length and temperature dependence of monomer-dimer equilibria. We then focus our analysis on the correlation between collapse and dimerization as well as the driving forces and conformational requirements for both processes. In the discussion section we summarize our results and place our findings in the context of the existing body of experimental data and proposals for mechanisms of polyglutamine aggregation.
We performed simulations for monomeric polyglutamine as a function of temperature and chain length. The peptide constructs in all of our simulations were of the form Ac-(Gln)N-Nme. Here, Ac refers to the acetyl group and Nme denotes N-methylamide. For brevity, the sequences with N glutamine residues will hereafter be referred to as QN. Flexible polymers undergo coil-to-globule transitions that are akin to second order phase transitions and characterized by the existence of a “tri-critical” θ-point.69; 70; 71; 80; 81 At the θ-temperature (T=Tθ), Rg is proportional to N0.5. Therefore, plots of as a function of temperature for different chain lengths should intersect at T=Tθ. Theory also predicts that for T >Tθ, the ratio ζ increases with increasing N, whereas for T <Tθ this ratio decreases as N increases. These predictions are consistent with the fact that for T >Tθ, chain-solvent interactions are preferred in a so-called good solvent, the coil state is favored, and Rg scales as N0.59. Conversely, for T <Tθ, the chain collapses to minimize contacts with the poor solvent and Rg scales as N0.33. Finally, as chain lengths increase, the sharpness of coil-to-globule transitions should increase, and the width of the transition region should decrease.
In Figure 1 we plot the variation of ζ as a function of simulation temperature for Q5, Q15, Q30, and Q45, respectively. We find that the coil-to-globule transition is ill-defined for the Q5 peptide and this is consistent with the concept of “blobs”. Within a blob, the balance of chain-chain, chain-solvent, and relevant solvent-solvent interactions is smaller than kBT. Here, T is temperature, and kB is Boltzmann’s constant. If there are n residues in a blob, then the radius of gyration of the blob scales as n1/2 and this scaling holds irrespective of solvent quality or temperature. From Figure 1 it is clear that Q5 is essentially a blob-sized-peptide because ζ does not change significantly with temperature.
From Figure 1 we see that as chain lengths increase, the sharpness of coil-to-globule transitions increase and widths of transition regions decrease. The curves intersect at a common temperature of T ≈ 410K. For temperatures that lie outside the transition region, 360K ≤ T ≤ 430K, ζ decreases with increasing N in the globule limit (T < 360K) and it increases with increasing N in the coil limit (T > 420K). All of these observations are consistent with expectations listed above from the physics of generic, linear, flexible homopolymers. Such systems collapse in poor solvents in order to sequester themselves from unfavorable interactions with the surrounding milieu. In our calculations, T < 360K corresponds to poor solvent conditions.
The results shown in Figure 1 suggest that T ≈ 410K is a reasonable estimate for Tθ. We test this proposal in Figure 2 where we plot the scaling of ensemble-averaged internal distances as a function of separation in linear sequence. At the theta point, ensemble averages of inter-residue distances Rij scale as |j−i|0.5 for T=Tθ. Conversely, for T < Tθ, especially if T is outside the transition region, Rij for a range of sequence separations should plateau to a constant value proportional to the density of globules adopted in poor solvents. 80; 81 Ensemble averaged internal distances Rij were calculated as shown below:
Here, and denote the position vectors of atoms k and l, which are part of residues i and j, respectively; nij denotes the number of unique pairwise distances between residues i and j and the angular brackets denote an average over all of our simulation data for the system in question at temperature T.
Figure 2 shows the variation of Rij with sequence spacing |j−i| for different chain lengths and temperatures. In panel (a) we see that Rij increases systematically with sequence separation for Q5 and this holds true irrespective of the simulation temperature. For longer chains and T < 360K, Rij plateaus to fixed values for a range of sequence separations. This temperature regime mimics poor solvent conditions where collapsed states are preferred for monomeric polyglutamine. For T > 360K, the data in panels (b) – (d) show that Rij increases systematically with sequence separation and as T approaches 410K, Rij scales as |j−i|0.5 with |j−i|. This is demonstrated by favorable comparison of data at T=410K for Q15, Q30, and Q45 to dashed curves in panels (b)–(d) that plot Rij as Ro|j−i|0.5, where Ro=5.7Å is the value of Rij for |j−i| = 1, for all chain lengths and temperatures.
Therefore, for the force field used in this work, T=410K is a reasonable estimate for the θ-temperature for polyglutamine in aqueous solutions. At this temperature, polyglutamine molecules, specifically the longer chains (N ≥ 15), are indifferent with regards to their preference for chain-chain versus chain-solvent interactions. The driving forces for intermolecular associations should be prominent below the θ-temperature. For T > Tθ or even for temperatures that are in the immediate vicinity of Tθ there is no a priori reason to expect favorable intermolecular associations because chain solvent interactions are favored to chain-chain interactions. We designate the θ-temperature as Tθ, the collapse temperature T=360K as Tc, and the temperature T=420K as Ts because swollen, random coil states are preferred for temperatures greater than Ts.
We used replica exchange Metropolis Monte Carlo sampling to simulate homotypic associations of polyglutamine as a function of chain length and temperature. In addition to the monomer degrees of freedom, i.e., backbone and sidechain torsion angles, rigid body degrees of freedom for each molecule were sampled. Details are presented in the methods section.
Figure 3 shows temperature dependent cumulative distribution functions F(R) of intermolecular distances for pairs of Q5, Q15, Q30, and Q45 molecules. For a given pair of molecules, F(R) is an estimate of the probability that the average intermolecular separation is less than or equal to R. For Q5 the cumulative distribution functions are essentially independent of temperature. The likelihood of realizing a specific value of R increases with distance, suggesting that these molecules prefer to diffuse freely about each other. The conclusion is that both conformational equilibria and intermolecular associations for short glutamine-rich peptides are consistent with the behavior of short polar amides in water.
From the analysis of coil-to-globule transitions shown in Figure 1 we know that longer chains form more stable globules for T < Tc=360K. This chain length dependent drive for intramolecular phase separation has consequences for the spontaneity of intermolecular associations as shown in three of the four panels of Figure 3. For temperatures in the range T < 360K, the probability of spontaneous homodimerization increases with increasing chain length. For a given value of T, this is quantified in terms of higher probabilities associated with longer chains realizing close intermolecular separations. Conversely, for a given chain length, the probability of spontaneous dimerization decreases with increasing temperature. To quantify these observations, we computed excess interaction coefficients B22(T) using the cumulative distribution functions shown in Figure 3. These coefficients are defined as follows:
In equation (2), FT (R)is the cumulative distribution function at temperature T, FT=Tθ (R)is the cumulative distribution function at Tθ, and Rdroplet=200Å is the radius of the droplet used in the simulations (see methods section). The integrals were calculated using an extended trapezoidal rule. The excess interaction coefficients are in the spirit of normalized second virial coefficients that are routinely used in statistical thermodynamics to assess the magnitude of intermolecular associations in solutions of small molecules as well as flexible polymers.82; 83 If B22(T) is less than zero, spontaneous homodimerization is thermodynamically favored vis-à-vis the θ-point and the degree of favorability is assessed by the magnitude of B22(T). If B22(T) is positive, then the chains avoid each other, more so than at the θ-point, indicating a clear preference for dissociated states. If the preference for associated and dissociated states is akin to that of an ideal chain, then B22(T) will be zero.
Figure 4 shows two sets of plots. Panel (a) plots the variation of B22(T) as a function of temperature for T ≤ 410K. Separate curves are shown for each of Q5, Q15, Q30, and Q45, respectively. For Q5, B22(T) is negligibly small across the entire temperature range; for the longer chains, B22(T) is negative over different temperature ranges and the magnitude of B22(T) decreases with increasing temperature. Specifically, B22(T) is negative in the temperature range T ≤ 315K for Q15, and negative in the range T ≤ 360K for both Q30 and Q45. Additionally, at temperatures where B22(T) is negative, its magnitude is greater for longer chains. This is summarized in panel (b) which plots the variation of B22(T) as a function of chain length and the different curves denote different temperatures.
The results from Figures 1, ,3,3, and and44 may be summarized as follows: The sharpness of coil-to-globule transitions of monomeric polyglutamine increases with chain length. For T ≤ 360K, Q30 and Q45 form stable globules, and in this temperature range these peptides form stable homodimers, whose stability decreases steadily with increasing temperature. For a given temperature in the range T ≤ 360K, homodimers of Q45 are more stable than homodimers of Q30. In contrast, homodimers of Q15 are only marginally stable and are accessible over a narrower temperature range T ≤ 315K and this weak dimerization is consistent with the shallow coil-to-globule transition observed for this system. The observation of length-dependent dimerization shows that the driving force for polyglutamine aggregation increases with chain length. We now analyze the physical basis for the length and temperature dependence of spontaneous homodimerization in polyglutamine.
Panels (a) and (b) of Figure 5 show correlations between the temperature dependencies of specific conformational characteristics of monomeric polyglutamine chains and the temperature dependence of B22(T) in the range T ≤ 360K. Panel (a) shows a correlation between the density of monomeric globular polyglutamine and B22(T). The magnitude of the latter decreases as density decreases. Similarly, panel (b) shows that the magnitude of B22(T) decreases as the ensemble averaged value of Rg increases for monomeric globular polyglutamine. In both cases, the magnitudes of Pearson correlation coefficients that assess the degree of linear correlation are ca. 0.95.
For a given temperature, the driving forces for chain collapse may be decomposed into two components.84 Specifically, the mean-field internal energy per residue may be written as:
Here, N denotes chain length, U is the average potential energy at temperature T, C1(T) measures the bulk energy density, and C2(T) measures the surface energy density. C1(T) provides an estimate of the effective strength of self-interactions and C2(T) estimates the energy associated with making interfaces between polyglutamine and the surrounding solvent. Self-interactions are favorable if C1(T) is negative and the strengths of favorable interactions are measured by the magnitude of C1(T). If C2(T) is positive, its magnitude measures the energy penalty associated with increasing the size of the unfavorable chain-solvent interface. Conversely, if C2(T) is negative, then mixing of the chain and solvent is preferred.
Panels (a) and (b) in Figure S1 of the supplementary material plot the temperature dependence of C1(T) and C2(T) in the range T ≤ 360K. The stabilities of collapsed states increase as C1(T) becomes more negative and C2(T) becomes more positive and this happens as temperature decreases in the range T ≤ 360K. Comparisons of data in Figures 4 and and55 to those shown in Figure S1 indicate that B22(T) becomes increasingly more negative as the driving force (magnitudes of C1(T) and C2(T), respectively) for forming compact (small ζ) dense globules (large ρ) increases. Upon collapse, self-interactions are maximized. Dimerization leads to a diminution of the unfavorable solute-solvent interface and increased self-interactions through intermolecular association.
Since collapse and dimerization result from the combined drive to minimize unfavorable solute-solvent interfaces and maximize self-interactions, the surface-to-volume ratio (RSV) of a single chain in a poor solvent provides a generic measure of the relevant driving forces.85 For globules, RSV decreases with increasing chain length because it scales as N−1/3. For small N, such as Q15 the contribution from the surface term to RSV is significant, which means that unfavorable surface energies are not readily offset by favorable self-interactions. As a result, Q15 shows weak tendencies toward collapse and stable dimerization. RSV decreases rapidly for larger N. For Q30 and increasingly so for Q45, RSV is small enough to allow for the unfavorable surface energies to be offset by favorable self-interactions as evidenced by the increased preference for collapsed and dimerized states in the regime T < 360K. Q30 and Q45 encompass the threshold length range for polyglutamine disease phenotypes. The preceding discussion identifies RSV as a rather simple measure for the observed phenotype and is anchored in generic concepts drawn from polymer physics. However, if polymers were rigid spheres or colloidal particles that interacted primarily through surface contacts,86 then the value of RSV for a single chain would be less meaningful because C1(T) would be negligible. In this case, the relevant surface-to-volume ratio would be that of clusters of molecules and not that of a single molecule. The description of polyglutamine aggregation would then follow classical models for homogeneous nucleation, where aggregation is favorable only if the cluster size is greater than some critical number.
The foregoing analysis focused on generic polymer physics parameters and the correlations between these quantities and the spontaneity of intermolecular associations. We also assessed the presence of specific, ensemble-averaged conformational propensities that can be implicated in promoting both collapse and intermolecular associations. Specifically, we asked if there is a discernible increase in β-sheet propensity associated with collapse, spontaneous associations, or both? To answer this question, we computed the fractional α-helical and β-sheet contents using our simulation data.
There are several ways to assess secondary structure content in proteins and polypeptides. We have developed a strategy that is based on analysis of distributions of backbone and ψ angles. The resultant measure, shown below, provides a reasonable estimate of secondary structure content as compared to popular measures such as DSSP87 and STRIDE88 (data not shown). The fractional α and β contents fα and fβ, respectively are defined as:
In equation (4), fX denotes the fractional content of secondary structure type X, where X is either α or β. The mod2π terms corrects for periodicity effects when calculating distances in angular space and N is the number of residues in the sequence, excluding capping groups. The coordinates (X, ψ X) define the reference ,ψ values to be adopted by an individual residue for the secondary structure motif of type X. If a residue i adopts and ψ angles that lie within a circle of radius r, then the parameter is set to unity; otherwise assumes a value between 0 and 1, and the precise value is determined by two parameters viz., the distance dX(i) and τX. The latter is the width of the Gaussian function used to determine the value to be assigned for . For Xα, (α, ψ α) =(−60°, −50°), rα=30°, and τα=0.002deg−2. Conversely, for Xβ, (β, ψ β) =(−125°, 125°), rβ=40°, and τβ=0.002deg−2.
Figure 6 shows four panels that summarize the temperature dependencies of fractional α-helical (fα) and β-sheet (fβ) contents in Q5, Q15, Q30, and Q45. Data are shown from simulations of monomeric polyglutamine and those with two chains (“dimer”). From the data shown in Figure 6 we conclude the following: 1) There is clear, statistically significant diminution in fα with increasing chain length. Inasmuch as intermolecular associations become favorable with increasing chain length, the decreased α-helical propensities with increasing N suggest a weak correlation between decreased helical propensity and chain associations. 2) While collapse and intermolecular associations show clear temperature dependencies, conformational propensities measured in terms of fα and fβ show very weak temperature dependencies. Therefore, local conformational propensities appear to be only weakly coupled from the driving forces for collapse and intermolecular associations. 3) For a given solvent quality (defined by the value of the simulation temperature T), the fractional β content is greater than or equal to the fractional α content, and this is true irrespective of chain length. 4) Of the three chains, Q15, Q30, and Q45, the two longer chains show a diminution in the α-helical content and an enhancement in β content by a weak increase in fβ with increasing temperature.
Local secondary structure content changes weakly as a function of temperature and chain length. Despite this, the driving forces for collapse and the spontaneities of homodimerization show clear temperature and length dependencies. Therefore, we conclude that disordered globules associate to form disordered dimers. This observation is congruent with the findings of Krishnan and Lindquist89 for the aggregation of the NM regions of the yeast prion protein Sup35. They found that “molten oligomers”, which form as precursors to NM fiber formation, are dominated by contacts between the globular forms of the glutamine and asparagine rich N-domain, 90 which also forms collapsed structures in its monomeric form. 91
Experimental data and computational studies have documented the lack of conformational specificity in monomeric polyglutamine. In our simulation data, this preference for intrinsic disorder prevails despite the preference for collapsed states for temperatures in the range T ≤ 360K. In previous work, we proposed that intrinsic disorder is a direct consequence of the homopolymeric nature of polyglutamine. The lack of sequence specificity implies that a variety of compact species, irrespective of chain conformation, have equivalent stabilities, and that the conformational ensemble therefore is a heterogeneous collection of compact conformations. While this type of disorder is distinct from the disorder associated with denatured proteins, collapse does not imply folding.92 Our analysis of local conformational propensities makes this point. Additionally, we can analyze the variations in contact patterns between individual members of the conformational ensemble to assess the degree of disorder as a function of temperature. To accomplish this, we quantify disorder by computing a single figure of merit namely, the normalized variance in the number of intramolecular contacts ( ) as a function of temperature. This quantity, computed for monomeric polyglutamine, is defined as follows:
Here, N denotes the chain length; T is the simulation temperature; is the probability of realizing k intramolecular contacts in a chain of length N at temperature T; a contact is defined by any two non-bonded atoms from residues i and j having a distance ≤ 3Å; is the average number of intramolecular contacts in a chain of length N at temperature T; and nmax(N) is the maximal number of realizable intramolecular contacts in a chain of length N.
Results for the variation of as a function of temperature for Q15, Q30, and Q45 are shown in panel (a) of Figure 7. In the high temperature limit (T > 450K), past the theta point, chains sample canonical denatured state ensembles where the dominant contacts are local and the likelihood of realizing distal contacts is very small; this is true for a majority of conformations in the ensemble. Consequently, the average number of contacts is small and so is the variance. Just below the theta point, the chains are in the transition region and sample conformations from two distinct ensembles viz., the coil and globule states. In this regime, conformational fluctuations are large, and values for are high because vastly different types of conformations are sampled. In complete congruence with the analysis shown in Figure 1, the sharpness of the coil-to-globule transition increases with N. This is manifest by the fact that as N increases, the width of the transition region decreases and the peak height increases in panel (a) of Figure 7. In the collapse regime, decreases with decreasing temperature and does not plateau to a well-defined value. This systematic decrease of with decreasing temperature is a characteristic signature of dynamical disorder and is consistent with glassy behavior quantified in previous studies of monomeric polyglutamine.54 The assignment of dynamical disorder to collapsed polyglutamine is made clear by comparing the variance profiles shown in panel (a) of Figure 7 to the variance profiles obtained from simulation data for thermal unfolding of two well-folded proteins (shown in panel (b) of Figure 7) namely, the B1 domain of protein G (GB1) and the engrailed homeodomain (ENH). There are well-defined baselines in the values of on either side of the transition region. Additionally, the unfolded baseline (high temperature) is higher in value than the folded baseline (low temperature), which is consistent with the adoption of a roughly rigid structure with small-scale fluctuations at low temperature and a heterogeneous ensemble characterized by dominant local contacts at high temperature. In contrast, for polyglutamine, there are temperatures well into the collapsed regime (T < 360K) for which is actually higher than the asymptotic value achieved in the high temperature regime (T > 450K). These data are consistent with the proposal that monomeric polyglutamine fluctuates between disparate collections of conformations of roughly equivalent compactness. Intrinsic disorder results because collapse is only weakly coupled from folding in these simple systems that lack the requisite sequence specificity to prefer a specific compact conformation.
We have established that monomeric polyglutamine, which is intrinsically disordered, associates to form disordered homodimers. The latter point is underscored in the analysis where we showed that local conformational propensities are essentially unchanged between the isolated monomer and associated dimer, Figure 6. Using a simple approach, we interrogated the role of intrinsic disorder (spontaneous fluctuations) of polyglutamine in promoting intermolecular associations. This was done in a series of simulations where we quantified the likelihood of realizing spontaneous associations of rigid globules. These simulations were carried out as follows: Random globular conformations were chosen from the conformational ensemble of monomeric Q30 at T=298K. The internal coordinates were then frozen and only rigid body Monte Carlo moves were allowed for subsequent sampling. Statistics were recorded to construct the requisite histograms for intermolecular separations sampled in simulations with rigid globules. The process was repeated approximately thousand times, and the resultant, average cumulative distribution F(R) was compared to that obtained for the association of “fully flexible” chains. These comparisons are shown in Figure 8. The dashed curve, which corresponds to the cumulative distribution function for rigid globules, reveals the importance of conformational fluctuations in promoting intermolecular associations.
The suppression of conformational fluctuations at T=298K leads to a diminution of intermolecular associativity for Q30. The lack of rigid structural preferences or a stable fold upon collapse is clearly responsible for promoting intermolecular associations between disordered globules. This result is consistent with the observation that many aggregation sequences are also intrinsically disordered. However, some caution is required in interpreting the results of Figure 8. For instance, Figure 7 shows that the degree of disorder measured by increases with temperature for T < 360K and yet B22(T) decreases with increasing temperature. The physical basis for the latter observation comes from the analysis in Figures 5 and S1, which demonstrates that the poorness of solvent decreases with increasing temperature. Therefore, we conclude that both poorness of solvent and spontaneous conformational fluctuations, work together to promote spontaneous homodimerization. The preference for homodimerization requires an appropriate combination of the poorness of solvent (see Figures 1–5) and magnitude of fluctuations. This point is reinforced by the following observation: In the temperature regime Tc ≤ T ≤ Tθ (360 K < T < 410K) the magnitudes of conformational fluctuations go through a maximum for all chain lengths. In this regime, the surface energy penalty, measured by C2(T), is still positive and approaches zero only as T → Tθ(data not shown). This implies homodimerization is thermodynamically unfavorable despite the fact the solute-solvent interface is also unfavorable. Under these conditions, homodimerization might require the formation of an appropriate conformational nucleus, to which only appropriate conformations would be able to dock and minimize the unfavorable interface with the surrounding solvent. Alternatively, some other, higher order, oligomeric species might the thermodynamically favored entity because such a species might minimize the unfavorable solute-solvent interface more efficiently than dimers in the regime Tc ≤ T ≤ Tθ. A detailed investigation of the precise correlation between poorness of solvent and the magnitude of conformational fluctuations merits further scrutiny and is reserved for a separate study.
Polyglutamine molecules are polyamides built by a repetition of backbone secondary amides and sidechain primary amides. To analyze the types of inter-atomic contacts that lead to collapse and dimerization, we computed site-site pair correlation functions. Results are shown in Figures S2 and S3 of the supplementary material. The former shows the intramolecular site-site correlations for Q45 whereas the latter shows the intermolecular correlation functions. Both sets of correlation functions were constructed using simulation data with two Q45 molecules. The relevant pair correlation functions are calculated with respect to Tθ, as described below.
The site-site correlation functions of interest to us are between backbone donors (N) and backbone acceptors (O), sidechain donors (N) and sidechain acceptors (O), backbone donors (N) and sidechain acceptors (O), and sidechain donors (N) and backbone acceptors (O). If we denote donor atoms as D and acceptor atoms as A, then the relevant donor-acceptor site-site correlation function gDA(r) at temperature T is computed as:
Here, is the histogram of relevant donor-acceptor distances at temperature T and is the corresponding histogram of distances at the theta point. If gDA(r) > 1, then there is an enhancement of the relevant donor-acceptor contacts in the ensemble at temperature T vis-à-vis the theta point; if gDA(r) = 1, then the distribution of donor-acceptor contacts at separation r is equivalent to that of the theta point; finally, if gDA(r) < 1, then there is a depletion of donor-acceptor contacts at separation r vis-à-vis the theta point.
Figure S2 (supplementary material) shows a panel of intramolecular donor-acceptor site-site correlation functions for monomeric Q45. As T approaches the theta point, all pair correlation functions converge upon values of unity for all distances. For lower temperatures, specifically T ≤ 360K, there is significant enhancement vis-à-vis the theta point of short-range (3Å ≤ r < 5Å) sidechain donor – sidechain acceptor and sidechain donor – backbone acceptor contacts. All four sets of site-site correlation functions in Figure S2 show systematic enhancements of medium-range contacts (5Å ≤ r ≤ 20Å) and depletion of distal contacts. This feature is consistent with the preference for collapsed states at lower temperatures. The pair correlation functions shown in Figure S2 suggest that the collapsed states are characterized by prominent sidechain-backbone interactions, again with respect to the theta point, indicating that the sidechain amides solvate the backbone, thereby minimizing the interface between backbone secondary amides and the aqueous milieu. The approach used here to calculate pair correlations differs from the one used in previous work.54 Here we used the theta point as our reference state whereas in previous work we used an ideal chain model as the reference and therefore the two sets of correlation functions for collapsed, monomeric polyglutamine are different.
Figure S3 (supplementary material) shows intermolecular donor-acceptor site-site correlation functions calculated using simulation data for a pair of Q45 molecules in the simulation volume. These pair correlation functions are shown on a log scale to facilitate the visualization of all the data. For temperatures that are in the collapse regime, there is significant enhancement of all flavors of donor-acceptor contacts. This is because of significant intermolecular donor-acceptor contacts that are absent at the theta point. These observations suggest that spontaneous dimerization of polyglutamine is the result of the drive to minimize the interface with the surrounding aqueous environment, a poor solvent for polyglutamine, and to replace this interface with favorable intra- and intermolecular contacts between all combinations of backbone donors, backbone acceptors, sidechain donors, and sidechain acceptors.
Molecular simulations have played an important role in generating insights and testable hypotheses for various self-assembly phenomena involving folded proteins and intrinsically disordered proteins.93; 94; 95; 96; 97; 98; 99; 100; 101 The latter are challenging systems for simulation and experiment alike because their free energy landscapes are both degenerate and rugged.54 Energy functions for simulations have multiple flavors. These range from structure-based energy functions to the molecular mechanics paradigm. In the former, the focus is on the role of geometry in guiding folding and assembly. The latter rests on the premise of transferability whereby energy functions are designed by transferring parameters that accurately describe the conformational and phase equilibria of small model compounds.
The sequence simplicity of polyglutamine systems is attractive for testing predictions from a large body of theoretical work on synthetic homopolymers. Accordingly, several groups have carried out simulations of varying complexity to test specific ideas and hypotheses regarding polyglutamine conformational equilibria, changes associated with increasing chain length, and assembly mechanisms.102; 103; 104; 105; 106; 107 Structure-based Go models and other heuristic coarse graining approaches have been used to construct phase diagrams for polyglutamine as a function of chain length and numbers of molecules. The works of the Hall102; 103; 105 and Dokholyan 108; 109 groups are particularly noteworthy. In their work, the chain length dependent likelihood of β-sheet formation is an important consideration in developing their energy functions and a mechanistic explanation for polyglutamine aggregation. Our ABSINTH continuum solvation model rests on the molecular mechanics paradigm and in this work we have presented results from atomistic simulations on the length and temperature dependence of conformational equilibria and spontaneous dimerization of polyglutamine molecules. Our main findings are as follows:
Chen et al.41 described the formation of large ordered polyglutamine aggregates as a nucleation-dependent reaction. They used the thermodynamic nucleus model of Ferrone, a variant of homogeneous nucleation, to analyze kinetic data for the formation of large aggregates. In the schematic that emerged, monomeric polyglutamine (irrespective of chain length) is in rapid pre-equilibrium with an ordered nucleus (presumably an ordered β-sheet conformation). The pre-equilibrium constant increases with increasing chain length, but remains immeasurably small for all chain lengths. According to this model, the lag phase in polyglutamine aggregation arises because β-sheet formation is thermodynamically unfavorable. However, this unfavorable folding is a conformational pre-requisite for the formation of aggregates of all sizes, including dimers.
Chen et al.41 used four probes to monitor the kinetics of aggregation. These were circular dichroism (CD), dynamic light scattering, Thioflavin T (ThT) binding, and reverse phase HPLC. If β-sheet contents do not vary with oligomerization, then CD signals would not change as oligomers formed. Similarly, dynamic light scattering cannot resolve the presence of smaller species; ThT binding most likely reports only on the formation of large ordered aggregates; and if the oligomers are part of the soluble species, then reverse phase HPLC would not detect these oligomers. Hence, close scrutiny of the data presented by Chen et al. suggests that one cannot rule out the presence of soluble oligomers in the reacting mixture.
Interestingly, other lines of experimental evidence support the presence of oligomers as identifiable intermediates.27; 111; 112; 113 Lee et al.42 recently measured the aggregation kinetics of Q23 using peptide constructs that were similar to those used by Chen et al.40 Using both static and dynamic light scattering, Lee et al. found evidence for the formation of soluble, linear aggregates during the lag phase. They also found the early aggregates to be lacking in regular secondary structure. Inasmuch as we can connect dimer formation with formation of larger aggregates, we propose that our results, which show a lack of local conformational specificity in chain collapse and intermolecular interactions, are consistent with the observations of Lee et al.
In comparison to other computational results, the studies by Marchut and Hall102; 103; 105 are the most relevant. They employed a conceptually different, structurally guided, coarse grain model to describe peptide and solvent. Despite this difference from our approach, a brief comparison of the results is in order. In their most recent study the concentration used is 2.5 mM for a system comprised of 24 molecules with chain lengths ranging from 16–48 residues at various reduced temperatures. Marchut and Hall find that at temperatures close to the effective Tθ of their model, large-scale aggregates with relatively large fractions of β-sheet hydrogen bonds and with distinctive ring-like topologies are populated. The authors note that experimental evidence for these structural motifs is lacking. At lower reduced temperatures, however, they describe “amorphous aggregates”, i.e., aggregates lacking in structure. Our results are congruent with this unstructured “phase”. In their parlance, disordered globules with unstructured interfaces are termed amorphous aggregates. As for the observed ordered phase (“sheets”), we argue that the concentration regime in their work as well as the employed model predispose the results toward this order. This is suggested by the fact that β-rich structures appear even at the monomer level, which is incongruent with existing experimental data for monomeric polyglutamine. It is therefore likely that the appearance of an ordered phase for small oligomers is an artifact of the way their models were built. These differences not withstanding, the results of both studies appear to share some overlap in predicting amorphous aggregates under certain conditions. Whether more of the results will be reconciled if the simulation conditions between the two studies are fully matched, remains to be seen, and is a topic for future investigation. Recently, Cecchini et al.114 used a surface area based implicit solvent model and replica exchange molecular dynamics to study the aggregation propensities of four short sequences including a 7-residue polyglutamine peptide. They found that the polyglutamine peptide and a peptide excised from the sequence of Sup35 showed clear aggregation propensities as discerned by analysis of the temperature dependence of a nematic order parameter. In future work we intend to reanalyze the current data using order parameters developed by the Caflisch group since these seem to provide an indicator of sequence regions that will have greater propensity for aggregation.
The presence of disordered low and high molecular weight aggregates implies that simple homogeneous nucleation models cannot describe polyglutamine aggregation. Questions persist regarding the degree of complexity needed in mechanistic models for polyglutamine aggregation. One possibility is that the formation of disordered linear or spherical aggregates, unlike the formation of ordered aggregates, occurs off pathway and does not follow the tenets of homogeneous nucleation theory. Alternatively, as suggested by Lee et al.42 and others89; 113; 115; 116, disordered aggregates that are sufficiently large might convert to ordered forms. The different scenarios are summarized in Figure 9. The schematic in Figure 9 is congruent with the tenets of the generalized Lumry-Eyring model put forward recently by Andrews and Roberts.117 At this juncture we cannot adjudicate between the different possibilities shown in Figure 9. This will require simulations that explicitly probe the extent of coupling between specific conformational restraints and intermolecular associations. It will also require advances in computational methods that allow us to study conformational equilibria and aggregation of multiple molecules in atomic detail. From an experimental standpoint, we will need to probe the distribution of oligomers formed at low concentrations and low copy numbers because these will provide a direct way to compare results from simulations to experimental data. These investigations will require novel experiments or the adaptation of experiments used on other aggregation-prone systems to study polyglutamine aggregation.
Capped polypeptides composed exclusively of glutamine residues (Acetyl-(Gln)N-N-Methylamide) were built with fixed bond lengths and angles according to the Engh-Huber high-resolution, crystallographic geometries. Peptides with N = 5, 15, 30, and 45 were simulated in the nNVT ensemble in a spherical droplet of radius Rdroplet=200Å, where n denotes the number of individual molecules, N the number of glutamine residues, V the volume of the simulation droplet, and T the simulation temperature. In all cases, polar interactions were truncated at 14Å, and short-range interactions were truncated at 10Å. This is justified by the fact that the simulation system does not include any naked charges and the charge set used, which is based on the OPLS-AA/L parameters, models primary (sidechain) and secondary (backbone) amides as dipoles, albeit with the partial charge model.
The simulation system consisted of either a single chain (n = 1) for the monomer simulations or of two chains (n = 2) for the dimer case. The chain molecules were confined to the simulation volume by applying a stiff harmonic boundary potential restraining the molecules to the simulation droplet. It is important to note that for a given simulation, fluctuations in n are quenched in the nNVT ensemble. Therefore, the monomer simulations mimic the infinite dilution limit despite the fact that the simulation volume is finite. For the dimer case, the effective concentration was ca. 100 μM, which is in the concentration range of most in vitro experiments. Given our simulation setup, there is no possibility of studying the formation of oligomers larger than a dimer. This does not mean that we predict the absence of such larger species in the concentration range of 100 μM. Rather, our simulations focus on quantifying the spontaneities of chain-length and temperature dependent intermolecular associations at low copy numbers and high effective concentrations. This scenario might be reminiscent of in vivo settings, although this claim is purely speculative in nature. Extensions to study the formation of larger oligomers will require improvements in sampling methodologies and these are currently being pursued.
All the simulations presented in this work were performed using Metropolis Monte Carlo (MMC) sampling of the relevant degrees of freedom, which for polyglutamine are the , ψ, and ω angles of the polypeptide backbone as well as the three sidechain dihedral angles (χ1, χ2, χ3) of the glutamine residue. Additionally, for n=2, we include the sampling of rigid body degrees of freedom namely, translations of centers-of-mass and rotational reorientations of molecules. Details of the move sets employed are summarized in Table S1 of the supplementary material and all of our choices are explained in detail in the caption that accompanies this table. It is important to remind the reader that appropriate design of MMC move sets allow us to simultaneously probe multiple, disparate length scales rather efficiently, taking advantage of the low overall density. This situation is unlike molecular dynamics sampling, which is quite inefficient for sampling large-scale conformational changes as well as intermolecular associations/dissociations. The latter is hindered by slow diffusion and will require adaptive approaches and indeed MMC sampling may be viewed as a variant of such an adaptive approach. All simulations were carried out using our ABSINTH software package, which is being readied for public release (http://lima.wustl.edu/absinth.php), and are summarized in Table 1.
As was shown in previous work, intrinsically disordered polyglutamine systems pose a serious challenge for conformational sampling. To improve the quality of our simulation data, we used thermal replica exchange118; 119; 120, which includes an additional Markov chain to the MMC sampling. Details of all the parameters for replica exchange are summarized in Table 1. While there is no formal requirement to restrict swaps between adjacent replicas, this restriction is implemented in practice because acceptance of proposed swaps between nonadjacent replicas is rather small. This restriction increases the overall efficiency of replica exchange given the finite number of swaps that are feasible during a simulation.118; 119; 120 In our simulations, we allowed swaps between all unique pairs of replicas121 in the range 298K ≤ T ≤ 410K because the acceptance of proposed swaps between nonadjacent replicas remained finite and we found that this improved the overall quality of sampling, especially for the lower temperatures.
All of the data presented here were generated using the ABSINTH continuum solvation model, the details of which were published recently. In this model, the effects of solvent water are approximated by two contributions: 1) A direct mean-field interaction; and 2) A dielectric screening effect. The ABSINTH model79, which is based on the EEF1 paradigm of Lazaridis and Karplus122, has been shown to reproduce the polymeric behavior of polyglutamine when compared to both simulations in explicit solvent as well as to experimental data. The results presented in the current work were obtained using two minor modifications to the previously published force field. First, for reasons of computational efficiency, partial charges on net-neutral methyl and methylene groups were omitted. All other partial charges are identical to those reported previously and are based on the OPLS-AA/L force field. Second, we employed slightly modified Lennard-Jones (LJ) parameters compared to the parameters published recently. The LJ parameters used in this work are shown in Table S2 (supplementary material). These modifications were not necessary; instead they are chronological artifacts because the simulations presented here were carried out using a previous, unpublished version of the ABSINTH force field and reflect the fact that the simulations presented here were completed prior to the fine-tuning of the published ABSINTH model. In light of the computational expense involved in carrying out the current simulations, we decided against repeating the calculations with the refined parameters. Brief tests (data not shown) indicate that all of the conclusions regarding the length and temperature dependencies of polyglutamine conformational equilibria and the spontaneities of homodimerization remain qualitatively robust. However, Tθ shifts to a value that is lower than 410K when we use the parameters that were published recently.
Most analysis quantities were computed once every 103 to 104 steps depending on the total extent of the simulation (see Table 1). For each monomer/dimer simulation, we carried out multiple, independent simulations with replica exchange. Therefore, we used a modified block averaging technique to estimate error bars. In this approach, for each temperature point, the data obtained from a single replica exchange simulation run is treated as a single block. With this approach, we are not confronted with the problem of having to chop our simulation data into ad hoc blocks. Independence of blocks for averaging is guaranteed because the starting conformations for all simulations are completely randomized. However, we do not have the resources to carry out hundreds of independent replica exchange runs. Instead, we typically have data from four independent replica exchange runs. Hence, the error bars that result from “block averaging”, which in reality is averaging over completely independent trajectories, are not rigorous estimates of statistical and bias errors in sampling; rather, they act as qualitative indicators of the reproducibility of our results between independent runs with randomized starting conformations.
This work was supported by grant 5R01NS056114-01 from the National Institutes of Health. We thank Scott Crick, Carl Frieden, Nicholas Lyle, and Timothy Williamson for helpful discussions and suggestions. RVP is grateful to Ron Wetzel for insights and guidance during the initial stages of this work.