|Home | About | Journals | Submit | Contact Us | Français|
Allostery describes altered protein function at one site due to a perturbation at another site. One mechanism of allostery involves correlated motions, which can occur even in the absence of substantial conformational change. We present a novel method, “MutInf”, to identify statistically significant correlated motions from equilibrium molecular dynamics simulations. Our approach analyzes both backbone and sidechain motions using internal coordinates to account for the gear-like twists that can take place even in the absence of the large conformational changes typical of traditional allosteric proteins. We quantify correlated motions using a mutual information metric, which we extend to incorporate data from multiple short simulations and to filter out correlations that are not statistically significant. Applying our approach to uncover mechanisms of cooperative small molecule binding in human interleukin-2, we identify clusters of correlated residues from 50 ns of molecular dynamics simulations. Interestingly, two of the clusters with the strongest correlations highlight known cooperative small-molecule binding sites and show substantial correlations between these sites. These cooperative binding sites on interleukin-2 are correlated not only through the hydrophobic core of the protein but also through a dynamic polar network of hydrogen bonding and electrostatic interactions. Since this approach identifies correlated conformations in an unbiased, statistically robust manner, it should be a useful tool for finding novel or “orphan” allosteric sites in proteins of biological and therapeutic importance.
Originally, allosteric proteins were those where multiple subunits achieved cooperative binding through ligand-mediated shifts in conformational equilibria. Nowadays, allostery is broadly defined as any case in which an event at one site on a protein or complex impacts function, dynamics, or distribution of conformations of another site (for recent reviews see 1, 2). This broader definition includes single-domain proteins as well as proteins or complexes where cooperativity occurs without substantial conformational change. Given this broader definition, it has been suggested that allostery is a property of many proteins3-5, but is only relevant when a localized event precipitates a change in function. Recently, there has been renewed interest in uncovering allosteric mechanisms of protein regulation and in discovering new allosteric sites, which are of significant interest in biological mechanisms of protein regulation and as novel sites for drug discovery6-8.
Typically, sites are identified as allosteric after mutational, structural, and thermodynamic characterization with allosteric protein, peptide, or small-molecule modulators, which are frequently found serendipitously9. As such, there has been much interest in computational approaches to identify novel allosteric sites. One of the most extensively used approaches has been the Statistical Coupling Analysis approach pioneered by Ranganathan and co-workers10-13, where pairs of residues that tend to be mutated together in multiple sequence alignments suggest coupling between protein sites. This approach has recently been used to engineer a novel allosteric network by combining predicted allosteric pathways from a light sensor and an enzyme14. However, this approach requires large multiple sequence alignments and the predicted couplings may or may not be relevant to particular proteins in the alignment15. Alternative methods to identify allosteric networks using sequence comparisons have also been described16.
Other computational methods to study allosteric mechanisms and identify potential sites for allosteric regulation focus on a protein's structure and dynamics. Cooper and Dryden showed that the free energy of cooperativity could be separated into two terms: one that accounts for changes in the protein's conformational distribution (i.e. by population shifts), and one that accounts for changes in the amplitudes and/or frequencies of protein vibrational motions. One approach to studying allostery is to focus on protein vibrations17-20 around a static structure, often by a coarse-grained normal mode analysis21-24, in which case allosteric effects of perturbations can be calculated analytically. However, these approaches are unable to capture the anharmonicity and multi-well nature of flexible degrees of freedom in proteins. Another approach is to infer groups of residues important for a given allosteric process by analyzing structures trapped in different conformations25-27.
Dynamical approaches to studying allostery generate an ensemble of structures and then analyze the ensemble using cross-correlations28, contact correlations29, principal components29, or local unfolding correlations30. One widely adopted approach uses a quasi-harmonic metric for correlations that assumes an “average” structure28, 29, 31-33. This approximation may be appropriate for small backbone fluctuations but may not aptly describe conformational changes that involve basin-hopping, such as loop or side-chain motions. To overcome this quasi-harmonic limitation, Lange and Grubmuller34, 35 used a mutual information method to account for both quasi-harmonic and anharmonic correlations in atoms' motion in Cartesian space. Still other methods introduce mechanical perturbations and monitor the subsequent motions of residues36, 37. The latter approach can detect substantial population shifts or structural changes following the induced local perturbations, as the added energy facilitates barrier crossing.
Our MutInf approach for identifying allosteric networks quantifies correlations between the conformations of residues in different sites. We use an entropy-based approach to analyze ensembles of protein conformers, such as those from molecular dynamics or Monte Carlo simulations. The method is applicable even in cases where conformational changes are subtle, e.g., when the coupling is mostly entropic in nature38, 39. Unlike the approaches described above, our approach uses internal coordinates and focuses on dihedral angles, which are responsible for most low-frequency motions, in order to capture correlated changes in side chain rotamers, a highly anharmonic type of correlated motion. The most closely related previously published method is a study that examined side-chain correlations using a mutual information metric and Monte Carlo simulations of side-chains40 on a set of fixed protein backbones.
Our MutInf method builds upon and extends previous work by 1) directly connecting correlated conformations to the molecular configurational entropy, 2) incorporating more robust entropy estimators, 3) correcting for undersampling using data from multiple simulations, 4) testing statistical significance to filter out correlated motions that are not significant, and 5) analyzing both backbone and sidechain torsions, which are frequently coupled41, 42. The theoretical underpinnings of our approach are described in detail in Methods. Briefly, we use second-order terms from the configurational entropy expansion, the mutual information43, to identify pairs of residues with correlated conformations in an equilibrium ensemble. In calculating mutual information, it does not matter whether two residues move at the same time or whether one moves, and then the other; what counts is whether these residues' conformational distributions are correlated. In this work we use the terms correlated motions and correlated conformations interchangeably.
Because we look for correlated conformations in an unbiased, statistically robust manner, we believe that MutInf will be a useful tool in the discovery of novel, “orphan” allosteric sites, where endogenous protein or small molecule allosteric modulators have yet to be discovered. As a proof-of-principle, we used our approach to identify correlations between the conformations of protein residues lining two small-molecule binding sites in human interleukin-2 (IL-2). This single-domain protein exhibits cooperative ligand binding without substantial conformational change, and to date no follow-up work has been done to uncover the mechanism for this cooperativity. We discuss the rationale behind our approach and compare its strengths and weaknesses to those of other methods and then discuss the mathematical details of our method and our novel results on IL-2.
When an equilibrium ensemble of states is altered by small perturbations, the fluctuation–dissipation theorem relates equilibrium fluctuations to the system's response, which will be proportional to equilibrium pair-correlations of the degrees of freedom and linear in the applied perturbations. This linear response theory suggests that external forces, such as those due to ligand binding, cause the largest indirect changes in the degrees of freedom that are most correlated (at equilibrium) with those directly perturbed by the external forces. As has been previously noted29, this also means that the response to small perturbations involves the same fluctuation pathways activated by random solvent collisions at equilibrium. Elastic network models have identified a correspondence between low-frequency normal modes and pathways used in several protein conformational changes22, 44, suggesting that correlations observed in equilibrium simulations may propagate perturbations in structure and/or dynamics due to ligand or protein binding.
A perturbation at one site can couple to another site directly, through electrostatic or steric interactions, or indirectly, through solvent reordering, or through a network of residues with correlated conformations. When the conformation at one site depends on the conformation at another site, the sites' conformations are correlated. When the conformations are correlated, perturbations at one site can cause population shifts in conformations at other sites. Correlated conformations are then signals that can be used to identify allosteric mechanisms and predict new sites for allosteric inhibition by proteins or small molecules.
Our MutInf approach uses equilibrium molecular dynamics simulations to identify correlations in residues' conformations, from which functional coupling between sites is inferred. Approaches such as ours that infer allostery from equilibrium simulations assume that the allosteric phenomena of interest (i.e. ligand binding, protein binding, protonation state changes, etc.) make perturbations to the energy landscape that are relatively small, i.e. at most a few kT. For example, proteins and ligand binders with fast on-rates will satisfy this assumption, while proteins and ligand binders with slow on-rates may not. Furthermore, equilibrium approaches that infer allostery assume that there are no large barriers to conformational changes required for allostery. If such barriers existed, they would prevent pairs of residues from sampling relevant correlated shifts in conformation when perturbations of interest are applied. Along these lines, these equilibrium approaches also assume that there is sufficient sampling along the degrees of freedom relevant to the allosteric phenomena, so that productive or “on-pathway” correlated motions can be observed.
To quantify correlations between residues' conformations from equilibrium simulations, we take advantage of a connection between information theory and thermodynamics. Inspired by the use of mutual information by Killian, Kravitz, and Gilson in calculating configurational entropies from conformational ensembles using internal coordinates43, we use second-order terms from the configurational entropy expansion, i.e. the mutual information, to identify pairs of residues with correlated conformations. This approach directly and quantitatively connects correlations in conformation to configurational entropy. Using internal coordinates to calculate the mutual information has the twofold advantage of 1) capturing the rotameric, flipping, and gear-like nature of correlated side-chains; and 2) removing potentially spurious correlations that can arise due to structural alignment. The latter effect occurs because minimization of the r.m.s. error in aligning structures in Cartesian space can yield correlated displacements in many atoms' positions as some atoms are fit better than others. An overview of our approach is presented in Scheme 1.
In applying entropy and mutual information to studying allostery, we sought to obtain a measure of the statistical significance of our results and filter out noisy and artifactual correlations. To accomplish this, we extended established methods for calculating entropies and examining correlations via mutual information to handle finite sample sizes, to incorporate data from multiple simulations, to account for the variability between simulations, and to correct for the fact that multiple simulations do not, in practice, represent independent samples of the macromolecular ensemble.
The configurational space of a molecule can be described in a standard Cartesian coordinate system or in an internal coordinate system of bond lengths, bond angles, and torsion angles (BAT)43. For proteins, key torsion angles include the , ψ, and ω torsion angles of the protein backbone and the χ torsion angles of the amino acid side-chains. In the present work, we consider only the , ψ, and heavy-atom χ torsion angles (only the first χ angle for proline) and neglect changes in bond lengths, bond angles and omega backbone torsion angles, as we believe that the dynamics of the first three are the most relevant to describing motions of biological importance43.
Small sample sizes are notoriously challenging for entropy and mutual information-based approaches, so we use robust estimators and correct for bias using simulated data.
We wish to quantify correlations between residues' torsions. Following the works of Matsuda45 and of Killian, Kravitz, and Gilson43, we connect correlated torsions to thermodynamics using an expansion of the molecular configurational entropy into terms over single torsions, pairs of torsions, etc. The total torsional entropy is given by:
where indices i and j are residues' torsions, and n is the number of torsions (, ψ, and χ torsion angles in the present work). The second-order term here represents a sum of the mutual information of each pair of torsions. The mutual information describes correlations between degrees of freedom, and gives a measure of how much information about one degree of freedom is gained by knowledge about another46. Because the mutual information values are terms in the entropy, which is related to free energy, the mutual information in Eq. 1 is in units of kT. The mutual information has been a popular, distribution-free analysis method, and more recently has been used in the context of molecular conformational ensembles40, 43.
As an example, consider the distributions of the χ1 torsion angles for two side chains. For concreteness, we use an example from interleukin-2, which is described in greater detail below and involves two aromatic residues in close proximity (Fig. 1). The expected joint distribution of these torsion angles under the null hypothesis (Fig. 1A) of independence is merely the outer product of the marginal distributions. However, the joint distribution from the observed simulations (Fig. 1B) show that these torsion angles are correlated (I = 0.203 kT), because a cross-peak (indicated by a gray box) appears in the simulations that would not be expected if these torsions were independent.
In practice, we compute the mutual information, I, between two degrees of freedom as the difference between the self-entropies and the joint entropy, using the relation, I = S(1)+S(2)—S(1,2), and a corrected histogram entropy estimate47 over adaptive partitions46:
where r and s are the number of marginal bins, ni, nj, and nij are the histogram counts, N is the number of data points, and Ψ is the digamma function. Adaptive partitions make efficient use of discrete bins, preserve correlations between variables, and normalize each joint distribution to a reference distribution in which marginal counts are as uniform as discretization allows46. In this work we used 24 bins per dimension. Furthermore, adaptive partitions enable accurate mutual information values to be calculated whether torsional motions are large or small. Note also that we account for the two-fold symmetry in the χ2 angle of Asp, Phe, and Tyr and in the χ3 angle of Glu.
The histogram entropy estimator above assumes that histograms are populated by a Poisson process (nij N) and so is especially appropriate for sparse joint histograms. It also implicitly includes finite bin and data size corrections used in other discrete entropy estimators48. As a statistic for examining correlations between variables, the mutual information (with corrections discussed below) is far more robust against small sample sizes than the χ2 statistic, which assumes nij≥5. While the nearest-neighbor approach49 could have been used instead to compute these integrals, it can require N≈50,000 data points50 or more to yield a converged estimate of the mutual information for pairs of torsions. Nearest-neighbor approaches are accurate for very large datasets but have biases under finite sample sizes that depend on the topology of conformational space sampled in simulations50. As our goal was to extend mutual information calculations to handle smaller sample sizes, we chose instead to use adaptive partitioning in combination with the corrected histogram mutual information estimate above (Eq.2), so that each pair of degrees of freedom could be compared against the same empirically-generated reference distribution and evaluated for significance.
In a number of applications using mutual information, it has been found that samples of two variables that are independent can yield nonzero mutual information in calculations46, 51-53. We empirically observe the same in simulated data (data not shown), and this is not surprising because errors in estimates of the true mutual information are a consequence of finite samples. To correct for this, one approach is to create P permutations of the original data, so that the marginal probability distributions remain the same while correlations between the data are scrambled. One can use these permutations to establish a test of significance of the observed mutual information with a null hypothesis of independence versus a one-sided alternative. The approximate p-value is then the percentage of mutual information values from different permutations that are greater than the observed mutual information from the original data46, 53. Also, the average mutual information for the permuted data, the “independent information”, can be subtracted from the observed mutual information to yield the “excess mutual information”, a more reliable estimate of the true mutual information51, 52.
When adaptive partitioning is not used (and hence the marginal densities are not normalized), permutation approaches are inefficient in sampling the distribution of the mutual information under the null hypothesis, because permuted values are likely to fall into bins overrepresented in the marginal densities; adaptive partitioning fixes this inefficiency by normalizing marginal densities without altering correlations between variables. One can apply the permutation approach above to nearest-neighbor estimates as well, as these also will have bias due to finite sample sizes. For example, a combined K-fold resampling/permutation test was found to be useful in conjunction with nearest-neighbor mutual information estimates to discriminate between relevant and independent features in feature selection53. A major drawback to the permutation approach is that it is computationally demanding in processing time and in memory. Moreover, permutations introduce random error because not all N! permutations can be made53.
Instead, since adaptive partitioning is used in this work, we noted that the same distribution of the “independent information” is appropriate for all pairs of degrees of freedom. The distribution of the mutual information for independent variables for given data size N and number of marginal bins r and s has not yet been analytically solved, though in some cases can be empirically fit52. However, because an analytical, parametric approach is not available, we perform Monte Carlo sampling to obtain the reference distribution of the “independent information” for all pairs of torsions. With adaptive partitioning, the marginal counts are nearly uniform and in any case are equivalent for different pairs of torsions. Thus, all pairs of torsions will have the same distribution in histogram bin space under the null hypothesis of independence. To construct the reference distribution for a pair of independent torsions, we first make a copy of the marginal distributions for a given pair of torsions (it doesn't matter which pair we choose). Then, we choose ordered pairs of bin indices at random from these marginal distributions and place them into a 2-D histogram without replacement. The mutual information is calculated according to Eq. 2 above, and this procedure is repeated 1000 times to create a distribution of the mutual information under the null hypothesis of independence for the given number of datapoints N and number of bins r.
We use this distribution of “independent information” for a significance test of observed mutual information values, and we subtract the average “independent information” from the observed mutual information to yield the “excess” mutual information; this filters out insignificant mutual information values and corrects for finite sample size bias. Because this analysis empirically generates a distribution under the null hypothesis, the false positive rate for keeping a nonzero mutual information value for torsions that are truly independent is α, the significance level (in our case, 0.01). This false positive rate will be further reduced by consideration of the alternative hypothesis.
Most approaches that filter mutual information values using tests of statistical significance do so according to whether the null hypothesis of independence can be rejected using descriptive statistics. One disadvantage of these approaches is that they do not consider the distribution of the mutual information under the alternative hypothesis. In Bayesian statistics, the mutual information is a random variable with a distribution, and the probability that the mutual information is greater than a a given value can be calculated. Approximations to the distribution of the mutual information have been described that account for uncertainties in the estimates of the probability density functions54. The first two central moments of the distribution, the expectation E[I] and variance Var[I] of the true information given the data and prior, are given as follows:
where nij=nij(observed)+nij(virtual), and the virtual counts come from a noninformative Dirichlet prior (nij=1 for the uniform prior, which was used in this work). Approximations for the leading order terms for the third and fourth central moments have been reported54, and could be used in an Edgeworth expansion to approximate the distribution, but for robustness we chose instead to simply use a Gaussian with the above mean and variance, which fit reasonably well to simulated data in a model system54. We then use this approximate distribution of the mutual information to calculate P(I<E[Iind]), the probability that the true mutual information is below that expected for independent torsions (calculated using Eq. 3 averaged over 1000 simulated independent datasets). Pairs of torsions with P(I<E[Iind]) > α are not significant and are discarded.
To obtain accurate entropies and mutual information values up to second order, simulations must be run many times longer than the slowest autocorrelation and pair correlation times, and data points should represent independent observations. Due to limited computing power, this is rarely the case, and molecules in simulations carry with them some memory of their initial states. For example, consider a salt bridge. Salt bridges can form strong electrostatic interactions, and hence it can take a long time to sample their full conformational space (long autocorrelation time) and even longer to sample all populated pairwise conformations (long pair correlation times). Thus, a salt bridge may retain some memory of its initial conformation, which will fade away on the timescale of the pair decorrelation time (approximately). In practice, we decided to use data from multiple simulations to penalize this kind of undersampling in a novel way.
First, we first aggregate the counts for two degrees of freedom from a set of simulations (sample ensembles) of size nsims, and calculate the mutual information for all the simulations taken together. Intuitively, two torsions in different simulations should not be correlated, as they should sample their probability distributions independently. Any non-zero (excess) mutual information between these torsions is a measure of conformational undersampling bias that we can subtract from the mutual information between the torsions for the set of simulations. To correct the mutual information for artifactual correlations due to incomplete sampling, we calculate the excess mutual information and then subtract the average excess mutual information between two degrees of freedom in different pairs of simulations (when it is positive):
Here i and j correspond to the different torsions, l and k are the indices of the pairs of different simulations, and Iind is calculated for a pair of simulations just as the independent information is calculated for a set of simulations using the Monte Carlo recipe above, except that values of <Iind> lower than the standard deviation of Iind are zeroed to reduce noise from this term. For the mutual information between torsions in different simulations, we use half as many bins (r′ = r/2), because the number of datapoints N′ for the histograms is smaller than the total number of datapoints from all simulations, N (N′=N/nsims). Significant mutual information values are those that have passed the significance test vs. the null hypothesis, the Bayesian filtering using the alternative hypothesis, and whose corrected excess mutual information (eq. 5) is greater than zero.
When we consider the mutual information between pairs of residues, we take the sum of the mutual information between pairs of residues' torsions:
This may overestimate the total mutual information between two residues, as we neglect the higher-order terms in Eq. 1. Inclusion of statistically significant higher order terms (which would require more data points) would further increase the accuracy of the calculated mutual information between pairs of residues. Nonetheless, our results below show that our robust use of second-order terms is a powerful means to identify residues with correlated conformations.
Molecular dynamics simulations on interleukin-2 (IL-2), alone and in complex with three ligands, were performed using GROMACS 3.355, 56 and the AMBER-99 forcefield57. Loops missing atomic coordinates, such as residues 1-5, 75-76, and 99-102 in apo IL-2, were closed using loop prediction via the Protein Local Optimization Program (PLOP58). Protonation states of histidine side-chains at pH 7 were given by MCCE59, 60: we modelled His16 as positively charged (residue name “HIP”) and His55 and His79 as ε-protonated. Two ligand-bound forms of IL-2 were prepared, with either Ro26-4550 (amino(3-(2-(1-methoxy-1-oxo-3-(4-(phenylethynyl)phenyl)propan-2-ylamino)-2-oxoethyl)piperidin-1-yl)methaniminium)61, 62 bound to the competitive IL-2Rα site (PDB:1M48), or compound 7c (1-(3,4-dihydro-1H-pyrido[3,4-b]indol-2(9H)-yl)-2-methoxyethanone) bound to the allosteric site63. Compound 7c was built from PDB:1NBP by modifying the covalent compound 7t in the crystal structure to the noncovalent compound 7c in Maestro (Schrodinger, 2007), then using PLOP loop prediction to optimize the loop from residue 29 to 33, and to fill in missing residues between residues 73 to 78, in each case simultaneously optimizing side-chains within 12 Å of the given loop region. These ligands were parametrized for MD by GAFF64 and assigned AM1-BCC charges65, 66. Each apo protein or complex was energy minimized in explicit solvent using GROMACS, and five copies of each of the three constructs (apo, competitive site bound, and allosteric site bound) were equilibrated at 300K (with different random seeds) using constant volume for 10 ps and using a constant pressure of 1 atm for 100 ps using the Berendsen barostat67, with hydrogens constrained using the Lincs algorithm68. Equilibration of each simulation was followed by a 10 ns production run, with snapshots of the atomic coordinates recorded every 1 ps. Actual lengths of individual simulations ranged between 9.6ns and 11ns for technical reasons. RMSDs of the two compound binding sites over the simulations showed that all of the five simulations per apo or small-molecule bound construct were stable (Supplemental Figure 1).
We clustered MD snapshots from the IL-2 simulations with allosteric compound bound according to the coordinates of residues in the competitive site using QT clustering69 as provided in GROMACS (“g_cluster -method gromos”). Then, to each cluster representative we docked the Roche competitive site ligand Ro26-455061, which binds cooperatively with the allosteric ligand. This ensemble docking was performed using the XGlide cross-docking script (Schrodinger, 2007, Script Center XGlide v. 126.96.36.199, mmshare v. 16109, using inner and outer grid box lengths of 10Å and 18Å, respectively)70.
We applied our MutInf approach to elucidate the mechanism of small molecule binding cooperativity in human interleukin-2. Little is known about how binding of ligand at the IL-2Rα binding site enables binding of a small molecule fragment to a cryptic allosteric site. These ligands bind at least 6.9 Å apart at their closest approach in the predicted ternary complex. Crystal structures of complexes of interleukin-2 with small molecules bound to different sites did not show substantial structural changes at the other sites (maximum Cα RMSD 0.88 Å at the allosteric site for apo PDB:1M47 and competitive-bound PDB:1M48). We therefore hypothesized that allostery and cooperativity in IL-2 arises largely from changes in dynamics and subtle population shifts rather than a major change in the preferred backbone conformation28, 38, 39.
We used molecular dynamics to study correlated motions at the atomic level on a picosecond-nanosecond timescale, and used our MutInf approach to analyze sets of 10 nanosecond trajectories of human interleukin-2, alone and in complex with different small molecule binding partners. Our goal was to show that MutInf can identify significant correlated conformations for functionally-important residues in simulations whose lengths and recording frequencies are typical of those in the current literature.
We first analyzed whether an unbiased, whole-protein analysis of correlations between residues in interleukin-2 would be able to identify cooperative sites and the correlations between them from the apo simulations alone. For each pair of residues, we calculate the mutual information as per (Eq. 6) between all pairs of , ψ, and χ torsion angles for our apo simulations of interleukin-2. The mutual information is reported in units of kT, because of the relationship between mutual information and entropy (Eq. 1).
When we plotted the statistically significant mutual information between pairs of residues' torsions in IL-2, we found that only a small subset of residue pairs are highly correlated, while many are only marginally correlated (Fig. 2A). A substantial part of the present work involved incorporating more robust entropy estimators for calculating the mutual information and filtering out insignificant correlations with the help of empirical or approximated distributions under the null and alternative hypotheses. So, as a control, we plotted the unfiltered mutual information between residues' torsions in Fig. 2B. Protocols with and without statistical filtering showed correlation between residues in the loops after helix 1 and between helices 2 and 3 (Fig. 2A and 2B, red boxes on the diagonal and off the diagonal, respectively, and Fig. 2C, red residues). Our statistical filtering, however, highlights these and removes background noise; only a subset of the pairwise correlations between residues in these regions make statistically significant contributions to the conformational entropy. Moreover, our MutInf approach identified significant correlations between residues in the loop region between helices 3 and 4 and residues in other regions, in particular residues in the floppy N-terminal tail (Ser6, Thr7) and the beginning of the N-terminal helix (11-15), residues in the loop between helices 2 and 3, and residues in the C-terminal helix (Fig. 2A, blue boxes, and Fig. 2C, blue residues). The loop between helices 2 and 3 displays significant variability in the different crystal structures of IL-2, and is at least partially disordered in most structures, indicating both that it is flexible and that it can adopt at least several conformations. Residues showing significant correlations near the C-terminus include two residues in the loop before the C-helix (Glu100 and Thr101), and residues along the C-helix (Arg120, Ile128, and Leu132, proximal to IL-2's negatively-charged C-terminus).
We compared our MutInf method to previously-reported methods for identifying correlated motions, in particular the Gaussian Network Model (GNM) approach of Bahar and colleagues71 and the Cartesian mutual information method of Lange and Grubmüller34. Both our method and the GNM method suggested correlation between residues in the loop after helix 1 and residues in the loop between helices 2 and 3 (red boxes on the diagonal and off the diagonal, respectively, Fig. 2A, ,3A).3A). We found that our approach highlighted strong correlations and gave low background noise, while the Cα cross-correlation matrix using a GNM (with 10 Å Cα-Cα cutoff) gave a noisier pattern of correlations, as did Lange and Grubmüller's mutual information method applied to the Cartesian coordinates of Cα atoms (Fig. 3B).
Our identification of significant long-range correlations led us to investigate the distribution of correlations between residue pairs as a function of distance between the residues' Cα atoms (Fig. 4). As would be expected, the number of weak correlations decreases as the distance between residues increases. However, residues separated by substantial distances have correlations of 0.4 kT or more just as often as residues separated by short distances.
Looking more closely, we see that there are strong couplings between pairs of distant residues (Fig. 5). Here, we highlight correlations of magnitude greater than kT between distant residues, namely those with alpha carbon separations of more than 5 Å. Again, we observe strong couplings between the N-terminus of helix A, Tyr31 at the C-terminal end of helix A, and the adaptive loop region (74, 76-78), as well as between Gln74 and Glu100, and between Glu100 and Ile128 near IL-2's C-terminus. It is not surprising that many of these residues are polar, as molecular dynamics simulations include terms for long-ranged electrostatic and ion-dipole interactions. Gaussian Network Models, on the other hand, do not model such sequence-dependent, long-range interactions, and so it is not surprising that the correlations between distant residues are typically weak. Electrostatic interactions can be both directly and indirectly responsible for long-range correlations in residues' conformations; directly, through Coulomb's law, and indirectly, through a dynamic network of charged and hydrogen-bonding polar residues, and through altering the first-shell water structure around the protein (in simulations with explicit solvent). Unlike charge-dipole or dipole-dipole interactions, where the effective range decreases through averaging over orientations72, charge-charge interactions retain their long-range nature even when averaged over orientations.
Other factors that can give rise to correlated conformations include hydrophobic packing and rigid-body motions of semi-rigid secondary structure elements, such as α-helices. We note that our approach does not typically show strong correlations within semi-rigid elements such as α-helices or the central hydrophobic core of the four-helix bundle. Two possible reasons for this are because mutual information values are not normalized quantities and because higher-order correlations are not captured. As the maximum mutual information between two residues is the minimum of their self-entropies (flexibilities), residues that have higher self-entropies (more flexible) can exhibit a greater magnitude of coupling with other residues. This behavior is thermodynamically appropriate because the un-normalized mutual information is related to the configurational entropy (Eq. 1), and not a normalized quantity. Furthermore, this behavior is consistent with thermodynamic considerations in intrinsically disordered proteins, where disorder in one or both domains serves to optimize allosteric coupling between the sites73. Such allosteric coupling does not require a network of interactions linking the sites. In the present study on interleukin-2, flexible resides at either end of a helix showed couplings > kT in some cases while intervening helical residues did not (Fig. 5), because mutual information values, unlike correlation coefficients, are not normalized for residues' self-fluctuations. We note that even normalized pairwise correlations (i.e. Fig. 3A) do not include the higher-order correlations that would be expected within semi-rigid elements, and while the Gaussian Normal Mode results (with default cutoffs) show weak couplings between the N-terminus of helix A and Tyr31 at the C-terminal end of helix A, and between the adaptive loop region (74, 76-78) and Glu100, these are not visibly distinct features (Fig. 3A). In any case, as the goal of our approach is to identify correlations between the conformations of functional sites on a protein, it is not necessary to identify all of the residues that indirectly mediate such correlations, though this is an area for future work. We will later focus on coupling between two small molecule binding sites in interleukin-2, which are physically connected by flexible loops and sidechains, rather than semi-rigid secondary structure elements.
For a site to be suitable for allosteric inhibitor design, it must be both (1) allosteric, causing shape or flexibility changes at other sites1, and (2) druggable, having the right shape and hydrophobicity for drug-like small molecules74. Here, our goal was to predict which sites were most likely to alter structure or dynamics at known functional sites upon perturbation (by ligand binding, for example). To search for such sites, we wanted to identify groups of residues responsible for correlations between functional sites.
In biological networks such as protein-protein interaction networks, functional connections between various proteins are preferentially mediated by “hubs” that interact with a greater than average number of partners75. Similarly, functional connections between various protein sites are thought to be mediated by “hub” residues or clusters of residues25, 27. We hypothesized that clusters of residues correlated to many other residues, i.e. “dynamical hotspots”, could be potential sites or mediators for allosteric modulation of other sites. To find such “dynamical hotspots”, we performed a hierarchical clustering of the matrix of mutual information values between residues, in analogy to the analysis of microarray data, using the “heatmap” function in R (http://www.r-project.org/). We used a Euclidean distance metric so that residues showing similar patterns of correlations with other residues are clustered together. Interestingly, one cluster of residues emerged with the strongest correlations within cluster members and the strongest correlations to other residues, and was previously found to be an adaptive region that could bind a number of small-molecule fragments as measured by Tethering experiments62. Residues in this cluster are colored red in Fig. 6 and mostly reside in the flexible loop between helices 2 and 3, with two in the N-terminal floppy tail and one in the flexible C-terminal loop. Because the mutual information between two torsions is less than either of their self-entropies, it is not surprising that flexible residues often have high mutual information with other residues. This red cluster constitutes a “dynamic hotspot”, as it is highly correlated to other clusters of residues. Furthermore, as this red cluster is correlated to the blue cluster containing the IL-2Rα inhibitor binding site, our method predicts the red cluster to be a candidate region for allosteric modulation of the IL-2Rα site. Two similar clusters can also be seen when mutual information values from subsets of the five simulations are block-averaged (Supplemental Figure 2). However, our approach does not yet predict whether such a site would be druggable by small-molecule allosteric modulators or contain “hotspots” of affinity for protein-protein interactions.
While direct experimental methods to identify correlated motions by NMR are limited, one can use chemical shift perturbations and changes in side-chain order parameters of residues outside a binding pocket to identify population shifts in residues within or proximal to allosteric sites that accompany ligand binding. A study by scientists at Roche identified IL-2 residues showing backbone and side-chain amide chemical shift perturbations upon binding IL-2Rα receptor or IL-2Rα-competitive small molecule61. For example, Tyr31, Gln74 and Ser75 (in the strong cluster colored “red” in Fig. 6) show strong 15N/1H chemical shift perturbations following competitive ligand binding, though these residues are not in the competitive binding site. While ring current effects from the ligand's biphenyl group could contribute to these shift perturbations, they are qualitatively consistent with our prediction that these residues' conformations are correlated to the conformation of the IL-2Rα binding site. Furthermore, several residues or regions distal from the IL-2/IL-2Rα interface identified by our approach as highly correlated to the “blue” cluster (encompassing many residues in the IL-2/IL-2Rα interface, PDB: 1Z9276) showed substantial chemical shift perturbations upon IL-2Rα binding (Fig. 7). Unfortunately, resonance overlap restricted the analysis of chemical shift perturbations, notably in most of the flexible loop in the red cluster, so we cannot test our prediction that one would see many perturbations in the flexible loop. It is important to note that none of the nine residues showing insignificant chemical shift perturbations (Asn26, Thr37, Met104, Cys105, Tyr107, Thr113, Ile122)61 appeared in the red or blue highly correlated clusters.
In the previous sections we used a global description of pairwise couplings to identify putative allosteric sites from correlations between clusters of residues. Presently, we apply our method to study the mechanism of coupling between two given allosteric sites. From our matrix of pairwise correlations between residues in apo-IL2 and representative structures from the conformational ensemble, we could infer a structural mechanism by which the two small molecule binding sites might be coupled. In Fig. 8 we show matrix elements corresponding to residues in the IL-2Rα-competitive site and residues in the allosteric site, along with representative conformations of these residues. These representative conformers were picked by clustering the MD snapshots according to the RMSD of residues in the “red” cluster that belonged to the highly flexible loop or were proximal to the N-terminus.
Thermodynamically, the two sites are coupled directly by the off-diagonal gray matrix elements in Fig. 8 in the box denoting “Correlations Between Sites”. These two sites may also be coupled by higher-order terms involving other residues, which our pairwise analysis does not address (save for the hierarchical clustering which uses patterns of correlation rather than the correlations themselves). From the representative conformations for these residues in Fig. 8, the two sites appear to be coupled via a polar network on the protein surface and a greasy core. Two residues are common to both binding sites, namely Lys35 and Met39. The side-chain of Met39, for example, can directly interact with both of the cooperative small molecules, and will be discussed in more detail later. A number of polar side-chains pointing toward the solvent form a network of hydrogen-bonding and electrostatic interactions. In particular, Gln74 (dark green lines) samples a wide swath of conformations, sometimes hydrogen bonding with basic residues in the competitive site (Lys35, dark blue, and Arg38, light blue), and other times hydrogen bonding with basic Arg81 (gray) near the allosteric site. Also, correlations between residues in the greasy core connect hydrophobic surfaces of both compound binding sites; when bound, the allosteric or competitive small molecule would be contiguous with this hydrophobic network. Notably, the matrix elements showing “Correlations Between Sites” indicate that the conformations of residues in the polar network and greasy core are coupled. Thus, a more accurate mechanism for the coupling would be that the two sites are coupled via a polar network, a greasy core, and crosstalk between these elements.
The preceding analysis suggests that the two sites are connected via correlated motions, and that this could explain the observed allostery. To directly test our hypothesis that the experimentally observed cooperativity between the sites involves subtle population shifts in residues exhibiting correlated motions, we performed additional simulations with a competitive or an allosteric inhibitor bound, and asked whether simulations with the allosteric inhibitor bound would cause population shifts in the competitive site similar to those observed in simulations with the competitive inhibitor. Comparing the crystal structures of apo IL-2 (PDB:1M47) to competitive-site-inhibited IL-2 (PDB:1M48), we note that the motion of two side chains, Met39 and Phe42, opened up a binding groove for the ligand that was closed in the apo structure.
We then asked whether population shifts caused by allosteric ligand binding would help open up the competitive site for competitive ligand binding. To address this question, we examined side-chain torsion angle distributions for Phe42 and Met39 in simulations with compound at either site and compared these to distributions from the apo simulations (using histogram bins of 6 degrees and 10 ps time intervals). The populations of side chain dihedrals angles in Phe42 and Met39 show substantial differences between the apo protein and the protein bound with competitive and allosteric inhibitors (Fig. 9, Fig. 10). Interestingly, the populations observed for the allosteric compound-bound protein are intermediate between apo IL-2 and competitive-site-bound IL-2. Phe42, a hot-spot residue critical for ligand binding and protein binding, adopts a different χ1 rotamer for ligand binding than it does for protein binding, which is more similar to the apo rotamer (Fig. 9). The χ1 rotamer selected by ligand in competitive-site-bound simulations (100% population) was more populated (89%) in allosteric-bound simulations than in apo simulations (39%), showing a population shift caused by allosteric compound binding.
We predict that Met39 is an important mediator of binding cooperativity because its conformation is correlated to that of Phe42 and because it shows χ1 and χ2 population shifts upon allosteric compound binding in the same direction as population shifts from the apo to competitive inhibitor-bound distributions. In crystal structures, Met39 adopts similar conformations in complexes with competitive inhibitor or allosteric inhibitor that both differ from the conformation in the apo structure. Met39 is in an “up” conformation in the apo structure, packed against hot-spot residue Phe42. In competitive inhibitor-bound structures, the side-chain of this Met moves down to slightly enlarge the pocket for a ligand aromatic ring, while in the allosteric-bound structure, the Met sidechain moves down to interact weakly with the ligand and fill in part of the hydrophobic pocket opened to accommodate the ligand (Fig. 10). Interestingly, this Met39 is not critical for a high-affinity competitive ligand to bind at the competitive site78, presumably because mutating it to alanine would simply make that hydrophobic pocket a little larger. However, it is currently unknown whether this residue is required for allosteric ligand binding or for the binding cooperativity, as we predict. Our calculations indicate that cross-talk contributing to cooperativity involves not only the greasy core (of which Met39 and Phe42 are a part), but also the loop between helices 3 and 4 and a polar network involving a number of basic residues on the protein surface. This hypotheses could be further tested by conservative mutations of residues such as Gln74, which is not part of either ligand's binding site; Lys35, whose alkyl tail but not polar head contacts compound 1 in the crystal structure; or Met39, whose alanine mutation only shows a slight effect on competitive ligand binding79
The preceding analysis suggests that binding at the allosteric site can positively modulate binding at the competitive site through changing the distribution of side chain rotamers. To more directly assess the relationship between these relatively subtle changes and ligand binding, we have performed small-molecule docking against snapshots from the MD simulations (Fig. 11). Although the scores from docking to MD snapshots cannot be interpreted as accurate binding affinities, we use them as way of qualitatively assessing whether the conformation of the site is appropriate for binding the competitive inhibitor. Because this site consists of many flexible side-chains, RMSD of the competitive binding site residues to the complexed crystal structure was not an appropriate measure of whether the competitive site's conformation was favorable for competitive ligand binding. We found that the best-scoring docked pose roughly superimposes with the crystal ligand (2.8 Å RMSD without fitting, 1.4 Å RMSD with rotational/translational fitting, Fig. 11C). The cluster of MD snapshots with the best-scoring docked ligand represented 0.033% of total snapshots. Thus, our relatively short molecular dynamics simulations sampled conformers suitable for binding competitive site ligand at 300K in the presence of allosteric ligand, enabling us to create a model of the ternary complex (Fig. 11C).
We have reported novel improvements to mutual information calculations that make them robust enough for relatively short molecular dynamics simulations and have applied our MutInf method to interrogate the mechanism of small molecule binding cooperativity in human interleukin-2. We found better separation of signal from noise in our matrix of correlations between pairs of torsions than in similar matrices that examined backbone Cα correlations. We identified not only local correlations in sequence and in distance space but also long-range correlations. Clustering the matrix of mutual information between residues, we identified a few clusters whose residues showed strong patterns of correlations. Two of these clusters highlighted key functional sites, namely the IL-2Rα-competitive protein interface/inhibitor binding site and a highly flexible loop that has to move to reveal a cryptic binding pocket for the allosteric ligand. Furthermore, we found that the conformations of a number of pairs of residues in these two functional sites were strongly correlated.
As MutInf identified known cooperative binding or functional sites within the top clusters and correlations between them, we believe that this approach will be useful in identifying novel allosteric sites for proteins or for small molecules. For example, we predict potential allostery between the flexible loop surrounding the allosteric compound binding site and the N-terminus of helix 1, the loop region around Glu100, and the C-terminus. Our prediction is further supported by the observation that all of these regions showed significant backbone NMR chemical shift perturbations upon binding of the IL-2Rα receptor61. However, the biological roles of these regions are not clear. The C-terminus of IL-2 interacts weakly with the γc receptor75, 80, 81 (KD ~0.7 mM) and independently of IL-2Rα binding. NMR chemical shift perturbations and isoelectric point changes upon addition of methionine to IL-2's N-terminus suggested a potential interaction between the N- and C-termini of apo IL-2 in solution82. Intriguingly, Thr3 is a site on human IL-2 that is variably glycosylated83; in mice, the N-terminus is longer and displays substantial sequence and glycosylation pattern variability, which in turn impacts IL-2's function in Type I diabetes in mouse models84, 85. An important caveat is that correlated motions are necessary but not sufficient for biologically-relevant allostery.
Though our statistical filtering enables us to find significant correlations in relatively short simulations, in applications where accurate total conformational entropy is desired, multiple longer simulations may be required to obtain absolute total entropy values that all converge to the same value, and higher-order terms might be needed. Nonetheless, our approach is useful in determining which residues or groups of residues show correlated conformations and which residues may mediate crosstalk between functional sites. One caveat is that MutInf focuses on coupled residue conformations rather than vibrations, and so we may not efficiently capture the role of semi-rigid elements in mediating correlations between more flexible sites. Though we did not look at motions faster than 1 ps, these are likely not as critical for ligand binding cooperativity in interleukin-2, where the residues linking the sites are primarily in flexible loops and have flexible side-chains. In general, however, such faster-timescale motions can help mediate cooperativity between more flexible sites, and so future work is needed to properly account for these in our approach.
Our calculations suggest that small molecule binding cooperativity in human interleukin-2 involves subtle population shifts and correlated conformations of two binding pockets coupled through a greasy core and a solvent-exposed polar network. New biophysical techniques to directly measure correlated motions by NMR would be useful in testing our predictions about correlated motions that couple allosteric sites.
The authors acknowledge J. Wells, M. Gilson, J. Gross, M. Arkin, H. Li, I. Kuntz, M. Kelly, J. Gross, and R. Abel for helpful discussions. The authors also acknowledge M. Hutter for assistance with his Bayesian approach. GF thanks T. Kortemme (UCSF) for support, and DLM and HA thank Ken Dill (UCSF) for support. CLM thanks J. Nilmeier for inspiration and gratefully acknowledges funding for this work from the UCSF integrated Program in Quantitative Biology fellowship and the UCSF Cancer Research Coordinating Committee. MPJ is a consultant to Schrodinger LLC.
Christopher L. McClendon, University of California San Francisco, Graduate Group in Biophysics and Department of Pharmaceutical Chemistry.
Gregory Friedland, California Institute for Quantitative Biosciences, University of California San Francisco, Graduate Group in Biophysics and Dept. of Bioengineering and Therapeutic Sciences; Sandia National Laboratories, Joint BioEnergy Institute; University of California San Francisco, Department of Pharmaceutical Chemistry.
David L. Mobley, University of California San Francisco, Department of Pharmaceutical Chemistry; University of New Orleans, Department of Chemistry; University of New Orleans, Department of Chemistry.
Homeira Amirkhani, Sharif University of Technology, Department of Physics.
Matthew P. Jacobson, University of California San Francisco, Pharmaceutical Chemistry.