|Home | About | Journals | Submit | Contact Us | Français|
We describe a method for determining diffusion-controlled rate constants for protein-protein association that explicitly includes the solution pH. The method combines the transient-complex theory for computing electrostatically enhanced association rates with an approach based on a rigorous thermodynamic cycle and partition functions for energy levels characterizing protonation states of associating proteins and their complexes. To test our method we determine the pH-dependent kinetics of association of the HyHEL-5 antibody with its antigen hen egg white lysozyme. It was shown experimentally that their association rate constant depends on pH, increasing linearly in the pH range 6–8 and saturating or even exhibiting a flat maximum in the pH range 8–10. The presented methodology leads to a qualitative agreement with the experimental data. Our approach allows one to study diffusion controlled protein-protein association under different pH conditions by taking into account the ensembles of protonation states rather than just the most probable protonation state of each protein.
All proteins contain ionizable groups that exchange protons with their environment. In solution, proteins fluctuate between available protonation states whose distribution is specific to a given pH. Protein ionizable groups are involved in intra-protein, protein-solvent, protein-protein and protein-ligand interactions influencing protein solubility, folding and catalytic activity.1 A proton transfer between ionizable groups and the surrounding solvent is a process with a pH-dependent equilibrium. To quantitatively describe such protonation equilibria in proteins, pKas of ionizable groups are often used. The pKa of a specific ionizable group in a single protein or in a protein-protein complex can be significantly shifted relative to the pKa of the corresponding isolated group in solution.2,3 Also, shifts in pKas between native and denatured states of proteins or between unbound and complexed forms are a source of the pH dependence of macromolecular stability or binding.4 As the exchange of protons modifies protein charge density, protein electrostatic properties are strongly related to the pH. Therefore, taking into account the protonation equilibria in proteins is essential for modeling their pH-dependent properties and functioning.
Association of proteins with each other and with other biomolecules is an important step in many biological processes. pH-dependent protonation equilibria determine both local charge distributions and protein net charges influencing protein-protein interactions and the kinetics of binding. Uptake or release of protons by protein ionizable groups can accompany their association and cause the association rate constant to be dependent on pH.
Effects of pH on protein association have not been widely studied. Standard theoretical techniques used to estimate rates of protein-protein association (such as Brownian dynamics) only partially take into account the pH effects;5,6 associating proteins are assigned single protonation states which are the most probable ones at a given pH either for isolated molecules or for their complexes. The facts that each protein exists as an ensemble of protonation states (with a number of different protonation states readily accessible by thermal agitations if their energies are sufficiently close to the energy of the most probable state) and that the protonation preferences of each of the associating proteins can be changed upon binding are neglected. Additionally, the pH at which the association occurs is not explicitly included during computations.
where ka and kao are, respectively, the association rates in the presence and absence of electrostatic interactions, < Eel >* is the average electrostatic interaction energy in a transient-complex ensemble that separates the bound and unbound states of proteins and is computed with the Poisson-Boltzmann (PB) model,10 and kBT is the thermal energy. This approach was successfully applied to study the influence of ionic strength and mutations on protein-protein and protein-RNA association kinetics.7–9 In this study we extend this model by explicitly introducing the protonation degrees of freedom and the solution pH. To evaluate Eel we simultaneously use the partition functions for the protonation states of proteins (both isolated and complexed) and the thermodynamic cycle, providing a common reference level for the computed energies.
Even though protonation equilibria are crucial for many biological processes, the amount of available experimental data of protein-protein association kinetics under varying pH is limited. We chose to test our approach on a pair of proteins: the anti-hen egg lysozyme monoclonal antibody (HyHEL-5) and its antigen, hen egg lysozyme (HEL) (Figure 1). The large number of ionizable groups in both proteins and extensive titrations of mutually influencing residues makes the chosen system an interesting and challenging test case. The kinetics of association of these proteins under various conditions was previously studied by stopped-flow fluorescence polarization.12 It was found that the association rate of HyHEL-5/HEL complex varies with pH in the range 6.0–10.0. We determined the association rate constant in the HyHEL-5/HEL system using both the new method and the standard one. We show that including the protonation degrees of freedom and pH while evaluating the electrostatic interaction energies leads to significantly better agreement with experiment than the standard methodology.
The stereospecific binding of two proteins can be described as a consecutive two step process. First, proteins (A and B) approach each other via translational and rotational diffusion and reach a region in configurational space where their separations and orientations are close to those observed in the final complex. Such configurations of proteins constitute the transient-complex ensemble (C*). Second, the conformational rearrangement of proteins leads to the stereospecific final complex (C):7
The overall association rate is:
If the process is diffusion-controlled, kb>>k−a and konka approximation holds. In the transient-complex theory, the transient-complex ensemble represents configurations that are as close to the native complex as possible without the need to consider short range interactions and conformational rearrangements. If the two proteins in the transient-complex are severely constrained in relative translation and rotation, the diffusion controlled rate constant to reach the transient complex, ka is given with eq 1.
The transient-complex ensemble is located at the outer boundary of the native-complex interaction energy well. The procedure to generate and identify the transient-complex ensemble follows that described previously.8,9 Proteins, in a configuration as in the native complex, are treated as rigid bodies with only six degrees of freedom: three for relative translation and three for relative rotation. The larger of the two proteins, HyHEL-5 is fixed in the laboratory frame. HEL translational and rotational coordinates are defined relative to the native-complex configuration built from the X-ray structure. Translation is represented in spherical coordinates by a relative displacement vector (ρ, θ, ϕ). Its norm, ρ, measures the separation between the surfaces of the two proteins. Rotation is represented by three Euler angles (α, β, γ) of the HEL-fixed frame, relative to the laboratory frame. The α angle describes the rotation around a HEL-fixed unit vector, β and γ are polar and azimuthal angles of the HEL-fixed vector in the laboratory frame. Coordinates are sampled from ρ [ρo, ρo + 6] (where ρo denotes the distance measured in the X-ray structure), θ [0, 180°], ϕ [0, 360°], α [−180°, 180°], β [−180°, 180°] and γ [0, 180°].
We generate random configurations of the two proteins by sampling the global coordinates (ρ,θ,ϕ,α,β,γ) from uniform distributions, so that all configurations are equally probable. Only the non-overlapping configurations are accepted and stored for further analysis. The overlaps are detected by testing all atoms of HyHEL-5 against all atoms of HEL. The distances between three classes of atoms are considered: hydrogens, polar (oxygens and nitrogens) and non-polar (carbons and other atoms). We assume that an overlap between atoms within the same or different classes occurs when their distance in the generated configuration is smaller than the minimum distance of such a pair in the X-ray native structure.
Non-overlapping configurations and contacts (native and non-native) are used to locate the boundary of the native-complex interaction energy well, as previously described.8,9 We choose a number of distinct atom pairs belonging to the interface between proteins and these native pairs form a list used to define contacts. Each atom is assigned a contact radius computed as half of the distance from its contact partner in the native structure of the complex. Native contacts occur if for a given non-overlapping configuration the distance between a native pair of atoms is smaller than the sum of its native distance plus 3.5Å. Non-native contacts are defined when any of the two atoms not forming a native pair are within the distance smaller than the sum of their contact radii plus 2.5Å. All steps of the procedure leading to the transient-complex ensemble were performed with a home-made software.
As stated above, the transient-complex ensemble separates the bound and unbound states of proteins. Its location can be identified by measuring the translational and rotational freedom of proteins in non-overlapping configurations. This can be done using σα(N), the standard deviation of the α angle among all non-overlapping configurations for a contact level N,8,9 or Ξ, the difference between σα(N) and its average at all contact levels lower than N. The maximum of Ξ coincides with the transient complex. The level of contacts characteristic for the transient-complex ensemble can be therefore identified by examining Ξ as a function of N:8,9
In order to compute Eel that includes protonation degrees of freedom, we use the previously described thermodynamic cycle.13 The electrostatic interaction energy for each configuration of proteins is:
where and are electrostatic free energies of isolated proteins and of their configuration taken from the transient-complex ensemble. These terms explicitly include the solution pH as described further on. The last term, , computed for molecules with all ionizable groups of proteins in their neutral states, is pH-independent
is evaluated using the traditional PB approach where the solute is represented as a low-dielectric region with NA, NB and NA+NB partial atomic charges located at positions of atomic nuclei. The solute is surrounded by a high-dielectric continuum solvent containing dissolved electrolyte. The distribution of mobile charges (ions) in solvent is described according to the Boltzmann law. Dielectric polarizability is included implicitly by dielectric constants of the media. In eq 6, ϕi is the electrostatic potential at the location of the charge qi computed as a sum of Coulombic potentials due to all charges except qi and the total reaction field at the location of the charge qi resulting from the solute-solvent dielectric boundary.
The electrostatic free energy of a molecule with M ionizable groups can be computed as:
where Gm is the free energy level associated with the mth protonation vector (where is either 0 or 1) and R is the gas constant. At a given solution pH and temperature, the free energy of the molecular state characterized by the mth protonation vector is:14
where is the negative decimal logarithm of the acidic dissociation constant of the isolated ith ionizable group (known from experiments or quantum mechanical computations for small model compounds), ΔΔGii is the free energy change for ionizing protein’s ith site when all other ionizable groups are electrically neutral relative to the analogous change in the model compound, is the interaction energy between ionizable groups i and j in their ionized (charged) state. Zero energy state is defined with all ionizable groups neutral. γi is equal to −1 for acidic and +1 for basic residues. Terms ΔΔGii and can be evaluated using the PB method as described elsewhere.15,16
It was shown before13 that it is possible to accurately compute the electrostatic free energy of a molecule with M ionizable groups without including all 2M energy levels. Moreover, to determine the electrostatic free energy with accuracy better than only the energy levels from the lowest one (i. e., the most probable) G1 up to RT·ln(2)·M (L states) above has to be known:
Computing of at a given pH, separately for HEL, HyHEL-5 and their transient-complex configurations, together with allows one to determine pH-dependent electrostatic interaction energy defined by eq 5.
For comparison we also considered a "traditional" or standard approach where one calculates Eel for each configuration of the transient-complex ensemble assigning protonation states consistent with the lowest energy state determined for selected pH and for the X-ray structure of the complex:
where indices 1 and X denote that only the lowest protonation state of the X-ray native complex is considered.
The coordinates of the HyHEL-5/HEL complex (Figure 1) determined at 1.7Å resolution were taken from the Protein Data Bank (accession code 1YQV17). To obtain fully protonated molecules hydrogen atoms were added and their positions were optimized using the GROMACS package.18 Partial charges and radii were assigned to atoms according to the OPLS-AA force field.19
Total number of 3.8·107 random configurations of HyHEL-5 and HEL was generated. All these configurations were tested for clashes and 417894 non-overlapping ones were accumulated. By analyzing contacts in these non-overlapping configurations the transient-complex ensemble was characterized as described above.
Terms ΔΔGii and (eq 8) for HEL, HyHEL-5 and each of their transient-complex configurations were evaluated using MEAD (Macroscopic Electrostatics with Atomic Detail).15 For each configuration the lowest energy protonation states of proteins at a given pH and temperature were determined using the modified Monte Carlo software of Antosiewicz.20 To evaluate up to 106 lowest energy states were registered. The net charges of molecules as a function of pH were calculated with the HYBRID21 software. Calculations were performed for isolated HEL and HyHEL-5 and for their complex in a single configuration as in the X-ray native structure. MEAD-calculated terms ΔΔGii and were used as an input for HYBRID. The net charges were computed as averages over ensembles of protonation states evaluated at given pH.
and for each of the transient-complex configurations were computed with APBS22 by solving the PB equation on the 225Å×225Å×289Å grid with final spacing of 0.48Å. All electrostatic computations were performed at 150mM ionic strength and temperature of 298.15K. Dielectric constants were set to 4 for proteins and 78.54 for the solvent (water). Solute-solvent dielectric boundary was defined as the van der Waals surface. It was previously shown that for oppositely charged molecules the PB calculations with the van der Waals surface definition of a dielectric boundary give negative electrostatic interaction energies but with the solvent accessible surface definition the electrostatic interaction energies can be positive.23
The transient complex is located at the outer boundary of the bound state. As described in the Methods section a unique definition of the transient-complex ensemble is possible using σα (the standard deviation of the rotation angle α) that measures the degree of rotational freedom in the set of generated non-overlapping configurations. Figure 2 plots the number of contacts (N) registered in 417894 configurations of HyHEL-5/HEL that passed the clash test versus σα computed for a given contact level. The Ξ function defined in eq 4, also presented in Figure 2, shows a maximum for 39 contacts.
The number of configurations from the non-overlapping set with the contact level of 39 was 351; they were classified as the transient-complex ensemble. Representative examples of the transient-complex configurations are shown in Figure 3.
Scatter plots with the number of contacts as a function of relative translation ρ and rotation angle α prepared based on all random HyHEL-5/HEL configurations that passed the clash test are presented in Figure 4. For configurations characterized by 39 contacts, the relative translation ρ ranges from 3.7 to 6.6Å with an average of 4.6Å. The standard deviation of ρ observed in the transient-complex configurations is 0.5Å. The rotation angle α characterizing the transient-complex ensemble falls between −45° and 45°, however, we also observe weakly populated configurations with α ~ −180° and α ~ 180°. The N(α) plot is asymmetric mostly around α ~ −90° and α ~ 90°. Additionally, configurations with α around ±90° are characterized by smaller number of contacts. The average α computed over the transient-complex configurations is −1.2° (with a standard deviation of 9.6°) indicating that positive and negative α values are nearly equally sampled.
The total number of ionizable groups is 32 in HEL and 101 in HyHEL-5. Both our and previous calculations of residual pKas in HEL, HyHEL-5, and the HyHEL-5/HEL complex24 show that pKas of several residues located at the binding interface, including those involved in the Arg-Glu salt links, change significantly upon association. In agreement with24 we find that most acidic residues titrate in the pH range below 5.0 while most basic residues in the pH range above 9.0.
For pH between 5.0 and 10.0 we computed the net charges of molecules as an average over the ensemble of protonation states. The net charge of isolated HEL varies between +10e and +3e, that of isolated HyHEL-5 between +10e and −15e, and of HyHEL-5/HEL complex (as in the X-ray structure) between +18e and −13e (Figure 5). The resulting isoelectric points, pI, (pH of zero net charge) are ~11 for HEL (11.2 determined experimentally25), ~7 for HyHEL-5 (consistent with pI of mouse immunoglobulines usually between 6.0 and 7.526,27) and ~9 for their complex.
At pH above 7.0, HEL and HyHEL-5 are oppositely charged. Experiments show that the maximal binding rate is observed for pH~8.0.24 At pH below 7 both molecules are positively charged but binding is still observed because local interactions between positively and negatively charged regions of both molecules and screening of interactions due to the non-zero ionic strength overcome the overall repulsion resulting from their positive net charges.
According to eq 9, the electrostatic free energy of a protein with multiple ionizable groups can be described as a sum of its free energy in the lowest protonation state and contributions from all other protonation states accessible at a given temperature. The latter contributions decrease the electrostatic free energy below its value in the lowest state G1 (see eq 9). If the number of easily accessible states is significant, the drop in the free energy below the lowest protonation state becomes substantial.
Figure 6 presents electrostatic free energy differences between the lowest protonation state G1 and of eq 9 computed for HEL and HyHEL-5 in the pH range 5–10. In general, the free energy differences observed for HEL are smaller than for HyHEL-5 what can be explained by smaller number of ionizable groups in HEL (32) than in HyHEL-5 (101). Large electrostatic free energy differences are found at low ( at pH 5) and high pH ( at pH 10) for the HyHEL-5 system and at high pH for HEL ( at pH 10). Acidic ionizable groups titrate below pH 5 and basic groups start releasing protons at pH above 9; as a consequence the number of thermally accessible protonation states is raised at low and high pH. The number of accessible protonation states in the pH between 6 and 8 is lower because protonation/deprotonation of acidic/basic groups is less probable. This accounts for smaller decrease in the electrostatic free energy. For example, the number of states (L in eq 9) needed to evaluate the electrostatic free energy of HEL with accuracy better than is 258, 142 and 1157 for pH 5, 7, and 10, respectively. For HyHEL-5 the same accuracy is obtained with 68846, 21110 and 185897 states, respectively. Upon association the number of states accessible for both proteins and their distributions are likely to change. The number of states included in calculations of electrostatic free energies of transient-complex configurations varies (depending on a particular configuration and pH) from about 28000 (at moderate pH) to almost 500000 (at pH 10).
For transient-complex configurations, we present the distributions of electrostatic interaction energies, Eel (eq 5) and (eq 10) at different pH (see Figure 7). In general, the Eel distributions are wider and their centers are more shifted upon changing pH than the distributions of . The Eel and averages computed over the transient-complex ensemble and their corresponding Boltzmann factors differ while varying the pH (Figure 8). As shown in Figure 8 the averages are lower than Eel. Binding is less favorable when the protonation degrees of freedom and pH are explicitly considered for evaluating the interaction energies. Largest differences between average Eel and are obtained at extreme pH (5 and 10). At moderate pH 8 acidic and basic residues are not likely to change their protonation states. Probabilities of proton uptake or release by acids or bases are low, both in isolated proteins and in their complexes, therefore, we observe small differences between average and Eel. It should be noted that in the pH range 5.0–10.0 both methods show favorable binding. However, while in the traditional approach binding becomes more favorable upon increasing pH, our method gives the most negative Eel for pH 9.0. Additionally, when pH is raised to 10.5 Eel becomes positive, interactions in transient-complex ensemble become repulsive and the association is retarded.
According to experiments12 the association rate constant for the interaction of HyHEL-5 with HEL varies with pH in the range 6.0–10.0 (see Figure 9). The experimentally measured ka linearly increases with pH with a threefold rise in magnitude in the pH between 6.0 and 8.0. Above pH 8.0 ka decreases but experiments were performed only at pH 10 so it is possible that the measured association rate exhibits a maximum between pH 8 and 10.
To compare our calculations with experimental data, the basal rate for association, kao (eq 1), i.e., the association rate in the absence of electrostatic interactions, needs to be determined. Although it is possible to estimate the basal rate theoretically by simulating the transiational and rotational movement of proteins in the absence of electrostatic interactions (as described elsewhere7–9), we treated kao as an adjustable parameter. We determined its value by finding a multiplicative factor that gives the best fit of the computed Boltzmann factors e−β<Eel>*) and (Figure 8) to the experimental curve in the pH range 6.0–8.0. The resulting basal rates are (based on fitting of ) and (based on fitting of e−β<Eel>*). Both values fall in the range that is appropriate for stereospecific protein-protein association.28,29
As shown in Figure 9, the traditional approach fails to reproduce the experimentally observed trend in the association kinetics. The slope of the computed curve in the pH region 6.0–8.0 is too small compared to the slope of the experimental curve. Moreover, the association rate constant computed with the traditional approach increases almost linearly in the whole studied pH range.
A much better result is obtained when protonation degrees of freedom and solution pH are explicitly considered. However, while in the pH range 6.0–9.0 the fit to experimental data is almost perfect, the drop in the association rate observed above pH 9.0 is significantly larger than experimental one. This may be due to a number of reasons. The presented approach neglects intra-protein conformational degrees of freedom. While protonation equilibria are treated explicitly and rigorously, the fact that proteins are frozen in their conformations found in the X-ray structure of the complex affects the results. Protein conformational freedom can be important particularly at low and high pH because titration of acidic or basic residues might be accompanied by local changes in the protein structure. Moreover, even some local changes in protein conformation (e.g. the side-chain reorientation) affect the computed energies of different protonation states.
Other limitations arise from the PB method. The dielectric model and the choice of solute dielectric constant may influence the computed shifts in the pKas of groups in the protein relative to their pKas in the corresponding model compound.24 Nevertheless, different dielectric models and dielectric constants are usually justified as a way to indirectly consider the internal flexibility of molecules.
We presented a computational approach to study kinetics of protein-protein association that explicitly takes into account proteins protonation equilibria and solution pH. We combined the transient-complex theory for protein association8,9,29 with explicit evaluation of electrostatic free energies of molecules with multiple protonation equilibria.13 The method was applied to determine the association kinetics of the HyHEL-5/HEL complex as a function of pH. Our results corroborate reasonably well with experimental data.12 The standard approach that we also tested fails to reproduce the pH-dependent trend in the association kinetics. While the standard approach gives almost monotonically increasing dependence of the association rate in the studied pH range, a maximum at pH~8 is observed in experiment.
The presented methodology is not free of limitations. Associating proteins are fixed in their crystal structure configurations while even local changes in protein conformation can affect the computed interaction energies. However, other methods widely used to study protein-protein association such as Brownian dynamics or PARE (Predicting Association Rate Enhancement)30 also suffer from this limitation. Moreover, these methods employ a less rigorous treatment of electrostatic interactions (as compared to the PB method), i.e., test charges or effective charge models and the Debye-Hückel approximation30,31 what may lead to additional uncertainties in predictions of association rate constants.
As proteins are represented with single structures, the coupling between protonation and conformational degrees of freedom is neglected. It is possible to include conformational flexibility during calculation of pH-dependent protein properties using either continuum electrostatic models (for example the MCCE method of Alexov and Gunner32) or empirical approaches (the PROPKA method developed in the Jensen group33). However, to incorporate conformational degrees of freedom in the presented methodology one would need to redefine the formalism used to compute the pH-dependent energy of interaction. Such modifications are underway.
We use the PB model to evaluate the electrostatic energies of protonation states. This model often overestimates the shifts in the pKas of ionizable groups in the protein relative to their pKas of the model compound. The results of the PB model may also depend on the force field used to assign atomic partial charges. While different methods exist to predict pKas and some of them outperform the PB approach,34 this is the only method that allows for rigorous evaluation of standard electrostatic free energies of molecules with multiple ionization sites.
The method used for calculation of free energy levels associated with different protonation states is based on the linear form of the PB equation. The electrostatic potential in the linear PB equation is a superposition of electrostatic potentials of each partial charge of the molecule. This property allows one to precalculate the interaction matrix,16 (of the order equal to the number of ionizable groups in the molecule) that is later used to evaluate free energy levels according to eq 8. While the fact that the linear PB equation is used might prevent applications of our method to highly charged systems, the usage of non-linear PB equation requires performing PB computations on-the-fly with Monte Carlo generation of energy levels; such a task is computationally prohibitive or even impossible.
Notwithstanding its limitations, our approach can predict the association kinetics, its sensitivity to solution pH and help to analyze, understand and rationalize experimental observations. By comparing the presented method with the traditional one we find that the latter approach is reliable in the pH region where protonation/deprotonation events have low probability (neutral pH in case of HyHEL-5 and HEL). However, in pH regions, where protonation/deprotonation is more likely to occur (acidic and basic pH ranges in the studied case) our approach provides a more physically sound description and leads to a quantitatively better agreement with the experiment. The next step would be to check predictive capabilities of our approach.
M.D. and J.T. acknowledge support from Foundation for Polish Science, University of Warsaw (115/30/E-343/BST1345/ICM/2008 and G31-4), Polish Ministry of Science and Higher Education (N N301 245236), Fogarty International Center (National Institutes of Health Research Grant no R03 TW07318). J.A. is supported by University of Warsaw (BST). M.D. would like to thank Huang-Xiang Zhou and Sanbo Qin for helpful discussions.