|Home | About | Journals | Submit | Contact Us | Français|
Conceived and designed the experiments: TT YS. Analyzed the data: TT YS. Wrote the paper: TT YS.
To perform recognition, molecules must locate and specifically bind their targets within a noisy biochemical environment with many look-alikes. Molecular recognition processes, especially the induced-fit mechanism, are known to involve conformational changes. This raises a basic question: Does molecular recognition gain any advantage by such conformational changes? By introducing a simple statistical-mechanics approach, we study the effect of conformation and flexibility on the quality of recognition processes. Our model relates specificity to the conformation of the participant molecules and thus suggests a possible answer: Optimal specificity is achieved when the ligand is slightly off target; that is, a conformational mismatch between the ligand and its main target improves the selectivity of the process. This indicates that deformations upon binding serve as a conformational proofreading mechanism, which may be selected for via evolution.
Practically all biological systems rely on the ability of bio-molecules to specifically recognize each other. Examples are antibodies targeting antigens, regulatory proteins binding DNA and enzymes catalyzing their substrates. These and other molecular recognizers must locate and preferentially interact with their specific targets among a vast variety of molecules that are often structurally similar. This task is further complicated by the inherent noise in the biochemical environment, whose magnitude is comparable with that of the non-covalent binding interactions –.
It was realized early that recognizing molecules should be complementary in shape, akin of matching lock and key (figure 1A). Later, however, it was found that the native forms of many recognizers do not match exactly the shape of their targets. There is a growing body of evidence for conformational changes upon binding between the native and the bound states of many biomolecules, for example in enzyme-substrate , antibody-antigen – and other protein-protein complexes , . Binding of protein to DNA is also associated with conformational changes, which may affect the fidelity of DNA polymerase –, and similar effects were observed in the binding of RNA by proteins –. The induced deformation typically involves displacements of binding sites in the range of tens of angstroms , , , , , . To account for these conformational changes upon binding, the induced fit scheme was suggested. In this scheme, the participating molecules deform to fit each other before they bind into a complex (figure 1B). Another model, the pre-equilibrium hypothesis, assumes that the target native state interconverts within an ensemble of conformations and the ligand selectively binds to one of them (figure 1C).
The abundance of conformational changes raises the question of whether they occur due to biochemical constraints or whether they are perhaps the outcome of an evolutionary optimization of recognition processes. In the present work, we discuss the latter possibility by evaluating the effects of conformation and flexibility on recognition. To estimate the quality of recognition we use the common measure of specificity, that is the ability to discriminate between competing targets. Whether conformational changes and especially the induced-fit mechanism can provide or enhance specificity has been a matter of debate , –. Various detailed kinetic schemes have been suggested and their potential effects on specificity have been discussed – however without direct relation to concrete conformational mechanisms. Here we examine these underlying effects of flexibility and conformational changes that may govern the rate constants and thus determine specificity. Our approach tries to elucidate some of these basic effects by introducing a simple statistical-mechanics model and applying it to a generalized kinetic scheme of recognition in the presence of noise. As an outcome, the flexibility of the ligand and its relative mismatch with respect to the target which optimize specificity can be evaluated.
In the binding schemes described above (figure 1), the ligand is a “switch” that interconverts between a native, inactive form and an active form that fits the target. However, in a noisy biochemical environment, one may expect both the ligand and the targets to interconvert within an ensemble of many possible conformations. Such an ensemble may be the outcome, for example, of thermally induced distortions. Consider for example a scenario in which an elastic ligand is interacting with two rigid competing targets (figure 2). All the conformations of the ligand may interact with the targets and as a result a variety of complexes, differing by the structures of the bound ligand, is formed (figure 2). Among the complexes formed, some are composed of perfectly matched ligand and target. In those complexes, specific binding energy due to the alignment of binding sites is gained. However, a complex may be formed even if the ligand does not perfectly match the target, due to non-specific binding energy. For example, the lac repressor can bind non-specifically to DNA regardless of its sequence . All the complexes, the matched and the mismatched, may retain some functionality. The efficiency of the recognition process depends on the elasticity of the ligand and on the structural mismatch between the ligand native state and the main target.
The quality of recognition is measured by its specificity, which is defined as follows. Consider a ligand a interacting with a correct target A and an incorrect competitor B,
where KA and KB are the dissociation constants, and νA and νB are the turnover numbers. Specificity is naturally defined as the ratio of the correct production rate, RA=[aA]·νA, and the incorrect production rate, RB=[aB]·νB, where [ ] denotes concentration. Typically, the chemical step is the rate-limiting one and the complex formation reaction is therefore in quasi-equilibrium, [aA]=[a][A]/KA, [aB]=[a][B]/KB. Thus the specificity ξ takes the form:
where i and j denote the conformations of the ligand and the target, respectively. Kij is the dissociation constant of the complex formed from the i-th ligand conformation and j-th target conformation and vij is the turnover number of this complex.
Equation 3 and its generalization, equation 4, reflect the dependence of the specificity on both the concentrations of complexes, determined by the dissociation constants K, and on their functionality, determined by the turnover numbers v. These parameters, K and v, depend on the flexibility and structure of the participant molecules. Evaluating this dependence allows us to estimate the optimal flexibility and structure similarity between the ligand and the main target.
In essence, molecular recognition is governed by the interplay between the interaction energy gained from the alignment of the binding sites and the elastic energy required to deform the molecules to align. Motivated by deformation spectra measurements , we treat this interplay within a simple model that takes into account only the lowest elastic mode. This is a vast simplification of the many degrees of freedom that are required to describe the details of a conformational change. However, as we suggest below, this simplified model still captures the essence of the energy tradeoff. Modeling proteins as elastic networks was previously applied to study large amplitude  and thermal fluctuations ,  of proteins, and to predict deformations and domain motion upon binding , . These models are fitted with typical spring constants of a few kBT/Å2.
We consider first an elastic ligand interacting with a rigid target. Later, we discuss the case of deformable targets. The binding domain of the ligand is regarded as an elastic string on which N binding sites are equally spaced (figure 4). The elastic deformation energy is described by harmonic springs that connect adjacent binding sites. In the native state of the ligand, the length of the binding domain is l0. The ligand interacts with a rigid target on which N complementary binding sites are equally spaced along a binding domain of length s. Binding is specific, that is a binding site of the ligand can gain binding energy ε only by fastening to its complementary binding site on the target. The ligand-target interactions are relatively short-range and therefore binding energy is gained only if the complementary binding sites are at the same position.
The presence of the target may induce a deformation of the ligand that, in order to gain binding energy, shifts the binding sites to new positions. However, such deformation of the ligand costs elastic energy. The conformation of the ligand is determined by N degrees of freedom, the N positions of the ligand binding sites. We assume for simplicity that all the springs that connect adjacent binding sites on the ligand have the same spring constant. We consider here only the deformation mode of lowest elastic energy, in which the binding domain of the ligand is stretched or shrunk uniformly. Thus, we reduce the number of degrees of freedom from N to two, the length of the deformed binding domain l, and the position of its edge (figure 4).
To evaluate the effect of conformational changes and flexibility on specificity one needs to estimate the concentrations and reaction constants in (4). Since all the reactions besides the product formation are assumed to be in equilibrium, we can regard each conformation of the ligand, specified by its length li, as a separate chemical species ai. Thus we may apply the law of mass-action to each of the binding reactions ai+A↔aiA, and obtain the equilibrium constant KiA=[ai][A]/[aiA]~Zi ZA/ZiA, where Zi, ZA and ZiA are the single-particle partition functions of the i-th ligand conformation, target and complex, respectively. The equilibrium constant is (see Methods):
where f is the non-specific free energy. The binding energy ε and the non-specific free energy f are in units of kBT. The concentration of free ligand of length li is proportional to the Boltzmann exponent of the distortion energy, [ai]~[a]·exp(−k/2(li−l0)2), where the effective spring constant k is in units of kBT/length2 and [a] is the total concentration of the free ligand. Although some preferred conformations may be catalyzed much faster than the others, the interconversion is assumed to be fast enough to still maintain this equilibrium distribution.
With the knowledge of how the rate constants depend on the conformation and the flexibility of the ligand, we analyze below the specificity to suggest a simple answer to the question raised above: What are the optimal geometry and flexibility that yield maximal specificity? The quality of a recognition process depends on two main properties of the participant molecules, their chemical affinity and the conformational match between them. To discuss the conformational effect, we consider a main or “correct” target A and an “incorrect” competitor B that differ in structure; their binding domains are of different lengths, sA and sB. Chemical affinity is taken into account by assuming that the competing target B has only N−m interacting binding sites while the main one has N. We test the specificity of a ligand specified by a native state length l0 and a flexibility k. We define the mismatch d as the difference between the ligand's native state length and the correct target's length, d=l0−sA. We first examine the competition between two rigid “noiseless” targets and then discuss the noisy case. The generalization to more than two competing targets is straightforward.
Consider a ligand interconverting within an ensemble of conformations, each one with a different binding domain length li. This ligand interacts with two competing rigid targets that differ by their length Δ=sA−sB. The ratio of the production rates due to unmatched and matched complexes is denoted by r=νum/νm where the production rates of the correct and incorrect products are assumed to be equal, νA=νB. For the sake of simplicity, we assume that the target is in excess with respect to the ligand, [A]~[Atotal] and [B]~[Btotal], and that the concentrations of the competing targets are equal, [A]=[B]. Substitution of the equilibrium constant (5) into (4) yields the specificity (see Methods)
where f is the non-specific free energy. The dimensionless parameter α~1/(k1/2g) is the ratio between the typical length scales, k−1/2 of the elasticity and g of the binding potential. Thus, we obtained in (6) the specificity ξ as a function of the structural and energetic parameters: the difference between the target and the competitor Δ, the mismatch between the ligand and the target d=l0−sA, the effective spring constant k and the specific and non-specific binding energies. Below we examine this dependence to find the optimal ligand, specified by its mismatch d (or its native state length l0).
The specificity (6) is simply the ratio of the formation rates of correct and incorrect products, RA and RB, respectively (figure 5). The correct production rate RA, as a function of the mismatch d, is the sum of a Gaussian centered on d=0, which accounts for the specific binding, and a uniform non-specific contribution. RA is therefore maximal at a zero mismatch. The incorrect production rate RB has the same uniform non-specific contribution and its specific contribution is now a Gaussian centered around d=Δ, where it exhibits its maximum. The crossover where the specific and non-specific contributions become comparable defines a “window of recognition”. When the windows of recognition of the correct and incorrect targets overlap, the resulting specificity exhibits a maximum at a finite nonzero mismatch (figure 5A). This optimal mismatch d0 is approximately
As the ligand becomes more rigid, the specificity increases while the optimal mismatch d0 tends to zero (figure 6A). The optimal mismatch is bounded by dmax=(Nε/k)1/2, the length-scale that reflects the interplay between the elastic and specific binding energies.
Competing targets of similar structure Δ≈0, have both correct and incorrect recognition windows centered on zero mismatch and the resulting specificity is akin to a rectangular window (figure 5B). The width of this window is the mismatch where specificity is half of its maximum, d1/2≈k−1/2((N−m)ε−f−log(αr)). As the ligand becomes more flexible the width of this rectangular window increases (figure 6B). Targets that differ much are evidently not competitors. Indeed, if the difference Δ is much larger than the window of recognition, the optimal mismatch vanishes (figure 5C). Thus, (7) provides a criterion for relevant competitors: these must lie within the window of recognition of the correct target.
An interesting special case is when only the perfectly matched complexes are functional. This situation may occur if the non-specific binding energy is small and only matched complexes are formed, or if mismatched complexes are not functional, r=0. The specificity in this case increases exponentially with the mismatch, ξ~exp(k·Δ·d) (figure 6C). Generalization of these results to more then 1D and for multiple competitors is straight forward. In figure 7, the specificity of a ligand as a function of mismatch in the presence of a few competitors is shown. The optimal mismatch depends on the structure of the various competitors. When the competitors have a structure similar to that of the main target, the mismatch is non-zero.
These results are reminiscent of kinetic proofreading ,  in which the specificity of a biochemical reaction increases exponentially with the temporal delay or the number of additional intermediate states –. In kinetic proofreading, the delay reduces the production rates of both correct and incorrect products, but the reduction of the incorrect product is larger and thus specificity improves. In the present case, the equivalent of the temporal delay is the spatial mismatch. It is evident from equation (6) that mismatch reduces both the correct and incorrect rates, but as the effect on the incorrect rate is more significant the overall specificity increases. Of course, a major difference is that kinetic proofreading is an energy-consuming non-equilibrium scheme whereas the conformational proofreading suggested here is at quasi-equilibrium.
In this case, the ligand still interconverts between an ensemble of conformations, but now the target is prone to error. We describe this noise as Gaussian fluctuations of the target's length s with a variance σ. These fluctuations may originate from various sources such as thermal noise, where the variance σ is related to the target's flexibility as σA,B~kA,B−1/2. The noise introduces additional matched complexes and thus widens the windows of recognition of both the correct and incorrect targets. Similar to (6) (see Methods), the resulting specificity is
If σA=σB=σ, the results are the same as for two rigid targets competing for a ligand with an effective spring constant k′=k/(1+kσ2). For any values of σA and σB, when the targets differ in structure Δ≠0, the specificity is optimal at a nonzero mismatch as in the noiseless case (figure 8A–B). But unlike the noiseless scenario, even the specificity of an infinitely rigid ligand may be optimal at a nonzero mismatch. For Δ=0 the specificity has an extremum at a zero mismatch. If the incorrect target is noisier, σA<σB, identical ligand and target achieve maximal specificity (figure 8C–D). However, if the correct target is noisier σA>σB, a mismatched ligand is optimal even for structurally similar targets.
The conformational proofreading model makes several predictions that may be put to an experimental test. To begin with, the structure of the target, the ligand and the competing molecule should fulfill a number of relations. First, we expect a mismatch to occur only if a competitor is within the ligand's window of recognition, since this is the situation where competition may threaten the quality of recognition. This can be verified by comparing the structure of the native ligand, its main target and the competitor. Second, we predict that a compromise must be struck between the need for the native ligand to be as far as possible from the competitor and as close as possible to the target, so that the mismatch will place ligand and competitor at opposite sides of some structural axis. For example, in figure 7 the mismatch which maximizes specificity is determined by the location of the competitors recognition windows. Experimentally, we expect that there will be a need for resolved 3D structures of ligands both in their native state and bound to their target, as well as these of the competitors. The rapidly increasing structural information that is available from studies of molecular recognition systems suggests that data that can validate or falsify our conformational proofreading hypothesis may already be available, or readily obtained.
Besides observing competition and specificity in known biological system, an experiment that in principle allows control over the nature of competition and the functional results of this competition may be carried out. A particularly appealing system that can be experimentally accessed and manipulated is that of transcription factors. While a transcription factor has one or several specific binding sites, there may be many competing sites on the DNA that would bind it. One can therefore experimentally alter the specific binding site or its competing sites, as well as the transcription factor, by point mutations and then observe the effect on specificity, e.g. by measuring the expression of upstream genes. The next step in this direction would be, instead of artificially manipulating the structures, tracing the coevolution of the transcription factor and all of the binding sites, looking at the in-vivo evolutionary optimization of recognition.
The ability to perform efficient information processing in the presence of noise is crucial for almost any biological system. Enhancing the specificity of recognition, in the sense of discrimination between competing targets, is therefore expected to increase the fitness. By introducing a model that captures the essence of the tradeoff between the specific binding energy and the structural deformation energy, it appears possible to estimate the optimal flexibility and geometry of the fittest molecules. Our model suggests that to optimally discriminate between competing targets of different structures, the ligand should have a finite mismatch relative to the main target. This spatial mismatch is similar to the temporal delay that underlies kinetic proofreading. Our analysis suggests that conformational changes upon binding may arise as the outcome of an evolutionary selection for enhancing recognition specificity in a noisy environment. This may also suggest that the structure and flexibility of binding molecules are governed by evolutionary pressure to optimize not only specificity but other cost functions such as robustness to noise.
Within the lowest mode model assumptions, when the ligand and the target are perfectly aligned, x0=y0 and l=s, all the binding sites interact and contribute a total binding energy Nε. Otherwise, the binding energy is only due to a single interacting site. If there are many binding sites, we can neglect the single site contribution and approximate the interaction energy by:
where k is the effective spring constant of the ligand binding domain. The interaction energy (9) describes an idealized scenario in which only perfectly aligned ligand and target gain specific binding energy. Of course, in reality there could be other conformations with partial alignment, but they would require the excitation of higher elastic modes.
In order to calculate the partition function of the complex ZiA=Tr(exp[−H(li)]) all the possible binding configurations of this complex should be specified. As mentioned above, only in the perfectly aligned configuration the specific binding energy Nε is gained. However, there may be other configurations in which the ligand and the target are bound non-specifically. We roughly estimate the non-specific contribution to the partition function as the product of the volume in which the non-specific binding occurs and the exponent of the non-specific binding energy. This non-specific contribution is exp(f), where f is defined to be the non-specific free energy. The total complex partition function is the product of the elastic contribution and the contribution due to binding, specific and non-specific, ZiA=exp(−k/2(li−l0)2)·(δ(li−s)·exp(Nε)+exp f). The elastic contributions to (5) cancel out since they are equal for both the ligand and ligand-target partition functions. The irrelevant kinetic contributions were also omitted.
The ligand may interconvert within a continuous ensemble of conformations specified by their binding site length l. Since the complex formation reaction is in quasi-equilibrium, the concentration of free ligand of length l is proportional to the Boltzmann exponent of the distortion energy, [a(l)]~[a]·exp(−k/2(l−l0)2), where l0 is the native state length. The effective spring constant k is in units of kBT/length2 and [a] is the total concentration of the free ligand. The dissociation constant in its continuous form is
Only matched complexes in which l=sA,B gain specific binding energy and the turnover of these complexes, νm, may be different from the turnover number of the unmatched complexes, νum. Therefore, the continuous form of the turnover number is
If the competing targets are rigid, the contribution to specificity from all possible complexes (4) becomes an integral over all ligand conformations l,
For the sake of simplicity we assume that (i) νA,m=νB,m and νA,um=νB,um, (ii) the target is in excess with respect to the ligand, [A]~[Atotal] and [B]~[Btotal], and (iii) the concentrations of the competing targets are equal, [A]=[B]. Performing the integration yields
where d=l0−sA and Δ=sA−sB. The normalization factor g reflects the assumption of a continuous ensemble of ligand conformations l. g is the phase space cell volume (actually, the translational factor of this cell volume). This cell volume appears as a proportionality constant of the partition functions. g is proportional to the typical length scale of thermal fluctuations in the system  which is affected by the elastic and binding forces. The k-dependence of α=(kg2/2π)−1/2 is at most α~k−1/2 and therefore contributes only logarithmic correction in equations (6–8). Under the reasonable assumptions that νmνum and k1/2l01 the specificity becomes (6). The above assumptions are made for simplicity and clarity, they do not change the qualitative nature of the results.
If the targets are subject to noise in their structure, (12) should also be integrated over all possible target conformations. If the fluctuations of the target binding site are around native state lengths sA and sB with variances σA and σB, the specificity is
If again, we assume that νmνum, k1/2l01 and sA,B/σA,B1, performing the integral (14) yields (8).
Competing Interests: The authors have declared that no competing interests exist.
Funding: This work was supported by the Minerva Foundation and the Center for Complexity Science.