|Home | About | Journals | Submit | Contact Us | Français|
Chemical biology has long sought to build protein switches for use in molecular diagnostics, imaging, and synthetic biology. The overarching challenge for any type of engineered protein switch is the ability to respond in a selective and predictable manner that caters to the specific environments and timescales needed for the application at hand. We previously described a general method to design switchable proteins, called “chemical rescue of structure”, that builds de novo allosteric control sites directly into a protein’s functional domain. This approach entails first carving out a buried cavity in a protein via mutation, such that the protein’s structure is disrupted and activity is lost. An exogenous ligand is subsequently added to substitute for the atoms that were removed by mutation, restoring the protein’s structure and thus its activity. Here, we begin to ask what principles dictate such switches’ response to different activating ligands. Using a redesigned β-glycosidase enzyme as our model system, we find that the designed effector site is quite malleable and can accommodate both larger and smaller ligands, but that optimal rescue comes only from a ligand that perfectly replaces the deleted atoms. Guided by these principles, we then altered the shape of this cavity by using different cavity-forming mutations, and predicted different ligands that would better complement these new cavities. These findings demonstrate how the protein switch’s response can be tuned via small changes to the ligand with respect to the binding cavity, and ultimately enabled us to design an improved switch. We anticipate that these insights will help enable design of future systems that tune other aspects of protein activity, whereby, like evolved protein receptors, remolding the effector site can also adjust additional outputs such as substrate selectivity and activation of downstream signaling pathways.
Chemical biology has been used to engineer small molecule-dependent activity into a variety of select proteins: this has allowed external activation of pathways controlling phosphorylation 1, glycosylation 2, and proteolysis 3, 4. However, each of these advances resulted from implementing pharmacological approaches uniquely tailored to the particular problem at hand. Recently, efforts have expanded to building synthetic switchable proteins in a more general way 5.
The potential utility of generalized protein switches is vast, and already they have successfully been used in certain cases to decipher cellular mechanisms and to construct novel devices such as biosensors 5–7. Some of these generalized examples include switches that activate protein function by using a small-molecule to reverse constitutive degradation of the target protein 3, 4 or induce removal of a self-splicing protein element (an “intein”) 8–10. These systems have typically been designed through an engineering-centric “bottom-up” approach 5: functional modules from different biological systems are mixed and matched to obtain the desired function. In another example, maltose-binding protein (input domain) was fused to β-lactamase (output domain) such that sugar binding regulated β-lactamase activity 11. However, engineering these switches is still far from rational; it entails extensive trial-and-error and/or directed evolution, particularly with regards to the linkers used for tethering together the functional modules 12. Individual point mutants can also tune the behavior of such protein switches, through effects on either the input domain (effector binding) and/or the output domain (catalysis) 13. While combinatorial libraries of randomized gene insertions and their point mutants can be tested to find combinations with the desired effect, it will nonetheless be advantageous to instead design these types of switches in a more rational way: this will enable the switch’s output(s) to not only be predicted, but also tuned for use in different environments and timescales.
As an alternative to this modular engineering strategy, we recently described a method called “chemical rescue of structure” that designs a new small molecule binding site directly into the protein’s output domain 14, 15. We start by identifying a “buttressing” side chain in the protein core (e.g. a tryptophan) that is required to maintain the architecture of the protein functional site; removal of this structural feature by mutation to glycine (e.g. W➝G) disrupts this architecture and leads to loss of function. We then restore the buttress by adding the cognate ligand (e.g. indole), which in turn restores the original protein conformation and rescues its activity.
Our initial studies used β-glycosidase, a sugar-metabolizing enzyme from Sulfolobus solfataricus, as a model system. A high-resolution crystal structure revealed that when we introduced the W33G mutation, catalytic residue Trp433 shifted back into the newly-formed cavity (Figure 1a): this explained the observed loss of enzyme activity, because Trp433 engages in key hydrogen bonding interactions with the substrate. Strikingly, by solving a holo crystal structure we found that indole binding shifts Trp433 back to its former position in the wild type conformation (Figure 1b): this explained the observation that addition of indole restores enzyme activity. Using the Michaelis-Menten model to interpret the enzyme kinetics, we found that introduction of the W33G mutation led to a 730-fold decrease in kcat/Km relative to wild type; this was completely restored upon addition of 10 mM indole 14.
Whether from “chemical rescue of structure” or from some other strategy, a current limitation in designing new molecular switches is the ability to selectively tune the responses of the switches in a predictable manner, as compared to the many other fields of engineering in which very precise control can be achieved 16. Biological systems are inherently noisy and complex, making it difficult to predict in a rational way how a given system will react to a particular input signal 17. In natural systems, such as G protein-coupled receptors (GPCRs), receptors have evolved to interact with a wide range of regulatory molecules 18 and can interact with different ligands that induce opposing responses 19. In estrogen receptor-α, ligand binding dynamics can dictate the fate of unique signaling pathways 20. In extreme cases, such as certain transcription factors, a single ligand can induce opposing responses on a given protein (agonist versus antagonist), depending on the context 21.
With regards to “chemical rescue of structure,” this comparison to natural signaling proteins prompts analogous questions. First, how should a cavity forming mutation be designed and then complemented in order to achieve optimal rescue? Second, how malleable is the resulting effector-binding site? In the case of the β-glycosidase example, the indole concentration can be varied in order to tune Vmax. But natural systems are rarely regulated by just one ligand – to what extent will this synthetic system respond to other ligands?
We envision two potential underlying structural mechanisms for rescue. In the first model the effector site behaves as a discrete switch, such that ligand binding exactly restores the geometry of the active site; such behavior is precedented by a recent study showing the discrete response of T4 lysozyme to a series of congeneric ligands 22. Given such a model, the degree to which a given ligand activates the protein would be related solely to the ligand’s binding affinity at the effector site. In the second model, the effector site responds in a continuous fashion: more diverse ligands can be accommodated, and each may influence protein structure – and thus activity – slightly differently. In this model, activation by a ligand is driven not just by its binding affinity, but also by the extent to which the ligand precisely restores the active site geometry for catalysis; analogous behavior is observed in crystal structures of GPCRs bound to full and partial agonists, in that the partial agonists make only a subset of the specific interactions of the full agonists 23.
In this study we seek to distinguish between these two models, and to further define the ideal chemical and geometric characteristics of a maximally rescuing ligand. We anticipate that the governing properties of this system will shed insight on the tunability of molecular switches built from “chemical rescue of structure” for future applications.
To probe the malleability of the effector site, we began by exploring the structure-activity relationship of rescue using a focused library of fifteen indole analogs with the β-glycosidase W33G enzyme switch described above. To probe the effector site cavity (the void left behind by the Trp to Gly mutation), we compiled this set by adding or removing functional groups around the indole parent scaffold. In order to explore in-plane substituents, methyl groups were individually added to each of the seven outer positions. To probe the effect of removing the hydrogen bond donor and allowing ring pucker, we tested indene and indan. We also included naphthalene and quinoline to explore the effect of increasing the ring size slightly, and we included benzene, toluene, and N-methylaniline to explore the effect of smaller compounds.
We tested each of these compounds in a β-glycosidase enzyme assay, to determine the extent to which each compound rescued activity (Figure 2). We used a high substrate concentration relative to the rescued enzymes’ Km values (Figure S1), such that our assay was designed to provide a readout on the effector’s influence on kcat. Because some effector compounds inhibited the wild type enzyme slightly at this high concentration, we normalized all results to the wild type enzyme under the same conditions. Each compound was tested at a concentration of 2 mM, because this was the maximum concentration at which all 15 compounds were soluble in our assay conditions.
Relative to the wild type enzyme, W33G retains 0.5% basal activity in the absence of any ligand. In our previous study we found that indole fully restored wild type activity at a concentration of 10 mM 14; here, indole only restores 25% of the wild type activity because a lower concentration is used (2 mM). Of the 15 compounds tested, we found that 13 of them rescued activity; however, the extent of rescue was quite diverse. Among the seven methylindoles, for example, we observe the most rescue from 1- and 3-methylindoles; in contrast, 2-methylindole yields no increase in activity over the apo enzyme. Collectively these results show that structural details of the effector ligands can produce differing levels of catalytic activity – either through differences in binding affinity, or through conformational differences in the holo enzyme. In order to distinguish these possibilities, we next sought to directly detect binding of these ligands to the enzyme.
We previously estimated the binding affinity of indole to β-gly W33G to be 0.75 mM in the presence of saturating substrate, and 15 mM in the absence of substrate 14. The weak binding affinity of this interaction – and presumably those of the indole analogs – places it outside the sensitivity limits of many common approaches for detecting protein-ligand interactions. To overcome this hurdle we therefore turned to 19F NMR 24, 25: the large chemical shift anisotropy of 19F nuclei results in line width differences of the fluorine signal between its free and protein-bound states, which is highly amenable for detection of weak interactions 26. In other words, differences between the transverse relaxation times of the free and protein-bound states are manifest through broadening of the 19F signal 26, 27.
We began by identifying a fluorine-containing reporter molecule: we selected 6-fluoroindole, having found that this compound rescues enzyme activity at a very similar level as indole itself (Figure S2). Free in solution, the fluorine in this compound exhibits a sharp peak at −122.07 ppm. Upon addition of WT β-gly there is no change to this peak; however, upon addition of β-gly W33G we observe considerable peak broadening indicating that the reporter binds to the cavity-containing mutant (Figure 3a, Table S1). Importantly, for this experiment we included 2,4-dinitrophenyl 2-deoxy-2-fluoro-β-D-glucopyranoside, a covalently-attached substrate analog of this enzyme: given the allosteric linkage between the active site and effector site, we reasoned that inclusion of a substrate analog would order the active site residues for catalysis, and in doing so would also pre-order the effector site residues for ligand binding.
Next, we evaluated the ability of several ligands to compete with 6-fluoroindole for binding to β-gly W33G (presumably at the allosteric effector site). In addition to indole, we selected 3-methylindole (the next best rescuing compound), 2-methylindole (no rescue), and N-methylaniline (a smaller compound with intermediate rescue).
Upon addition of indole to our sample, we find that the 19F signal sharpens (Figure 3a, Table S1): this implies that indole displaces some of the 6-fluoroindole from the effector site. We also observe analogous peak sharpening for both 3-methylindole and N-methylaniline. In contrast, addition of 2-methylindole produces very little, if any, peak sharpening; this suggests that 2-methylindole binds to β-gly W33G much less strongly than 6-fluoroindole, and the other analogs included in this experiment.
This in turn provides an explanation for why 2-methylindole rescued activity less than other analogs (Figure 2). It is not the case that 2-methylindole binds to the allosteric effector site in a manner unproductive for catalysis; rather, 2-methylindole fails to rescue activity because it simply does not bind to this site. This explanation is further consistent with the protein structure (Figure 3b): the protein environment around Trp33 in the wild type enzyme is tightest around the 2-position, making this the most sterically challenging location at which to accommodate an extra methyl group.
19F NMR competition assays have been used to measure dissociation constants ranging from low nM to high μM 27, but unfortunately this experiment does not allow us to determine the binding affinities of these ligands in an accurate and quantitative way. Nonetheless, as a qualitative measure of binding this assay complements the enzymatic functional assay that will be presented below, for discriminating non-rescuing compounds that do not bind, versus those that bind but do not rescue.
To directly examine potential differences in the active site geometry resulting from rescue by alternate ligands, we soaked crystals of β-gly W33G with 3-methylindole, with N-methylaniline, and with several other ligands. While soaking with indole led to a very useful crystal structure of that complex 14, upon soaking with other ligands we found only the unbound protein. Attempts to co-crystallize other ligands with β-gly W33G also yielded the unbound protein. For this reason, we sought to probe for small structural differences indirectly, through effects on substrate recognition.
Others have shown for allosteric activators of an unrelated system that the level of activation can depend on the structural properties of a specific substrate, and is thus a function of the precise arrangement of the activator-enzyme-substrate complex 28. Our previous study implicated Trp433 as the switch that distinguishes the active and inactive states of β-gly W33G. Accordingly, we anticipated that the precise placement of Trp433 in response to the rescuing ligand would be crucial for dictating the level of catalysis in the holo enzyme – and that rescue may be dependent not only on the effector ligand, but also on the choice of substrate.
The structure of the wild type β-glycosidase has been solved in complex with two different covalent substrate analogs, 2-fluoro-2-deoxy-D-galactose (2F-Gal) and 2-fluoro-2-deoxy-D-glucose (2F-Glc) 29. While these two inhibitors form nearly identical interactions with the active site, the primary difference in these two structures is the interaction of the substrate analog with our “switch” residue, Trp433: 2F-Gal requires that the Trp433 sidechain splits its hydrogen bonding potential between the C3 and C4 hydroxyls (Figure 4a), whereas the different stereochemistry at the C4 position in 2F-Glc allows the Trp433 sidechain to hydrogen bond only with the C3 hydroxyl (Figure 4b). Based on this difference we anticipated that Gal-derived substrates would be more sensitive to structural perturbations of Trp433 than Glc-derived substrates. As a consequence, we further anticipated that Glc-derived substrates would be easier to rescue by effectors that do not perfectly fit the cavity, and may therefore not restore the perfectly ideal catalytic geometry of Trp433.
To test this hypothesis, we compared rescue using Gal-derived substrates versus Glc-derived substrates. Starting with the same Gal-derived substrate utilized in Figure 2 (fluorescein di-β-D-galactopyranoside, FDGal), we again observe better rescue with 2 mM indole than with 2 mM 3-methylindole. However, when we instead use the analogous Glc-derived substrate (fluorescein di-β-D-glucopyranoside, FDGlc) the difference in rescue between these substrates is greatly reduced (Figure 4c). As anticipated, then, rescue of the Gal-derived substrate depends more on precisely recapitulating the catalytic geometry of Trp433. If its position were perfectly superposed with the Trp33 sidechain in the wild type enzyme, the additional methyl group in 3-methylindole would align to the original Cβ position; in the rescued mutant, this would form a steric clash with the Cα to which it was previously covalently attached. We therefore expect that the 3-methylindole location would be slightly shifted relative to Trp33 (and relative to rescuing indole), which in turn explains its slightly decreased ability to rescue activity against FDGal. Though subtle, the ability of these rescuing ligands to shift the substrate preference for this enzyme is also highly reminiscent of “functional selectivity” in the pharmacology of natural signaling systems 30–33.
To definitively separate effects of binding affinity from effects on the active site geometry, we next examined the concentration dependences of these effectors on the rate of product formation, using FDGal as substrate. To mitigate potential effects on Km (and thus simplify analysis), we again carried out experiments in a regime at which the substrate concentration (750 μM) is higher than Km of the rescued enzyme (Figure S1); thus, the observed initial velocity depends primarily on Vmax (which, in turn, depends in part on the effector concentration).
To fit the resulting initial velocities, we adapted the rate equation of a simple allosteric kinetic mechanism 34 by simplifying under the limit of saturating substrate (see Methods). This gave an equation that relates the change in initial velocities due to rescue to two key parameters, KD and W. KD is the effector dissociation constant, and W is a “linkage” term describing the magnitude of the effect of the allosteric ligand on Vmax. Structurally, the term W can also be interpreted as the extent to which the rescuing ligand restores the enzyme’s catalytic activity, because it reports on the rate of the holo enzyme.
Indole, 3-methylindole, and N-methylaniline each produce dose-dependent increases in the initial velocity that are well-described by this equation (Figure 5). To remain within the solubility limits for all three compounds, we were constrained to a maximum concentration of 5 mM. The curves have not fully saturated by this concentration, and so we present two views of the data: one that includes the extrapolated curve fitting (to show the full behavior of the model), and one that instead focuses only on the region for which experimental data was obtained (to show the agreement of the data to the model). Through these curve fits, the values of KD and W are obtained in tandem. To clearly establish the limits of this extrapolation, we report the bounds of KD and W (one standard error in either direction); the resulting bounds on the fitted curves are shown visually as well for comparison (Figure S3).
Among these three compounds, indole has the highest linkage term (W); this is consistent with our expectation that indole most effectively rescues the precise geometry of the active site. In fact, our extrapolation suggests that at high enough concentration the indole-rescued W33G mutant would actually surpass the rate of the wild type enzyme. While we cannot quite access these concentrations due to indole’s solubility, it is possible that the additional flexibility of the rescued enzyme allows W433G to adopt a conformation very slightly different than that of the wild type enzyme (and even more favorable for catalysis).
While the linkage term (W) for 3-methylindole is unsurprisingly worse than for indole, the KD value is actually slightly better than for indole. Thus, 3-methylindole appears to bind to β-gly W33G at least as tightly as indole, but does not rescue activity to the same extent because it does not fully restore the catalytic geometry. This observation is also consistent with those from the experiments described in the previous sections.
Finally, we compare the behavior of N-methylaniline to that of 3-methylindole. While the fitted values for W and KD are very similar for both ligands, the small differences provide an opportunity to visually highlight how these parameters together dictate the response to an effector ligand. 3-methylindole has a superior KD and thus more activity at low effector concentrations; in contrast, N-methylaniline has a superior W value and thus more activity at high effector concentrations – albeit at concentrations accessible only via extrapolation, due to solubility restrictions.
Drawing from long-established “receptor theory” pharmacology of natural signaling systems 35, 3-methylindole and N-methylaniline behave as partial agonists for the β-gly W33G “receptor”, in relation to the full agonist, indole.
To test this model of β-gly W33G activation, we sought to rationally design a ligand with enhanced signaling relative to indole itself. Given our model, it will be very difficult to design a compound with more activity at high effector concentrations (better W), since this would require shifting the catalytic residues in a very prescribed way. Instead, we focused on ligands that might have better activity than indole at low effector concentrations, where binding affinity (better KD) is most responsible for determining activity.
Selectively adding fluorine is a common step in medicinal chemistry when seeking to improve potency of a compound: this makes the compound more lipophilic and also allows for productive halogen-pi interactions 36, 37. Because fluorine is isosteric with hydrogen, shape complementary for the receptor is typically preserved. Further, fluorine-substituted indoles (such as the reporter compound used in the 19F NMR competition assay) at almost all positions are commercially available.
To begin, we solved the crystal structure of one of these, 5-fluoroindole, in complex with β-gly W33G (Figure 6a). Unsurprisingly, we found that 5-fluoroindole engages β-gly W33G in the same manner as indole, with no notable differences relative to the indole-bound or wild type structures of the enzyme. We then characterized rescue of activity from 5-fluoroindole, as we previously did for 3-methylindole and N-methylaniline. Indeed, relative to indole we find that 5-fluoroindole provides enhanced β-glycosidase activity at each of the concentrations we tested (Figure 6b). Notably, however, this difference is manifest most at low effector concentrations; as the concentration is increased indole starts to catch up with 5-fluoroindole, and is extrapolated to ultimately surpass the activity induced by 5-fluoroindole. This behavior is captured quantitatively through the parameters of each fit, in which KD favors 5-fluoroindole but W favors indole. While the crystal structure cannot provide an explanation for the difference in W, we attribute this to a very small difference in either the structure or dynamics of Trp433 in response to these two ligands.
Encouraged by the ability to rationally design an altered response by varying the rescuing ligand, we next sought to adjust the shape of the β-gly cavity, and test whether complementary rescuing ligands could be designed to these new shapes. As a first experiment, we reduced the size of the cavity by incorporating W33A instead of W33G. Earlier, we explained the preference of β-gly W33G for indole over 3-methylindole (Figure 5) by pointing out that while 3-methylindole restores the same number of removed (non-hydrogen) atoms that were removed by mutation, rescue would require close contact between non-covalent atoms that were previously bonded to one another (Cα and Cβ). Applying this logic to W33A, then, one would expect a steric clash between the Cβ of Ala33 and the atom replacing the Cγ of the original tryptophan sidechain (i.e. the indole 3-position). Instead, one would predict better rescue by the corresponding compound that lacks this atom: N-methylaniline. Spurred by this rationale, we therefore explored rescue of β-gly W33A by N-methylaniline, indole, and 3-methylindole.
Consistent with this expectation, we find that N-methylaniline rescues β-gly W33A better than either indole or 3-methylindole (Figure 7); however, the difference was surprisingly modest. A further surprise was the fact that 3-methylindole rescued activity better than indole, which was counterintuitive given that this experiment entailed reducing the size of the cavity.
The explanation for both of these puzzles is found in the separate contributions from active site geometry (W) and binding affinity (KD). N-methylaniline was expected to be most compatible with the Trp433 geometry that is optimal for catalysis; indeed, N-methylaniline has the highest value of W. Its relatively modest activation, meanwhile, derives solely from its poor binding affinity, which may reflect the entropic cost of ordering its freely rotatable bond (that is not present in the other compounds). In the same vein, the surprisingly good activity of 3-methylindole relative to indole also derives exclusively from a difference in binding affinity, presumably because 3-methylindole is more hydrophobic than indole. While it remains surprising that the value of W is not much worse for 3-methylindole than for indole, we speculate that 3-methylindole may flip to position its methyl group towards the exposed region used by indole’s amine group (Figure 3b); this hypothesis could best be tested through crystallographic evidence, however we have been unable to solve a structure of either β-gly mutant in complex with 3-methylindole. Finally, we note that the 3-methylindole and indole values of W are lower for W33A than for W33G; this is also consistent with the hypothesis that neither of the compounds can restore the active site structure quite as accurately when disrupted by Ala33.
In summary, while N-methylaniline proved to be the optimal ligand for rescuing this smaller cavity, the magnitude to which catalytic activity could be restored was nonetheless limited by the small size and flexible nature of this compound. To design a more sensitive switch, we therefore introduced a larger cavity into this enzyme.
Based on the structure of the wild type enzyme, there was only a single nearby sidechain in plane with the indole ring of Trp33, Val37 (Figure 8a). To expand the allosteric effector site, we therefore introduced an additional mutation, V37A, into β-gly W33G. Based on the location of this additional cavity-forming mutation, we expected that β-gly W33G_V37A would best be rescued by indole substituted at the 5-position.
We initially tested the activity of both indole and 5-methylindole against β-gly W33G, at a concentration of 2 mM (Figure 8b). As in our previous experiment (Figure 2), we found that indole conferred about twice the activity of 5-methylindole under these conditions. Testing these two compounds against β-gly W33G_V37A, however, showed a reversal of their activities: now 5-methylindole proved superior over indole.
To understand the basis for this difference, we explored β-gly W33G_V37A activity as a function of effector concentration (Figure 8c). Perhaps unsurprisingly, both compounds have very similar values of W: this implies that both compounds engage the effector site with very similar binding modes, and thus interact with Trp433 in the same way. Accordingly, the basis for the greater potency of 5-methylindole comes from its binding affinity (KD), which is almost an order of magnitude tighter than that of indole. This observation is also unsurprising, given the extra hydrophobic surface area that is buried upon binding of 5-methylindole instead of indole.
At the outset of this study, we tested a panel of indole analogs at 2 mM for rescue of β-gly W33G, and found that none of these surpassed the 25% of wild type activity recovered by indole itself (Figure 2). Now, with a more refined view of this system predicated on pharmacological receptor theory 35, we have designed a new protein-ligand pair that exhibits almost 80% of wild type activity at 2 mM effector (Figure 8b).
In the introduction to this study, we laid out two potential structural mechanisms for rescue at this designed allosteric effector site. In the first model the effector site behaves as a discrete active/inactive switch, activated through ligand binding. In the second model the effector site behaves as a rheostat, reporting not only on the presence of an activating ligand, but also on its precise structural complementarity for the effector site.
All of the accumulated evidence in our study points to the latter model. Numerous methylindoles show activity (all but 2-methylindole), highlighting the malleability of this site for accommodating alternate ligands. While certain ligands bind to this site more tightly than indole, none rescue the precise active site geometry optimal for catalysis quite as effectively as indole: instead, they serve as partial agonists of β-gly W33G.
Through the structure-activity relationship compiled for β-gly W33G, two principles emerge that guide design of agonists to fit a given cavity. First, the ideal rescuing ligand should replace the protein atoms removed by mutation as closely as possible: this leads to the most activity once the ligand is bound. Second, there may be a small tradeoff between a steric clash and a small cavity when replacing the deleted atoms: as demonstrated through indole’s rescue of β-gly W33G and N-methylaniline’s rescue of β-gly W33A, the small unfilled volume is less detrimental for activity than the structural rearrangement required to resolve the steric clash. With these design principles in mind, it became possible to rationalize the complete behavior of many different activating ligands, and also to design a new switch with enhanced activity.
The key strategy motivating design of the 5-methylindole / β-gly W33G_V37A switch was to improve binding affinity while at the same time maintaining the precise complementarity that restored the optimal geometry of Trp433. Given the very small ligands (indole has only 9 non-hydrogen atoms), there is an intrinsic biophysical limit to the potential binding affinity that can be achieved 38, 39. Indeed, the ligand efficiency (binding free energy per non-hydrogen atom) 38 of indole binding to β-gly W33G is −0.31 kcal/mol·atom (Table S2), which is comparable to the median ligand efficiency of other protein-ligand complexes (−0.34 kcal/mol·atom) 39. By design, 5-methylindole incorporates an additional potent interaction (burial of the methyl group in a tight cavity), increasing the ligand efficiency to −0.43 kcal/mol·atom. Given the typical limits of ligand efficiency 39, designing switches that respond to lower ligand concentrations will likely require the use of ligands that rescue larger cavities, or more sophisticated mechanisms that incorporate feed-forward signaling.
In our studies of other model systems activated by indole after incorporation of W➝G cavity-forming mutations, we found that inactivation and rescue could be mediated by local or global unfolding/refolding, rather than through a discrete conformational change 15. This mechanism enabled the use of effector sites located much further from the active site, but may come at the expense of sensitivity towards slightly different ligands. Here, we demonstrated that the response to different ligands could be tuned through a combination of binding affinity (KD) and precise structural complementarity (W). We anticipate that longer-ranged allosteric switches built in manner are less likely to preserve the subtle structural response to different ligands as the signal is transduced through the protein; as a result, the response may instead be dominated simply by the binding affinity. The same limitation is also likely to switches that are built by recombining modular domains 40, 41, because it is difficult to predict precisely how the linkers will transduce the input signal over to the output domain. An important advantage of chemical rescue of structure is the ability to place the location of the effector site in close proximity to the output signal (i.e. the active site), which in turn may facilitate the graded responses we demonstrate here.
Looking forward, careful modulation of the response over a short distance may also facilitate design of switches that exhibit functional selectivity. Already we observe this property in β-gly W33G – though not explicitly designed – in the fact that indole and 3-methylindole tune the enzyme’s preference for substrates containing galactose versus glucose. Natural systems make extensive use of this paradigm, as exemplified by GPCRs that initiate signaling through either the canonical G protein-mediated pathway or the non-canonical β-arrestin G protein-independent pathway, depending on the particular ligand that it bound 19, 42, 43. We anticipate that the design principles emerging from this study of the β-gly W33G designed allosteric effector site will provide a first step towards rationally designing new synthetic switches capable of absorbing information from several different molecular cues, and providing distinct and meaningful responses to each of them 5.
A complete description of methods is available as Supporting Online Materials. Coordinates and structure factors for the crystal structure of β-gly W33G bound to 5-fluoroindole has been deposited with the Research Collaboratory for Structural Bioinformatics Protein Data Bank (PDB) with accession code 5IXE.
The weak affinity of the interactions we sought to measure places them outside the sensitivity limits of techniques to such as surface plasmon resonance (SPR), isothermal titration calorimetry (ITC), and differential scanning fluorimetry (DSF). The sensitivity problem is exacerbated for some of these techniques by the particularly small ligands (< 200 Da) and the relatively large protein (56,000 Da). We recognized that the problem of detecting very weak binding of a small ligand is reminiscent of the challenges faced in fragment-based drug discovery campaigns, and therefore borrowed an emerging tool from their repertoire, 19F NMR 24, 25.
19F NMR spectra were acquired on a Bruker DRX spectrometer equipped with an 11.7 T magnet (19F resonance frequency equals 470 MHz). 500 μM β-glycosidase in 50 mM phosphate buffer in H2O with 10% protonated DMSO, 2 mM 6-fluoroindole and 5 mM of the competitor ligand.
Protein samples were pre-treated with 2,4-dinitrophenyl 2-deoxy-2-fluoro-β-D-glucopyranoside. The 2,4-dinitrophenol serves as a leaving group such that the 2-deoxy-2-fluoro-β-D-glucose remains covalently attached to the protein. We confirmed labeling of the protein by broadening of the inhibitor’s 19F NMR peak, and also spectrophotometrically by production of 2,4-dinitrophenol.
All β-glycosidase enzyme assays were conducted using 58 nM β-glycosidase with fluorescein di-β-D-galactopyranoside (FDGal) or fluorescein di-β-D-glucopyranoside (FDGlc) as substrate, in a buffer of 50 mM sodium phosphate pH 6.5 and 10% DMSO at 37°C. Upon catalysis, FDGal is cleaved twice yielding one D-galactose and two molecules of fluorescein. We detect product formation by following fluorescence with excitation at 485 nm and emission at 528 nm.
To fit the observed initial velocities, we adapted the rate equation of a simple allosteric kinetic mechanism 34 by simplifying under the limit of saturating substrate. This gave the relationship:
where V is the initial velocity at a given effector concentration, [A] is the effector concentration, KD is the effector dissociation constant, Vapo is the maximal velocity in the absence of effector, and W is a “linkage” term describing the magnitude of the effect of the allosteric ligand on Vmax. Functionally, W is defined as the ratio of the maximal velocity at saturating effector, to the maximal velocity in the absence of effector (in other words, W is the fold-increase in Vmax that the effector can bring about). Throughout this work we interpret W as the extent to which the rescuing ligand restores the protein structure required for catalysis, because it reports on the rate of the holo enzyme.
Standard errors for log(KD) and W were calculated with GraphPad Prism version 6.0 via:
where Pi represents the ith adjustable (non-constant) parameter. SS is the sum of the squared residuals. DF is the degrees of freedom (the difference between the number of data points and parameters fit by regression). Cov(i,i) is the i-th diagonal element of the covariance matrix.
The standard error of log(KD) was used to calculate the upper and lower bounds of the KD values (one standard error above/below the best-fit value).
We thank Andrea Bazzoli, Yan Xia, Jittasak Khowsathit, Andrew Beaven, and Wonpil Im for valuable discussions and feedback. Support for the NMR instrumentation was provided by NSF Major Research Instrumentation Grant 9977422 and NIH Center Grant 5P20GM103418. Use of the IMCA-CAT beamline 17-ID at the Advanced Photon Source was supported by the companies of the Industrial Macromolecular Crystallography Association through a contract with the Hauptman-Woodward Medical Research Institute. Use of the Advanced Photon Source was supported by the U.S. Department of Energy, Office of Science, Office of Basic Energy Sciences, under Contract No. DE-AC02-06CH11357. This work was supported by grants from the National Institute of General Medical Sciences (8P20GM103420 and 5P30GM110761), the National Center for Research Resources (5P20RR017708), the National Science Foundation through XSEDE allocation MCB130049, the Human Frontier Science Program (J.K.), and the Alfred P. Sloan Fellowship (J.K.). S. Jimmy Budiardjo was supported by a Post Baccalaureate Research Education grant (2R25GM078441) and the Pharmaceutical Aspects of Biotechnology training grant (5T32GM008359) from the National Institutes of General Medical Sciences of the National Institutes of Health. Timothy J. Licknack was supported by the NSF REU Program (DBI-1156856). Michael B. Cory was supported by the Beckman Scholars Award from the the Arnold and Mabel Beckman Foundation.
Associated Content (Supporting Information)
A complete description of experimental methods and procedures. Figure S1 showing Km values for W33G when rescued by three different effectors. Figure S2 showing indole and 6-fluoroindole activity for β-gly W33G. Figure S3 showing upper and lower bounds of initial velocity at ± 1 standard error. Figure S4 showing electron density maps of 5-fluoroindole bound to β-gly W33G. Figure S5 showing superposition of 5-fluoroindole-bound and indole-bound crystal structures. Table S1 containing 19F NMR peak integrals. Table S2 containing ligand efficiency values for each ligand/protein pair reported in this study. Table S3 containing crystallographic data for 5-fluoroindole-bound β-gly W33G. This material is available free of charge via the Internet at http://pubs.acs.org.