Modern molecular biology techniques have greatly accelerated the use of recombinant biological molecules in the clinic1
. Accordingly, therapeutic proteins now comprise one of the most important and fastest growing sectors of pharmaceutical development. Recombinant proteins have provided novel therapeutic options for a wide range of human diseases, including those for which no drugs produced by more traditional chemistry-based methods exist.
Many proteins, however, are unsuitable as therapeutic molecules in their naturally occurring state. Protein engineering methods provide routes to achieve a wide array of desired properties. One of the most popular and effective engineering techniques is to subject a protein of interest to an evolutionary process2; 3; 4
. The power of protein evolution derives from its ability to select variants with a desired property (in particular, improved binding affinity) from a highly diverse pool. Using such methods for directed evolution, recombinant proteins can be engineered to bind a nearly limitless repertoire of potential targets with relatively high specificity and affinity.
Mammalian immune systems are encoded with natural protein engineering tools. Antibodies, as products of the adaptive immune system, are exceptionally well-suited to combating infectious disease: hypervariability and a natural affinity maturation process allow for recognition of diverse antigens5
; a constant region triggers potent immune mechanisms. Antibodies have therefore been commonly used as therapeutic molecules, either in their natural state or after further engineering.
Alternatives to antibodies, largely in the form of diverse scaffolds for the design and engineering of recombinant proteins that often serve as high-affinity steric inhibitors of deleterious protein interactions, have been developed (reviewed in 6; 7
). Like antibodies, these scaffold proteins are often able to bind a wide range of target proteins. The natural protein binding partners of drug targets, although more restricted in their specificity than generic scaffold proteins, provide additional alternatives for protein engineering that can lead to highly effective therapeutic proteins8; 9; 10
Protein engineering methods have been restricted traditionally to amino acid sequence variation of the initial protein architecture, and not expansion or modification of the architecture itself. Recent studies, however, have shown that engineering strategies that dispense of natural protein architectures, through the recombination and rearrangement of protein domains and modules, can result in engineered proteins that exhibit unique molecular recognition properties and novel functions (reviewed in 11
). Hybrid methods that exist somewhere between maintaining and dispensing of the initial protein architecture, by diversifying not only the sequence but also the length of protein loops within or near the protein-protein interface, have now been utilized successfully in numerous molecular systems. This is yet another engineering strategy that mimics nature, especially the length diversity of complementarity determining region (CDR) 3 loops in antibodies and T cell receptors (TCRs)12
. Several groups using as a scaffold the 10th
type III domain of human fibronectin, which has an immunoglobin-like β-sandwich fold and CDR-like loops, have exploited loop length diversity to achieve significant affinity gains13; 14; 15; 16
. Here we present the detailed molecular basis of an amino acid residue and loop length diversity protein engineering process that, when combined, resulted in a greater than million-fold affinity increase in an engineered TCR-superantigen (SAG) complex.
We previously used a semi-rational protein engineering strategy that incorporated structure-based knowledge concerning protein complexes that are homologous to the targeted complex in order to increase the extent and degree of affinity maturation17
. We generated an engineered TCR variant named G5-8, derived from the mouse TCR Vβ8.2 chain (mVβ8.2), that binds to the bacterial SAG staphylococcal enterotoxin B (SEB) with a three million-fold increase in affinity relative to the wild type mVβ8.2, with measured binding affinities of 48 pM17
and 150 μM18
, respectively. Additionally, we showed that G5-8 acts as an inhibitor of SEB-mediated T cell activation and is completely protective in vivo
when administered to animals challenged with a lethal dose of SEB17
. A brief overview of our semi-rational, structure-based protein engineering strategy is presented here as a guide for the structural and energetic bases of the engineered affinity maturation described below.
Initially, following a standard directed evolution strategy, we created libraries with genetic variability in the mVβ8.2 region, but no sequence length changes, within the targeted mVβ8.2/SEB complex molecular interface and selected affinity-matured variants by yeast display (). Genetic diversity was focused entirely on the CDR2 loop of mVβ8.2 since it forms the majority of the protein-protein interface with SEB and contains several hot spot contacts19
. From this process, we generated G2-5, a variant of mVβ8.2 that binds SEB with an affinity of 650 pM17
, an approximate 200,000-fold increase relative to the wild type complex.
Subsequently extending from the G2-5 platform, we followed a semi-rational directed evolution engineering strategy () that takes advantage of structure-based knowledge of an homologous TCR/SAG complex, the human TCR Vβ2.1 domain (hVβ2.1) in complex with streptococcal pyrogenic exotoxin C (SpeC). SpeC interacts with hVβ2.1 forming intermolecular contacts with each TCRβ hypervariable loop20
(, left panel), while SEB contacts only the mVβ8.2 CDR2 and HV4 loops21
(, left and middle panels). The hVβ2.1 CDR1 loop includes a non-canonical single amino acid residue insertion, which acts to push several residues C-terminal to it closer to the SpeC molecular surface to make numerous intermolecular interactions that have been shown to augment the affinity of the hVβ2.1/SpeC complex22
. Conversely, residues from the shorter CDR1 loop of mVβ8.2 are located at too great a distance from SEB to make specific interactions (, middle panel).
With this more comprehensive structural understanding of TCR/SAG interactions and seeking to functionalize the CDR1 loop of mVβ8.2 as a meaningful contributor to increased SEB binding affinity, we extended the standard directed evolution approach (, right panel) by generating additional mVβ8.2 libraries that included randomized CDR1 loops with either one or two additional amino acid residues relative to the wild type sequence and selecting for affinity-matured variants (, right panel). After exhaustive, iterative rounds of mutagenesis and selection for mVβ8.2 variants with modified CDR1 loop length, most of the isolated variants contained a single additional residue. One of these, variant G5-8, incorporated the additional CDR1 loop residue (Ser27aG5-8
), as well as two CDR1 loop variant residues (Tyr28G5-8
). G5-8 binds to SEB with an affinity of 48 pM17
, 3 million-fold higher than the wild type mVβ8.2/SEB complex, and more than 10-fold higher than G2-5, the highest affinity variant with a conserved sequence length relative to the template for directed evolution, wild type mVβ8.2.
The structural basis of this rationalized protein engineering method is now revealed by our 2.95 Å X-ray crystal structure of the G5-8/SEB complex, combined with a mutational analysis (see Supplementary Information – Materials and Methods
). Crystallographic and refinement statistics for this structure are listed in . There were eight G5-8/SEB complexes per asymmetric unit in this crystal. The interface G5-8 variant residues and the SEB residues that they contact from all of these complexes superimpose essentially perfectly (Supplementary Figure 1
), even though in the final stages of refinement the non-crystallographic constraints on all interface residues were relaxed. Electron density maps also clearly delineate the side chain positions of these residues (Supplementary Figure 2
Crystallographic and refinement statistics for the G5-8/SEB complex structure
When the G5-8/SEB structure is superimposed onto the wild type mVβ8.2/SEB structure21
, the two main chains of the complexes are nearly indistinguishable except for the CDR1 loops of G5-8 and mVβ8.2 (). A schematic interaction map of the wild type mVβ8.2/SEB and affinity-matured G5-8/SEB protein-protein interfaces is shown in Supplementary Figure 3
. Three G5-8 residues (Ser27aG5-8
) replace two mVβ8.2 residues (Asn28mVβ8.2
), which results in a longer CDR1 loop with a distinct conformation (). The structural effect of these CDR1 loop sequence and length changes in G5-8 is that the side chain of Tyr28G5-8
is pointed directly toward SEB (), confirming the rational basis of our engineering strategy.
Structural analysis of the affinity matured complex
Mutations in the CDR2 loop of G5-8, which are similar to those in G2-5, include two relatively large amino acid side chains that replace minimal side chains. These include substitutions of Ala52mVβ8.2
, respectively (). Together, the variant CDR1 and CDR2 loop residues in G5-8 that make intermolecular contacts with SEB (Tyr28G5-8
) encompass the β-sandwich domain of SEB, extending from the inter-domain cleft to its periphery (), a well-documented region of energetic importance for SAG/TCR complexes19; 23
The variant residues in G5-8 from both the CDR1 and CDR2 loops form numerous intermolecular contacts with SEB that are absent in the wild type mVβ8.2/SEB complex. This results in relative increases in buried TCR surface area (805 Å2 versus 561 Å2), shape complementarity (0.67 versus 0.56), and hydrogen bonds (11 versus 3). The CDR1 loop mutation Tyr28G5-8 forms a pi-stacking interaction with Arg110SEB and a hydrogen bond with Asn60SEB (). This results in an additional ~70 Å2 of buried surface area that is not present in the mVβ8.2/SEB complex. The CDR2 loop mutations, Ile52G5-8 and Arg53G5-8, form van der Waals contacts and a hydrogen bond with a trio of SEB asparagine residues, Asn31SEB, Asn60SEB and Asn88SEB. These three variant residues in the G5-8 CDR1 (Tyr28G5-8) and CDR2 (Ile52G5-8 and Arg53G5-8) loops comprise the majority of the increased buried surface area and intermolecular contacts in the G5-8/SEB complex relative to the wild type complex and form a contiguous interface with SEB centered around Asn60SEB ().
In protein-protein interactions, not all noncovalent contacts in the interface are energetically equivalent24; 25
. To determine which mutations in G5-8 resulted in significant energetic changes in complex formation with SEB, relative to the mVβ8.2/SEB complex, we assessed relative binding affinity changes for reversion mutations of each variant residue, as well as alanine and/or phenylalanine mutations of Tyr28G5-8
(see Supplementary Information – Materials and Methods
). As others have combined phage display with alanine scanning mutagenesis to create “shotgun” alanine-scanning mutagenesis26; 27; 28
, we combined yeast display and reversion/replacement mutagenesis, as we had done previously with another TCR/SAG interaction, human Vβ2.1 in complex with toxic shock syndrome toxin-18
, for a facile and efficient method for the energetic evaluation of individual amino acid residues, or individual atoms thereof, in an evolved protein ().
Energetic dissection of the affinity increasing mutations
Using this approach, we found that several mutations in both the CDR1 and CDR2 loops of G5-8 were energetically important for complex formation (). Specifically, reversion mutations at the CDR1 position 28 (Tyr28→Asn; red in ) and the CDR2 positions 52 through 54 (Ile52→Ala, Arg53→Gly, Asn54→Ser; blue in ) resulted in significant reductions in binding affinity when displayed on the yeast surface in the context of the G5-8 background.
To further dissect the molecular basis of affinity maturation in the CDR1 loop, we performed a similar mutagenesis analysis with Tyr28G5-8→Phe and Tyr28G5-8→Ala mutations. These two replacement mutations abrogate the hydrogen bond and pi-stacking interactions, respectively, observed in the crystal structure (). These assays indicated that the binding energy ascribed to Tyr28G5-8 is derived primarily from the pi-stacking interaction between its phenyl ring and Arg110SEB, and not from the hydrogen bond formed between its hydroxyl group and Asn60SEB (red in ). Additionally, we observed no relative change in binding for the Ser27aG5-8→Ala mutation, confirming that this inserted residue does not itself make energetically significant interactions with SEB. Instead, the single residue insertion at this position probably acts as a spacer to lengthen the CDR1 loop such that Tyr28G5-8 can form energetically productive contacts with SEB (see below).
The contiguous stretch of CDR2 loop residues 52 through 54 is critically important for affinity maturation (blue in ). The reversion mutation at position 53 contributes most significantly to the affinity maturation process, as might be expected from the ~210 Å increase of buried surface that results from Arg53G5-8 relative to that of Gly53mVβ8.2. Likewise, the ~75 Å increase in buried surface area that results from mutating Ala52mVβ8.2 to Ile52G5-8 makes a significant contribution to binding in the G5-8/SEB complex. Although Asn54G5-8 makes no intermolecular contacts with SEB, it may contribute to the affinity maturation process, perhaps through intramolecular interactions that act to stabilize the conformation of the G5-8 CDR2 loop. According to our mutational analysis, several other residues in both the CDR1 and CDR2 loops may contribute to the affinity maturation process in a similar, although less significant, manner including Lys24G5-8, Met48G5-8 and Val55G5-8 (grey in ). The molecular basis of affinity maturation by these residues is uncertain, but it may be due to effects on the conformational flexibility of the CDR loops, as suggested by an Arg53G5-8 to Ala53G5-8 mutation that we observed to have an intermediate affinity between arginine and glycine residues at position 53. Based on our crystal structure, it is unlikely that the Ala53G5-8 Cβ atom can contact SEB, indicating that CDR2 loop entropy and/or flexibility may contribute to G5-8/SEB binding.
In addition to the experimental mutational analysis described above, we performed a computational analysis of the same set of individual reversion and replacement mutations using the Rosetta program29
(see Supplementary Information – Materials and Methods
). The results of this computational analysis strongly corroborate the experimental results. A plot of Rosetta ΔΔG scores versus experimentally measured binding free energies for all 18 CDR1 and CDR2 loop mutants () exhibits a correlation coefficient of 0.91. As with the experimental analysis, the computational analysis clearly distinguished those mutations that had profound versus insignificant binding effects and clearly implicated the pi-stacking interaction of Tyr28G5-8
to be of greater energetic importance than the hydrogen bond formed between its hydroxyl group and Asn60SEB
(Supplementary Table 1
). Only those residues for which we observed a small energetic effect experimentally, including Lys24G5-8
, were in poor agreement with the computational results. These residues do not make specific intermolecular contacts with SEB () and, thus, their energetic effects likely involve backbone conformational changes, as mentioned above, which are more difficult to model using computational algorithms30
Computationally, we also assessed why the CDR1 residues in G5-8 may have given rise to higher affinity and, therefore, selection in the final round of the directed evolution process. We found that without the “spacer” residue, Ser27aG5-8
, the tyrosine residue at position 28 makes very few, and no energetically favorable, contacts with SEB (Supplementary Table 2
), supporting the need for CDR1 loop extension to achieve increased affinity. In addition, computational analysis revealed very few amino acids at position 28, other than tyrosine, to be energetically favorable in an extended CDR1 loop, with only tryptophan, phenylalanine (which was verified experimentally), and glutamic acid predicted to result in affinities commensurate with tyrosine at that position (Supplementary Table 2
). The tryptophan is predicted to have favorable packing interactions in an orientation similar to Tyr28, while the modeled glutamic acid adopts a conformation for maximal electrostatic interactions with the positively charged side chain of Arg110 on SEB (Supplementary Figure 4
The structural and energetic changes that arise from our structure-based, semi-rational directed evolution approach are entirely compatible with our original rationale for modifying the standard engineering strategy. Just as in the wild type hVβ2.1/SpeC structure20
, we find that lengthening the CDR1 loop of G5-8 by a single additional residue has the effect of pushing a residue C-terminal to the insertion site closer towards SEB to which it can form intermolecular contacts that significantly increase binding affinity. The evolution step of this approach is still required, however, as simply increasing the length of the wild type CDR1 loop would not provide for these energetically productive interactions since the wild type residue Asn28mVβ8.2
would be unable to form similar pi-stacking or hydrogen bond interactions with SEB. Thus, by rationalizing the directed evolution process in a structure-based manner to augment its evolutionary power, we have achieved an unprecedented level of affinity maturation in a protein-protein interaction that, in turn, resulted in a highly effective protein therapeutic.