|Home | About | Journals | Submit | Contact Us | Français|
The computer-based design of protein-protein interactions is a rigorous test of our understanding of molecular recognition and an attractive approach for creating novel tools for cell and molecular research. Considerable attention has been placed on redesigning the affinity and specificity of naturally occurring interactions. Several studies have shown that reducing the desolvation costs for binding while preserving shape complimentarity and hydrogen bonding is an effective strategy for improving binding affinities. In favorable cases specificity has been designed by focusing only on interactions with the target protein, while in cases with closely related off-target proteins, it has been necessary to explicitly disfavor unwanted binding partners. The rational design of protein-protein interactions from scratch is still an unsolved problem, but recent developments in flexible backbone design and energy functions hold promise for the future.
Protein-protein interactions are critical to most biological processes. The ability to rationally create and destroy selective protein-protein interactions can be used to develop valuable therapeutic agents as well as novel tools for basic cell and molecular research. In general, protein designers have been focused on three problems in interface design: enhancing the affinity of naturally occurring interactions, redesigning binding specificities within a family of interactions, and designing interactions from scratch. Two common approaches for achieving these goals are directed evolution and structure-based modeling. Recently, there has been significant progress in structure-based modeling as computational methods developed for stabilizing and designing monomeric proteins have been applied to protein interfaces. These programs use rapid optimization techniques to search for sequences that pack tightly, form good hydrogen bonds, and have favorable solvation energies [1,2]. Because the physical principles that determine protein stability and protein binding affinities are similar , sequence optimization algorithms and energy functions developed for de novo protein design can be used largely unaltered for protein interface design.
However, protein interfaces do have unique features that make them particularly challenging to design. Interfaces between proteins that interact transiently are generally more polar than the interior of a protein core . To successfully design these types of interfaces it is especially important to be able to model the subtle tradeoff between desolvation and hydrogen bond formation. Additionally, water mediated interactions are more prevalent at interfaces than in the interior of proteins . Naturally occurring protein-protein interactions can be very specific despite the presence of competing proteins with very similar structures and sequences. To design high specificity interactions protein designers have been required to create new algorithms that allow for the explicit optimization of the energy gap between target and off-target interactions .
Here, we review recent progress in computer-based design of affinity and specificity at protein interfaces and describe new methodology that is likely to be important for achieving even more ambitious goals such as designing novel protein interactions from scratch. We focus on studies that have been not reviewed previously [2,7,8].
A systematic approach to identifying mutations that increase the affinity of a protein-protein interface will clearly be a useful complement to current selection/screening methods. Sammond et al built and tested a protocol  that focused on a set of detailed structure-based “rules”. Before a mutation was predicted to be stabilizing, it must pass these rules, which fall into two categories. First, it must directly increase affinity by increasing the hydrophobic surface area buried by the interface. Second, it must maintain the structure of the interface – this was assessed via the interface interaction energy, ensuring that the mutation not disrupt any hydrogen bonds or lead to burial of additional polar groups, and a requirement that the mutation not destabilize the structure of the monomeric protein (each of these were evaluated using Rosetta).
Given the physically-reasonable basis of this approach, it is not surprising that other studies on different protein-protein interfaces have led to similar conclusions [10-12]. In two separate studies, Tidor and co-workers increased protein binding affinities by searching for mutations that had a net favorable Poisson-Boltzmann continuum electrostatic solvation and interaction score [13,14]. Many of the mutations swapped polar residues buried at the interface with similar sized hydrophobic amino acids (Figure 1). Mutations that were found to increase the affinity of the SHV-1 β-lactamase / BLIP complex  – while not designed with a particular focus on desolvation energies – served to emphasize their importance. Additionally, a reweighting of the terms in the Rosetta energy function to identify those which best predict affinity-increasing mutations in the TCR / MHC peptide complex pointed to sterics and solvation as most important . This is in keeping with the lesson that an efficient way to increase affinity – given the accuracy of current energy functions – is through creation of additional intermolecular contacts without increasing burial of charged groups. This is far from the only way to increase affinity, however, as other studies have found affinity-increasing mutations through consideration of long-range interactions between charged amino acids .
One of the most aestetically appealing – and challenging – targets for designing specificity into protein-protein interactions is construction of an obligate heterodimeric interface from a homodimer. As pointed out by Bolon et. al. , this may represent the situation where “negative design” approaches are most essential. Since each component of the desired interface begins the design process with an identical backbone, “off-target” binding (ie. formation of homodimers) is especially likely. An exciting biological application of this goal has recently been realized through design of heterodimeric nucleases [19,20]; while homodimeric nucleases cleave palindromic DNA sequences, construction of heterodimeric nucleases greatly expends the universe of sequences which can be targeted. This was achieved though “implicit” negative design techniques: introduction of opposite (and hence complementary) charges on each subunit, or else introduction of “all-big” sidechains on one subunit and “all-small” sidechains on the other.
Most computational studies that have focused on changing binding specificities have tested their designs with only one or a handful of potential off-target binders. Recent work by Grigoryan et al. is groundbreaking because they computationally and experimentally test their designed bZIP-binding peptides against all 20 bZIP families in the human genome . Given the overlap in sequence and structure space between bZIP families, it is not surprising that explicit negative design was required to create specific binders. Negative design was especially important for disfavoring homodimer formation by the designed proteins. To rapidly compute target and off-target binding energies a new method based on cluster expansion was developed that allows one to convert a structure-based model into a sequence-based scoring function . Integer linear programming was then combined with rapid energy evaluation by cluster expansion to identify sequences that had specified stabilities and specificities . It will be exciting to see if a similar approach can be used to design specificity in systems that are not as structurally conserved as the bZIP peptides. One potential limitation of the cluster expansion method is that it requires predetermined backbone coordinates for the target interactions, although it has been shown that cluster expansion can be accurately applied to families of very similar backbones . In cases where “undesirable” targets exhibit significant structural differences from the “desirable” target, examples continue to accumulate in support of the idea that mutations neutral or stabilizing to the “desirable” target are on average destabilizing to “undesirable” targets, and hence specificity can be achieved through explicit consideration of the target interaction only [25-28].
While rational design of a single protein sequence that binds a predetermined set of partners remains an unattained goal, the groundwork towards this has been laid through “multi-constraint” design . Using a set of proteins known to bind more than one partner, Humphris and Kortemme redesigned each protein either with a single constraint (to bind one partner in particular) or with multiple constraints (to optimally bind all known partners simultaneously). With this experimental setup, divergence of the single-constraint designs from the multiple-constraint design is indicative of compromise made by the promiscuous protein in order to maintain binding to multiple partners. Surprisingly, the degree of compromise in the sequence of most promiscuous proteins was found to be minimal; the structural basis for this phenomenom was traced to the fact that diverse interaction partners had evolved similar means to form key interactions with the promiscuous protein. In contrast, a small number of proteins – “hub” proteins with an exceptionally large number of interaction partners such as ubiquitin – were found to have evolved their vast promiscuousity via a different approach. Extensive compromise over multiple surface patches was identified, probably representing the primary source of selective pressure on these protein sequences. These exciting revelations may provide direction for future efforts at rationally designing multi-functional proteins.
Despite considerable success in the redesign of naturally occurring protein-protein interfaces, building protein-protein interactions from scratch has proven to be a challenging problem for computational design . In one case, grafting key residues from a known interface into a new scaffold protein created a novel interaction, but this is not a viable approach when the target interface has no known binding partners . In contrast to modeling-based approaches, there are many examples in which directed evolution has been used to design novel protein binders [33,34]. Strikingly, selecting from protein libraries that have surface loops containing only tyrosines and serines is sufficient to generate high affinity binders . Like naturally occurring antibodies, the sequences and structures of the designed loops co-evolve to present complementary surfaces to the target protein. These results highlight the need for computational protocols to sample backbone conformational space as well as sequence space when designing protein interfaces.
A variety of approaches have been developed for combining backbone optimization with sequence optimization. Humpris et al. showed that a pre-generated ensemble of backbones built with small perturbations to Cα-Cβ bond vectors followed by sequence optimization improved recapitulation of favorable sequences at a protein-protein interface . Similar backbone perturbations have been incorporated into dead end elimination algorithms for simultaneous optimization of backbone and sequence . Normal mode analysis has also been used to pre-generate alternative backbones and used to design new helical peptides that bind Bcl-xL . Backbone optimization can also be iterated with sequence optimization. This strategy was recently used to design new conformations in a protein loop, a potential step in de novo interface design . In all of these examples, relatively small perturbations were made to the protein backbone. In contrast, large changes in loop structure are frequently seen in antibodies and binders evolved in the laboratory. Inclusion of large conformational perturbations in computational design will be challenging but potentially useful for creating novel interfaces.
In addition to efficient methods for sampling backbone and sequence space, it is critical to have an accurate energy function for evaluating the relative favorability of different models [40-42]. After producing a set of models, it is also common to use a set of structure-based filters to eliminate models that have defects. For example, a high number of buried unsatisfied hydrogen bonds or the presence of low probability side chain rotamers may be cause to throw out a model. It is particularly useful to have a way to compare the quality of the design models to naturally occurring proteins. Sheffler and Baker have created a procedure for identifying cavities within a protein and estimating the probability of observing a similar cavity in high-resolution crystal structures . Comparisons of this type may indicate that certain terms in the energy function are not being emphasized properly, or that more conformational sampling needs to be performed in order to find higher quality design models.
As methods in computational interface design improve, an important aim will be to move beyond model studies and create proteins that are useful to others. When faced with pragmatic goals, it may be advantageous to combine computational methods with other techniques for protein engineering, such as directed evolution. Whether the goal is binding a specific target surface on a protein or designing a new enzyme, directed evolution usually requires that the starting sequences for selection contain some members that are at least partially functional. Computational design may be useful for creating libraries or individual sequences enriched in the target functionality, and thus provide a good starting point for directed evolution [45,46]. This approach was used in the recent design of an enzyme for a novel reaction . The long-term challenge is to learn from the optimization of current designs imparted by directed evolution, to further improve rational approaches.
The successful redesign of protein-protein binding affinities and specificities indicate that computational design is a viable approach for redesigning protein-signaling pathways. It is clear that explicit negative design will be needed in some cases to achieve specificity, however, techniques that focus on only positive design are suitable when the targets are significantly different in structure. Directed evolution experiments have shown that relatively simple libraries can be used to create novel protein binders, and suggests means to improving current computational design methodologies.
This work was supported by NIH grant RO1GM073960 to B.K.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.