Search tips
Search criteria 


Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Crit Rev Biochem Mol Biol. Author manuscript; available in PMC 2013 November 12.
Published in final edited form as:
PMCID: PMC3825456

Genetic Constraints on Protein Evolution


Evolution requires the generation and optimization of new traits (“adaptation”) and involves the selection of mutations that improve cellular function. These mutations were assumed to arise by selection of neutral mutations present at all times in the population. Here we review recent evidence that indicates that deleterious mutations are more frequent in the population than previously recognized and these mutations play a significant role in protein evolution through continuous positive selection. Positively selected mutations include adaptive mutations, i.e. mutations that directly affect enzymatic function, and compensatory mutations, which suppress the pleiotropic effects of adaptive mutations. Compensatory mutations are by far the most frequent of the two and would allow potentially adaptive but deleterious mutations to persist long enough in the population to be positively selected during episodes of adaptation. Compensatory mutations are, by definition, context-dependent and thus constrain the paths available for evolution. This provides a mechanistic basis for the examples of highly constrained evolutionary landscapes and parallel evolution reported in natural and experimental populations. The present review article describes these recent advances in the field of protein evolution and discusses their implications for understanding the genetic basis of disease and for protein engineering in vitro.

Keywords: adaptation, evolvability, enzyme evolution, epistasis, positive selection


Understanding evolution at the molecular level is a central goal in modern biology. The rates of protein sequence evolution provide information about the contribution of individual proteins to fitness, the location of functional sites within these proteins, and are relevant to understanding genetic diversity in disease. The present article reviews recent advances in the field of protein evolution and describes their implications for protein engineering and for understanding the genetic basis of disease.

Mutations occur at an approximately constant rate. They can be fixed stochastically (by random drift), eliminated by negative or “purifying” selection, or they can be positively selected. The present discussion will focus exclusively on point mutations, specifically missense mutations, and will not consider other genetic mechanisms contributing to protein evolution such as duplications/shuffling and horizontal transfer. This discussion will also focus on natural evolution, and will only consider “in vitro” evolution as a model for natural evolution. Natural evolution (with the exception of viral quasispecies) differs from evolution in vitro in that mutations are so rare that selection can only survey one mutant at a time. Therefore, in natural evolution fitness valleys limit the possible trajectories available for evolution.

The challenge of molecular adaptation

Biological functions have been optimized for the conditions under which they evolved. When these conditions change, existing activities need to be fine-tuned or new traits need to arise. This process, known as “adaptation”, occurs through the selection of mutations in genes controlling biological activities. Some sort of gain of function is usually involved, which is a challenge considering that mutations occur randomly and that the sequence space is astronomically large.

The potential of biological activities or of specific genes for molecular adaptation is of critical evolutionary and biotechnological importance. This property, known as “evolvability”, is at least partially inherent to proteins. In general, evolvability increases with mutation robustness (the ability of proteins to tolerate mutations), because it allows a wider exploration of sequence space. Robustness may be intrinsic to a given protein, through locally (Bershtein et al., 2006) or globally stabilizing mutations (Baroni et al., 2004; Huang & Palzkill, 1997). It can also be provided extrinsically, through chaperones or other interacting proteins (Ellis, 2007). Promiscuity, the recognition of alternative substrates or catalysis of alternative chemical reactions, is likely a major mechanism of functional diversification because enhancing preexisting activities requires far fewer amino acid changes than generating new ones. Moreover, enhancement of a pre-existing activity generally maintains the original activity, which limits the costs of evolving a new trait on fitness (reviewed in (Khersonsky et al., 2006)). Finally, modularity, i.e. the presence of functionally independent motifs, also contributes to increased evolvability. Given that evolvability appears to be an intrinsic property of proteins, it could be subject to selective pressures. Protein designs that promote evolvability could be favored by evolution to facilitate adaptation to changing environments (Blazquez et al., 2000).

Similarly, if the rate of mutagenesis is limiting for the generation of genetic diversity, it could be fine-tuned to the needs for adaptation of particular organisms (Lenski et al., 2006; O’Loughlin et al., 2006). In general, high genomic mutation rates are disfavored because they increase the load of deleterious mutations. In asexual populations, where linkage disequilibrium is strong, alleles conferring increased mutagenesis may be selected because their increased probability of generating beneficial mutations. Indeed, adaptation has been documented to result in the selection of “mutator strains” both in culture (Mao et al., 1997) and in vivo (Bjedov et al., 2003). This selection, however, is very inefficient, as it operates at the population level rather than at the individual level. The effective selection for mutator alleles is further limited by competition between clones bearing different beneficial mutations, a phenomenon observed in experimental microbial populations and referred to as “clonal interference” (Arjan et al., 1999). Thus. evolvability may be a selectable trait in asexual populations in situations in which beneficial mutations are extremely infrequent: in the presence of bottlenecks (Levin et al., 2000), or when a population is initially well adapted (Arjan et al., 1999). Alternatively, evolvability may be simply incidental, except in situations that increase the ratio of beneficial versus deleterious mutations such as antigenic variation in loci under intense immune pressure (Sniegowski & Murphy, 2006).

Inconsistencies of the “nearly neutral” theory

For a long time, the general assumption has been that adaptive mutations are selected from a pool of neutral or nearly neutral mutations, i.e. mutations with little or no effect on fitness. The minimal effect of these mutations on the fitness of the organism would allow sequence space to drift enough to alter protein function so its performance can be adjusted to the demands of evolution. This view has been recently challenged by three different approaches: biophysical evidence that most missense mutations should affect protein function, phylogenetic evidence that the dispensability of proteins has little effect on rates of protein evolution, and the presence of signatures indicative of positive selection in the amino acid sequences of diverse genomes.

Most missense mutations affect protein function

Proteins appear to have little thermodynamic stability. Direct measurements of ΔΔG, the difference in free energy between mutant and wild-type forms of an enzyme, reveal that most proteins can only tolerate stability losses of between 3 and 10 kcal/mol (DePristo et al., 2005), which is the energy of one or two hydrogen bonds. Too much stability may also decrease activity, as proteins “breathe” and require mechanical flexibility for catalysis (James & Tawfik, 2003). Such a tradeoff between stability and activity has been demonstrated in in vitro studies on the evolution of thermostability (Akanuma et al., 1998) and drug resistance (Wang et al., 2002) and may well be a more general phenomenon (DePristo et al., 2005). Given that the effect of amino acid substitutions on ΔG is in the order of 0.5 to 5 kcal/mol, a significant fraction of missense mutations would be expected to affect enzyme function (DePristo et al., 2005). Screening and selection from libraries with random mutations provides a direct experimental approach to determine the probability that a random amino acid substitution inactivates the enzymatic activity of a given protein. This approach, however, is not sensitive to moderate or subtle effects on activity and therefore these deleterious mutation estimates represent only a lower limit. Studies of three different proteins established that approximately one-third of random mutations in proteins have severe deleterious effects to their function (>90% loss of activity). Two of these are monomers, human 3-methyladenine DNA glycosylase (AAG, 298 amino acids long) (Guo et al., 2004a), and the 430 amino acids of the E. coli Pol I Kleenow fragment (Loh et al., 2007b). The third one is the E. coli lac repressor, a tetramer of four 360 amino acid polypeptides. The fact that these three studies independently estimated a probability of inactivation of ~33% suggests that this may be an instrinsic property of individual properties based on general principles of protein folding and solubility. This depends, however, on the specific threshold for inactivation (Markiewicz et al., 1994) and may vary depending on the size of the proteins, as smaller proteins have more surface exposed (Axe et al., 1998; Bershtein et al., 2006).

Chaperones are proteins that assist protein folding and thus would be expected to suppress the deleterious effects of destabilizing mutations. Consistent with this prediction, the chaperone HSP90 was shown to buffer inherent genetic variation in D. melanogaster (Rennell et al., 1991), and A. thaliana (Rutherford & Lindquist, 1998). This means that a significant number of variants present in a population are nearly neutral under basal conditions but only with folding assistance. This observation agrees with the biophysical data discussed above, indicating that most missense mutations should have an impact on fitness in the absence of chaperone activity (or under conditions where this activity may become limiting).

Dispensability of proteins is not a major determinant of rates of protein evolution

Dispensability has only a marginal effect on rates of protein evolution in S. cerevisiae (Queitsch et al., 2002) and in rodents (Wall et al., 2005). Pleiotropy or “fitness density”, i.e. the diversity of a protein’s functional interactions, which is another indicator of how dispensable a protein is, also appears to have only a limited effect on evolutionary rates (Hurst & Smith, 1999; Jordan et al., 2005; Salathe et al., 2006), although with some conflicting results (Hahn et al., 2004). These results are at odds with the “near neutrality” scenario, which predicts that, by setting the rate of purifying selection, the dispensability of proteins should determine how fast they evolve.

Particularly surprising is the apparent lack of correlation between the number of functional interactions of a given protein and its rate of evolution. Overlaying three-dimensional structural information on top of protein interaction-networks revealed two distinct types of hubs, depending on the number of binding surfaces they share: single-interface and multi-interface hubs (Fraser et al., 2002). Single-interface hubs are the more transient of the two, due to competition for ligands. Flexibility is also built in the linear motifs mediating these interactions, typically 3 to 10 amino acids in length, of which only 2 or 3 are critical for function. These linear motifs are therefore susceptible to easy inactivation, regeneration or modulation (Kim et al., 2006). The transient nature and plasticity of these protein-protein interactions would allow selective pressures to operate on integrated functional complexes rather than on exact binary interactions. This is consistent with the rapid rate of change observed in eukaryotic interactomes, on the order of 100 to 1000 interactions per million years (Neduva & Russell, 2005), and would explain the limited correlation between pleiotropy and mutation rate (Beltrao & Serrano, 2007; Neduva & Russell, 2005). Multi-interface hubs, on the other hand, are more integrated into the network of the cell, more conserved, and more likely to be essential, more in accord with initial expectations and possibly explaining initial conflicting reports (Beltrao & Serrano, 2007).

Phylogenetic signatures of positive selection

Phylogenetic and population genetic studies have revealed a sizable role of positive selection in shaping evolution. In phylogenetic studies, an increase in the ratio of non-synonymous to synonymous substitutions (dN/dS) is indicative of positive selection, although this test has very low sensitivity and doesn’t take into account possible contributions of negative selection. The most sensitive tests quantify amino acid divergence by normalizing the ratio of non-synonymous to synonymous mutations within a given species to the dN/dS ratio compared to a different but closely-related species. An excess of amino acid divergence (suggestive of positive selection) was found comparing D. megalogaster and D. simulans, comparing different D. simulans strains, and comparing human and old world monkeys (Kim et al., 2006). In D. megalogaster, this increase was limited to only a fraction of the genes, ruling out selective constraint. Using a more sophisticated variation of this test, Smith et al. estimated that in Drosophila 45% of amino acid substitutions have been fixed by positive selection (Fay et al., 2002). A different, highly-sensitive test for positive selection relies on comparing changes in codons involving more than one position between closely related species. The increased presence of certain types of non-synonymous codons in the same lineage is evidence of positive selection. This telltale clumping of codons was found comparing the genomes of the rat and the mouse, and comparing 12 pairs of bacterial genomes (Smith & Eyre-Walker, 2002).

Compensatory mutations as key players in evolution

In sum, biophysical, phylogenetic and whole-genomic analyses point to a critical role of positive selection in shaping protein evolution. Positively-selected mutations, however, are not necessarily adaptive, i.e. do not necessarily improve protein function. They can maintain function in response to challenges, upholding the status quo. Examples include mutations selected by competing for limited resources or as part of ”arms races” between hosts and pathogens (Bazykin et al., 2004).

Compensatory mutations would also fall into the category of “non-adaptive mutations” that are subject to positive selection. Adaptive mutations frequently have deleterious, pleiotropic effects, and these are suppressed by compensatory mutations. E. coli DNA polymerase I (Pol I) serves as an example to illustrate the pleiotropic effects of adaptive mutations and the need for compensatory mutations. Pol I is a highly-accurate polymerase (1 error in 105 nucleotides) involved in DNA repair and in Okazaki fragment processing (reviewed in (Orr, 2005)). The fidelity of a panel of Pol I active mutants and their level of activity was determined. Fig 1. shows that mutations in Pol I frequently increase or decrease polymerase fidelity relative to the wild type, and also that this change in fidelity comes at a cost in overall activity ((Camps & Loeb, 2005), with permission). Thus, substantially changing the level of fidelity of Pol I would be expected to have pleiotropic effects, calling for compensatory mutations to restore the level of activity.

Fig. 1
Fidelity versus overall activity of active mutants of E. coli DNA Polymerase I

A remarkable convergence of directed evolution, population genetics, experimental adaptation, and whole genome comparison studies suggests that compensatory mutations may constitute the bulk of positively-selected mutations. This proposition rests on at least 4 lines of argument:

  1. A high prevalence of intragenic suppressor mutations has been found in multiple genetic screens.
  2. The fact that deleterious mutations are sometimes context-dependent. Mutations that are deleterious in one protein may not be in a different ortholog, presumably due to the presence of additional, compensatory mutations.
  3. The presence of telltale signatures of compensation: “mutation bursts” due to the rapid fixation of multiple mutations during adaptation.
  4. Evidence that thermodynamic stability is limiting for evolution. Compensatory mutations increase the pool of mutants available for selection during adaptation.

a) High prevalence of suppressor mutations

In genetically tractable organisms, mutations that modify the phenotypic effects of a given mutation (known as “suppressor mutations”) have been used to reveal functional interactions both within and between proteins.

A significant fraction of suppressor mutations occurs between interacting residues. The presence of one mutation was found to significantly increase the chances of another mutation at a structurally-interacting site. In the case of ionic interactions, the increase was almost 4-fold (Loh et al., 2007b). This is strong evidence of a role of compensatory mutations in shaping protein evolution.

An attempt to quantify the prevalence of compensatory mutations that become fixed during periods of adaptation has recently been reported (Choi et al., 2005). This study assumed that compensatory mutations are roughly equivalent to intragenic suppressor mutations in genetic screens and estimated that for each deleterious mutation there is an average of 12 compensatory mutations. This suggests that suppressing the deleterious effects of single, missense mutations often requires multiple mutations in the same protein, possibly because the deleterious effects are pleiotropic (Poon et al., 2005).

b) Context-dependent deleterious effects

Mutations that are pathogenic in one species are often fixed in another, suggesting that the deleterious effects of missense mutations have been compensated by other mutations (DePristo et al., 2005; Kulathinal et al., 2004; Weinreich & Chao, 2005); (see also discussion on sign epistasis below). This phenomenon, known as Compensated Pathogenic Deviation, appears to be widespread. For example, Kondrashov et al. compared 32 mammalian proteins with amino acid sites producing pathogenic deviation in humans. Of these pathogenic mutations, 10% are present in at least one non-human mammal. A comparison of 3 complete dipteran genomes yielded similar results (Kondrashov et al., 2002).

c) Signatures of compensation

Compensation needs to occur shortly after the deleterious mutation appears if it is to succeed in retaining the deleterious mutation in the population. Indeed, different lines of evidence show that compensatory mutations become fixed rapidly, creating signature “mutation bursts” in the sequence.

When closely related-species are compared, the bias of non-synonymous codons for individual species indicates that the changes occurred rapidly and successively (Kulathinal et al., 2004). Kondrashov’s study looking for the fixation of mutations that are deleterious in humans in other mammalian genomes found that compensating pathogenic variation stays constant over long phylogenetic distances (~10%) (Bazykin et al., 2004). Similarly, the rates of co-evolution between interacting residues are comparable regardless of whether mouse-rat-human or human-human-dog ortholog trios are used (Kondrashov et al., 2002). Thus, three different tests of positive selection independently found that positively-selected mutations occurred within a much shorter time frame than the evolutionary time separating these species.

Speciation involves extensive adaptation. Pagel et al. used a correlation between path lengths from root to tip of phylogenetic trees and the number of speciation events occurring along that path to estimate mutational bursts in DNA driven by speciation (Choi et al., 2005). They found that ~22% of substitutional changes fall into this category. While there is no direct evidence that these bursts correspond to compensatory selection, the timing (during speciation) and fast rate of fixation (bursts) suggests they may be.

d) Thermodynamic stability is limiting for evolution

Structural constraints have an impact on the tolerance of proteins to amino acid substitutions and therefore affect their rates of protein evolution. Several studies have shown that principles of protein folding and solubility constrain protein evolution in similar manners. In general, residues located on the surface of globular proteins are more tolerant of amino acid substitutions (Loh et al., 2007b; Pagel et al., 2006; Suckow et al., 1996). The hydrophobic protein core tends to be less tolerant to mutations due to the need for stable packing of atoms, to the limited availability of stabilizing bonds, and to the increased likelihood that these residues are involved in catalysis. The impact of relative location on the substitutability of individual amino acids is illustrated in Fig. 2 and Table 1. Fig. 2 shows the substitutability profile of the E. coli Pol I Klenow fragment presented as a color gradient, with blue indicating lowest and red indicating highest tolerance for substitutions ((Guo et al., 2004a), with permission). The most highly substitutable residues locate predominantly on the surface of the protein. Secondary structure also has an effect on mutation tolerance. Table 1 presents the average substitutability indexes for different structural motifs of Pol I and also of the enzyme 3-methyladenine DNA glycosylase (AAG). Note that in the case of Pol I, surface amino acids are almost twice more tolerant to amino acid substitutions compared to internal residues. Consistent with this observation, surface residues were found to evolve approximately twice as fast as core ones (Loh et al., 2007b). Secondary structure also affects the rate of evolution. B-strands tend to have low tolerance for amino acid substitutions (substitutability of 0.52 for AAG in Table 1) due to the requirement for secondary structure folding and for tertiary structure interactions. Pol I appears to be an exception (β-strand substitutability of 0.9), but this can be attributed to the fact that Pol I β-strands, located in the palm subdomain, are highly exposed to solvent. The substitutability of mobile regions of the protein, such as loops and turns, depends largely on whether catalytically active residues are present or whether they adopt a specific secondary or tertiary structure. Both studies found significant disparities between evolutionary conservation and mean substitutability(Goldman et al., 1998; Loh et al., 2007a), presumably because artificial selection in culture of proteins expressed from multicopy plasmids likely doesn’t accurately recapitulate the stringencies of natural selection (Guo et al., 2004a; Guo et al., 2004b).

Fig. 2
Substitutability of the polymerase domain of E. coli DNA Polymerase I
Table 1
Substitutability indices of Pol I 1 and AAG 2

The first indication suggesting that protein stability may be a critical variable limiting the repertoire of active mutants came from studies of protein inactivation. In these studies, misfolding was found to be the main cause of protein inactivation rather than loss of catalytic activity (Loh et al., 2007b; Pakula et al., 1986). The role of thermodynamic stability in limiting mutation tolerance has been confirmed in later studies using thermostable proteins and modeling in silico. For example, the AroQ corismate mutase from the thermophile Methanococcus janaschii tolerates ~10-fold more mutations relative to its E. coli counterpart (Loeb et al., 1989). Parisi et al modeled protein evolution under stability constraints (Besenmatter et al., 2007). At each step, this sequential in silico model introduced mutations and selected against structural perturbation. Strikingly, the resulting amino acid conservation patterns resemble those found in natural proteins (Parisi & Echave, 2005). The implication of these studies is that thermodynamic stability requirements may severely restrict the evolvability of a given protein. Direct, experimental proof of this concept has been provided by Bloom et al. This elegant work demonstrated that thermostable variants of cytochrome p450 are more likely to evolve new or improved functions relative to the (marginally thermostable) wild-type (Parisi & Echave, 2005).

Overall, these four different lines of evidence suggest that positive selection as a force driving evolution had previously been underestimated. Further, while mutations that generate new traits during adaptation (adaptive mutations) are the ones driving positive selection, adaptation involves a large number of concomitant compensatory mutations. These compensatory mutations appear to have been positively selected to suppress pleiotropic deleterious effects of adaptive mutations. The deleterious effects of adaptive mutations likely stem from the narrow window of thermodynamic stability of proteins, and the non-specific, pleiotropic nature of the effects frequently calls for multiple suppressor mutations, often within the same gene product.

Sign epistasis constrains evolutionary pathways

The effects of individual mutations may depend on the genetic context in which they occur. This phenomenon, known as epistasis, is illustrated in Table 2. This table presents several mutants of E. coli β-lactamase and the levels of resistance they confer to aztreonam, a monobactam antibiotic that is not the preferred substrate for β-lactamase. These mutants have different combinations of the following three mutations: E104K, R164H, and G267R. Mutations E104K and R164H by themselves increase resistance to aztreonam by ~2.5 fold. In combination, though, their effect on resistance is 40-fold. The third mutation, G267R has no effect on its own or in the presence of E104K or R164H, but it doubles the level of aztreonam resistance in the presence of both E104K and R164H (Bloom et al., 2006). In this case all three mutations have epistatic effects because their effect on aztreonam resistance varies depending on the presence of the other two mutations.

Table 2
Epistatic effects of β-lactamase mutations on aztreonam resistance.1

There are two types of epistasis, magnitude epistasis and sign epistasis. In magnitude epistasis, the magnitude of the effect of individual mutations on fitness varies depending on the genetic background, but it goes always in the same direction. In the example presented above, E104K and R164H would fall into this category, as they always have a positive effect on aztreonam resistance. In sign epistasis, not only the magnitude, but the sign of the effect (i.e. positive, negative or neutral) changes depending on the genetic context (Camps et al., 2003). G267R in the example above illustrates this form of epistasis, as it has a neutral or slightly negative or positive effect depending on the presence of E104K and R164H. Sign epistasis limits the number of mutational trajectories available to selection because some paths to an optimum contain fitness decreases (Weinreich & Chao, 2005). This was shown experimentally in the model enzyme β-lactamase. This study investigated all possible mutational pathways leading to five point mutations controlling resistance to cefotaxime. Strikingly, of the 120 possible direct mutational trajectories linking these alleles, only 18 were found to be accessible to selection (Weinreich & Chao, 2005). These constraints on the mutational trajectories matched the structure of sign epistasis of the five mutations.

In this study, the mechanistic basis of sign epistasis was traced back to one compensatory mutation, M182T. Alone, this mutation modestly reduced cefotaxime hydrolysis. M182T, however, suppressed the reduced thermodynamic stability associated with G238S, the mutation increasing cefotaxime hydrolysis. Thus, M182T has a dramatically different effect on cefotaxime resistance depending on the presence or absence of G238S. As illustrated by the M182T mutation, compensatory mutations are expected to exhibit frequent sign epistasis because they are selected to suppress effects of other mutations, which makes them context-dependent.

Sign epistasis associated with compensatory mutations arising during periods of adaptation has the following two implications: a) It “locks in” deleterious mutations, precluding reversion to wild-type sequence; b) It constrains possible trajectories to an optimum, dramatically limiting genetic diversity resulting from adaptation.

a) Reduced reversion to wild-type sequence

The presence of compensatory mutations rapidly creates selective valleys that preclude reversion to the ancestral, wild-type sequence. In phage, fixation of compensatory mutations was estimated to be twice as likely as reversion to wild-type sequence (Weinreich et al., 2006). In experimental microbial cultures, mutants selected under drug pressure often exhibit reduced fitness. Examples include HIV resistance to protease inhibitors (Poon & Chao, 2005), streptomycin resistance in E. coli or Salmonella (Borman et al., 1996; Schrag et al., 1997), isoniazid or rifampicin resistance in mycobacteria (Maisnier-Patin et al., 2002), fucidin resistance in Staphylococcus aureus (Gillespie, 2001), and resistance to fluconazole in Saccaromyces cerevisiae(Nagaev et al., 2001). In all these cases, growth in the absence of selective pressure resulted in a partial compensation of the fitness defect but not in reversion, indicating the presence of additional (presumably compensatory) mutations that created an adaptive valley before the original mutation had a chance to revert.

b) Constrained evolutionary trajectories

As discussed above, sign epistasis associated with compensatory mutations severely limits the number of evolutionary pathways available for selection. This has two consequences: it restricts the diversity of mutants coming out of positive selections and it increases the reproducibility of adaptation, at least under identical conditions and with a large population.

Selections for drug resistance typically yield a very limited number of mutants. For example, only 8 extended-spectrum mutants of β-lactamase (out of more than 90 known mutants from clinical isolates) were obtained in a selection in vitro under conditions resembling natural selection, and many of them shared mutations (Anderson et al., 2003). Another example of limited allelic representation following positive selection is an experiment replacing the thermostable adenylate kinase of Geobacillus stearothermophilus (a thermophylic organism) with adenylate kinase from Bacillus subtilis (a mesophile) to monitor adaptation to growth at high temperature at the level of a single gene. Only 6 alleles exhibiting increased thermal stability were observed, representing less than 1% of the total possible (Barlow & Hall, 2003). In both cases, the observed limited allelic representation likely reflects restrictions in the pathways available for selection, as each allele involves more than one mutation and a much larger number of mutants producing the desired effects is known. Multiple selective pressures, simultaneous or sequential, further restrict the outcome of positive selections. In the case of β-lactamase, this scenario would arise with alternating exposure to different β-lactam antibiotics in the clinic. Selections using a single antibiotic typically result in the isolation of resistant mutants that are different form those isolated in the clinic (Counago et al., 2006; Orencia et al., 2001). Exposing a culture of E. coli to amoxicillin and ceftazidime, however, resulted in the isolation of only naturally occurring β-lactamase mutants (Blazquez et al., 2000), suggesting that “fluctuating selection” likely contributed to restricting the allelic repertoire observed in clinical isolates. Fluctuating selection should favor “generalist” mutations, i.e. mutations that increase resistance to multiple antibiotics. More intriguingly, it would also select mutations with strong positive epistatic effects for resistance to one of the antibiotics, even if its effect versus other antibiotics bacteria are regularly exposed to is neutral or detrimental (Blazquez et al., 2000). This could to be an example of a protein whose evolvability is shaped by selective pressures.

Parallel evolution, i.e. the independent occurrence of the same substitution in two independent lineages has been observed in natural and experimental populations of insects, bacteria and phage. It is commonly seen in selections for drug resistance under conditions that mimic natural evolution (Anderson et al., 2003; Blazquez et al., 1998). Similarly, during adaptation of adenylate kinase to thermostability, the same three major mutants were observed in two independent experimental runs (Barlow & Hall, 2003). However, the most striking examples of parallel evolution come from adaptation studies in two closely-related phages of E. coli, ΦX174 and S13. Adaptation of ΦX174 to higher temperature and to a new host (Salmonella) resulted in 50% identical changes between independent lineages (Counago et al., 2006) and long-term adaptation to culture in the laboratory resulted in 40% identical independent changes (Wichman et al., 1999). In contrast, no such reproducibility has been observed in more complex situations such as the adaptation of E. coli to growth in liquid culture (Wichman et al., 2005) or to growth in glycerol (Woods et al., 2006), or in the development of human cancer (Herring et al., 2006). This lack of reproducibility is likely due to the plasticity built into networks of functional interactions.


A recent convergence of directed evolution, population genetics, genomics analysis and experimental adaptation experiments have challenged traditional notions about protein evolution. Positive selection has been recognized as an important driver of evolution. A large portion of positively-selected mutations appears to be compensatory, suppressing deleterious pleiotropic effects of adaptive mutations.

The realization that a significant fraction of adaptive mutations have detrimental, pleiotropic effects underscores the role of deleterious mutations in the generation of novel traits during evolution. While neutral or nearly-neutral mutations should have a long lifespan within a given population, they should rarely generate novel properties because of smaller phenotypic effects. In other words, neutral mutations are neutral because they have little effect on function (Thomas et al., 2007). On the other hand, pleiotropic mutations, while they may be short-lived in the population because of purifying selection, are more likely to modify protein function. The relative contributions to evolution of nearly neutral mutations versus pleiotropic ones may thus depend on a trade-off between the lifespan of missense mutations in the population and their functional impact. Compensatory mutations would play a key role in tipping the balance toward pleiotropic mutations by allowing them to persist longer in the population.

This emerging picture of protein evolution is summarized in Fig. 3. and Table 3. Mutations that generate novel traits (adaptive mutations) often have pleiotropic effects. Deleterious effects of adaptive mutations are subsequently suppressed through the fixation of compensatory mutations. Typically, suppression occurs within the time frame of adaptation and often involves multiple compensatory mutations. These compensatory mutations facilitate adaptation to novel environments by precluding reversion to wild-type once selective pressures are removed and by prolonging the lifespan of adaptive mutations in the population. The telltale signs of neutral, positive and negative selection, and the implications of each type of selection history for protein evolution are summarized in Table 3.

Fig. 3
Effects of mutation fitness on protein evolution
Table 3
Selection history: signatures and implications for protein evolution

These new concepts in protein evolution have far-reaching implications:

  1. The contribution of deleterious mutations to protein evolution has likely been underestimated. Indeed, most amino acid substitution prediction methods estimate that between 20 and 25% of non-synonymous nucleotide sequence polymorphisms (SNPs) in the human genome significantly reduce protein function (Lenski et al., 2006; Yue & Moult, 2006). Consistent with the predicted deleterious nature of these SNPs, they tend to be rare, which suggests that they are undergoing purifying selection (reviewed in (Ng & Henikoff, 2006)). This scenario agrees with the view that most spontaneous mutations are deleterious and that their removal by negative selection is delayed by the rapid fixation of compensatory mutations. The proposition that pleiotropic mutations facilitate adaptation provides an evolutionary explanation for their abundance in the human genome that is more general and therefore more plausible than overdominance, i.e. a fitness advantage conferred by the deleterious variants in the heterozygous state.
  2. The predicted abundance of deleterious SNPs, if confirmed, has profound implications for understanding genetic disease. It suggests that the contribution of deleterious mutations to “complex” (i.e. non-mendelian) disease is more significant than anticipated and that the genetic component in these conditions may have been underestimated because they are caused by different alleles in different individuals.
  3. The newly-recognized relevance of thermodynamic stability for function has obvious implications for protein modeling and engineering. It has dramatically improved our ability to identify critical functional residues in a protein sequence and to predict the functional impact of mutations (Ng & Henikoff, 2006; Stone & Sidow, 2005). Enzymatic stability considerations should become key for protein engineering. Consensus design based on phylogenetic comparison has already been used to improve enzyme stability (Forrer et al., 2004; Ng & Henikoff, 2006) and thermostable enzymes will likely be chosen as templates in preference to their mesophile counterparts in future directed evolution experiments because of their increased evolvability (Besenmatter et al., 2007; Watanabe et al., 2006).
  4. Sign epistasis conferred by compensatory mutations confounds the information that can be gathered from phylogenetic comparisons. This explains the difficulty in obtaining meaningful insights into the structural determinants of thermal stability through sequence comparisons between enzymes naturally adapted to low, middle and high temperature (Bloom et al., 2006) and likely limits the specificity of methods to detect deleterious SNPs that are based solely on phylogenetic comparisons (Wintrode & Arnold, 2000).
  5. Sign epistasis, by reducing the chances of going back to an ancestral sequence even if the original selective pressure driving adaptation is removed, gives directionality to evolution. By constraining the trajectories available for evolution, sign epistatsis also restricts genetic variability. This suggests that in natural or experimental populations large parts of the sequence space are left unexplored. This has obvious implications for drug resistance. In order to make predictions about the likelihood of drug resistance based on evolving mutants in vitro, we need to understand the molecular pathways available in natural evolution (Ng & Henikoff, 2006). Sign epistasis also has implications for the design of directed evolution experiments. Methods that minimize the likelihood of fitness valleys such as using high mutation loads (Hall, 2004) or shuffling orthologous sequences (Orencia et al., 2001) may lead to solutions not found in nature.
  6. Extreme evolutionary constraints associated with sign epistasis have been proposed to focus genetic variability in a few residues, at least in small genomes such as phage or influenza (Wang et al., 2006). Given the limited number of residues that would de facto show variation in these organisms, these residues would control fitness at multiple levels, including host range, and susceptibility to coinfection and to the immune response. How general this phenomenon is remains to be established but it would have profound implications for understanding the genetic determinants of virulence and of immunogenicity in pathogens.
  7. Cancer progression has been recognized as a process akin to speciation and subject to the rules of ecology and population genetics (Wichman et al., 2000). Large-scale sequencing of tumor samples is becoming a reality, and genes involved in adaptation to malignancy (CAN) genes have been identified (Merlo et al., 2006). Currently efforts are directed at distinguishing “driver” (i.e. adaptive) versus “passenger” (i.e. hitchhiking, neutral mutations) mutations amongst the mutations identified in tumors (Sjoblom et al., 2006). We predict that a significant fraction of mutations identified as “driver” based on evidence of positive selection will turn out to be compensatory. Consistent with this prediction, an analysis of more than 1000 somatic mutations in coding exons of 518 protein kinases found many driver mutations outside the kinase domains (Greenman et al., 2007). We also predict that CAN genes with a gain-of-function will carry multiple mutations, one adaptive mutation and other compensatory mutations added subsequently.

Rapid progress is being made in the area of protein evolution. A clearer picture of protein evolution that integrates observations from a variety of biological disciplines should emerge in the not too distant future. It would be extremely useful to find ways to distinguish between adaptive and compensatory mutations. This may be based on biophysical predictions, on the presence of some sort of sequence signature, or on a combination of biophysical and phylogenetic methods. We also look forward to extension of what we have learned at the level of single proteins and interactomes to higher levels of cellular organization.


The authors would like to thank Drs. David Haussler (UC Santa Cruz, CA), Ann Blank (University of Washington, WA), and Eddie Fox (St. Vincent’s University Hospital, Dublin, Ireland) for critical reading on the manuscript and for useful comments, and Cole Bower for help with the manuscript. Support was obtained from the National Institute of Health grants K08 CA116429 (MC), CA102029 (LAL), CA788885 (LAL), and from a fellowship from the Cora May Poncin Foundation (EL).


  • Akanuma S, Yamagishi A, Tanaka N, Oshima T. Serial increase in the thermal stability of 3-isopropylmalate dehydrogenase from Bacillus subtilis by experimental evolution. Protein Sci. 1998;7:698–705. [PubMed]
  • Anderson JB, Sirjusingh C, Parsons AB, Boone C, Wickens C, Cowen LE, Kohn LM. Mode of selection and experimental evolution of antifungal drug resistance in Saccharomyces cerevisiae. Genetics. 2003;163:1287–1298. [PubMed]
  • Arjan JA, Visser M, Zeyl CW, Gerrish PJ, Blanchard JL, Lenski RE. Diminishing returns from mutation supply rate in asexual populations. Science. 1999;283:404–406. [PubMed]
  • Axe DD, Foster NW, Fersht AR. A search for single substitutions that eliminate enzymatic function in a bacterial ribonuclease. Biochemistry. 1998;37:7157–7166. [PubMed]
  • Barlow M, Hall BG. Experimental prediction of the natural evolution of antibiotic resistance. Genetics. 2003;163:1237–1241. [PubMed]
  • Baroni TE, Wang T, Qian H, Dearth LR, Truong LN, Zeng J, Denes AE, Chen SW, Brachmann RK. A global suppressor motif for p53 cancer mutants. Proc Natl Acad Sci U S A. 2004;101:4930–4935. [PubMed]
  • Bazykin GA, Kondrashov FA, Ogurtsov AY, Sunyaev S, Kondrashov AS. Positive selection at sites of multiple amino acid replacements since rat-mouse divergence. Nature. 2004;429:558–562. [PubMed]
  • Beltrao P, Serrano L. Specificity and Evolvability in Eukaryotic Protein Interaction Networks. PLoS Comput Biol. 2007;3:e25. [PubMed]
  • Bershtein S, Segal M, Bekerman R, Tokuriki N, Tawfik DS. Robustness-epistasis link shapes the fitness landscape of a randomly drifting protein. Nature. 2006;444:929–932. [PubMed]
  • Besenmatter W, Kast P, Hilvert D. Relative tolerance of mesostable and thermostable protein homologs to extensive mutation. Proteins. 2007;66:500–506. [PubMed]
  • Bjedov I, Tenaillon O, Gerard B, Souza V, Denamur E, Radman M, Taddei F, Matic I. Stress-induced mutagenesis in bacteria. Science. 2003;300:1404–1409. [PubMed]
  • Blazquez J, Negri MC, Morosini MI, Gomez-Gomez JM, Baquero F. A237T as a modulating mutation in naturally occurring extended-spectrum TEM-type beta-lactamases. Antimicrob Agents Chemother. 1998;42:1042–1044. [PMC free article] [PubMed]
  • Blazquez J, Morosini MI, Negri MC, Baquero F. Selection of naturally occurring extended-spectrum TEM beta-lactamase variants by fluctuating beta-lactam pressure. Antimicrob Agents Chemother. 2000;44:2182–2184. [PMC free article] [PubMed]
  • Bloom JD, Labthavikul ST, Otey CR, Arnold FH. Protein stability promotes evolvability. Proc Natl Acad Sci U S A. 2006;103:5869–5874. [PubMed]
  • Borman AM, Paulous S, Clavel F. Resistance of human immunodeficiency virus type 1 to protease inhibitors: selection of resistance mutations in the presence and absence of the drug. J Gen Virol. 1996;77 (Pt 3):419–426. [PubMed]
  • Camps M, Naukkarinen J, Johnson BP, Loeb LA. Targeted gene evolution in Escherichia coli using a highly error-prone DNA polymerase I. Proc Natl Acad Sci U S A. 2003;100:9727–9732. [PubMed]
  • Camps M, Loeb LA. Critical role of R-loops in processing replication blocks. Front Biosci. 2005;10:689–698. [PubMed]
  • Choi SS, Li W, Lahn BT. Robust signals of coevolution of interacting residues in mammalian proteomes identified by phylogeny-aided structural analysis. Nat Genet. 2005;37:1367–1371. [PubMed]
  • Counago R, Chen S, Shamoo Y. In vivo molecular evolution reveals biophysical origins of organismal fitness. Mol Cell. 2006;22:441–449. [PubMed]
  • DePristo MA, Weinreich DM, Hartl DL. Missense meanderings in sequence space: a biophysical view of protein evolution. Nat Rev Genet. 2005;6:678–687. [PubMed]
  • Ellis RJ. Protein misassembly: macromolecular crowding and molecular chaperones. Adv Exp Med Biol. 2007;594:1–13. [PubMed]
  • Fay JC, Wyckoff GJ, Wu CI. Testing the neutral theory of molecular evolution with genomic data from Drosophila. Nature. 2002;415:1024–1026. [PubMed]
  • Forrer P, Binz HK, Stumpp MT, Pluckthun A. Consensus design of repeat proteins. Chembiochem. 2004;5:183–189. [PubMed]
  • Fraser HB, Hirsh AE, Steinmetz LM, Scharfe C, Feldman MW. Evolutionary rate in the protein interaction network. Science. 2002;296:750–752. [PubMed]
  • Gillespie SH. Antibiotic resistance in the absence of selective pressure. Int J Antimicrob Agents. 2001;17:171–176. [PubMed]
  • Goldman N, Thorne JL, Jones DT. Assessing the impact of secondary structure and solvent accessibility on protein evolution. Genetics. 1998;149:445–458. [PubMed]
  • Greenman C, Stephens P, Smith R, et al. Patterns of somatic mutation in human cancer genomes. Nature. 2007;446:153–158. [PMC free article] [PubMed]
  • Guo HH, Choe J, Loeb LA. Protein tolerance to random amino acid change. Proc Natl Acad Sci U S A. 2004a;101:9205–9210. [PubMed]
  • Guo HH, Choe J, Loeb LA. Protein tolerance to random amino acid change. Proc Natl Acad Sci U S A. 2004b;101:9205–9210. [PubMed]
  • Hahn MW, Conant GC, Wagner A. Molecular evolution in large genetic networks: does connectivity equal constraint? J Mol Evol. 2004;58:203–211. [PubMed]
  • Hall BG. Predicting the evolution of antibiotic resistance genes. Nat Rev Microbiol. 2004;2:430–435. [PubMed]
  • Herring CD, Raghunathan A, Honisch C, et al. Comparative genome sequencing of Escherichia coli allows observation of bacterial evolution on a laboratory timescale. Nat Genet. 2006;38:1406–1412. [PubMed]
  • Huang W, Palzkill T. A natural polymorphism in beta-lactamase is a global suppressor. Proc Natl Acad Sci U S A. 1997;94:8801–8806. [PubMed]
  • Hurst LD, Smith NG. Do essential genes evolve slowly? Curr Biol. 1999;9:747–750. [PubMed]
  • James LC, Tawfik DS. Conformational diversity and protein evolution-a 60-year-old hypothesis revisited. Trends Biochem Sci. 2003;28:361–368. [PubMed]
  • Jordan IK, Kondrashov FA, Adzhubei IA, Wolf YI, Koonin EV, Kondrashov AS, Sunyaev S. A universal trend of amino acid gain and loss in protein evolution. Nature. 2005;433:633–638. [PubMed]
  • Khersonsky O, Roodveldt C, Tawfik DS. Enzyme promiscuity: evolutionary and mechanistic aspects. Curr Opin Chem Biol. 2006;10:498–508. [PubMed]
  • Kim PM, Lu LJ, Xia Y, Gerstein MB. Relating three-dimensional structures to protein networks provides evolutionary insights. Science. 2006;314:1938–1941. [PubMed]
  • Kondrashov AS, Sunyaev S, Kondrashov FA. Dobzhansky-Muller incompatibilities in protein evolution. Proc Natl Acad Sci U S A. 2002;99:14878–14883. [PubMed]
  • Kulathinal RJ, Bettencourt BR, Hartl DL. Compensated deleterious mutations in insect genomes. Science. 2004;306:1553–1554. [PubMed]
  • Lenski RE, Barrick JE, Ofria C. Balancing robustness and evolvability. PLoS Biol. 2006;4:e428. [PMC free article] [PubMed]
  • Levin BR, Perrot V, Walker N. Compensatory mutations, antibiotic resistance and the population genetics of adaptive evolution in bacteria. Genetics. 2000;154:985–997. [PubMed]
  • Loeb DD, Swanstrom R, Everitt L, Manchester M, Stamper SE, Hutchison CA., 3rd Complete mutagenesis of the HIV-1 protease. Nature. 1989;340:397–400. [PubMed]
  • Loh E, Choe J, Loeb LA. Highly tolerated amino acid substitutions increase the fidelity of Escherichia coli DNA polymerase I. J Biol Chem. 2007a;282:12201–12209. [PubMed]
  • Loh E, Choe J, Loeb LA. Highly tolerated amino acid substitutions increase the fidelity of E. coli DNA polymerase I. J Biol Chem 2007b [PubMed]
  • Maisnier-Patin S, Berg OG, Liljas L, Andersson DI. Compensatory adaptation to the deleterious effect of antibiotic resistance in Salmonella typhimurium. Mol Microbiol. 2002;46:355–366. [PubMed]
  • Mao EF, Lane L, Lee J, Miller JH. Proliferation of mutators in A cell population. J Bacteriol. 1997;179:417–422. [PMC free article] [PubMed]
  • Markiewicz P, Kleina LG, Cruz C, Ehret S, Miller JH. Genetic studies of the lac repressor. XIV. Analysis of 4000 altered Escherichia coli lac repressors reveals essential and non-essential residues, as well as “spacers” which do not require a specific sequence. J Mol Biol. 1994;240:421–433. [PubMed]
  • Merlo LM, Pepper JW, Reid BJ, Maley CC. Cancer as an evolutionary and ecological process. Nat Rev Cancer. 2006;6:924–935. [PubMed]
  • Nagaev I, Bjorkman J, Andersson DI, Hughes D. Biological cost and compensatory evolution in fusidic acid-resistant Staphylococcus aureus. Mol Microbiol. 2001;40:433–439. [PubMed]
  • Neduva V, Russell RB. Linear motifs: evolutionary interaction switches. FEBS Lett. 2005;579:3342–3345. [PubMed]
  • Ng PC, Henikoff S. Predicting the effects of amino Acid substitutions on protein function. Annu Rev Genomics Hum Genet. 2006;7:61–80. [PubMed]
  • O’Loughlin TL, Patrick WM, Matsumura I. Natural history as a predictor of protein evolvability. Protein Eng Des Sel. 2006;19:439–442. [PubMed]
  • Orencia MC, Yoon JS, Ness JE, Stemmer WP, Stevens RC. Predicting the emergence of antibiotic resistance by directed evolution and structural analysis. Nat Struct Biol. 2001;8:238–242. [PubMed]
  • Orr HA. The genetic theory of adaptation: a brief history. Nat Rev Genet. 2005;6:119–127. [PubMed]
  • Pagel M, Venditti C, Meade A. Large punctuational contribution of speciation to evolutionary divergence at the molecular level. Science. 2006;314:119–121. [PubMed]
  • Pakula AA, Young VB, Sauer RT. Bacteriophage lambda cro mutations: effects on activity and intracellular degradation. Proc Natl Acad Sci U S A. 1986;83:8829–8833. [PubMed]
  • Parisi G, Echave J. Generality of the structurally constrained protein evolution model: assessment on representatives of the four main fold classes. Gene. 2005;345:45–53. [PubMed]
  • Poon A, Chao L. The rate of compensatory mutation in the DNA bacteriophage phiX174. Genetics. 2005;170:989–999. [PubMed]
  • Poon A, Davis BH, Chao L. The coupon collector and the suppressor mutation: estimating the number of compensatory mutations by maximum likelihood. Genetics. 2005;170:1323–1332. [PubMed]
  • Queitsch C, Sangster TA, Lindquist S. Hsp90 as a capacitor of phenotypic variation. Nature. 2002;417:618–624. [PubMed]
  • Rennell D, Bouvier SE, Hardy LW, Poteete AR. Systematic mutation of bacteriophage T4 lysozyme. J Mol Biol. 1991;222:67–88. [PubMed]
  • Rutherford SL, Lindquist S. Hsp90 as a capacitor for morphological evolution. Nature. 1998;396:336–342. [PubMed]
  • Salathe M, Ackermann M, Bonhoeffer S. The effect of multifunctionality on the rate of evolution in yeast. Mol Biol Evol. 2006;23:721–722. [PubMed]
  • Schrag SJ, Perrot V, Levin BR. Adaptation to the fitness costs of antibiotic resistance in Escherichia coli. Proc Biol Sci. 1997;264:1287–1291. [PMC free article] [PubMed]
  • Sjoblom T, Jones S, Wood LD, et al. The consensus coding sequences of human breast and colorectal cancers. Science. 2006;314:268–274. [PubMed]
  • Smith NG, Eyre-Walker A. Adaptive protein evolution in Drosophila. Nature. 2002;415:1022–1024. [PubMed]
  • Sniegowski PD, Murphy HA. Evolvability. Curr Biol. 2006;16:R831–834. [PubMed]
  • Stone EA, Sidow A. Physicochemical constraint violation by missense substitutions mediates impairment of protein function and disease severity. Genome Res. 2005;15:978–986. [PubMed]
  • Suckow J, Markiewicz P, Kleina LG, Miller J, Kisters-Woike B, Muller-Hill B. Genetic studies of the Lac repressor. XV: 4000 single amino acid substitutions and analysis of the resulting phenotypes on the basis of the protein structure. J Mol Biol. 1996;261:509–523. [PubMed]
  • Thomas RK, Baker AC, Debiasi RM, et al. High-throughput oncogene mutation profiling in human cancer. Nat Genet. 2007;39:347–351. [PubMed]
  • Wall DP, Hirsh AE, Fraser HB, Kumm J, Giaever G, Eisen MB, Feldman MW. Functional genomic analysis of the rates of protein evolution. Proc Natl Acad Sci U S A. 2005;102:5483–5488. [PubMed]
  • Wang TW, Zhu H, Ma XY, Zhang T, Ma YS, Wei DZ. Mutant library construction in directed molecular evolution: casting a wider net. Mol Biotechnol. 2006;34:55–68. [PubMed]
  • Wang X, Minasov G, Shoichet BK. Evolution of an antibiotic resistance enzyme constrained by stability and activity trade-offs. J Mol Biol. 2002;320:85–95. [PubMed]
  • Watanabe K, Ohkuri T, Yokobori S, Yamagishi A. Designing thermostable proteins: ancestral mutants of 3-isopropylmalate dehydrogenase designed by using a phylogenetic tree. J Mol Biol. 2006;355:664–674. [PubMed]
  • Weinreich DM, Chao L. Rapid evolutionary escape by large populations from local fitness peaks is likely in nature. Evolution Int J Org Evolution. 2005;59:1175–1182. [PubMed]
  • Weinreich DM, Delaney NF, Depristo MA, Hartl DL. Darwinian evolution can follow only very few mutational paths to fitter proteins. Science. 2006;312:111–114. [PubMed]
  • Wichman HA, Badgett MR, Scott LA, Boulianne CM, Bull JJ. Different trajectories of parallel evolution during viral adaptation. Science. 1999;285:422–424. [PubMed]
  • Wichman HA, Scott LA, Yarber CD, Bull JJ. Experimental evolution recapitulates natural evolution. Philos Trans R Soc Lond B Biol Sci. 2000;355:1677–1684. [PMC free article] [PubMed]
  • Wichman HA, Millstein J, Bull JJ. Adaptive molecular evolution for 13,000 phage generations: a possible arms race. Genetics. 2005;170:19–31. [PubMed]
  • Wintrode PL, Arnold FH. Temperature adaptation of enzymes: lessons from laboratory evolution. Adv Protein Chem. 2000;55:161–225. [PubMed]
  • Woods R, Schneider D, Winkworth CL, Riley MA, Lenski RE. Tests of parallel molecular evolution in a long-term experiment with Escherichia coli. Proc Natl Acad Sci U S A. 2006;103:9107–9112. [PubMed]
  • Yue P, Moult J. Identification and analysis of deleterious human SNPs. J Mol Biol. 2006;356:1263–1274. [PubMed]