To our knowledge, this is the first study in which a stability landscape was constructed for an entire domain at single residue resolution. By combining a stability-based in vitro selection with high throughput sequencing, we could analyze the relationship between the sequence and stability of the CH3 domain of human IgG1 without the need for expressing, purifying and measuring each mutant protein individually.
In previous studies, similar approaches have been used for epitope mapping: a protein library was displayed on ribosomes
7 or phage
8,9 and was selected for ligand binding, followed by sequencing of functional clones. Residues involved in the interaction with the ligand are less tolerant to mutation, thereby allowing the identification of functional epitopes. Alternatively, Chao
et al. displayed an EGFR (
epidermal
growth
factor
receptor) library on yeast and selected mutants that lost binding to three different monoclonal antibodies.
32 Sequencing of the selected pools (in this case, the selected pools contained the “non-binders”) enabled the identification of epitopes.
In those experiments, the library was constructed either by error‐prone PCR
7 or by shotgun scanning mutagenesis.
8,9 In the latter approach, degenerate oligonucleotides are used, preferentially allowing either the wild-type residue or a defined amino acid to be expressed at particular positions. The advantage of shotgun scanning is the possibility of changing all analyzed positions to the same amino acid (e.g., to alanine). In contrast, application of error‐prone PCR mostly results in mutations that are reached by changing just one nucleotide of a codon. Amino acid mutations requiring two or even three nucleotide replacements are extremely rare. Thus, only a certain set of amino acid mutations, which is dependent on the initial codon, is incorporated. However, this may also be an advantage of error‐prone PCR, as it facilitates analysis of more than just one type of mutation. Moreover, error‐prone PCR allows randomization of a large region in a single step, as opposed to shotgun scanning mutagenesis, which is limited to a small number of positions due to the necessity of using degenerate oligonucleotides.
In the present study, error‐prone PCR allowed us to randomly mutagenize the entire IgG1-Fc gene (220 amino acid positions) and to analyze the effect of a set of mutations at each position. However, in order to increase the significance and reliability of the data, we analyzed only the change in the total mutation rate. This means that, at a specific position, mutations to different residues were not analyzed separately. In spite of this simplification, the obtained stability landscape of the CH3 domain of IgG1 is of high quality, as demonstrated using various approaches. Firstly, evolutionarily conserved residues were significantly less tolerant to mutation, which is in agreement with other studies.
10,33 Secondly, the median ΔΔ
G values (calculated from predicted ΔΔ
G values of all 19 possible mutations at a specific position) correlated with the data from the stability landscape. Residues with higher median ΔΔ
G values were less tolerant to mutation. Thirdly, comparison with published data demonstrated that residues, which are important for the stability,
16,22–24 as well as for efficient folding of the CH3 domain,
25 are highly intolerant to mutation. Finally, the reproducibility of the data was confirmed by the strong correlation of the mutation rate changes obtained from the two separately performed experiments (b).
Another reason for only showing the change in the total mutation rate instead of analyzing each type of mutation individually is the graphic visualization of the data. Depicting all types of mutations in a single diagram would have been very confusing. However, for interested readers, we included such a stability landscape of the CH3 domain, where all types of mutations are analyzed separately, in the supplemental material (
Supplemental Fig. 3).
Another reason for only showing the change in the total mutation rate instead of analyzing each type of mutation individually is the graphic visualization of the data. Depicting all types of mutations in a single diagram would have been very confusing. However, for interested readers, we included such a stability landscape of the CH3 domain, where all types of mutations are analyzed separately, in the supplemental material (Supplemental Fig. 3).
In vitro selection and sequencing of functional protein variants have also been used by other groups for investigating the relationship between sequence and stability.
34–36 In these studies, the libraries were constructed by shotgun scanning. However, in contrast to the studies discussed above, the selected positions were completely randomized using NNK or NNS codons (N is a mixture of all four bases; K is a mixture of G and T; S is a mixture of C and G). This strategy allows the analysis of all amino acid substitutions at the targeted positions. However, in order to sample all (or the majority of) possible combinations of mutations and to avoid overlapping effects of too many mutations in one protein variant, these approaches were limited to mutagenesis of 1–6 residues within one library. Moreover, due to the limitation to low sequence numbers that could be analyzed by Sanger sequencing, these high mutation rates at a low number of positions were necessary in order to be able to detect a significant amount of mutations at a certain position.
As outlined in
Introduction, this limitation was overcome by application of high throughput sequencing: Fowler
et al. resolved the relationship between the sequence of the WW domain and its function (binding to its peptide ligand) at high resolution.
10 However, two parameters were found to influence the mutational tolerance of a certain residue: (i) its involvement in binding to the peptide ligand and (ii) its impact on the structure and stability of the domain. Thus, the observed tolerances to mutation are determined by a mixture of two parameters, making it difficult to interpret the result.
In the present study, these mixed influences on the stability landscape of the CH3 domain were avoided by choosing ligands that interact with the CH2. This strategy was enabled by the partial reversibility of the unfolding pathway of IgG1-Fc, as discussed above. As a consequence, the mutational tolerance of residues in the CH3 domain was solely determined by their impact on folding and stability of the CH3, but it was not influenced by interferences with ligand binding.
Apart from general insights into the relationship between the sequence and stability of proteins, this study may also prove to be very useful for protein engineering. For example, IgG1-Fc has been engineered for binding to the tumor antigen Her2/neu or to αvβ3 integrin by mutating the C-terminal structural loops of the CH3 domains.
37,38 However, changing these loop sequences resulted in decreased thermal stability of the CH3 domains.
37 One of the critical factors determining the fitness of a randomly mutated library is the part of the protein that is chosen for randomization. In this regard, the stability landscape of the CH3 domain might be valuable, as it provides information about the impact of specific residues on the stability of the CH3 domain. Mutating loop regions that are more tolerant to mutation might increase library fitness and enable the selection of stable binders.