|Home | About | Journals | Submit | Contact Us | Français|
In vitro scanning mutagenesis strategies are valuable tools to identify critical residues in proteins and to generate proteins with modified properties. We describe the fast and simple All-Codon Scanning (ACS) strategy that creates a defined gene library wherein each individual codon within a specific target region is changed into all possible codons with only a single codon change per mutagenesis product. ACS is based on a multiplexed overlapping mutagenesis primer design that saturates only the targeted gene region with single codon changes. We have used ACS to produce single amino-acid changes in small and large regions of the human tumor suppressor protein p53 to identify single amino-acid substitutions that can restore activity to inactive p53 found in human cancers. Single-tube reactions were used to saturate defined 30-nt regions with all possible codon changes. The same technique was used in 20 parallel reactions to scan the 600-bp fragment encoding the entire p53 core domain. Identification of several novel p53 cancer rescue mutations demonstrated the utility of the ACS approach. ACS is a fast, simple and versatile method, which is useful for protein structure–function analyses and protein design or evolution problems.
A simple and efficient approach that scans across a defined gene region and systematically produces all possible single-codon mutations within that region would be highly desirable for many applications. Random mutagenesis, coupled with genetic screening or selection strategies to identify desired phenotypes, is often used to isolate proteins with modified activities or to identify critical residues in proteins (1–3). Protein-coding DNA regions are typically mutagenized in vitro using codon cassette mutagenesis, site-directed mutagenesis, chemical modification of nucleotides, or error-prone PCR conditions (4–7). Error-prone PCR-based strategies are biased towards certain nucleotides or nucleotide combinations, rarely produce two or three base changes in a single codon, typically do not achieve saturation, may introduce multiple amino acid changes per product, and cannot be controlled in detail. Codon cassette mutagenesis requires specific sequence conditions and targets a single codon at a time, thus requiring considerable effort if saturation of extended gene regions is desired (5,8). More directed approaches, such as site-directed mutagenesis or its extensions to alanine scanning (9) and iterative saturation mutagenesis (10), sample only selected residues and are labor-intensive for saturation mutagenesis of large gene regions or whole genes. The All-Codon Scanning (ACS) strategy presented here overcomes many of the above restrictions. ACS allows the systematic change of each codon in a defined gene region into all other codons, while yielding only a single changed codon per mutagenesis product.
ACS mutagenesis of small and large regions of the human tumor suppressor gene p53 was used here to identify amino acid changes that can restore activity of p53 mutants found in human cancer. The p53 gene encodes a tumor suppressor protein that is a key cellular defense against cancer (11). p53 mutations occur in ~50% of human cancers, and about three-quarters of those mutations are single-point missense mutations in the p53 core domain (12,13). Restoring p53 activity in human tumors is therefore theoretically possible. Indeed, reactivated p53 holds great therapeutic promise because animal models have shown that reintroduction of active p53, even in advanced tumors, leads to tumor regression (14–16). As demonstrated in vivo, p53 cancer mutants can be reactivated through intragenic second-site suppressor mutations (17–19). These so-called ‘cancer rescue’ mutations suggest that drugs mimicking suppressor mutations might be designed (20). Therefore, studies on cancer rescue mutations aim to identify regions in p53 where perturbations may cause p53 reactivation. Cancer rescue mutations may also influence protein-protein interactions or protein stability without necessarily inducing major structural disruptions. Thus, identification of many or all p53 cancer rescue mutations could provide a better structure–function understanding of the phenomenon of p53 reactivation and support future drug development efforts.
As described below, the ACS strategy can be combined with synthetic DNA technologies to optimize hybridization fidelity. Several computational approaches exist that allow the design of synthetic DNA molecules and encompass a number of DNA sequence-determined properties without changing the encoded protein sequence (21–25). This is of course possible because of the degeneracy of the genetic code, and allows simultaneous optimization for several arbitrarily specified sequence properties, such as its own self-assembly, codon usage (26), GC/AT ratio (27,28), translational kinetics (29), codon-pair bias (30–32) and other DNA structural scales (33). Particularly useful for the ACS strategy is the Computational Optimized DNA Assembly (CODA) system (34), which exploits the degeneracy of the genetic code to perform multiple sequence optimizations, including removal of deleterious RNA cross-hybridizations and secondary structures. The resulting global thermodynamic optimization of the DNA sequence implies that every location in the gene is assigned a globally unique thermodynamic address. However, any modern gene design software that removes RNA cross-hybridizations and secondary structure may be used instead of CODA.
The ACS strategy described here is a quick and simple method that overcomes several shortcomings of current scanning mutagenesis approaches. ACS creates a sequence defined gene library with (i) each individual codon in a specific target region changed to all other possible codons, and (ii) only a single codon change per mutagenesis product.
Oligonucleotides were purchased from Integrated DNA Technologies (Coralville, IA, USA). PfuUltra II ultra DNA Polymerase was purchased from Strategene (La Jolla, CA, USA). Restriction enzymes and T4 DNA ligase were obtained from New England Biolabs (Ipswich, MA, USA).
A pool of 10 forward and 10 reverse oligonucleotides, each with a single randomized NNN codon at its center was used in each mutagenesis reaction. NNN yields all possible codons and so reduces the possibility of codon-level effects that might interfere with translation or expression (e.g. rare codons or bad codon pairs), while the stop codons are inactive in our genetic selection and so were not seen. The ACS method can be modified easily to use NNK or NNS (or any desired triplets) for use where there was less concern about codon-level effects or stop codons would be deleterious. Similarly, in applications where DNA synthesis bias would be a concern, custom preparation of oligonucleotides with balanced base ratios is available (http://www.idtdna.com/catalog/analytical/Page1.aspx).
The constant flanking regions surrounding each NNN codon matched the target incorporation site and were extended at least four bases past any other NNN codon. Consequently, incorporation of any mutagenesis oligonucleotide into a product must mask any previously incorporated NNN, and so only one NNN codon is possible in any product. Mutagenesis oligonucleotide extension can potentially lead to a bias towards products that are generated from oligonucleotides with higher annealing temperatures. To overcome this problem we used the Stratagene online mutagenesis oligonucleotide designer (http://www.stratagene.com/sdmdesigner/default.aspx) to obtain roughly equal annealing temperatures greater than 68°C.
The 50 μl PCR reaction was carried out with 50 ng templates, 6 μM total primer concentration (300 nM of each forward and reverse primer), 400 μM dNTPs and 5 U of Pfu Ultra II polymerase. The reaction was initiated at 94°C for 10 min, followed by 18 cycles (30 s at 94°C, 40 s at 55°C and 200 s at 72°C), and concluded by incubation at 72°C for 15 min. The amplification products were digested with Dpn1 to remove the methylated template plasmids. At this point, a plasmid library containing completely saturated target regions of a gene of interest can be generated by transformation into Escherichia coli. Alternatively, the mutagenized products can be amplified by PCR and introduced into target vectors by homologous recombination or used in many other standard applications.
A region containing the p53 core domain (amino acids 50–334) was re-amplified from the mutagenesis mix. The PCR products were purified (QIAquick™ PCR purification kit, Qiagen, Germany) and used to generate p53-expression plasmids in yeast using the gap-repair strategy (35). To identify mutations that reactivate p53 cancer mutants we used a p53-tester yeast strain. Growth of this yeast strain in medium lacking uracil depends on active p53 because the URA3 gene is engineered to be under transcriptional control of p53 (18,36). The plasmid pTW300, which is a centromeric plasmid expressing human p53 under control of the ADH1 promoter (36), was gapped with AgeI/StuI. These restriction sites are positioned so that the ACS final PCR products overlapped with both ends of the gapped plasmids by at least 60 nt at both sites. Taking advantage of the highly efficient homologous recombination in Saccharomyces cerevisiae, 500 ng of the PCR products and 500 ng of the gapped plasmid were co-transformed into yeast strain RB379. The p53-tester yeast strain RB379 was engineered so that expression of the URA3 gene is dependent on active p53 (18). Cells were directly plated onto SC plates lacking uracil and incubated at 37°C, thus selecting for repaired, functional plasmids and a possibly functional p53 molecule. These plasmids were directly sequenced from yeast cells, after colony-PCR amplification (Genewiz, San Diego, CA, USA).
The p53 gene sequence was CODA-optimized to encode its own correct self-assembly as described (34). Computations were performed on a 64-node cluster of 3 GHz Xeon dual processors. Computer code is available upon request. The p53 sequence was also designed for optimal expression in S. cerevisiae using synonymous codon substitutions. Codon usage for yeast was estimated using the codon adaptation index (37) and codon pair statistics (30–32,38). The CODA output is a list of DNA oligonucleotide sequences and a set of instructions for combining the purchased oligonucleotides by polymerase extension and PCR into the full length synthetic gene.
For each overlapping intermediate DNA fragment, the constituent oligonucleotide set was mixed to final concentrations of 0.1 μM with an excess (1 μM) of the 5′ and 3′ primer oligonucleotide. Each oligonucleotide set was extended to an intermediate DNA fragment by a PCR reaction using 2.5 U of PfuUltra II ultra DNA polymerase (Stratagene), 200 μM dNTPs (Roche Diagnostics, Indianapolis, IN) and 1X PfuUltra reaction buffer. These reactions were performed in a MJ Research PTC-225 thermal cycler using the following calculated-control protocol: 10 min denaturation step at 95°C, followed by 35 cycles of 30 s at 95°C, 30 s at 68°C and 20 s at 72°C, and a final step of 5 min at 72°C. The PCR products were subjected to electrophoresis in a 1% agarose gel, PCR purified (Quiaquick PCR clean up, Quiagen, Valencia, CA, USA), and quantified by NanoDrop (Thermo Scientific, Wilmington, DE, USA). Each overlapping intermediate DNA fragment was mixed to a final concentration of 0.1 μM along with 1 μM NdeI/BamHI cloning oligonucleotides, 2.5 U of PfuUltra II ultra DNA polymerase (Stratagene), 200 μM dNTPs (Roche Diagnostics, Indianapolis, IN), and 1× PfuUltra reaction buffer. These primer extension and PCR amplification reactions were performed in a MJ Research PTC-225 thermal cycler as above, but with 45 s extension at 72°C. The full-length p53 gene was separated in a 1% agarose gel, PCR purified, and cloned into a pCODA vector. The sequence was confirmed by DNA sequencing.
H1299 and Saos-2 cell lines (p53 negative) were grown in high-glucose DMEM with 10% fetal bovine serum. All transfections were done with Lipofectamine 2000 (Invitrogen, Carlsbad, CA, USA) according to the manufacturer’s protocol. All experiments were done in triplicate. p53-dependent reporter activity values were normalized to Renilla luciferase activity by co-transfection with expression plasmids carrying the Firefly luciferase gene under the control of the p21 promoter and a CMV promoter driven p53 (36). Cells were harvested 48 h after transfection and analyzed for luciferase activity with the Dual-Glo Luciferase Assay System (Promega, Madison, WI, USA) according to the manufacturer’s protocol. Light emission was detected with the Monolight 2010 luminometer (Analytical Luminescence Laboratory, San Diego, CA, USA).
The location of cancer mutants and cancer rescue mutants in the p53 3D structure was visualized using Visual Molecular Dynamics (VMD) (http://www.ks.uiuc.edu/Research/vmd/).
A CODA-optimized human p53 core domain DNA sequence was optimized to code for the same amino acid sequence as the native gene, to be easily self-assembled from synthetic oligonucleotides by polymerase extension, respect hypothesized codon usage and codon pair preferences (30–32) in yeast, and especially to avoid RNA secondary structure and off-target cross-hybridization, which is known to be a source of mutagenesis failure (33). Nucleotide changes required to achieve these properties (Figure 1A) were computed by the CODA algorithm as described (34).
Briefly, the p53 gene was divided into six overlapping intermediate DNA fragments, each of which was subdivided into 10 short overlapping and abutting oligonucleotides. There exists a temperature gap within which, with high thermodynamic probability, correct hybridizations are annealed and incorrect hybridizations are melted. The melting temperature distributions of the CODA-designed p53 sequence show a melting temperature gap of 20°C (Figure 1B), which essentially results in a globally unique thermodynamic address for each location in the target gene and thereby minimizes mutagenesis difficulties due to miss-priming and cross-hybridization. Self-assembled overlapping intermediate DNA fragments were obtained by primer extension and PCR amplification. The overlapping intermediate DNA fragments were combined in another PCR amplification reaction to generate a single product, sequence verified, full-length p53 core domain (Figure 1C).
ACS is a single-tube PCR-based mutagenesis strategy that creates all 64 possible codons at each single position in a 30 bp wide target region of a gene of interest. Individual reactions were restricted to 30-bp sections to restrict oligonucleotide length and so avoid unwanted mutations due to oligonucleotide synthesis errors. The novel arrangement and site-specific hybridization of multiplexed overlapping oligonucleotides, described below, results in only a single codon change per gene product. Thus, library complexity remains practical for down-stream applications (Figure 2).
First, we verified by sequencing more than 500 randomly chosen individual mutagenesis products that, at a given amino acid position, all 64 possible codon insertions were observed, and all 20 amino acids were represented at approximately their expected frequencies (correlation coefficient r = 0.79) (Figure 3). We observed a very high mutagenesis efficiency (>95%), which we attribute to the unique thermodynamic profile of the synthetic gene design. The melting temperature distributions of the CODA-designed p53 sequence show a melting temperature gap of 20°C (Figure 1B). Thus, correct annealing of any given oligonucleotide was strongly favored over hybridization at an incorrect position. Even with a 3-nt mismatch mutagenesis oligonucleotide, the temperature gap between correct hybridizations and incorrect hybridizations remained at a very selective 15°C. This high thermal gap avoids annealing to incorrect gene locations and solves one of the main obstacles to efficiently produce sequence-defined multiplexed DNA libraries (33).
Next, we chose the region encoding amino acids 232–241 of p53 to test the efficiency of ACS mutagenesis in identification of rescue mutations for the p53R158L cancer mutant. Amino acid region 232–241 in mutant p53R158L was specifically selected because it had been extensively explored previously by error-prone PCR and was identified as a ‘global suppressor motif’ due to the accumulation of cancer rescue mutations (36). The region containing the p53 core domain was re-amplified from the mutagenesis mix. The PCR products were purified and used to generate p53-expression plasmids in yeast using the gap-repair strategy (Figure 2). As described in detail in the ‘Materials and Methods’ section, a p53-tester yeast strain with p53-controlled expression of the selectable URA3 gene was used for these experiments to identify mutations that restore p53 activity (36).
For each ACS reaction we obtained at least 200 000 transformants, ensuring full coverage of all generated mutants. Specifically, considering mutations at 10 possible codon locations, each of which can have any one of 64 codons, the 640 theoretical different mutants are covered by >200 000 transformants with >300-fold coverage (as >200 000/640). Such high coverage almost certainly guarantees that every codon change was represented (P < 1.0e − 130 by the binomial distribution with Bonferoni correction for 640 trials).
ACS mutagenesis of the p53 region encoding amino acids 232–241 identified all five previously described rescue mutants and one new mutation (Y234L) that reactivated the p53R158L cancer mutant (Table 1). Interestingly, the Y234L mutation required a minimum of 2 nt changes and so is unlikely to be found using error-prone PCR, thus providing evidence for the value of our exhaustive approach.
We next wanted to scan a novel region in p53 for reactivating second-site mutations that can restore activity to p53 mutants. We considered the 10 p53 cancer mutants that are most commonly found in human cancer and can be constructed so that they differ by two or more nucleic acid changes from the wild-type (R249S, G245S, H179R, R273H, R248L, R158L, R280T, P151S, P152L, P278L). Cancer mutants were restricted to those with multiple nucleotide changes from wild-type p53 to minimize false-positive errors due to spontaneous reversion, although analysis of the experimental results later indicated that this precaution was not necessary. These 10 cancer mutants were used for ACS mutagenesis of amino acid region 114–123. Note that one of these p53 mutants (p53R158L) was used in the ACS experiment targeting region 232–241. Region 114–123 was considered a potential cancer rescue region because multiple amino acid changes occurred there spontaneously in previous cancer rescue mutant screens but no single amino acid change cancer rescue mutations had been found previously [(36); Brachmann, R. K., personal communication]. In addition, a structure-based computational classifier trained to predict regions in p53 that likely contain cancer rescue mutations selected region 114–123 (39). Therefore, this region was thought likely to include single amino acid cancer rescue mutants and was selected for ACS.
We obtained a minimum of 200 000 yeast transformants for each library, which represents >300-fold coverage of the expected 640 mutants in each library. We isolated a total of 146 transformants that grew on plates lacking uracil. The p53 core domains of all 146 potential rescue mutants were amplified by colony PCR and sequenced. Three of the isolated uracil prototroph strains contained wild-type p53, which was likely a result of reversion of the cancer mutation due to selection pressure. However, the vast majority of sequences (143) retained the cancer mutation. A total of 13 different potential rescue mutants showed both the cancer mutation and a second mutation that changed the amino acid sequence in the mutagenized region. All potential rescue mutants were regenerated using site directed mutagenesis, transformed into the p53-tester yeast strain, and analyzed for their ability to confer uracil prototrophy (Figure 4). All rescue mutations were confirmed. Six of them had a strong reactivation activity, whereas seven showed only a very small gain in activity compared to the cancer mutant and were not further tested (Table 2). We measured p53 activity of the strong rescue mutants in human H1299 cells, which lack endogenous p53 activity. p53 cancer mutants and rescue mutants were expressed in H1299 cells and activity was measured using a p53-dependent luciferase reporter (18) (Figure 4). Without exception, mutations confirmed in the yeast-based screen also reactivated p53 cancer mutants in human cells.
All p53 cancer rescue mutants identified in amino acid region 114–123 were new and had not been identified by previous error-prone PCR mutagenesis approaches using the same yeast-based p53 activity assay. ACS resulted in a mutant that reactivated the p53 cancer mutant p53P152L, which had not been rescued before. Importantly, three out of the six identified amino acid changes that strongly reactivated p53 cancer mutants required 2- or 3-nt changes. Such mutations are underrepresented in current in vivo or in vitro mutagenesis approaches, but were easily identified using our complete and unbiased exhaustive approach.
ACS reactions easily can be performed in parallel to produce gene libraries that target much larger gene regions or entire genes. We used ACS to saturate all 200 amino acids of the p53 core domain with all possible single amino acid changes. The ACS method consists of single-tube reactions, each covering a region of 30 bases. Twenty parallel ACS reactions were performed and the mutagenesis products were introduced to the p53-tester yeast strain (Figure 2). Combining these single ACS reactions produced a completely saturated library with exactly one mutation in each product. At the same time, the modularity of the system allows to avoid the modification of specific gene areas of interest or even key residues, such as the p53 cancer mutation itself.
The three p53 cancer mutants p53R175H, p53R273H and p53G245S were exhaustively scanned for single-change cancer rescue mutants throughout the 200 amino acid core region. We selected these particular cancer mutants due to their specific characteristics. p53R175H is the most common cancer mutant found in human tumors, but reversion to wild-type requires only a single nucleotide change and we were concerned about a potentially high spontaneous reversion rate interfering with ACS screening. Eventually, however, random reversion to wild-type p53 turned out to be insignificant. We also chose to analyze p53R273H, the most common cancer mutant that differs by 2-nt changes from wild-type p53, to avoid altogether the possibility of spontaneous reversion. Neither cancer mutant had been rescued previously by intragenic single amino acid changes, and thus might have been too stringent an initial test case for ACS. Therefore, we also analyzed p53G245S, the most common cancer mutant that had been rescued previously by a single amino acid change.
Table 3 summarizes the results of ACS saturation mutagenesis on the 200 amino acids in the p53 core domain for these three different cancer mutants. p53R175H could not be reactivated by a single amino acid change. For p53R273H and p53G245S, we recovered all previously known single rescue mutations and identified six new reactivation sites (Table 3). Three of the six new sites required two or three changes from wild-type p53, and so are unlikely to be seen using error-prone PCR. These results further demonstrate the utility of ACS for efficient, economic, and systematic saturation scanning mutagenesis in defined gene sections.
In summary, the ACS strategy creates a sequence defined gene library, which changes each individual codon in a specific target region into all possible codons, with only a single codon change per mutagenesis product. The ACS approach is fast, simple, and generally applicable to any problem that requires systematic changes of defined regions in a DNA sequence, including protein design, directed evolution, structure/function analyses and promoter studies.
Individual ACS mutagenesis reactions can target changes in a single codon (or even a single base pair position) for as many as 30 contiguous base pairs. The upper base pair target limit restricts primer length to avoid incorporation of mutations due to primer synthesis errors, considering a typical error rate of one per 600 synthesized nucleotides (40–43). By performing parallel reactions, mutagenesis can be expanded to larger regions or whole genes. The ACS system allows complete control over the target regions; specific gene sections, or even specific nucleotides, easily can be protected from mutagenesis. Thus, ACS is an adaptable and versatile tool.
Here, ACS was used to mutagenize small and large regions of the human tumor suppressor gene p53. Eleven common p53 cancer mutants, which collectively are found in well over one million diagnosed cancer cases per year, were tested. Reactivating intragenic mutations for four of them were identified. Moreover, the p53 cancer mutants p53P152L and p53R273H had not been reactivated previously by single amino acid changes. Thus, ACS obtained more complete biological information than previously had been obtained using methods such as error-prone PCR (Supplementary Table S1). Notably, 7 out of the 12 strong new p53 cancer rescue mutants we identified in this study required 2-or 3-nt changes, confirming the utility of ACS as an unbiased approach.
To understand better the regions selected and their relationship to the p53 protein, it is helpful to consider molecular visualizations of p53 (Figure 5). Cancer mutations are indicated in red and rescue mutations in green. The 3D locations are consistent with the hypothesis that different p53 cancer rescue mutants require different rescue mechanisms (44–46), and suggest that different rescue regions may correspond to distinct modes of cancer mutant reactivation. For example, rescue mutations for the p53 cancer mutant p53R273H are located in distinct regions that do not overlap with rescue regions associated with other cancer mutants (Figure 5). A comprehensive identification of such rescue regions using ACS or similar approaches may contribute to targeted drug development, as it helps to identify surface exposed, drug accessible sites on p53 where perturbations may result in reactivation.
A significant proportion of human tumors express full length but mutated, inactive p53. Consequently, there are ongoing efforts to find small molecule drugs that mimic the cancer rescue effect of reactivating p53 second-site suppressor mutations (44,45,47,48). Despite impressive progress, efforts are hampered by a limited understanding of the p53 mutation–structure/function relationship (13,46,49,50). The ACS strategy described here has expanded and diversified the collection of rescue mutations that reactivate p53 cancer mutants. We hope that these efforts will lead to further insight into general structural changes that can rescue p53 cancer mutants, and thereby facilitate rational drug design approaches that seek to exploit similar effects.
Supplementary Data are available at NAR Online.
National Institutes of Health BISTI (grant number CA-112560, partial); National Science Foundation (grant number IIS-0326037). National Institutes of Health Biomedical Informatics Training Program (grant number LM-07443 to S.A.D.). Funding for open access charge: National Institutes of Health BISTI (grant number CA-112560); National Science Foundation (grant number IIS-0326037).
We thank Lydia Ho, Faezeh Salehi and Chris Wassman for helpful discussion and technical assistance. S. A. D. acknowledges support from the National Institutes of Health Biomedical Informatics Training Program. Materials and computer code will be provided under the terms of the standard Materials Transfer Agreement of the Regents of the University of California.