Directed evolution, the generation of a library of mutations followed by function-based selection, is a standard tool for many protein engineering, synthetic biology and metabolic engineering applications (
1,
2). This method has been applied to a variety of problems, including epitope mapping of protein-protein interactions (
3–
5), generation of novel protein function (
6,
7), optimization of metabolic pathways through promoter or transcription factor engineering (
8,
9) and enzyme engineering, such as modulation of substrate use (
10) or generation of novel catalytic activity (
11). Approaches to generate libraries of variants include targeted and random mutagenesis in addition to recombination-based methods, such as DNA shuffling (
12) and non-templated recombination (
13), while still other methods use a combination of these techniques (
14). Of these, random mutagenesis is a popular and effective tool when combined with a robust function-based screen (
15,
16) and can be used to adjust a variety of protein biochemical or biophysical features, such as antibody affinity for ligands (
17).
The goal of a directed evolution experiment is to identify clones with enhanced function; the probability of success is enhanced by generation of large libraries encoding few wild-type variants. Generation of randomly mutated libraries in
E. coli typically proceeds with four steps: (i) amplification of the region of interest under error-prone conditions, (ii) restriction enzyme digestion, (iii) ligation into a similarly digested plasmid and (iv) transformation into competent cells. Methods to improve error-prone amplification have been extensively studied (
18). For example, PCR conditions can be optimized for library quality, but this has little impact on the final library size. Similarly, as long as highly competent
E. coli are used for transformation, this step does not typically limit library size (
19). Sequential enzymatic digestion and ligation steps can limit the size of the final library (due to inefficient digestion or elimination/introduction of restriction sites during error-prone PCR (
20)), often require optimization of insert:template ratios during ligation (
21), and can be both time- and resource-intensive. Techniques designed to avoid this pitfall often require altered reaction conditions, specialized primer design or additional time consuming sub-cloning steps (
22,
23).
As a result, the cloning steps present the major bottleneck to library generation. Alternatives to enzymatic digestion and ligation typically employ PCR or
in vivo mutagenesis (
24). Direct DNA manipulation can be avoided by
in vivo techniques, such as somatic hypermutation in B cells (
25) and plasmid amplification in
E. coli strains lacking recombination/repair enzymes (
e.g., XL1-Red, Stratagene). However, the use of XL1-Red suffers from a very low mutation rate, requiring propagation of plasmids for up to 100 generations to achieve sufficient error rates, and does not restrict mutations to a region of interest, but introduces them throughout the plasmid and bacterial genome (
23). QuikChange (Stratagene) is a widely used method for site-directed mutagenesis and has been expanded for targeted mutagenesis libraries, including those using megaprimers (
20,
21,
26–
28). Here, primers encoding the desired mutation(s) are annealed to methylated template plasmid. A linear amplification step extends the primer to generate a replica of the entire plasmid. The wild-type template is digested with the methyl-DNA specific enzyme
DpnI and, after transformation, endogenous nucleases (
29). Almost all QuikChange-based methods require extensive primer design and optimization of PCR conditions as well as a high incidence of wild-type background in final libraries.
To avoid the high levels of wild-type plasmid complicating QuikChange mutagenesis, Kunkel mutagenesis offers an alternative approach (
30). Kunkel mutagenesis relies on annealing of a mutagenic primer to uracil-containing single-stranded template DNA (dU-ssDNA) followed by primer extension to generate a mutated complementary strand without uracil (
31). Upon transformation into
dut+ ung+ E. coli, the template dU-ssDNA is digested by endogenous nucleases, allowing only the altered DNA strand to be propagated. Libraries of over 10
11 individual clones randomizing eight consecutive amino acids have been constructed using this method (
4). Although multiple regions within a gene can be simultaneously randomized by annealing mutiple primers, this method is currently limited to targeted mutations.
Here, we present a restriction enzyme-free approach to in vitro library generation, MegAnneal, which allows generation of large random mutagenesis libraries in a single day. This method involves five steps (). (1a) Stop codons are inserted into the gene of interest and (1b) the resulting phagemid used to create dU-ssDNA. The stop codons in the dU-ssDNA template do not interfere with megaprimer annealing but ensure that background clones which do not incorporate the megaprimer also are unable to produce full-length protein. (2a) The region of interest is amplified from plasmid under error-prone conditions. (2b) Using the amplified product as template, asymmetric PCR with a single nested primer generates a library of randomly mutated 3′ megaprimers. (3) In a single reaction, the megaprimers are annealed to dUTP- and stop codon-containing, single-stranded template plasmid, (4) extended with T7 DNA polymerase to create double stranded DNA in which one strand incorporates a megaprimer and ligated with T4 DNA ligase. (5) Transformation into dut+ ung+ E. coli cleaves the uracil-containing wild-type template but not the megaprimer-containing plasmid DNA, resulting in the final library. MegAnneal reliably and rapidly creates large libraries (~107 cfu/μgDNA/transformation). By omitting the traditional digestion and ligation steps, we avoid two problematic steps in library generation. Inclusion of stop codons in the dU-ssDNA template plasmid prevents functional wild-type sequences from contaminating the final library. Finally, we have used megaprimers ranging from 150–750 bp in length, to focus random mutagenesis to desired regions on the plasmid. We have successfully employed this method to generate libraries for three different single-chain antibodies (scFv) and, in conjunction with phage display, have identified variants with enhanced function from two of these libraries.