A procedure was developed that allows the assembly of a GOI into a linear DNA template with all the components necessary for in vitro TS–TL. Assembly is achieved using a coupled uracil excision–ligation strategy based on USER enzyme and T4 DNA ligase, which allows the simultaneous and seamless assembly of three different PCR products with a minimal number of work-up steps by that conferring simplicity, robustness and speed to the method. For its success, several factors proved critical.
To achieve a high efficiency of assembly and efficient PCR purification, it was critical to purify all substrates by PEG–MgCl2
precipitation prior to assembly (24
). Otherwise, primer dimers and other small non-specific by-products that could not readily be visualized with ethidium bromide interfered with the assembly process (, L8-10). Precipitation with PEG–MgCl2
constitutes an inexpensive and fast method to purify PCR products. Size dependence can easily be exerted by titrating the optimal concentration of PEG-8000 for a given DNA fragment even though its resolution is less stringent compared to gel purification. Nonetheless, precipitation with PEG–MgCl2
proved necessary and sufficient in this case to generate clean substrates for efficient assembly through uracil excision–ligation.
Despite its widespread applicability to any assembly procedure that features PCR products including overlap extension PCR and restriction–digestion–ligation, precipitation with PEG–MgCl2
is virtually unused in practice. Our results emphasize that this step is critical and, if applied prior to the assembly step, overcomes the need to gel purify DNA templates after the assembly or purifying PCR steps: first of all, it saves time and labour as precipitation can be completed in ~10–15 min as opposed to several hours needed to separate and extract DNA fragments from agarose gels. Furthermore, it prevents the GOI from becoming damaged following exposure to UV light during gel excision and pre-empts any inhibiting effects that contaminants such as residual agarose and salts may have on downstream reactions (7
). For instance, the majority of commercial in vitro
TS–TL systems explicitly advise not to purify DNA templates by agarose gel electrophoresis (17
) Similarly, it has been reported in the literature that templates are not sufficiently clean for in vitro
TS–TL following excision from agarose gels using commercial purification kits and generally need to be further purified by ethanol precipitation or include additional washing steps, e.g. with washing buffer during purification with the QIAquick kit (7
). Similarly, we found that the efficiency of PCR is markedly reduced if the DNA was gel purified prior to amplification in the absence of additional purification steps (data not shown).
If a PCR is particularly prone to non-specific by-products that cannot easily be purified by precipitation with PEG–MgCl2
, it may be necessary to perform two successive nested PCRs (12
): a standard PCR to recover the desired DNA template followed by a short second run over 10 cycles to generate the assembly substrate with two uracil residues. This may particularly apply in real selections where very few DNA fragments serve as templates and the amplification efficiency of the target amplicon is compromised by greater competition from non-specific amplification reactions.
In terms of sequence requirements, template assembly through uracil excision–ligation only imposes minimal restrictions on the splice sites: only an adenine and a thymidine spaced apart by several nucleotides in the 5′→3′ direction to ensure efficient dissociation of the excised regions are required (22–24
). In the context of directed evolution, the procedure allows rapid and flexible variation of the diversified regions saving laborious subcloning steps when different regions of a gene are targeted and no additional restriction sites need to be introduced. In practice it should be considered that all splice sites need to differ by at least two base pairs for a complementary overlap of 5–6 bp to prevent single-stranded extensions from cross-hybridizing (data not shown). Equally, it must be ensured that the single-stranded extensions cannot fold onto themselves to prevent the formation of covalently closed loops. Calculating the base pair probabilities of the single-stranded extensions with a suitable folding programme gives a good indication if secondary structures pose a problem, e.g. with the RNA-fold web server (http://rna.tbi.univie.ac.at/
) which conveniently displays base pair probabilities as a dot plot (31
In comparison, template assembly procedures based on overlap extension PCR or restriction–digestion and ligation take significantly longer—up to an entire day as opposed to ~90 min. This can be attributed to sequential restriction–digestion and ligation reactions, and relatively long overlap extension and purifying PCRs featuring 25–30 cycles. Furthermore, the purifying PCR frequently generates non-specific fragments which require the desired DNA template to be purified by agarose gel electrophoresis (7–10
). The reason for the purifying PCR frequently being ineffective is presumably a consequence of the reduced efficiency of the overlap extension and ligation reactions. For instance in ligation reactions, template circularization faces additional competition from template concatemerization. Furthermore, the palindromic nature of the single-stranded extensions generated by the majority of restriction enzymes enable DNA fragments to form homodimers which can be—just as the desired product—exponentially amplified by PCR when they are formed between the flanking regions or the plasmid backbone. In addition, the efficiency of restriction–digestion and ligation strategies is often limited by the idiosyncrasies of individual restriction enzymes, which can cause unexpected problems such as inefficient digestion at restriction sites located near the end of a linear DNA fragment (33
). In overlap extension PCR, a qualitative analysis of DNA hybridization kinetics suggests that even under perfect conditions in the absence of any secondary structures only a fraction of templates recombines in a given thermal cycle. For instance, the hybridisation half lives for an overlap of 25 bp with both assembly substrates present at a concentration of 10 nM is on the order of 6–7 min (34
). This compares to recombination times of ≤1 min that feature in most, if not all, overlap extension PCR protocols that have been published in textbooks (36
), commercial product manuals (38
) and scientific publications that either specifically deal with the subject of overlap extension PCR (39
) or apply it in the context of directed evolution to assemble and regenerate libraries (9
). In fact, it is unclear to what extent overlap extension PCR is suitable to assemble gene libraries. For instance, templates that by chance recombine in an earlier thermal cycle will also enter exponential amplification earlier, and thus increase in abundance so that templates are not uniformly amplified. If the size of a library then exceeds the recombination efficiency so that stochastic hybridization events cannot be averaged out for subpopulations of identical mutants, the library will become randomly biased. This makes the enrichment less dependent on a functional trait and may even reduce the diversity of a library as a portion of mutants can be eliminated during the assembly process.
In some cases, the entire template has been amplified after selection so that template reassembly is unnecessary in the first place (5
). This strategy is, however, not broadly applicable; e.g. it has been reported that PCR amplification is inefficient if identical set of primers are used to prepare templates and recover genes after selection (7
). This can be attributed to the partial, exonucleolytic degradation of the ends either by the 3′→5′ exonuclease activity of a proof-reading polymerase or due to any exonuclease activities in the cell extract (7
). Successive nested PCRs of the whole template are equally impractical; e.g. if synthetic modifications are introduced (5
), each set of nested oligonucleotides needs to be modified separately. Depending on the type of modification, this may significantly add to the costs. Furthermore, PCR amplification of the whole template is expected to accumulate mutations in constant parts, particularly in the coding regions of long and essential fusion genes, which can significantly compromise the performance of a screening and selection process (9
). An exception relates to streptavidin that forms the basis of a non-covalent DNA display system (5
) and has also been the subject of evolutionary optimization to improve binding towards a biotin analogue (16
). In both cases, the entire streptavidin gene was amplified for at least five successive selection cycles with no detrimental effect on the selection process suggesting that streptavidin can tolerate this level of mutation. This may, however, be a protein-specific effect and does not necessarily apply to systems that rely on less robust proteins.
In summary, the assembly protocol presented here should be applicable to many different in vitro screening and selection systems, especially those that feature relatively long fusion genes that are susceptible to mutations and thus rely on nested PCRs. The protocol saves time, labour and enables many rapid successive selection cycles. This is highly practical when mimicking evolutionary processes such as genetic drifts.