A number of cloning strategies have evolved over the past three decades ranging from the conventional cassette cloning approach using restriction enzymes and ligase to more recent techniques involving cre recombinase (1
) and integrase–exisionase systems (2
). The latter techniques are adequate for basic cloning where the goal is to transfer a single insert into a vector. However, these techniques are less powerful when the number of unique sequences to be inserted increases. Such is the case with aptamer libraries that require the insertion of thousands or millions of different sequences into the same backbone vector.
At the DNA level an aptamer library can be defined by a constant region shared by all clones and a variable region that is unique for every member represented in the library. In order to achieve a high complexity library, it is necessary to insert a suitably large number of unique inserts into a specific site within the vector. This is applicable for peptide libraries and ribozyme libraries, among others (3
). The number of unique clones within the library defines its complexity, and it is usually desirable to have a high complexity that represents as many different sequences as possible. The creation of such libraries can represent a significant share of the time invested in setting up a genetic screen using such libraries.
The construction of a random peptide expression library requires a random central region usually 27–45 nucleotides in length flanked by regions of defined sequence and the backbone vector chosen to carry the library. Examples of such libraries can be found in multiple publications (3
). The process usually involves a modified version of the cassette cloning approach. In brief, a small oligonucleotide complementary to the non-random 3′ end of the library oligonucleotide is annealed to prime a polymerase reaction that makes the library insert double stranded (3
). The now double-stranded insert is restricted with endonucleases, purified by gel electrophoresis and ligated into a vector previously digested with complementary restriction enzymes. Because the oligonucleotide is usually less than 100 bases long, it can be difficult to efficiently purify the double-stranded insert that was successfully cut with both restriction enzymes from incompletely digested material. Both the ligation of a small insert into a much larger vector and the inability to adequately purify the insert can result in loss of library complexity.
Here we have considered a different strategy: the creation of a single-stranded backbone vector that is compatible with a single-stranded insert containing the aptamer library. Although such an approach has previously been used primarily for the substitution or incorporation of one or a few nucleotides, we were encouraged that such site-directed mutagenesis has been used to successfully integrate sequences as large as 27 bases such as the HA1 epitope (15
), a size equal to that of many libraries. However, conventional site-directed mutagenesis is an inefficient process that yields the desired product much less than 50% of the time (16
), an efficiency too low for library generation of sufficient complexity. The more advanced QuikChange Mutagenesis method is still incapable of introducing sequences long enough to generate biologically active peptide libraries. When a 31 nucleotide sequence was introduced, more than 25% of the transformants failed to carry the insert even after substantial optimization (17
). This procedure does not improve transformation efficiency, critical for complex library production.
The technique presented here uses a library oligonucleotide that hybridizes to the single-stranded vector, and primes a polymerase reaction that uses the vector strand as template. The newly synthesized library strand is covalently closed—creating a double-stranded DNA (dsDNA)—and purified from template materials. Modifications to the technique ensure that nearly 100% of the resulting vectors can contain inserts. We demonstrate that the procedure is sufficiently efficient to generate libraries of a complexity of at least 1 × 106. With optimization and increases in scale it should be possible to make libraries of 1 × 108. The approach should simplify the creation of high complexity oligomer-based libraries in a number of experimental settings.