Unlike the conventional cloning approach that relies on site-specific digestion and ligation, homologous recombination aligns complementary sequences and enables the exchange between homologous fragments. Homologous recombination is much more efficient in yeast than in bacteria and other higher eukaryotes. Therefore, it has been exploited as one of the most important tools in gene cloning, site-specific mutagenesis, plasmid construction and target gene disruption and deletion (13
). In this report, we further extended its application to assemble biochemical pathways in a single step.
The novelty of this DNA assembler method lies on the following aspects: (a) it is the first demonstration of yeast in vivo homologous recombination in assembling multiple-gene biochemical pathways in a single step, which distinguishes it from routine single-gene plasmid construction; (b) compared to the two related methods including the SLIC method and the domino method, DNA assembler is the most rapid and efficient approach to construct large recombinant DNA molecules; (c) unlike the SLIC method and the domino method, DNA assembler can be used to construct custom-designed DNA molecules not only on a plasmid but also on a chromosome. The latter enables the exogenous DNA stably maintained in the cell and has the potential to assemble a very large biochemical pathway or even a whole genome.
Similar to E. coli
, because of its well-characterized physiology and genetics and ample genetic tools, S. cerevisiae
has been widely used as a platform organism for heterologous expression of biochemical pathways in the fields of pathway engineering and metabolic engineering (1–5
). In contrast to the conventional techniques for pathway cloning, DNA assembler avoids repeated cycles of multiple-step cloning, does not rely on the restriction digestion and in vitro
ligation and takes only 1–2 weeks to assemble multiple DNA fragments into S. cerevisiae
either on a plasmid or on a chromosome. It simply takes the advantage of the pre-existing homologous recombination machinery in yeast and thus does not require any exogenous recombinase.
In addition, unlike the recently reported in vitro
SLIC recombination method that requires T4 DNA polymerase and RecA recombinase to treat the DNA substrates before transformation (11
), DNA assembler only requires simple DNA preparation via PCR and one-step yeast transformation and yet yields a much higher assembly efficiency than the SLIC method. As shown in our report, for a shorter pathway (3–5 genes, ~10 kb), a very high efficiency of 80–100% was obtained with a ~50-bp overlap, while for a longer pathway (8 genes, ~19 kb), a relatively high efficiency of 40–70% was achieved with a slightly longer overlap (~125–430 bp). In addition, for a long pathway, increasing the ratio between the inserts and the backbone could further improve the assembly efficiency. For example, the eight-gene pathway was reassembled with a ~50-bp overlap. The amount of the inserts was doubled while the amount of the linearized vector was maintained (i.e. 600 ng of each insert was combined with 500 ng of the linearized vector). As a result, more colonies on the SC-Ura plates were observed, and an assembly efficiency of as high as 70% was obtained, compared with the efficiency of 20% when lower amount of the inserts was used. This indicates that DNA assembler could efficiently assemble a large DNA molecule even with a short overlap by manipulating the ratio between the inserts and the vector backbone.
It is possible that deletions may be caused by a repeat or repeats in the DNA fragments. To address this concern, we examined the existence of repeats in pRS426m-xylose-zeaxanthin. We found that short repeats are very common in such a 24.6-kb DNA molecule. As shown in Supplementary Table 3
, 7 sequences with a length of 14 bp appeared twice, which were widely distributed among promoters, terminators and structural genes. In addition, 29 sequences (13 bp), 76 sequences (12 bp), 197 sequences (11 bp) and 527 sequences (10 bp) were found to occur at least twice in the same molecule. These abundant repeats up to 14 bp did not appear to be problematic in our experiment considering that assembling efficiencies of 40–70%, 50–60% and 10–20% were obtained with 270–430-bp, ~125-bp and ~50-bp overlaps, respectively. Previous homologous recombination-based gene-cloning experiments by other groups indicated the length of overlap required for efficient recombination between two DNA fragments: 40-bp overlaps yielded an efficiency of greater than 90% (18
). The efficiency was slightly reduced using 30-bp overlaps (~80%) and rapidly dropped to 3.4% when 20-bp overlaps were used (40
). Similarly, in terms of deletions, Manivasakam and coworkers (22
) studied the length of a repeat required for targeted deletion. A repeat of a 45 bp, a 30 bp and a 25 bp resulted in efficiencies of 84%, 54% and 4%, respectively (22
). In addition, Langle-Rouault and Jacobs reported that selection of marker excision failed using 25-bp direct repeats (21
). All these studies suggest that the efficiency of recombination-based deletions is fairly low when repeats of shorter than 25 bp are involved. It should also be noted that in all these mentioned experiments, a selectable marker was used to specifically select the deletion event, and the efficiency was calculated based on the percentage of the correct deletions among all the transformants. Therefore, short repeats (<25 bp) in a long DNA molecule should not cause too much trouble, especially because such a deletion event is not specifically selected, making the chance even much lower. When the repeats become longer (>25 bp, but not too long), the recombination between the repeats will likely compete with the recombination between the overlaps at the ends of each fragment in the assembling process. In this case, a longer overlap (>100 bp) will benefit the recombination between the ends. The deletion between repeats could happen, but the chance will be several orders of magnitude lower than the above efficiency considering that no selection is used in such a deletion process. Of special note, we recently developed an efficient method for gene deletions in yeast, and we studied the length of the repeats required to yield decent deletion efficiencies (41
). Our results showed that 25-bp repeats are not sufficient to generate deletions. It should be noted that when the repeats become very long (>100 bp), the deletion event may occur with a higher efficiency. However, such long repeats are not common in a typical biochemical pathway. If long repeats exist in structural genes, silent mutations may be incorporated during the fragment preparation process in order to reduce the length of the repeated regions.
In our studies, S. cerevisiae YSG50 was used as a host for DNA assembly and integration. This strain carries a ura3-1 point mutation on its genome, which may be converted to wild type by the ura3 gene in the pRS626m vector (or in the helper fragment). To address this concern, we also tested a strain HZ848 with a complete deletion of ura3 (the strain HZ848 was obtained by deleting the ura3 gene in YSG50). The same amount of DNA mixture (eight DNA fragments plus the digested vector) was transformed to YSG50 and HZ848, and in parallel, the same amount of the linearized vector was transformed to both strains as controls.
Similar numbers of colonies were obtained in the two samples (42 for YSG50 versus 38 for HZ848), whereas only two colonies appeared in both controls. That the same number of colonies appeared in both controls suggested that the vector had not been completely digested. Although it is possible that the chromosomal gene ura3-1 may be converted to wild type, it should occur at a very low frequency (<10−6). In comparison, the frequency of DNA assembly was estimated to be 5.7 × 10−6 (~40 colonies appearing on the Sc-Ura plates from 7 × 106 cells), which is relatively high. Ten clones from both of YSG50 transformants and HZ88 transformants were selected for further analysis. Of 10, 5 YSG50 transformants and 4 of 10 HZ848 transformants were correct, indicating that for a biochemical pathway consisting of up to eight genes, there is no advantage to use a strain with a complete deletion of ura3. However, when the assembly efficiency drops to a similar level with that of the reversion rate of ura3-1 back to wild type, using a strain with ura3 completely deleted might become advantageous. Such scenario may occur when a much longer biochemical pathway is assembled.
It should be noted that DNA assembler has the potential to assemble much longer biochemical pathways in S. cerevisiae
. Conceptually, a long biochemical pathway can be split into several segments (8–10 genes each) that will be sequentially integrated into the yeast chromosome by using a recyclable selection marker such as Ura. Once a segment is integrated, 5-fluoroorotic acid can be used to remove the integrated Ura selection marker from the chromosome (42
), so that another segment can be integrated subsequently. Through recycling the selection marker, horizontally moving any biochemical pathways from their native hosts to S. cerevisiae
will be no longer limited by the size of the pathways. For example, a biochemical pathway composed of 30–50 genes (around 100–200 kb) could be integrated to the chromosome of S. cerevisiae
within several weeks. Of course, when a biochemical pathway is very long, PCR-introduced mutations may become a concern even though high-fidelity DNA polymerases are used. Note that this concern is also common to other reported DNA assembly methods using high-fidelity polymerases such as the Phusion polymerase (with a very low error rate, 4.4 × 10−7
) should greatly reduce the mutation frequency. Fortunately, with the development of new powerful DNA sequencing technologies, validation of the correctly assembled long biochemical pathway by DNA sequencing is no longer a burden in terms of time and cost. For example, the recently developed 454 sequencing technology (Roche, Branford, CT) supports the sequencing of samples from a wide variety of starting materials including genomic DNA, PCR products, BACs and cDNA (http://www.454.com/
). And it costs less than $15K to sequence the entire genome of a microorganism of typical size (~6 Mb) within a few days.
In principle, this method may be even used to assemble a DNA molecule as large as an entire chromosome or genome. As a matter of fact, Gibson and coworkers’ success in the construction of the entire M. genitalium
genome from 101 fragments (29
) indicates that it is possible to assemble a DNA molecule as large as a genome from short DNA fragments using DNA assembler. Also, in addition to assembling biochemical pathways and genomes, DNA assembler has many other applications such as library creation in combinatorial biology and construction of complicated DNA molecules in the field of synthetic biology.