|Home | About | Journals | Submit | Contact Us | Français|
As the field of synthetic biology expands, strategies and tools for the rapid construction of new biochemical pathways will become increasingly valuable. Purely rational design of complex biological pathways is inherently limited by the current state of our knowledge. Selection of optimal arrangements of genetic elements from randomized libraries may well be a useful approach for successful engineering. Here, we propose the construction and optimization of metabolic pathways using the inherent gene shuffling activity of a natural bacterial site-specific recombination system, the integron. As a proof of principle, we constructed and optimized a functional tryptophan biosynthetic operon in Escherichia coli. The trpA-E genes along with ‘regulatory’ elements were delivered as individual recombination cassettes in a synthetic integron platform. Integrase-mediated recombination generated thousands of genetic combinations overnight. We were able to isolate a large number of arrangements displaying varying fitness and tryptophan production capacities. Several assemblages required as many as six recombination events and produced as much as 11-fold more tryptophan than the natural gene order in the same context.
Synthetic biology aims to engineer useful novel functions in existing organisms (1–3). Recent efforts have shown how one can couple mathematical modelling with the precise characterization of libraries of genetic elements to rapidly construct synthetic networks with predictable functions (4). However, unpredictable interactions will probably remain a significant challenge to genetic engineering for some time to come. One way to circumvent this problem might be to admit the limits of our predictive abilities and to test large numbers of random designs. This approach has already proved successful for the directed evolution of proteins. Indeed, accurate prediction of protein folding and function from primary sequence remains a challenge (5). Nevertheless, methods such as gene shuffling have enabled the improvement of protein stability and performance as well as changes in their reaction and substrate specificity (6). These methods have mainly focused on mimicking and enhancing natural recombination to promote diversity. In contrast to rational design, directed evolution does not require a comprehensive knowledge of the system being implemented (7). We believe the same principles can be applied to the engineering of larger genetic systems and metabolic pathways. Combinatorial approaches have recently been used for promoter engineering (8), or for the modification of intergenic regions in synthetic metabolic pathways with the purpose of improving the balance of gene expression (9). Along these lines, a new method called MAGE allows the combinatorial mutagenesis of a few base pairs at multiple genomic loci to improve endogenous metabolic pathways (10). In this work, we used the unique recombination properties of the integron machinery to enable the rapid and large-scale generation of combinations in vivo.
Integrons were first discovered owing to their involvement in multiple antibiotics resistance phenotypes. They were later identified in the genomes of ~10% of sequenced bacteria (11) and can represent a significant proportion of these genomes (e.g. 3% in Vibrio cholera; 12). Integrons are composed of a tyrosine recombinase (the integrase, IntI), a primary recombination site (attI), and an array of gene cassettes (13). Cassettes generally consist of promoterless ORFs flanked by attC recombination sites. In some vibrionaceae, there can be more than 200 such cassettes. Upon stress (14), the integrase is expressed and can recombine attC sites, leading to the excision of circular cassettes that can be further integrated at the attI site. These recombination events lead to deletions and rearrangements in the cassette array as well as the capture of new cassettes by lateral gene transfer (15). They represent extremely powerful evolutionary devices, whose success is exemplified by their ubiquitous spread associated with multiple antibiotics resistances. Here, we designed three different setups based on the integron recombination parts to test the recombining power of this genetic system. We chose to assay the assembly of a known anabolic pathway, tryptophan biosynthesis, involving the consecutive action of multiple enzymes, and selected clones with increased production of the final metabolite.
Escherichia coli strains were grown in Luria Bertani broth (LB) or 63B1 minimal medium with glucose 0.4% (MM63B1) at 37°C. Antibiotics were used at the following concentrations: ampicillin (Ap), 100 µg/ml, chloramphenicol (Cm), 25 µg/ml. Diaminopimelic acid (DAP) was supplemented when necessary to a final concentration of 0.3 mM and tryptophan to a final concentration of 50 µg/ml. Chemicals were obtained from Sigma-Aldrich (France).
pSW plasmids were constructed as follow. First, the attP recombination site of the lambda phage was PCR amplified and cloned at the SacI site of the previously published pSW23T (16). The multiple cloning site was then replaced with the BioBrick strandard restriction sites through a PCR with oligos o936 and o937 (see Supplementary Table S4 for the list of oligos). The tryptophan operon genes were PCR amplified from the genome of E. coli MG1655 with a first set of primers (trp(A-E)-F and trp(A-E)-R) framing them with common 3′- and 5′-ends. A second set of primers (pos(1-6)-F and pos(1-6)-R) was then used to add the attCaadA7 site on the 3′-end and either BioBrick standard restriction sites or BglI restriction sites allowing the directional assembly of the cassette array in successive steps leading to pSWlib, pSW-BA and pSW-CED (see Supplementary Table S5 for the description of plasmids). The BglI and BioBrick restriction sites present in the trp genes were deleted trough site-directed mutagenesis with primers o1002–o1007. The attI1 site was entered to the registry of standard biological parts as BBa_J99002 and attCaadA7 as BBa_J99001. BBa_J23100 is a constitutive promoter and BBa_B0015 a transcriptional terminator. Biobricks details can be found at partsregistry.org). The cat [CmR] gene of pSWlib was replaced by the aadA7 [SpecR] gene to give pSWKspec as described by Demarre et al. (16). J23100 and attI1 were cloned in pSWKspec through BioBrick standard assembly, giving p7421.
The tryptophan operon was deleted in the TG1 strain, with the method described by Chaveroche et al. (17) giving the strain TG1Δtrp::km. The deletion of the tryptophan operon was then P1 transduced into MG1655recA::Tn10 [RecA was supplied from the plasmid pCY579 (18)] and the kanamycin resistance was subsequently excised trough FRT recombination mediated by the pCP20 plasmid, resulting in strain ω7814 (see Supplementary Table S6 for the description of strains).
pSWlib and p7421 were integrated into the attB site of the ω7814 chromosome through lambda recombination mediated by plasmid pTSA29-CXI (19) to give ω7830 and ω7902, respectively.
Overnight cultures of ω7842 were diluted to the 1/100 in LB medium with arabinose 0.2% and grown overnight. These were then plated on MM63B1 and LB agar. Prototroph frequencies were established as the ratio of the number of clones on MM63B1 over LB. The 788 clones were screened for cassette integration at the attI site. Twenty-nine positive clones and 30 negative clones were further analysed to determine their precise gene order.
Overnight cultures of ω7661-int were diluted to the 1/100 in LB medium with arabinose 0.2% and grown overnight. Plasmid extractions were realized with Macherey–Nagel NucleoPlasmid kits and transformed into electro-competent ω7814 strains. Transformants were plated on MM63B1 and LB + Cm. Prototrophs were analysed through series of PCRs.
Overnight cultures of the donor(s) strain(s) (ω7893 or ω8066 + ω8067) and recipient strain (ω7902-int) were diluted to the 1/100 and grown in LB + DAP (300 µg/ml) and LB + arabinose 0.2%, respectively to OD600 = 0.6. Two millilitre of the donor cells were mixed with 4 ml of the recipient cells and filtered onto 0.22 µm filters which were incubated overnight on LB + DAP Petri dishes. The bacteria on the filter were then resuspended in LB and plated on MM63B1 or LB. Prototroph frequencies were established as the ratio of the number of clones on MM63B1 over LB. The cassettes orders in the recovered prototrophs were established through series of PCRs.
We performed here biological measures of tryptophan production relying on a co-culture between a tryptophan producing strain and an auxotroph reporter strain. The growth of the reporter strain is indicative of the presence of tryptophan in the medium and thus of the tryptophan production by the other strain. Overnight cultures of the prototroph strains obtained from the recombination assays were diluted to the 1/100 in MM63B1 together with ω8072 in 96 wells Corning culture plates (1/1 ratio). Fluorescence (em:560 nm, ex:600 nm) and OD600 were measured in a Tecan infinite200 plate reader. Tryptophan production was assessed as the difference in fluorescence between the sample and the control well (DB8072 only) divided by the sample OD600. Growth rates were measured in MM63B1 and MM63B1 + tryptophan in a Tecan infinite200 plate reader.
Tryptophan biosynthesis involves seven reactions from chorismate as a starting point (20). In E. coli, five proteins are involved in this pathway (TrpA-E), with TrpE and TrpC each carrying two catalytic domains. In a first assay, the genes for each of these proteins were ‘packaged’ into identically designed cassettes in which the gene of interest was preceded by the same ribosome binding site (BBa_B0030) and followed by a well described 64 bp attC recombination site [attCaadA7 (21)]. These artificial cassettes were assembled downstream of an attI primary recombination site in an arbitrary order along with four ‘regulatory’ cassettes to form the starting library plasmid pSWlib. Two of the regulatory cassettes carried strong transcriptional terminators and were primarily intended to prevent the initial construct from producing tryptophan. One cassette carried the reporter gene lacZα and the last one a constitutive promoter (see Figure 1 for the detailed arrangement). The entire pSWlib plasmid was integrated into the chromosome of ω7814, an E. coli strain that is deleted for the whole-tryptophan operon (see ‘Materials and Methods’ section). As expected, the resulting strain remained auxotrophic for tryptophan. To induce rearrangements that might lead to expression of the tryptophan biosynthetic pathway, the integrase IntI1 gene was expressed from an arabinose-inducible PBAD promoter located on a plasmid (pBAD-IntI1). After an overnight culture with arabinose, tryptophan prototrophs were recovered at a frequency of 3.4 × 10−3.
Rearrangements were then analysed through series of PCR reactions giving the exact integron gene order in the prototrophic strains. It appeared from these data that most (28/30) rearrangements only involved deletion events removing both transcriptional terminators. We thus screened a larger number of clones for cassette integration at the attI site and found that 3.7% (29/788) showed more extensive rearrangements. Their precise gene order was determined, and can all be explained by one or several attC × attC excisions followed by attC × attI integrations, as well as cassettes duplications in some cases (Table 1). The frequency of such events—hereafter referred to as reordering events—in the unselected population is thus on the order of 10−4. Since a single reordering event only allows a small number of different gene arrangements to be attained, we also attempted to assess the frequency of multiple reordering and duplication events. These numbers are hard to determine precisely since they would require a very large number of PCR screens. Among the recovered prototrophs, 2.4% (19/788) displayed single reordering events, 0.5% (4/788) double reordering events and 0.8% (6/788) events involving the duplication of at least one cassette. We can thus estimate the order of magnitude for the frequency of double reordering events in the unselected population to be 10−5. It is noteworthy that this frequency is higher than the product of the frequencies of two single reordering events, suggesting that once a recombination event has occurred, a second one is more likely to happen. We can also note that none of the genotyped arrangements placed the promoter cassette upstream of the trp genes. Their expression is likely driven by the pL1 promoter of the oriRP4 region (22) upstream of the attI.
Since single reordering events are more frequent than multiple ones, the types and number of gene combinations actually obtained in a culture are highly dependent on the original gene order. In order to determine the number of unique combinations that retain the five trp genes and that can be obtained by a defined number of reordering and deletion events, we numerically generated all combinations and counted the unique ones. Single reordering events of the nine initial cassettes can yield 36 unique combinations and 342 if we include the deletion of non-essential cassettes. Double reordering events can yield 904 unique combinations and 5022 if we include deletions. We want to assess the expectation of the number of unique combinations that we can obtain in a culture were C cells realize randomly one of the N possible combinations. If we make the approximation that all combinations have the same probability, this problem is known as the ‘coupon collector’s problem’ which asks the question of the number of sample trials (C), with replacement, required to collect a complete set of N coupons. We numerically simulated this experiment 10 000 times and computed the estimators of the expectation and of the standard deviation. In a 1 ml overnight culture of 109 cells, we should have an average of C = 104 cells realising one of the N = 5022 possible double reordering events. The expectation is to have 4337 (±20) unique combinations in 1 ml of overnight culture. Nevertheless, it seems that some rearrangements are more probable than others. For instance, the excision of the trpB cassette and its integration at the attI recombination site occurred in 6 of the 19 single reordering events isolated experimentally.
In order to assess the functional significance of the selected genotypes, we then measured the tryptophan production and generation time for 10 arrangements. The sequence of the trpA-E genes in the parental strain was verified. Since the frequency at which we recovered prototroph strains is much higher than the point mutation frequency of E. coli, we can assume that the phenotype of the different arrangements is explained by cassette order only. We also constructed and tested the combination mimicking the natural gene order. The measured tryptophan productions ranged from 4-fold less to a 2.8-fold more than the original strain carrying the natural trp operon (MG1655recA; Figure 2). Hence, both gene order and gene copy number in the operon have a drastic effect on tryptophan production. Interestingly, the wild-type order (EDCBA) has one of the lowest production levels. We also observed that growth rate could be affected by the cassettes arrangement, with an up to 40% decrease of growth rate compared to the parental MG1655recA strain, without correlation to the trp production level (Figure 2).
Overall, these results demonstrate that a synthetic integron can efficiently be used to generate functional gene combinations from a library of independent candidate gene cassettes. In order to improve the flexibility of the system presented above, we devised and tested two alternative methods that allow one to carry out the rearrangement and selection steps in different genetic backgrounds and enhance the delivery of cassettes. Having the synthetic integron on the chromosome presents some disadvantages in those cases one wants to generate combinations in one strain and select for good solutions in another genetic background. To address this concern, we assessed the potential of a synthetic integron carried by a plasmid (Figure 3B). The original cassette array was cloned into a low copy number plasmid [BBa_pSB4C5 (23)] and recombination was induced during an overnight culture. Plasmid DNA was recovered and transformed into a tryptophan auxotroph, and transformants where plated on minimal medium to select for plasmids carrying functional trp operons. The proportion of prototrophs among the transformants was 2.2 × 10−4. Subsequent analysis revealed that most of them contained multiple plasmids each carrying different genes of the tryptophan pathway. Nevertheless, out of the 96 colonies screened, we were able to identify six clones carrying all the genes in a single plasmid. They were all in different combinations, and three of them carried duplications of one gene or more (see Supplementary Table S1).
Because the capacity to test a large number of candidate genes, and the ease of including new genes of interest within an existing scaffold, are of prime importance, we considered the possibility of delivering integron cassettes through conjugation. This procedure presents the possibility of building large arrays of cassettes in independent cloning strains and delivering them for chromosomal integration and gene shuffling in a recipient bacteria. In a first assay, we used the pSWlib suicide plasmid. pSWlib was delivered through conjugation into a tryptophan auxotroph carrying an attI site on the chromosome and expressing the integrase from the pBAD-IntI1 plasmid. The transconjugants were selected for growth on minimal medium. Being unable to replicate in the recipient cell, the pSWlib can only be maintained if it integrates at the attI site. Alternatively, the plasmid can be lost by recombination of excised trp cassettes into the recipient chromosome. The frequency of recipient cells becoming prototrophs was 2.3 × 10−5 (see Supplementary Table S2 for details on the recovered arrangements). Delivering the cassettes through conjugation also offers the possibility of delivering different cassette arrays at the same time, increasing the combinatorial power of the assay. In a second assay, the five trp genes were thus split between two donor plasmids pSW-BA and pSW-CED. Two donor strains each carrying a different plasmid where used in a conjugation assay with the recipient cell described above (Figure 3C). Prototrophs were recovered at a frequency of 5 × 10−6. We determined the gene combinations obtained in eight prototroph colonies through series of PCRs. All arrangements were different from each other. The pSW-BA and pSW-CED were integrated in the chromosome of the recipient strain either through attI × attI recombination or attI × attC recombination. Further recombination events or deletion of the suicide pSW vector were also identified (see Supplementary Table S3).
We demonstrated here for the first time, the ability to efficiently generate large number of genetic combinations and arrangements in vivo using site-specific recombination. The functional arrangements we isolated in the chromosomal assay required from two to six recombination events, leading to cassette loss, reordering events and duplications. The generated operons varied both in growth rate and tryptophan production capacities in an uncorrelated manner. The effect on the strain fitness is presumably due to the generation of a misbalance of the operon enzymes expression leading to toxic effects and non optimal allocation of resources. It is for instance known that indole (the product of TrpA) has oxidative toxic effects (24). Besides, an excessive strain on the chorismate pool could deplete the biosynthetic pathways of amino-acids and metabolites such as tyrosine, phenylalanine, ubiquinone and tetrahydrofolate. Similar effects are very likely to occur in any attempt to implement any synthetic metabolic pathways, but are mostly unpredictable; even for the present case where we used E. coli genes in E. coli and though the tryptophan operon is one of the best studied biosynthesis pathways. This new method should thus find applications in metabolic engineering where rational decisions about candidate genes in a pathway, gene order and gene regulation are hard to make.
Among the setups we tested, we show that conjugation is a powerful way to deliver cassettes in a chromosomal platform of a recipient strain. Simultaneous conjugation from several donor strains could be an extremely practical way to combine ready-made elements. One could consider the possibility of having plasmid libraries of various biobricks (regulatory elements, genes, etc.) that could be easily reused for the combinatorial synthesis of new systems and pathways. Other obvious areas of applications could include the shuffling and recombination of protein domains, the study of larger chromosomal rearrangements and the random design of regulatory networks. One could indeed easily construct libraries of promoters and transcription factor cassettes that could be randomly combined using our method. A synthetic integron could also simply be used as a ‘landing platform’ for consecutive and targeted integrations of genetic elements in a host of interest using conjugation.
The question of the production of combinations in large numbers is central in both biotechnology and synthetic biology. The recently described MAGE method (10), illustrates how important is this challenge. Although both our synthetic integron and MAGE allow the generation of thousands of combinations, they have very different outputs. Whereas the synthetic integron can generate random arrangements of large exogenous genetic elements, the MAGE method permits to target mutations of a few base pairs at several genomic loci simultaneously. Both methods could advantageously be combined to manipulate at the same time genetic elements arrangement and their sequence. The only constraint of these approaches is the availability of a screen powerful enough to discriminate good solutions in a large population. This problem is being tackled by the recent developments of ultrahigh-throughput screening technologies, notably using microfluidics (25,26).
For some applications, one could also see the necessity of packaging genetic elements of interest in integron cassettes as a limitation. However, attC recombination sites are remarkably flexible in sequence. It has indeed been shown that they recombine as folded single stranded DNA and that recombination is mostly driven by structural features of the stem-loop and not by primary sequence (27–29). This opens the possibility of creating attC sites ‘à la carte’, and to use them as protein linkers, for instance.
For the moment, the most time consuming step in the utilization of this new approach probably is the assembly of large integron cassette arrays. Nevertheless, new methods in DNA assembly, such as SLIC (30) or DNA-assembler (31), promise to overcome this hurdle. Such developments, together with the exponential progress in DNA synthesis, are empowering synthetic biology in unprecedented ways, changing the speed at which we will increase our knowledge of living systems and our ability to manipulate them.
Supplementary Data are available at NAR Online.
Institut Pasteur; Centre National de la Recherche Scientifique; European Union (NoE EuroPathoGenomics; LSHB-CT-2005-512061); a PhD fellowship from the University Paris Diderot FdV Bettencourt PhD program. Funding for open access charge: Institut Pasteur.
Conflict of interest statement. None declared.
We are grateful to A. Lindner, C. Loot and S. Colloms for careful reading and revising of the article.