|Home | About | Journals | Submit | Contact Us | Français|
Synthetic genetic sensors and circuits enable programmable control over the timing and conditions of gene expression. They are being increasingly incorporated into the control of complex, multigene pathways and cellular functions. Here, we propose a design strategy to genetically separate the sensing/circuitry functions from the pathway to be controlled. This separation is achieved by having the output of the circuit drive the expression of a polymerase, which then activates the pathway from polymerase-specific promoters. The sensors, circuits and polymerase are encoded together on a ‘controller’ plasmid. Variants of T7 RNA polymerase that reduce toxicity were constructed and used as scaffolds for the construction of four orthogonal polymerases identified via part mining that bind to unique promoter sequences. This set is highly orthogonal and induces cognate promoters by 8- to 75-fold more than off-target promoters. These orthogonal polymerases enable four independent channels linking the outputs of circuits to the control of different cellular functions. As a demonstration, we constructed a controller plasmid that integrates two inducible systems, implements an AND logic operation and toggles between metabolic pathways that change Escherichia coli green (deoxychromoviridans) and red (lycopene). The advantages of this organization are that (i) the regulation of the pathway can be changed simply by introducing a different controller plasmid, (ii) transcription is orthogonal to host machinery and (iii) the pathway genes are not transcribed in the absence of a controller and are thus more easily carried without invoking evolutionary pressure.
Synthetic genetic circuits have been constructed by engineering biochemical interactions to generate signal processing or dynamic functions (e.g. a logic gate or oscillator) (1–4). These circuits can be connected to each other to implement more intricate computational operations (5,6). Further, genetic sensors, which respond to environmental stimuli, can be connected as inputs to the circuits (7,8). The outputs can be used to control actuators that determine what the cell is making or doing. A number of ‘toy systems’ have been constructed to develop the design principles by which multiple sensors, circuits and actuators can be combined into larger programs. Such programs link cell–cell communication for biofilm formation, enable Escherichia coli to form patterns, and control the direction of chemotaxis (9–12). For practical applications, simple circuits have been used to control metabolic pathways to implement feedback or to turn on at a particular growth phase (13–15). Increasingly sophisticated programs containing many sensors and circuits will be required to integrate multiple cellular processes, implement fine spatial and temporal control over gene expression and move process control algorithms into individual cells. As the programs become larger, it will be more challenging to design and test in isolation before moving to control relevant pathways. Here, we present a design strategy that relies on multiple engineered phage polymerases to link the output of genetic programs with the pathways to be controlled, both of which are carried in distinct genetic locations. This modular design allows the program and pathways to be constructed and tested independently and then combined to create a complete operational system.
When introducing synthetic DNA into a cell, it is desirable that the encoded processes be functionally distinct from host processes. This has been articulated as building a ‘virtual machine’ in cells that make separate transcription/translation machinery and resources available to a synthetic system, thus reducing the load on the host (16). To this end, RNA regulator systems have been engineered to produce orthogonal parts libraries for controlling gene expression (17,18). Similarly, phage polymerases are a means to control orthogonal transcription and are one of the most used tools in genetic engineering. Specifically, T7 RNA polymerase (RNAP) has been shown to function in a variety of hosts, including Gram-negative and -positive bacteria, plant chloroplasts and mammalian cells (19–21). An advantage of T7 polymerase is that its promoters are tightly inactive in the absence of the polymerase, thus reducing the load on the cell when uninduced (20,22). Recent advances in manipulating and minimizing the ribosome and altering the 16S rRNA to bind to alternative Shine-Delgarno sites are important steps towards the ultimate goal of simultaneous orthogonal translation (23,24).
In this article, we present a design strategy to separate sensing/circuitry functions from pathways/actuation (Figure 1A). They are encoded in genetically distinct regions and linked by having the output of the circuits drive the expression of phage polymerases. This enables the ‘controller’ to be optimized independently using fluorescent proteins and then transferred to the control of a pathway. Bioinformatics and protein engineering were applied to build four orthogonal polymerases by swapping the DNA-binding loop from homologues mined from sequence databases. This yielded orthogonal polymerases that do not cross-react with non-cognate promoters and thus can be used to control four distinct pathways or actuators. Promoter libraries for the polymerases were constructed to enable tuneable control over the expression levels. Further, multiple terminators were constructed to enable multicistron systems be controlled by a polymerase without problems due to homologous recombination. Together, this approach decouples the design of a pathway from the regulation used to control it.
Escherichia coli strain DH10B was used for routine cloning, plasmid propagation and T7 characterization assays. E. coli strain MG1655 was used for pigment expression. Luria–Bertani (LB)-Miller (Difco) was used for strain propagation and assays. The antibiotics used were 34.4μg·ml−1 chloramphenicol, 100μg·ml−1 spectinomycin, 50μg·ml−1 kanamycin and/or 100μg·ml−1 ampicillin. The inducers used were isopropyl β-d-1-thiogalactopyranoside (Sigma-Aldrich) and/or anhydrotetracycline (Sigma-Aldrich).
RNAP homologues were identified using BLAST (http://blast.ncbi.nlm.nih.gov/) (25). The search was run against the non-redundant protein sequence database, with an expect threshold of 10, word size of 3, the BLOSUM62 matrix, gap costs of 11 for existence and 1 for extension and the conditional compositional score matrix adjustment. Protein sequences were aligned using ClustalW (http://www.ebi.ac.uk/Tools/msa/clustalw2/) (26). Pairwise alignment utilized the Gonnet weight matrix, a gap opening penalty of 100 and an extension penalty of 0.1. The multiple sequence alignment also utilized the Gonnet matrix, a gap opening penalty of 10, an extension penalty of 0.2, gap distances of 5 and no end gaps. Phage promoters were identified by scanning phage genomes with the PHIRE package (http://www.biw.kuleuven.be/logt/PHIRE.htm) (27). Default parameters were used (string length 20, degeneracy 4, dominanNum 4 and window size 20).
Cells harbouring the appropriate plasmids were inoculated into 5ml of LB media (supplemented with antibiotics, 37°C, 250rpm) in 15ml Falcon tubes and grown for 14h. The cultures were diluted into fresh 5ml LB media (supplemented with antibiotics and inducers) to a final OD600 of 0.25 in 15ml Falcon tubes and were incubated for 6h (37°C, 250rpm). To halt the assay, 2μl of cells were transferred from each tube to a 96-well plate containing phosphate-buffered saline supplemented with 2mg·ml−1 kanamycin. Fluorescence data were collected using a BD Biosciences LSRII flow cytometer. GFP measurements utilized a 488nm laser with a 530/30nm filter, and mRFP measurements utilized a 532nm laser with 575/26nm filter. Data were gated by forward and side scatter, and each dataset consisted of at least 25000 cells. FlowJo was used to calculate the geometric means of the fluorescence distributions and perform compensation. The autofluorescence value of E. coli cells harbouring no plasmid was subtracted to generate the values reported in this study.
To assess RNAP toxicity, we measured the impact of RNAP-mediated expression on cell viability. Cells were co-transformed with an RNAP plasmid and one of the three reporter plasmids. The reporter plasmids carried no insert (N23), a wild-type T7 promoter with lacO-binding site and mRFP that produces moderate gene expression (N155) or a wild-type T7 promoter without lacO-binding site and mRFP producing high levels of gene expression (N489). Cells carrying appropriate plasmids were grown as described for the assay to characterize fluorescence. After dilution to an OD600 of 0.25, cells were further diluted by a factor of 106, and 50μl of cells were plated on LB media containing antibiotics and 100μM IPTG. After 14h of incubation at 37°C, colonies that had formed on each plate were counted.
Strains were grown as in the Fluorescence Characterization assay with an exception. Instead of incubation in inducing media for 6h, cells were grown 24h before pigment extraction and quantification. After 24h of induction, cultures were diluted to an OD600 of 1.0, and 1ml of diluted culture was centrifuged at 15000g for 60sec. Supernatant was removed, cells were resuspended in 200μl H2O, and re-centrifuged at 15000g for 60sec. The pigments were then extracted from cell pellets following one of the two methods. For lycopene extraction, cells were resuspended in 250μl acetone and incubated at 55°C for 15min with frequent vortexing. For deoxychromoviridans extraction, after removal of supernatant, cells were resuspended in 50μl of 10% sodium dodecyl sulfate and incubated at 55°C for 15min with frequent vortexing. Following, 250μl of methanol were added to the tubes. The mixture was incubated at 55°C for an additional 15min with frequent vortexing. Pigment extraction mixtures (both lycopene and deoxychromoviridans) were then centrifuged at 15000g for 60sec and the supernatant was transferred to a 96-well plate for quantification. Absorbance spectra from 350 to 700nm were collected using a Tecan Safire plate spectrophotometer. Lycopene and deoxychromoviridans were quantified using measurements at 470 and 650nm, respectively. Data are reported as the percent of max absorbance observed after subtraction of the absorbance of wild-type E. coli MG1655 cell extracts.
T7 promoters of varying strength were generated from a random library and cloned in front of mRFP (28). The promoter was randomized in the following way: TAATACGACTCACTANNNNNAGA. The library was screened by picking 36 random colonies and growing overnight in 5ml LB media containing 1mM IPTG. Fluorescence was measured by flow cytometry, and four representative clones that exhibited diverse strengths were selected for further analysis. These four clones were sequenced and characterized. T7 terminators were generated from a random library and inserted into the characterization vector N292 (SBa_000566). The T7 terminator library was diversified as follows: TANNNAACCSSWWSSNSSSSTCWWWCGSSSSSSWWSSGTTT. The terminator library was co-transformed with T7* to screen for active terminators. Thirty-six colonies were selected and grown overnight in 5ml LB media containing 1mM IPTG. Fluorescence was measured by flow cytometry, and eight clones that exhibited highest termination were selected for further analysis. These eight clones were sequenced and characterized. The RBS Calculator (‘Forward Engineering’ mode, ‘ACCTCCTTA’ as the 16S RNA sequence) was used to generate RBSs of 25000 AU for each gene in the lycopene biosynthetic pathway. Insulator sequences were designed using the Random DNA Generator using a random GC content of 50% (http://www.faculty.ucr.edu/~mmaduro/random.htm).
Each orthogonal polymerase was constructed in the T7* backbone. The specificity loop for the N4 subfamily was inserted in T7*, and the gene was cloned into pIncW without further modification. This construct is denoted T7*(N4). The K1F and T3 specificity loops were cloned into T7*, combined with the Lon-mediated N-terminal degradation tag (LFIKPADLREIVTFPLFSDLVQCGFPSPAADYVEQRIDL) and inserted in pIncW to produce T7*(K1F) and T7*(T3).
A practical challenge in using phage polymerases is that they can exhibit toxicity (20,29,30). To overcome this, we constructed a variant of T7 RNAP with reduced toxicity by combining mutations to lower its concentration and activity. The wild-type gene was placed under control of an IPTG-inducible promoter in a low copy plasmid and co-transformed with a second plasmid containing a T7 promoter and fluorescent reporter (Figure 1B). Toxicity was assessed by plating strains on inducing media and counting colonies after 24h of growth. Significant toxicity was observed when T7 RNAP is highly expressed, even in the absence of a T7 promoter (Figure 1C, v1). A Lon-mediated N-terminal degradation tag from the umuD gene in E. coli (31) was added to limit polymerase concentration (Figure 1C, v2). This resulted in a slightly lower toxicity. Next, RNAP expression was tightly controlled using a weak ribosome-binding site and GTG start codon. Finally, during cloning, a spontaneous mutation in the polymerase active site (R632S) arose. This mutation was found to significantly reduce host toxicity while maintaining activity (Figure 1C, v3). It is interesting to note that the previous studies have reported that mutations to this region of the polymerase can reduce its processivity (32–34).
Once the toxicity was reduced, we sought to expand the number of polymerases that are orthogonal and can be used simultaneously in a cell. In the context of a controller, this would increase the number of circuit outputs that could be linked to different pathways (Figure 1A). Orthogonality can be achieved by engineering polymerases to bind to different promoters and not cross-react. One approach would be to apply part mining (35), where homologous phage polymerases are identified from the sequence databases, constructed using DNA synthesis and screened for activity and orthogonality. However, we did not want to repeat the process of reducing the toxicity for each polymerase. To avoid this, we used the set of T7 backbones (v1 to v3) as scaffolds into which we inserted the peptide loop responsible for promoter recognition from different phage polymerases identified in the NCBI database (Figure 2A). These chimeras were then tested for activity and orthogonality.
The DNA sequence to which T7 RNAP binds is determined by a β-hairpin, known as the specificity loop, which contacts the −12 to −7bp region of the promoter (36). Changes to the specificity loop confer the ability to recognize different promoter sequences (37,38). Remarkably, it had been shown that a single mutation (N748D) switches T7 RNAP to preferentially bind a T3-like phage promoter (39). Thus, this is a good region to target mutagenesis; however, random mutations may be disruptive and it could require too many simultaneous mutations to generate orthogonality (40). Instead, we took the approach of exchanging the entire β-hairpin identified in RNAP from homologous phages. This is related to previous work, where orthogonal transcription factors were made by using information from a sequence database to guide mutations to the protein-DNA interface (39).
Each phage in the T7 family contains an RNAP and 10–20 promoters that provide a wealth of information about the interaction of the polymerase with DNA. To identify β-hairpins that confer different binding specificities, we identified homologues of T7 RNAP and computationally determined their DNA-binding preferences. First, 43 T7 RNAP homologues were identified from NCBI via a protein–protein BLAST against non-redundant protein sequences (25). RNAP with an E-value less than 1−100 and for which a fully sequenced phage genome exists were selected for further analysis. A multiple sequence alignment of the RNAP amino acid sequences was performed using ClustalW (26) to identify and extract the β-hairpin in each RNAP corresponding to T7 amino acids G732-P780. Three RNAPs (RSB2, W5455 and ϕIBB-PF7A) were eliminated due to significant differences in length of the β-hairpins. A second multiple sequence alignment was performed with only the β-hairpin sequences, and 13 RNAP subfamilies were identified (distance between members <0.1 in the ClustalW guide tree). Putative promoters were identified from each phage genome using PHIRE, software that scans genomes for regulatory elements by identifying conserved sequences with a limited number of user-defined degeneracies (27). WebLogo was used to determine the consensus sequence for each phage RNAP (41). Remarkably, for each β-hairpin subfamily, the binding region of the consensus promoter is identical (Supplementary Figure S1).
Novel RNAPs were generated by swapping the β-hairpin from T7 RNAP (Q744 to I761) with the equivalent region from each subfamily (Figure 2B). The corresponding binding region of the T7 promoter (−12C to −7C) was replaced with the promoter subfamily consensus sequences (Figure 2A). The resulting RNAPs were screened for activity against their predicted promoters. Four RNAPs exhibited strong activity (42-, 12-, 17- and 40-fold induction by T7*, T7*(T3), T7*(K1F) and T7*(N4), respectively) (Supplementary Figure S4) and similar temporal induction (Supplementary Figure S6). The activity of these RNAPs against non-cognate promoters was then characterized. Each RNAP is highly orthogonal, even after significant mutations to both the specificity loop and promoter (Figure 2C).
It is valuable to be able to tune the strength of a promoter to achieve varied levels of transcription. The T7 promoter has been shown to be modular, consisting of a 6bp RNAP-binding region and a 5bp strength-determining region (36,42,43). Mutations to the strength-determining region should alter promoter strength without affecting RNAP specificity (44,45). We created a promoter library by randomizing the wild-type T7 promoter from −2bp to +3bp. The library was screened using flow cytometry, and a set of representative promoters was identified that includes a broad range of expression levels spanning two orders of magnitude. These promoters were sequenced to determine the mutations to the strength-determining region (Figure 3A). These regions were then combined with a different specificity-determining region that is specific for T7*(T3). The rank order for strength persists with fair correlation between strengths of individual promoters (Figure 3B and Supplementary Figure S5). Orthogonality is retained for the hybrid promoters, where T7* only transcribes those promoters containing the cognate-binding sequence and vice versa for T7*(T3).
The set of orthogonal RNAPs enables the independent control of multiple pathways by a genetic program encoded on a controller. The modularity of the controller allows it to be tested using fluorescent reporters before implementing it to control metabolic pathways or difficult to assay cellular functions. To demonstrate this, we constructed a simple genetic program whose output is two RNAPs (T7* and T7*(K1F)) under the control of multiple inducing signals and a logic gate (Figure 4). T7* is expressed from the Ptac promoter by IPTG induction (characterized in Supplementary Figure S3), whereas T7*(K1F) is controlled by an AND gate that is active only in the presence of both IPTG and anhydrotetracycline (aTc). The AND function is achieved by placing a lacO-binding site after the transcription start site of the Ptet promoter. Such promoter engineering has been applied previously to build gates (46,47), and it is similar to that used to build an edge detector program (11). We characterized the performance of the genetic program using reporter plasmids for each RNAP and verified that it produced the expected circuit logic (Figure 4B). The reporters are placed under the control of promoters responsive to each RNAP in the same genetic context as the genes ultimately to be controlled. The T7* RNAP exhibits 7.8-fold induction, whereas the T7*(K1F) RNAP is induced 9.2-fold when tested using red fluorescent protein and a low copy pSC101* origin.
Once the controller is verified, it can then be co-transformed with the target pathways, which are carried on a second plasmid. We applied the genetic program to control the expression of two small molecule pigments, lycopene (red) and deoxychromoviridans (green). Escherichia coli produces small amounts of lycopene through the 1-deoxy-d-xylulose-5-phosphate (DXP) pathway following the introduction of the carotenoid genes crtEBI (Figure 5B) (48). Lycopene production can be improved by overexpressing two genes, dxs and idi (49,50). Deoxychromoviridans is synthesized from l-tryptophan by the genes vioABE (51,52).
Each gene from both pathways was placed in a cistron comprising an insulator, T7 or K1F promoter, synthetic ribosome-binding site, the gene of interest and a T7 terminator (Figure 5A). A library of T7-derived terminators was created to avoid homologous recombination between cistrons (Supplementary Data, Supplementary Figures S8 and S9). Synthetic cistrons were assembled into either a lycopene operon or a deoxychromoviridans operon using the Gibson assembly method (53). This method enabled us to introduce a library of T7 promoters into each lycopene cistron, co-transform the library with T7*, and screen for efficient producers under inducing conditions. We identified and sequenced a clone that exhibited excellent pigment production and contrast (T7 promoters indicated in Figure 5A). For deoxychromoviridans, we obtained sufficient expression using wild-type promoters but found that clones were not stable when vioE was expressed as a cistron. Expressing vioB and vioE from a single K1F promoter is stable and eliminates leaky expression of deoxychromoviridans.
We then assessed the feasibility of connecting the genetic program that was tuned and characterized in isolation to the multigene pigment pathways. The genetic program was co-transformed with the two biosynthetic pathways, and cells were plated on varying inducer combinations (Figure 5C). Lycopene is synthesized in the presence of IPTG and is not affected by the presence of aTc. Deoxychromoviridans, in contrast, is only synthesized when both inducing molecules are present. Further quantification was performed by extracting pigments and measuring relative absorbance under the different inducing states (50,54). We measured a 7.9-fold induction of lycopene expression between the absence of inducers and the addition of IPTG (Figure 5D). Lycopene production in the presence of both IPTG and aTc yields a 5.6-fold induction, and induction of deoxychromoviridans is 3.3-fold. The circuit performance obtained using the fluorescent reporters (Figure 4) closely matches with that obtained using the more complex pathways (Figure 5D).
Genetic engineering is becoming increasingly complex, requiring integrated control over multiple many-gene pathways. Ultimately, it is envisioned that synthetic systems could approach the size and complexity of complete genomes. Our ability to engineer systems at this scale is going to require modularization in their design and testing. To this end, we present an approach to decouple the genetic regulation controlling the conditions and dynamics of gene expression from the pathways that are being controlled. The sensors and circuits can be constructed and tuned using fluorescent proteins, and the engineered pathways can be optimized under simplified inducing conditions. Because the output channels are orthogonal polymerases, the ‘controller’ can switch from testing to implementation simply by co-transforming it with the more complex pathways. This has several advantages. First, the pathways may be challenging to assay and inappropriate for the characterization of circuit dynamics. Second, this approach enables the future development of highly integrated genetic programs linking dozens of sensors and circuits that all can be encoded in the controller. Thus, the modularization of genetic programming enables it to be abstracted from the idiosyncrasies of the biology being controlled.
Supplementary Data are available at NAR Online: Supplementary Table 1, Supplementary Figures 1–10 and Supplementary References [56–63].
Office of Naval Research [N00014-10-1-0245 to C.A.V.]; NSF [CCF-0943385 to C.A.V.]; National Institutes of Health [AI067699 and AI067699 to C.A.V.]; NSF graduate student fellowships (to K.T. and F.M.) and NDSEG and Hertz graduate fellowships to (T.H.S.S.). K.T., R.H., T.H.S.S., F.M., and C.A.V. are part of the NSF Synthetic Biology Engineering Research Center (SynBERC). Funding for open access charge: National Institutes of Health [AI067699 and AI067699 to C.A.V.].
Conflict of interest statement. None declared.
The authors thank George Church and Harris Wang for plasmid pAC-LYC containing lycopene biosynthesis genes.