|Home | About | Journals | Submit | Contact Us | Français|
An important aim of synthetic biology is to uncover the design principles of natural biological systems through the rational design of gene and protein circuits. Here we highlight how the process of engineering biological systems — from synthetic promoters to the control of cell–cell interactions — has contributed to our understanding of how endogenous systems are put together and function. Synthetic biological devices allow us to intuitively grasp the ranges of behavior generated by simple biological circuits, such as linear cascades and interlocking feedback loops, as well as to exert control over natural processes such as gene expression and population dynamics.
One of the most astounding findings of the human genome project was that our genomes contained as many genes as that of Drosophila melanogaster. This finding begged the question: how do you get one organism to look like a fly and another like a human with the same number of genes? One possibility is that the rich repertoire of non-protein coding sequence found in the genomes of complex organisms adds many new parts with which to generate complexity1. A decade of research has put forward the rather different idea that instead of looking at the length of the parts list as the determinant of organismal complexity, we should look at how those parts fit together2,3. From this perspective, complexity arises from novel combinations of pre-existing proteins and the ability to evolve new phenotypes rests on the MODULARITY of biological parts.
While natural examples have been found to illustrate this latter possibility3, strong evidence to support this post-genomic view of biology has come from the synthesis of new biological systems. Rational synthesis of biological systems can hint at the natural history of how a particular system came to acquire its properties4,5. More often, however, we use synthetic circuits to explore, in a hands-on fashion, the set of design principles that determine the structure and operation of biological systems.
The core aim of synthetic biology is to develop and apply engineering tools to control cellular behavior, using precisely characterized parts, such as cis-regulatory elements, to achieve desired functions. An important direction, for example, has been to engineer cells with an eye towards practical applications, such as BIOREMEDIATION6, biosensors7, biofuels8,9, and even the beginnings of clinical applications10–12. In this Review, however, we focus on how synthetic circuits help us to understand how natural biological systems are genetically assembled and how they operate in organisms from microbes to mammalian cells. In this light, synthetic circuits have been critical as simplified test-beds in which to refine our ideas of how similarly structured natural networks function and have served as tools to control natural networks. We highlight the contribution of synthetic biology to putting together an increasingly quantitative description of gene expression and signal transduction, in uncovering the diversity of behaviors that can arise from positive and negative feedback systems, and progress in rationally controlling spatial organization and cell-cell interactions. We pay particular attention to recent progress in using synthetic systems to uncover novel aspects of cell biology, such as how cells decide to undergo apoptosis and the molecular basis for communication between the endoplasmic reticulum and mitochondrion. We aim to show that synthetic biological approaches have given us a great deal of intuition on how the simple building blocks that underlie complex natural systems work as well as basic tools to quantitatively characterize natural phenomena, both of which are crucial for the field to progress into the analysis and complete control of natural circuits.
The first step in assembling a biological circuit is to gather the component parts. In the cell, circuits are accomplished via gene expression, and so a great deal of effort in synthetic biology has gone into investigating the rules surrounding gene expression, particularly the processes of transcription and translation. The precise measurements afforded by artificially constructed systems allows us to transform qualitative notions of transcriptional repression and activation and post-transcriptional regulation into quantifiable effects such as how promoter architecture defines the rate of transcription and the specific degradation rate specified by a given sequence motif.
Among the earliest contributions of synthetic biology to understanding natural biological processes include detailed, quantitative measurements of transcriptional regulation, building on a foundation laid 50 years ago in the groundbreaking work of researchers such as Jacob and Monod13. Synthetic constructs have been used to map out the TRANSFER FUNCTION that relates the input concentration of transcription factor14,15 and inducers16 to the output concentration of a reporter gene14,17,18, single mRNA molecules19,20 or single proteins21. Many of these same constructs were also used to measure the mean output of the transcriptional process and also HIGHER-ORDER MOMENTS (such as the VARIANCE) in organisms ranging from Escherichia coli and Bacillius subtilis to mammalian cells. Single-molecule studies in these model organisms directly established that mRNA and proteins are produced in bursts of activity22.
A key question in the study of transcriptional regulation is how promoter architecture affects transcriptional activity. For example, below we describe several studies that have informed how the number and genomic positions of transcription factor (TF) binding sites affect transcriptional activity. Given the combinatorial control of gene expression, it is also critical to study how multiple TFs interact with DNA and with each other to tune mRNA production. Endogenous promoters use all of these parameters to specify either a desired transcription rate or a BOOLEAN FUNCTION, such as an AND gate that allows transcription to occur only when all TF binding sites in the promoter are occupied.
The experimental breakthrough that allowed quantitative measurements of the transcriptional power of different promoter architectures was the use of Combinatorial Promoter libraries23. Libraries of promoters driving reporter proteins, such as luciferase or fluorescent proteins, allow for an unbiased measurement of transcriptional activity over the space of possible promoters – such an unbiased method can then be used to try and ascertain rules that describe the responsiveness of a promoter to TFs. Earlier work used randomly mutated promoters to draw inferences about the functional subparts of the promoter, such as the TATA box; by contrast, the construction of combinatorial promoter libraries involves identifying specific operator sites that bind TFs and randomly ligating them together in a way that shuffles their relative positions and copy numbers (Figure 1). The studies highlighted below have combined such promoter libraries and modeling to show that the strength of a promoter is determined largely by the position of TF binding sites with respect to key promoter elements such as the TATA box and with respect to each other..
The simplest case is to understand how the positioning of a single operator affects the expression of a promoter. In prokaryotes, operators are classified as being in the core, proximal, or distal regions of the promoter (Figure 1). Working in E. coli24,25, Cox et al. and Kinkhabwala and Guet independently observed that repressors can effectively repress expression from all 3 promoter subregions, with Cox et al. showing that the strength of repression is greatest when the repressor site is in the core region of the promoter, less strong when in the proximal region, and weakest when in the distal region. Activators, on the other hand, work only in the distal site; Cox et al. showed that instead they have no effect in the core and proximal sites. Both studies go on to develop simple models of promoter activity by taking into account binding reactions of TF to DNA that are in thermodynamic equilibrium.
It was expected that the situation would be far more subtle in eukaryotes, where chromatin structure can strongly influence expression levels26. However, even in Saccharomyces cerevisiae 49% of the variation in expression in the promoter library could be explained by a simple thermodynamic model incorporating just TF-DNA and TF-TF interactions27, interactions that were also suggested in theoretical workl28. More surprisingly, Gertz et al. provided evidence that weak binding sites, which are important for prokaryotic transcription, can also be important in eukaryotes. Focusing on the TF multicopy inhibitor of GAL-1 (Mig1), Gertz et al. showed that repression from one weak and one strong Mig1 binding site can be as effective as two strong Mig1 binding sites. This is particularly crucial given that 24% of all yeast promoters contain putative weak Mig1 binding sites.
The promoter library studies open the way to consider some general questions in transcriptional control. The theoretical frameworks in the E. coli and yeast studies, for example, differ slightly: the E. coli studies do no require TF-TF interactions and frame the issue mostly in the language of Boolean logic, whereas the latter makes heavy use of TF-TF interactions, particularly in the analysis of weak binding sites. Future single-molecule studies of transcriptional control can help to resolve the relative importance of TF-DNA and TF-TF interactions in generating transcriptional activity. Furthermore, the fact that simple equilibrium binding explains much, but not all, of the effect of promoter architecture on expression level suggests that the next goal should be to track down the source of the remaining variation. Genomic location can be an important contributor to expression and expression fluctuations29, perhaps by affecting local chromatin context. Knowing how to parcel out the variation due to these different effects will be particularly helpful when these studies are extended to mammalian systems, where there is considerably less control over where synthetic transgene constructs are integrated into the genome.
Although much of the early work in synthetic biology focused on transcriptional regulation significant progress has also been made in incorporating post-transcriptional effects into synthetic circuits, affecting both RNA and protein. At the RNA level, for example, mutagenesis screens based on synthetic constructs have been used to determine the sequences that are recognized by RNA editing enzymes to change adenine into inosine30. Furthermore, as regulatory RNAs have been increasingly appreciated as important drivers of gene expression, synthetic circuits have included elements from the RNA interference pathway31, APTAMERS32-34, and RIBOSWITCHES35,36 to control the flow of genetic information37.
Synthetic circuits involving enzymatic RNAs have mostly been developed as platforms to tune gene expression, but many of these platforms can easily be extended to understand natural biological phenomena. In the study of Grate and Wilson, for example, an aptamer is used to control the expression of cyclinB-2 (Clb2), a key regulator of the cell cycle, in a tetramethylrosamine (TMR)-dependent manner34. The authors slowed the speed of the cell cycle by adding TMR; this method can be useful in measuring how the level of Clb2 protein affects the speed at which the cell cycle progresses while keeping all transcriptional feedback constant.
Synthetic studies have also directly tweaked how mRNA is translated into protein and how long proteins persist before being degraded. Several experiments in prokaryotic systems, especially those studying the stochastic nature of gene expression, have altered the translation rate by mutating ribosomal binding sites (RBS)17,38. Apart from demonstrating another possible layer of quantitative regulation of gene expression, studies involving RBS variants provided early evidence that E.coli cells could tune the stochasticity in the expression level of a given gene independently of its mean. Lastly, Grilly et al. have developed a circuit that controls the degradation of a target protein using the well-known ClpXP protease machinery from E.coli39. Typically, models of gene expression treat protein degradation as an exponential decay process, with the decay being due to growth of cell volume over time. Regulated proteolysis, however, can depend on the formation of enzyme-substrate complexes as intermediates on the way to degradation. In finding that the degradation follows Michaelis-Menten kinetics, Grilly et al. completed one of the few quantitative comparisons of specific protease activity to models of enzyme kinetics.
Taken together, these results point to some interesting similarities between transcription and translation – both are inherently noisy processes that can be quantitatively modulated by specific sequence elements, such as RBS and protease recognition sites. Future studies can use the ideas and methods from the study of transcription, such as combinatorial library approaches, to more systematically explore the process of translation.
The two approaches of engineering specific promoter architectures or using a natural inducible promoter to tune transcriptional activity, and using specific sequence sites to tune translational yield can combined to achieve precise yet flexible control over gene expression17,31. A nice example of using these two ingredients to study natural processes in mammalian cells can be found in recent work in which a Tet and Lac controlled regulation was adapted and combined with RNAi for use in HeLa cells (Figure 2)40.
As synthetic biology begins to recapitulate more realistic systems, which contain many moving parts, demand will increase for circuits that control every step of the process that turns DNA sequence into protein. Such layered circuits can help illuminate why certain regulatory schemes are employed to control gene expression over others in a given context. For example, gene expression in natural systems can be attenuated by epigenetic silencing, transcriptional repressors or post-transcriptional regulators such as microRNAs (either alone or in concert with other molecules); this begs the questions of why a system uses one system rather than the other and to what extent different layers of regulation generate collective effects that no one layer can accomplish. One area that will be increasingly under study, and that may help unravel the issues surrounding layered circuits, is the dynamics of the different steps that contribute to expression; the studies highlighted above almost exclusively focus their attention on steady state behavior. While intuition tells us that transcription factors act slowly compared to post-transcriptional players such as regulatory RNAs, as the latter presumably do not have to be transported back to the nucleus and then locate a specific genomic locus, there is currently a lack of data that would enable us to turn these intuitive notions into quantitative facts.
The act of engineering cellular pathways has allowed insight into two key properties: precise measurement and control of the input-output relationship of a pathway, and the functional architecture of the pathway constituents themselves. In particular, engineering signalling pathways has provided insights into the functional significance of specific protein sequences and structures by showing exactly which protein domains and which amino acid residues are responsible for mediating specific interactions along the pathway.
Initially, pathway engineering was primarily explored in the context of metabolism41. Metabolic engineering typically involved the use of genetic screens and DIRECTED EVOLUTION to maximize targeted METABOLIC FLUXES. Synthetic efforts in boosting metabolic fluxes have begun to pay off, as is exemplified in a recent study in which a synthetic protein scaffold was used to draw metabolic enzymes spatially closer to each other42 —however, it should be noted that this study does not involved any pathway rewiring. By contrast, rational rewiring of pathways involves specific manipulations to the components of the system to achieve a desired outcome. The most crucial aspect of protein and gene structure that synthetic biologists use to rewire pathways is the inherent modularity of many proteins43 (signalling proteins, for example, typically have dedicated domains for recognizing binding partners that act independently of other functional domains); most rewiring studies therefore focus on signal transduction and genetic cascades (Box 1). There are fewer examples of achieving metabolic control through specifically designed changes in protein sequence44,45. Changes in the structure of an ALLOSTERIC SITE in a metabolic enzyme are more prone to alter the active site of the enzyme than is the case with signaling proteins46. This property allows for regulation of metabolic fluxes by effects such as allostery, but the relative lack of modularity also makes it difficult to forward engineer new behaviors by altering one domain but holding all others constant.
As the central cartoon shows, membrane proteins (light blue) can be engineered to have sensors (green), and can be made to interact with adapters (red), which can in turn can be made to interact with other adapters (dark blue). More formally the input/output relationship can be controlled in two ways: by changing the stimulus that a receptor is triggered by (shown schematically in panel A), or by changing the transducing molecules that the receptor uses to pass the information from the environment to the cellular interior (panels B and C).
Chimeric photoreceptors illustrate the first type of change. Although chimeric receptors have been used previously104, photoreceptors allow for much higher sensitivity measurements and avoid crosstalk effects. In the case of E. coli, the rewiring is accomplished by transcriptionally fusing the cytosolic signal transduction domain of the pathway sensor, the histidine kinase domain of EnvZ, to cyanobacterial phytochrome 1 (Cph1), thereby resulting in a system in which EnvZ’s response regulator, OmpR, can be triggered by light50. Pathway activity is read out by placing the lacZ gene, whose product creates a black compound, under control of the OmpR-depdendent ompC promoter.
The response to a light gradient input serves as a very precise measurement of the transfer function of the pathway (panel A, lower subpanel). The transfer curve seems to indicate that the pathway operates in a threshold linear manner, though whether that is due to the phytochrome sensor itself rather than the pathway needs to be explored. Such thresholding could serve to protect the cell from overreacting to small signals. Shimizu-Sato et al. operated on similar principles in yeast, but instead fused a Gal4 binding site domain (GBD) to the red-light absorbing phytochrome form Pr and a Gal activating domain (GAD) to the phytochrome’s binding partner phytochrome interacting factor 3 (PIF3), thus bypassing the galactose signaling cascade51. Any gene of interest can thus be controlled by placing it under the control the gal1 promoter and simply exposing the cells to red light instead of galactose.
Once activated, the signal from the sensor must be specifically transduced to affect specific downstream processes. By studying covariance among residues from interacting proteins, one can use statistical scores such as mutual information to predict which residues determine the specificity of the interaction. As shown in panel B, when specificity-determining residues from the protein RstB (shown in bold) were substituted into EnvZ, resulting in the chimeric protein Chim1, phosphotransfer occurred between EnvZ and RstA rather than EnvZ’s normal partner OmpR63.
Finally, a great deal of signal processing takes place in between the triggering of a sensor by the environment and the output of the pathway, especially in eukaryotes. One major intermediate in eukaryotes is the class of proteins known as guanine exchange factors (GEFs), which control morphological pathways. Yeh et al. swapped wildtype GEFs controlling formation of filopodia and lamellipodia for synthetic GEFs that can be induced by the small molecule forskolin and that generate novel morphological outputs67. Specifically, GEFs contain an autoinhibitory domain that Yeh et al. substitute with a PKA-responsive inhibitory domain, PDZ. Placing an endogenous pathway under tunable control allows us to characterize crucial aspects of cell biology in quantitative detail. Interestingly, Yeh et al. find that the morphological output is only manifest probabilistically – it is the fraction of cells that display either filopodia (shown here) or lamellopodia (not shown) that increases with increasing forskolin.
Even within signaling systems, however, researchers are presented with severe challenges. Among the major limitations in understanding the signal propagation characteristics of many pathways is confusion over what cue triggers the cascade and whether the cue affects other processes taking place in the cell. Take for example the case of OSMOTIC SHOCK. While many organisms have dedicated signaling systems to relay information about an osmotic shock to the cell, the presence of abundant osmolyte will affect numerous processes besides signaling, such as global transcription-factor binding47. The examples described below illustrate how techniques that both specifically and sensitively activate a selected cascade allow one to focus on pathway behavior independently of such off-target effects.
One of the most direct ways of rewiring the input output relationship of a pathway is by directly changing the cue that the pathway sensor responds to. If the cue is chosen such that its level can be directly modulated, then one can measure pathway transfer functions much as was described above for promoters. Armbruster et al., for example, generated a G protein coupled receptor (GPCR) that responded to a pharmacologically inert compound that could then be titrated in to measure pathway response48, while Anderson et al. engineered sensors that can detect changes in tumour-related microenvironments12. Alternatively, one can manipulate the ligands that drive pathway activity, as was done by Cironi et al. when they linked together epidermal growth factor (EGF) and mutated forms of interferon α-2a (IFNα-2a) such that the only cells that could correctly respond to the IFNα-2a signal were those that coexpressed the EGF receptor49.
A particularly striking example of how sensor rewiring can shed light on the operation of a cascade in vivo in a sensitive and specific manner can be found in the use of chimeric photoreceptors (Box 1a). Two studies used light itself as the cue to drive a signaling system50,51; this approach is unlike traditional implementations of light-driven systems52–54, such as those that use light to activate a small molecule that then activates a desired biological process55. Levskaya et al. engineered the Escherichia coli EnvZ/OmpR two-component system to respond to light, while Shimizu-Sato et al. focused on the Saccharomyces cerevisiae galactose utilization pathway, by fusing a phytochrome and its binding partner to selected pathway proteins. Armed with this engineered cascade, Levskaya et al. proceeded to map out the input output transfer function with very high precision by exposing a lawn of rewired bacteria to a light gradient. The transfer function measured in Levskaya et al. suggests that a threshold level of the environmental cue is needed before triggering pathway activity. While careful titration of an osmolyte would have allowed precision measurement of the transfer function, such as through the use of MICROFLUIDIC DEVICES56, matching the sensitivity of a simple light gradient will be difficult to accomplish. Furthermore, matching the specificity afforded by using light to drive pathway activity is probably impossible. However, given the ease with which we can deliver precisely controlled light signals to cells compared to delivering chemical signals, the Levskaya et al. and Shimizu-Sato et al. studies can be easily extended to perform tasks such as measuring BODE PLOTS, as was recently done for the yeast osmoresponse system57,58.
Whereas swapping the sensor in a signaling pathway is a way to engineer the input side of the input–output relationship, changing the identity of the molecules that carry the signal from sensor to downstream effectors can affect the output side. In fact, given the high degree of sequence homology between many sensor/transducer pairs, there is great interest in developing a detailed description of sensor/transducer interactions to understand the multiple ways in which pathways prevent crosstalk59 — for example, by using SCAFFOLD PROTEINS60, MUTUAL INHIBITION61, and KINETIC INSULATION62.
This is the basic strategy that was followed by Skerker et al to rewire the EnvZ/OmpR system63. This study made heavy use of the large amount of sequence data available for TWO-COMPONENT SYSTEMS to computationally detect individual amino acid residues that covary between cognate pairs. Specifically, they calculated the mutual information between all possible pairs of residues from sensors and response regulators and found the pair that maximized mutual information. These pairs were hypothesized to be the specificity-determining residues. Remarkably they then substituted a given sensor’s specificity-determining residues for a different sensor’s specificity-determining residues, keeping all other residues intact, and thereby activated the latter sensor’s pathway with the former sensor’s trigger. Furthermore they perform the same rewiring feat by substituting specificity-determining residues in the response regulator (Box 1b).
For now, the relative paucity of sequence data precludes the use of this technique for other systems, such as eukaryotic homologues of two-component systems. Nevertheless, this study provides a framework in which one can go beyond crude domain-level protein engineering all the way to molecular details. . A particularly enticing possibility, which is explored in Skerker et al., is to unite the bioinformatically guided rewiring approach with data on crystal structure, especially structures of protein–protein complexes. Using a crystal structure of a complex made up of proteins similar to EnvZ and OmpR, Skerker et al. show that the specificity-determining residues for both sensor kinase and response regulator probably occur at the interface of the two proteins, suggesting that the coevolving residues interact physically rather than allosterically. Combining structural and rewired pathway data can indicate how to explore further the numerous systems in which docking site interactions have been identified64,65. Synthetic pathways and crystallography together can be key in unraveling fundamental biophysical interactions underlying signal transduction.
Altering the way in which a sensor interacts with its environmental cues and its immediate downstream signaling partner represents the most obvious way to manipulate signal transduction. The next most obvious idea is to follow the signal and tackle the intermediate transducers in the pathway. Howard et al., for example, took the pro-apoptotic Fadd death domain and fused it to Grb2 and ShcA, members of the receptor tyrosine kinase (RTK) pathway; as a result, RTK-triggered signals could be used to drive apoptosis66.
At the adapter level, one key target for pathway engineering is the family of guanine nucleotide exchange factors (GEF) that regulate the actin cytoskeleton through the Rho family of GTPases67. Yeh et al. exploited the presence of an autoinhibitory domain in GEFs that can be swapped for an inhibitory domain that itself is under the control of a small molecule. Yeh et al. swap wildtype GEFs controlling formation of filopodia and lamellipodia for synthetic GEFs that can be induced by the small molecule forskolin (Box 1c). In this study, Yeh et al. daisy-chained two GEFs in series and show that the combined, and thus longer, GEF system is both more sensitive to inducer and displays a sharper separation between ON and OFF states. These results are exactly what one would expect from previous synthetic studies examining the sensitivity and sharpness of transcriptional cascades as the cascade length is varied68. As seen above in the case of apoptosis in the RNAi switch, placing an endogenous pathway — morphological in this case — under tunable control allows us to characterize crucial aspects of cell biology in quantitative detail.
Another interesting and complementary theme that emerges from rewiring studies is how differently rewired circuits can yield the same output. The library of combinatorially synthesized gene networks constructed by Guet et al. contains instances of systems that have different connectivity properties but the same Boolean truth table and those that have the same connectivities but different boolean truth tables69. Along these same lines, Isalan et al. show that randomly rewiring the transcriptional network of E. coli results in growth defects in only 5% of the rewirings, a level of tolerance difficult for manmade systems to replicate70. The idea of rewiring a circuit but maintaining its logic seems also to have been employed in the evolution of the mating type switch in yeast, where Candida albicans a genes activate the a mating type while Saccharomyces cerevisiae alpha genes represses the a mating type71. Theoretical studies on the evolvability of biochemical networks suggests that networks that are wired differently but produce the same output constitute a ‘neutral space’ that allows flexibility in the design of networks to perform some function and thus eases the way for phenotypic changes to take place72,73. Continuing in the theme of using rewired pathways to highlight system flexibility, Antunes et al. transplant a bacterial two-component system into the eukaryotic plant Arabidopsis thaliana. Thee prokaryotic transcriptional activator manages to cross into the nucleus to drive gene expression, fueling speculation that pathway evolution can even be driven by horizontal gene transfer between organisms from different kingdoms of life74.
Synthesis has uncovered several rules governing how DNA is turned into proteins and then how proteins interact to generate diverse phenotypes without the need for a combinatorial explosion in the number of genes. In the examples considered above, however, the flow of information was largely an ordered sequence of events: diverse outcomes in these systems resulted from combinatorial rearrangements of modular parts. The complexity of naturally occurring cellular networks, however, is often dominated by feedback and feedforward loops. By incorporating these features, synthetic circuits have also taught us about the dynamics and systems-level function of more complex molecular interactions.
Initial work in this area primarily focused on the identification75 and experimental characterization of simple MOTIFS that occur frequently in genetic and signaling networks. In this first generation of synthetic biology, studies mimicked natural systems and confirmed theoretical expectations that positive feedback systems can be BISTABLE76-79, negative feedback systems are noise resistant80 and can speed up circuit dynamics81. More recently, engineered feedback loops have been extended to signaling and metabolic systems by generating novel protein-protein and genetic interactions to explore how signaling pathways set their sensitivity to input and how they tune their kinetics82,83. One concrete way in which synthetic circuits are helping us approach more complicated interaction networks is by serving as benchmarks against which theoretical and computational tools can be tested84,85 (see Box 2).
One of the most important functions that synthetic circuits have served has been their use in building and refining analytic and computational models of biological systems. When modeling a gene or protein circuit, one must make a series of choices. The first choice has to do with how fine a scale one wishes to model the input/output relationship – typically this choice boils down to whether one wants to view the system as a Boolean logic operator or a dynamical system. The dynamical system framework can be further broken down along 2 dimensions, depending on whether spatial or stochastic effects need to be taken into account. Spatial effects can usually be ignored when the biochemical reactions that make up the system occur on timescales slower than the time it takes to mix the reactants by diffusion. Stochastic effects can usually be ignored if the dynamical variables of the system can be represented as continuous rather than discrete entities; that is, when we are interested in the concentrations of a molecule rather than the number of molecules. Synthetic circuits have been used to explore all of these issues in some detail.
Until recently, the choice of modeling methodology was based on one’s best guess for which effects were important to include, along with post-hoc comparison of the model with data. Detailed comparisons of different modeling paradigms have been lacking. Cantone et al.84 and Ellis et al.85 have offered the field some guidance through the introduction of benchmark networks — that is, a network that has a defined topology that interacts only minimally with endogenous systems, against which to test proposed modeling methods. In particular, Cantone et al. create a relatively sophisticated synthetic transcriptional network of 5 genes that serves as an oracle that is queried by different perturbations, such as overexpression of the network genes and induction by transcriptional inducers. Finally they test methods based on ordinary differential equations, BAYESIAN INFERENCE, and information theory to uncover the connectivity of the network; they find that differential equations and Bayesian inference were better at uncovering the functional relationships than the information theory-based approach, as expected for such a small network. Cantone et al. thus provide an example of how synthetic circuits can be helpful in refining our understanding of large-scale biological systems by improving the algorithms we use to analyse genomic and proteomic datasets.
To make the lessons concrete, we focus on how biological parts can be arranged to create a biologically relevant dynamical system: an oscillator (Box 3). Cells display a range of oscillatory behaviors. Some oscillators have tunable periods, such as the dependence of the cell cycle period on nutrient levels available, whereas others are more robust to changes in parameters, such as the circadian oscillator. Examples include oscillatory signaling from nuclear factor kappa B (NFkB), which oscillates to control gene expression86, and the p53-murine double minute 2 (Mdm2), which oscillates to drive the DNA damage response87.
The simplest way to achieve oscillation is through use of a delayed negative feedback loop105. Imagine that you construct a system with two genes, A and B, and that protein A activates the transcription of B whereas protein B inhibits the transcription of A. Turning on gene A leads to build up of the protein A, but also of protein B. After some time, enough protein B builds up to cause protein A levels to decrease – this then results in a decrease in protein B levels, which allows protein A levels to rise, and so on.
However, when one builds a simple negative feedback circuit as described above, the oscillations are in general not robust. In the repressilator of Elowitz and Leibler106, which consists of a cycle of 3 transcriptional repressors and a fluorescent protein readout (panel A), the oscillators fall out of phase and damp out following a small number of cycles. Swinburne et al. engineered an autoinhibitory circuit in which the delay timescale in the negative feedback was set by the length of an intron engineered into the construct; they also find that even for a given intron length the oscillation period varies widely from cell to cel107. The source of the damping in both cases can be found in the stochastic nature of gene expression: random amounts of protein produced at random times result in uncoordinated behavior that causes the components making up the oscillator to fall out of phase. The synthetic genetic oscillator was missing a key ingredient.
A strong hint as to the identity of that key ingredient was provided by the analysis of naturally occurring oscillators. In particular, the cell cycle oscillator contained interlocking positive feedback loops in addition to the core negative feedback loop that was generally assumed to generate the oscillations (panel B). Experiments in the cell cycle of frog embryos along with computational simulations suggested that the positive feedback loops could stabilize two states that the system would cycle between via the negative feedback loop108–110, creating a RELAXATION OSCILLATOR. Could something as simple as positive feedback be responsible for robustness in genetic oscillators in organisms as diverse as bacteria to mammals? And can positive feedback enable cells to independently tune the amplitude and frequency of the oscillations?
Two recent studies, in agreement with earlier work111, indicate that coupling positive and negative feedback is indeed sufficient to ensure stable oscillations. Stricker et al. implemented a transcriptional circuit in E. coli that integrates a positive and negative feedback loop in a common inducible promoter88 (panel c), while Tigges et al., working in mammalian cells, used transcriptional positive feedback and negative feedback mediated by transcription of an antisense RNA89.
Experimentally, Stricker et al. observe that the dual feedback oscillator is robust to a number of perturbations, including changes in inducer level and temperature; these features could not be adequately described by their initial modeling of this circuit112. It was only through the addition of various biological steps in the negative feedback, such as TF-DNA binding and multimerization, that the model could reproduce the robustness of the oscillator to parameter changes. The authors conclude that from the point of view of the oscillator’s operation, what matters is not the details of what processes make up the negative feedback but instead that the negative feedback includes a delay; by contrast, the positive feedback only ensures robustness and tunability.
The system built by Tigges et al. shares many of these details, with the delay in the negative feedback coming from post-transcriptional repression of the circuit’s transcriptional activator, but the system itself is sensitive to molecular details such as the relative ratios of the circuit components – for some ratios of circuit components oscillations are abolished.
How can one construct a robust yet tunable oscillator in a living cell? The construction of in vivo oscillators provides a particularly beautiful example of how the interplay of analysis of naturally occurring systems, modeling, and construction of synthetic systems can yield deep insights into biological phenomena. The story here begins with the observation that the simplest oscillator design, a delayed negative feedback, cannot sustain oscillations beyond a small number of periods when operating in a cell. Instead, as highlighted in Box 3, naturally occurring oscillators hinted at the crucial role of interlocking positive feedback in maintaining a robust oscillator, which was employed in the genetic oscillators recently syntheisized by Stricker et al88 and Tigges et al89.
As the studies in Box 3 show, oscillators, in addition to being fun to watch, are among the simplest in vivo systems that can be used to understand interactions between different types of feedback loop. While systems biologists are increasingly comfortable with our understanding of simple motifs, we cannot say the same about interactions of those motifs. It is worth considering, for example, that even for interlocking positive and negative feedback loops multiple behaviors are possible as one varies the parameters of the system and includes stochastic effects. For example, in the yeast galactose utilization pathway, the negative feedback effectively counteracts the positive feedback and limits the parameter space over which the system is bistable90. Beyond two or three loops, however, we are usually at a loss to describe the system – especially a natural one that may contain even more interactions than is being accounted for. Synthetic circuits are helping us systematically understand how motifs interact to generate ever-richer behavior.
If there is one context in which all of the various biological processes tackled by synthetic biologists come together it is in the engineering of spatiotemporal interactions, both intracellular and intercellular. Engineering cell-cell interactions in a rational manner requires us to master rational manipulation of communication devices (signaling pathways), using promoters to specify desired transcriptional responses to a given signal strength, and arrange these elements in a circuit architecture that robustly encodes the function we are trying to implement. If we hope to systematically build up our understanding of functional compartments of the cell, development, and ecology then it is imperative that we integrate lessons learned from diverse areas of synthetic biology.
Perhaps the most striking feature of the eukaryotic cell is its organization into functional subcompartments: the nucleus for genetic material, mitochondria for respiration, endoplasmic reticulum (ER) for protein production, etc. For the eukaryotic cell to accomplish its tasks, the behavior of these compartments must be coordinated in space and time. A recent study in S. cerevisiae from has yielded new insight on how the mitochondrion and ER communicate, by using a genetic screen coupled with a synthetic construct that is designed to specifically tether the two organelles91. Kornmann et al. find that the synthetic tether complements mutations in maintainance of mitochondrial morphology 1 (Mmm1), mitochondrial distribution and morphology 10 (Mdm10), 12 (Mdm12), and 34 (Mdm34), thus identifying these 4 proteins as constituents of a complex that ties the organelles together and allows the exchange of phospholipids (needed by the mitochondrial membranes) and calcium (which acts as a signaling molecule between the two).
Two properties that we still cannot reliably engineer are the dynamics of a circuit and spatial control. Both these behaviours have one major biological process in common: development. In anticipation of one day tackling developmental processes and other intercellular pathways, some groups have designed circuits to spatiotemporally control gene expression. Using a network mimicking naturally occurring feedforward circuits, for example, Basu et al. have designed cells that can respond to the signal acyl-homoserine lactone (AHL) from nearby cells but ignore equal concentrations of this signal from faraway cells92. This feat is accomplished by a key property of the feedforward network in the signal receiving cells – it responds not only to the concentration of the signal but also to the rate of increase of that concentration. Signal sending cells nearby signal receiving cells increase the rate of AHL concentration more rapidly than distant sending cells. Basu et al. built on this work to create a circuit that could respond to only a narrow range of AHL signal, much like a band filter, thereby exhibiting another feature of developmental processes93.
The exquisite coordination that is a hallmark of development also almost certainly requires the use of networks that can act as genetic timers and counters. Friedland et al. have provided a design for a network that constitutively pumps out GFP mRNA transcripts that are translationally inhibited but whose inhibition can be lifted by a trans-activating RNA (taRNA)94; the transcription of the taRNA is inducible by arabinose and so the network output, in the form of discrete amounts of GFP, represents pulses of arabinose. Finally, Isalan et al. have gone as far as building a mock-up of a realistic D. melanogaster embryo, modeling the SYNCYTIUM as a collection of paramagnetic beads coated with DNA, in which genetic networks analogous to the gap gene system can be placed95. Interestingly, this ‘minimal embryo’ leads the authors to suggest that pattern formation in the real embryo requires activator molecules to propagate faster than inhibitors, implying that the gap system is a REACTION-DIFFUSION system that uses a mechanism quite unlike Turing instabilities to lay down patterns. As the authors point out, this is hardly surprising given that the gap system uses nonhomogeneous initial conditions in the form of spatially localized components deposited in the insect egg and as the activator is not autocatalytic. Whether these lessons carry over to their natural settings remains to be seen.
As is the case with the band filter circuits described above, most synthetic circuits involved in cell-cell communication make use of the QUORUM SENSING PATHWAY (Figure 3a)96. These circuits usually borrow components from organisms like Vibrio fischeri, although attempts at incorporating other systems have also been successful97,98. Examples of using such systems to study natural phenomena are more limited. Balagadde et al., by adapting an earlier design99, used the quorum sensing proteins to drive expression of an antibiotic to create a synthetic predator-prey system100, while Brenner et al. used a similar system to study the ability of cells to signal in the context of a biofilm101. Chuang et al. recently have used engineered circuits for cell-cell interactions to study the evolutionary phenomenon of Simpson’s paradox (Figure 3b), in which the cells that provide a useful product to the population make up a diminishing fraction of the population but nevertheless increase in absolute number by promoting population growth102. Gore et al. provide another example of synthetic ecology in their study of the evolutionary game dynamics underlying sucrose metabolism in yeast103. The study establishes that sucrose metabolism can be thought of as a snowdrift game, in which both cells that metabolize sucrose (cooperators) and those that do not (cheaters) stably coexist in a population, thereby opening an avenue to show how competition between different alleles can actually promote diversity in a population.
Studies such as these on fundamental aspects of ecology and evolution are difficult to carry out in natural environments due to the multiplicity of confounding factors, but synthetically engineered populations provide a way to cleanly separate different effects. Studies on engineered populations not only highlight the ability to connect the molecular details of a network to population level effects but also the utility of abstracting away from such details and focusing on general cell-cell interactions. Taking sucrose metabolism from Gore et al. as an example, it was possible to predict population level responses to changes in the cost of cooperation just on the basis of the game theoretic characterization of the interaction between cheaters and cooperators, with no direct knowledge of the molecular details. Indeed, this approach of constructing synthetic systems dedicated to characterize how cells interact can be very useful in cases such as cancer dynamics, where the underlying molecular details are either poorly understood or exceedingly complicated but population level measurements are both feasible and relevant to understanding the phenomenon.
The synthetic biology community has made great strides in working out some of the most basic features of regulatory networks and cellular pathways. We are exerting greater control over the process of gene expression, and we have a wealth of information regarding the effects of network topology on system function. Topological details such as connectivity, cascade length, feedback structure have been explored. But there is much work yet to do before we can treat biological circuits like we treat electronic ones (Box 4).
In the future, we can expect to see that the synthetic circuits deployed in cells will grow in complexity and will integrate multiple cellular processes, as has been done for genetic regulation and metabolism83. We should also expect to see increasing contact with large-scale cell biology, such as through the creation of synthetic organelles, whose in vivo construction will be guided by synthetic regulatory networks. Progress along these fronts is limited by many of the same obstacles found across the sub disciplines of biology: we are still in need of more ways to specifically modulate the expression level of genes of interest, the activity state of pathways of interest, and we require more sensitive techniques (ideally at single-molecule resolution) to measure the abundance of mRNAs, proteins and specifically modified proteins in live cells.
One of the main ways in which methodological advances will be useful is in tightly constraining models of biological networks. Obstacles to rapidly moving synthetic circuits from the blackboard to the cell can often be traced to the fact that the system under study does not behave as initial modeling indicates. This, in turn, is usually due to the fact that the systems are underdetermined, meaning that many different models can usually describe the circuit data. Higher resolution data, both in terms of abundances of the relevant molecules and as a function of time, will constrain the space of possible models significantly and should allow for more rational, predictable design processes.
Assuming these technical obstacles are overcome, in a future where man-made circuits increasingly look like their byzantine natural counterparts, it is not unreasonable to expect nearly synthetic or fully synthetic cells to make their appearance. At these extreme levels of complexity, it may prove difficult or even unhelpful to mechanistically model the relevant systems. It is likely, however, to prove useful to compare the performance of natural and synthetic circuits and cells in a rigorous fashion – perhaps through the formulation of a Turing test for synthetic biology – as differences in performance can point to possible design principles.
Looking back on the various examples of circuits and processes that synthetic biologists have examined, we can see that the utility of synthetic circuits can be measured along 3 different dimensions. First, synthetic circuits can serve as easily manipulable toy models that we can characterize in exacting quantitative detail and thereby build intuition for how similarly structured natural networks operate. Second, synthetic circuits can be used to allow us control over natural networks and so make discoveries about the molecular and cell biology underlying important physiological processes. Third, on a more conceptual level, synthetic systems provide clear evidence that one can generate complexity by rearranging even well-known parts, thus bolstering claims on the evolvability of natural systems.
While we are still very far from rationally assembling a living organism from scratch, and far from understanding all the design principles according to which biological networks operate, the first generation of synthetically designed systems have offered us a glimpse at the need to weave our tools from disparate processes from transcriptional regulation to signal transduction in order to approach fundamental questions in modern biology.
Shankar Mukherji is a graduate student in the Harvard-Massachusetts Institute of Technology (MIT; in Cambridge, USA)) Division of Health Science and Technology, working in the laboratory of Alexander van Oudenaarden at the MIT Departments of Physics and Biology. His research focuses on quantitative studies of gene expression and signal transduction. He received his undergraduate degrees in physics and mathematics from MIT.
Alexander van Oudenaarden is a Professor of Physics and Biology at the Massachusetts Institute of Technology (MIT) in Cambridge, USA. He obtained his Ph.D. in Physics in 1998 at Delft University of Technology in the Netherlands. He was a postdoctoral fellow at Stanford University, California (USA), where he worked with Steven Boxer and Julie Theriot. He joined the faculty at MIT in 2000. His research focuses on how single cells use gene and protein networks to accurately process intra- and extracellular signals. His laboratory is particularly interested in understanding stochastic gene expression and systems biology at the single-cell level. Current efforts in van Oudenaarden’s group are focused on developing an integrated theoretical and experimental approach to understand the role of stochastic gene expression during development and differentiation.