|Home | About | Journals | Submit | Contact Us | Français|
With the increasing amount of experimental data on gene expression and regulation, there is a growing need for quantitative models to describe the data and relate them to their respective context. Thermodynamic models provide a useful framework for the quantitative analysis of bacterial transcription regulation. This framework can facilitate the quantification of vastly different forms of gene expression from several well-characterized bacterial promoters that are regulated by one or two species of transcription factors; it is useful because it requires only a few parameters. As such, it provides a compact description useful for higher-level studies (e.g. of genetic networks) without the need to invoke the biochemical details of every component. Moreover, it can be used to generate hypotheses on the likely mechanisms of transcriptional control.
Biology is undergoing a transformation from a `component-centric' focus on the individual parts toward a `system-level' focus on how a limited number of parts work together to perform complex functions. For gene regulation, this theme has been discussed extensively in the context of simple genetic circuits [1•,2–4] in addition to complex, developmental networks . The functional properties of a genetic circuit often critically depend on the degree of cooperativity (see Glossary) in the interactions between the molecular components . For gene regulation, this cooperativity is dictated to a large extent by the architecture of the cis-regulatory region (see Glossary),  and the specific mechanism of transcriptional activation or repression [8••], which is mediated through interactions among various transcription factors (TFs) and the RNA polymerase (RNAP) complex. Often, even qualitative features of a gene circuit (e.g. whether a circuit can be bistable or whether it can spontaneously oscillate) cannot be determined without quantitative knowledge of the transcriptional regulation of key genes in the circuit .
Predicting the expression level of genes directly from the underlying biochemistry and biophysics is a difficult task. This is due most notably to ignorance of many biochemical parameters, especially their relevant in vivo values. However, the thermodynamic model reviewed in the preceding article [9••] yields several general mathematical forms for the dependence of the fold-change in gene expression on the concentration(s) of the TF(s) regulating transcription. These general forms contain only a few parameters characterizing the effective interactions between the molecular players. Thus, from a practical standpoint, it is expedient to quantify the transcriptional regulation of a gene by fitting expression data to the appropriate model function in order to obtain effective parameters that best describe the promoter [10,11]. This procedure might be useful even when the simplifying assumptions made by the thermodynamic models are not satisfied [9••]. By analyzing gene expression data within the thermodynamic framework, one can elucidate whether an assumed set of interactions between TFs and RNAP can consistently explain the data. Failure of the analysis can suggest important missing ingredients, such as unknown mechanisms of cooperativity, whereas success can lead to predictions for new experiments (e.g. how operator deletion would affect gene expression).
There has been much recent progress in understanding the mechanistic aspect of bacterial gene regulation [8••]. However, the systematic quantification of gene expression is still in its infancy. In this paper, we review several experimentally characterized cis-regulatory systems in bacteria. For each case, we provide what we believe to be the most appropriate form for the dependence of the promoter activity (see Glossary) on the TF concentration(s). For each system, we show graphically how the expected form depends on the effective parameters. We hope to demonstrate how the thermodynamic models can provide a direct link between the arrangements of interactions in a promoter region and the quantitative characteristics of gene expression.
Our quantitative discussion focuses on several well-characterized bacterial promoters controlled by one or two species of TFs. We use the results of the thermodynamic model listed in Table 1 of the preceding paper [9••], which we refer to as Table 1 throughout this review. We make the additional simplifying assumption that the in vivo promoters are weak, so that even at full activation the equilibrium gene expression is still small (e.g. <10% of the strongest promoters). Indeed, for a large number of bacterial promoters, the expression is small in the exponential growth phase when compared with the expression of the ribosomal genes, for example, which are fully turned on . In this weak promoter limit, the fold-change in promoter activity (henceforth simply referred to as `fold-change') is given directly by the regulation factor (Freg) listed in Table 1. We will consider two types of activators: those activators that recruit RNAP to its promoter, and those that stimulate the transition rate of bound RNAP from a closed to an open complex. Even though the latter is a kinetic effect, its impact on the overall promoter activity (e.g. transcription initiation rate) can, nevertheless, be effectively described by the thermodynamic model in the weak promoter limit that we study.
The simplest example of activation involves the binding of an induced TF to a single operator site, and the subsequent recruitment of RNAP. This is the case with the lac promoter of E.coli, shown in Figure 1a (in the absence of the lac repressor). The activating TF is a CRP (cAMP receptor protein) dimer in complex with the inducer cAMP [13,14]. We will denote this complex by CRP2* and use * to indicate the activated form of a TF. Case 2 in Table 1 gives the mathematical form of the expected fold-change for this situation with [A] = [CRP2*], and Figure 1b plots its dependence on the induced dimer concentration. The two parameters of the model are the effective in vivo dissociation constant (KA) between CRP and the operator, and the enhancement factor (f), which characterizes the degree of stimulation in transcription resulting from operator-bound CRP. These are readily revealed in a log–log plot of the relative promoter activity against the cellular concentration of the induced activator, [CRP2*]. As long as the range of [CRP2*] probed is sufficiently broad, one can read the enhancement factor (f) from the graph as the maximal fold-change between full activation at saturating levels of [CRP2*] and basal activity at low levels of [CRP2*]. One can also read off the effective dissociation constant (KA) as the value of [CRP2*] at half-activation. The steepness of the transition region — called the `sensitivity' (or `gain') in the literature (see Glossary) [15•] — plays an important role in the function of genetic circuits. Here, we quantify transcriptional sensitivity by the log–log slope (s) at the mid-point of the transition region. s ≤ 1 for promoters containing a single operator, and s approaches 1 for only very large values of f. In contrast, functions such as amplification, bistability or spontaneous oscillation all require circuit components to have high sensitivity, with a value of s > 1 .
TFs often have domains that enable interaction with one another when bound to adjacent operator sites, and this interaction can result in cooperativity in transcriptional activation. The PRM promoter of phage lambda, shown in Figure 2a, is such an example [1•]. Binding of the dimeric lambda repressor cI to the operator OR2 (the `activator' site) stimulates transcription, and binding of cI to the upstream operator OR1 (the `helper' site) helps to recruit cI to OR2. The expected fold-change (Case 3 in Table 1 with [A] = [H] = [cI2], KH = KR1 and KA = KR2) depends on the affinities KR1 and KR2 of cI to the two operators, the cooperative interaction (ω) between the two operator-bound cI dimers, and the enhancement factor f due to the OR2-bound cI. It is shown in the log–log plot of Figure 2b (thick solid line) as a function of [cI2]/KR2.
To quantify the possible role of the auxiliary operator OR1, we also plot in Figure 2b the fold-change for different ratios of KR1 and KR2. Comparing these curves, it is clear that the auxiliary operator OR1 does not change the degree of full activation, given by ƒ. The most significant feature of this dual-activator system is perhaps the increase in the log-log slope of the transition region (compared with the extreme cases) for intermediate values of KR2/KR1. In fact, for the realistic parameter of KR2/KR1 ≈ 25 (thick solid line in Figure 2b), we have a sensitivity of s ≈ 0.93. This is close to the maximum attainable for this system, with its small enhancement factor (ƒ ≈ 11), and is nearly double the maximum sensitivity (s ≈ 0.54) for the promoter with OR2 only (thin solid line in Figure 2b). For TFs with larger values of ω and ƒ, this cis-regulatory construct can, in principle, provide more sensitivity, with s approaching 2.
The same cis-regulatory design can be used to implement co-activation — one of the simplest forms of signal integration (see Glossary) — if the two operators are targets of two distinct TF species. A possible example of this is the variant of E. coli's melAB promoter studied by Wade et al.  (see Figure 3a), where transcription is stimulated by an induced MelR dimer bound to the weak proximal operator, O2. Meanwhile, CRP bound to the upstream operator O1 helps recruit MelR but does not directly participate in activation. Assuming that the induction of MelR by melibiose results in an increase in MelR-operator binding affinity, we expect the form of the co-dependence to be given by Case 3 in Table 1, but with [A] = [MelR2*], [H] = [CRP2*] and KH = K1, KA = K2. The fold-change is plotted against the induced CRP concentration on the log-log plot of Figure 3b for different concentrations of the induced MelR. To better visualize the co-dependence on CRP and MelR, it is useful to plot the fold-change as a three-dimensional plot; see Figure 3c. The transition region (the yellow band) is clearly dependent on both TFs. Consider a simplified situation where CRP and MelR can each take on two possible concentrations — a pair of `low' and `high' values. Then it is possible to choose the pair of concentrations (e.g. those marked by the 4 open circles in Figure 3c) such that the fold-change is large (the green region) only when both concentrations are `high'. This mimics a logical AND function of the two inputs [17•]. It is also possible to choose the pair of concentrations as marked by the four solid circles such that the fold-change is large (the green region) unless both concentrations are `low'. The latter choice mimics a logical OR function. The flexibility of this cis-regulatory scheme makes the shape of the fold-change readily evolvable  (e.g. between the AND/OR functions) by merely altering the operator sequences that encode the values of K1 and K2.
An alternative mechanism for co-activation is synergistic or dual activation [19–21], where two operator-bound TFs can simultaneously contact different subunits of RNAP and activate transcription. This mechanism is limited to TFs that can activate transcription at different locations relative to the core promoter. Prominent examples of such synergistic activation in the bacterial literature [19–25] all involve the activator CRP because it can recruit RNAP from multiple locations at varying distances upstream of the promoter [8••,26].
The synthetic promoter studied by Joung et al.  contained two operators: one for cI proximal to the core promoter (O2) and the other for CRP at an upstream operator (O1) (see Figure 4a). The data from the study by Joung et al. support the model where each operator-bound activator can independently interact with RNAP and enhance transcription . The expected fold-change is given by Case 8 in Table 1 (with [A1] = [CRP2*], [A2] = [cI2], KA1 = K1, KA2 = K2 and ω = 1) and shown in the log–log plot of Figure 4b as a function of [CRP2*] for various cI concentrations. Note that, since ω = 1, the dependence of gene expression on [CRP2*] is independent of [cI2], except for an overall vertical shift. This is a reflection of the multiplicative nature of independent synergistic activation. An alternative way of visualizing the same result is the three-dimensional plot of Figure 4c.
In another experiment by Joung et al. , both the proximal site (O2) and the distal site (O1) were engineered to bind CRP (see Figure 5a, left). An important result of these experiments was that the fold-change with both CRP operators is larger than the product of the fold-changes with one operator alone. This is not consistent with the independent recruitment assumption and suggests additional cooperativity (ω > 1). A possible mechanism proposed by Joung et al. is that DNA bending (see Figure 5a, right) induced by the CRP bound to the proximal operator O2 facilitates the upstream CRP interaction with RNAP, without any direct protein–protein interaction between the two TFs. This cooperative effect can be included in the thermodynamic model as shown in Case 8 of Table 1 (with [A1] = [A2] = [CRP2*], KA1 = K1, KA2 = K2 and w > 1) regardless of the specific molecular mechanism. Similar to the case of activation by cI, the expression level is most sensitive when the K values for the two binding sites are equal. In Figure 5b, we plot the expected fold-change, with K1 = K2 and different values of w. The extra cooperativity increases both the enhancement factor (w · f1 · f2) and the sensitivity (s) of the transition region.
The simplest example of repression involves the binding of a TF to a single operator site that interferes with the binding of RNAP to the core promoter. This is the case in the truncated lac promoter (e.g. lacUV5) which has only the main operator, Om, of LacI located closely downstream of the core promoter (Figure 6a) . The expected fold-change is given by Case 1 of Table 1, with [R] = [LacI4], KR = Km and only one unknown parameter, Km, characterizing the effective dissociation constant of the operator Om. Here, it is possible to compute Km [28••] directly from the experimental data of Oehler et al. , because the cellular concentration of LacI was quantified. In fact, because Oehler et al. characterized gene expression at two distinct LacI concentrations, the two data points can be used to check the consistency of the thermodynamic model.
This analysis was performed for the three lac operator sequences O1, O2 and O3 studied in  (results shown in Figure 6b). We note that the Km values obtained, K1 ≈ 0.22 nM, K2 ≈ 2.7 nM and K3 110 nM for the three operators, are significantly different from, for example, the results K1 ≈ 10−3 nM, K2 ≈ 10−2 nM and K3 0.016 nM to 1 nM obtained from in vitro assays [29–31]. These results underscore the fact that the relevant TF-operator binding constant for the thermodynamic model is not given by the in vitro measurement — even if the appropriate physiological conditions are used — but must be corrected for by considering the interaction of the TF with the genomic background [9••,32]. Consistent with the theoretical expectation, the ratios of the K values are in reasonable agreement between the in vivo and the in vitro results. We note also that the expected range of promoter activities is much larger than those for the activator-controlled promoters described above. This follows from the strong excluded-volume interaction between the repressor and RNAP, such that more repressor proteins generally lead to stronger repression; whereas in activation more activator protein does not lead to more activation beyond the enhancement factor (f), which is set by the weak activator-RNAP interaction1. By contrast, the sensitivity is still limited to s ≤ 1 with a single operator site.
For the wild-type lac promoter, the degree of repression exceeds 1000-fold with only ~10 repressor molecules in a cell . This is substantially larger than the <100-fold repression achievable by the best of the truncated promoters (Figure 6) at the same repressor concentration. The additional repression is facilitated by the stabilization of the Om-bound Lac tetramer, which can simultaneously bind to an auxiliary operator Oa through DNA looping (see Figure 7a). The wild type lac promoter has two such auxiliary operators: O2 located 401 bases downstream and O3 located 92 bases upstream. We describe the simpler case studied experimentally by Oehler et al. , which involves only repression and looping between the main operator, Om, and the down-stream auxiliary operator, O2. The expected fold-change is given by Case 9 of Table 1, with [R] = [LacI4].
Given that the three K values are already determined (see Figure 6b), there is only one unknown parameter in this case in the form for the fold-change (Case 9 of Table 1). This is [L], the effective concentration of repressors that are made available, as a result of DNA-looping, for binding to one of the two operators. This looping is itself caused by the binding of a repressor to the other operator. Oehler et al.  did experiments with the main operator, Om, substituted for one of the three operator sequences (O1, O2 and O3), each for two concentrations of LacI. The results of all six experiments are described consistently by the expected fold-changes according to the thermodynamic model (see Figure 7b), with [L] ≈ 660 nM [28••].
Quantitatively, the strong repression effect (compare Figure 6b and Figure 7b) results directly from the large value of [L] generated by DNA looping, which amplifies the effective concentration of one operator-bound repressor 660-fold. This enhancement of the local repressor concentration is a result of the linkage between Om and Oa, as already described qualitatively elsewhere [27,33]. Intuitively, once a LacI tetramer binds to one of the two operators, it is available within a small volume for binding to the other. The actual value of [L] is clearly dependent on the spacing between the two operators, in addition to the energetics of bending the DNA backbone. We have deduced the dependence of [L] on operator spacing (shown in Figure 7d) by analyzing the data of Müller et al. , who measured the fold-changes in repression for promoter constructs with different spacing between the main and auxiliary operators (see Figure 7c). In Figure 7c, we also show the predicted transcriptional fold-changes for the same constructs of Müller et al. , but at different LacI concentrations.
Interaction between the TFs can also enhance the sensitivity in transcriptional repression. The PR promoter, which controls the expression of cro in phage lambda (illustrated in Figure 8a), is a good example of this mode of repression [1• ]. When bound to either OR1 or OR2, the lambda repressor, cI, blocks the access of RNAP to the core promoter, thereby repressing transcription. The combined effect of two repressive operators, reinforced by the cooperative interaction between the operator-bound cIs, results in both further repression and enhanced sensitivity. The expected form of fold-change is given by Case 6 in Table 1 ([R1] = [R2] = [cI2]) and plotted in Figure 8b. Maximum log-log (i.e. sensitivity) in repression is the largest when KR1 and KR2 are equal. Similar schemes have been generalized for co-repression by two species of repressors [35–37], and can be used to mimic the logical NAND function [17•].
In fact, enhanced sensitivity in repression does not require direct interaction between the repressor molecules. An example is the PLtetO-1 promoter , which contains two operators of TetR; see Figure 8c. The expected form of the fold-change is given by Case 5 in Table 1, with [R] = [R2 = [TetR2*], and KR1 = K1, KR2 = K2. By appropriately decreasing K1 and K2, it is possible to make the activity of this promoter (not shown) nearly identical to that represented by the solid line in Figure 8b (i.e. with the steepened slope) even though the TetR dimers do not contact each other physically. The enhanced sensitivity is expected here because of the `collaborative' nature of repression — the occupation of either operator is sufficient to block RNAP from the core promoter, leaving the other operator site available for binding for `free' . We expect that a similar construct where the two operators are targets of different, non-interacting TFs would implement co-repression. Comparing the activating and repressive modes of transcription control, we find repressive control to be advantageous because high sensitivity can be generated by TFs without the need of TF–TF interaction, and fold changes are not limited by the magnitude of the (typically weak) TF–RNAP interaction .
The mathematical description for the different activation and repression mechanisms discussed above can be summarized by very simple forms. For a single TF species with up to two operators in the cis-regulatory region, all of the fold-changes described in Table 1 can be compactly represented by the general form
Similarly, for co-regulation by two TFs with cellular concentrations, [TF1] and [TF2], and for no more than one operator each in the regulatory region, the fold-change has the form
The general forms in Equation 1 and Equation 2 include many possible mechanisms of activation and repression not discussed above. If 3 binding sites for the TF are involved in the regulatory process, then Equation 1 or Equation 2 would be generalized to the ratio of third-degree polynomials of the [TF]s.
The above analysis indicates that, by quantitatively measuring the fold-change as a function of the activated TF concentration(s), we can achieve two important goals (i) by fitting experimental results to an expression such as Equation 1 or Equation 2, one would obtain a quantitative characterization of the promoter at all TF concentrations, but with only a few (e.g. four or six) parameters. This can be done regardless of the validity of the thermodynamic model itself. As discussed previously, the compact description will facilitate quantitative higher-level study of gene circuits. (ii) By comparing the values of these parameters to the expected forms according to the thermodynamic model (e.g. Table 1), one can generate hypotheses on the likely mechanisms of transcriptional control for further experiments. Thus, the form of the fold-change in gene expression itself can be an effective diagnostic tool to distinguish subtle mechanisms of transcriptional control.
We have illustrated a variety of promoter activities implemented in different cis-regulatory designs. Also illustrated are important functional differences (e.g. in transcriptional cooperativity, and in the nature of combinatorial control) among promoters characterized by different parameters of the same cis-regulatory construct. These differences often cannot be discriminated by the qualitative characterization of promoter activity predominantly practiced in molecular biology today (e.g. fold-change in gene expression caused by deletion of a regulatory protein). Instead, they call for more quantitative characterization, particularly the quantification of the TF concentrations — or their relative concentrations — controlling promoter activity. The reward of quantitative characterization includes a compact phenomenological description of promoter activity for higher-level analysis and the elucidation of unknown mechanisms of transcriptional control.
We are grateful to Steve Busby, Ann Hochschild, Bill Loomis, Mark Ptashne, Milton Saier Jr and Jon Widom for discussions and comments. We are also thankful to Nigel Orme for his extensive contributions to the figures in this paper. This research is supported by the NIH Director's Pioneer Award (RP), NSF through grants 9984471, 0403997 (JK), and 0211308, 0216576, 0225630 (TH, TK). JK is a Cottrell Scholar of Research Corporation. UG acknowledges an `Emmy Noether' research grant from the DFG.
1Not discussed here is a lower plateau of promoter activity for saturating amounts of repressor, sometimes referred to as “promoter leakage”. Such leakage could result, for example, from the passage of the replication fork through a tightly repressed promoter, leading to basal transcription activity.
Papers of particular interest, published within the annual period of review, have been highlighted as: •of special interest ••of outstanding interest