|Home | About | Journals | Submit | Contact Us | Français|
The lactose repressor protein (LacI) was among the very first genetic regulatory proteins discovered, and more than 1000 members of the bacterial LacI/GalR family are now identified. LacI has been the prototype for understanding how transcription is controlled using small metabolites to modulate protein association with specific DNA sites. This understanding has been greatly expanded by the study of other LacI/GalR homologues. A general picture emerges in which the conserved fold provides a scaffold for multiple types of interactions —including oligomerization, small molecule binding, and protein•protein binding — that in turn influence target DNA binding and thereby regulate mRNA production. Although many different functions have evolved from this basic scaffold, each homologue retains functional flexibility: For the same protein, different small molecules can have disparate impact on DNA binding and hence transcriptional outcome. In turn, binding to alternative DNA sequences may impact the degree of allosteric response. Thus, this family exhibits a symphony of variations by which transcriptional control is achieved.
In virtually all bacteria, LacI/GalR family members regulate transcription for a wide range of processes. First catalogued in 1992 by Weickert and Adhya , sequences of >1000 characterized and hypothetical homologues are now known (2008 BLAST search of Swiss-Prot). These proteins have not been found in archaebacteria or eukaryotes, although proteins with homologous domains are ubiquitous.
The LacI/GalR family can be divided into >33 paralogue groups that appear to derive from an ancestral gene. As many as 22 paralogues co-exist in a single species. Many members coordinate available nutrients with expression of catabolic genes , but some regulate processes as diverse as nucleotide biosynthesis and toxin expression (e.g. [2,3]). Two members are “master” regulators: homologues CcpA and CRA control expression of enzymes that determine carbon flow in Gram-negative and Gram-positive bacteria, respectively. If these key proteins are disabled, virulence is altered in several pathogens (e.g. [4,5•, 6•,7]).
The common function of the LacI/GalR proteins, which features allosteric regulation of DNA binding to modulate transcription, is shown in Figure 1. Each homologue has evolved a unique variation: In addition to binding specific “operator” DNA sequences, each protein exhibits specificity for distinct effector ligands. Although most members repress transcription, some act as both repressors and activators (e.g. CcpA, as reviewed in [8•]). Some homologues control one operon (e.g., LacI), whereas others coordinate a set of related operons — for example, CRA controls >10 operons and PurR regulates at least 19 [9–12]). Binding the effector ligand may either decrease (induction) or increase DNA-binding affinity (co-repression), thereby altering transcription levels of downstream genes (Figure 1). As might be anticipated in a regulatory loop, effector molecules are frequently metabolically related to the regulated operon (e.g. [1,3,9,13,14]). In addition to or instead of small molecules, some family members bind other proteins [1,15–17].
The common monomeric structure of the LacI/GalR proteins comprises both DNA-binding and regulatory domains (Figure 2). Homodimer formation is required for high-affinity binding to operator DNA, which is usually some variation on an inverted repeat sequence (Figure 2, ). The two functional domains are linked by ~18 amino acids that mediate key interactions (see below). In the LacI/GalR family, the regulatory domains have two essential roles: (i) They receive and transmit the “input” signal from binding the effector molecule, and (ii) they mediate homodimer formation [1,8•,18–20].
E. coli lactose repressor protein (LacI) represses the lac operon until it binds the physiological inducer allolactose or the gratuitous inducer IPTG (reviewed in [13,21]). The LacI dimer can effect repression and induction of the lac operon through binding a single high affinity operator [22,23]. In addition, wild-type LacI contains a sequence of ~20 amino acids at the C-terminus of the regulatory domain that promotes tetramer formation, allowing stronger repression through DNA-looping with two operator sequences (reviewed in ) (Figure 3). These loops have been visualized directly in single molecule experiments [24•]. At low in vivo inducer concentrations, one dimer within the tetramer appears to stochastically dissociate from the primary operator, leading to small bursts of gene expression [25••]. High inducer concentrations lead to LacI dissociation from both operators, increasing the duration of large bursts of gene expression [25••].
Induced LacI remains capable of binding DNA, but the affinity for the operator site is reduced ≥3 orders of magnitude, allowing excess genomic, nonspecific DNA to compete for the repressor protein. Indeed, LacI seldom dissociates from DNA in vivo . The number of inducers that elicit induction is unknown: Thermodynamic evidence is consistent with 2 inducers/dimer , but others argue that one is sufficient . Perhaps complexes with 0, 1, and 2 inducers bound/dimer result in distinct states with different DNA-binding properties. Gratuitous anti-inducer ligands are known that enhance LacI affinity for operator DNA, whereas ”neutral” ligands bind the same effector site but elicit no change in DNA-binding affinity [27,29]
Despite extensive efforts, no high resolution structure shows a complete picture for even a single functional state of LacI (e.g. [20,30–33]). Nonetheless, these structures have been invaluable for successive analyses of allostery: Comparison of the LacI·OsymDNA·anti-inducer and LacI·inducer structures led to the hypothesis that inducer binding shifts the N-subdomains of the regulatory domain [20,32]. These changes would ultimately impact the spacing of the N-terminal DNA-binding domains, misaligning the sites and lowering affinity. Motions between these two regulatory domain conformations were simulated with targeted molecular dynamics . The predicted structural intermediates are in good agreement with existing experimental data and provide the basis for ongoing studies of LacI allostery (e.g., [27,35]).
The newest X-ray structures of LacI bound to either anti-inducer or neutral ligands show very few changes in the regulatory domain compared to the inducer-bound regulatory domain structure [36•]. Thus, these structures — including the LacI•IPTG structure — might represent “off-pathway” conformations. True allosteric changes might be seen only in the DNA-bound ternary complexes. To that end, small-angle X-ray scattering was carried out with full-length tetrameric and dimeric LacI bound to DNA and to DNA/IPTG . Only subtle conformational changes occur within the dimer•operator complex upon IPTG binding. Notably, the linker region that is extended in apo-LacI is compact in the both LacI•DNA and the induced LacI•DNA•IPTG complex. In tetrameric LacI, inducer binding led to a change in the dimer•dimer disposition, reflecting the inherent flexibility of the tetrameric arrangement.
In E. coli, the purine repressor protein (PurR) regulates 19 operons that control purine and pyrimidine metabolic pathways (e.g. [11,12]). The physiological allosteric response of PurR is opposite to that of LacI — high affinity operator binding requires the presence of co-repressor ligand [3,38,39] (Figure 1B). Co-repressors are guanine or hypoxanthine . When DNA-binding affinity is measured in the presence and absence of co-repressor, the allosteric response of PurR is about 2 orders of magnitude , significantly smaller than LacI induction, but near that observed for anti-inducers on LacI [40•]. As with LacI, the stoichiometry of PurR:co-repressor required to elicit the allosteric effect is unknown.
PurR crystallizes more readily than most other family members, and a number of structures are available for wild-type and mutant homodimers bound to DNA and a variety of co-repressors (e.g. [41–45]). As with LacI, structures are not known for all possible functional states. Comparing structures of the apo-regulatory domain and corepressed full-length PurR, Brennan and colleagues hypothesized that large subdomain domain motions separate the DNA-binding domains too far to bind the operator half-sites . The Mowbray lab  showed that effector binding to PurR exhibits a larger reorientation of the regulatory subdomains than does LacI. However, small-angle X-ray scattering results with a chimera comprising the LacI DNA-binding domain and the PurR regulatory domain show much smaller changes than LacI [47••]. This outcome may be an effect of either chimera formation or truncation of the PurR DNA-binding domain in the apo-PurR structure.
In Gram-positive bacteria, carbon catabolite protein A (CcpA) is a central regulator of carbon metabolism, controlling hundreds of genes; this homologue can function either as a repressor or an activator (reviewed in [8•,48]). Several structures of CcpA have been solved (e.g. [8•,49,50]. Unlike LacI and PurR, the primary allosteric effectors of CcpA are the proteins HPr or Crh . These cofactor proteins are phosphorylated at Ser46 under particular metabolic conditions. In turn, one phosphorylated cofactor binds to each monomer within a CcpA dimer, facilitating a structural change to a “closed” form and enhancing DNA binding  (see Figure 1B). Interestingly, binding to different cofactor proteins can affect regulation of different operons (reviewed in [8•]). The HPr-Ser46-P/Crh-Ser46-P binding site is not the same as for the small molecule effector, but lies near residues on the three strands that link the N-and C-subdomains (Figure 2, yellow region). Upon phosphoprotein binding, the conformational change seen in the CcpA regulatory domain is similar to that seen when LacI and PurR bind small effector ligands. CcpA can also bind either glucose-6-phosphate or fructose-1,6-bisphosphate in the canonical effector binding site, which enhances the cofactor function of HPr-Ser46-P but not Crh-Ser46-P (see [8•]). Interactions with HPr-Ser46-P are also observed for the B. subtilis homologue RbsR .
The E. coli cytidine repressor protein (CytR) regulates at least nine transcriptional units encoding genes involved in purine and pyrimidine biosynthesis and utilization (reviewed in ). CytR binds to cytO DNA as a homodimer; DNA-binding is cooperative in the presence of two flanking catabolite repressor proteins (CRP) (Figure 4). Notably, the spacing of cytO half-sites is varied and can be much wider than for the other LacI/GalR proteins [54,55]. CytR binding to its small molecule effector (cytidine) has no effect on intrinsic DNA binding affinity (e.g. [56,57••]). Instead, the cytidine-induced conformational change disallows simultaneous CytR contacts with CRP and cytO. As a result (i) cooperative DNA binding of CytR and CRP is diminished, allowing RNA polymerase to compete for cytO, and (ii) direct interactions between CRP and RNA polymerase are altered [56, 57••, 58].
Some of the differences in CytR function may arise from differences in the sequence linking the two functional domains (see below). No high resolution structure has been obtained, but biophysical data suggest that CytR can adopt multiple conformations in the apo-state that are constrained differently when bound to operators with distinct half-site spacings [57••]. Unlike many members of the LacI/GalR family, altered CytR•CRP interactions provide a “rheostatic” rather than “on/off” switching mechanism.
The regulatory domain contains the effector and cognate protein binding sites, making this region the basic element for allostery (Figure 2). Structural changes of this domain are currently illuminated by comparison of the apo- and ligand-bound structures. LacI, PurR, and CcpA appear to have a common cleft closure, in which the N-subdomain moves and the C-subdomain remains fixed [32,42,46,50]. Changes in the regulatory domain appear to dictate the direction of allosteric response for the intact protein, as indicated by studies with chimeric repressors: When the LacI DNA-binding domain and linker are fused to the PurR regulatory domain, the chimera is co-repressed by hypoxanthine [47••], whereas when fused to the GalR regulatory domain, the chimera is induced by galactose [59•]. In LacI and GalR, several mutants that cannot respond to effector are found in the regulatory domain, in either the effector binding pocket or in regions that are crucial for allostery [60,61].
Despite its dominant role, the regulatory domain can be adapted for various functions. In addition to accommodating diverse specificities for different effectors, the regulatory domain can be either induced or co-repressed. Indeed, these alternate phenomena can occur on the same regulatory domain. As mentioned previously, LacI binds inducers, anti-inducers, and neutral ligands. Moreover, isothermal titration calorimetry experiments showed that ONPG, a neutral ligand for tetrameric LacI, behaved as an anti-inducer for dimeric LacI [40•]. The E. coli homologue GalR also has inducer (galactose and fucose) and anti-inducer (paradoxically, IPTG) ligands . Although we presented PurR co-repression as “opposite” to LacI induction, a better comparison might well be the LacI•DNA•anti-inducer relationship.
Based on these observations, we propose that all LacI/GalR regulatory domains have potential for multiple allosteric modes. For example, a gratuitous inducer might be identified for PurR. Further, mutations that arise in evolution or are designed in the laboratory might influence the allosteric effect of ligand. Such latent allosteric potential in an ancestral regulatory domain would enable an inducible regulator to evolve the co-repression required to shutdown biosynthetic pathways (and vice versa).
The 18 amino acids that join the DNA-binding domain to the regulatory domain are involved in many interfaces. These are best understood by subdividing the linker into an unstructured N-linker, a central hinge helix, and an unstructured C-linker (Figure 2). One face of the hinge helix directly contacts DNA; another face forms an interface between the two helices of a dimer; and other helix residues interact with the regulatory domain. In addition, both the N- and C-linkers interact with the regulatory domain. From the available structures, hypotheses have been formed about how structural changes are propagated to and through the respective linkers (see above). However, the only structural information available on the true allosteric complexes is low resolution from small angle X-ray scattering [37,47••]. These data show that the LacI linker remains compact in the DNA complexes of either full-length LacI or an engineered chimera comprising LacI and PurR (“LLhP”).
Even though linker conformational changes may be small, mutagenesis illuminates several positions important to allostery. Formation of a disulfide bond between the LacI linkers abolished allostery for some operators, whereas inserted glycines diminished the allosteric response [63,64]. Some amino acid substitutions of the LacI C-linker position 61 abolish inducibility . Other substitutions at the same residue in LLhP dramatically enhanced the magnitude of the allosteric response to co-repressor [47••]. Mutagenesis of a second chimera (comprising the LacI DNA-binding domain and the GalR regulatory domain) suggests that at least four additional linker positions may participate in allostery [59•]. Because many of these substituted positions are not conserved among family members, the effects of mutagenesis might mirror the evolution of allosteric differences between family members.
Many family members posses a conserved linker motif: Y/FxPxxxAxxL/M. A key feature is the alternative L/M side chain, which inserts into the minor groove in the center of the DNA operator [20,42,50]. A few family members lack features of the motif and/or have multiple P or G residues that are anticipated to disrupt the hinge helix. In the bacterial phylum Firmicutes, homologues that lack the linker motif also have a distinct operator motif . Thus, the larger LacI/GalR family can be divided into two subfamilies [59•,65], which appear to have evolved different mechanisms by which the linkers bind DNA and convey allostery. For example, E. coli CytR lacks the L/M, has a P and a G in the “helical” region, and cytO is similar to the operator subfamily identified in Firmicutes. The linker of CytR appears to adopt multiple conformations, allowing this repressor to recognize variable spacing and rotations in the cytO half-sites (Figure 4) [57••].
Thermodynamically, allostery occurs when binding to ligand A differs in the absence and presence of ligand B. To preserve a complete thermodynamic cycle, the complement must also occur. Since effector binding to LacI/GalR proteins alters DNA binding affinity, DNA binding by the LacI/GalR proteins must alter effector binding, a feature that has been directly measured (e.g., ). Given this behavior, each specific operator sequence might exhibit a different allosteric response to small molecule effectors. This relationship has been confirmed for variants of LacI [63,64] and chimera LLhP [47••], and conceivably could contribute to the operator-specific responses seen with variants of CcpA [66•]. Many LacI/GalR proteins are known to regulate multiple operons, and an alternative allosteric response to various DNA sequences would allow their differential, but simultaneous, regulation.
The ubiquity of LacI/GalR regulatory proteins in prokaryotes testifies to the robust nature of this mechanism for conserving the energy required for mRNA and protein production [67•]. Their conserved structure has potential to be regulated by small molecules, by other proteins, or their combination. The protein structure is adaptable, demonstrating both induction and co-repression within the same molecule. The structure can effect on/off switching — with >1,000-fold change in transcription — or can rheostatically modulate gene expression between ~10 and 100-fold. As we understand the intricacies of the LacI/GalR proteins, and the ways in which they can be varied, we gain the capacity to introduce “designed” regulatory systems into the cellular milieu.
We are grateful for the support of the National Institutes of Health (GM079423 and P20 RR17708 for LSK, GM22441 for KSM) and the Robert A. Welch Foundation (C-576 for KSM).
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Papers of particular interest, published within the period of the review have been highlighted as:
• of special interest
•• of outstanding interest