|Home | About | Journals | Submit | Contact Us | Français|
Response regulators (RRs) within two-component signal transduction systems control a variety of cellular processes. Most RRs contain DNA-binding output domains and serve as transcriptional regulators. Other RR types contain RNA-binding, ligand-binding, protein-binding or transporter output domains and exert regulation at the transcriptional, post-transcriptional or post-translational levels. In a significant fraction of RRs, output domains are enzymes that themselves participate in signal transduction: methylesterases, adenylate or diguanylate cyclases, c-di-GMP-specific phosphodiesterases, histidine kinases, serine/threonine protein kinases and protein phosphatases. In addition, there remain output domains whose functions are still unknown. Patterns of the distribution of various RR families are generally conserved within key microbial lineages and can be used to trace adaptations of various species to their unique ecological niches.
Bacterial adaptation to changing environmental conditions can be thought of as occurring on several different levels: 1) the level of individual genes and proteins (changes in gene expression, allosteric regulation of enzyme activity), 2) the level of global regulons (expression of multiple operons, stress response), 3) the whole-cell level (cellular motility, sporulation), and 4) the multicellular level (cell aggregation, biofilm formation). Two-component signal transduction systems (TCSs) affect processes at all these levels, primarily through transcriptional, post-transcriptional and post-translational regulation of gene expression, but also through a variety of protein-protein interactions. These regulatory processes are performed by various response regulators (RRs) that all share the common phosphoacceptor (receiver, REC) domain but differ in their output domains. The sheer number and diversity of RRs mirror the diversity of functions that they perform. At the time of this writing, public databases include more than 70,000 protein sequences that contain the REC domain; in the latest release of the Pfam database [••1] these sequences are classified into 1716 domain architectures. Although the latter figure primarily reflects the diversity of hybrid histidine kinases, it does not include combinations of REC with uncharacterized domains. Thus, the easiest way to investigate the diversity of RRs is by browsing various genomic databases, including those specifically dedicated to signal transduction. Such databases include Microbial Signal Transduction database (MiST, http://genomics.ornl.gov/mist/, ), whose updated version was recently released at a new web site http://mistdb.com [•3], and P2CS (http://www.p2cs.org/), a database of prokaryotic two-component systems [•4]. The author of this review maintains a manually curated listing of the RRs encoded in a non-redundant collection of prokaryotic genomes and classified based on their domain architectures, http://www.ncbi.nlm.nih.gov/Complete_Genomes/RRcensus.html. This listing has been recently updated and now covers all bacterial and archaeal species whose genomes were released by the end of 2009. Compared to the original 200-genome set , the current list provides a much better coverage of microbial diversity, particularly for such bacterial phyla as Aquificae, Bacteroidetes, Chlorobi, Chloflexi, Thermotogae, and Verrucomicrobia, as well as for delta and epsilon subdivisions of Proteobacteria. However, the key trends deduced from the smaller genome sample  still appear to hold true. Here, I briefly review the key rationale for RR classification [5,6] and discuss the growing list of RR output domains, as well as the recent progress in their structural and functional characterization.
Although most RRs combine the REC domain with one or more output domains, the term “response regulator” was coined in 1977 by Daniel Koshland to describe the chemotaxis protein CheY that consists solely of the REC domain . Such single-domain RRs comprise ~17% of bacterial RRs and almost half of all RRs in Archaea (Figure 1). These proteins control bacterial motility (e.g., CheY) and participate in signaling phosphorelays (e.g., Spo0F) and in protein-protein interactions governing cell development and division (e.g., DivK, CpdR, see  for a recent review).
A key problem in studying RRs using the standard genome analysis techniques is that sequence similarity of RRs does not necessarily indicate functional identity. For example, sequences of Bacillus subtilis Spo0F and Caulobacter crescentus DivK share 30% identity, which is higher than between chorismate mutases from these two organisms. Still, these proteins have dramatically different functions. Transcriptional regulators OmpR and PhoB from Escherichia coli have 37% identical amino acid residues but regulate entirely different processes. As a result, assigning functions to newly sequenced RRs based solely on sequence similarity is problematic, and many such annotations in the current genome databases appear to be wrong. In contrast, family assignments of RRs based on their domain architectures are relatively robust, easy to make, and immediately indicate function. Indeed, it is often impossible to say whether a newly sequenced REC-only domain protein controls chemotaxis, cell development, or any other chemosensory pathway . Still, such a protein can be confidently assigned to the single-domain RR family, meaning that it is definitely not a transcriptional regulator. Likewise, although it is usually hard to predict the target operon(s) for an RR that combines a REC domain with a winged helix domain, such RR can be confidently assigned to the OmpR/PhoB family, meaning that it is definitely a transcriptional regulator that likely shares the key structural features and the regulatory mechanism with the well-studied eponymous members of that family (see ). Thus, functional assignment of an RR typically relies on the recognition of its output domain and its functional assignment to the DNA-binding, RNA-binding, ligand-binding, protein-binding, enzyme or transporter protein category.
A great majority of bacterial RRs contain DNA-binding output domains and serve as transcriptional regulators (this does not appear to be the case in Archaea, see Figure 1b). The most common types of DNA-binding output domains are listed in Table 1. Most of them have known three-dimensional (3D) structures that feature different variants of the helix-turn-helix (HTH) DNA-binding structural motif . However, crystal structures of full-length RRs are currently available only for the OmpR/PhoB and NarL/FixJ families [11,12].
The last major gap in the list of output domain structures has been filled last year when Ann Stock and colleagues reported the crystal structure of the LytTR domain, a unique DNA-binding domain that lacks the HTH motif and consists mostly of β-strands [••13]. Transcriptional regulators with the LytTR-type output domains are widespread in bacteria and control production of virulence factors in several important bacterial pathogens [14,15]. The complex of the LytTR domain from Staphylococcus aureus AgrA with its 15-bp DNA target fragment revealed a previously unknown mode of protein-DNA interaction, which involves side chains of amino acid residues that are located in the loops between the β-strands [••13]. The variability of these residues among the members of the LytR/AgrA family could explain the diversity of the DNA targets of these RRs. Another interesting observation from this structure is the significant bending of the target DNA fragment site upon binding of the LytTR domain [••13]. This DNA bending is likely to increase the activity of the respective promoters and could explain the mechanism of transcriptional activation by the RRs of the LytR/AgrA family. A similar structure was recently submitted to the Protein Data Bank (PDB) by the Midwest Center for Structural Genomics (PDB: 3d6w, Osipiuk et al., LytTR DNA-binding domain of putative methyl-accepting/DNA response regulator from Bacillus cereus, unpublished), confirming the key observations of [••13].
For other transcriptional regulators, recent efforts have been targeted towards structural characterization of full-length RRs, analysis of the conformational changes brought by phosphorylation of the REC domain, and analysis of the precise mechanisms of DNA binding and target specificity. For the OmpR/PhoB family, structures of a full-length RR and a DNA-bound complex of the winged-helix domain have been available since 2002 [11,16] and the conformational changes caused by phosphorylation have been studied in detail . Remarkably, even very similar RRs appear to differ in their activation mechanisms . A recent paper studied the interaction of the C-terminal DNA-binding domain of OmpR with its DNA target sequences and concluded that there are principal differences in how OmpR and PhoB are affected by phosphorylation of their REC domains [•18].
The NarL/FixJ family is the second most abundant family of bacterial RRs. Its members have a typical HTH DNA-binding output domain (referred to as GerE in Pfam) that is similar to the one in the transcriptional regulator LuxR. For that reason, RRs of this family are often annotated as members of the LuxR family. Sometimes, the NarL/FixJ family is further subdivided into TetR, IclR and other families. The recent work from Valley Stewart’s group revealed the fine-tuned regulation of assimilation of nitrate and nitrite in E. coli by closely related paralogous RRs NarL and NarP [19,•20].
For the RRs of the NtrC/DctD family, no full-length structure has been reported so far, although structural models have been built based on the structures of the three individual constituent domains and domain pairs [21,22]. Structural data indicated that inactive (non-phosphorylated) NtrC molecule forms dimers in solution, and its activation requires further oligomerization with the central σ54-interacting ATPase domain forming a ring-like structure. A recent work clarified this issue by showing that the active form of NtrC is a hexamer [•23].
Although RRs are often assumed to serve as transcriptional regulators, a significant fraction of bacterial RRs do not regulate transcription, at least not in a direct way. In addition to single-domain RRs, such RRs include those with output domains that have enzymatic activity and themselves participate in signal transduction: methylesterases, adenylate cyclases, diguanylate cyclases, c-di-GMP-specific phosphodiesterases, histidine kinases, serine/threonine protein kinases and protein phosphatases (Table 2). Some of these RRs have been extensively characterized. These include the widespread CheB family of chemotaxis proteins that combine the REC domain with a C-terminal methylesterase domain [24,25] and PleD-like RRs that contain two REC domains followed by the diguanylate cyclase (GGDEF, PF01590) domain [26–28]. Two recent papers provided structural characterization of WspR, an RR with the REC-GGDEF domain architecture, whose activation mechanism is substantially different from that of PleD and appears to involve formation of a WspR tetramer [•29,•30]. The GGDEF-containing RRs of Anaplasma phagocytophilum (PleD family) and Borrelia burgdorferi (WspR-family) have been shown to serve as global regulators of cell metabolism in these important human pathogens [•31,•32].
In many other RRs, output domains have well-characterized structures and functions but the structures of full-length RRs still remain to be solved. These include RRs of PvrR and RpfG families, so named after their first characterized representatives [33,34], whose output domains are c-di-GMP-specific phosphodiesterases of two classes, represented, respectively, by the EAL and HD-GYP domains. Several 3D structures of the EAL domain have been solved, suggesting a catalytic mechanism for the c-di-GMP hydrolysis [•35, •36], whereas for the HD-GYP domain, only the structure of generic HD-type phosphodiesterase is available at this time . Similarly, although the structures of class III adenylate cyclases, Ser/Thr protein kinases and CheC-type protein phosphatases are known, structures of full-length RRs with these output domains are still unavailable. The structure of a full-length RR with a PP2C-type protein phosphatase output domain has been recently released but has not yet been formally described (PDB: 3eq2, Levchenko et al., Structure of hexagonal crystal form of Pseudomonas aeruginosa RssB, unpublished). Based on this work, RRs with the PP2C-type output domain can now be referred to as the RssB family (to avoid confusion, it should be noted that the E. coli RssB protein, also referred to as Hnr or SprE , has a truncated PP2C-type domain that is apparently devoid of phosphatase activity and functions solely in protein-protein interactions ).
Recent studies revealed another interesting RR that functions through protein-protein interactions. This RR, encoded in the genomes of most alpha-proteobacteria but so far not seen outside that lineage, combines a REC domain with an N-terminal domain that is very similar to the σE (RpoE, σ24) subunit of RNA polymerase. Based on the DNA-binding properties of σE, we and others initially speculated that these RRs were yet another group of DNA-binding transcriptional regulators [5, 38], a suggestion that now appears to be incorrect. The actual story proved to be much more complex. First, this RR, named PhyR, was shown to regulate plant colonization by Methylobacterium extorquens . Subsequent studies demonstrated that PhyR is involved in regulation of the general stress response, including resistance to heat shock and desiccation in both M. extorquens and Bradyrhizobium japonicum [•39,•40]. Most importantly, PhyR did not appear to be involved in DNA binding or interaction with RNA polymerase, as might be expected of its sigma factor domain [••41]. Instead, it appeared to act through a partner-switching mechanism, somewhat similar to that regulating the activity of σB in B. subtilis . The σE domain of PhyR was shown to bind an anti-sigma factor, a newly characterized small protein NepR. This led to the model of PhyR action that includes sequestering of NepR and thus freeing up a genuine sigma factor σEcfG, which allows σEcfG to interact with the RNA polymerase thereby stimulating transcription of the stress-related genes. The environmental control of this signaling mechanism is achieved through phosphorylation of the PhyR REC domain, which could unlock the σE domain and allow its interaction with NepR [••41].
Continued genome sequencing of phylogenetically and ecologically diverse microorganisms results in rapid growth of the number of RR sequences and continuously introduces previously unknown RR domain architectures. In many cases, these architectures include well-characterized domains that just have not been previously associated with two-component signaling. For example, the recently sequenced genomes of the alkane-degrading sulfate-reducing delta-proteobacteria Desulfatibacillum alkenivorans and Desulfococcus oleovorans encode RRs whose output domains are membrane transporters for nitrate, sulfate, and dicarboxylates. Several other bacteria encode fusions of the REC domain with membrane-bound and/or ATPase components of the ABC-type transporters (see Supplementary Table 1). Thus, the list of functions controlled by RRs still keeps growing and now includes membrane transport of various ions.
Another clear trend among recently discovered RR sequences is the growing number of metabolic enzymes that are found fused to the REC domain in at least some bacteria. Such enzymes include P-loop-type ATPases of the MinD/ParA and PilB families, threonine synthase, nucleoside phosphorylase, sugar transferase, dolichyl-phosphate glucosyltransferase, NAD(P)-dependent glutamate dehydrogenase, potential metal-dependent hydrolase, and other enzymes (MYG, manuscript in preparation, see Supplementary Table l). As discussed previously , such fusions appear to be evolutionarily harmless – but potentially advantageous – events that put these metabolic enzymes under environmental control.
Representatives of different bacterial lineages show clear differences in the families of RRs that they encode (Figure 1b). However, closely related species typically have very similar RR family profiles, i.e. they encode similar amounts of certain RRs and do not encode the same RRs (Figure 2). This trend is well preserved at the genus level and can often be followed up to the phylum level, even though the absolute numbers of encoded RRs differ greatly, in accordance with the organism’s genome size. For example, Figure 2 shows that genome compaction in the course of adaptation of Anoxybacillus flavithermus to its unique silica-saturated ecological niche  was accompanied by a massive loss of RR genes. However, the relative numbers of RRs of each kind did not change much, meaning that different RR families were equally affected by this process. In addition to its potential use in taxonomy, this conservation of RR family profiles could have predictive power, eventually allowing us to define the ecological preferences of a given organism based solely on its signaling capacity deduced from the genome sequence (see Box 1).
The current knowledge of TCS comes from studies of only a few RRs in a handful of model organisms. The growing list of sequenced genomes and metagenomes provides opportunities for expanding this set ([••13,•31,•32,•40,••41] are good examples) and gaining a much better understanding of signaling mechanisms in a wide variety of microorganisms from diverse environments. In the near future, to make better use of the genomic data, it would be necessary to:
In the long run, accomplishing these goals should allow us to deduce the behavior and metabolism of each organism based solely on its genome sequence and compile a unified picture of microbial life in each microcosm.
Although TCSs are found in all three domains of life, Bacteria, Archaea and Eukaryotes, the presence and abundance of particular RR classes varies between the lineages. Archaea (at least the ones with sequenced genomes) encode very few RRs with DNA-binding output domains (Figure 1b). Almost half of all archaeal RRs are single-domain RRs (stand-alone REC domains); in some organisms they comprise 90–100% of all RRs [8,44]. Many archaea also encode RRs that combine a REC domain with one or more PAS and/or GAF domains . The ligands, if any, bound by these RRs remain uncharacterized and the pathways that they regulate remain obscure. These RRs, similarly to some single-domain RRs, likely participate in allosteric regulation of their cognate histidine kinases. Other widespread archaeal RRs include the chemotaxis regulator CheB (Figure 1b) and RRs with the HalX (PF08663) output domain, whose function still remains unknown.
Among eukaryotes, TCSs are found primarily in protozoa, fungi, algae and green plants. The few RR genes apparently found in metazoan genomes most likely come from bacterial contamination. In yeast and in other fungi, TCSs are involved in cellular response to osmotic, oxidative, and other environmental stresses [45,46,•47]. In plants, TCSs are additionally involved in circadian rhythms, adaptation to salinity, cytokinin signaling [48,49,•50] and in redox regulation of chloroplast gene expression [••51]. These pathways involve single-domain RRs, as well as RRs with various output domains, such as SANT (Myb_DNA-binding, PF00249). Studies of RRs from eukaryotic microorganisms are only beginning but hold great promise for understanding signal transduction mechanisms operating in higher organisms.
Bacterial response regulators are extremely diverse: they include almost all known types of DNA-binding domains, many ligand-binding and protein-binding domains, membrane transporters and a variety of metabolic and signaling enzymes. There appears to be no restriction on the variety of domains that could be fused to the REC domain and thereby put under environmental control. Obviously, the principal control mechanism is regulation of gene expression at the transcriptional level. However, widespread RRs with methylesterase, diguanylate cyclase, c-di-GMP-specific phosphodiesterase or Ser/Thr protein phosphatase output domains provide additional means of control that work at post-transcriptional levels. Such RRs allow TCSs to interfere with the functioning of other signal transduction systems and appear to put TCSs at the top of the bacterial signaling hierarchy.
I thank Armen Mulkidjanian and Sergei Mekhedov for helpful suggestions and many other colleagues for critical comments. This study was supported by the Intramural Research Program of the National Library of Medicine at the U.S. National Institutes of Health.