About half of the protein-coding genes in prokaryotic genomes are organized into operons to facilitate co-regulation during transcription. With the evolution of genomes, operon structures are undergoing changes which could coordinate diverse gene expression patterns in response to various stimuli during the life cycle of a bacterial cell. Here we developed a graph-based model to elucidate the diversity of operon structures across a set of closely related bacterial genomes. In the constructed graph, each node represents one orthologous gene group (OGG) and a pair of nodes will be connected if any two genes, from the corresponding two OGGs respectively, are located in the same operon as immediate neighbors in any of the considered genomes. Through identifying the connected components in the above graph, we found that genes in a connected component are likely to be functionally related and these identified components tend to form treelike topology, such as paths and stars, corresponding to different biological mechanisms in transcriptional regulation as follows. Specifically, (i) a path-structure component integrates genes encoding a protein complex, such as ribosome; and (ii) a star-structure component not only groups related genes together, but also reflects the key functional roles of the central node of this component, such as the ABC transporter with a transporter permease and substrate-binding proteins surrounding it. Most interestingly, the genes from organisms with highly diverse living environments, i.e., biomass degraders and animal pathogens of clostridia in our study, can be clearly classified into different topological groups on some connected components.
RegPredict web server is designed to provide comparative genomics tools for reconstruction and analysis of microbial regulons using comparative genomics approach. The server allows the user to rapidly generate reference sets of regulons and regulatory motif profiles in a group of prokaryotic genomes. The new concept of a cluster of co-regulated orthologous operons allows the user to distribute the analysis of large regulons and to perform the comparative analysis of multiple clusters independently. Two major workflows currently implemented in RegPredict are: (i) regulon reconstruction for a known regulatory motif and (ii) ab initio inference of a novel regulon using several scenarios for the generation of starting gene sets. RegPredict provides a comprehensive collection of manually curated positional weight matrices of regulatory motifs. It is based on genomic sequences, ortholog and operon predictions from the MicrobesOnline. An interactive web interface of RegPredict integrates and presents diverse genomic and functional information about the candidate regulon members from several web resources. RegPredict is freely accessible at http://regpredict.lbl.gov.
RegulonDB version 2.0, a database on transcriptional regulation and operon organization in Escherichia coli, is now available on the web at the following URL: http://www.cifn.unam. mx/Computational_Biology/regulondb/. In this paper we describe the main computational changes to the database, which include migrating the database to Sybase, providing graphical descriptions of the internal organization of operons and regulons, and direct links to MEDLINE references. The web interface offers searching either by mechanisms of regulation or by operon organization. The results of a search (operon organization, or site collection) are displayed as hypertext, and can also be displayed graphically. In terms of its contents, RegulonDB contains a large number of operons, as well as the absolute position in the completed genome sequence of sites, promoters, and individual genes of E.coli.
Operon structures play an important role in transcriptional regulation in prokaryotes. However, there have been fewer studies on complicated operon structures in which the transcriptional units vary with changing environmental conditions. Information about such complicated operons is helpful for predicting and analyzing operon structures, as well as understanding gene functions and transcriptional regulation.
We systematically analyzed the experimentally verified transcriptional units (TUs) in Bacillus subtilis and Escherichia coli obtained from ODB and RegulonDB. To understand the relationships between TUs and operons, we defined a new classification system for adjacent gene pairs, divided into three groups according to the level of gene co-regulation: operon pairs (OP) belong to the same TU, sub-operon pairs (SOP) that are at the transcriptional boundaries within an operon, and non-operon pairs (NOP) belonging to different operons. Consequently, we found that the levels of gene co-regulation was correlated to intergenic distances and gene expression levels. Additional analysis revealed that they were also correlated to the levels of conservation across about 200 prokaryotic genomes. Most interestingly, we found that functional associations in SOPs were more observed in the environmental and genetic information processes.
Complicated operon strucutures were correlated with genome organization and gene expression profiles. Such intricately regulated operons allow functional differences depending on environmental conditions. These regulatory mechanisms are helpful in accommodating the variety of changes that happen around the cell. In addition, such differences may play an important role in the evolution of gene order across genomes.
Though the bacterial transcription regulation apparatus is distinct in terms of several structural and functional features from its eukaryotic counterpart, the gross structure of the transcription regulatory network (TRN) is believed to be similar in both superkingdoms. Here, we explore the fine structure of the bacterial TRN and the underlying “co-regulatory network (CRN)” to show that despite the superficial similarities to eukaryotic networks, the bacterial networks display entirely different organizational principles. In particular unlike in eukaryotes, the hubs of bacterial networks are both global regulators and integrators of diverse disparate transcriptional responses. These and other organizational differences might correlate with the fundamental differences in gene and promoter organization in the two superkingdoms, especially the presence of operons and regulons in bacteria. Further we explored to find the interplay, if any, between network structures, mode of regulatory interactions and signal sensing of TFs in shaping up the bacterial transcriptional regulatory responses. For this purpose, we first classified TFs according to their regulatory mode (activator, repressor or dual regulator) and sensory mechanism (one-component systems responding to internal or external signals, TFs from 2-component systems and chromosomal structure modifying TFs) in the bacterial model organism E. coli and then we studied the overall evolutionary optimization of network structures. The incorporation of TFs in different hierarchical elements of the TRN appears to involve on a multi-dimensional selection process depending on regulatory and sensory modes of TFs in motifs, co-regulatory associations between TFs of different functional classes and transcript half-lives. As result it appears to have generated circuits that allow intricately regulated physiological state changes. We identified the biological significance of most of these optimizations, which can be further used as the basis to explore similar controls in other bacteria. We also show that, though on the larger evolutionary scale, unrelated TFs have evolved to become hubs, within lineages like γ-proteobacteria there is strong tendency to retain hubs, as well as certain higher-order network modules that have emerged through lineage specific paralog duplications.
Using profiles of phylogenetic profiles (P-cubic) we compared the evolutionary dynamics of different kinds of functional associations. Ordered from most to least evolutionarily stable, these associations were genes in the same operons, genes whose products participate in the same biochemical pathway, genes coding for physically interacting proteins and genes in the same regulons. Regulons showed the most plastic functional interactions with evolutionary stabilities barely better than those of unrelated genes. Further regulon analyses showed that global regulators contain less evolutionarily stable associations than local regulators. Genes co-repressed by global regulators had a higher evolutionary conservation than genes co-activated by global regulators. However, the reverse was true for genes co-repressed and co-activated by local regulators. Of all the regulon-related associations, the relationship between regulators and their target genes showed the most evolutionary stability. Different negative data sets built to contrast against each of the analysed kinds of modules also differed in evolutionary conservation revealing further underlying genome organization. Applying P-cubic analyses to other genomes might help visualize genome organization, understand the evolutionary importance and plasticity of functional associations and compare the quality of data sets expected to reflect functional interactions, such as those coming from high-throughput experiments.
The Yop virulon enables extracellularly located Yersinia, in close contact with a eukaryotic target cell, to inject bacterial toxic proteins directly into the cytosol of this cell. Several Ysc proteins, forming the Yop secretion apparatus, display homology with proteins of the flagellar basal body. To determine whether this relationship could extend to the regulatory pathways, we analyzed the influence of flhDC, the master regulatory operon of the flagellum, on the yop regulon. In an flhDC mutant, the yop regulon was up-regulated. The transcription of virF and the steady-state level of the transcriptional activator VirF were enhanced. yop transcription was increased at 37°C and could also be detected at a low temperature. Yop secretion was increased at 37°C and occurred even at a low temperature. The Ysc secretion machinery was thus functional at room temperature in the absence of flagella, implying that in wild-type bacteria, FlhD and/or FlhC, or the product of a gene downstream of flhDC, represses the yop regulon. In agreement with this notion, increased expression of flhDC in wild-type bacteria resulted in the oversecretion of flagellins at room temperature and in decreased Yop secretion at 37°C.
The RegPrecise database (http://regprecise.lbl.gov) was developed for capturing, visualization and analysis of predicted transcription factor regulons in prokaryotes that were reconstructed and manually curated by utilizing the comparative genomic approach. A significant number of high-quality inferences of transcriptional regulatory interactions have been already accumulated for diverse taxonomic groups of bacteria. The reconstructed regulons include transcription factors, their cognate DNA motifs and regulated genes/operons linked to the candidate transcription factor binding sites. The RegPrecise allows for browsing the regulon collections for: (i) conservation of DNA binding sites and regulated genes for a particular regulon across diverse taxonomic lineages; (ii) sets of regulons for a family of transcription factors; (iii) repertoire of regulons in a particular taxonomic group of species; (iv) regulons associated with a metabolic pathway or a biological process in various genomes. The initial release of the database includes ∼11 500 candidate binding sites for ∼400 orthologous groups of transcription factors from over 350 prokaryotic genomes. Majority of these data are represented by genome-wide regulon reconstructions in Shewanella and Streptococcus genera and a large-scale prediction of regulons for the LacI family of transcription factors. Another section in the database represents the results of accurate regulon propagation to the closely related genomes.
Intragenomic and intergenomic comparisons of upstream nucleotide sequences of archaeal genes were performed with the goal of predicting transcription regulatory sites (operators) and identifying likely regulons. Learning sets for the detection of regulatory sites were constructed using the available experimental data on archaeal transcription regulation or by analogy with known bacterial regulons, and further analysis was performed using iterative profile searches. The information content of the candidate signals detected by this method is insufficient for reliable predictions to be made. Therefore, this approach has to be complemented by examination of evolutionary conservation in different archaeal genomes. This combined strategy resulted in the prediction of a conserved heat shock regulon in all euryarchaea, a nitrogen fixation regulon in the methanogens Methanococcus jannaschii and Methanobacterium thermoautotrophicum and an aromatic amino acid regulon in M.thermoautotrophicum. Unexpectedly, the heat shock regulatory site was detected not only for genes that encode known chaperone proteins but also for archaeal histone genes. This suggests a possible function for archaeal histones in stress-related changes in DNA condensation. In addition, comparative analysis of the genomes of three Pyrococcus species resulted in the prediction of their purine metabolism and transport regulon. The results demonstrate the feasibility of prediction of at least some transcription regulatory sites by comparing poorly characterized prokaryotic genomes, particularly when several closely related genome sequences are available.
Gene regulatory circuits are often commonly shared between two closely related organisms. Our web tool iCR (identify Conserved target of a Regulon) makes use of this fact and identify conserved targets of a regulatory protein. iCR is a special refined extension of our previous tool PredictRegulon- that predicts genome wide, the potential binding sites and target operons of a regulatory protein in a single user selected genome. Like PredictRegulon, the iCR accepts known binding sites of a regulatory protein as ungapped multiple sequence alignment and provides the potential binding sites. However important differences are that the user can select more than one genome at a time and the output reports the genes that are common in two or more species. In order to achieve this, iCR makes use of Cluster of Orthologous Group (COG) indices for the genes. This tool analyses the upstream region of all user-selected prokaryote genome and gives the output based on conservation target orthologs. iCR also reports the Functional class codes based on COG classification for the encoded proteins of downstream genes which helps user understand the nature of the co-regulated genes at the result page itself. iCR is freely accessible at .
The type III protein secretion system is an important pathogenicity factor of enteropathogenic and enterohaemorrhagic Escherichia coli pathotypes. The genes encoding this apparatus are located on a pathogenicity island (the locus of enterocyte effacement) and are transcriptionally activated by the master regulator Ler. In each pathotype Ler is also known to regulate genes located elsewhere on the chromosome, but the full extent of the Ler regulon is unclear, especially for enteropathogenic E. coli. The Ler regulon was defined for two strains of E. coli: E2348/69 (enteropathogenic) and EDL933 (enterohaemorrhagic) in mid and late log phases of growth by DNA microarray analysis of the transcriptomes of wild-type and ler mutant versions of each strain. In both strains the Ler regulon is focused on the locus of enterocyte effacement – all major transcriptional units of which are activated by Ler, with the sole exception of the LEE1 operon during mid-log phase growth in E2348/69. However, the Ler regulon does extend more widely and also includes unlinked pathogenicity genes: in E2348/69 more than 50 genes outside of this locus were regulated, including a number of known or potential pathogenicity determinants; in EDL933 only 4 extra-LEE genes, again including known pathogenicity factors, were activated. In E2348/69, where the Ler regulon is clearly growth phase dependent, a number of genes including the plasmid-encoded regulator operon perABC, were found to be negatively regulated by Ler. Negative regulation by Ler of PerC, itself a positive regulator of the ler promoter, suggests a negative feedback loop involving these proteins.
We present a study on computational identification of uber-operons in a prokaryotic genome, each of which represents a group of operons that are evolutionarily or functionally associated through operons in other (reference) genomes. Uber-operons represent a rich set of footprints of operon evolution, whose full utilization could lead to new and more powerful tools for elucidation of biological pathways and networks than what operons have provided, and a better understanding of prokaryotic genome structures and evolution. Our prediction algorithm predicts uber-operons through identifying groups of functionally or transcriptionally related operons, whose gene sets are conserved across the target and multiple reference genomes. Using this algorithm, we have predicted uber-operons for each of a group of 91 genomes, using the other 90 genomes as references. In particular, we predicted 158 uber-operons in Escherichia coli K12 covering 1830 genes, and found that many of the uber-operons correspond to parts of known regulons or biological pathways or are involved in highly related biological processes based on their Gene Ontology (GO) assignments. For some of the predicted uber-operons that are not parts of known regulons or pathways, our analyses indicate that their genes are highly likely to work together in the same biological processes, suggesting the possibility of new regulons and pathways. We believe that our uber-operon prediction provides a highly useful capability and a rich information source for elucidation of complex biological processes, such as pathways in microbes. All the prediction results are available at our Uber-Operon Database: , the first of its kind.
The Haemophilus ducreyi 35000HP genome encodes a homolog of the CpxRA two-component cell envelope stress response system originally characterized in Escherichia coli. CpxR, the cytoplasmic response regulator, was shown previously to be involved in repression of the expression of the lspB-lspA2 operon (M. Labandeira-Rey, J. R. Mock, and E. J. Hansen, Infect. Immun. 77:3402-3411, 2009). In the present study, the H. ducreyi CpxR and CpxA proteins were shown to closely resemble those of other well-studied bacterial species. A cpxA deletion mutant and a CpxR-overexpressing strain were used to explore the extent of the CpxRA regulon. DNA microarray and real-time reverse transcriptase (RT) PCR analyses indicated several potential regulatory targets for the H. ducreyi CpxRA two-component regulatory system. Electrophoretic mobility shift assays (EMSAs) were used to prove that H. ducreyi CpxR interacted with the promoter regions of genes encoding both known and putative virulence factors of H. ducreyi, including the lspB-lspA2 operon, the flp operon, and dsrA. Interestingly, the use of EMSAs also indicated that H. ducreyi CpxR did not bind to the promoter regions of several genes predicted to encode factors involved in the cell envelope stress response. Taken together, these data suggest that the CpxRA system in H. ducreyi, in contrast to that in E. coli, may be involved primarily in controlling expression of genes not involved in the cell envelope stress response.
Regulation of the genes required for bioluminescence in the marine bacterium Vibrio fischeri (the lux regulon) is a complex process requiring coordination of several systems. The primary level of regulation is mediated by a positive regulatory protein, LuxR, and a small diffusible molecule, N-(3-oxo-hexanoyl)-homoserine lactone, termed autoinducer. Transcription of the luxR gene, which encodes the regulatory protein, is positively regulated by the cyclic AMP-CAP system. The lux regulon of V. fischeri consists of two divergently transcribed operons designated operonL and operonR. Transcription of the rightward operon (operonR; luxICDABE), consisting of the genes required for autoinducer synthesis (luxI) and light production (luxCDABE), is activated by LuxR in an autoinducer-dependent fashion. The leftward operon (operonL) consists of a single known gene, luxR. The LuxR protein has also been shown to decrease transcription of operonL through an autoinducer-dependent mechanism, thereby negatively regulating its own synthesis. In this paper we demonstrate that the autoinducer-dependent repression of operonL transcription requires not only LuxR but also DNA sequences within operonR which occur upstream of the promoter for operonL. In the absence of these DNA sequences, the LuxR protein causes an autoinducer-dependent activation of transcription of operonL. The lux operator, located in the control region between the two operons, was required for both the positive and negative autoinducer-dependent responses. By titration of high levels of LuxR supplied in trans with synthetic autoinducer, we found that low levels of autoinducer could elicit a positive response even in the presence of the negative-acting DNA sequences, while higher levels of autoinducer resulted in a negative response. Without these DNA sequences in operonR, LuxR and autoinducer stimulated transcription regardless of the level of autoinducer. These results suggest that a switch between stimulation and repression of operonL transcription is mediated by the levels of the LuxR-autoinducer complex, which in these experiments reflects the level of autoinducer in the growth medium.
In Xanthomonas oryzae pv. oryzae, the causal agent of bacterial leaf blight of rice, HrpXo is known to be a transcriptional regulator for the hypersensitive response and pathogenicity (hrp) genes. Several HrpXo regulons are preceded by a consensus sequence (TTCGC-N15-TTCGC), called the plant-inducible promoter (PIP) box, which is required for expression of the gene that follows. Thus, the PIP box can be an effective marker for screening HrpXo regulons from the genome database. It is not known, however, whether mutations in the PIP box cause a complete loss of promoter activity. In this study, we introduced base substitutions at each of the consensus nucleotides in the PIP box of the hrpC operon in X. oryzae pv. oryzae, and the promoter activity was examined by using a β-glucuronidase (GUS) reporter gene. Although the GUS activity was generally reduced by base substitutions, several mutated PIP boxes conferred considerable promoter activity. In several cases, even imperfect PIP boxes with two base substitutions retained 20% of the promoter activity found in the nonsubstituted PIP box. We screened HrpXo regulon candidates with an imperfect PIP box obtained from the genome database of X. oryzae pv. oryzae and found that at least two genes preceded by an imperfect PIP box with two base substitutions were actually expressed in an HrpXo-dependent manner. These results indicate that a base substitution in the PIP box is quite permissible for HrpXo-dependent expression and suggest that X. oryzae pv. oryzae may possess more HrpXo regulons than expected.
Presence of overlapping genes (OGs) is a common phenomenon in bacterial genomes. Most frequently, overlapping genes share coding regions with as few as one nucleotide to as many as thousands of nucleotides. Overlapping genes are often co-regulated, transcriptionally and translationally. Overlapping genes are also subject to the whims of evolution, as the gene overlap is known to be disrupted in some species/strains and participating genes are sometimes lost in independent lineages. Therefore, a better understanding of evolutionary patterns and rates of the disruption of overlapping genes is an important component of genome structure and evolution of gene function. In this study, we investigate the fate of ancestrally overlapping genes in complete genomes from 15 contemporary strains of Salmonella species. We find that the fates of overlapping genes inside and outside operons are distinctly different. A larger fraction of overlapping genes inside operons conserves their overlap as compared to gene pairs outside of the operons (average 0.89 vs. 0.83 per genome). However, when overlapping genes in the operons separate, one partner is lost more frequently than in those separated genes outside of operons (average 0.02 vs. 0.01 per genome). We also investigate the fate of a pan set of overlapping genes at the present and ancestral nodes over a phylogenetic tree based on genome sequence data, respectively. We propose that co-regulation plays important roles on the fates of genes. Furthermore, a vast majority of disruptions occurred prior to the common ancestor of all 15 Salmonella strains, which enables us to obtain an estimate of disruptions between Salmonella and E. coli.
The regulation of the expression of the operons in the flagella-chemotaxis regulon in Escherichia coli has been shown to be a highly ordered cascade which closely parallels the assembly of the flagellar structure and the chemotaxis machinery (T. Iino, Annu. Rev. Genet. 11:161-182, 1977; Y. Komeda, J. Bacteriol. 168: 1315-1318). The master operon, flbB, has been sequenced, and one of its gene products (FlaI) has been identified. On the basis of the deduced amino acid sequence, the FlbB protein has similarity to an alternate sigma factor which is responsible for expression of flagella in Bacillus subtilis. In addition, we have sequenced the 5' regions of a number of flagellar operons and compared these sequences with the 5' region of flagellar operons directly and indirectly under FlbB and FlaI control. We found both a consensus sequence which has been shown to be in all other flagellar operons (J. D. Helmann and M. J. Chamberlin, Proc. Natl. Acad. Sci. USA 84:6422-6424) and a derivative consensus sequence, which is found only in the 5' region of operons directly under FlbB and FlaI control.
Quorum sensing is the process of cell-to-cell communication by which bacteria communicate via secreted signal molecules called autoinducers. As cell population density increases, the accumulation of autoinducers leads to co-ordinated changes in gene expression across the bacterial community. The marine bacterium, Vibrio harveyi, uses three autoinducers to achieve intra-species, intra-genera and inter-species cell–cell communication. The detection of these autoinducers ultimately leads to the production of LuxR, the quorum-sensing master regulator that controls expression of the genes in the quorum-sensing regulon. LuxR is a member of the TetR protein superfamily; however, unlike other TetR repressors that typically repress their own gene expression and that of an adjacent operon, LuxR is capable of activating and repressing a large number of genes. Here, we used protein binding microarrays and a two-layered bioinformatics approach to show that LuxR binds a 21 bp consensus operator with dyad symmetry. In vitro and in vivo analyses of two promoters directly regulated by LuxR allowed us to identify those bases that are critical for LuxR binding. Together, the in silico and biochemical results enabled us to scan the genome and identify novel targets of LuxR in V. harveyi and thus expand the understanding of the quorum-sensing regulon.
An operon is a fundamental unit of transcription and contains specific functional genes for the construction and regulation of networks at the entire genome level. The correct prediction of operons is vital for understanding gene regulations and functions in newly sequenced genomes. As experimental methods for operon detection tend to be nontrivial and time consuming, various methods for operon prediction have been proposed in the literature. In this study, a binary particle swarm optimization is used for operon prediction in bacterial genomes. The intergenic distance, participation in the same metabolic pathway, the cluster of orthologous groups, the gene length ratio and the operon length are used to design a fitness function. We trained the proper values on the Escherichia coli genome, and used the above five properties to implement feature selection. Finally, our study used the intergenic distance, metabolic pathway and the gene length ratio property to predict operons. Experimental results show that the prediction accuracy of this method reached 92.1%, 93.3% and 95.9% on the Bacillus subtilis genome, the Pseudomonas aeruginosa PA01 genome and the Staphylococcus aureus genome, respectively. This method has enabled us to predict operons with high accuracy for these three genomes, for which only limited data on the properties of the operon structure exists.
Accurate prediction of DNA motifs that are targets of RNA polymerases, sigma factors and transcription factors (TFs) in prokaryotes is a difficult mission mainly due to as yet undiscovered features in DNA sequences or structures in promoter regions. Improved prediction and comparison algorithms are currently available for identifying transcription factor binding sites (TFBSs) and their accompanying TFs and regulon members.
We here extend the current databases of TFs, TFBSs and regulons with our knowledge on Lactococcus lactis and developed a webserver for prediction, mining and visualization of prokaryote promoter elements and regulons via a novel concept. This new approach includes an all-in-one method of data mining for TFs, TFBSs, promoters, and regulons for any bacterial genome via a user-friendly webserver. We demonstrate the power of this method by mining WalRK regulons in Lactococci and Streptococci and, vice versa, use L. lactis regulon data (CodY) to mine closely related species.
The PePPER webserver offers, besides the all-in-one analysis method, a toolbox for mining for regulons, promoters and TFBSs and accommodates a new L. lactis regulon database in addition to already existing regulon data. Identification of putative regulons and full annotation of intergenic regions in any bacterial genome on the basis of existing knowledge on a related organism can now be performed by biologists and it can be done for a wide range of regulons. On the basis of the PePPER output, biologist can design experiments to further verify the existence and extent of the proposed regulons. The PePPER webserver is freely accessible at http://pepper.molgenrug.nl.
Comparative analysis of genes, operons and regulatory elements was applied to the lysine biosynthetic pathway in available bacterial genomes. We report identification of a lysine-specific RNA element, named the LYS element, in the regulatory regions of bacterial genes involved in biosynthesis and transport of lysine. Similarly to the previously described RNA regulatory elements for three vitamins (riboflavin, thiamin and cobalamin), purine and methionine regulons, this regulatory RNA structure is highly conserved on the sequence and structural levels. The LYS element includes regions of lysine-constitutive mutations previously identified in Escherichia coli and Bacillus subtilis. A possible mechanism of the lysine-specific riboswitch is similar to the previously defined mechanisms for the other metabolite-specific riboswitches and involves either transcriptional or translational attenuation in various groups of bacteria. Identification of LYS elements in Gram-negative γ-proteobacteria, Gram-positive bacteria from the Bacillus/Clostridium group, and Thermotogales resulted in description of the previously uncharacterized lysine regulon in these bacterial species. Positional analysis of LYS elements led to identification of a number of new candidate lysine transporters, namely LysW, YvsH and LysXY. Finally, the most likely candidates for genes of lysine biosynthesis missing in Gram- positive bacteria were identified using the genome context analysis.
The specificity of DNA-dependent RNA polymerase for target promotes is largely due to the replaceable sigma subunit that it carries. Multiple sigma proteins, each conferring a unique promoter preference on RNA polymerase, are likely to be present in all bacteria; however, their abundance and diversity have been best characterized in Bacillus subtilis, the bacterium in which multiple sigma factors were first discovered. The 10 sigma factors thus far identified in B. subtilis directly contribute to the bacterium's ability to control gene expression. These proteins are not merely necessary for the expression of those operons whose promoters they recognize; in many instances, their appearance within the cell is sufficient to activate these operons. This review describes the discovery of each of the known B. subtilis sigma factors, their characteristics, the regulons they direct, and the complex restrictions placed on their synthesis and activities. These controls include the anticipated transcriptional regulation that modulates the expression of the sigma factor structural genes but, in the case of several of the B. subtilis sigma factors, go beyond this, adding novel posttranslational restraints on sigma factor activity. Two of the sigma factors (sigma E and sigma K) are, for example, synthesized as inactive precursor proteins. Their activities are kept in check by "pro-protein" sequences which are cleaved from the precursor molecules in response to intercellular cues. Other sigma factors (sigma B, sigma F, and sigma G) are inhibited by "anti-sigma factor" proteins that sequester them into complexes which block their ability to form RNA polymerase holoenzymes. The anti-sigma factors are, in turn, opposed by additional proteins which participate in the sigma factors' release. The devices used to control sigma factor activity in B, subtilis may prove to be as widespread as multiple sigma factors themselves, providing ways of coupling sigma factor activation to environmental or physiological signals that cannot be readily joined to other regulatory mechanisms.
The leucine-responsive regulatory protein (Lrp) regulates the expression of more than 40 genes and proteins in Escherichia coli. Among the operons that are positively regulated by Lrp are operons involved in amino acid biosynthesis (ilvIH, serA)), in the biosynthesis of pili (pap, fan, fim), and in the assimilation of ammonia (glnA, gltBD). Negatively regulated operons include operons involved in amino acid catabolism (sdaA, tdh) and peptide transport (opp) and the operon coding for Lrp itself (lrp). Detailed studies of a few members of the regulon have shown that Lrp can act directly to activate or repress transcription of target operons. A substantial fraction of operons regulated by Lrp are also regulated by leucine, and the effect of leucine on expression of these operons requires a functional Lrp protein. The patterns of regulation are surprising and interesting: in some cases activation or repression mediated by Lrp is antagonized by leucine, in other cases Lrp-mediated activation or repression is potentiated by leucine, and in still other cases leucine has no effect on Lrp-mediated regulation. Current research is just beginning to elucidate the detailed mechanisms by which Lrp can mediate such a broad spectrum of regulatory effects. Our view of the role of Lrp in metabolism may change as more members of the regulon are identified and their regulation characterized, but at this point Lrp seems to be important in regulating nitrogen metabolism and one-carbon metabolism, permitting adaptations to feast and to famine.
Recognition of transcription regulation sites (operators) is a hard problem in computational molecular biology. In most cases, small sample size and low degree of sequence conservation preclude the construction of reliable recognition rules. We suggest an approach to this problem based on simultaneous analysis of several related genomes. It appears that as long as a gene coding for a transcription regulator is conserved in the compared bacterial genomes, the regulation of the respective group of genes (regulons) also tends to be maintained. Thus a gene can be confidently predicted to belong to a particular regulon in case not only itself, but also its orthologs in other genomes have candidate operators in the regulatory regions. This provides for a greater sensitivity of operator identification as even relatively weak signals are likely to be functionally relevant when conserved. We use this approach to analyze the purine (PurR), arginine (ArgR) and aromatic amino acid (TrpR and TyrR) regulons of Escherichia coli and Haemophilus influenzae. Candidate binding sites in regulatory regions of the respective H.influenzae genes are identified, a new family of purine transport proteins predicted to belong to the PurR regulon is described, and probable regulation of arginine transport by ArgR is demonstrated. Differences in the regulation of some orthologous genes in E.coli and H.influenzae, in particular the apparent lack of the autoregulation of the purine repressor gene in H.influenzae, are demonstrated.
The availability of the complete genome sequence for Shewanella oneidensis MR-1 has permitted a comprehensive characterization of the ferric uptake regulator (Fur) modulon in this dissimilatory metal-reducing bacterium. We have employed targeted gene mutagenesis, DNA microarrays, proteomic analysis using liquid chromatography-mass spectrometry, and computational motif discovery tools to define the S. oneidensis Fur regulon. Using this integrated approach, we identified nine probable operons (containing 24 genes) and 15 individual open reading frames (ORFs), either with unknown functions or encoding products annotated as transport or binding proteins, that are predicted to be direct targets of Fur-mediated repression. This study suggested, for the first time, possible roles for four operons and eight ORFs with unknown functions in iron metabolism or iron transport-related functions. Proteomic analysis clearly identified a number of transporters, binding proteins, and receptors related to iron uptake that were up-regulated in response to a fur deletion and verified the expression of nine genes originally annotated as pseudogenes. Comparison of the transcriptome and proteome data revealed strong correlation for genes shown to be undergoing large changes at the transcript level. A number of genes encoding components of the electron transport system were also differentially expressed in a fur deletion mutant. The gene omcA (SO1779), which encodes a decaheme cytochrome c, exhibited significant decreases in both mRNA and protein abundance in the fur mutant and possessed a strong candidate Fur-binding site in its upstream region, thus suggesting that omcA may be a direct target of Fur activation.