The RegPrecise database is publicly accessible through a web interface at
http://regprecise.lbl.gov. The home page provides several different ways to access the regulon descriptions. Two key entry points, ‘Regulon collections’ and ‘Browse and statistics’, allow browsing through the database content, whereas ‘Search gene/regulator’ is useful for finding information about specific target genes and TF regulators in individual genomes. Alternatively, in order to get an overview of the database content, two types of browsing are provided under ‘Browse and statistics’ link—‘Browse by regulog’ and ‘Browse by genome’.
Following ‘Regulon collections’ link, the user gets a list of all available collections of regulogs organized into groups corresponding to the four types of collections described above. Each collection web page provides condensed information about all TF regulogs inferred by the comparative genomics approach for a particular group of genomes, TFs, or biological pathways, and includes total statistics on a number of genomes, regulogs, TFs and TFBSs within a collection. Each type of collection is focused on certain aspects of evolution of transcriptional regulation, and thus requires a different way of the data representation. An interface implemented in the RegPrecise is illustrated below on two examples.
Representation of the collection of regulogs by taxonomic group (as illustrated by the ‘Shewanella’ collection in A) provides an overview table of 74 reconstructed TF regulogs sorted by a TF protein family attribute in a set of 13 Shewanella genomes sorted by taxonomy. In this table, rows and columns correspond to regulogs and genomes, respectively, whereas each non-empty cell colored green provides a reference to a web page with detailed description of a particular TF regulon in individual genome. The table shows distribution of orthologous TFs in a group of genomes, highlights universally conserved and narrowly distributed regulogs, and provides general functional classification of target genes within the regulogs.
Representation of the collection of regulogs by TFs (as illustrated by the Zur regulon collection in B) provides a summary for all regulogs reconstructed for orthologous TFs across diverse taxonomic groups of bacteria. Each regulog has an attributed phylum name and the regulog name showing a more precise definition of the taxonomic group where it has been reconstructed. For this type of collections we also provide an alignment of TFBS motifs built using a set TFBSs inferred for each regulog. These TFBS motifs are represented by motif sequence logos drawn with the WebLogo package v.2.6 (
38). Sequence logo is particularly useful for the comparison and evolutionary analysis of TFBS motifs between orthologous TF regulogs from different taxonomic groups.
Regulog collection web pages, being the upper level in the data hierarchy of the RegPrecise, provide all necessary links to the web pages at the regulog and regulon levels. The regulog page provides a comparative table showing conservation of gene regulation across genomes within a particular regulog (A). Essentially, this table shows a phylogenetic profile of gene regulation based on the presence and absence of gene regulation by a particular TF in every genome. This type of visualization allows the user to easily identify a core part of the regulon—a set of genes controlled by a TF in most of the analyzed genomes; and a variable part of the regulon populated by genes that are conserved only in several genomes. The regulog web page also provides a brief description of a TF (TF family, effector), a list of analyzed genomes with the number of predicted target genes and operons, and a TFBS motif sequence logo.
The lower level in data hierarchy in the RegPrecise is a regulon described in the individual genome. The regulon page shows detailed information about all inferred regulatory interactions for a particular TF in a particular genome (B). This web page has a brief description of a TF (Genbank locus tag, TF family, effector) and a complete list of predicted target genes organized in putative transcriptional units with detailed information about associated TFBSs (site sequence, score and position relative to the first gene start). In addition to this plain view on all target operons within a particular genome, we provide an orthogonal view on a particular operon in all genomes analyzed for a particular regulog (C). The latter view allows the user to assess conservation of regulation for a particular operon.
Collections of regulogs, individual regulogs and regulon pages in the database are linked to the associated TFBS profile web pages that provide a list of all TFBSs identified for a particular regulog in a subset of genomes (including first gene locus tag, site sequence and relative position), and a TFBS profile represented as a sequence logo.