Accurate recognition of regulatory elements in promoters is an essential
prerequisite for understanding the mechanisms of gene regulation at the
level of transcription. Composite regulatory elements represent a particular
type of such transcriptional regulatory elements consisting of pairs of
individual DNA motifs. In contrast to the present approach, most available
recognition techniques are based purely on statistical evaluation of the
occurrence of single motifs. Such methods are limited in application, since
the accuracy of recognition is greatly dependent on the size and quality of
the sequence dataset. Methods that exploit available knowledge and have
broad applicability are evidently needed.
We developed a novel method to identify composite regulatory elements in
promoters using a library of known examples. In depth investigation of
regularities encoded in known composite elements allowed us to introduce a
new characteristic measure and to improve the specificity compared with
other methods. Tests on an established benchmark and real genomic data show
that our method outperforms other available methods based either on known
examples or statistical evaluations. In addition to better recognition, a
practical advantage of this method is first the ability to detect a high
number of different types of composite elements, and second direct
biological interpretation of the identified results. The program is
and includes an option to extend the provided library by user supplied
The novel algorithm for the identification of composite regulatory elements
presented in this paper was proved to be superior to existing methods. Its
application to tissue specific promoters identified several highly specific
composite elements with relevance to their biological function. This
approach together with other methods will further advance the understanding
of transcriptional regulation of genes.
Massive gene expression changes in different cellular states measured by microarrays, in fact, reflect just an "echo" of real molecular processes in the cells. Transcription factors constitute a class of the regulatory molecules that typically require posttranscriptional modifications or ligand binding in order to exert their function. Therefore, such important functional changes of transcription factors are not directly visible in the microarray experiments.
We developed a novel approach to find key transcription factors that may explain concerted expression changes of specific components of the signal transduction network. The approach aims at revealing evidence of positive feedback loops in the signal transduction circuits through activation of pathway-specific transcription factors. We demonstrate that promoters of genes encoding components of many known signal transduction pathways are enriched by binding sites of those transcription factors that are endpoints of the considered pathways. Application of the approach to the microarray gene expression data on TNF-alpha stimulated primary human endothelial cells helped to reveal novel key transcription factors potentially involved in the regulation of the signal transduction pathways of the cells.
We developed a novel computational approach for revealing key transcription factors by knowledge-based analysis of gene expression data with the help of databases on gene regulatory networks (TRANSFAC® and TRANSPATH®). The corresponding software and databases are available at .
Composite Module Analyst (CMA) is a novel software tool aiming to identify promoter-enhancer models based on the composition of transcription factor (TF) binding sites and their pairs. CMA is closely interconnected with the TRANSFAC® database. In particular, CMA uses the positional weight matrix (PWM) library collected in TRANSFAC® and therefore provides the possibility to search for a large variety of different TF binding sites. We model the structure of the long gene regulatory regions by a Boolean function that joins several local modules, each consisting of co-localized TF binding sites. Having as an input a set of co-regulated genes, CMA builds the promoter model and optimizes the parameters of the model automatically by applying a genetic-regression algorithm. We use a multicomponent fitness function of the algorithm which includes several statistical criteria in a weighted linear function. We show examples of successful application of CMA to a microarray data on transcription profiling of TNF-alpha stimulated primary human endothelial cells. The CMA web server is freely accessible at . An advanced version of CMA is also a part of the commercial system ExPlain™ () designed for causal analysis of gene expression data.
The TRANSFAC® database on transcription factors, their binding sites, nucleotide distribution matrices and regulated genes as well as the complementing database TRANSCompel® on composite elements have been further enhanced on various levels. A new web interface with different search options and integrated versions of Match™ and Patch™ provides increased functionality for TRANSFAC®. The list of databases which are linked to the common GENE table of TRANSFAC® and TRANSCompel® has been extended by: Ensembl, UniGene, EntrezGene, HumanPSD™ and TRANSPRO™. Standard gene names from HGNC, MGI and RGD, are included for human, mouse and rat genes, respectively. With the help of InterProScan, Pfam, SMART and PROSITE domains are assigned automatically to the protein sequences of the transcription factors. TRANSCompel® contains now, in addition to the COMPEL table, a separate table for detailed information on the experimental EVIDENCE on which the composite elements are based. Finally, for TRANSFAC®, in respect of data growth, in particular the gain of Drosophila transcription factor binding sites (by courtesy of the Drosophila DNase I footprint database) and of Arabidopsis factors (by courtesy of DATF, Database of Arabidopsis Transcription Factors) has to be stressed. The here described public releases, TRANSFAC® 7.0 and TRANSCompel® 7.0, are accessible under .
TRANSPATH® is a database about signal transduction events. It provides information about signaling molecules, their reactions and the pathways these reactions constitute. The representation of signaling molecules is organized in a number of orthogonal hierarchies reflecting the classification of the molecules, their species-specific or generic features, and their post-translational modifications. Reactions are similarly hierarchically organized in a three-layer architecture, differentiating between reactions that are evidenced by individual publications, generalizations of these reactions to construct species-independent ‘reference pathways’ and the ‘semantic projections’ of these pathways. A number of search and browse options allow easy access to the database contents, which can be visualized with the tool PathwayBuilder™. The module PathoSign adds data about pathologically relevant mutations in signaling components, including their genotypes and phenotypes. TRANSPATH® and PathoSign can be used as encyclopaedia, in the educational process, for vizualization and modeling of signal transduction networks and for the analysis of gene expression data. TRANSPATH® Public 6.0 is freely accessible for users from non-profit organizations under .
can either be used as an encyclopedia, for both specific and general
information on signal transduction, or can serve as a network analyser. Therefore,
three modules have been created: the first one is the data, which have been manually
extracted, mostly from the primary literature; the second is PathwayBuilder™,
which provides several different types of network visualization and hence faciliates
understanding; the third is ArrayAnalyzer™, which is particularly suited to gene
expression array interpretation, and is able to identify key molecules within signalling
networks (potential drug targets). These key molecules could be responsible for
the coordinated regulation of downstream events. Manual data extraction focuses
on direct reactions between signalling molecules and the experimental evidence for
them, including species of genes/proteins used in individual experiments, experimental
systems, materials and methods. This combination of materials and methods is
used in TRANSPATH®
to assign a quality value to each experimentally proven reaction,
which reflects the probability that this reaction would happen under
physiological conditions. Another important feature in TRANSPATH® is the inclusion
of transcription factor–gene relations, which are transferred from TRANSFAC®,
a database focused on transcription regulation and transcription factors. Since
interactions between molecules are mainly direct, this allows a complete and
stepwise pathway reconstruction from ligands to regulated genes. More information is
available at www.biobase.de/pages/products/databases.html.
MatchTM is a weight matrix-based tool for searching putative transcription factor binding sites in DNA sequences. MatchTM is closely interconnected and distributed together with the TRANSFAC® database. In particular, MatchTM uses the matrix library collected in TRANSFAC® and therefore provides the possibility to search for a great variety of different transcription factor binding sites. Several sets of optimised matrix cut-off values are built in the system to provide a variety of search modes of different stringency. The user may construct and save his/her specific user profiles which are selected subsets of matrices including default or user-defined cut-off values. Furthermore a number of tissue-specific profiles are provided that were compiled by the TRANSFAC® team. A public version of the MatchTM tool is available at: http://www.gene-regulation.com/pub/programs.html#match. The same program with a different web interface can be found at http://compel.bionet.nsc.ru/Match/Match.html. An advanced version of the tool called MatchTM Professional is available at http://www.biobase.de.
The TRANSFAC® database on eukaryotic transcriptional regulation, comprising data on transcription factors, their target genes and regulatory binding sites, has been extended and further developed, both in number of entries and in the scope and structure of the collected data. Structured fields for expression patterns have been introduced for transcription factors from human and mouse, using the CYTOMER® database on anatomical structures and developmental stages. The functionality of Match™, a tool for matrix-based search of transcription factor binding sites, has been enhanced. For instance, the program now comes along with a number of tissue-(or state-)specific profiles and new profiles can be created and modified with Match™ Profiler. The GENE table was extended and gained in importance, containing amongst others links to LocusLink, RefSeq and OMIM now. Further, (direct) links between factor and target gene on one hand and between gene and encoded factor on the other hand were introduced. The TRANSFAC® public release is available at http://www.gene-regulation.com. For yeast an additional release including the latest data was made available separately as TRANSFAC® Saccharomyces Module (TSM) at http://transfac.gbf.de. For CYTOMER® free download versions are available at http://www.biobase.de:8080/index.html.
Originating from COMPEL, the TRANSCompel® database emphasizes the key role of specific interactions between transcription factors binding to their target sites providing specific features of gene regulation in a particular cellular content. Composite regulatory elements contain two closely situated binding sites for distinct transcription factors and represent minimal functional units providing combinatorial transcriptional regulation. Both specific factor–DNA and factor–factor interactions contribute to the function of composite elements (CEs). Information about the structure of known CEs and specific gene regulation achieved through such CEs appears to be extremely useful for promoter prediction, for gene function prediction and for applied gene engineering as well. Each database entry corresponds to an individual CE within a particular gene and contains information about two binding sites, two corresponding transcription factors and experiments confirming cooperative action between transcription factors. The COMPEL database, equipped with the search and browse tools, is available at http://www.gene-regulation.com/pub/databases.html#transcompel. Moreover, we have developed the program CATCH™ for searching potential CEs in DNA sequences. It is freely available as CompelPatternSearch at http://compel.bionet.nsc.ru/FunSite/CompelPatternSearch.html.
COMPEL is a database on composite regulatory elements, the basic structures of combinatorial regulation. Composite regulatory elements contain two closely situated binding sites for distinct transcription factors and represent minimal functional units providing combinatorial transcriptional regulation. Both specific factor–DNA and factor–factor interactions contribute to the function of composite elements (CEs). Information about the structure of known CEs and specific gene regulation achieved through such CEs appears to be extremely useful for promoter prediction, for gene function prediction and for applied gene engineering as well. The structure of the relational model of COMPEL is determined by the concept of molecular structure and regulatory role of CEs. Based on the set of a particular CE, a program has been developed for searching potential CEs in gene regulatory regions. WWW search and browse routines were developed for COMPEL release 3.0. The COMPEL database equipped with the search and browse tools is available at http://compel.bionet.nsc.ru/ . The program for prediction of potential CEs of NFAT type is available at http://compel.bionet.nsc.ru/FunSite.html and http://transfac.gbf.de/dbsearch/funsitep/s_comp.html
Transcription Regulatory Regions Database (TRRD) has been developed for accumulation of experimental information on the structure–function features of regulatory regions of eukaryotic genes. Each entry in TRRD corresponds to a particular gene and contains a description of structure–function features of its regulatory regions (transcription factor binding sites, promoters, enhancers, silencers, etc.) and gene expression regulation patterns. The current release, TRRD 4.2.5, comprises the description of 760 genes, 3403 expression patterns, and >4600 regulatory elements including 3604 transcription factor binding sites, 600 promoters and 152 enhancers. This information was obtained through annotation of 2537 scientific publications. TRRD 4.2.5 is available through the WWW at http://wwwmgs.bionet.nsc.ru/mgs/dbases/trrd4/
TRANSFAC is a database on transcription factors, their genomic binding sites and DNA-binding profiles. In addition to being updated and extended by new features, it has been complemented now by a series of additional database modules. Among them, modules which provide data about signal transduction pathways (TRANSPATH) or about cell types/organs/developmental stages (CYTOMER) are available as well as an updated version of the previously described COMPEL database. The databases are available on the WWW at http://transfac.gbf.de/
The Transcription Regulatory Regions Database (TRRD) is a curated database designed for accumulation of experimental data on extended regulatory regions of eukaryotic genes, the regulatory elements they contain, i.e., transcription factor binding sites, promoters, enhancers, silencers, etc., and expression patterns of the genes. Release 4.1 of TRRD offers a number of significant improvements, in particular, a more detailed description of transcription factor binding sites, transcription factors per se, and gene expression patterns in a computer-readable format. In addition, the new TRRD release provides considerably more references to other molecular biological databases. TRRD 4.1 is installed under SRS and is available through the WWW at http://www.bionet.nsc.ru/trrd/
TRANSFAC, TRRD (Transcription Regulatory Region Database) and COMPEL are databases which store information about transcriptional regulation in eukaryotic cells. The three databases provide distinct views on the components involved in transcription: transcription factors and their binding sites and binding profiles (TRANSFAC), the regulatory hierarchy of whole genes (TRRD), and the structural and functional properties of composite elements (COMPEL). The quantitative and qualitative changes of all three databases and connected programs are described. The databases are accessible via WWW:http://transfac.gbf.de/TRANSFAC orhttp://www.bionet.nsc.ru/TRRD
Three databases that provide data on transcriptional regulation are described. TRANSFAC is a database on transcription factors and their DNA binding sites. TRRD (Transcription Regulatory Region Database) collects information about complete regulatory regions, their regulation properties and architecture. COMPEL comprises specific information on composite regulatory elements. Here, we describe the present status of these databases and the first steps towards their federation.
Over the past years, evidence has been accumulating for a fundamental role of protein-protein interactions between transcription factors in gene-specific transcription regulation. Many of these interactions run within composite elements containing binding sites for several factors. We have selected 101 composite regulatory elements identified experimentally in the regulatory regions of 64 genes of vertebrates and of their viruses and briefly described them in a compilation. Of these, 82 composite elements are of the synergistic type and 19 of the antagonistic type. Within the synergistic type composite elements, transcription factors bind to the corresponding sites simultaneously, thus cooperatively activating transcription. The factors, binding to their target sites within antagonistic type composite elements, produce opposing effects on transcription. The nucleotide sequence and localization in the genes, the names and brief description of transcription factors, are provided for each composite element, including a representation of experimental data on its functioning. Most of the composite elements (3/4) fall between -250 bp and the transcription start site. The distance between the binding sites within the composite elements described varies from complete overlapping to 80 bp. The compilation of composite elements is presented in the database COMPEL which is electronically accessible by anonymous ftp via internet.
Computational analysis of master regulators through the search for transcription factor binding sites followed by analysis of signal transduction networks of a cell is a new approach of causal analysis of multi-omics data.
This paper contains results on analysis of multi-omics data that include transcriptomics, proteomics and epigenomics data of methotrexate (MTX) resistant colon cancer cell line. The data were used for analysis of mechanisms of resistance and for prediction of potential drug targets and promising compounds for reverting the MTX resistance of these cancer cells. We present all results of the analysis including the lists of identified transcription factors and their binding sites in genome and the list of predicted master regulators – potential drug targets.
This data was generated in the study recently published in the article “Multi-omics “Upstream Analysis” of regulatory genomic regions helps identifying targets against methotrexate resistance of colon cancer” (Kel et al., 2016) .
These data are of interest for researchers from the field of multi-omics data analysis and for biologists who are interested in identification of novel drug targets against NTX resistance.
Amniotic fluid embolism (AFE) is a catastrophic consequence of labor and delivery that often results in maternal and neonatal death. These poor outcomes are related largely to the rarity of the event in a population overwhelmingly biased by overall good health. Despite the presence of national AFE registries, there are no published algorithmic approaches to its management, to our knowledge. The purpose of this article is to share a care pathway developed by a multidisciplinary group at a community teaching hospital. Post hoc analysis of a complicated case of AFE resulted in development of this pathway, which addresses many of the major consequences of AFE. We offer this algorithm as a template for use by any institution willing to implement a clinical pathway to treat AFE. It is accompanied by the remarkable case outcome that prompted its development.
GTRD—Gene Transcription Regulation Database (http://gtrd.biouml.org)—is a database of transcription factor binding sites (TFBSs) identified by ChIP-seq experiments for human and mouse. Raw ChIP-seq data were obtained from ENCODE and SRA and uniformly processed: (i) reads were aligned using Bowtie2; (ii) ChIP-seq peaks were called using peak callers MACS, SISSRs, GEM and PICS; (iii) peaks for the same factor and peak callers, but different experiment conditions (cell line, treatment, etc.), were merged into clusters; (iv) such clusters for different peak callers were merged into metaclusters that were considered as non-redundant sets of TFBSs. In addition to information on location in genome, the sets contain structured information about cell lines and experimental conditions extracted from descriptions of corresponding ChIP-seq experiments. A web interface to access GTRD was developed using the BioUML platform. It provides: (i) browsing and displaying information; (ii) advanced search possibilities, e.g. search of TFBSs near the specified gene or search of all genes potentially regulated by a specified transcription factor; (iii) integrated genome browser that provides visualization of the GTRD data: read alignments, peaks, clusters, metaclusters and information about gene structures from the Ensembl database and binding sites predicted using position weight matrices from the HOCOMOCO database.
Although IL-10 promotes a regulatory phenotype of CD11c+ dendritic cells and macrophages in vitro, the role of IL-10 signaling in CD11c+ cells to maintain intestinal tolerance in vivo remains elusive. To this aim, we generated mice with a CD11c-specific deletion of the IL-10 receptor alpha (Cd11ccreIl10rafl/fl). In contrast to the colon, the small intestine of Cd11ccreIl10rafl/fl mice exhibited spontaneous crypt hyperplasia, increased numbers of intraepithelial lymphocytes and lamina propria T cells, associated with elevated levels of T cell-derived IFNγ and IL-17A. Whereas naive mucosal T-cell priming was not affected and oral tolerance to ovalbumin was intact, augmented T-cell function in the lamina propria was associated with elevated numbers of locally dividing T cells, expression of T-cell attracting chemokines and reduced T-cell apoptosis. Upon stimulation, intestinal IL-10Rα deficient CD11c+ cells exhibited increased activation associated with enhanced IL-6 and TNFα production. Following colonization with Helicobacter hepaticus Cd11ccreIl10rafl/fl mice developed severe large intestinal inflammation characterized by infiltrating T cells and increased levels of Il17a, Ifng, and Il12p40. Altogether these findings demonstrate a critical role of IL-10 signaling in CD11c+ cells to control small intestinal immune homeostasis by limiting reactivation of local memory T cells and to protect against Helicobacter hepaticus-induced colitis.
CD11c+ myeloid cells; dendritic cells; interleukin 10; small intestine; celiac disease
A strategy is presented that allows a causal analysis of co-expressed genes, which may be subject to common regulatory influences. A state-of-the-art promoter analysis for potential transcription factor (TF) binding sites in combination with a knowledge-based analysis of the upstream pathway that control the activity of these TFs is shown to lead to hypothetical master regulators. This strategy was implemented as a workflow in a comprehensive bioinformatic software platform. We applied this workflow to gene sets that were identified by a novel triclustering algorithm in naphthalene-induced gene expression signatures of murine liver and lung tissue. As a result, tissue-specific master regulators were identified that are known to be linked with tumorigenic and apoptotic processes. To our knowledge, this is the first time that genes of expression triclusters were used to identify upstream regulators.
microarray data; gene expression signatures; upstream analysis; promoter analysis; pathway analysis
To identify medically relevant aspects of blood pressure dysregulation (BPD) related to quality of life in individuals with spinal cord injury (SCI), and to propose an integrated conceptual framework based on input from both individuals with SCI and their clinical providers. This framework will serve as a guide for the development of a patient-reported outcome (PRO) measure specifically related to BPD.
Three focus groups with individuals with SCI and 3 groups with SCI providers were analyzed using grounded-theory based qualitative analysis to ascertain how blood pressure impacts health-related quality of life (HRQOL) in individuals with SCI.
Focus groups were conducted at 2 Veterans Affairs medical centers and a research center.
Individuals with SCI (n=27) in 3 focus groups and clinical providers (n=25) in 3 focus groups.
Main Outcome Measures
Qualitative analysis indicated that all focus groups spent the highest percentage of time discussing symptoms of BPD (39%), followed by precipitators/causes of BPD (16%), preventative actions (15%), corrective actions (12%), and the impact that BPD has on social or emotional functioning (8%). While patient/consumer focus groups and provider focus groups raised similar issues, providers spent more time discussing precipitators/causes of BPD and preventative actions (38%) than patient/consumer groups (24%).
These results suggest that BPD uniquely and adversely impacts HRQOL in persons with SCI. While both individuals with SCI and their providers highlighted the relevant symptoms of BPD, the SCI providers offered additional detailed information regarding the precipitators/causes and what can be done to prevent/treat BPD. Further, the results suggest that persons with SCI are aware of how BPD impacts their HRQOL and are able to distinguish between subtle signs and symptoms. These findings exemplify the need for a validated and sensitive clinical measurement tool that can assess the extent to which BPD impacts HRQOL in patients with SCI.
Blood pressure; Outcome assessment (health care); Quality of life; Rehabilitation; Spinal cord injuries
Biological processes are fundamentally driven by complex interactions between biomolecules. Integrated high-throughput omics studies enable multifaceted views of cells, organisms, or their communities. With the advent of new post-genomics technologies, omics studies are becoming increasingly prevalent; yet the full impact of these studies can only be realized through data harmonization, sharing, meta-analysis, and integrated research. These essential steps require consistent generation, capture, and distribution of metadata. To ensure transparency, facilitate data harmonization, and maximize reproducibility and usability of life sciences studies, we propose a simple common omics metadata checklist. The proposed checklist is built on the rich ontologies and standards already in use by the life sciences community. The checklist will serve as a common denominator to guide experimental design, capture important parameters, and be used as a standard format for stand-alone data publications. The omics metadata checklist and data publications will create efficient linkages between omics data and knowledge-based life sciences innovation and, importantly, allow for appropriate attribution to data generators and infrastructure science builders in the post-genomics era. We ask that the life sciences community test the proposed omics metadata checklist and data publications and provide feedback for their use and improvement.
Clinically, gentamicin has been used extensively to treat the debilitating symptoms of Mèniére’s disease and is well known for its vestibulotoxic properties. Until recently, it was widely accepted that the round window membrane (RWM) was the primary entry route into the inner ear following intratympanic drug administration. In the current study, gentamicin was delivered to either the RWM or the stapes footplate of guinea pigs (GPs) to assess the associated hearing loss and histopathology associated with each procedure. Vestibulotoxicity of the utricular macula, saccular macula, and crista ampullaris in the posterior semicircular canal were assessed quantitatively with density counts of hair cells, supporting cells, and stereocilia in histological sections. Cochleotoxicity was assessed quantitatively by changes in threshold of auditory brainstem responses (ABR), along with hair cell and spiral ganglion cell counts in the basal and second turns of the cochlea. Animals receiving gentamicin applied to the stapes footplate exhibited markedly higher levels of hearing loss between 8–32kHz, a greater reduction of outer hair cells in the basal turn of the cochlea and fewer normal type I cells in the utricle in the vestibule than those receiving gentamicin on the RWM or saline controls. This suggests that gentamicin more readily enters the ear when applied to the stapes footplate compared with RWM application. These data provide a potential explanation for why gentamicin preferentially ablates vestibular function while preserving hearing following transtympanic administration in humans.
Inner ear drug delivery; gentamicin; pharmacokinetics; oval window; stapes; stapediovestibular joint; annular ligament
Rescue of the p53 tumor suppressor is an attractive cancer therapy approach. However, pharmacologically activated p53 can induce diverse responses ranging from cell death to growth arrest and DNA repair, which limits the efficient application of p53-reactivating drugs in clinic. Elucidation of the molecular mechanisms defining the biological outcome upon p53 activation remains a grand challenge in the p53 field. Here, we report that concurrent pharmacological activation of p53 and inhibition of thioredoxin reductase followed by generation of reactive oxygen species (ROS), result in the synthetic lethality in cancer cells. ROS promote the activation of c-Jun N-terminal kinase (JNK) and DNA damage response, which establishes a positive feedback loop with p53. This converts the p53-induced growth arrest/senescence to apoptosis. We identified several survival oncogenes inhibited by p53 in JNK-dependent manner, including Mcl1, PI3K, eIF4E, as well as p53 inhibitors Wip1 and MdmX. Further, we show that Wip1 is one of the crucial executors downstream of JNK whose ablation confers the enhanced and sustained p53 transcriptional response contributing to cell death. Our study provides novel insights for manipulating p53 response in a controlled way. Further, our results may enable new pharmacological strategy to exploit abnormally high ROS level, often linked with higher aggressiveness in cancer, to selectively kill cancer cells upon pharmacological reactivation of p53.
TrxR; ROS; JNK; p53; Wip1; inhibition of oncogenes