Search tips
Search criteria

Results 1-25 (1463149)

Clipboard (0)

Related Articles

1.  Dynamics of hepatic gene expression profile in a rat cecal ligation and puncture model 
The Journal of Surgical Research  2011;176(2):583-600.
Sepsis remains a major clinical challenge in intensive care units. The difficulty in developing new and more effective treatments for sepsis exemplifies our incomplete understanding of the underlying pathophysiology of it. One of the more widely used rodent models for studying polymicrobial sepsis is cecal ligation and puncture (CLP). While a number of CLP studies investigated the ensuing systemic inflammatory response, they usually focus on a single time point post CLP and therefore fail to describe the dynamics of the response. Furthermore, previous studies mostly use surgery without infection (herein referred to as Sham CLP, SCLP) as a control for the CLP model, however SCLP represents an aseptic injurious event that also stimulates a systemic inflammatory response. Thus, there is a need to better understand the dynamics and expression patterns of both injury- and sepsis- induced gene expression alterations to identify potential regulatory targets. In this direction, we characterized the response of the liver within the first 24 h in a rat model of SCLP and CLP using a time series of microarray gene expression data.
Rats were randomly divided into three groups, sham, SCLP and CLP. Rats in SCLP group are subjected to laparotomy, cecal ligation and puncture while those in CLP group are subjected to the similar procedures without cecal ligation and puncture. Animals were saline resuscitated and sacrificed at defined time points (0, 2, 4, 8, 16, and 24 h). Liver tissues were explanted and analyzed for their gene expression profiles using microarray technology. Unoperated animals (Sham) serve as negative controls. After identifying differentially expressed probesets between sham and SCLP or CLP conditions over time, the concatenated data sets corresponding to these differentially expressed probesets in sham and SCLP or CLP groups were combined and analyzed using a “consensus clustering” approach. Promoters of genes that share common characteristics were extracted, and compared with gene batteries comprised of co expressed genes in order to identify putatative transcription factors which could be responsible for the co regulation of those genes.
The SCLP/CLP genes whose expression patterns significantly changed compared to sham over time were identified, clustered, and finally analyzed for pathway enrichment. Our results indicate that both CLP and SCLP triggered the activation of a pro-inflammatory response, enhanced synthesis of acute-phase proteins, increased metabolism and tissue damage markers. Genes triggered by CLP which can be directly linked to bacteria removal functions were absent in SCLP injury. In addition, genes relevant to oxidative stress induced damage were unique to CLP injury, which may be due to the increased severity of CLP injury vs. SCLP injury. Pathway enrichment identified pathways with similar functionality but different dynamics in the two injury models, indicating that the functions controlled by those pathways are under the influence of different transcription factors and regulatory mechanisms. Putatively identified transcription factors, notably including CREB, NF-KB and STAT, were obtained through analysis of the promoter regions in the SCLP/CLP genes. Our results show that while transcription factors such as NF-KB, HOMF, and GATA were common in both injuries for the IL-6 signaling pathway, there were many other transcription factors associated with that pathway which were unique to CLP, including FKHD, HESF and IRFF. There were 17 transcription factors that were identified as important in at least 2 pathways in the CLP injury, but only 7 transcription factors with that property in the SCLP injury. This also supports the hypothesis of unique regulatory modules that govern the pathways present in both the CLP and SCLP response.
By using microarrays to assess multiple genes in a high throughput manner, we demonstrate that an inflammatory response involving different dynamics and different genes is triggered by SCLP and CLP. From our analysis of the CLP data, the key characteristics of sepsis are a pro inflammatory response which drives hypermetabolism, immune cell activation, and damage from oxidative stress. This contrasts with SCLP, which triggers a modified inflammatory response leading to no immune cell activation, decreased detoxification potential, and hyper metabolism. Many of the identified transcription factors that drive the CLP-induced response are not found in the SCLP group, suggesting that SCLP and CLP induce different types of inflammatory responses via different regulatory pathways.
PMCID: PMC3368040  PMID: 22381171
sepsis; trauma; gene expression; transcription factor; microarray; inflammation; liver
2.  Tissue-Specific Gene Expression and Regulation in Liver and Muscle Following Chronic Corticosteroid Administration 
Although corticosteroids (CSs) affect gene expression in multiple tissues, the array of genes that are regulated by these catabolic steroids is diverse, highly tissue specific, and depends on their functions in the tissue. Liver has many important functions in performing and regulating diverse metabolic processes. Muscle, in addition to its mechanical role, is critical in maintaining systemic energy homeostasis and accounts for about 80% of insulin-directed glucose disposal. Consequently, a better understanding of CS pharmacogenomic effects in these tissues would provide valuable information regarding the tissue-specificity of transcriptional dynamics, and would provide insights into the underlying molecular mechanisms of action for both beneficial and detrimental effects.
We performed an integrated analysis of transcriptional data from liver and muscle in response to methylprednisolone (MPL) infusion, which included clustering and functional annotation of clustered gene groups, promoter extraction and putative transcription factor (TF) identification, and finally, regulatory closeness (RC) identification.
This analysis allowed the identification of critical transcriptional responses and CS-responsive functions in liver and muscle during chronic MPL administration, the prediction of putative transcriptional regulators relevant to transcriptional responses of CS-affected genes which are also potential secondary bio-signals altering expression levels of target-genes, and the exploration of the tissue-specificity and biological significance of gene expression patterns, CS-responsive functions, and transcriptional regulation.
The analysis provided an integrated description of the genomic and functional effects of chronic MPL infusion in liver and muscle.
PMCID: PMC3956809  PMID: 24653645
liver; muscle; glucocorticoids; corticosteroids; gene expression; gene regulation; promoter analysis
3.  Genome-wide transcriptional plasticity underlies cellular adaptation to novel challenge 
By recruiting the essential HIS3 gene to the GAL regulatory system and switching to a repressing glucose medium, we confronted yeast cells with a novel challenge they had not encountered before along their history in evolution.Adaptation to this challenge involved a global transcriptional response of a sizeable fraction of the genome, which relaxed on the time scale of the population adaptation, of order of 10 generations.For a large fraction of the responding genes there is no simple biological interpretation, connecting them to the specific cellular demands imposed by the novel challenge.Strikingly, repeating the experiment did not reproduce similar transcription patterns neither in the transient phase nor in the adapted state in glucose.These results suggest that physiological selection operates on the new metabolic configurations generated by the non-specific large scale transcriptional response to eventually stabilize an adaptive state.
Cells adjust their transcriptional state to accommodate environmental and genetic perturbations. Some common perturbations, such as changes in nutrient composition, elicit well-characterized transcriptional responses that can be understood by simple engineering-like design principles as satisfying specific demands imposed by the perturbation. However, cells also have the ability to adapt to novel and unforeseen challenges. This ability is central in realizing the evolvability potential of cells as they respond to dramatic genetic or environmental changes along evolution. Little is known about the mechanisms underlying such adaptations to novel challenges; in particular, the role of the transcriptional regulatory network in such adaptations has not been characterized. Genome-wide measurements have revealed that, in many cases, perturbations lead to a global transcriptional response involving a sizeable fraction of the genome (Gasch et al, 2000; Jelinsky et al, 2000; Causton et al, 2001; Ideker et al, 2001; Lai et al, 2005). Such global behavior suggests that general collective properties of the genetic network, rather than specific pre-designed pathways, determine an important part of the transcriptional response. It is not known however what fraction of genes within such massive transcriptional responses is essential to the specific cellular demands. It is also unknown whether the non-pre-designed part of the response can have a functional role in adaptation to novel challenges.
To study these questions, we confronted yeast cells with a novel challenge they had not encountered before along their history in evolution. A strain of the yeast Saccharomyces cerevisiae was engineered to recruit the gene HIS3, an essential enzyme from the histidine biosynthesis pathway (Hinnebusch, 1992), to the GAL regulatory system, responsible for galactose utilization (Stolovicki et al, 2006). The GAL system is known to be strongly repressed when the cells are exposed to glucose. Therefore, upon switching to a medium containing glucose and lacking histidine, the GAL system and with it HIS3 are highly repressed immediately following the switch and the cells encounter a severe challenge. We have recently shown that a cell population carrying this rewired genome can adapt to grow competitively in a chemostat in a medium containing pure glucose (Stolovicki et al, 2006). This adaptation occurred on a timescale of ∼10 generations; applying a stronger environmental pressure in the form of a competitive inhibitor to HIS3 (3AT) resulted in a similar adaptation albeit with a longer timescale. Figure 1 shows the dynamics of the population's cell density (blue lines, measured by OD) following a medium switch from galactose to glucose in the chemostat without (A) and with (B) 3AT. The experiments revealed that adaptation occurs on physiological timescales (much shorter than required by spontaneous random mutations), but the mechanisms underlying this adaptation have remained unclear (Stolovicki et al, 2006).
Yeast cells had not encountered recruitment of HIS3 to the GAL system along their evolutionary history, and their genome could not possibly have been selected to specifically address glucose repression of HIS3. This experiment, therefore, provides a unique opportunity to characterize the spontaneous transcriptional response during adaptation to a novel challenge and to assess the functional role of the regulatory system in this adaptation. We used DNA microarrays to measure the genome-wide expression levels at time points along the adaptation process, with and without 3AT. These measurements revealed that a sizeable fraction of the genome responded by induction or repression to the switch into glucose. Superimposed on the OD traces, Figure 1 shows the results of a clustering analysis of the expression of genes as measured by the arrays along time in the experiments. This analysis revealed two dominant clusters, each containing hundreds of genes in each experiment, which responded to the medium switch to glucose by a strong transient induction or repression followed by relaxation to steady state on the timescale of the adaptation process, ∼ 10 generations. The two clusters in each experiment show similar but opposite dynamics.
A detailed analysis of the gene content in the two clusters revealed that only a small portion of the response was induced by a change in carbon source (15% overlap between the corresponding clusters in the two experiments, with and without 3AT). Moreover, it revealed a very low overlap with the universal stress response observed for a wide range of environmental stresses (Gasch et al, 2000; Causton et al, 2001) and with the typical response to amino-acid starvation (Natarajan et al, 2001). Additionally, all known specific responses to stress in the literature are characterized by transient induction or repression with relaxation to steady state within a generation time (Gasch et al, 2000; Koerkamp et al, 2002; Wu et al, 2004), whereas in our experiments relaxation of the transcriptional response occurs over many generations. Taken together, these results show that the transcriptional response observed here is neither a metabolic response to the change in carbon source nor is it a standard response to stress or amino-acid starvation. This raises the possibility that it is a spontaneous collective response that is largely composed of genes that do not have a specific function. This possibility was tested directly by repeating the experiment with different populations and comparing their responses. This procedure revealed reproducible adaptation dynamics and steady states in terms of population density, but showed significantly different transcriptional transient responses and steady states for the two repeated experiments. Thus, a significant portion of the genes that changed their expression during the adaptation process do not have a well-defined and reproducible function in the challenging environment.
The application of a stronger environmental pressure in the form of 3AT had a dramatic effect on the global characteristics of the transcriptional response: it induced a markedly higher correlation among the hundreds of responding genes. Figure 3A compares the array data in color code for the two experiments. It is seen that the emergent pattern of transcription exhibits a higher degree of order by the introduction of high external pressure. Observation of the transcriptional patterns for specific metabolic pathways illustrates the different contributions to the correlated dynamics (Figure 3B–D). A general energetic module such as glycolysis exhibited similar patterns of induction and relaxation in experiments with and without 3AT (Figure 3B). However, in general, we found that more than one-third of the known metabolic modules (30 out of 88 modules described in KEGG) exhibited high expression correlation among their genes when the environmental pressure was high but not when it was low. As an example, Figure 3C shows the histidine biosynthesis pathway and Figure 3D the purine pathway. Note the highly ordered trajectories in the lower panels (with 3AT) compared to the disordered ones in the upper panels (no 3AT). This order extends also between genes belonging to different and even distant metabolic modules. It indicates that a global transcriptional regulatory mechanism is in operation, rather than a local specific one. Surprisingly, genes belonging to the same metabolic pathway exhibited simultaneous positively and negatively correlated dynamics. Thus, an important conclusion of this work is that the global transcriptional response to a novel challenge cannot be explained by a simple cellular or metabolic logic. This is to be expected if the response had not been specifically selected in evolution and was not pre-designed for the challenge.
Our data clearly reveal that the massive transcriptional response underlies the adaptation process to a novel challenge. The novelty of the challenge presented to the cells excludes the possibility that this response has been specifically selected toward this challenge. Thus, transcriptional regulation has dynamic properties resulting in a general massive nonspecific response to a novel perturbation. Such a response in turn allows for metabolic rearrangements, which by feeding back on transcription lead to adaptation of the cells to the unforeseen situation. The drastic change in the expression state of the cell opens multiple new metabolic pathways. Physiological selection works then on these multiple metabolic pathways to stabilize an adaptive state that causes relaxation of the perturbed expression pattern. This scenario, involving the creation of a library of possibilities and physiological selection over this library, is compatible with our understanding of a broad class of biological systems, placing the cellular metabolic/regulatory networks on the same footing as the neural or the immune systems (Gerhart and Kirschner, 1997).
Cells adjust their transcriptional state to accommodate environmental and genetic perturbations. An open question is to what extent transcriptional response to perturbations has been specifically selected along evolution. To test the possibility that transcriptional reprogramming does not need to be ‘pre-designed' to lead to an adaptive metabolic state on physiological timescales, we confronted yeast cells with a novel challenge they had not previously encountered. We rewired the genome by recruiting an essential gene, HIS3, from the histidine biosynthesis pathway to a foreign regulatory system, the GAL network responsible for galactose utilization. Switching medium to glucose in a chemostat caused repression of the essential gene and presented the cells with a severe challenge to which they adapted over approximately 10 generations. Using genome-wide expression arrays, we show here that a global transcriptional reprogramming (>1200 genes) underlies the adaptation. A large fraction of the responding genes is nonreproducible in repeated experiments. These results show that a nonspecific transcriptional response reflecting the natural plasticity of the regulatory network supports adaptation of cells to novel challenges.
PMCID: PMC1865588  PMID: 17453047
adaptation; cellular metabolism; expression arrays; plasticity; transcriptional response
4.  Gene arrays and temporal patterns of drug response: corticosteroid effects on rat liver 
It was hypothesized that expression profiling using gene arrays can be used to distinguish temporal patterns of changes in gene expression in response to a drug in vivo, and that these patterns can be used to identify groups of genes regulated by common mechanisms. A corticosteroid, methylprednisolone (MPL), was administered intravenously to a group of 47 rats (Rattus rattus) that were sacrificed at 17 timepoints over 72 h after MPL administration. Plasma drug concentrations and hepatic glucocorticoid receptors were measured from each animal. In addition, RNAs prepared from individual livers were used to query Affymetrix genechips for mRNA expression patterns. Statistical analyses using Affymetrix and GeneSpring software were applied to the results. Cluster analysis revealed six major temporal patterns containing 196 corticosteroid-responsive probe sets representing 153 different genes. Four clusters showed increased expression with differences in lag-time, onset rate, and/or duration of transcriptional effect. A fifth cluster showed rapid reduction persisting for 18 h. The final cluster identified showed decreased expression followed by an extended period of increased expression. These results lend new insights into the diverse hepatic genes involved in the physiologic, therapeutic, and adverse effects of corticosteroids and suggest that a limited array of control processes account for the dynamics of their pharmacogenomic effects.
PMCID: PMC4207265  PMID: 12928814
Corticosteroids; Glucocorticoids; Expression profiling; Cluster analysis
5.  Incorporating Motif Analysis into Gene Co-expression Networks Reveals Novel Modular Expression Pattern and New Signaling Pathways 
PLoS Genetics  2013;9(10):e1003840.
Understanding of gene regulatory networks requires discovery of expression modules within gene co-expression networks and identification of promoter motifs and corresponding transcription factors that regulate their expression. A commonly used method for this purpose is a top-down approach based on clustering the network into a range of densely connected segments, treating these segments as expression modules, and extracting promoter motifs from these modules. Here, we describe a novel bottom-up approach to identify gene expression modules driven by known cis-regulatory motifs in the gene promoters. For a specific motif, genes in the co-expression network are ranked according to their probability of belonging to an expression module regulated by that motif. The ranking is conducted via motif enrichment or motif position bias analysis. Our results indicate that motif position bias analysis is an effective tool for genome-wide motif analysis. Sub-networks containing the top ranked genes are extracted and analyzed for inherent gene expression modules. This approach identified novel expression modules for the G-box, W-box, site II, and MYB motifs from an Arabidopsis thaliana gene co-expression network based on the graphical Gaussian model. The novel expression modules include those involved in house-keeping functions, primary and secondary metabolism, and abiotic and biotic stress responses. In addition to confirmation of previously described modules, we identified modules that include new signaling pathways. To associate transcription factors that regulate genes in these co-expression modules, we developed a novel reporter system. Using this approach, we evaluated MYB transcription factor-promoter interactions within MYB motif modules.
Author Summary
Gene co-expression networks unite genes with similar expression patterns. From these networks, gene co-expression modules can be identified. A specific family of transcription factor(s) may regulate the genes within a co-expression module. Thus, module identification is important to decipher the gene regulatory network. Previously, module identification relied on clustering the gene network into gene clusters that were then treated as modules. This represents a top-down approach. Here, we introduce a reverse approach aiming at identifying gene co-expression modules regulated by known promoter motifs. For a given promoter motif, we calculated the probability of each gene within the network to belong to a module regulated by that motif via motif enrichment analysis or motif position bias analysis. A sub-network containing the genes with a high probability of belonging to a motif driven module was then extracted from the gene co-expression network. From this sub-network, the modular structure can be identified via visual inspection. Our bottom-up approach recovered many known and novel modules for the G-box, MYB, W-box and site II elements motif, whose expression may be regulated by the transcription factors that bind to these motifs. Additionally, we developed a rapid transcription factor-promoter interaction screening system to validate predicted interactions.
PMCID: PMC3789834  PMID: 24098147
6.  Network motif-based identification of transcription factor-target gene relationships by integrating multi-source biological data 
BMC Bioinformatics  2008;9:203.
Integrating data from multiple global assays and curated databases is essential to understand the spatio-temporal interactions within cells. Different experiments measure cellular processes at various widths and depths, while databases contain biological information based on established facts or published data. Integrating these complementary datasets helps infer a mutually consistent transcriptional regulatory network (TRN) with strong similarity to the structure of the underlying genetic regulatory modules. Decomposing the TRN into a small set of recurring regulatory patterns, called network motifs (NM), facilitates the inference. Identifying NMs defined by specific transcription factors (TF) establishes the framework structure of a TRN and allows the inference of TF-target gene relationship. This paper introduces a computational framework for utilizing data from multiple sources to infer TF-target gene relationships on the basis of NMs. The data include time course gene expression profiles, genome-wide location analysis data, binding sequence data, and gene ontology (GO) information.
The proposed computational framework was tested using gene expression data associated with cell cycle progression in yeast. Among 800 cell cycle related genes, 85 were identified as candidate TFs and classified into four previously defined NMs. The NMs for a subset of TFs are obtained from literature. Support vector machine (SVM) classifiers were used to estimate NMs for the remaining TFs. The potential downstream target genes for the TFs were clustered into 34 biologically significant groups. The relationships between TFs and potential target gene clusters were examined by training recurrent neural networks whose topologies mimic the NMs to which the TFs are classified. The identified relationships between TFs and gene clusters were evaluated using the following biological validation and statistical analyses: (1) Gene set enrichment analysis (GSEA) to evaluate the clustering results; (2) Leave-one-out cross-validation (LOOCV) to ensure that the SVM classifiers assign TFs to NM categories with high confidence; (3) Binding site enrichment analysis (BSEA) to determine enrichment of the gene clusters for the cognate binding sites of their predicted TFs; (4) Comparison with previously reported results in the literatures to confirm the inferred regulations.
The major contribution of this study is the development of a computational framework to assist the inference of TRN by integrating heterogeneous data from multiple sources and by decomposing a TRN into NM-based modules. The inference capability of the proposed framework is verified statistically (e.g., LOOCV) and biologically (e.g., GSEA, BSEA, and literature validation). The proposed framework is useful for inferring small NM-based modules of TF-target gene relationships that can serve as a basis for generating new testable hypotheses.
PMCID: PMC2386822  PMID: 18426580
7.  Systematic image-driven analysis of the spatial Drosophila embryonic expression landscape 
We created innovative virtual representation for our large scale Drosophila insitu expression dataset. We aligned an elliptically shaped mesh comprised of small triangular regions to the outline of each embryo. Each triangle defines a unique location in the embryo and comparing corresponding triangles allows easy identification of similar expression patterns.The virtual representation was used to organize the expression landscape at stage 4-6. We identified regions with similar expression in the embryo and clustered genes with similar expression patterns.We created algorithms to mine the dataset for adjacent non-overlapping patterns and anti-correlated patterns. We were able to mine the dataset to identify co-expressed and putative interacting genes.Using co-expression we were able to assign putative functions to unknown genes.
Analyzing both temporal and spatial gene expression is essential for understanding development and regulatory networks of multicellular organisms. Interacting genes are commonly expressed in overlapping or adjacent domains. Thus, gene expression patterns can be used to assign putative gene functions and mined to infer candidates for networks.
We have generated a systematic two-dimensional mRNA expression atlas profiling embryonic development of Drosophila melanogaster (Tomancak et al, 2002, 2007). To date, we have collected over 70 000 images for over 6000 genes. To explore spatial relationships between gene expression patterns, we used a novel computational image-processing approach by converting expression patterns from the images into virtual representations (Figure 1). Using a custom-designed automated pipeline, for each image, we segmented and aligned the outline of the embryo to an elliptically shaped mesh, comprised of 311 small triangular regions each defining a unique location within the embryo. By comparing corresponding triangles, we produced a distance score to identify similar patterns. We generated those triangulated images (TIs) for our entire data set at all developmental stages and demonstrated that this representation can be used as for objective computationally defined description for expression in in situ hybridization images from various sources, including images from the literature.
We used the TIs to conduct a comprehensive analysis of the expression landscape. To this end, we created a novel approach to temporally sort and compact TIs to a non-redundant data set suitable for further computational processing. Although generally applicable for all developmental stages, for this study, we focused on developmental stages 4–6. For this stage range, we reduced the initial set of about 5800 TIs to 553 TIs containing 364 genes. Using this filtered data set, to discover how expression subdivides the embryo into regions, we clustered areas with similar expression and demonstrated that expression patterns divide the early embryo into distinct spatial regions resembling a fate map (Figure 3). To discover the range of unique expression patterns, we used affinity propagation clustering (Frey and Dueck, 2007) to group TIs with similar patterns and identified 39 clusters each representing a distinct pattern class. We integrated the remaining genes into the 39 clusters and studied the distribution of expression patterns and the relationships between the clusters.
The clustered expression patterns were used to identify putative positive and negative regulatory interactions. The similar TIs in each cluster not only grouped already known genes with related functions, but previously undescribed genes. A comparative analysis identified subtle differences between the genes within each expression cluster. To investigate these differences, we developed a novel Markov Random Field (MRF) segmentation algorithm to extract patterns. We then extended the MRF algorithm to detect shared expression boundaries, generate similarity measurements, and discriminate even faint/uncertain patterns between two TIs. This enabled us to identify more subtle partial expression pattern overlaps and adjacent non-overlapping patterns. For example, by conducting this analysis on the cluster containing the gene snail, we identified the previously known huckebein, which restricts snail expression (Reuter and Leptin, 1994), and zfh1, which interacts with tinman (Broihier et al, 1998; Su et al, 1999).
By studying the functions of known genes, we assigned putative developmental roles to each of the 39 clusters. Of the 1800 genes investigated, only half of them had previously assigned functions.
Representing expression patterns with geometric meshes facilitates the analysis of a complex process involving thousands of genes. This approach is complementary to the cellular resolution 3D atlas for the Drosophila embryo (Fowlkes et al, 2008). Our method can be used as a rapid, fully automated, high-throughput approach to obtain a map of co-expression, which will serve to select specific genes for detailed multiplex in-situ hybridization and confocal analysis for a fine-grain atlas. Our data are similar to the data in the literature, and research groups studying reporter constructs, mutant animals, or orthologs can easily produce in situ hybridizations. TIs can be readily created and provide representations that are both comparable to each other and our data set. We have demonstrated that our approach can be used for predicting relationships in regulatory and developmental pathways.
Discovery of temporal and spatial patterns of gene expression is essential for understanding the regulatory networks and development in multicellular organisms. We analyzed the images from our large-scale spatial expression data set of early Drosophila embryonic development and present a comprehensive computational image analysis of the expression landscape. For this study, we created an innovative virtual representation of embryonic expression patterns using an elliptically shaped mesh grid that allows us to make quantitative comparisons of gene expression using a common frame of reference. Demonstrating the power of our approach, we used gene co-expression to identify distinct expression domains in the early embryo; the result is surprisingly similar to the fate map determined using laser ablation. We also used a clustering strategy to find genes with similar patterns and developed new analysis tools to detect variation within consensus patterns, adjacent non-overlapping patterns, and anti-correlated patterns. Of the 1800 genes investigated, only half had previously assigned functions. The known genes suggest developmental roles for the clusters, and identification of related patterns predicts requirements for co-occurring biological functions.
PMCID: PMC2824522  PMID: 20087342
biological function; embryo; gene expression; in situ hybridization; Markov Random Field
8.  Computational Identification of Transcriptional Regulators in Human Endotoxemia 
PLoS ONE  2011;6(5):e18889.
One of the great challenges in the post-genomic era is to decipher the underlying principles governing the dynamics of biological responses. As modulating gene expression levels is among the key regulatory responses of an organism to changes in its environment, identifying biologically relevant transcriptional regulators and their putative regulatory interactions with target genes is an essential step towards studying the complex dynamics of transcriptional regulation. We present an analysis that integrates various computational and biological aspects to explore the transcriptional regulation of systemic inflammatory responses through a human endotoxemia model. Given a high-dimensional transcriptional profiling dataset from human blood leukocytes, an elementary set of temporal dynamic responses which capture the essence of a pro-inflammatory phase, a counter-regulatory response and a dysregulation in leukocyte bioenergetics has been extracted. Upon identification of these expression patterns, fourteen inflammation-specific gene batteries that represent groups of hypothetically ‘coregulated’ genes are proposed. Subsequently, statistically significant cis-regulatory modules (CRMs) are identified and decomposed into a list of critical transcription factors (34) that are validated largely on primary literature. Finally, our analysis further allows for the construction of a dynamic representation of the temporal transcriptional regulatory program across the host, deciphering possible combinatorial interactions among factors under which they might be active. Although much remains to be explored, this study has computationally identified key transcription factors and proposed a putative time-dependent transcriptional regulatory program associated with critical transcriptional inflammatory responses. These results provide a solid foundation for future investigations to elucidate the underlying transcriptional regulatory mechanisms under the host inflammatory response. Also, the assumption that coexpressed genes that are functionally relevant are more likely to share some common transcriptional regulatory mechanism seems to be promising, making the proposed framework become essential in unravelling context-specific transcriptional regulatory interactions underlying diverse mammalian biological processes.
PMCID: PMC3103499  PMID: 21637747
9.  A Predictive Model of the Oxygen and Heme Regulatory Network in Yeast 
PLoS Computational Biology  2008;4(11):e1000224.
Deciphering gene regulatory mechanisms through the analysis of high-throughput expression data is a challenging computational problem. Previous computational studies have used large expression datasets in order to resolve fine patterns of coexpression, producing clusters or modules of potentially coregulated genes. These methods typically examine promoter sequence information, such as DNA motifs or transcription factor occupancy data, in a separate step after clustering. We needed an alternative and more integrative approach to study the oxygen regulatory network in Saccharomyces cerevisiae using a small dataset of perturbation experiments. Mechanisms of oxygen sensing and regulation underlie many physiological and pathological processes, and only a handful of oxygen regulators have been identified in previous studies. We used a new machine learning algorithm called MEDUSA to uncover detailed information about the oxygen regulatory network using genome-wide expression changes in response to perturbations in the levels of oxygen, heme, Hap1, and Co2+. MEDUSA integrates mRNA expression, promoter sequence, and ChIP-chip occupancy data to learn a model that accurately predicts the differential expression of target genes in held-out data. We used a novel margin-based score to extract significant condition-specific regulators and assemble a global map of the oxygen sensing and regulatory network. This network includes both known oxygen and heme regulators, such as Hap1, Mga2, Hap4, and Upc2, as well as many new candidate regulators. MEDUSA also identified many DNA motifs that are consistent with previous experimentally identified transcription factor binding sites. Because MEDUSA's regulatory program associates regulators to target genes through their promoter sequences, we directly tested the predicted regulators for OLE1, a gene specifically induced under hypoxia, by experimental analysis of the activity of its promoter. In each case, deletion of the candidate regulator resulted in the predicted effect on promoter activity, confirming that several novel regulators identified by MEDUSA are indeed involved in oxygen regulation. MEDUSA can reveal important information from a small dataset and generate testable hypotheses for further experimental analysis. Supplemental data are included.
Author Summary
The cell uses complex regulatory networks to modulate the expression of genes in response to changes in cellular and environmental conditions. The transcript level of a gene is directly affected by the binding of transcriptional regulators to DNA motifs in its promoter sequence. Therefore, both expression levels of transcription factors and other regulatory proteins as well as sequence information in the promoters contribute to transcriptional gene regulation. In this study, we describe a new computational strategy for learning gene regulatory programs from gene expression data based on the MEDUSA algorithm. We learn a model that predicts differential expression of target genes from the expression levels of regulators, the presence of DNA motifs in promoter sequences, and binding data for transcription factors. Unlike many previous approaches, we do not assume that genes are regulated in clusters, and we learn DNA motifs de novo from promoter sequences as an integrated part of our algorithm. We use MEDUSA to produce a global map of the yeast oxygen and heme regulatory network. To demonstrate that MEDUSA can reveal detailed information about regulatory mechanisms, we perform biochemical experiments to confirm the predicted regulators for an important hypoxia gene.
PMCID: PMC2573020  PMID: 19008939
10.  microRNA-122 as a regulator of mitochondrial metabolic gene network in hepatocellular carcinoma 
A moderate loss of miR-122 function correlates with up-regulation of seed-matched genes and down-regulation of mitochondrially localized genes in both human hepatocellular carcinoma and in normal mice treated with anti-miR-122 antagomir.Putative direct targets up-regulated with loss of miR-122 and secondary targets down-regulated with loss of miR-122 are conserved between human beings and mice and are rapidly regulated in vitro in response to miR-122 over- and under-expression.Loss of miR-122 secondary target expression in either tumorous or adjacent non-tumorous tissue predicts poor survival of heptatocellular carcinoma patients.
Hepatocellular carcinoma (HCC) is one of the most aggressive human malignancies, common in Asia, Africa, and in areas with endemic infections of hepatitis-B or -C viruses (HBV or HCV) (But et al, 2008). Globally, the 5-year survival rate of HCC is <5% and about 600 000 HCC patients die each year. The high mortality associated with this disease is mainly attributed to the failure to diagnose HCC patients at an early stage and a lack of effective therapies for patients with advanced stage HCC. Understanding the relationships between phenotypic and molecular changes in HCC is, therefore, of paramount importance for the development of improved HCC diagnosis and treatment methods.
In this study, we examined mRNA and microRNA (miRNA)-expression profiles of tumor and adjacent non-tumor liver tissue from HCC patients. The patient population was selected from a region of endemic HBV infection, and HBV infection appears to contribute to the etiology of HCC in these patients. A total of 96 HCC patients were included in the study, of which about 88% tested positive for HBV antigen; patients testing positive for HCV antigen were excluded. Among the 220 miRNAs profiled, miR-122 was the most highly expressed miRNA in liver, and its expression was decreased almost two-fold in HCC tissue relative to adjacent non-tumor tissue, confirming earlier observations (Lagos-Quintana et al, 2002; Kutay et al, 2006; Budhu et al, 2008).
Over 1000 transcripts were correlated and over 1000 transcripts were anti-correlated with miR-122 expression. Consistent with the idea that transcripts anti-correlated with miR-122 are potential miR-122 targets, the most highly anti-correlated transcripts were highly enriched for the presence of the miR-122 central seed hexamer, CACTCC, in the 3′UTR. Although the complete set of negatively correlated genes was enriched for cell-cycle genes, the subset of seed-matched genes had no significant KEGG Pathway annotation, suggesting that miR-122 is unlikely to directly regulate the cell cycle in these patients. In contrast, transcripts positively correlated with miR-122 were not enriched for 3′UTR seed matches to miR-122. Interestingly, these 1042 transcripts were enriched for genes coding for mitochondrially localized proteins and for metabolic functions.
To analyze the impact of loss of miR-122 in vivo, silencing of miR-122 was performed by antisense inhibition (anti-miR-122) in wild-type mice (Figure 3). As with the genes negatively correlated with miR-122 in HCC patients, no significant biological annotation was associated with the seed-matched genes up-regulated by anti-miR-122 in mouse livers. The most significantly enriched biological annotation for anti-miR-122 down-regulated genes, as for positively correlated genes in HCC, was mitochondrial localization; the down-regulated mitochondrial genes were enriched for metabolic functions. Putative direct and downstream targets with orthologs on both the human and mouse microarrays showed significant overlap for regulations in the same direction. These overlaps defined sets of putative miR-122 primary and secondary targets. The results were further extended in the analysis of a separate dataset from 180 HCC, 40 cirrhotic, and 6 normal liver tissue samples (Figure 4), showing anti-correlation of proposed primary and secondary targets in non-healthy tissues.
To validate the direct correlation between miR-122 and some of the primary and secondary targets, we determined the expression of putative targets after transfection of miR-122 mimetic into PLC/PRF/5 HCC cells, including the putative direct targets SMARCD1 and MAP3K3 (MEKK3), a target described in the literature, CAT-1 (SLC7A1), and three putative secondary targets, PPARGC1A (PGC-1α) and succinate dehydrogenase subunits A and B. As expected, the putative direct targets showed reduced expression, whereas the putative secondary target genes showed increased expression in cells over-expressing miR-122 (Figure 4).
Functional classification of genes using the total ancestry method (Yu et al, 2007) identified PPARGC1A (PGC-1α) as the most connected secondary target. PPARGC1A has been proposed to function as a master regulator of mitochondrial biogenesis (Ventura-Clapier et al, 2008), suggesting that loss of PPARGC1A expression may contribute to the loss of mitochondrial gene expression correlated with loss of miR-122 expression. To further validate the link of miR-122 and PGC-1α protein, we transfected PLC/PRF/5 cells with miR-122-expression vector, and observed an increase in PGC-1α protein levels. Importantly, transfection of both miR-122 mimetic and miR-122-expression vector significantly reduced the lactate content of PLC/PRF/5 cells, whereas anti-miR-122 treatment increased lactate production. Together, the data support the function of miR-122 in mitochondrial metabolic functions.
Patient survival was not directly associated with miR-122-expression levels. However, miR-122 secondary targets were expressed at significantly higher levels in both tumor and adjacent non-tumor tissues among survivors as compared with deceased patients, providing supporting evidence for the potential relevance of loss of miR-122 function in HCC patient morbidity and mortality.
Overall, our findings reveal potentially new biological functions for miR-122 in liver physiology. We observed decreased expression of miR-122, a liver-specific miRNA, in HBV-associated HCC, and loss of miR-122 seemed to correlate with the decrease of mitochondrion-related metabolic pathway gene expression in HCC and in non-tumor liver tissues, a result that is consistent with the outcome of treatment of mice with anti-miR-122 and is of prognostic significance for HCC patients. Further investigation will be conducted to dissect the regulatory function of miR-122 on mitochondrial metabolism in HCC and to test whether increasing miR-122 expression can improve mitochondrial function in liver and perhaps in liver tumor tissues. Moreover, these results support the idea that primary targets of a given miRNA may be distributed over a variety of functional categories while resulting in a coordinated secondary response, potentially through synergistic action (Linsley et al, 2007).
Tumorigenesis involves multistep genetic alterations. To elucidate the microRNA (miRNA)–gene interaction network in carcinogenesis, we examined their genome-wide expression profiles in 96 pairs of tumor/non-tumor tissues from hepatocellular carcinoma (HCC). Comprehensive analysis of the coordinate expression of miRNAs and mRNAs reveals that miR-122 is under-expressed in HCC and that increased expression of miR-122 seed-matched genes leads to a loss of mitochondrial metabolic function. Furthermore, the miR-122 secondary targets, which decrease in expression, are good prognostic markers for HCC. Transcriptome profiling data from additional 180 HCC and 40 liver cirrhotic patients in the same cohort were used to confirm the anti-correlation of miR-122 primary and secondary target gene sets. The HCC findings can be recapitulated in mouse liver by silencing miR-122 with antagomir treatment followed by gene-expression microarray analysis. In vitro miR-122 data further provided a direct link between induction of miR-122-controlled genes and impairment of mitochondrial metabolism. In conclusion, miR-122 regulates mitochondrial metabolism and its loss may be detrimental to sustaining critical liver function and contribute to morbidity and mortality of liver cancer patients.
PMCID: PMC2950084  PMID: 20739924
hepatocellular carcinoma; microarray; miR-122; mitochondrial; survival
11.  Quantitative Analysis of the Drosophila Segmentation Regulatory Network Using Pattern Generating Potentials 
PLoS Biology  2010;8(8):e1000456.
A new computational method uses gene expression databases and transcription factor binding specificities to describe regulatory elements in the Drosophila A/P patterning network in unprecedented detail.
Cis-regulatory modules that drive precise spatial-temporal patterns of gene expression are central to the process of metazoan development. We describe a new computational strategy to annotate genomic sequences based on their “pattern generating potential” and to produce quantitative descriptions of transcriptional regulatory networks at the level of individual protein-module interactions. We use this approach to convert the qualitative understanding of interactions that regulate Drosophila segmentation into a network model in which a confidence value is associated with each transcription factor-module interaction. Sequence information from multiple Drosophila species is integrated with transcription factor binding specificities to determine conserved binding site frequencies across the genome. These binding site profiles are combined with transcription factor expression information to create a model to predict module activity patterns. This model is used to scan genomic sequences for the potential to generate all or part of the expression pattern of a nearby gene, obtained from available gene expression databases. Interactions between individual transcription factors and modules are inferred by a statistical method to quantify a factor's contribution to the module's pattern generating potential. We use these pattern generating potentials to systematically describe the location and function of known and novel cis-regulatory modules in the segmentation network, identifying many examples of modules predicted to have overlapping expression activities. Surprisingly, conserved transcription factor binding site frequencies were as effective as experimental measurements of occupancy in predicting module expression patterns or factor-module interactions. Thus, unlike previous module prediction methods, this method predicts not only the location of modules but also their spatial activity pattern and the factors that directly determine this pattern. As databases of transcription factor specificities and in vivo gene expression patterns grow, analysis of pattern generating potentials provides a general method to decode transcriptional regulatory sequences and networks.
Author Summary
The developmental program specifying segmentation along the anterior-posterior axis of the Drosophila embryo is one of the best studied examples of transcriptional regulatory networks. Previous work has identified the location and function of dozens of DNA segments called cis-regulatory “modules” that regulate several genes in precise spatial patterns in the early embryo. In many cases, transcription factors that interact with such modules have also been identified. We present a novel computational framework that turns a qualitative and fragmented understanding of modules and factor-module interactions into a quantitative, systems-level view. The formalism utilizes experimentally characterized binding specificities of transcription factors and gene expression patterns to describe how multiple transcription factors (working as activators or repressors) act together in a module to determine its regulatory activity. This formalism can explain the expression patterns of known modules, infer factor-module interactions and quantify the potential of an arbitrary DNA segment to drive a gene's expression. We have also employed databases of gene expression patterns to find novel modules of the regulatory network. As databases of binding motifs and gene expression patterns grow, this new approach provides a general method to decode transcriptional regulatory sequences and networks.
PMCID: PMC2923081  PMID: 20808951
12.  The phosphoproteome of toll-like receptor-activated macrophages 
First global and quantitative analysis of phosphorylation cascades induced by toll-like receptor (TLR) stimulation in macrophages identifies nearly 7000 phosphorylation sites and shows extensive and dynamic up-regulation and down-regulation after lipopolysaccharide (LPS).In addition to the canonical TLR-associated pathways, mining of the phosphorylation data suggests an involvement of ATM/ATR kinases in signalling and shows that the cytoskeleton is a hotspot of TLR-induced phosphorylation.Intersecting transcription factor phosphorylation with bioinformatic promoter analysis of genes induced by LPS identified several candidate transcriptional regulators that were previously not implicated in TLR-induced transcriptional control.
Toll-like receptors (TLR) are a family of pattern recognition receptors that enable innate immune cells to sense infectious danger. Recognition of microbial structures, like lipopolysaccharide (LPS) of Gram-negative bacteria by TLR4, causes within hours substantial re-programming of macrophage gene expression, including up-regulation of chemokines driving inflammation, anti-microbial effector molecules and cytokines directing adaptive immune responses. TLR signalling is initiated by the adapter protein Myd88 and leads to the activation of kinase cascades that result in activation of the MAPK and NFkB pathways. Phosphorylation has an essential role in these early steps of TLR signalling, and in addition regulates critical transcription factors (TFs). Although TLR signalling has been extensively studied, a comprehensive analysis of phosphorylation events in TLR-activated macrophages is lacking. It is therefore unknown whether the canonical MAPK and NFkB pathways comprise the main phosphorylation events and which other molecular functions and processes are regulated by phosphorylation after stimulation with LPS.
Recent progress in mass spectrometry-based proteomics has opened the possibility to quantitatively investigate global changes in protein abundance and post-translational modifications. Stable isotope labelling with amino acids in cell culture (SILAC) allows highly accurate quantification, and has proved especially useful for direct comparison of phosphopeptide abundance in time-course or treatment analyses.
Here, we adapted SILAC to primary mouse macrophages, and performed a global, quantitative and kinetic analysis of the macrophage phosphoproteome after LPS stimulation. Bioinformatic analyses were used to identify kinases, pathways and biological processes enriched in the LPS-regulated phosphoproteome. To connect TF phosphorylation with transcription, we generated a parallel dataset of nascent RNA and used in silico promoter analysis to identify transcriptional regulators with binding site enrichment among the LPS-regulated gene set.
After establishing SILAC conditions for efficient labelling of primary bone marrow-derived macrophages in two independent experiments 1850 phosphoproteins with a total of 6956 phosphorylation sites were reproducibly identified. Phosphoproteins were detected from all cellular compartments, with a clear enrichment for nuclear and cytoskeleton-associated proteins. LPS caused major regulation of a large fraction of phosphopeptides, with 24% of all sites up-regulated and 9% down-regulated after stimulation (Figure 3A and B). These changes were highly dynamic, as the majority of the regulated phosphopeptides were up-regulated or down-regulated transiently or in a delayed manner (Figure 3C). Overall, the extent of changes in the phosphoproteome was comparable to the transcriptional re-programming, underscoring the importance of phosphorylation cascades in TLR signalling. Our parallel transcriptome data also showed that widespread phosphorylation precedes massive transcriptional changes.
To obtain footprints of kinase activation in response to TLR ligation, we searched phosphopeptide sequences for known linear sequence motifs of 33 kinases and identified kinase motifs enriched among LPS-regulated phosphorylation sites (compared to non-regulated phosphorylation sites) (Table I). Motif ERK/MAPK was highly enriched, in accordance with the essential role of the MAPK module in TLR signalling. Other kinases with motif enrichment have also recently been linked to TLR signalling (e.g. PKD; AKT and its targets GSK3 and mTOR). However, the DNA damage-actviated kinases ATM/ATR and the cell cycle-associated kinases AURORA and CHK1/2 have not been associated with the macrophage response to TLR activation yet. These finding shed new light on older data on the effect of TLR on macrophage proliferation in response to macrophage colony stimulating factor. Of interest, in follow-up experiments using pharmacological inhibitors of the kinases with motif enrichment, we observed that inhibition of ATM kinase activity caused increased LPS-induced expression of several cytokines and chemokines, suggesting that this pathway regulates inflammatory responses.
In further bioinformatic analyses, the Gene Ontology and signalling pathway annotations of phosphoproteins were used to identify signalling pathways and cellular processes targeted by TLR4-controlled phosphorylation (Table II). Among the expected hits, based on the known TLR pathways, were TLR signalling, MAPK and AKT as well as mTOR signalling. Of interest, the annotation terms ‘Rho GTPase cycle' and ‘cytoskeleton' were significantly enriched among LPS-regulated phosphoproteins, indicating a more prominent role for cytoskeletal proteins in the transduction of TLR signals or in the biological response to it.
We were especially interested in the phosphorylation of TFs and its regulation by LPS (Figure 6A). We hypothesised that functionally important TFs should have an increased frequency of binding sites in the promoters of LPS-regulated genes (Figure 6B). To identify transcriptionally regulated genes with high sensitivity, we isolated nascent RNA after metabolic labelling (Figure 6C–E). In silico promoter scanning using Genomatix software for binding sites for all 50 TF families with phosphorylated members was used to test for enrichment in transciptionally induced genes (Figure 6F). At the early time point, binding site enrichment for the canonical TLR-associated TF NFkB was detected, and in addition we found that several other TF families with an established role in the transcription of individual LPS-target genes showed binding site enrichment (CEBP, MEF2, NFAT and HEAT). In addition, enrichment for OCT and HOXC binding sites at the early time point and SORY matrices later after stimulation indicated an involvement of the phosphorylated members of the respective TF families in the execution of TLR-induced transcriptional responses. An initial test of the function for a few of these candidate transcriptional regulators was performed using siRNA knockdown in primary macrophages. These experiments suggested that knock down of the SORY binding phosphoprotein Capicua homolog (Cic) and to a lesser extent of the CREB family member Atf7 selectively attenuates LPS-induced expression of Il1a and Il1b.
In summary, this study provides a novel and global perspective on innate immune activation by TLR signalling (Figure 5). We quantitatively detected a large number of previously unknown site-specific phosphorylation events, which are now publicly available through the Phosida database. By combining different data mining approaches, we consistently identified canonical and newly implicated TLR-activated signalling modules. In particular, the PI3K/AKT and the related mTOR pathway were highlighted; furthermore, DNA damage–response associated ATM/ATR kinases and the cytoskeleton emerged as unexpected hotspots for phosphorylation. Finally, weaving together corresponding phophoproteome and nascent transcriptome datasets through the loom of in silico promoter analysis we identified TFs with a likely role in mediating TLR-induced gene expression programmes.
Recognition of microbial danger signals by toll-like receptors (TLR) causes re-programming of macrophages. To investigate kinase cascades triggered by the TLR4 ligand lipopolysaccharide (LPS) on systems level, we performed a global, quantitative and kinetic analysis of the phosphoproteome of primary macrophages using stable isotope labelling with amino acids in cell culture, phosphopeptide enrichment and high-resolution mass spectrometry. In parallel, nascent RNA was profiled to link transcription factor (TF) phosphorylation to TLR4-induced transcriptional activation. We reproducibly identified 1850 phosphoproteins with 6956 phosphorylation sites, two thirds of which were not reported earlier. LPS caused major dynamic changes in the phosphoproteome (24% up-regulation and 9% down-regulation). Functional bioinformatic analyses confirmed canonical players of the TLR pathway and highlighted other signalling modules (e.g. mTOR, ATM/ATR kinases) and the cytoskeleton as hotspots of LPS-regulated phosphorylation. Finally, weaving together phosphoproteome and nascent transcriptome data by in silico promoter analysis, we implicated several phosphorylated TFs in primary LPS-controlled gene expression.
PMCID: PMC2913394  PMID: 20531401
macrophage; nascent RNA; phosphoproteome; SILAC; toll-like receptors
13.  Unraveling condition-dependent networks of transcription factors that control metabolic pathway activity in yeast 
While typically many expression levels change in transcription factor mutants, only few of these changes lead to functional changes. The predictive capability of expression and DNA binding data for such functional changes in metabolism is very limited.Large-scale 13C-flux data reveal the condition specificity of transcriptional control of metabolic function.Transcription control in yeast focuses on the switch between respiration and fermentation.Follow-up modeling on the basis of transcriptomics and proteomics data suggest the newly discovered Gcn4 control of respiration to be mediated via PKA and Snf1.
Effective control and modulation of cellular behavior is of paramount importance in medicine (Kreeger and Lauffenburger, 2010) and biotechnology (Haynes and Silver, 2009), and requires profound understanding of control mechanisms. In this study, we aim to elucidate the extent to which transcription factors control the operation of yeast metabolism. As a quantitative readout of metabolic function, we monitored the traffic of small molecules through various pathways of central metabolism by 13C-flux analysis (Sauer, 2006). The choosen growth conditions represent two different regulatory states of reduced (galactose) and maximal carbon source repression (glucose), as well as a different nitrogen metabolism and two common, permanent stress conditions.
Depending on the growth condition, between 7 and 13% of the deleted transcription factors altered the determined flux ratios (Figure 3). Of the six quantified flux ratios, only the glycolysis/pentose phosphate pathway split, and the convergent ratio of anaplerosis and TCA cycle were controlled by the deleted transcription factors. Thus, we concluded that 23 transcription factors control flux distributions under at least one of the tested growth conditions, leading to 42 condition-dependent interactions of transcription factors with metabolic pathway activity (Figure 4). With two exceptions, all other identified transcription factors interactions controlled the TCA cycle flux. This condition-specific active control of metabolic function could not have been predicted from DNA binding and expression data; that is, 26.1% false negatives, 48.6% true positives.
Of the 23 transcription factors that controlled TCA cycle flux distributions under the tested conditions, only Bas1, Gcn4, Gcr2 and Pho2 exerted control under more than one condition. We identified Cit1, Mdh1 and Idh1/2 with a proteomics approach as the relevant target enzyme that increase the TCA cycle flux. Next, we asked whether Bas1, Gcr2, Gcn4 and Pho2 act directly on the TCA cycle or mediate their effect indirectly. Based on the transcriptomics data, the pattern of differentially activated transcription factors inferred by the differential expression of their target genes suggested reduced glucose repression in all four mutants as the common mechanism.
Starting from the currently largest set of 13C-based flux distributions, we identified networks of individual transcription factors that control metabolic pathway activity. These networks of active metabolic control have the following properties. First, they are highly condition dependent, as at most four transcription factors control the same metabolic flux distribution under more than one growth conditions. Second, they focus almost exclusively on the TCA cycle, thereby controlling the switch between respiratory and fermentative metabolism. Third, with four to 14 active transcription factors, they are small compared with gene regulation networks that were obtained from expression and DNA binding data. For the metabolic network studied here, robustness is also apparent from the fact that upregulated TCA cycle fluxes were not sufficient to achieve full respiratory metabolism; that is, absent or low ethanol formation. Several explanations could potentially explain the observed robustness. The most likely explanation is that environmental signals might be transmitted by different signaling pathways to several transcription factors, whose orchestrated action on multiple target genes is necessary to achieve a functional flux response. This hypothesis would explain why several transcription factors exert flux effects on the same pathway, but each flux effect is relatively small, as further, coordinated manipulations would be necessary to further improve the respiratory flux. Our findings demonstrate the importance of identifying and quantifying the extent to which regulatory effectors alter cellular function.
Which transcription factors control the distribution of metabolic fluxes under a given condition? We address this question by systematically quantifying metabolic fluxes in 119 transcription factor deletion mutants of Saccharomyces cerevisiae under five growth conditions. While most knockouts did not affect fluxes, we identified 42 condition-dependent interactions that were mediated by a total of 23 transcription factors that control almost exclusively the cellular decision between respiration and fermentation. This relatively sparse, condition-specific network of active metabolic control contrasts with the much larger gene regulation network inferred from expression and DNA binding data. Based on protein and transcript analyses in key mutants, we identified three enzymes in the tricarboxylic acid cycle as the key targets of this transcriptional control. For the transcription factor Gcn4, we demonstrate that this control is mediated through the PKA and Snf1 signaling cascade. The discrepancy between flux response predictions, based on the known regulatory network architecture and our functional 13C-data, demonstrates the importance of identifying and quantifying the extent to which regulatory effectors alter cellular functions.
PMCID: PMC3010106  PMID: 21119627
metabolic flux; omics data; regulatory network; transcription factor; transcriptional regulation
14.  Modeling Co-Expression across Species for Complex Traits: Insights to the Difference of Human and Mouse Embryonic Stem Cells 
PLoS Computational Biology  2010;6(3):e1000707.
Complex interactions between genes or proteins contribute substantially to phenotypic evolution. We present a probabilistic model and a maximum likelihood approach for cross-species clustering analysis and for identification of conserved as well as species-specific co-expression modules. This model enables a “soft” cross-species clustering (SCSC) approach by encouraging but not enforcing orthologous genes to be grouped into the same cluster. SCSC is therefore robust to obscure orthologous relationships and can reflect different functional roles of orthologous genes in different species. We generated a time-course gene expression dataset for differentiating mouse embryonic stem (ES) cells, and compiled a dataset of published gene expression data on differentiating human ES cells. Applying SCSC to analyze these datasets, we identified conserved and species-specific gene regulatory modules. Together with protein-DNA binding data, an SCSC cluster specifically induced in murine ES cells indicated that the KLF2/4/5 transcription factors, although critical to maintaining the pluripotent phenotype in mouse ES cells, were decoupled from the OCT4/SOX2/NANOG regulatory module in human ES cells. Two of the target genes of murine KLF2/4/5, LIN28 and NODAL, were rewired to be targets of OCT4/SOX2/NANOG in human ES cells. Moreover, by mapping SCSC clusters onto KEGG signaling pathways, we identified the signal transduction components that were induced in pluripotent ES cells in either a conserved or a species-specific manner. These results suggest that the pluripotent cell identity can be established and maintained through more than one gene regulatory network.
Author Summary
A major goal in biology is to understand the evolution of complex traits, such as the development of multicellular body plans. To a certain extent, complex traits are governed by regulated gene expression. The comparison expression data between species requires extra considerations than sequence comparison, because gene expression is not static and the level of expression is influenced by external conditions. Considering that co-expression patterns are often comparable across species, we developed a statistical model for cross-species clustering analysis. The model allows each species to create its own clusters of the genes but also encourages the species to borrow strength from each others' clusters of orthologous genes. The result is a pairing of clusters, one from each species, where the paired clusters share many but not necessarily all orthologous genes. The model-based approach not only reduces subjective influence but also enables effective use of evolutionary dependence. Applying this model to analyze human and mouse embryonic stem (ES) cell data, we identified the transcription factors and the signaling proteins that are specifically expressed in either human or mouse ES cells. These results suggest that the pluripotent cell identity can be established and maintained through more than one gene regulatory network.
PMCID: PMC2837392  PMID: 20300647
15.  Targeted interactomics reveals a complex core cell cycle machinery in Arabidopsis thaliana 
A protein interactome focused towards cell proliferation was mapped comprising 857 interactions among 393 proteins, leading to many new insights in plant cell cycle regulation.A comprehensive view on heterodimeric cyclin-dependent kinase (CDK)/cyclin complexes in plants is obtained, in relation with their regulators.Over 100 new candidate cell cycle proteins were predicted.
The basic underlying mechanisms that govern the cell cycle are conserved among all eukaryotes. Peculiar for plants, however, is that their genome contains a collection of cell cycle regulatory genes that is intriguingly large (Vandepoele et al, 2002; Menges et al, 2005) compared to other eukaryotes. Arabidopsis thaliana (Arabidopsis) encodes 71 genes in five regulatory classes versus only 15 in yeast and 23 in human.
Despite the discovery of numerous cell cycle genes, little is known about the protein complex machinery that steers plant cell division. Therefore, we applied tandem affinity purification (TAP) approach coupled with mass spectrometry (MS) on Arabidopsis cell suspension cultures to isolate and analyze protein complexes involved in the cell cycle. This approach allowed us to successfully map a first draft of the basic cell cycle complex machinery of Arabidopsis, providing many new insights into plant cell division.
To map the interactome, we relied on a streamlined platform comprising generic Gateway-based vectors with high cloning flexibility, the fast generation of transgenic suspension cultures, TAP adapted for plant cells, and matrix-assisted laser desorption ionization (MALDI) tandem-MS for the identification of purified proteins (Van Leene et al, 2007, 2008Van Leene et al, 2007, 2008). Complexes for 102 cell cycle proteins were analyzed using this approach, leading to a non-redundant data set of 857 interactions among 393 proteins (Figure 1A). Two subspaces were identified in this data set, domain I1, containing interactions confirmed in at least two independent experimental repeats or in the reciprocal purification experiment, and domain I2 consisting of uniquely observed interactions.
Several observations underlined the quality of both domains. All tested reverse purifications found the original interaction, and 150 known or predicted interactions were confirmed, meaning that also a huge stack of new interactions was revealed. An in-depth computational analysis revealed enrichment for many cell cycle-related features among the proteins of the network (Figure 1B), and many protein pairs were coregulated at the transcriptional level (Figure 1C). Through integration of known cell cycle-related features, more than 100 new candidate cell cycle proteins were predicted (Figure 1D). Besides common qualities of both interactome domains, their real significance appeared through mutual differences exposing two subspaces in the cell cycle interactome: a central regulatory network of stable complexes that are repeatedly isolated and represent core regulatory units, and a peripheral network comprising transient interactions identified less frequently, which are involved in other aspects of the process, such as crosstalk between core complexes or connections with other pathways. To evaluate the biological relevance of the cell cycle interactome in plants, we validated interactions from both domains by a transient split-luciferase assay in Arabidopsis plants (Marion et al, 2008), further sustaining the hypothesis-generating power of the data set to understand plant growth.
With respect to insights into the cell cycle physiology, the interactome was subdivided according to the functional classes of the baits and core protein complexes were extracted, covering cyclin-dependent kinase (CDK)/cyclin core complexes together with their positive and negative regulation networks, DNA replication complexes, the anaphase-promoting complex, and spindle checkpoint complexes. The data imply that mitotic A- and B-type cyclins exclusively form heterodimeric complexes with the plant-specific B-type CDKs and not with CDKA;1, whereas D-type cyclins seem to associate with CDKA;1. Besides the extraction of complexes previously shown in other organisms, our data also suggested many new functional links; for example, the link coupling cell division with the regulation of transcript splicing. The association of negative regulators of CDK/cyclin complexes with transcription factors suggests that their role in reallocation is not solely targeted to CDK/cyclin complexes. New members of the Siamese-related inhibitory proteins were identified, and for the first time potential inhibitors of plant-specific mitotic B-type CDKs have been found in plants. New evidence that the E2F–DP–RBR network is not only active at G1-to-S, but also at the G2-to-M transition is provided and many complexes involved in DNA replication or repair were isolated. For the first time, a plant APC has been isolated biochemically, identifying three potential new plant-specific APC interactors, and finally, complexes involved in the spindle checkpoint were isolated mapping many new but specific interactions.
Finally, to get a general view on the complex machinery, modules of interacting cyclins and core cell cycle regulators were ranked along the cell cycle phases according to the transcript expression peak of the cyclins, showing an assorted set of CDK–cyclin complexes with high regulatory differentiation (Figure 4). Even within the same subfamily (e.g. cyclin A3, B1, B2, D3, and D4), cyclins differ not only in their functional time frame but also in the type and number of CDKs, inhibitors, and scaffolding proteins they bind, further indicating their functional diversification. According to our interaction data, at least 92 different variants of CDK–cyclin complexes are found in Arabidopsis.
In conclusion, these results reflect how several rounds of gene duplication (Sterck et al, 2007) led to the evolution of a large set of cyclin paralogs and a myriad of regulators, resulting in a significant jump in the complexity of the cell cycle machinery that could accommodate unique plant-specific features such as an indeterminate mode of postembryonic development. Through their extensive regulation and connection with a myriad of up- and downstream pathways, the core cell cycle complexes might offer the plant a flexible toolkit to fine-tune cell proliferation in response to an ever-changing environment.
Cell proliferation is the main driving force for plant growth. Although genome sequence analysis revealed a high number of cell cycle genes in plants, little is known about the molecular complexes steering cell division. In a targeted proteomics approach, we mapped the core complex machinery at the heart of the Arabidopsis thaliana cell cycle control. Besides a central regulatory network of core complexes, we distinguished a peripheral network that links the core machinery to up- and downstream pathways. Over 100 new candidate cell cycle proteins were predicted and an in-depth biological interpretation demonstrated the hypothesis-generating power of the interaction data. The data set provided a comprehensive view on heterodimeric cyclin-dependent kinase (CDK)–cyclin complexes in plants. For the first time, inhibitory proteins of plant-specific B-type CDKs were discovered and the anaphase-promoting complex was characterized and extended. Important conclusions were that mitotic A- and B-type cyclins form complexes with the plant-specific B-type CDKs and not with CDKA;1, and that D-type cyclins and S-phase-specific A-type cyclins seem to be associated exclusively with CDKA;1. Furthermore, we could show that plants have evolved a combinatorial toolkit consisting of at least 92 different CDK–cyclin complex variants, which strongly underscores the functional diversification among the large family of cyclins and reflects the pivotal role of cell cycle regulation in the developmental plasticity of plants.
PMCID: PMC2950081  PMID: 20706207
Arabidopsis thaliana; cell cycle; interactome; protein complex; protein interactions
16.  Bell's palsy 
Clinical Evidence  2011;2011:1204.
Bell's palsy is characterised by an acute, unilateral, partial, or complete paralysis of the face (i.e., lower motor neurone pattern). The weakness may be partial (paresis) or complete (paralysis), and may be associated with mild pain, numbness, increased sensitivity to sound, and altered taste. Bell's palsy remains idiopathic, but a proportion of cases may be caused by reactivation of herpes viruses from the geniculate ganglion of the facial nerve. Bell's palsy is most common in people aged 15 to 40 years, with a 1 in 60 lifetime risk. Most make a spontaneous recovery within 1 month, but up to 30% show delayed or incomplete recovery.
Methods and outcomes
We conducted a systematic review to answer the following clinical question: What are the effects of treatments in adults and children? We searched: Medline, Embase, The Cochrane Library, and other important databases up to June 2010 (Clinical Evidence reviews are updated periodically, please check our website for the most up-to-date version of this review). We included harms alerts from relevant organisations such as the US Food and Drug Administration (FDA) and the UK Medicines and Healthcare products Regulatory Agency (MHRA).
We found 14 systematic reviews, RCTs, or observational studies that met our inclusion criteria. We performed a GRADE evaluation of the quality of evidence for interventions.
In this systematic review we present information relating to the effectiveness and safety of the following interventions: antiviral treatment, corticosteroids (alone or plus antiviral treatment), hyperbaric oxygen therapy, facial nerve decompression surgery, and facial retraining.
Key Points
Bell's palsy is an idiopathic, unilateral, acute paresis or paralysis of facial movement caused by dysfunction of the lower motor neurone. Up to 30% of people with acute peripheral facial palsy have an alternative cause diagnosed at presentation or during the course of their facial palsy. Alternative causes are higher in children (>50%), warranting specialist evaluation at presentation. Severe pain, vesicles (ear or oral), and hearing loss or imbalance, suggest Ramsay Hunt syndrome caused by herpes zoster virus infection, which requires specialist management. Most people with paresis (partial weakness) make a spontaneous recovery within 3 weeks. Up to 30% of people, typically people with paralysis (complete palsy), have a delayed or incomplete recovery.
Corticosteroids alone improve rate of recovery and the proportion of people who make a full recovery, and reduce cosmetically disabling sequelae, motor synkinesis, and autonomic dysfunction compared with placebo or no treatment.
Antiviral treatment alone is no more effective than placebo and is less effective than corticosteroid treatment at improving recovery of facial motor function and at reducing the risk of disabling sequelae.
For people with paresis at presentation (about 70%), there is no evidence of a clinically important additive effect of adding antivirals to corticosteroid therapy. For people who develop paralysis (about 30%), and may demonstrate a trend towards complete degeneration on electrophysiological testing, it is unknown whether adding antiviral treatment to corticosteroid therapy has a significant additive or synergistic effect.
Hyperbaric oxygen may improve time to recovery and the proportion of people who make a full recovery compared with corticosteroids; however, the evidence for this is weak.
We don't know whether facial nerve decompression surgery is beneficial in Bell's palsy.
Facial retraining may improve recovery of facial motor function scores including stiffness and lip mobility, and may reduce the risk of motor synkinesis in Bell's palsy, but the evidence is too weak to draw conclusions.
Clinical guide Good evidence exists that corticosteroid therapy improves facial palsy in people with Bell's palsy independent of severity at presentation. Treatment is likely to be more effective when started within 72 hours of onset, and less effective after 7 days. Contraindications to corticosteroid therapy exist and adverse effects are more likely following 7 days of treatment. Combination therapy with a corticosteroid and antiviral is no more effective than corticosteroid therapy alone for Bell's palsy; however, combination therapy should be considered when there is evidence of viral infection with herpes zoster, such as zoster sine herpete and Ramsay Hunt syndrome. People presenting with complete facial paralysis should be offered a choice of combination therapy with a corticosteroid and antiviral, because the evidence for therapy without antivirals is not yet definitive for this group and antivirals have few adverse effects. In people presenting with mild facial paresis from Bell's palsy, there is a high rate of spontaneous resolution without treatment. Bell's palsy is a diagnosis of exclusion and clinicians should remain mindful of the causes of facial palsy, including tumour and infection. All children presenting with facial palsy and adults with delayed recovery should be referred for assessment by an otolaryngologist - head and neck surgeon or other appropriate specialist. The authors believe that facial palsy should not be treated only by protocol-driven practice. Bell's palsy is a diagnosis of exclusion, although a search for other causes of facial palsy must not delay treatment of likely Bell's palsy. Patients should have the opportunity to participate in an informed choice in their management where relevant.
PMCID: PMC3275144  PMID: 21375786
17.  Co-expression module analysis reveals biological processes, genomic gain, and regulatory mechanisms associated with breast cancer progression 
BMC Systems Biology  2010;4:74.
Gene expression signatures are typically identified by correlating gene expression patterns to a disease phenotype of interest. However, individual gene-based signatures usually suffer from low reproducibility and interpretability.
We have developed a novel algorithm Iterative Clique Enumeration (ICE) for identifying relatively independent maximal cliques as co-expression modules and a module-based approach to the analysis of gene expression data. Applying this approach on a public breast cancer dataset identified 19 modules whose expression levels were significantly correlated with tumor grade. The correlations were reproducible for 17 modules in an independent breast cancer dataset, and the reproducibility was considerably higher than that based on individual genes or modules identified by other algorithms. Sixteen out of the 17 modules showed significant enrichment in certain Gene Ontology (GO) categories. Specifically, modules related to cell proliferation and immune response were up-regulated in high-grade tumors while those related to cell adhesion was down-regulated. Further analyses showed that transcription factors NYFB, E2F1/E2F3, NRF1, and ELK1 were responsible for the up-regulation of the cell proliferation modules. IRF family and ETS family proteins were responsible for the up-regulation of the immune response modules. Moreover, inhibition of the PPARA signaling pathway may also play an important role in tumor progression. The module without GO enrichment was found to be associated with a potential genomic gain in 8q21-23 in high-grade tumors. The 17-module signature of breast tumor progression clustered patients into subgroups with significantly different relapse-free survival times. Namely, patients with lower cell proliferation and higher cell adhesion levels had significantly lower risk of recurrence, both for all patients (p = 0.004) and for those with grade 2 tumors (p = 0.017).
The ICE algorithm is effective in identifying relatively independent co-expression modules from gene co-expression networks and the module-based approach illustrated in this study provides a robust, interpretable, and mechanistic characterization of transcriptional changes.
PMCID: PMC2902438  PMID: 20507583
18.  Genome-Wide Discovery of Drug-Dependent Human Liver Regulatory Elements 
PLoS Genetics  2014;10(10):e1004648.
Inter-individual variation in gene regulatory elements is hypothesized to play a causative role in adverse drug reactions and reduced drug activity. However, relatively little is known about the location and function of drug-dependent elements. To uncover drug-associated elements in a genome-wide manner, we performed RNA-seq and ChIP-seq using antibodies against the pregnane X receptor (PXR) and three active regulatory marks (p300, H3K4me1, H3K27ac) on primary human hepatocytes treated with rifampin or vehicle control. Rifampin and PXR were chosen since they are part of the CYP3A4 pathway, which is known to account for the metabolism of more than 50% of all prescribed drugs. We selected 227 proximal promoters for genes with rifampin-dependent expression or nearby PXR/p300 occupancy sites and assayed their ability to induce luciferase in rifampin-treated HepG2 cells, finding only 10 (4.4%) that exhibited drug-dependent activity. As this result suggested a role for distal enhancer modules, we searched more broadly to identify 1,297 genomic regions bearing a conditional PXR occupancy as well as all three active regulatory marks. These regions are enriched near genes that function in the metabolism of xenobiotics, specifically members of the cytochrome P450 family. We performed enhancer assays in rifampin-treated HepG2 cells for 42 of these sequences as well as 7 sequences that overlap linkage-disequilibrium blocks defined by lead SNPs from pharmacogenomic GWAS studies, revealing 15/42 and 4/7 to be functional enhancers, respectively. A common African haplotype in one of these enhancers in the GSTA locus was found to exhibit potential rifampin hypersensitivity. Combined, our results further suggest that enhancers are the predominant targets of rifampin-induced PXR activation, provide a genome-wide catalog of PXR targets and serve as a model for the identification of drug-responsive regulatory elements.
Author Summary
Drug response varies between individuals and can be caused by genetic factors. Nucleotide variation in gene regulatory elements can have a significant effect on drug response, but due to the difficulty in identifying these elements, they remain understudied. Here, we used various genomic assays to analyze human liver cells treated with or without the antibiotic rifampin and identified drug-induced regulatory elements genome-wide. The testing of numerous active promoters in human liver cells showed only a few to be induced by rifampin treatment. A similar analysis of enhancers found several of them to be induced by the drug. Nucleotide variants in one of these enhancers were found to alter its activity. Combined, this work identifies numerous novel gene regulatory elements that can be activated due to drug response and thus provides candidate sequences in the human genome where nucleotide variation can lead to differences in drug response. It also provides a universally applicable method to detect these elements for other drugs.
PMCID: PMC4183418  PMID: 25275310
19.  Identification of gene co-regulatory modules and associated cis-elements involved in degenerative heart disease 
BMC Medical Genomics  2009;2:31.
Cardiomyopathies, degenerative diseases of cardiac muscle, are among the leading causes of death in the developed world. Microarray studies of cardiomyopathies have identified up to several hundred genes that significantly alter their expression patterns as the disease progresses. However, the regulatory mechanisms driving these changes, in particular the networks of transcription factors involved, remain poorly understood. Our goals are (A) to identify modules of co-regulated genes that undergo similar changes in expression in various types of cardiomyopathies, and (B) to reveal the specific pattern of transcription factor binding sites, cis-elements, in the proximal promoter region of genes comprising such modules.
We analyzed 149 microarray samples from human hypertrophic and dilated cardiomyopathies of various etiologies. Hierarchical clustering and Gene Ontology annotations were applied to identify modules enriched in genes with highly correlated expression and a similar physiological function. To discover motifs that may underly changes in expression, we used the promoter regions for genes in three of the most interesting modules as input to motif discovery algorithms. The resulting motifs were used to construct a probabilistic model predictive of changes in expression across different cardiomyopathies.
We found that three modules with the highest degree of functional enrichment contain genes involved in myocardial contraction (n = 9), energy generation (n = 20), or protein translation (n = 20). Using motif discovery tools revealed that genes in the contractile module were found to contain a TATA-box followed by a CACC-box, and are depleted in other GC-rich motifs; whereas genes in the translation module contain a pyrimidine-rich initiator, Elk-1, SP-1, and a novel motif with a GCGC core. Using a naïve Bayes classifier revealed that patterns of motifs are statistically predictive of expression patterns, with odds ratios of 2.7 (contractile), 1.9 (energy generation), and 5.5 (protein translation).
We identified patterns comprised of putative cis-regulatory motifs enriched in the upstream promoter sequence of genes that undergo similar changes in expression secondary to cardiomyopathies of various etiologies. Our analysis is a first step towards understanding transcription factor networks that are active in regulating gene expression during degenerative heart disease.
PMCID: PMC2700136  PMID: 19476647
20.  Multi-Organ Expression Profiling Uncovers a Gene Module in Coronary Artery Disease Involving Transendothelial Migration of Leukocytes and LIM Domain Binding 2: The Stockholm Atherosclerosis Gene Expression (STAGE) Study 
PLoS Genetics  2009;5(12):e1000754.
Environmental exposures filtered through the genetic make-up of each individual alter the transcriptional repertoire in organs central to metabolic homeostasis, thereby affecting arterial lipid accumulation, inflammation, and the development of coronary artery disease (CAD). The primary aim of the Stockholm Atherosclerosis Gene Expression (STAGE) study was to determine whether there are functionally associated genes (rather than individual genes) important for CAD development. To this end, two-way clustering was used on 278 transcriptional profiles of liver, skeletal muscle, and visceral fat (n = 66/tissue) and atherosclerotic and unaffected arterial wall (n = 40/tissue) isolated from CAD patients during coronary artery bypass surgery. The first step, across all mRNA signals (n = 15,042/12,621 RefSeqs/genes) in each tissue, resulted in a total of 60 tissue clusters (n = 3958 genes). In the second step (performed within tissue clusters), one atherosclerotic lesion (n = 49/48) and one visceral fat (n = 59) cluster segregated the patients into two groups that differed in the extent of coronary stenosis (P = 0.008 and P = 0.00015). The associations of these clusters with coronary atherosclerosis were validated by analyzing carotid atherosclerosis expression profiles. Remarkably, in one cluster (n = 55/54) relating to carotid stenosis (P = 0.04), 27 genes in the two clusters relating to coronary stenosis were confirmed (n = 16/17, P<10−27and−30). Genes in the transendothelial migration of leukocytes (TEML) pathway were overrepresented in all three clusters, referred to as the atherosclerosis module (A-module). In a second validation step, using three independent cohorts, the A-module was found to be genetically enriched with CAD risk by 1.8-fold (P<0.004). The transcription co-factor LIM domain binding 2 (LDB2) was identified as a potential high-hierarchy regulator of the A-module, a notion supported by subnetwork analysis, by cellular and lesion expression of LDB2, and by the expression of 13 TEML genes in Ldb2–deficient arterial wall. Thus, the A-module appears to be important for atherosclerosis development and, together with LDB2, merits further attention in CAD research.
Author Summary
The WHO predicts that coronary artery disease (CAD) will become the leading cause of death worldwide in 2010. Currently, major research efforts are focused on understanding the genetics of CAD through multi-center, genome-wide association studies of tens of thousands of patients and controls. Such studies can identify common variants of general importance throughout the entire population, which are likely relatively few. The number of rare genetic variants and variants that act in the context of environmental risk factors for CAD is probably much higher. We performed whole-genome expression analyses in several organs to identify functionally associated genes important for CAD development. We found an atherosclerosis module (A-module) consisting of 128 genes, enriched with genetic risk for CAD, involving transendothelial migration of leukocytes (TEML) and LIM domain binding 2 (LDB2) as its high-hierarchy regulator. Our study design represents a novel way of understanding the molecular underpinnings of CAD, focusing on genome-wide expression sensing both environmental and genetic influences. Investigating the relative enrichment of genetic CAD risk in functional groups (modules and networks) is an alternative approach to extract additional relevant information from genome-wide association studies. The A-module and LDB2 are attractive targets for treatments to modulate TEML and atherosclerosis development.
PMCID: PMC2780352  PMID: 19997623
21.  Microarray Analysis of Mercury-Induced Changes in Gene Expression in Human Liver Carcinoma (HepG2) Cells: Importance in Immune Responses 
Mercury is widely distributed in the biosphere, and its toxic effects have been associated with human death and several ailments that include cardiovascular diseases, anemia, kidney and liver damage, developmental abnormalities, neurobehavioral disorders, autoimmune diseases, and cancers in experimental animals. At the cellular level, mercury has been shown to interact with sulphydryl groups of proteins and enzymes, to damage DNA, and to modulate cell cycle progression and/or apoptosis. However, the underlying molecular mechanisms of mercury toxicity remain to be elucidated. Our laboratory has demonstrated that mercury exposure induces cytotoxicity and apoptosis, modulates cell cycle, and transcriptionally activates specific stress genes in human liver carcinoma cells. The liver is one of the few organs capable of regeneration from injury. Dormant genes in the liver are therefore capable of reactivation. In this research, we hypothesize that mercury-induced hepatotoxicity is associated with the modulation of specific gene expressions in liver cells that can lead to several disease states involving immune system dysfunctions. In testing this hypothesis, we used an Affymetrix oligonucleotide microarray with probe sets complementary to more than 20,000 genes to determine whether patterns of gene expressions differ between controls and mercury (1–3μg/mL) treated cells. There was a clear separation in gene expression profiles between controls and mercury-treated cells. Hierarchical cluster analysis identified 2,211 target genes that were affected. One hundred and thirty-eight of these genes were up-regulated, among which forty three were significantly over-expressed (p = 0.001) with greater than a two-fold change, and ninety five genes were moderately over-expressed with an increase of more than one fold (p = 0.004). Two thousand and twenty-three genes were down-regulated with only forty five of them reaching a statistically significant decline at p = 0.05 according to the Welch’s ANOVA/Welch’s t-test. Further analyses of affected genes identified genes located on all human chromosomes except chromosome 22 with higher than normal effects on genes found on chromosomes 1–14, 17–20 (sex-determining region Y)-box18SRY, 21 (splicing factor, arginine/serine-rich 15 and ATP-binding), and X (including BCL6-co-repressor). These genes are categorized as control and regulatory genes for metabolic pathways involving the cell cycle (cyclin-dependent kinases), apoptosis, cytokine expression, Na+/K+ ATPase, stress responses, G-protein signal transduction, transcription factors, DNA repair as well as metal-regulatory transcription factor 1, MTF1 HGNC, chondroitin sulfate proteoglycan 5 (neuroglycan C), ATP-binding cassette, sub-family G (WHITE), cytochrome b-561 family protein, CDC-like kinase 1 (CLK1 HGNC) (protein tyrosine kinase STY), Na+/H+ exchanger regulatory factor (NHERF HGNC), potassium voltage-gated channel subfamily H member 2 (KCNH2), putative MAPK activating protein (PM20, PM21), ras homolog gene family, polymerase (DNA directed), δ regulatory subunit (50kDa), leptin receptor involved in hematopoietin/interferon-class (D200-domain) cytokine receptor activity and thymidine kinase 2, mitochondrial TK2 HGNC and related genes. Significant alterations in these specific genes provide new directions for deeper mechanistic investigations that would lead to a better understanding of the molecular basis of mercury-induced toxicity and human diseases that may result from disturbances in the immune system.
PMCID: PMC3807506  PMID: 16823088
Mercury; oligonucleotide microarray; gene expression profile; HepG2 cells; immune responses
22.  The Molecular Phenotype of Endocapillary Proliferation: Novel Therapeutic Targets for IgA Nephropathy 
PLoS ONE  2014;9(8):e103413.
IgA nephropathy (IgAN) is a clinically and pathologically heterogeneous disease. Endocapillary proliferation is associated with higher risk of progressive disease, and clinical studies suggest that corticosteroids mitigate this risk. However, corticosteroids are associated with protean cellular effects and significant toxicity. Furthermore the precise mechanism by which they modulate kidney injury in IgAN is not well delineated. To better understand molecular pathways involved in the development of endocapillary proliferation and to identify novel specific therapeutic targets, we evaluated the glomerular transcriptome of microdissected kidney biopsies from 22 patients with IgAN. Endocapillary proliferation was defined according to the Oxford scoring system independently by 3 nephropathologists. We analyzed mRNA expression using microarrays and identified transcripts differentially expressed in patients with endocapillary proliferation compared to IgAN without endocapillary lesions. Next, we employed both transcription factor analysis and in silico drug screening and confirmed that the endocapillary proliferation transcriptome is significantly enriched with pathways that can be impacted by corticosteroids. With this approach we also identified novel therapeutic targets and bioactive small molecules that may be considered for therapeutic trials for the treatment of IgAN, including resveratrol and hydroquinine. In summary, we have defined the distinct molecular profile of a pathologic phenotype associated with progressive renal insufficiency in IgAN. Exploration of the pathways associated with endocapillary proliferation confirms a molecular basis for the clinical effectiveness of corticosteroids in this subgroup of IgAN, and elucidates new therapeutic strategies for IgAN.
PMCID: PMC4136785  PMID: 25133636
23.  Coexpression Network Analysis in Abdominal and Gluteal Adipose Tissue Reveals Regulatory Genetic Loci for Metabolic Syndrome and Related Phenotypes 
PLoS Genetics  2012;8(2):e1002505.
Metabolic Syndrome (MetS) is highly prevalent and has considerable public health impact, but its underlying genetic factors remain elusive. To identify gene networks involved in MetS, we conducted whole-genome expression and genotype profiling on abdominal (ABD) and gluteal (GLU) adipose tissue, and whole blood (WB), from 29 MetS cases and 44 controls. Co-expression network analysis for each tissue independently identified nine, six, and zero MetS–associated modules of coexpressed genes in ABD, GLU, and WB, respectively. Of 8,992 probesets expressed in ABD or GLU, 685 (7.6%) were expressed in ABD and 51 (0.6%) in GLU only. Differential eigengene network analysis of 8,256 shared probesets detected 22 shared modules with high preservation across adipose depots (DABD-GLU = 0.89), seven of which were associated with MetS (FDR P<0.01). The strongest associated module, significantly enriched for immune response–related processes, contained 94/620 (15%) genes with inter-depot differences. In an independent cohort of 145/141 twins with ABD and WB longitudinal expression data, median variability in ABD due to familiality was greater for MetS–associated versus un-associated modules (ABD: 0.48 versus 0.18, P = 0.08; GLU: 0.54 versus 0.20, P = 7.8×10−4). Cis-eQTL analysis of probesets associated with MetS (FDR P<0.01) and/or inter-depot differences (FDR P<0.01) provided evidence for 32 eQTLs. Corresponding eSNPs were tested for association with MetS–related phenotypes in two GWAS of >100,000 individuals; rs10282458, affecting expression of RARRES2 (encoding chemerin), was associated with body mass index (BMI) (P = 6.0×10−4); and rs2395185, affecting inter-depot differences of HLA-DRB1 expression, was associated with high-density lipoprotein (P = 8.7×10−4) and BMI–adjusted waist-to-hip ratio (P = 2.4×10−4). Since many genes and their interactions influence complex traits such as MetS, integrated analysis of genotypes and coexpression networks across multiple tissues relevant to clinical traits is an efficient strategy to identify novel associations.
Author Summary
Metabolic Syndrome (MetS) is a highly prevalent disorder with considerable public health concern, but its underlying genetic factors remain elusive. Given that most cellular components exert their functions through interactions with other cellular components, even the largest of genome-wide association (GWA) studies may often not detect their effects, nor necessarily provide insight into the complex molecular mechanisms of the disease. Rather than focusing on individual genes, the analysis of coexpression networks can be used for finding clusters (modules) of correlated expression levels across samples. In this study, we used a gene network–based approach for integrating clinical MetS, genotypic, and gene expression data from abdominal and gluteal adipose tissue and whole blood. We identified modules of genes related to MetS significantly enriched for immune response and oxidative phosphorylation pathways. We tested SNPs for association with MetS–associated expression (eSNPs), and tested prioritised eSNPs for association with MetS–related phenotypes in two large-scale GWA datasets. We identified two loci, neither of which had reached genome-wide significance levels in GWAs, associated with expression levels of RARRES2 and HLA-DRB1 and with MetS–related phenotypes, demonstrating that the integrated analysis of genotype and expression data from relevant multiple tissues can identify novel associations with complex traits such as MetS.
PMCID: PMC3285582  PMID: 22383892
24.  Mathematical Modeling of Corticosteroid Pharmacogenomics in Rat Muscle following Acute and Chronic Methylprednisolone Dosing 
Molecular pharmaceutics  2008;5(2):328-339.
The pharmacogenomic effects of a corticosteroid (CS) were assessed in rat skeletal muscle using microarrays. Adrenalectomized (ADX) rats were treated with methylprednisolone (MPL) by either 50 mg/kg intravenous injection or 7-day 0.3 mg/kg/h infusion through subcutaneously implanted pumps. RNAs extracted from individual rat muscles were hybridized to Affymetrix Rat Genome Genechips. Data mining yielded 653 and 2316 CS-responsive probe sets following MPL bolus and infusion treatments. Of these, 196 genes were controlled by MPL under both dosing conditions. Cluster analysis revealed that 124 probe sets exhibited three typical expression dynamic profiles following acute dosing. Cluster A consisted of up-regulated probe sets which were grouped into five subclusters each exhibiting unique temporal patterns during the infusion. Cluster B comprised down-regulated probe sets which were divided into two subclusters with distinct dynamics during the infusion. Cluster C probe sets exhibited delayed down-regulation under both bolus and infusion conditions. Among those, 104 probe sets were further grouped into subclusters based on their profiles following chronic MPL dosing. Several mathematical models were proposed and adequately captured the temporal patterns for each subcluster. Multiple types of dosing regimens are needed to resolve common determinants of gene regulation as chronic exposure results in unexpected differences in gene expression compared to acute dosing. Pharmacokinetic/pharmacodynamic (PK/PD) modeling provides a quantitative tool for elucidating the complexities of CS pharmacogenomics in skeletal muscle.
PMCID: PMC4196382  PMID: 18271548
Microarray studies; pharmacokinetics; pharmacodynamics; mathematical models; computational biology
25.  A Systems Genetics Approach Implicates USF1, FADS3, and Other Causal Candidate Genes for Familial Combined Hyperlipidemia 
PLoS Genetics  2009;5(9):e1000642.
We hypothesized that a common SNP in the 3' untranslated region of the upstream transcription factor 1 (USF1), rs3737787, may affect lipid traits by influencing gene expression levels, and we investigated this possibility utilizing the Mexican population, which has a high predisposition to dyslipidemia. We first associated rs3737787 genotypes in Mexican Familial Combined Hyperlipidemia (FCHL) case/control fat biopsies, with global expression patterns. To identify sets of co-expressed genes co-regulated by similar factors such as transcription factors, genetic variants, or environmental effects, we utilized weighted gene co-expression network analysis (WGCNA). Through WGCNA in the Mexican FCHL fat biopsies we identified two significant Triglyceride (TG)-associated co-expression modules. One of these modules was also associated with FCHL, the other FCHL component traits, and rs3737787 genotypes. This USF1-regulated FCHL-associated (URFA) module was enriched for genes involved in lipid metabolic processes. Using systems genetics procedures we identified 18 causal candidate genes in the URFA module. The FCHL causal candidate gene fatty acid desaturase 3 (FADS3) was associated with TGs in a recent Caucasian genome-wide significant association study and we replicated this association in Mexican FCHL families. Based on a USF1-regulated FCHL-associated co-expression module and SNP rs3737787, we identify a set of causal candidate genes for FCHL-related traits. We then provide evidence from two independent datasets supporting FADS3 as a causal gene for FCHL and elevated TGs in Mexicans.
Author Summary
By integrating a genetic polymorphism with genome-wide gene expression levels, we were able to attribute function to a genetic polymorphism in the USF1 gene. The USF1 gene has previously been associated with a common dyslipidemia, FCHL. FCHL is characterized by elevated levels of total cholesterol, triglycerides, or both. We demonstrate that this genetic polymorphism in USF1 contributes to FCHL disease risk by modulating the expression of a group of genes functionally related to lipid metabolism, and that this modulation is mediated by USF1. One of the genes whose expression is modulated by USF1 is FADS3, which was also implicated in a recent genome-wide association study for lipid traits. We demonstrated that a genetic polymorphism from the FADS3 region, which was associated with triglycerides in a GWAS study of Caucasians, was also associated with triglycerides in Mexican FCHL families. Our analysis provides novel insight into the gene expression profile contributing to FCHL disease risk, and identifies FADS3 as a new gene for FCHL in Mexicans.
PMCID: PMC2730565  PMID: 19750004

Results 1-25 (1463149)