PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-25 (1200053)

Clipboard (0)
None

Related Articles

1.  Inferring a transcriptional regulatory network of the cytokinesis-related genes by network component analysis 
BMC Systems Biology  2009;3:110.
Background
Network Component Analysis (NCA) is a network structure-driven framework for deducing regulatory signal dynamics. In contrast to principal component analysis, which can be employed to select the high-variance genes, NCA makes use of the connectivity structure from transcriptional regulatory networks to infer dynamics of transcription factor activities. Using the budding yeast Saccharomyces cerevisiae as a model system, we aim to deduce regulatory actions of cytokinesis-related genes, using precise spatial proximity (midbody) and/or temporal synchronicity (cytokinesis) to avoid full-scale computation from genome-wide databases.
Results
NCA was applied to infer regulatory actions of transcription factor activity from microarray data and partial transcription factor-gene connectivity information for cytokinesis-related genes, which were a subset of genome-wide datasets. No literature has so far discussed the inferred results through NCA are independent of the scale of the gene expression dataset. To avoid full-scale computation from genome-wide databases, four cytokinesis-related gene cases were selected for NCA by running computational analysis over the transcription factor database to confirm the approach being scale-free. The inferred dynamics of transcription factor activity through NCA were independent of the scale of the data matrix selected from the four cytokinesis-related gene sets. Moreover, the inferred regulatory actions were nearly identical to published observations for the selected cytokinesis-related genes in the budding yeast; namely, Mcm1, Ndd1, and Fkh2, which form a transcription factor complex to control expression of the CLB2 cluster (i.e. BUD4, CHS2, IQG1, and CDC5).
Conclusion
In this study, using S. cerevisiae as a model system, NCA was successfully applied to infer similar regulatory actions of transcription factor activities from two various microarray databases and several partial transcription factor-gene connectivity datasets for selected cytokinesis-related genes independent of data sizes. The regulated action for four selected cytokinesis-related genes (BUD4, CHS2, IQG1, and CDC5) belongs to the M-phase or M/G1 phase, consistent with the empirical observations that in S. cerevisiae, the Mcm1-Ndd1-Fkh2 transcription factor complex can regulate expression of the cytokinesis-related genes BUD4, CHS2, IQG1, and CDC5. Since Bud4, Iqg1, and Cdc5 are highly conserved between human and yeast, results obtained from NCA for cytokinesis in the budding yeast can lead to a suggestion that human cells should have the transcription regulator(s) as the budding yeast Mcm1-Ndd1-Fkh2 transcription factor complex in controlling occurrence of cytokinesis.
doi:10.1186/1752-0509-3-110
PMCID: PMC2800846  PMID: 19943917
2.  Network component analysis provides quantitative insights on an Arabidopsis transcription factor-gene regulatory network 
BMC Systems Biology  2013;7:126.
Background
Gene regulatory networks (GRNs) are models of molecule-gene interactions instrumental in the coordination of gene expression. Transcription factor (TF)-GRNs are an important subset of GRNs that characterize gene expression as the effect of TFs acting on their target genes. Although such networks can qualitatively summarize TF-gene interactions, it is highly desirable to quantitatively determine the strengths of the interactions in a TF-GRN as well as the magnitudes of TF activities. To our knowledge, such analysis is rare in plant biology. A computational methodology developed for this purpose is network component analysis (NCA), which has been used for studying large-scale microbial TF-GRNs to obtain nontrivial, mechanistic insights. In this work, we employed NCA to quantitatively analyze a plant TF-GRN important in floral development using available regulatory information from AGRIS, by processing previously reported gene expression data from four shoot apical meristem cell types.
Results
The NCA model satisfactorily accounted for gene expression measurements in a TF-GRN of seven TFs (LFY, AG, SEPALLATA3 [SEP3], AP2, AGL15, HY5 and AP3/PI) and 55 genes. NCA found strong interactions between certain TF-gene pairs including LFY → MYB17, AG → CRC, AP2 → RD20, AGL15 → RAV2 and HY5 → HLH1, and the direction of the interaction (activation or repression) for some AGL15 targets for which this information was not previously available. The activity trends of four TFs - LFY, AG, HY5 and AP3/PI as deduced by NCA correlated well with the changes in expression levels of the genes encoding these TFs across all four cell types; such a correlation was not observed for SEP3, AP2 and AGL15.
Conclusions
For the first time, we have reported the use of NCA to quantitatively analyze a plant TF-GRN important in floral development for obtaining nontrivial information about connectivity strengths between TFs and their target genes as well as TF activity. However, since NCA relies on documented connectivity information about the underlying TF-GRN, it is currently limited in its application to larger plant networks because of the lack of documented connectivities. In the future, the identification of interactions between plant TFs and their target genes on a genome scale would allow the use of NCA to provide quantitative regulatory information about plant TF-GRNs, leading to improved insights on cellular regulatory programs.
doi:10.1186/1752-0509-7-126
PMCID: PMC3843564  PMID: 24228871
3.  Reconstruction of Transcription Regulatory Networks by Stability-Based Network Component Analysis 
Reliable inference of transcription regulatory networks is still a challenging task in the field of computational biology. Network component analysis (NCA) has become a powerful scheme to uncover the networks behind complex biological processes, especially when gene expression data is integrated with binding motif information. However, the performance of NCA is impaired by the high rate of false connections in binding motif information and the high level of noise in gene expression data. Moreover, in real applications such as cancer research, the performance of NCA in simultaneously analyzing multiple candidate transcription factors (TFs) is further limited by the small sample number of gene expression data. In this paper, we propose a novel scheme, stability-based NCA, to overcome the above-mentioned problems by addressing the inconsistency between gene expression data and motif binding information (i.e., prior network knowledge). This method introduces small perturbations on prior network knowledge and utilizes the variation of estimated TF activities to reflect the stability of TF activities. Such a scheme is less limited by the sample size and especially capable to identify condition-specific TFs and their target genes. Experiment results on both simulation data and real breast cancer data demonstrate the efficiency and robustness of the proposed method.
PMCID: PMC3652899  PMID: 24407294
transcription regulatory network; network component analysis; stability analysis; transcription factor activity; target genes identification
4.  Inferring yeast cell cycle regulators and interactions using transcription factor activities 
BMC Genomics  2005;6:90.
Background
Since transcription factors are often regulated at the post-transcriptional level, their activities, rather than expression levels may provide valuable information for investigating functions and their interactions. The recently developed Network Component Analysis (NCA) and its generalized form (gNCA) provide a robust framework for deducing the transcription factor activities (TFAs) from various types of DNA microarray data and transcription factor-gene connectivity. The goal of this work is to demonstrate the utility of TFAs in inferring transcription factor functions and interactions in Saccharomyces cerevisiae cell cycle regulation.
Results
Using gNCA, we determined 74 TFAs from both wild type and fkh1 fkh2 deletion mutant microarray data encompassing 1529 ORFs. We hypothesized that transcription factors participating in the cell cycle regulation exhibit cyclic activity profiles. This hypothesis was supported by the TFA profiles of known cell cycle factors and was used as a basis to uncover other potential cell cycle factors. By combining the results from both cluster analysis and periodicity analysis, we recovered nearly 90% of the known cell cycle regulators, and identified 5 putative cell cycle-related transcription factors (Dal81, Hap2, Hir2, Mss11, and Rlm1). In addition, by analyzing expression data from transcription factor knockout strains, we determined 3 verified (Ace2, Ndd1, and Swi5) and 4 putative interaction partners (Cha4, Hap2, Fhl1, and Rts2) of the forkhead transcription factors. Sensitivity of TFAs to connectivity errors was determined to provide confidence level of these predictions.
Conclusion
By subjecting TFA profiles to analyses based upon physiological signatures we were able to identify cell cycle related transcription factors consistent with current literature, transcription factors with potential cell cycle dependent roles, and interactions between transcription factors.
doi:10.1186/1471-2164-6-90
PMCID: PMC1180827  PMID: 15949038
5.  Trimming of mammalian transcriptional networks using network component analysis 
BMC Bioinformatics  2010;11:511.
Background
Network Component Analysis (NCA) has been used to deduce the activities of transcription factors (TFs) from gene expression data and the TF-gene binding relationship. However, the TF-gene interaction varies in different environmental conditions and tissues, but such information is rarely available and cannot be predicted simply by motif analysis. Thus, it is beneficial to identify key TF-gene interactions under the experimental condition based on transcriptome data. Such information would be useful in identifying key regulatory pathways and gene markers of TFs in further studies.
Results
We developed an algorithm to trim network connectivity such that the important regulatory interactions between the TFs and the genes were retained and the regulatory signals were deduced. Theoretical studies demonstrated that the regulatory signals were accurately reconstructed even in the case where only three independent transcriptome datasets were available. At least 80% of the main target genes were correctly predicted in the extreme condition of high noise level and small number of datasets. Our algorithm was tested with transcriptome data taken from mice under rapamycin treatment. The initial network topology from the literature contains 70 TFs, 778 genes, and 1423 edges between the TFs and genes. Our method retained 1074 edges (i.e. 75% of the original edge number) and identified 17 TFs as being significantly perturbed under the experimental condition. Twelve of these TFs are involved in MAPK signaling or myeloid leukemia pathways defined in the KEGG database, or are known to physically interact with each other. Additionally, four of these TFs, which are Hif1a, Cebpb, Nfkb1, and Atf1, are known targets of rapamycin. Furthermore, the trimmed network was able to predict Eno1 as an important target of Hif1a; this key interaction could not be detected without trimming the regulatory network.
Conclusions
The advantage of our new algorithm, relative to the original NCA, is that our algorithm can identify the important TF-gene interactions. Identifying the important TF-gene interactions is crucial for understanding the roles of pleiotropic global regulators, such as p53. Also, our algorithm has been developed to overcome NCA's inability to analyze large networks where multiple TFs regulate a single gene. Thus, our algorithm extends the applicability of NCA to the realm of mammalian regulatory network analysis.
doi:10.1186/1471-2105-11-511
PMCID: PMC2967563  PMID: 20942926
6.  A transcriptional dynamic network during Arabidopsis thaliana pollen development 
BMC Systems Biology  2011;5(Suppl 3):S8.
Background
To understand transcriptional regulatory networks (TRNs), especially the coordinated dynamic regulation between transcription factors (TFs) and their corresponding target genes during development, computational approaches would represent significant advances in the genome-wide expression analysis. The major challenges for the experiments include monitoring the time-specific TFs' activities and identifying the dynamic regulatory relationships between TFs and their target genes, both of which are currently not yet available at the large scale. However, various methods have been proposed to computationally estimate those activities and regulations. During the past decade, significant progresses have been made towards understanding pollen development at each development stage under the molecular level, yet the regulatory mechanisms that control the dynamic pollen development processes remain largely unknown. Here, we adopt Networks Component Analysis (NCA) to identify TF activities over time couse, and infer their regulatory relationships based on the coexpression of TFs and their target genes during pollen development.
Results
We carried out meta-analysis by integrating several sets of gene expression data related to Arabidopsis thaliana pollen development (stages range from UNM, BCP, TCP, HP to 0.5 hr pollen tube and 4 hr pollen tube). We constructed a regulatory network, including 19 TFs, 101 target genes and 319 regulatory interactions. The computationally estimated TF activities were well correlated to their coordinated genes' expressions during the development process. We clustered the expression of their target genes in the context of regulatory influences, and inferred new regulatory relationships between those TFs and their target genes, such as transcription factor WRKY34, which was identified that specifically expressed in pollen, and regulated several new target genes. Our finding facilitates the interpretation of the expression patterns with more biological relevancy, since the clusters corresponding to the activity of specific TF or the combination of TFs suggest the coordinated regulation of TFs to their target genes.
Conclusions
Through integrating different resources, we constructed a dynamic regulatory network of Arabidopsis thaliana during pollen development with gene coexpression and NCA. The network illustrated the relationships between the TFs' activities and their target genes' expression, as well as the interactions between TFs, which provide new insight into the molecular mechanisms that control the pollen development.
doi:10.1186/1752-0509-5-S3-S8
PMCID: PMC3287576  PMID: 22784627
7.  Regulatory component analysis: a semi-blind extraction approach to infer gene regulatory networks with imperfect biological knowledge 
Signal processing  2011;92(8):1902-1915.
With the advent of high-throughput biotechnology capable of monitoring genomic signals, it becomes increasingly promising to understand molecular cellular mechanisms through systems biology approaches. One of the active research topics in systems biology is to infer gene transcriptional regulatory networks using various genomic data; this inference problem can be formulated as a linear model with latent signals associated with some regulatory proteins called transcription factors (TFs). As common statistical assumptions may not hold for genomic signals, typical latent variable algorithms such as independent component analysis (ICA) are incapable to reveal underlying true regulatory signals. Liao et al. [1] proposed to perform inference using an approach named network component analysis (NCA), the optimization of which is achieved by a least-squares fitting approach with biological knowledge constraints. However, the incompleteness of biological knowledge and its inconsistency with gene expression data are not considered in the original NCA solution, which could greatly affect the inference accuracy. To overcome these limitations, we propose a linear extraction scheme, namely regulatory component analysis (RCA), to infer underlying regulatory signals even with partial biological knowledge. Numerical simulations show a significant improvement of our proposed RCA over NCA, not only when signal-to-noise-ratio (SNR) is low, but also when the given biological knowledge is incomplete and inconsistent to gene expression data. Furthermore, real biological experiments on E. coli are performed for regulatory network inference in comparison with several typical linear latent variable methods, which again demonstrates the effectiveness and improved performance of the proposed algorithm.
doi:10.1016/j.sigpro.2011.11.028
PMCID: PMC3367667  PMID: 22685363
Transcriptional regulatory network inference; Source extraction; Gene expression; Genomic signal processing
8.  Behavioural Interventions for Urinary Incontinence in Community-Dwelling Seniors 
Executive Summary
In early August 2007, the Medical Advisory Secretariat began work on the Aging in the Community project, an evidence-based review of the literature surrounding healthy aging in the community. The Health System Strategy Division at the Ministry of Health and Long-Term Care subsequently asked the secretariat to provide an evidentiary platform for the ministry’s newly released Aging at Home Strategy.
After a broad literature review and consultation with experts, the secretariat identified 4 key areas that strongly predict an elderly person’s transition from independent community living to a long-term care home. Evidence-based analyses have been prepared for each of these 4 areas: falls and fall-related injuries, urinary incontinence, dementia, and social isolation. For the first area, falls and fall-related injuries, an economic model is described in a separate report.
Please visit the Medical Advisory Secretariat Web site, http://www.health.gov.on.ca/english/providers/program/mas/mas_about.html, to review these titles within the Aging in the Community series.
Aging in the Community: Summary of Evidence-Based Analyses
Prevention of Falls and Fall-Related Injuries in Community-Dwelling Seniors: An Evidence-Based Analysis
Behavioural Interventions for Urinary Incontinence in Community-Dwelling Seniors: An Evidence-Based Analysis
Caregiver- and Patient-Directed Interventions for Dementia: An Evidence-Based Analysis
Social Isolation in Community-Dwelling Seniors: An Evidence-Based Analysis
The Falls/Fractures Economic Model in Ontario Residents Aged 65 Years and Over (FEMOR)
Objective
To assess the effectiveness of behavioural interventions for the treatment and management of urinary incontinence (UI) in community-dwelling seniors.
Clinical Need: Target Population and Condition
Urinary incontinence defined as “the complaint of any involuntary leakage of urine” was identified as 1 of the key predictors in a senior’s transition from independent community living to admission to a long-term care (LTC) home. Urinary incontinence is a health problem that affects a substantial proportion of Ontario’s community-dwelling seniors (and indirectly affects caregivers), impacting their health, functioning, well-being and quality of life. Based on Canadian studies, prevalence estimates range from 9% to 30% for senior men and nearly double from 19% to 55% for senior women. The direct and indirect costs associated with UI are substantial. It is estimated that the total annual costs in Canada are $1.5 billion (Cdn), and that each year a senior living at home will spend $1,000 to $1,500 on incontinence supplies.
Interventions to treat and manage UI can be classified into broad categories which include lifestyle modification, behavioural techniques, medications, devices (e.g., continence pessaries), surgical interventions and adjunctive measures (e.g., absorbent products).
The focus of this review is behavioural interventions, since they are commonly the first line of treatment considered in seniors given that they are the least invasive options with no reported side effects, do not limit future treatment options, and can be applied in combination with other therapies. In addition, many seniors would not be ideal candidates for other types of interventions involving more risk, such as surgical measures.
Note: It is recognized that the terms “senior” and “elderly” carry a range of meanings for different audiences; this report generally uses the former, but the terms are treated here as essentially interchangeable.
Description of Technology/Therapy
Behavioural interventions can be divided into 2 categories according to the target population: caregiver-dependent techniques and patient-directed techniques. Caregiver-dependent techniques (also known as toileting assistance) are targeted at medically complex, frail individuals living at home with the assistance of a caregiver, who tends to be a family member. These seniors may also have cognitive deficits and/or motor deficits. A health care professional trains the senior’s caregiver to deliver an intervention such as prompted voiding, habit retraining, or timed voiding. The health care professional who trains the caregiver is commonly a nurse or a nurse with advanced training in the management of UI, such as a nurse continence advisor (NCA) or a clinical nurse specialist (CNS).
The second category of behavioural interventions consists of patient-directed techniques targeted towards mobile, motivated seniors. Seniors in this population are cognitively able, free from any major physical deficits, and motivated to regain and/or improve their continence. A nurse or a nurse with advanced training in UI management, such as an NCA or CNS, delivers the patient-directed techniques. These are often provided as multicomponent interventions including a combination of bladder training techniques, pelvic floor muscle training (PFMT), education on bladder control strategies, and self-monitoring. Pelvic floor muscle training, defined as a program of repeated pelvic floor muscle contractions taught and supervised by a health care professional, may be employed as part of a multicomponent intervention or in isolation.
Education is a large component of both caregiver-dependent and patient-directed behavioural interventions, and patient and/or caregiver involvement as well as continued practice strongly affect the success of treatment. Incontinence products, which include a large variety of pads and devices for effective containment of urine, may be used in conjunction with behavioural techniques at any point in the patient’s management.
Evidence-Based Analysis Methods
A comprehensive search strategy was used to identify systematic reviews and randomized controlled trials that examined the effectiveness, safety, and cost-effectiveness of caregiver-dependent and patient-directed behavioural interventions for the treatment of UI in community-dwelling seniors (see Appendix 1).
Research Questions
Are caregiver-dependent behavioural interventions effective in improving UI in medically complex, frail community-dwelling seniors with/without cognitive deficits and/or motor deficits?
Are patient-directed behavioural interventions effective in improving UI in mobile, motivated community-dwelling seniors?
Are behavioural interventions delivered by NCAs or CNSs in a clinic setting effective in improving incontinence outcomes in community-dwelling seniors?
Assessment of Quality of Evidence
The quality of the evidence was assessed as high, moderate, low, or very low according to the GRADE methodology and GRADE Working Group. As per GRADE the following definitions apply:
Summary of Findings
Executive Summary Table 1 summarizes the results of the analysis.
The available evidence was limited by considerable variation in study populations and in the type and severity of UI for studies examining both caregiver-directed and patient-directed interventions. The UI literature frequently is limited to reporting subjective outcome measures such as patient observations and symptoms. The primary outcome of interest, admission to a LTC home, was not reported in the UI literature. The number of eligible studies was low, and there were limited data on long-term follow-up.
Summary of Evidence on Behavioural Interventions for the Treatment of Urinary Incontinence in Community-Dwelling Seniors
Prompted voiding
Habit retraining
Timed voiding
Bladder training
PFMT (with or without biofeedback)
Bladder control strategies
Education
Self-monitoring
CI refers to confidence interval; CNS, clinical nurse specialist; NCA, nurse continence advisor; PFMT, pelvic floor muscle training; RCT, randomized controlled trial; WMD, weighted mean difference; UI, urinary incontinence.
Economic Analysis
A budget impact analysis was conducted to forecast costs for caregiver-dependent and patient-directed multicomponent behavioural techniques delivered by NCAs, and PFMT alone delivered by physiotherapists. All costs are reported in 2008 Canadian dollars. Based on epidemiological data, published medical literature and clinical expert opinion, the annual cost of caregiver-dependent behavioural techniques was estimated to be $9.2 M, while the annual costs of patient-directed behavioural techniques delivered by either an NCA or physiotherapist were estimated to be $25.5 M and $36.1 M, respectively. Estimates will vary if the underlying assumptions are changed.
Currently, the province of Ontario absorbs the cost of NCAs (available through the 42 Community Care Access Centres across the province) in the home setting. The 2007 Incontinence Care in the Community Report estimated that the total cost being absorbed by the public system of providing continence care in the home is $19.5 M in Ontario. This cost estimate included resources such as personnel, communication with physicians, record keeping and product costs. Clinic costs were not included in this estimation because currently these come out of the global budget of the respective hospital and very few continence clinics actually exist in the province. The budget impact analysis factored in a cost for the clinic setting, assuming that the public system would absorb the cost with this new model of community care.
Considerations for Ontario Health System
An expert panel on aging in the community met on 3 occasions from January to May 2008, and in part, discussed treatment of UI in seniors in Ontario with a focus on caregiver-dependent and patient-directed behavioural interventions. In particular, the panel discussed how treatment for UI is made available to seniors in Ontario and who provides the service. Some of the major themes arising from the discussions included:
Services/interventions that currently exist in Ontario offering behavioural interventions to treat UI are not consistent. There is a lack of consistency in how seniors access services for treatment of UI, who manages patients and what treatment patients receive.
Help-seeking behaviours are important to consider when designing optimal service delivery methods.
There is considerable social stigma associated with UI and therefore there is a need for public education and an awareness campaign.
The cost of incontinent supplies and the availability of NCAs were highlighted.
Conclusions
There is moderate-quality evidence that the following interventions are effective in improving UI in mobile motivated seniors:
Multicomponent behavioural interventions including a combination of bladder training techniques, PFMT (with or without biofeedback), education on bladder control strategies and self-monitoring techniques.
Pelvic floor muscle training alone.
There is moderate quality evidence that when behavioural interventions are led by NCAs or CNSs in a clinic setting, they are effective in improving UI in seniors.
There is limited low-quality evidence that prompted voiding may be effective in medically complex, frail seniors with motivated caregivers.
There is insufficient evidence for the following interventions in medically complex, frail seniors with motivated caregivers:
habit retraining, and
timed voiding.
PMCID: PMC3377527  PMID: 23074508
9.  iRegulon: From a Gene List to a Gene Regulatory Network Using Large Motif and Track Collections 
PLoS Computational Biology  2014;10(7):e1003731.
Identifying master regulators of biological processes and mapping their downstream gene networks are key challenges in systems biology. We developed a computational method, called iRegulon, to reverse-engineer the transcriptional regulatory network underlying a co-expressed gene set using cis-regulatory sequence analysis. iRegulon implements a genome-wide ranking-and-recovery approach to detect enriched transcription factor motifs and their optimal sets of direct targets. We increase the accuracy of network inference by using very large motif collections of up to ten thousand position weight matrices collected from various species, and linking these to candidate human TFs via a motif2TF procedure. We validate iRegulon on gene sets derived from ENCODE ChIP-seq data with increasing levels of noise, and we compare iRegulon with existing motif discovery methods. Next, we use iRegulon on more challenging types of gene lists, including microRNA target sets, protein-protein interaction networks, and genetic perturbation data. In particular, we over-activate p53 in breast cancer cells, followed by RNA-seq and ChIP-seq, and could identify an extensive up-regulated network controlled directly by p53. Similarly we map a repressive network with no indication of direct p53 regulation but rather an indirect effect via E2F and NFY. Finally, we generalize our computational framework to include regulatory tracks such as ChIP-seq data and show how motif and track discovery can be combined to map functional regulatory interactions among co-expressed genes. iRegulon is available as a Cytoscape plugin from http://iregulon.aertslab.org.
Author Summary
Gene regulatory networks control developmental, homeostatic, and disease processes by governing precise levels and spatio-temporal patterns of gene expression. Determining their topology can provide mechanistic insight into these processes. Gene regulatory networks consist of interactions between transcription factors and their direct target genes. Each regulatory interaction represents the binding of the transcription factor to a specific DNA binding site near its target gene. Here we present a computational method, called iRegulon, to identify master regulators and direct target genes in a human gene signature, i.e. a set of co-expressed genes. iRegulon relies on the analysis of the regulatory sequences around each gene in the gene set to detect enriched TF motifs or ChIP-seq peaks, using databases of nearly 10.000 TF motifs and 1000 ChIP-seq data sets or “tracks”. Next, it associates enriched motifs and tracks with candidate transcription factors and determines the optimal subset of direct target genes. We validate iRegulon on ENCODE data, and use it in combination with RNA-seq and ChIP-seq data to map a p53 downstream network with new predicted co-factors and targets. iRegulon is available as a Cytoscape plugin, supporting human, mouse, and Drosophila genes, and provides access to hundreds of cancer-related TF-target subnetworks or “regulons”.
doi:10.1371/journal.pcbi.1003731
PMCID: PMC4109854  PMID: 25058159
10.  A dynamic network of transcription in LPS-treated human subjects 
BMC Systems Biology  2009;3:78.
Background
Understanding the transcriptional regulatory networks that map out the coordinated dynamic responses of signaling proteins, transcription factors and target genes over time would represent a significant advance in the application of genome wide expression analysis. The primary challenge is monitoring transcription factor activities over time, which is not yet available at the large scale. Instead, there have been several developments to estimate activities computationally. For example, Network Component Analysis (NCA) is an approach that can predict transcription factor activities over time as well as the relative regulatory influence of factors on each target gene.
Results
In this study, we analyzed a gene expression data set in blood leukocytes from human subjects administered with lipopolysaccharide (LPS), a prototypical inflammatory challenge, in the context of a reconstructed regulatory network including 10 transcription factors, 99 target genes and 149 regulatory interactions. We found that the computationally estimated activities were well correlated to their coordinated action. Furthermore, we found that clustering the genes in the context of regulatory influences greatly facilitated interpretation of the expression data, as clusters of gene expression corresponded to the activity of specific factors or more interestingly, factor combinations which suggest coordinated regulation of gene expression. The resulting clusters were therefore more biologically meaningful, and also led to identification of additional genes under the same regulation.
Conclusion
Using NCA, we were able to build a network that accounted for between 8–11% genes in the known transcriptional response to LPS in humans. The dynamic network illustrated changes of transcription factor activities and gene expressions as well as interactions of signaling proteins, transcription factors and target genes.
doi:10.1186/1752-0509-3-78
PMCID: PMC2729748  PMID: 19638230
11.  An integrated machine learning approach for predicting DosR-regulated genes in Mycobacterium tuberculosis 
BMC Systems Biology  2010;4:37.
Background
DosR is an important regulator of the response to stress such as limited oxygen availability in Mycobacterium tuberculosis. Time course gene expression data enable us to dissect this response on the gene regulatory level. The mRNA expression profile of a regulator, however, is not necessarily a direct reflection of its activity. Knowing the transcription factor activity (TFA) can be exploited to predict novel target genes regulated by the same transcription factor. Various approaches have been proposed to reconstruct TFAs from gene expression data. Most of them capture only a first-order approximation to the complex transcriptional processes by assuming linear gene responses and linear dynamics in TFA, or ignore the temporal information in data from such systems.
Results
In this paper, we approach the problem of inferring dynamic hidden TFAs using Gaussian processes (GP). We are able to model dynamic TFAs and to account for both linear and nonlinear gene responses. To test the validity of the proposed approach, we reconstruct the hidden TFA of p53, a tumour suppressor activated by DNA damage, using published time course gene expression data. Our reconstructed TFA is closer to the experimentally determined profile of p53 concentration than that from the original study. We then apply the model to time course gene expression data obtained from chemostat cultures of M. tuberculosis under reduced oxygen availability. After estimation of the TFA of DosR based on a number of known target genes using the GP model, we predict novel DosR-regulated genes: the parameters of the model are interpreted as relevance parameters indicating an existing functional relationship between TFA and gene expression. We further improve the prediction by integrating promoter sequence information in a logistic regression model. Apart from the documented DosR-regulated genes, our prediction yields ten novel genes under direct control of DosR.
Conclusions
Chemostat cultures are an ideal experimental system for controlling noise and variability when monitoring the response of bacterial organisms such as M. tuberculosis to finely controlled changes in culture conditions and available metabolites. Nonlinear hidden TFA dynamics of regulators can be reconstructed remarkably well with Gaussian processes from such data. Moreover, estimated parameters of the GP can be used to assess whether a gene is controlled by the reconstructed TFA or not. It is straightforward to combine these parameters with further information, such as the presence of binding motifs, to increase prediction accuracy.
doi:10.1186/1752-0509-4-37
PMCID: PMC2867773  PMID: 20356371
12.  Dynamics of Regulatory Networks in Gastrin-Treated Adenocarcinoma Cells 
PLoS ONE  2014;9(1):e78349.
Understanding gene transcription regulatory networks is critical to deciphering the molecular mechanisms of different cellular states. Most studies focus on static transcriptional networks. In the current study, we used the gastrin-regulated system as a model to understand the dynamics of transcriptional networks composed of transcription factors (TFs) and target genes (TGs). The hormone gastrin activates and stimulates signaling pathways leading to various cellular states through transcriptional programs. Dysregulation of gastrin can result in cancerous tumors, for example. However, the regulatory networks involving gastrin are highly complex, and the roles of most of the components of these networks are unknown. We used time series microarray data of AR42J adenocarcinoma cells treated with gastrin combined with static TF-TG relationships integrated from different sources, and we reconstructed the dynamic activities of TFs using network component analysis (NCA). Based on the peak expression of TGs and activity of TFs, we created active sub-networks at four time ranges after gastrin treatment, namely immediate-early (IE), mid-early (ME), mid-late (ML) and very late (VL). Network analysis revealed that the active sub-networks were topologically different at the early and late time ranges. Gene ontology analysis unveiled that each active sub-network was highly enriched in a particular biological process. Interestingly, network motif patterns were also distinct between the sub-networks. This analysis can be applied to other time series microarray datasets, focusing on smaller sub-networks that are activated in a cascade, allowing better overview of the mechanisms involved at each time range.
doi:10.1371/journal.pone.0078349
PMCID: PMC3885390  PMID: 24416123
13.  Reconstructing genome-wide regulatory network of E. coli using transcriptome data and predicted transcription factor activities 
BMC Bioinformatics  2011;12:233.
Background
Gene regulatory networks play essential roles in living organisms to control growth, keep internal metabolism running and respond to external environmental changes. Understanding the connections and the activity levels of regulators is important for the research of gene regulatory networks. While relevance score based algorithms that reconstruct gene regulatory networks from transcriptome data can infer genome-wide gene regulatory networks, they are unfortunately prone to false positive results. Transcription factor activities (TFAs) quantitatively reflect the ability of the transcription factor to regulate target genes. However, classic relevance score based gene regulatory network reconstruction algorithms use models do not include the TFA layer, thus missing a key regulatory element.
Results
This work integrates TFA prediction algorithms with relevance score based network reconstruction algorithms to reconstruct gene regulatory networks with improved accuracy over classic relevance score based algorithms. This method is called Gene expression and Transcription factor activity based Relevance Network (GTRNetwork). Different combinations of TFA prediction algorithms and relevance score functions have been applied to find the most efficient combination. When the integrated GTRNetwork method was applied to E. coli data, the reconstructed genome-wide gene regulatory network predicted 381 new regulatory links. This reconstructed gene regulatory network including the predicted new regulatory links show promising biological significances. Many of the new links are verified by known TF binding site information, and many other links can be verified from the literature and databases such as EcoCyc. The reconstructed gene regulatory network is applied to a recent transcriptome analysis of E. coli during isobutanol stress. In addition to the 16 significantly changed TFAs detected in the original paper, another 7 significantly changed TFAs have been detected by using our reconstructed network.
Conclusions
The GTRNetwork algorithm introduces the hidden layer TFA into classic relevance score-based gene regulatory network reconstruction processes. Integrating the TFA biological information with regulatory network reconstruction algorithms significantly improves both detection of new links and reduces that rate of false positives. The application of GTRNetwork on E. coli gene transcriptome data gives a set of potential regulatory links with promising biological significance for isobutanol stress and other conditions.
doi:10.1186/1471-2105-12-233
PMCID: PMC3224099  PMID: 21668997
14.  Genome-scale cold stress response regulatory networks in ten Arabidopsis thaliana ecotypes 
BMC Genomics  2013;14:722.
Background
Low temperature leads to major crop losses every year. Although several studies have been conducted focusing on diversity of cold tolerance level in multiple phenotypically divergent Arabidopsis thaliana (A. thaliana) ecotypes, genome-scale molecular understanding is still lacking.
Results
In this study, we report genome-scale transcript response diversity of 10 A. thaliana ecotypes originating from different geographical locations to non-freezing cold stress (10°C). To analyze the transcriptional response diversity, we initially compared transcriptome changes in all 10 ecotypes using Arabidopsis NimbleGen ATH6 microarrays. In total 6061 transcripts were significantly cold regulated (p < 0.01) in 10 ecotypes, including 498 transcription factors and 315 transposable elements. The majority of the transcripts (75%) showed ecotype specific expression pattern. By using sequence data available from Arabidopsis thaliana 1001 genome project, we further investigated sequence polymorphisms in the core cold stress regulon genes. Significant numbers of non-synonymous amino acid changes were observed in the coding region of the CBF regulon genes. Considering the limited knowledge about regulatory interactions between transcription factors and their target genes in the model plant A. thaliana, we have adopted a powerful systems genetics approach- Network Component Analysis (NCA) to construct an in-silico transcriptional regulatory network model during response to cold stress. The resulting regulatory network contained 1,275 nodes and 7,720 connections, with 178 transcription factors and 1,331 target genes.
Conclusions
A. thaliana ecotypes exhibit considerable variation in transcriptome level responses to non-freezing cold stress treatment. Ecotype specific transcripts and related gene ontology (GO) categories were identified to delineate natural variation of cold stress regulated differential gene expression in the model plant A. thaliana. The predicted regulatory network model was able to identify new ecotype specific transcription factors and their regulatory interactions, which might be crucial for their local geographic adaptation to cold temperature. Additionally, since the approach presented here is general, it could be adapted to study networks regulating biological process in any biological systems.
doi:10.1186/1471-2164-14-722
PMCID: PMC3829657  PMID: 24148294
Arabidopsis thaliana; Ecotypes; Cold stress; Natural variation; Adaptation; Gene expression; Regulatory networks; Arabidopsis thaliana 1001 genome; Systems biology; Network component analysis
15.  Dynamic Regulatory Network Reconstruction for Alzheimer's Disease Based on Matrix Decomposition Techniques 
Alzheimer's disease (AD) is the most common form of dementia and leads to irreversible neurodegenerative damage of the brain. Finding the dynamic responses of genes, signaling proteins, transcription factor (TF) activities, and regulatory networks of the progressively deteriorative progress of AD would represent a significant advance in discovering the pathogenesis of AD. However, the high throughput technologies of measuring TF activities are not yet available on a genome-wide scale. In this study, based on DNA microarray gene expression data and a priori information of TFs, network component analysis (NCA) algorithm is applied to determining the TF activities and regulatory influences on TGs of incipient, moderate, and severe AD. Based on that, the dynamical gene regulatory networks of the deteriorative courses of AD were reconstructed. To select significant genes which are differentially expressed in different courses of AD, independent component analysis (ICA), which is better than the traditional clustering methods and can successfully group one gene in different meaningful biological processes, was used. The molecular biological analysis showed that the changes of TF activities and interactions of signaling proteins in mitosis, cell cycle, immune response, and inflammation play an important role in the deterioration of AD.
doi:10.1155/2014/891761
PMCID: PMC4082865  PMID: 25024739
16.  Connectivity in the Yeast Cell Cycle Transcription Network: Inferences from Neural Networks 
PLoS Computational Biology  2006;2(12):e169.
A current challenge is to develop computational approaches to infer gene network regulatory relationships based on multiple types of large-scale functional genomic data. We find that single-layer feed-forward artificial neural network (ANN) models can effectively discover gene network structure by integrating global in vivo protein:DNA interaction data (ChIP/Array) with genome-wide microarray RNA data. We test this on the yeast cell cycle transcription network, which is composed of several hundred genes with phase-specific RNA outputs. These ANNs were robust to noise in data and to a variety of perturbations. They reliably identified and ranked 10 of 12 known major cell cycle factors at the top of a set of 204, based on a sum-of-squared weights metric. Comparative analysis of motif occurrences among multiple yeast species independently confirmed relationships inferred from ANN weights analysis. ANN models can capitalize on properties of biological gene networks that other kinds of models do not. ANNs naturally take advantage of patterns of absence, as well as presence, of factor binding associated with specific expression output; they are easily subjected to in silico “mutation” to uncover biological redundancies; and they can use the full range of factor binding values. A prominent feature of cell cycle ANNs suggested an analogous property might exist in the biological network. This postulated that “network-local discrimination” occurs when regulatory connections (here between MBF and target genes) are explicitly disfavored in one network module (G2), relative to others and to the class of genes outside the mitotic network. If correct, this predicts that MBF motifs will be significantly depleted from the discriminated class and that the discrimination will persist through evolution. Analysis of distantly related Schizosaccharomyces pombe confirmed this, suggesting that network-local discrimination is real and complements well-known enrichment of MBF sites in G1 class genes.
Synopsis
A current challenge is to develop computational approaches to infer gene network regulatory relationships by integrating multiple types of large-scale functional genomic data. This paper shows that simple artificial neural networks (ANNs) employed in a new way do this very well. The ANN models are well-suited to capitalize on natural properties of gene networks in ways that many previous methods do not. Resulting gene network connections inferred between transcription factors and RNA output patterns are robust to noise in large-scale input datasets and to differences in RNA clustering class inputs. This was shown by using the yeast cell cycle gene network as a test case. The cycle has multiple classes of oscillatory RNAs, and Hart, Mjolsness, and Wold show that the ANNs identify key connections that associate genes from each cell cycle phase group with known and candidate regulators. Comparative analysis of network connectivity across multiple genomes showed strong conservation of basic factor-to-output relationships, although at the greatest evolutionary distances the specific target genes have mainly changed identity.
doi:10.1371/journal.pcbi.0020169
PMCID: PMC1761652  PMID: 17194216
17.  Discovering Motifs in Ranked Lists of DNA Sequences 
PLoS Computational Biology  2007;3(3):e39.
Computational methods for discovery of sequence elements that are enriched in a target set compared with a background set are fundamental in molecular biology research. One example is the discovery of transcription factor binding motifs that are inferred from ChIP–chip (chromatin immuno-precipitation on a microarray) measurements. Several major challenges in sequence motif discovery still require consideration: (i) the need for a principled approach to partitioning the data into target and background sets; (ii) the lack of rigorous models and of an exact p-value for measuring motif enrichment; (iii) the need for an appropriate framework for accounting for motif multiplicity; (iv) the tendency, in many of the existing methods, to report presumably significant motifs even when applied to randomly generated data. In this paper we present a statistical framework for discovering enriched sequence elements in ranked lists that resolves these four issues. We demonstrate the implementation of this framework in a software application, termed DRIM (discovery of rank imbalanced motifs), which identifies sequence motifs in lists of ranked DNA sequences. We applied DRIM to ChIP–chip and CpG methylation data and obtained the following results. (i) Identification of 50 novel putative transcription factor (TF) binding sites in yeast ChIP–chip data. The biological function of some of them was further investigated to gain new insights on transcription regulation networks in yeast. For example, our discoveries enable the elucidation of the network of the TF ARO80. Another finding concerns a systematic TF binding enhancement to sequences containing CA repeats. (ii) Discovery of novel motifs in human cancer CpG methylation data. Remarkably, most of these motifs are similar to DNA sequence elements bound by the Polycomb complex that promotes histone methylation. Our findings thus support a model in which histone methylation and CpG methylation are mechanistically linked. Overall, we demonstrate that the statistical framework embodied in the DRIM software tool is highly effective for identifying regulatory sequence elements in a variety of applications ranging from expression and ChIP–chip to CpG methylation data. DRIM is publicly available at http://bioinfo.cs.technion.ac.il/drim.
Author Summary
A computational problem with many applications in molecular biology is to identify short DNA sequence patterns (motifs) that are significantly overrepresented in a target set of genomic sequences relative to a background set of genomic sequences. One example is a target set that contains DNA sequences to which a specific transcription factor protein was experimentally measured as bound while the background set contains sequences to which the same transcription factor was not bound. Overrepresented sequence motifs in the target set may represent a subsequence that is molecularly recognized by the transcription factor. An inherent limitation of the above formulation of the problem lies in the fact that in many cases data cannot be clearly partitioned into distinct target and background sets in a biologically justified manner. We describe a statistical framework for discovering motifs in a list of genomic sequences that are ranked according to a biological parameter or measurement (e.g., transcription factor to sequence binding measurements). Our approach circumvents the need to partition the data into target and background sets using arbitrarily set parameters. The framework is implemented in a software tool called DRIM. The application of DRIM led to the identification of novel putative transcription factor binding sites in yeast and to the discovery of previously unknown motifs in CpG methylation regions in human cancer cell lines.
doi:10.1371/journal.pcbi.0030039
PMCID: PMC1829477  PMID: 17381235
18.  Reconstruction of Gene Regulatory Modules in Cancer Cell Cycle by Multi-Source Data Integration 
PLoS ONE  2010;5(4):e10268.
Background
Precise regulation of the cell cycle is crucial to the growth and development of all organisms. Understanding the regulatory mechanism of the cell cycle is crucial to unraveling many complicated diseases, most notably cancer. Multiple sources of biological data are available to study the dynamic interactions among many genes that are related to the cancer cell cycle. Integrating these informative and complementary data sources can help to infer a mutually consistent gene transcriptional regulatory network with strong similarity to the underlying gene regulatory relationships in cancer cells.
Results and Principal Findings
We propose an integrative framework that infers gene regulatory modules from the cell cycle of cancer cells by incorporating multiple sources of biological data, including gene expression profiles, gene ontology, and molecular interaction. Among 846 human genes with putative roles in cell cycle regulation, we identified 46 transcription factors and 39 gene ontology groups. We reconstructed regulatory modules to infer the underlying regulatory relationships. Four regulatory network motifs were identified from the interaction network. The relationship between each transcription factor and predicted target gene groups was examined by training a recurrent neural network whose topology mimics the network motif(s) to which the transcription factor was assigned. Inferred network motifs related to eight well-known cell cycle genes were confirmed by gene set enrichment analysis, binding site enrichment analysis, and comparison with previously published experimental results.
Conclusions
We established a robust method that can accurately infer underlying relationships between a given transcription factor and its downstream target genes by integrating different layers of biological data. Our method could also be beneficial to biologists for predicting the components of regulatory modules in which any candidate gene is involved. Such predictions can then be used to design a more streamlined experimental approach for biological validation. Understanding the dynamics of these modules will shed light on the processes that occur in cancer cells resulting from errors in cell cycle regulation.
doi:10.1371/journal.pone.0010268
PMCID: PMC2858157  PMID: 20422009
19.  Putative cold acclimation pathways in Arabidopsis thaliana identified by a combined analysis of mRNA co-expression patterns, promoter motifs and transcription factors 
BMC Genomics  2007;8:304.
Background
With the advent of microarray technology, it has become feasible to identify virtually all genes in an organism that are induced by developmental or environmental changes. However, relying solely on gene expression data may be of limited value if the aim is to infer the underlying genetic networks. Development of computational methods to combine microarray data with other information sources is therefore necessary. Here we describe one such method.
Results
By means of our method, previously published Arabidopsis microarray data from cold acclimated plants at six different time points, promoter motif sequence data extracted from ~24,000 Arabidopsis promoters and known transcription factor binding sites were combined to construct a putative genetic regulatory interaction network. The inferred network includes both previously characterised and hitherto un-described regulatory interactions between transcription factor (TF) genes and genes that encode other TFs or other proteins. Part of the obtained transcription factor regulatory network is presented here. More detailed information is available in the additional files.
Conclusion
The rule-based method described here can be used to infer genetic networks by combining data from microarrays, promoter sequences and known promoter binding sites. This method should in principle be applicable to any biological system. We tested the method on the cold acclimation process in Arabidopsis and could identify a more complex putative genetic regulatory network than previously described. However, it should be noted that information on specific binding sites for individual TFs were in most cases not available. Thus, gene targets for the entire TF gene families were predicted. In addition, the networks were built solely by a bioinformatics approach and experimental verifications will be necessary for their final validation. On the other hand, since our method highlights putative novel interactions, more directed experiments could now be performed.
doi:10.1186/1471-2164-8-304
PMCID: PMC2001198  PMID: 17764576
20.  Uncovering a Macrophage Transcriptional Program by Integrating Evidence from Motif Scanning and Expression Dynamics 
PLoS Computational Biology  2008;4(3):e1000021.
Macrophages are versatile immune cells that can detect a variety of pathogen-associated molecular patterns through their Toll-like receptors (TLRs). In response to microbial challenge, the TLR-stimulated macrophage undergoes an activation program controlled by a dynamically inducible transcriptional regulatory network. Mapping a complex mammalian transcriptional network poses significant challenges and requires the integration of multiple experimental data types. In this work, we inferred a transcriptional network underlying TLR-stimulated murine macrophage activation. Microarray-based expression profiling and transcription factor binding site motif scanning were used to infer a network of associations between transcription factor genes and clusters of co-expressed target genes. The time-lagged correlation was used to analyze temporal expression data in order to identify potential causal influences in the network. A novel statistical test was developed to assess the significance of the time-lagged correlation. Several associations in the resulting inferred network were validated using targeted ChIP-on-chip experiments. The network incorporates known regulators and gives insight into the transcriptional control of macrophage activation. Our analysis identified a novel regulator (TGIF1) that may have a role in macrophage activation.
Author Summary
Macrophages play a vital role in host defense against infection by recognizing pathogens through pattern recognition receptors, such as the Toll-like receptors (TLRs), and mounting an immune response. Stimulation of TLRs initiates a complex transcriptional program in which induced transcription factor genes dynamically regulate downstream genes. Microarray-based transcriptional profiling has proved useful for mapping such transcriptional programs in simpler model organisms; however, mammalian systems present difficulties such as post-translational regulation of transcription factors, combinatorial gene regulation, and a paucity of available gene-knockout expression data. Additional evidence sources, such as DNA sequence-based identification of transcription factor binding sites, are needed. In this work, we computationally inferred a transcriptional network for TLR-stimulated murine macrophages. Our approach combined sequence scanning with time-course expression data in a probabilistic framework. Expression data were analyzed using the time-lagged correlation. A novel, unbiased method was developed to assess the significance of the time-lagged correlation. The inferred network of associations between transcription factor genes and co-expressed gene clusters was validated with targeted ChIP-on-chip experiments, and yielded insights into the macrophage activation program, including a potential novel regulator. Our general approach could be used to analyze other complex mammalian systems for which time-course expression data are available.
doi:10.1371/journal.pcbi.1000021
PMCID: PMC2265556  PMID: 18369420
21.  GINI: From ISH Images to Gene Interaction Networks 
PLoS Computational Biology  2013;9(10):e1003227.
Accurate inference of molecular and functional interactions among genes, especially in multicellular organisms such as Drosophila, often requires statistical analysis of correlations not only between the magnitudes of gene expressions, but also between their temporal-spatial patterns. The ISH (in-situ-hybridization)-based gene expression micro-imaging technology offers an effective approach to perform large-scale spatial-temporal profiling of whole-body mRNA abundance. However, analytical tools for discovering gene interactions from such data remain an open challenge due to various reasons, including difficulties in extracting canonical representations of gene activities from images, and in inference of statistically meaningful networks from such representations. In this paper, we present GINI, a machine learning system for inferring gene interaction networks from Drosophila embryonic ISH images. GINI builds on a computer-vision-inspired vector-space representation of the spatial pattern of gene expression in ISH images, enabled by our recently developed system; and a new multi-instance-kernel algorithm that learns a sparse Markov network model, in which, every gene (i.e., node) in the network is represented by a vector-valued spatial pattern rather than a scalar-valued gene intensity as in conventional approaches such as a Gaussian graphical model. By capturing the notion of spatial similarity of gene expression, and at the same time properly taking into account the presence of multiple images per gene via multi-instance kernels, GINI is well-positioned to infer statistically sound, and biologically meaningful gene interaction networks from image data. Using both synthetic data and a small manually curated data set, we demonstrate the effectiveness of our approach in network building. Furthermore, we report results on a large publicly available collection of Drosophila embryonic ISH images from the Berkeley Drosophila Genome Project, where GINI makes novel and interesting predictions of gene interactions. Software for GINI is available at http://sailing.cs.cmu.edu/Drosophila_ISH_images/
Author Summary
As high-throughput technologies for molecular abundance profiling are becoming more inexpensive and accessible, computational inference of gene interaction networks from such data based on well-founded statistical principles is imperative to advance the understanding of regulatory mechanisms in various biological systems. Reverse engineering of gene networks has traditionally relied on analysis of whole-genome microarray data; here we present a new method, GINI, to infer gene networks from ISH images, thereby enabling exploration of spatial characteristics of gene expressions for network inference. Our method generates a Markov network, which encapsulates globally meaningful statistical-dependencies from vector-valued gene spatial patterns. In other words, we advance the state-of-art in both the usage of richer forms of expression data, and the employment of principled statistical methodology for sound network inference on such new form of data. Our results show that analyzing the spatial distribution of gene expression enables us to capture information not available from microarray data. Such an analysis is especially important in analyzing genes involved in embryonic development of Drosophila to reveal specific spatial patterning that determines the development of the 14 segments of the adult fly.
doi:10.1371/journal.pcbi.1003227
PMCID: PMC3794902  PMID: 24130465
22.  Pharmacokinetic similarity of biologics: analysis using nonlinear mixed-effects modeling 
Our objective was to show with two examples that a pharmacokinetic (PK) similarity analysis can be performed with nonlinear mixed effects models (NLMEM). We used two studies comparing different biosimilars: a three-way crossover trial on somatropin and a parallel group trial on epoetin alpha. For both datasets, NLMEM-based analysis was compared to non-compartmental analysis (NCA). As for NCA, we performed NLMEM-based equivalence Wald test on secondary parameters of the model: the area under the curve and the maximal concentration. Somatropin PK was described by a one-compartment model, epoetin alpha PK by a two-compartment model with linear and Michaelis-Menten elimination. For both studies, PK similarity was demonstrated by NCA and NLMEM. Both approaches led to similar results. Therefore, PK similarity data can be analyzed by both methods. NCA is an easier approach as it does not require data modelling but NLMEM leads to a better understanding of the underlying biological system.
doi:10.1038/clpt.2011.216
PMCID: PMC3400548  PMID: 22205196
Biological Agents; pharmacokinetics; Clinical Trials as Topic; statistics & numerical data; Erythropoietin; pharmacokinetics; Human Growth Hormone; pharmacokinetics; Humans; Nonlinear Dynamics; Recombinant Proteins; pharmacokinetics; biologics; nonlinear mixed effects model; non-compartmental analysis; bioequivalence; pharmacokinetics
23.  Trade-off between Responsiveness and Noise Suppression in Biomolecular System Responses to Environmental Cues 
PLoS Computational Biology  2011;7(6):e1002091.
When living systems detect changes in their external environment their response must be measured to balance the need to react appropriately with the need to remain stable, ignoring insignificant signals. Because this is a fundamental challenge of all biological systems that execute programs in response to stimuli, we developed a generalized time-frequency analysis (TFA) framework to systematically explore the dynamical properties of biomolecular networks. Using TFA, we focused on two well-characterized yeast gene regulatory networks responsive to carbon-source shifts and a mammalian innate immune regulatory network responsive to lipopolysaccharides (LPS). The networks are comprised of two different basic architectures. Dual positive and negative feedback loops make up the yeast galactose network; whereas overlapping positive and negative feed-forward loops are common to the yeast fatty-acid response network and the LPS-induced network of macrophages. TFA revealed remarkably distinct network behaviors in terms of trade-offs in responsiveness and noise suppression that are appropriately tuned to each biological response. The wild type galactose network was found to be highly responsive while the oleate network has greater noise suppression ability. The LPS network appeared more balanced, exhibiting less bias toward noise suppression or responsiveness. Exploration of the network parameter space exposed dramatic differences in system behaviors for each network. These studies highlight fundamental structural and dynamical principles that underlie each network, reveal constrained parameters of positive and negative feedback and feed-forward strengths that tune the networks appropriately for their respective biological roles, and demonstrate the general utility of the TFA approach for systems and synthetic biology.
Author Summary
Biological systems constantly balance noise suppression with responsiveness. In a fluctuating environment, some changes are insignificant to living cells while others represent cues to which they must respond. These stimuli are interpreted by molecular circuits that enable the cell to strike an appropriate balance between responsiveness and noise suppression. This trade-off is governed by the structure and kinetic parameters of molecular networks, which have been tuned by evolutionary selection for different stimuli and responses. We consider three regulatory circuits (two from yeast and one from mammalian cells), which respond to different environments and involve very different physiological processes. To investigate the responses to a time varying signal, we developed a generalized time-frequency analysis framework for studying such trade-offs using mathematical models of regulatory circuits and explore how the structure and parameters of the circuit affect the trade-offs between noise suppression and responsiveness. The generalized TFA approach represents an effective tool for exploring and analyzing different systems-level dynamical properties. Making use of such properties can facilitate prediction and network control for systems- and synthetic biology applications.
doi:10.1371/journal.pcbi.1002091
PMCID: PMC3127798  PMID: 21738459
24.  TimeDelay-ARACNE: Reverse engineering of gene networks from time-course data by an information theoretic approach 
BMC Bioinformatics  2010;11:154.
Background
One of main aims of Molecular Biology is the gain of knowledge about how molecular components interact each other and to understand gene function regulations. Using microarray technology, it is possible to extract measurements of thousands of genes into a single analysis step having a picture of the cell gene expression. Several methods have been developed to infer gene networks from steady-state data, much less literature is produced about time-course data, so the development of algorithms to infer gene networks from time-series measurements is a current challenge into bioinformatics research area. In order to detect dependencies between genes at different time delays, we propose an approach to infer gene regulatory networks from time-series measurements starting from a well known algorithm based on information theory.
Results
In this paper we show how the ARACNE (Algorithm for the Reconstruction of Accurate Cellular Networks) algorithm can be used for gene regulatory network inference in the case of time-course expression profiles. The resulting method is called TimeDelay-ARACNE. It just tries to extract dependencies between two genes at different time delays, providing a measure of these dependencies in terms of mutual information. The basic idea of the proposed algorithm is to detect time-delayed dependencies between the expression profiles by assuming as underlying probabilistic model a stationary Markov Random Field. Less informative dependencies are filtered out using an auto calculated threshold, retaining most reliable connections. TimeDelay-ARACNE can infer small local networks of time regulated gene-gene interactions detecting their versus and also discovering cyclic interactions also when only a medium-small number of measurements are available. We test the algorithm both on synthetic networks and on microarray expression profiles. Microarray measurements concern S. cerevisiae cell cycle, E. coli SOS pathways and a recently developed network for in vivo assessment of reverse engineering algorithms. Our results are compared with ARACNE itself and with the ones of two previously published algorithms: Dynamic Bayesian Networks and systems of ODEs, showing that TimeDelay-ARACNE has good accuracy, recall and F-score for the network reconstruction task.
Conclusions
Here we report the adaptation of the ARACNE algorithm to infer gene regulatory networks from time-course data, so that, the resulting network is represented as a directed graph. The proposed algorithm is expected to be useful in reconstruction of small biological directed networks from time course data.
doi:10.1186/1471-2105-11-154
PMCID: PMC2862045  PMID: 20338053
25.  Global coordination of transcriptional control and mRNA decay during cellular differentiation 
We have systematically identified the targets of the Schizosaccharomyces pombe RNA-binding protein Meu5p, which is transiently induced during cellular differentiation. Meu5p-bound transcripts (>80) are expressed at low levels and have shorter half-lives in meu5 mutants, suggesting that Meu5p binding stabilizes its RNA targets.Most Meu5p targets are induced during differentiation by the activity of the Mei4p transcription factor. However, although most Mei4p targets display a sharp peak of expression, Meu5p targets are expressed for a longer period. In the absence of Meu5p, all Mei4p targets are expressed with similar kinetics (similar to non-Meu5p targets). Therefore, Meu5p determines the temporal profile of its targets.As the meu5 gene is itself a target of the transcription factor Mei4p, the RNA-binding protein Meu5p and their shared targets form a feed-forward loop (FFL), a network motif that is common in transcriptional networks.Our data highlight the importance of considering both transcriptional and posttranscriptional controls to understand dynamic changes in RNA levels, and provide insight into the structure of the regulatory networks that integrate transcription and RNA decay.
RNA levels are determined by the balance between RNA production (transcription) and degradation (decay or turnover). Therefore, cells can alter transcript levels by modulating either or both processes. Regulation of transcriptional initiation is one of the most common ways to regulate RNA levels. This function is frequently performed by transcription factors (TFs), which recognize specific sequence motifs on the promoters of their target genes and activate or repress their transcription. At the posttranscriptional level, RNA-binding proteins (RBPs) can bind to specific sequences on their target RNAs and regulate their rates of turnover.
RNA decay can be studied at the genome-wide level using microarrays or next-generation sequencing. The contribution of RNA turnover to transcript levels can be assessed by directly measuring decay rates. This is usually achieved by using microarrays to follow the decrease of RNA levels after inactivation of RNA polymerase II, or by in vivo labelling of newly synthesized RNA with modified nucleosides. These approaches can be applied to mutants in genes encoding RBPs, allowing the dissection of their specific functions in RNA turnover. Moreover, direct RBP targets can be identified by purifying RBP–RNA complexes, which are then analysed using microarrays (RIp-chip, for RBP Immunoprecipitation followed by analysis with DNA chips).
Many biological processes involve the establishment of complex programs of gene expression, in which the levels of hundreds of mRNAs are dynamically regulated. Although the genome-wide function of TFs in these processes has been studied extensively, much less is known about the contribution of RBPs, and especially about how the activity of TFs and RBPs is coordinated. Sexual differentiation of the fission yeast Schizosaccharomyces pombe culminates in meiosis and sporulation and is driven by an extensive gene expression program during which ∼40% of the genome (∼2000 genes) is regulated in complex temporal patterns. Transcriptional control is essential for the implementation of this program, and TFs responsible for the induction of most groups of upregulated genes have been identified. In particular, a transcription factor called Mei4p, which is itself transiently expressed during the meiotic divisions, induces the temporary expression of over 500 genes.
Here, we use genome-wide approaches to investigate the function of the Meu5p RBP, which is transiently induced by the Mei4p TF during the meiotic divisions. RIp-chip experiments identified >80 transcripts bound to Meu5p during meiosis, most of which were also targets of the Mei4p transcription factor. In meu5 mutants, Meu5p targets are expressed at low levels and have shorter half-lives, indicating that Meu5p stabilizes the transcripts it binds to. This stabilization has biological importance, as cells without meu5 are defective in spore formation.
Although the majority of Mei4p TF targets reach their peak in expression levels with similar kinetics, we noticed that the timing of their downregulation was heterogeneous. We could identify two discrete groups among Mei4p targets: a set of mRNAs with short (∼1 h) and sharp gene expression profiles (early decrease), and a group that displayed a broader expression pattern, with high levels of expression for 2–3 h (late decrease).
Most Meu5p RBP targets belonged to the late-decrease group, suggesting a simple model in which Meu5p might stabilize its targets, thus extending the duration of their expression. To test this idea, we followed gene expression in synchronized cultures of wild-type and meu5Δ meiotic cells. Although the expression of early decrease genes was not affected by the absence of meu5, late-decrease genes switched their profile to a pattern similar to that of early decrease genes. As transcription of meu5 is under the control of Mei4p, the TF Mei4p, the RBP Meu5p, and their common targets form a so-called feed-forward loop, in which a protein regulates a target both directly and indirectly through a second protein. This arrangement is common in transcriptional and protein phosphorylation networks.
Our results serve as a paradigm of how the coordination of the action of TFs and RBPs determines how RNA levels are dynamically regulated.
The function of transcription in dynamic gene expression programs has been extensively studied, but little is known about how it is integrated with RNA turnover at the genome-wide level. We investigated these questions using the meiotic gene expression program of Schizosaccharomyces pombe. We identified over 80 transcripts that co-purify with the meiotic-specific Meu5p RNA-binding protein. Their levels and half-lives were reduced in meu5 mutants, demonstrating that Meu5p stabilizes its targets. Most Meu5p-bound RNAs were also targets of the Mei4p transcription factor, which induces the transient expression of ∼500 meiotic genes. Although many Mei4p targets showed sharp expression peaks, Meu5p targets had broad expression profiles. In the absence of meu5, all Mei4p targets were expressed with similar kinetics, indicating that Meu5p alters the global features of the gene expression program. As Mei4p activates meu5 transcription, Mei4p, Meu5p and their common targets form a feed-forward loop, a motif common in transcriptional networks but not studied in the context of mRNA decay. Our data provide insight into the topology of regulatory networks integrating transcriptional and posttranscriptional controls.
doi:10.1038/msb.2010.38
PMCID: PMC2913401  PMID: 20531409
mRNA decay; RIp-chip; posttranscriptional control

Results 1-25 (1200053)