The editors of BMC Systems Biology would like to thank all our reviewers who have contributed to the journal in Volume 8 (2014).
Rho GTPases function as molecular switches in many different signaling pathways and control a wide range of cellular processes. Rho GDP-dissociation inhibitors (RhoGDIs) regulate Rho GTPase signaling and can function as both negative and positive regulators. The role of RhoGDIs as negative regulators of Rho GTPase signaling has been extensively investigated; however, little is known about how RhoGDIs act as positive regulators. Furthermore, it is unclear how this opposing role of GDIs influences the Rho GTPase cycle. We constructed ordinary differential equation models of the Rho GTPase cycle in which RhoGDIs inhibit the regulatory activities of guanine nucleotide exchange factors (GEFs) and GTPase-activating proteins (GAPs) by interacting with them directly as well as by sequestering the Rho GTPases. Using this model, we analyzed the role of RhoGDIs in Rho GTPase signaling.
The model constructed in this study showed that the functions of GEFs and GAPs are integrated into Rho GTPase signaling through the interactions of these regulators with GDIs, and that the negative role of GDIs is to suppress the overall Rho activity by inhibiting GEFs. Furthermore, the positive role of GDIs is to sustain Rho activation by inhibiting GAPs under certain conditions. The interconversion between transient and sustained Rho activation occurs mainly through changes in the affinities of GDIs to GAPs and the concentrations of GAPs.
RhoGDIs positively regulate Rho GTPase signaling primarily by interacting with GAPs and may participate in the switching between transient and sustained signals of the Rho GTPases. These findings enhance our understanding of the physiological roles of RhoGDIs and Rho GTPase signaling.
Electronic supplementary material
The online version of this article (doi:10.1186/s12918-015-0143-5) contains supplementary material, which is available to authorized users.
RhoGDI; Rho GTPases; Ordinary differential equation; GAPs; GEFs
Initial success of inhibitors targeting oncogenes is often followed by tumor relapse due to acquired resistance. In addition to mutations in targeted oncogenes, signaling cross-talks among pathways play a vital role in such drug inefficacy. These include activation of compensatory pathways and altered activities of key effectors in other cell survival and growth-associated pathways.
We propose a computational framework using Bayesian modeling to systematically characterize potential cross-talks among breast cancer signaling pathways. We employed a fully Bayesian approach known as the p1-model to infer posterior probabilities of gene-pairs in networks derived from the gene expression datasets of ErbB2-positive breast cancer cell-lines (parental, lapatinib-sensitive cell-line SKBR3 and the lapatinib-resistant cell-line SKBR3-R, derived from SKBR3). Using this computational framework, we searched for cross-talks between EGFR/ErbB and other signaling pathways from Reactome, KEGG and WikiPathway databases that contribute to lapatinib resistance. We identified 104, 188 and 299 gene-pairs as putative drug-resistant cross-talks, respectively, each comprised of a gene in the EGFR/ErbB signaling pathway and a gene from another signaling pathway, that appear to be interacting in resistant cells but not in parental cells. In 168 of these (distinct) gene-pairs, both of the interacting partners are up-regulated in resistant conditions relative to parental conditions. These gene-pairs are prime candidates for novel cross-talks contributing to lapatinib resistance. They associate EGFR/ErbB signaling with six other signaling pathways: Notch, Wnt, GPCR, hedgehog, insulin receptor/IGF1R and TGF- β receptor signaling. We conducted a literature survey to validate these cross-talks, and found evidence supporting a role for many of them in contributing to drug resistance. We also analyzed an independent study of lapatinib resistance in the BT474 breast cancer cell-line and found the same signaling pathways making cross-talks with the EGFR/ErbB signaling pathway as in the primary dataset.
Our results indicate that the activation of compensatory pathways can potentially cause up-regulation of EGFR/ErbB pathway genes (counteracting the inhibiting effect of lapatinib) via signaling cross-talk. Thus, the up-regulated members of these compensatory pathways along with the members of the EGFR/ErbB signaling pathway are interesting as potential targets for designing novel anti-cancer therapeutics.
Electronic supplementary material
The online version of this article (doi:10.1186/s12918-014-0135-x) contains supplementary material, which is available to authorized users.
Drug resisance; Signaling cross-talk; Bayesian statistical modeling; p1-model; EGFR signaling; Breast cancer; Lapatinib
Mortierella alpina is an oleaginous fungus used in the industrial scale production of arachidonic acid (ARA). In order to investigate the metabolic characteristics at a systems level and to explore potential strategies for enhanced lipid production, a genome-scale metabolic model of M. alpina was reconstructed.
This model included 1106 genes, 1854 reactions and 1732 metabolites. On minimal growth medium, 86 genes were identified as essential, whereas 49 essential genes were identified on yeast extract medium. A series of sequential desaturase and elongase catalysed steps are involved in the synthesis of polyunsaturated fatty acids (PUFAs) from acetyl-CoA precursors, with concomitant NADPH consumption, and these steps were investigated in this study. Oxygen is known to affect the degree of unsaturation of PUFAs, and robustness analysis determined that an oxygen uptake rate of 2.0 mmol gDW−1 h−1 was optimal for ARA accumulation. The flux of 53 reactions involving NADPH was significantly altered at different ARA levels. Of these, malic enzyme (ME) was confirmed as a key component in ARA production and NADPH generation. When using minimization of metabolic adjustment, a knock-out of ME led to a 38.28% decrease in ARA production.
The simulation results confirmed the model as a useful tool for future research on the metabolism of PUFAs.
Electronic supplementary material
The online version of this article (doi:10.1186/s12918-014-0137-8) contains supplementary material, which is available to authorized users.
Mortierella alpina; Arachidonic acid; Genome-scale metabolic model; Polyunsaturated fatty acids; Malic enzyme
Over the past years, tremendous efforts have been made to elucidate the molecular basis of the initiation and progression of ovarian cancer. However, most existing studies have been focused on individual genes or a single type of data, which may lack the power to detect the complex mechanisms of cancer formation by overlooking the interactions of different genetic and epigenetic factors.
We propose an integrative framework to identify genetic and epigenetic features related to ovarian cancer and to quantify the causal relationships among these features using a probabilistic graphical model based on the Cancer Genome Atlas (TCGA) data. In the feature selection, we first defined a set of seed genes by including 48 candidate tumor suppressors or oncogenes and an additional 20 ovarian cancer related genes reported in the literature. The seed genes were then fed into a stepwise correlation-based selector to identify 271 additional features including 177 genes, 82 copy number variation sites, 11 methylation sites and 1 somatic mutation (at gene TP53). We built a Bayesian network model with a logit link function to quantify the causal relationships among these features and discovered a set of 13 hub genes including ARID1A, C19orf53, CSKN2A1 and COL5A2. The directed graph revealed many potential genetic pathways, some of which confirmed the existing results in the literature. Clustering analysis further suggested four gene clusters, three of which correspond to well-defined cellular processes including cell division, tumor invasion and mitochondrial system. In addition, two genes related to glycoprotein synthesis, PSG11 and GALNT10, were found highly predictive for the overall survival time of ovarian cancer patients.
The proposed framework is effective in identifying possible important genetic and epigenetic features that are related to complex cancer diseases. The constructed Bayesian network has identified some new genetic/epigenetic pathways, which may shed new light into the molecular mechanisms of ovarian cancer.
Electronic supplementary material
The online version of this article (doi:10.1186/s12918-014-0136-9) contains supplementary material, which is available to authorized users.
The Cancer Genome Atlas; Bayesian network; Pathway analysis; Feature selection; Causal inference; Directed network
Given the complex nature of cardiovascular disease (CVD), information derived from a systems-level will allow us to fully interrogate features of CVD to better understand disease pathogenesis and to identify new drug targets.
Here, we describe a systematic assessment of the multi-layer interactions underlying cardiovascular drugs, targets, genes and disorders to reveal comprehensive insights into cardiovascular systems biology and pharmacology. We have identified 206 effect-mediating drug targets, which are modulated by 254 unique drugs, of which, 43% display activities across different protein families (sequence similarity < 30%), highlighting the fact that multitarget therapy is suitable for CVD. Although there is little overlap between cardiovascular protein targets and disease genes, the two groups have similar pleiotropy and intimate relationships in the human disease gene-gene and cellular networks, supporting their similar characteristics in disease development and response to therapy. We also characterize the relationships between different cardiovascular disorders, which reveal that they share more etiological commonalities with each other rooted in the global disease-disease networks. Furthermore, the disease modular analysis demonstrates apparent molecular connection between 227 cardiovascular disease pairs.
All these provide important consensus as to the cause, prevention, and treatment of various CVD disorders from systems-level perspective.
Electronic supplementary material
The online version of this article (doi:10.1186/s12918-014-0141-z) contains supplementary material, which is available to authorized users.
Cardiovascular disease; Network pharmacology; Network analysis; Drug discovery; Drug-target network; Gene-disease network
Constraint-based metabolic models and flux balance analysis (FBA) have been extensively used in the last years to investigate the behavior of cells and also as basis for different industrial applications. In this context, this work provides a validation of a small-sized FBA model of the yeast Pichia pastoris. Our main objective is testing how accurate is the hypothesis of maximum growth to predict the behavior of P. pastoris in a range of experimental environments.
A constraint-based model of P. pastoris was previously validated using metabolic flux analysis (MFA). In this paper we have verified the model ability to predict the cells behavior in different conditions without introducing measurements, experimental parameters, or any additional constraint, just by assuming that cells will make the best use of the available resources to maximize its growth. In particular, we have tested FBA model ability to: (a) predict growth yields over single substrates (glucose, glycerol, and methanol); (b) predict growth rate, substrate uptakes, respiration rates, and by-product formation in scenarios where different substrates are available (glucose, glycerol, methanol, or mixes of methanol and glycerol); (c) predict the different behaviors of P. pastoris cultures in aerobic and hypoxic conditions for each single substrate. In every case, experimental data from literature are used as validation.
We conclude that our predictions based on growth maximisation are reasonably accurate, but still far from perfect. The deviations are significant in scenarios where P. pastoris grows on methanol, suggesting that the hypothesis of maximum growth could be not dominating in these situations. However, predictions are much better when glycerol or glucose are used as substrates. In these scenarios, even if our FBA model is small and imposes a strong assumption regarding how cells will regulate their metabolic fluxes, it provides reasonably good predictions in terms of growth, substrate preference, product formation, and respiration rates.
Electronic supplementary material
The online version of this article (doi:10.1186/s12918-014-0142-y) contains supplementary material, which is available to authorized users.
Constraint- based model; Flux balance analysis; Possibilistic metabolic flux analysis; Pichia pastoris
Although the growth-factor G-CSF is widely used to prevent granulotoxic side effects of cytotoxic chemotherapies, its optimal use is still unknown since treatment outcome depends on many parameters such as dosing and timing of chemotherapies, pharmaceutical derivative of G-CSF used and individual risk factors. We showed in the past that a pharmacokinetic and –dynamic model of G-CSF and human granulopoiesis can be used to predict the performance of yet untested G-CSF schedules. However, only a single chemotherapy was considered so far.
In the present paper, we propose a comprehensive model of chemotherapy toxicity and combine it with our cell kinetic model of granulopoiesis. Major assumptions are: proportionality of cell numbers and cell loss, delayed action of chemotherapy, drug, drug-dose and cell stage specific toxicities, no interaction of drugs and higher toxicity of drugs at the first time of application. Correspondingly, chemotherapies can be characterized by a set of toxicity parameters which can be estimated by fitting the predictions of our model to clinical time series data of patients under therapy. Data were either extracted from the literature or were received from cooperating clinical study groups.
Model assumptions proved to be feasible in explaining granulotoxicity of 10 different chemotherapeutic drugs or drug-combinations applied in 33 different schedules with and without G-CSF. Risk groups of granulotoxicity were traced back to differences in toxicity parameters.
We established a comprehensive model of combined G-CSF and chemotherapy action in humans which allows us to predict and compare the outcome of alternative G-CSF schedules. We aim to apply the model in different clinical contexts to optimize and individualize G-CSF treatment.
Electronic supplementary material
The online version of this article (doi:10.1186/s12918-014-0138-7) contains supplementary material, which is available to authorized users.
Leukopenia; G-CSF; Filgrastim; Pegfilgrastim; Chemotherapy
Metabolomic responses to extreme thermal stress have recently been investigated in Drosophila melanogaster. However, a network level understanding of metabolomic responses to longer and less drastic temperature changes, which more closely reflect variation in natural ambient temperatures experienced during development and adulthood, is currently lacking. Here we use high-resolution, non-targeted metabolomics to dissect metabolomic changes in D. melanogaster elicited by moderately cool (18°C) or warm (27°C) developmental and adult temperature exposures.
We find that temperature at which larvae are reared has a dramatic effect on metabolomic network structure measured in adults. Using network analysis, we are able to identify modules that are highly differentially expressed in response to changing developmental temperature, as well as modules whose correlation structure is strongly preserved across temperature.
Our results suggest that the effect of temperature on the metabolome provides an easily studied and powerful model for understanding the forces that influence invariance and plasticity in biological networks.
Electronic supplementary material
The online version of this article (doi:10.1186/s12918-014-0139-6) contains supplementary material, which is available to authorized users.
Drosophila melanogaster; Temperature; Metabolomics; Networks; Differential coexpression
Understanding how cells make decisions, and why they make the decisions they make, is of fundamental interest in systems biology. To address this, we study the decisions made by E. coli on which genes to express when presented with two different sugars. It is well-known that glucose, E. coli’s preferred carbon source, represses the uptake of other sugars by means of global and gene-specific mechanisms. However, less is known about the utilization of glucose-free sugar mixtures which are found in the natural environment of E. coli and in biotechnology.
Here, we combine experiment and theory to map the choices of E. coli among 6 different non-glucose carbon sources. We used robotic assays and fluorescence reporter strains to make precise measurements of promoter activity and growth rate in all pairs of these sugars. We find that the sugars can be ranked in a hierarchy: in a mixture of a higher and a lower sugar, the lower sugar system shows reduced promoter activity. The hierarchy corresponds to the growth rate supported by each sugar- the faster the growth rate, the higher the sugar on the hierarchy. The hierarchy is ‘soft’ in the sense that the lower sugar promoters are not completely repressed. Measurement of the activity of the master regulator CRP-cAMP shows that the hierarchy can be quantitatively explained based on differential activation of the promoters by CRP-cAMP. Comparing sugar system activation as a function of time in sugar pair mixtures at sub-saturating concentrations, we find cases of sequential activation, and also cases of simultaneous expression of both systems. Such simultaneous expression is not predicted by simple models of growth rate optimization, which predict only sequential activation. We extend these models by suggesting multi-objective optimization for both growing rapidly now and preparing the cell for future growth on the poorer sugar.
We find a defined hierarchy of sugar utilization, which can be quantitatively explained by differential activation by the master regulator cAMP-CRP. The present approach can be used to understand cell decisions when presented with mixtures of conditions.
Electronic supplementary material
The online version of this article (doi:10.1186/s12918-014-0133-z) contains supplementary material, which is available to authorized users.
E. coli; Carbon catabolic repression; CCR; Diauxic shift; Non-PTS sugars; Cellular decision making; cAMP; CRP; CAP
Clinical trials are the main method for evaluating safety and efficacy of medical interventions and have produced many advances in improving human health. The Women’s Health Initiative overturned a half-century of harmful practice in hormone therapy, the National Lung Screening Trial identified the first successful lung cancer screening tool and the Prostate, Lung, Colorectal and Ovarian Cancer Screening Trial overturned decades-long assumptions. While some trials identify unforeseen safety issues or harms, many fail to demonstrate efficacy. Large trials require substantial resources; to ensure reliable outcomes, we must seek ways to improve the predictive information used as the basis of trials.
Here we demonstrate a modeling framework for linking knowledge of underlying biological mechanism to evaluate the expectation of trial outcomes. Key features include the ability to propagate uncertainty in biological mechanism to uncertainty in trial outcome and mechanisms for identifying knowledge gaps most responsible for unexpected outcomes. The framework was used to model the effect of selenium supplementation for prostate cancer prevention and parallels the Selenium and Vitamin E Cancer Prevention Trial that showed no efficacy despite suggestive data from secondary endpoints in the Nutritional Prevention of Cancer trial and found increased incidence of high-grade prostate cancer in certain subgroups.
Using machine learning methods, we identified the parameters of the model that are most predictive of trial outcome and found that the top four are directly related to the rates of reactions producing methylselenol and transporting extracellular selenium into the cell as selenide. This modeling process demonstrates how the approach can be used in advance of a large clinical trial to identify the best targets for conducting further research to reduce the uncertainty in the trial outcome.
Electronic supplementary material
The online version of this article (doi:10.1186/s12918-014-0140-0) contains supplementary material, which is available to authorized users.
Multilevel modeling; Clinical trials; Decision support; Value of information (VOI); Cancer chemoprevention
Cancer metabolism is emerging as an important focus area in cancer research. However, the in vitro cell culture conditions under which much cellular metabolism research is performed differ drastically from in vivo tumor conditions, which are characterized by variations in the levels of oxygen, nutrients like glucose, and other molecules like chemotherapeutics. Moreover, it is important to know how the diverse cell types in a tumor, including cancer stem cells that are believed to be a major cause of cancer recurrence, respond to these variations. Here, in vitro environmental perturbations designed to mimic different aspects of the in vivo environment were used to characterize how an ovarian cancer cell line and its derived, isogenic cancer stem cells metabolically respond to environmental cues.
Mass spectrometry was used to profile metabolite levels in response to in vitro environmental perturbations. Docetaxel, the chemotherapeutic used for this experiment, caused significant metabolic changes in amino acid and carbohydrate metabolism in ovarian cancer cells, but had virtually no metabolic effect on isogenic ovarian cancer stem cells. Glucose deprivation, hypoxia, and the combination thereof altered ovarian cancer cell and cancer stem cell metabolism to varying extents for the two cell types. Hypoxia had a much larger effect on ovarian cancer cell metabolism, while glucose deprivation had a greater effect on ovarian cancer stem cell metabolism. Core metabolites and pathways affected by these perturbations were identified, along with pathways that were unique to cell types or perturbations.
The metabolic responses of an ovarian cancer cell line and its derived isogenic cancer stem cells differ greatly under most conditions, suggesting that these two cell types may behave quite differently in an in vivo tumor microenvironment. While cancer metabolism and cancer stem cells are each promising potential therapeutic targets, such varied behaviors in vivo would need to be considered in the design and early testing of such treatments.
Electronic supplementary material
The online version of this article (doi:10.1186/s12918-014-0134-y) contains supplementary material, which is available to authorized users.
Metabolomics; Cancer stem cells; Ovarian cancer; Cancer metabolism; Biologically-inspired metabolic perturbations
Although much is understood about the enzymatic cascades that underlie cellular biosynthesis, comparatively little is known about the rules that determine their cellular organization. We performed a detailed analysis of the localization of E.coli GFP-tagged enzymes for cells growing exponentially
We found that out of 857 globular enzymes, at least 219 have a discrete punctuate localization in the cytoplasm and catalyze the first or the last reaction in 60% of biosynthetic pathways. A graph-theoretic analysis of E.coli’s metabolic network shows that localized enzymes, in contrast to non-localized ones, form a tree-like hierarchical structure, have a higher within-group connectivity, and are traversed by a higher number of feed-forward and feedback loops than their non-localized counterparts. A Gene Ontology analysis of these enzymes reveals an enrichment of terms related to essential metabolic functions in growing cells. Given that these findings suggest a distinct metabolic role for localization, we studied the dynamics of cellular localization of the cell wall synthesizing enzymes in B. subtilis and found that enzymes localize during exponential growth but not during stationary growth.
We conclude that active biochemical pathways inside the cytoplasm are organized spatially following a rule where their first or their last enzymes localize to effectively connect the different active pathways and thus could reflect the activity state of the cell’s metabolic network.
Electronic supplementary material
The online version of this article (doi:10.1186/s12918-014-0131-1) contains supplementary material, which is available to authorized users.
Metabolism; Enzyme localization; Fluorescent imaging; Metabolic network; Enzymatic activity; Metabolic pathways; Bacteria
One of the distinctive features of biological oscillators such as circadian clocks and cell cycles is robustness which is the ability to resume reliable operation in the face of different types of perturbations. In the previous study, we proposed multiparameter sensitivity (MPS) as an intelligible measure for robustness to fluctuations in kinetic parameters. Analytical solutions directly connect the mechanisms and kinetic parameters to dynamic properties such as period, amplitude and their associated MPSs. Although negative feedback loops are known as common structures to biological oscillators, the analytical solutions have not been presented for a general model of negative feedback oscillators.
We present the analytical expressions for the period, amplitude and their associated MPSs for a general model of negative feedback oscillators. The analytical solutions are validated by comparing them with numerical solutions. The analytical solutions explicitly show how the dynamic properties depend on the kinetic parameters. The ratio of a threshold to the amplitude has a strong impact on the period MPS. As the ratio approaches to one, the MPS increases, indicating that the period becomes more sensitive to changes in kinetic parameters. We present the first mathematical proof that the distributed time-delay mechanism contributes to making the oscillation period robust to parameter fluctuations. The MPS decreases with an increase in the feedback loop length (i.e., the number of molecular species constituting the feedback loop).
Since a general model of negative feedback oscillators was employed, the results shown in this paper are expected to be true for many of biological oscillators. This study strongly supports that the hypothesis that phosphorylations of clock proteins contribute to the robustness of circadian rhythms. The analytical solutions give synthetic biologists some clues to design gene oscillators with robust and desired period.
Transcriptional regulation of gene expression is usually accomplished by multiple interactive transcription factors (TFs). Therefore, it is crucial to understand the precise cooperative interactions among TFs. Various kinds of experimental data including ChIP-chip, TF binding site (TFBS), gene expression, TF knockout and protein-protein interaction data have been used to identify cooperative TF pairs in existing methods. The nucleosome occupancy data is not yet used for this research topic despite that several researches have revealed the association between nucleosomes and TFBSs.
In this study, we developed a novel method to infer the cooperativity between two TFs by integrating the TF-gene documented regulation, TFBS and nucleosome occupancy data. TF-gene documented regulation and TFBS data were used to determine the target genes of a TF, and the genome-wide nucleosome occupancy data was used to assess the nucleosome occupancy on TFBSs. Our method identifies cooperative TF pairs based on two biologically plausible assumptions. If two TFs cooperate, then (i) they should have a significantly higher number of common target genes than random expectation and (ii) their binding sites (in the promoters of their common target genes) should tend to be co-depleted of nucleosomes in order to make these binding sites simultaneously accessible to TF binding. Each TF pair is given a cooperativity score by our method. The higher the score is, the more likely a TF pair has cooperativity. Finally, a list of 27 cooperative TF pairs has been predicted by our method. Among these 27 TF pairs, 19 pairs are also predicted by existing methods. The other 8 pairs are novel cooperative TF pairs predicted by our method. The biological relevance of these 8 novel cooperative TF pairs is justified by the existence of protein-protein interactions and co-annotation in the same MIPS functional categories. Moreover, we adopted three performance indices to compare our predictions with 11 existing methods' predictions. We show that our method performs better than these 11 existing methods in identifying cooperative TF pairs in yeast. Finally, the cooperative TF network constructed from the 27 predicted cooperative TF pairs shows that our method has the power to find cooperative TF pairs of different biological processes.
Our method is effective in identifying cooperative TF pairs in yeast. Many of our predictions are validated by the literature, and our method outperforms 11 existing methods. We believe that our study will help biologists to understand the mechanisms of transcriptional regulation in eukaryotic cells.
transcription factor cooperativity; nucleosome; transcription factor binding site; yeast
The prediction of small complexes (consisting of two or three distinct proteins) is an important and challenging subtask in protein complex prediction from protein-protein interaction (PPI) networks. The prediction of small complexes is especially susceptible to noise (missing or spurious interactions) in the PPI network, while smaller groups of proteins are likelier to take on topological characteristics of real complexes by chance.
We propose a two-stage approach, SSS and Extract, for discovering small complexes. First, the PPI network is weighted by size-specific supervised weighting (SSS), which integrates heterogeneous data and their topological features with an overall topological isolatedness feature. SSS uses a naive-Bayes maximum-likelihood model to weight the edges with two posterior probabilities: that of being in a small complex, and of being in a large complex. The second stage, Extract, analyzes the SSS-weighted network to extract putative small complexes and scores them by cohesiveness-weighted density, which incorporates both small-co-complex and large-co-complex weights of edges within and surrounding the complexes.
We test our approach on the prediction of yeast and human small complexes, and demonstrate that our approach attains higher precision and recall than some popular complex prediction algorithms. Furthermore, our approach generates a greater number of novel predictions with higher quality in terms of functional coherence.
protein complex; protein interaction; data integration; machine learning
Progress in systems biology offers sophisticated approaches toward a comprehensive understanding of biological systems. Yet, computational analyses are held back due to difficulties in determining suitable model parameter values from experimental data which naturally are subject to biological fluctuations. The data may also be corrupted by experimental uncertainties and sometimes do not contain all information regarding variables that cannot be measured for technical reasons.
We show here a streamlined approach for the construction of a coarse model that allows us to set up dynamic models with minimal input information. The approach uses a hybrid between a pure mass action system and a generalized mass action (GMA) system in the framework of biochemical systems theory (BST) with rate constants of 1, normal kinetic orders of 1, and -0.5 and 0.5 for inhibitory and activating effects, named Unity (U)-system. The U-system model does not necessarily fit all data well but is often sufficient for predicting metabolic behavior of metabolites which cannot be simultaneously measured, identifying inconsistencies between experimental data and the assumed underlying pathway structure, as well as predicting system responses to a modification of gene or enzyme. The U-system approach was validated with small, generic systems and implemented to model a large-scale metabolic reaction network of a higher plant, Arabidopsis. The dynamic behaviors obtained by predictive simulations agreed with actually available metabolomic time-series data, identified probable errors in the experimental datasets, and estimated probable behavior of unmeasurable metabolites in a qualitative manner. The model could also predict metabolic responses of Arabidopsis with altered network structures due to genetic modification.
The U-system approach can effectively predict metabolic behaviors and responses based on structures of an alleged metabolic reaction network. Thus, it can be a useful first-line tool of data analysis, model diagnostics and aid the design of next-step experiments.
Arabidopsis thaliana; kinetic parameter; mathematical modeling; metabolic reaction network; metabolomics
A quantitative understanding of interactions between transcription factors (TFs) and their DNA binding sites is key to the rational design of gene regulatory networks. Recent advances in high-throughput technologies have enabled high-resolution measurements of protein-DNA binding affinity. Importantly, such experiments revealed the complex nature of TF-DNA interactions, whereby the effects of nucleotide changes on the binding affinity were observed to be context dependent. A systematic method to give high-quality estimates of such complex affinity landscapes is, thus, essential to the control of gene expression and the advance of synthetic biology.
Here, we propose a two-round prediction method that is based on support vector regression (SVR) with weighted degree (WD) kernels. In the first round, a WD kernel with shifts and mismatches is used with SVR to detect the importance of subsequences with different lengths at different positions. The subsequences identified as important in the first round are then fed into a second WD kernel to fit the experimentally measured affinities. To our knowledge, this is the first attempt to increase the accuracy of the affinity prediction by applying two rounds of string kernels and by identifying a small number of crucial k-mers. The proposed method was tested by predicting the binding affinity landscape of Gcn4p in Saccharomyces cerevisiae using datasets from HiTS-FLIP. Our method explicitly identified important subsequences and showed significant performance improvements when compared with other state-of-the-art methods. Based on the identified important subsequences, we discovered two surprisingly stable 10-mers and one sensitive 10-mer which were not reported before. Further test on four other TFs in S. cerevisiae demonstrated the generality of our method.
We proposed in this paper a two-round method to quantitatively model the DNA binding affinity landscape. Since the ability to modify genetic parts to fine-tune gene expression rates is crucial to the design of biological systems, such a tool may play an important role in the success of synthetic biology going forward.
binding affinity; protein-DNA interaction; support vector regression; weighted degree kernel
Candida albicans has emerged as an important model organism for the study of infectious disease. Using high-throughput simultaneously quantified time-course transcriptomics, this study constructed host-pathogen interspecies interaction networks between C. albicans and zebrafish during the adhesion, invasion, and damage stages. Given that iron and glucose have been identified as crucial resources required during the infection process between C. albicans and zebrafish, we focused on the construction of the interspecies networks associated with them. Furthermore, a randomization technique was proposed to identify differentially regulated proteins that are statistically eminent for the three infection stages. The behaviors of the highly connected or differentially regulated proteins identified from the resulting networks were further investigated.
"Robustness" is an important system property that measures the ability of the system tolerating the intrinsic perturbations in a dynamic network. This characteristic provides a systematic and quantitative view to elucidate the dynamics of iron and glucose competition in terms of the interspecies interaction networks. Here, we further estimated the robustness of our constructed interspecies interaction networks for the three infection stages.
The constructed networks and robustness analysis provided significant insight into dynamic interactions related to iron and glucose competition during infection and enabled us to quantify the system's intrinsic perturbation tolerance ability during iron and glucose competition throughout the three infection stages. Moreover, the networks also assist in elucidating the offensive and defensive mechanisms of C. albicans and zebrafish during their competition for iron and glucose. Our proposed method can be easily extended to identify other such networks involved in the competition for essential resources during infection.
Cell population control allows for the maintenance of a specific cell population density. In this study, we use lysis gene BBa_K117000 from the Registry of Standard Biological Parts, formed by MIT, to lyse Escherichia coli (E. coli). The lysis gene is regulated by a synthetic genetic lysis circuit, using an inducer-regulated promoter-RBS component. To make the design more easily, it is necessary to provide a systematic approach for a genetic lysis circuit to achieve control of cell population density.
Firstly, the lytic ability of the constructed genetic lysis circuit is described by the relationship between the promoter-RBS components and inducer concentration in a steady state model. Then, three types of promoter-RBS libraries are established. Finally, according to design specifications, a systematic design approach is proposed to provide synthetic biologists with a prescribed I/O response by selecting proper promoter-RBS component set in combination with suitable inducer concentrations, within a feasible range.
This study provides an important systematic design method for the development of next-generation synthetic gene circuits, from component library construction to genetic circuit assembly. In future, when libraries are more complete, more precise cell density control can be achieved.
Genetic lysis circuit; Cell population control; Promoter-RBS library
Gene Ontology (GO) provides rich information and a convenient way to study gene functional similarity, which has been successfully used in various applications. However, the existing GO based similarity measurements have limited functions for only a subset of GO information is considered in each measure. An appropriate integration of the existing measures to take into account more information in GO is demanding.
We propose a novel integrative measure called InteGO2 to automatically select appropriate seed measures and then to integrate them using a metaheuristic search method. The experiment results show that InteGO2 significantly improves the performance of gene similarity in human, Arabidopsis and yeast on both molecular function and biological process GO categories.
InteGO2 computes gene-to-gene similarities more accurately than tested existing measures and has high robustness. The supplementary document and software are available at http://mlg.hit.edu.cn:8082/.
Gene ontology; Semantic similarity; Integrative approach
Defining a measure for regulatory similarity (RS) of two genes is an important step toward identifying co-regulated genes. To date, transcription factor binding sites (TFBSs) have been widely used to measure the RS of two genes because transcription factors (TFs) binding to TFBSs in promoters is the most crucial and well understood step in gene regulation. However, existing TFBS-based RS measures consider the relation of a TFBS to a gene as a Boolean (either 'presence' or 'absence') without utilizing the information of TFBS locations in promoters.
Functional TFBSs of many TFs in yeast are known to have a strong positional preference to occur in a small region in the promoters. This biological knowledge prompts us to develop a novel RS measure that exploits the TFBS location information. The performances of different RS measures are evaluated by the fraction of gene pairs that are co-regulated (validated by literature evidence) by at least one common TF under different RS scores. The experimental results show that the proposed RS measure is the best co-regulation indicator among the six compared RS measures. In addition, the co-regulated genes identified by the proposed RS measure are also shown to be able to benefit three co-regulation-based applications: detecting gene co-function, gene co-expression and protein-protein interactions.
The proposed RS measure provides a good indicator for gene co-regulation. Besides, its good performance reveals the importance of the location information in TFBS-based RS measures.
Metabolic reactions have been extensively studied and compiled over the last century. These have provided a theoretical base to implement models, simulations of which are used to identify drug targets and optimize metabolic throughput at a systemic level. While tools for the perturbation of metabolic networks are available, their applications are limited and restricted as they require varied dependencies and often a commercial platform for full functionality. We have developed MetaNET, an open source user-friendly platform-independent and web-accessible resource consisting of several pre-defined workflows for metabolic network analysis.
MetaNET is a web-accessible platform that incorporates a range of functions which can be combined to produce different simulations related to metabolic networks. These include (i) optimization of an objective function for wild type strain, gene/catalyst/reaction knock-out/knock-down analysis using flux balance analysis. (ii) flux variability analysis (iii) chemical species participation (iv) cycles and extreme paths identification and (v) choke point reaction analysis to facilitate identification of potential drug targets. The platform is built using custom scripts along with the open-source Galaxy workflow and Systems Biology Research Tool as components. Pre-defined workflows are available for common processes, and an exhaustive list of over 50 functions are provided for user defined workflows.
MetaNET, available at http://metanet.osdd.net, provides a user-friendly rich interface allowing the analysis of genome-scale metabolic networks under various genetic and environmental conditions. The framework permits the storage of previous results, the ability to repeat analysis and share results with other users over the internet as well as run different tools simultaneously using pre-defined workflows, and user-created custom workflows.
Flux balance analysis; Metabolic network; Systems biology; in silico gene knock-out; Perturbation analysis
Flux analysis methods lie at the core of Metabolic Engineering (ME), providing methods for phenotype simulation that allow the determination of flux distributions under different conditions. Although many constraint-based modeling software tools have been developed and published, none provides a free user-friendly application that makes available the full portfolio of flux analysis methods.
This work presents Constraint-based Flux Analysis (CBFA), an open-source software application for flux analysis in metabolic models that implements several methods for phenotype prediction, allowing users to define constraints associated with measured fluxes and/or flux ratios, together with environmental conditions (e.g. media) and reaction/gene knockouts. CBFA identifies the set of applicable methods based on the constraints defined from user inputs, encompassing algebraic and constraint-based simulation methods. The integration of CBFA within the OptFlux framework for ME enables the utilization of different model formats and standards and the integration with complementary methods for phenotype simulation and visualization of results.
A general-purpose and flexible application is proposed that is independent of the origin of the constraints defined for a given simulation. The aim is to provide a simple to use software tool focused on the application of several flux prediction methods.
Electronic supplementary material
The online version of this article (doi:10.1186/s12918-014-0123-1) contains supplementary material, which is available to authorized users.
Constraint-based modeling; Metabolic Flux analysis; Metabolic engineering; Open-source software
Metabolic network models describing the biochemical reaction network and material fluxes inside microorganisms open interesting routes for the model-based optimization of bioprocesses. Dynamic metabolic flux analysis (dMFA) has lately been studied as an extension of regular metabolic flux analysis (MFA), rendering a dynamic view of the fluxes, also in non-stationary conditions. Recent dMFA implementations suffer from some drawbacks, though. More specifically, the fluxes are not estimated as specific fluxes, which are more biologically relevant. Also, the flux profiles are not smooth, and additional constraints like, e.g., irreversibility constraints on the fluxes, cannot be taken into account. Finally, in all previous methods, a basis for the null space of the stoichiometric matrix, i.e., which set of free fluxes is used, needs to be chosen. This choice is not trivial, and has a large influence on the resulting estimates.
In this work, a new methodology based on a B-spline parameterization of the fluxes is presented. Because of the high degree of non-linearity due to this parameterization, an incremental knot insertion strategy has been devised, resulting in a sequence of non-linear dynamic optimization problems. These are solved using state-of-the-art dynamic optimization methods and tools, i.e., orthogonal collocation, an interior-point optimizer and automatic differentiation. Also, a procedure to choose an optimal basis for the null space of the stoichiometric matrix is described, discarding the need to make a choice beforehand. The proposed methodology is validated on two simulated case studies: (i) a small-scale network with 7 fluxes, to illustrate the operation of the algorithm, and (ii) a medium-scale network with 68 fluxes, to show the algorithm’s capabilities for a realistic network. The results show an accurate correspondence to the reference fluxes used to simulate the measurements, both in a theoretically ideal setting with no experimental noise, and in a realistic noise setting.
Because, apart from a metabolic reaction network and the measurements, no extra input needs to be given, the resulting algorithm is a systematic, integrated and accurate methodology for dynamic metabolic flux analysis that can be run online in real-time if necessary.
Electronic supplementary material
The online version of this article (doi:10.1186/s12918-014-0132-0) contains supplementary material, which is available to authorized users.
Dynamic metabolic flux analysis; B-spline parameterizations; Non-linear optimization; Parameter estimation