Search tips
Search criteria

Results 1-6 (6)

Clipboard (0)

Select a Filter Below

Year of Publication
Document Types
1.  Improving Breast Cancer Survival Analysis through Competition-Based Multidimensional Modeling 
PLoS Computational Biology  2013;9(5):e1003047.
Breast cancer is the most common malignancy in women and is responsible for hundreds of thousands of deaths annually. As with most cancers, it is a heterogeneous disease and different breast cancer subtypes are treated differently. Understanding the difference in prognosis for breast cancer based on its molecular and phenotypic features is one avenue for improving treatment by matching the proper treatment with molecular subtypes of the disease. In this work, we employed a competition-based approach to modeling breast cancer prognosis using large datasets containing genomic and clinical information and an online real-time leaderboard program used to speed feedback to the modeling team and to encourage each modeler to work towards achieving a higher ranked submission. We find that machine learning methods combined with molecular features selected based on expert prior knowledge can improve survival predictions compared to current best-in-class methodologies and that ensemble models trained across multiple user submissions systematically outperform individual models within the ensemble. We also find that model scores are highly consistent across multiple independent evaluations. This study serves as the pilot phase of a much larger competition open to the whole research community, with the goal of understanding general strategies for model optimization using clinical and molecular profiling data and providing an objective, transparent system for assessing prognostic models.
Author Summary
We developed an extensible software framework for sharing molecular prognostic models of breast cancer survival in a transparent collaborative environment and subjecting each model to automated evaluation using objective metrics. The computational framework presented in this study, our detailed post-hoc analysis of hundreds of modeling approaches, and the use of a novel cutting-edge data resource together represents one of the largest-scale systematic studies to date assessing the factors influencing accuracy of molecular-based prognostic models in breast cancer. Our results demonstrate the ability to infer prognostic models with accuracy on par or greater than previously reported studies, with significant performance improvements by using state-of-the-art machine learning approaches trained on clinical covariates. Our results also demonstrate the difficultly in incorporating molecular data to achieve substantial performance improvements over clinical covariates alone. However, improvement was achieved by combining clinical feature data with intelligent selection of important molecular features based on domain-specific prior knowledge. We observe that ensemble models aggregating the information across many diverse models achieve among the highest scores of all models and systematically out-perform individual models within the ensemble, suggesting a general strategy for leveraging the wisdom of crowds to develop robust predictive models.
PMCID: PMC3649990  PMID: 23671412
2.  Multi-Scale Stochastic Simulation of Diffusion-Coupled Agents and Its Application to Cell Culture Simulation 
PLoS ONE  2011;6(12):e29298.
Many biological systems consist of multiple cells that interact by secretion and binding of diffusing molecules, thus coordinating responses across cells. Techniques for simulating systems coupling extracellular and intracellular processes are very limited. Here we present an efficient method to stochastically simulate diffusion processes, which at the same time allows synchronization between internal and external cellular conditions through a modification of Gillespie's chemical reaction algorithm. Individual cells are simulated as independent agents, and each cell accurately reacts to changes in its local environment affected by diffusing molecules. Such a simulation provides time-scale separation between the intra-cellular and extra-cellular processes. We use our methodology to study how human monocyte-derived dendritic cells alert neighboring cells about viral infection using diffusing interferon molecules. A subpopulation of the infected cells reacts early to the infection and secretes interferon into the extra-cellular medium, which helps activate other cells. Findings predicted by our simulation and confirmed by experimental results suggest that the early activation is largely independent of the fraction of infected cells and is thus both sensitive and robust. The concordance with the experimental results supports the value of our method for overcoming the challenges of accurately simulating multiscale biological signaling systems.
PMCID: PMC3244460  PMID: 22216238
3.  Role of Cell-to-Cell Variability in Activating a Positive Feedback Antiviral Response in Human Dendritic Cells 
PLoS ONE  2011;6(2):e16614.
In the first few hours following Newcastle disease viral infection of human monocyte-derived dendritic cells, the induction of IFNB1 is extremely low and the secreted type I interferon response is below the limits of ELISA assay. However, many interferon-induced genes are activated at this time, for example DDX58 (RIGI), which in response to viral RNA induces IFNB1. We investigated whether the early induction of IFNBI in only a small percentage of infected cells leads to low level IFN secretion that then induces IFN-responsive genes in all cells. We developed an agent-based mathematical model to explore the IFNBI and DDX58 temporal dynamics. Simulations showed that a small number of early responder cells provide a mechanism for efficient and controlled activation of the DDX58-IFNBI positive feedback loop. The model predicted distributions of single cell responses that were confirmed by single cell mRNA measurements. The results suggest that large cell-to-cell variation plays an important role in the early innate immune response, and that the variability is essential for the efficient activation of the IFNB1 based feedback loop.
PMCID: PMC3035661  PMID: 21347441
4.  Plato's Cave Algorithm: Inferring Functional Signaling Networks from Early Gene Expression Shadows 
PLoS Computational Biology  2010;6(6):e1000828.
Improving the ability to reverse engineer biochemical networks is a major goal of systems biology. Lesions in signaling networks lead to alterations in gene expression, which in principle should allow network reconstruction. However, the information about the activity levels of signaling proteins conveyed in overall gene expression is limited by the complexity of gene expression dynamics and of regulatory network topology. Two observations provide the basis for overcoming this limitation: a. genes induced without de-novo protein synthesis (early genes) show a linear accumulation of product in the first hour after the change in the cell's state; b. The signaling components in the network largely function in the linear range of their stimulus-response curves. Therefore, unlike most genes or most time points, expression profiles of early genes at an early time point provide direct biochemical assays that represent the activity levels of upstream signaling components. Such expression data provide the basis for an efficient algorithm (Plato's Cave algorithm; PLACA) to reverse engineer functional signaling networks. Unlike conventional reverse engineering algorithms that use steady state values, PLACA uses stimulated early gene expression measurements associated with systematic perturbations of signaling components, without measuring the signaling components themselves. Besides the reverse engineered network, PLACA also identifies the genes detecting the functional interaction, thereby facilitating validation of the predicted functional network. Using simulated datasets, the algorithm is shown to be robust to experimental noise. Using experimental data obtained from gonadotropes, PLACA reverse engineered the interaction network of six perturbed signaling components. The network recapitulated many known interactions and identified novel functional interactions that were validated by further experiment. PLACA uses the results of experiments that are feasible for any signaling network to predict the functional topology of the network and to identify novel relationships.
Author Summary
Elucidating the biochemical interactions in living cells is essential to understanding their behavior under various external conditions. Some of these interactions occur between signaling components with many active states, and their activity levels may be difficult to measure directly. However, most methods to reverse engineer interaction networks rely on measuring gene activity at steady state under various cellular stimuli. Such gene measurements therefore ignore the intermediate effects of signaling components, and cannot reliably convey the interactions between the signaling components themselves. We propose using the changes in activity of early genes shortly after the stimulus to infer the functional interactions between the unmeasured signaling components. The change in expression in such genes at these times is directly and linearly affected by the signaling components, since there is insufficient time for other genes to be transcribed and interfere with the early genes' expression. We present an algorithm that uses such measurements to reverse engineer the functional interaction network between signaling components, and also provides a means for testing these predictions. The algorithm therefore uses feasible experiments to reconstruct functional networks. We applied the algorithm to experimental measurements and uncovered known interactions, as well as novel interactions that were then confirmed experimentally.
PMCID: PMC2891706  PMID: 20585619
5.  Stochastic Analysis of the SOS Response in Escherichia coli 
PLoS ONE  2009;4(5):e5363.
DNA damage in Escherichia coli evokes a response mechanism called the SOS response. The genetic circuit of this mechanism includes the genes recA and lexA, which regulate each other via a mixed feedback loop involving transcriptional regulation and protein-protein interaction. Under normal conditions, recA is transcriptionally repressed by LexA, which also functions as an auto-repressor. In presence of DNA damage, RecA proteins recognize stalled replication forks and participate in the DNA repair process. Under these conditions, RecA marks LexA for fast degradation. Generally, such mixed feedback loops are known to exhibit either bi-stability or a single steady state. However, when the dynamics of the SOS system following DNA damage was recently studied in single cells, ordered peaks were observed in the promoter activity of both genes (Friedman et al., 2005, PLoS Biol. 3(7):e238). This surprising phenomenon was masked in previous studies of cell populations. Previous attempts to explain these results harnessed additional genes to the system and deployed complex deterministic mathematical models that were only partially successful in explaining the results.
Methodology/Principal Findings
Here we apply stochastic methods, which are better suited for dynamic simulations of single cells. We show that a simple model, involving only the basic components of the circuit, is sufficient to explain the peaks in the promoter activities of recA and lexA. Notably, deterministic simulations of the same model do not produce peaks in the promoter activities.
We conclude that the double negative mixed feedback loop with auto-repression accounts for the experimentally observed peaks in the promoter activities. In addition to explaining the experimental results, this result shows that including additional regulations in a mixed feedback loop may dramatically change the dynamic functionality of this regulatory module. Furthermore, our results suggests that stochastic fluctuations strongly affect the qualitative behavior of important regulatory modules even under biologically relevant conditions, thus emphasizing the importance of stochastic analysis of regulatory circuits.
PMCID: PMC2675100  PMID: 19424504
6.  Regulation of gene expression by small non-coding RNAs: a quantitative view 
The importance of post-transcriptional regulation by small non-coding RNAs has recently been recognized in both pro- and eukaryotes. Small RNAs (sRNAs) regulate gene expression post-transcriptionally by base pairing with the mRNA. Here we use dynamical simulations to characterize this regulation mode in comparison to transcriptional regulation mediated by protein–DNA interaction and to post-translational regulation achieved by protein–protein interaction. We show quantitatively that regulation by sRNA is advantageous when fast responses to external signals are needed, consistent with experimental data about its involvement in stress responses. Our analysis indicates that the half-life of the sRNA–mRNA complex and the ratio of their production rates determine the steady-state level of the target protein, suggesting that regulation by sRNA may provide fine-tuning of gene expression. We also describe the network of regulation by sRNA in Escherichia coli, and integrate it with the transcription regulation network, uncovering mixed regulatory circuits, such as mixed feed-forward loops. The integration of sRNAs in feed-forward loops provides tight repression, guaranteed by the combination of transcriptional and post-transcriptional regulations.
PMCID: PMC2013925  PMID: 17893699
cellular networks; network motifs; small non-coding RNA; transcriptional and post-transcriptional regulation

Results 1-6 (6)