|Home | About | Journals | Submit | Contact Us | Français|
Studies of developmental biology are often facilitated by diagram “models” that summarize the current understanding of underlying mechanisms. The increasing complexity of our understanding of development necessitates computational models that can extend these representations to include their dynamic behavior. Here we present a prototype model of C. elegans vulval precursor cell fate specification that represents many processes crucial for this developmental event but that are hard to integrate using other modeling methodologies. We demonstrate the integrative capabilities of our methodology by comprehensively incorporating the contents of three seminal papers, showing that this methodology can lead to comprehensive models of developmental biology. The prototype computational model was built and is run using a language (Live Sequence Charts) and tool (the Play-Engine) that facilitate the same conceptual processes biologists use to construct and probe diagram-type models. We demonstrate that this modeling approach permits rigorous tests of mutual consistency between experimental data and mechanistic hypotheses and can identify specific conflicting results, providing a useful approach to probe developmental systems.
Simple diagram “models” are used in experimental biology to summarize mechanisms inferred from detailed inter-related experimental results (e.g., see Fig. 1A). While executable computational models are becoming more prevalent, most models represent isolated aspects of what is known about a biological system, or they are geared to large scale data sets and limited in terms of the types of data they represent (for reviews, see de Jong, 2002; Ideker and Lauffenburger, 2003; Reeves et al., 2006). Moreover, the complexity of mathematical models makes them inaccessible to the average biologist to comprehend, use or extend further. Therefore, the vast majority of biological understanding is still represented using text and static diagrams, with dynamics and implications provided by human intuition. A methodology that can incorporate a system’s dynamic behavior, expand the information-content of current diagrammatic models to include both its explicit and implicit contextual underpinnings, and formalize the semantics to make it computationally testable would tremendously enhance our current representations of biology.
Much information about biological systems derives from small-scale “reductionist” studies. This information is typically non-quantitative, compiled over time by multiple individuals using a variety of experimental approaches, and acquired and reported using non-systematic methods. This makes it relatively recalcitrant to conventional computational modeling approaches. Nevertheless, these data need to be represented in comprehensive models of biological systems. Here we present a computational modeling approach that facilitates the integration and analysis of diverse types of standard biological information. The graphical nature of both the interface (the GUI, see Fig. 1B) and the computational language itself (Fig. 1C,D,E; LSCs; (Damm and Harel, 2001; Harel and Marelly, 2003) make this approach intuitive and user-friendly to biologists. To illustrate the approach, we have represented a portion of vulval precursor cell (VPC) fate specification in the nematode Caenorhabditis elegans (for review, see Sternberg, 2005).
System design for software and system engineering seeks to represent all aspects of a modeled system. Similarly, our ultimate modeling goal is to represent all known aspects of a biological system (Harel, 2003). The model presented here is a comprehensive representation of virtually all of the information and experiments reported in three seminal papers that helped establish the field of VPC fate specification in C. elegans (Sternberg, 1988; Sternberg and Horvitz, 1986; Sternberg and Horvitz, 1989). It represents and integrates different kinds of experimental results, including anatomical and genetic perturbations, and is a working prototype for an updatable model.
Our model is available to readers (see Supplementary Materials), who are encouraged to download the model and use it to investigate its contents and run simulations. The Supplementary Material accompanying this report contains a “User Guide” that describes how to manipulate the model. It also contains a set of movie files of recorded runs of simulations, showcasing the model without necessitating its download. Additional detailed explanations of the model and its testing can also be found in the Supplementary Material.
A key attractive feature of our methodology is that it does not impose a computational way to re-think the biology. Instead, it uses the same conceptual process as the building and reason-based testing of static model diagrams that biologists are accustomed to. The building blocks of the model are “scenarios”: “if-then” logic statements about the behavior or mechanistic basis of a limited “piece” of the system. The statements are time-constrained and have a precise syntax (that is, they are formal). This approach is particularly amenable to representing the understanding gained from reductionist analyses of biological systems. Each statement is captured in a Live Sequence Chart (LSC), which is a representation of conditions known to trigger a resulting behavior (Fig. 1C,D). The triggering conditions are represented in the “prechart;” resulting behaviors are represented in the “main chart.” These modular descriptions of behavioral mechanism are linked by events and objects shared between LSCs.
A graphical user interface (GUI; Fig. 1B) serves as a dynamic visualization of the biology, and is used both in the construction of the LSCs (Play-In) and in simulations (Play-Out). LSC scenarios are not written by programming, but rather by actually performing the desired behavior using the GUI and menu-driven components (see User Guide in Supplementary material). Thus, the model can be modified and expanded by users with virtually no training in computer programming. Simulated perturbations/experiments are reproduced by the manipulation of objects (relevant cells and genes). The Play-Engine tool runs all aspects of our model.
Developmental time underlies the dynamics of the model (Kam et al., 2004). LSCs refer to a clock function correlated with developmental time, thus allowing developmental time to drive the progress of a simulation. The Play-Engine monitors all LSCs based on the state of the system: assessing which LSCs should be active, and implementing events in their main charts when the requisite conditions are fulfilled. Events implemented by a main chart may, in turn, affect other precharts or maincharts.
The behavior of our model is controlled by a set of 86 universal LSCs (uLSCs) that specify the mechanistic rules inferred by the preponderance of existing data. Some uLSCs contain probabilistic events that generate the large number of possible outcomes seen in vivo. For example, ablation of certain VPCs allows other VPCs to occupy the vacated positions. Consistent with biological observations, the model produces alternative outcomes for specific ablations (“non-determinism”). Thus, simulations are not based on rote reproduction of experimental observations. Rather, they are based on mechanistic rules, explicitly stated as uLSCs. The model’s predictive power comes from the fact that this general mechanistic rule can be used to execute and display the consequences of system perturbations (in silico “experiments”) using a set of rules that are hypothesized to control the behaviors of the system. The uLSC “VPCresponse50LIN3” (Fig. 1D) provides an illustration. uLSCs can define behaviors at different levels of detail, offering important flexibility. Mechanisms that are well understood can be described in great detail, while those that are not as well understood — but are nonetheless important to drive simulations — can still be included. Additional mechanistic details can be added later without altering unrelated aspects of the model.
The model represents behaviors that influence VPC fate specification, either directly or indirectly. Direct influences include the establishment of the gradient of the LIN-3 inducing signal, a set of rules governing the movements of the VPCs following cell ablation experiments, and inductive and lateral signaling mechanisms. Details are in the Supplementary Materials.
A good working model can account for all the experimental observations from which its mechanistic rules were inferred. Working hypotheses represented by static diagrammatic models are typically tested using thought-based analyses. Computational models can be tested more systematically, matching the actual biological outcomes that result from specific experimental conditions to the outcomes of simulations that start with the same set of specific conditions.
In our methodology, a method called “play-out” allows simulations of system behavior under a set of in silico “experimental” conditions. Manual play-out allows the user to manipulate the system for a single run and directly observe the simulation. Batch-run play-out allows automated system runs for high-throughput testing, generating a number of different files that store the results of the simulations at various levels of detail (see Supplementary Materials).
The Play-Engine is ideally suited for testing experimental outcomes against mechanistic hypotheses. During a simulation run, the Play-Engine tracks the states of all objects and traces all events. In performing this function, it activates the relevant LSCs and traces the progress of the events described in each LSC as they occur. Thus it can match the events driven by uLSCs during a simulation to a specific experimental result when the latter is described as an LSC. Experimental results are described using a second type of LSC: existential LSCs (eLSCs) (Fig. 1E). eLSCs differ from uLSCs in that they do not drive system behavior, but are monitored to determine whether a given simulation run of the system satisfies the statements they contain. Therefore, eLSCs do not have separate “condition” and “result” portions (Fig. 1E). We used a set of 260 eLSCs to represent essentially all of the actual experiments and results (table by table, line by line) reported in the core papers. Using the systematic testing capabilities of the Play-Engine, we have shown that our model can reproduce essentially all of the results observed for each experiment that was conducted in the core papers. An analysis of exceptions is in the Supplementary Materials.
Biologists are often faced with more than one mechanistic hypothesis that appears to be compatible with the experimental data. Our modeling methodology easily represents alternative hypotheses. Each “Execution Configuration” in the Play-Engine’s setup stores a specific subset of uLSCs that the Play-Engine will use during execution. Thus, different Execution Configurations can be used to include and exclude the specific uLSCs that make up the key mechanistic differences between alternative hypotheses, while leaving common elements of the model intact.
For example, determining the relative roles played by the inductive and lateral signaling mechanisms has been a long-standing issue in the study of VPC fate specification (see reviews by Sternberg, 2005; Sundaram, 2004). Figure 1F highlights the two uLSCs that allow graded inductive signaling to influence the fates of the VPCs. Removal of these two uLSCs eliminates the differential response to inductive signal (the thin blue arrows in Fig. 1A). A similar small number of uLSCs allows the lateral signaling mechanism to promote sequential signaling. In addition to testing the complete model that incorporates all mechanisms, we similarly tested the model’s behavior under only the “Graded” or only the “Sequential” signaling hypotheses (Sternberg and Horvitz, 1986) by defining two additional execution configurations. Of the experimental observations that can be reproduced by the combined model, our testing identified additional experimental outcomes that fail to be reproduced by these restricted “Graded” or the “Sequential” Execution Configurations (see Supplementary Materials).
The prototype model we present here was built to determine the extent to which our methodology can be applied to typical studies of developmental biology. The most important advantages this approach offers over the current reason-driven static models are: (1) visualization of explicit dynamic behavior based on a set of mechanistic rules; and (2) the capability to follow multiple simultaneous events throughout a simulation, (3) systematic testing of all experimental results, (4) incorporation of multiple data-types. Although the set of core papers represented is small and historically distant, subsequent progress within the field can now be modeled within the context of the early data, rather than being represented in isolation.
The ability to extend an existing computational model as new data become available is critical to the development of comprehensive models. The extendibility of this model is both its greatest strength and its future challenge. The challenge lies in the dramatic increase in complexity and scale as additional genes, alleles, processes and interactions are incorporated. Because of their modular nature, “scenario”-based descriptions of behavior are simpler to modify than non-scenario-based approaches. The addition of new data, or even paradigmatic shifts in our understanding, requires modification of only the affected scenario modules, and not a reconstruction of the entire model.
The generic nature of the Play-Engine tool will allow the translation of our modeling efforts to many other biological systems. New system-specific GUIs will allow similar representations of other systems, while the solutions we have found to represent the processes and behaviors of vulval fate specification should be applicable to similar aspects of other systems. Our current model can be extended and deepened to represent a growing proportion of this specific system, while also providing adaptable tools to represent other biological systems.
We gratefully acknowledge the contributions of D. Barak, and M. Yano to various aspects of this work. In addition, we thank and R. Posner, L. Cooley, and K. Birnbaum for critical reading of early versions of the manuscript. This work was supported by collaborative NIH grant R24-GM066969, the Yale-Weizmann Exchange Program, the John von Neumann Minerva Center at the Weizmann Institute, and a grant from the Kahn Fund for Systems Biology at the Weizmann Institute of Science.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.