|Home | About | Journals | Submit | Contact Us | Français|
The development of a complete organism from a single cell involves extraordinarily complex orchestration of biological processes that vary intricately across space and time. Systems biology seeks to describe how all elements of a biological system interact in order to understand, model, and ultimately predict aspects of emergent biological processes. Embryogenesis represents an extraordinary opportunity – and challenge – for the application of systems biology. Systems approaches have already been used successfully to study various aspects of development, from complex intracellular networks to 4D models of organogenesis. Going forward, great advancements and discoveries can be expected from systems approaches applied to embryogenesis and developmental biology.
Systems biology is an experimental and theoretical framework that treats biology as an informational science, and seeks to study the behavior of biological systems as a whole. In particular, the deep complexity of developmental processes motivates the use of systematic and integrative analyses to garner biological insights. In these approaches, biological information is represented as being transmitted, modulated and integrated by biological networks that are then executed by molecular 'machines' (Hood et al. 2004; Price et al. 2009). At its heart, systems biology seeks to understand the dynamic behavior of complex biological systems in sufficient detail to construct computational models that can predict how various perturbations will affect a living system. An iterative model-building process is often employed, wherein an in silico model evolves through various iterations and increases in complexity, completeness and predictive accuracy as it is informed by increasing experimental data. The constructed computational model thus serves as a large-scale hypothesis about how an integrated biological process works as a whole. That is, the model characterizes explicitly the relevant components, their relationships to one another, and the dynamics of the interacting system. For example, a systems analysis of a mouse embryo might involve as a first step the identification of all proteins and genes expressed using shot gun proteomics and transcriptomics. Then the systems biologist might try to identify how these interact using yeast two-hybrid screening or affinity purification. Based on the results, a model is built and the system is tested ‘virtually’ by imposing different constraints or perturbations (such as a knock out) – and the predictions are compared with experiments. The model is further refined based on the novel results, and new and more accurate predictions are made. This process is repeated over and over again, with understanding and predictive accuracy increasing accordingly. Ultimately what is sought is the ability to predict how perturbations to any individual component (or combination of components) will affect the system, in order to effect desired changes to the system such as to recover health from a disease state.
The need for a systems biology approach in complex biological systems such as embryogenesis can be illustrated by analogy to complex physical machines such as automobiles. The ultimate structure and function of a car is dependent upon interactions between several constituent components – from small, regulatory electronics to the engine and chassis. To fully understand the nature of a car – and to be able to fix or build one – the behavior of each part must be understood, but so too must the complex schema of how all parts interact to create the full ‘emergent behavior’ of an automobile. Biological ‘machines’ exhibit even more multi-scale complexity than do cars, and it is thus just as important to understand how the constituent parts work together in the biological setting as it is the mechanical. Systems biology strives to characterize this interactivity using rigorous and systematic approaches. This challenge is daunting, but the capabilities to succeed in this endeavor are ever increasing.
Systems biology research typically requires large amounts of high-throughput experimental measurements to be successful. Probably the most familiar and frequently used high-throughput technology today is microarrays, which provide large-scale quantitative data for both transcriptional and genomic measurement (Allison et al. 2006; Gunderson et al. 2005; Schena et al. 1995). Among the most rapidly advancing measurement technologies is nucleotide sequencing, with novel ‘Next-Generation’ sequencing methods promising improved efficiency and significantly reduced expense (Eid et al. 2009; Mardis 2008; Mir et al. 2008; Shendure and Ji 2008). Using reverse transcription techniques these sequencing technologies can also be applied for the de novo interrogation of transcription, small RNA species (Landgraf et al. 2007), alternative splicing (Pan et al. 2008), and noncoding transcripts (Core et al. 2008). Of great importance to systems biology as it applies to human development and medicine is the advent of technologies for personalized genomics – which has already enabled sequencing of several individual human genomes (Bentley et al. 2008; Kidd et al. 2008; Wang et al. 2008; Wheeler et al. 2008). Major efforts are underway in both academic and industry settings to bring the cost of a full genome sequence down to $1000 or even less (Quail et al. 2008; Rothberg and Leamon 2008). Example large projects in academia for personalized genomics include the Personal Genome Project at Harvard led by George Church (Kaiser 2008), and a multi-institution Systems Medicine project led by Leroy Hood at the Institute for Systems Biology in Seattle (Langton 2008). These programs strive to catalyze advancements in generating and analyzing personalized sequence data by validating their use in large-scale experimental efforts. Increasingly robust and inexpensive measurement technologies for nucleotide measurement – both genomics and transcriptomics – will continue to empower systems-based studies of biology highly relevant to human embryogenesis.
Data-intensive pursuits in systems biology are also aided by advancing technologies for measurement of protein concentrations, post-translational modifications, and molecular interactions. Applications of proteomics in systems biology can be broadly divided into two groups: 1) shotgun proteomics, which focuses on global discovery of proteins relevant to specific biological processes without prior knowledge of what those components may be; and 2) targeted proteomics, where the specific proteins being searched for are known and the focus is on the their quantitative measurement and dynamic characterization. Discovery of cellular or serum-borne proteins is possible through the application of diverse techniques in tandem mass spectrometry (Bantscheff et al. 2007; Gupta et al. 2007; Nesvizhskii et al. 2007), as well as two-dimensional gel electrophoresis (Froment et al. 2005; Kolkman et al. 2005). These methodologies allow large-scale examination of up to thousands of proteins, and enable the discovery of relevant species participating in biological events. Once important participating agents in a signaling or regulatory network are known, other proteomic methods enable high-resolution study of causal linkages and dynamic parameters. Techniques such as Multiple Reaction Monitoring of isotope-labeled peptides and other diverse label-based approaches enable elucidation of quantitative and dynamic relationships between signaling and regulatory proteins. Recently developed surface plasmon resonance antibody microarrays (Boozer et al. 2006; Lausted et al. 2008; Usui-Aoki et al. 2005) also enable such targeted proteomic screens for proteins that have adequate capture agents. Complementing these interrogations of protein concentrations, diverse techniques have been developed to assay protein-protein interactions and to characterize the substantial human protein ‘Interactome’ (Braun et al. 2009; Gandhi et al. 2006; Rual et al. 2005). Proteomic technologies contribute to systems-based analyses at multiple scales of biological complexity.
Multiple online databases have been created for the storage and distribution of diverse genome-scale data. A small subset of what is available for different data types includes: 1) transcriptomics from the Gene Expression Omnibus Omnibus (Edgar et al. 2002) and the Stanford Microarray Database (Sherlock et al. 2001); 2) regulatory sequences from the Eukaryotic Promoter Database (Perier et al.) and the Transcription Regulatory Element Database (Zhao et al. 2005); and 3) proteomics from the Proteomics Identification Database (Martens et al. 2005). Additionally, standardized data formats such as the Systems Biology Markup Language (Hucka et al. 2003) and CellML (Lloyd et al. 2004), as well as software packages such as Cytoscape (Shannon et al. 2003) and the Gaggle (Shannon et al. 2006) have been developed to enable rapid porting of datasets, annotations, and models between different programs and formats. Cumulatively, these technologies for both experimental measurement and data analysis embody powerful modalities for rigorous and quantitative characterization of biological systems.
The process of embryogenesis exhibits diverse layers of complexity, including large-scale modulation of transcription programs, and the propagation of information from the molecular level to the scale of tissues and organs. The process involves more interacting biological variables than are easily accounted for manually, from complex transcription regulatory networks to multi-dimensional morphogen gradients. Furthermore, embryogenesis is an intrinsically dynamical process in which both intracellular and tissue-level phenotypes change immensely across time and space. However, such complexity does not make the system inherently vulnerable to perturbations. In fact, biological robustness is profoundly exemplified during embryogenesis – leading to the observed fact that networks involved in development have evolved dramatic robustness to changes in kinetic parameters associated with most system components (Eldar et al. 2002; Meir et al. 2002; von Dassow et al. 2000). Improved characterization and modeling of the robustness of embryogenesis amidst genetic heterogeneity and environmental perturbations is a major ongoing challenge for developmental systems biology.
An important concept underlying much systems-based thinking within and outside of biology is the notion of ‘Complex Adaptive Systems’ (Gell-Mann 1994). These systems are referred to as ‘complex’ because they are composed of many linked components; they are modular in nature and form interconnected networks. 'Adaptive' refers to a capacity to modify behavior in response to changing environment or context. At many different scales, life – in the form of cells, tissues, organisms, and even ecosystems – exhibits the qualities of complex adaptive systems. A complex and adaptive nature is particularly characteristic of higher organisms, such as humans, in both health and disease (Coffey 1998; Schwab and Pienta 1996). Particularly, cell lineages and differentiation programs can be considered highly adaptive systems – for they achieve specific phenotypic character even in diverse and dynamic biological contexts (Theise 2006; Theise and d'Inverno 2004). Such systems are notoriously difficult to model and control accurately because their emergent behaviors are dependent on the interactions of diffuse agents, and thus the challenge of modeling such a complex and adaptive process as embryogenesis is great indeed.
Embryogenesis can be considered a ‘unidirectional’ adaptive process, in which individual system components modulate behavior in response to changing biological context, but do so in a (typically) ‘irreversible’ manner with clear directionality. Furthermore, these developmental processes exhibit discrete ‘stages.’ That is, as cells differentiate, they occupy a series of discrete phenotypic states, and are less stable outside of these defined conditions. Quantitative transcriptomic analyses have shown that differentiating cells are drawn towards ‘attractor states’ – particular configurations which exhibit high stability, and a corresponding tendency of cells to reach and remain therein (Chang et al. 2006; Huang et al. 2005; Kashiwagi et al. 2006). The phenomenon of ‘canalization’ is one feedback process for such attraction dynamics – in which differentiating cells are ‘guided’ into particular phenotypic states by internal processes, such as transcriptional regulatory circuits circuits (Lott et al. 2007) and microRNA (Hornstein and Shomron 2006). Canalization is one example of systems-level, emergent phenomena witnessed in complex adaptive processes such as embryogenesis.
Biological networks operating in the process of embryogenesis exhibit marked robustness to perturbations due to redundancy in control and feedback across a wide and diffuse network. The survival need and evolutionary selection for robustness is clear in that embryogenesis is such an immensely complex process that, remarkably, operates correctly so much of the time. Such robustness to uncertainty and noise is considered as a hallmark of living systems (Kitano 2004). A recent study found that the segment polarity network in Drosophila development was robust to large changes in the biochemical kinetic constants that govern its behavior (von Dassow et al. 2000). This intriguing study showed that picking a kinetic parameter in this network at random, even over multiple orders of magnitude, would on average work to produce proper segmentation over 90% of the time. This robustness of the overall phenotype to differences in individual kinetic properties makes the network resistant to the effects of mutations and other developmental 'noise' that alters the dynamics of catalyzed reactions occurring during embryogenesis. Similar behavior has been observed in networks governing cell growth, development, metabolism and chemotaxis (Stelling et al. 2004). Biological robustness should not be misunderstood to imply a phenotypic constancy regardless of stimuli or mutations: rather, robust networks maintain specific functionalities in the context of perturbations, while remaining sufficiently flexible to change modes of operation in a suitable manner. Certain biological systems thus exist at a 'critical' state – at the boundary of ordered and chaotic states, imbuing sufficient rigidity to resist small uncertainties, but flexible enough to adapt to major environmental changes (Nykter et al. 2008). Such design principles can potentially be exploited to study dynamic and complex biological systems such as those at work in embryogenesis.
Given this multifaceted complexity, embryogenesis lends itself to characterization through systems-based approaches. Detailed experimental investigations identify a substantial ‘parts list’ in which genes, signals, and phenotypes important to embryogenesis are elucidated. Systems biology, complementary to this important process, presents a framework to collate the behaviors of these diverse biological components and to form cohesive, integrated explications of biological phenomena. Systems biology presents a powerful set of analytical tools well-suited to recording and assessing the complex phenomena of embryogenesis. Systems analysis has already been brought to bear on a number of aspects of the developmental process, including 1) intracellular networks, 2) communication signals between cells and with the environment, and 3) multi-scale integration (Box 1).
Common types of systems-based models of embryogenic and developmental biology, including experimental data sources, analytical frameworks, and representative examples of prior studies for each model type.
Perhaps the most common form of systems-based study has been the characterization of intracellular networks involved in developmental processes. In silico modeling of these networks describes variously the genetic, transcriptional, and signaling events relevant to large-scale cellular and tissue phenotype. A pioneering achievement was the elucidation of a gene regulatory network that regulates the specification of endoderm and mesoderm layers in the sea urchin embryo by Eric Davidson’s group (Davidson et al. 2002). This study involved large-scale perturbation experiments, genomic and transcriptomic measurement, and computational analyses to define the transcriptional regulatory ‘circuit’ that underlies this developmental process – and was among the first systems-based characterizations of molecular networks governing embryogenesis. This network showed a basis for irreversibility in cellular development in rigorous, readily visualized form – including activating cues from maternal tissues, and internal regulatory feedback processes which progress and ‘lock-down’ the embryogenic process such that it is self-sustaining. This project also spoke to the idea of biological attractor states, as the study’s multi-component transcriptional program propels differentiating cells towards specific phenotypic trajectories. These discrete developmental states exhibit particular stability in the environmental and genetic context wherein they reside. Such preferentially stable states represent an intrinsic quality of any dynamical system with internal or external sources of feedback, and developmental systems are frequently found to have both (Alon 2007).
Following this seminal study, many other intracellular networks involved in development have been characterized using similar and advancing methods, with multiple reviews available on the topic (Davidson and Erwin 2006; Levine and Davidson 2005). Several projects have extended the study of germ layer specification to elucidate similar networks in other organisms, including Drosophila (Sandmann et al. 2007) and the tadpole (Imai et al. 2006). These studies have demonstrated that such networks exhibit extensive combinatoric regulation of developmental genes, and have revealed important biomolecular control points in embryogenic events. Intracellular processes in development are now commonly referred to as Genetic Regulatory Networks (GRNs), and are becoming represented by increasingly quantitative and genome-scale models (Longabaugh et al. 2005; Reeves et al. 2006). For example, recent investigations were able to define ‘clusters’ of genes related according to biological function and expression patterns through global gene expression analysis in Drosophila embryogenesis (Hornstein and Shomron 2006; Shalgi et al. 2007; Tomancak et al. 2007). Another major focus has been analysis of cis-regulatory motifs. Recent studies have combined computational and experimental approaches to identify the nature of transcriptional regulatory sites across the genome, and the connectivity between transcription factors and their cognate binding sites (Kuntz et al. 2008; Li et al. 2007; Van Loo et al. 2008). Noncoding transcripts such as microRNA also play important roles in developmental processes. Recent studies have revealed that miRNA-mediated events are a ubiquitous mechanism to augment the robustness and precision of gene regulation (Aboobaker et al. 2005; Johnston et al. 2005; Tsang et al. 2007). By examining biomolecular networks at the systems level, these methods elucidate both important individual components involved in developmental processes, and the systems-level architecture of the circuits they compose.
The ability of cells to respond appropriately to surrounding cells and the extracellular environment serves as a major basis for developmental processes. The complex relationships observed between intracellular networks and extracellular cues warrants rigorous modeling and simulation to more fully comprehend. A major area for human systems biology and medicine going forward involves characterizing and interpreting the information content of the ‘secretome’ – the set of all biomolecules secreted by cells and tissues (Price et al. 2009). Importantly, the secretome can serve as a fully non-invasive proxy to survey cell/cell and cell/tissue interactions – including those involved in development. Learning to non-invasively monitor important aspects of embryogenesis and development via the secretome has been the subject of intensive research in recent years (Dominguez et al. 2009; Katz-Jaffe et al. 2009). These investigations used a combination of high-throughput proteomic techniques, including two-dimensional gel electrophoresis and tandem mass spectrometry, to characterize major elements of the secreted proteome. They have revealed a substantially global portrait of the secretome and elucidated biomolecules important to differentiation and development. Similar recent studies have identified secreted proteins associated with the differentiation of mesenchymal stem cells into osteoblasts and adipocytes (Chiellini et al. 2008; Schinköthe et al. 2008). The molecular processes that drive the highly multi-potent character of mesenchymal stem cells are intrinsically complex and involve many regulatory agents. In striving to consider a large repertoire of cellular factors simultaneously, systems approaches are well-suited to examine such integrated, multi-component processes.
Similar proteomic techniques have been employed to study the extracellular matrix and its application to developmental processes in cells and tissues (Xiao et al. 2009). These investigations have examined roles played by the ECM in structural support, intercellular communication, and the modulation of cell growth and behavior. Detailed understanding of the extracellular matrix is necessary for successfully culturing cells in synthetic conditions that mimic the in vivo scenario. Many model tissue systems employ this knowledge in an attempt to recapitulate natural ECM to study and rationally direct cellular differentiation on artificial substrates (El-Ali et al. 2006). In systematically characterizing the complex extracellular environments observed in developmental processes, these studies inform more precise laboratory investigations for further experimental research. Large-scale proteomics has also been applied directly to clinical medical tasks. Importantly, a recent analysis of the secretome of human embryos identified extracellular predictors of viable embryos for implantation (Katz-Jaffe et al. 2006). Extracellular phenomena represent the mechanistic bridge between intracellular networks and tissue-level events; they thus embody a crucial component of any systems-based study of embryogenesis and development.
Multiscale modeling approaches strive to connect microscopic events with macroscopic phenomena, and to model the molecular processes that propagate information from the scale of genes to the scales of cells, tissues, and organs (Lewis 2008; Walker and Southgate 2009). Such models have progressed substantially from the simple mathematical models of neurons by Hodgkin and Huxley, up to modern models of morphogen diffusion in developing embryos (Tomlin and Axelrod 2007). Digital simulation of three-dimensional structures formed during embryonic development can be performed using tools such as Compucell3D (Cickovski et al. 2007). For example, this tool has been employed to model skeletal development in a vertebrate limb (Chaturvedi et al. 2005). An impressive recent project elucidated a four-dimensional in silico model of pancreatic organogenesis as it progresses in space and time (Setty et al. 2008). This model considers interactions from the level of genes to the level of organs to model and visualize pancreatic development over multiple time scales. Similar multiscale models have examined the migration of germ layers in Xenopus laevis as a function of Wnt/β-catenin signaling (Peirce et al. 2006; Robertson et al. 2007), and the effects of extracellular growth factors on epithelial development (Walker et al. 2006).
Models of embryogenesis have been aided by advancements in technologies such as in situ hybridization and MRI. These tools have enabled detailed three-dimensional recording of gene expression patterns during embryogenesis and development in both humans (Matsuda 2005) and several model organisms such as Drosophila (Christiansen et al. 2006; Peng et al. 2007; Pisarev et al. 2009; Wei et al. 2006). Additionally, spatial gene expression patterns from such diverse organisms have been integrated into a large public database, enabling comparison of gene expression patterns across species (Haudry et al. 2008). These spatial models do not explicitly account for the transfer of information from the molecular to the tissue level. Nonetheless, they enable a large-scale consideration of gene expression programs – including the dynamics by which genes and gene clusters are activated, both spatially across tissues and temporally throughout developmental processes.
A long-term objective of systems approaches to embryogenesis is the construction of in silico biological models of the developing embryo that accurately recapitulate in vivo biological events. One could potentially predict and perform in silico experiments using such computational models, thereby providing focused hypotheses for further experimental study. This exciting prospect would permit not only a rigorous description of biological activities, but also enable the prediction of responses given by cells or organisms to exogenous and therapeutic perturbations. Such technologies would empower researchers to conduct virtual trials of medical procedures or therapeutics, reducing the need for expensive and potentially high-risk experimentation.
The promises of systems biology are grand and exciting, yet there are many hurdles to be passed before such approaches come to fruition. We do not have complete knowledge of the cell and systems models of cellular processes are still at their infancy. For example, epigenetic phenomena are now understood to play crucial roles in the differentiation and maintenance of cell type, but systems-level computational models of epigenetic events in embryogenesis are only beginning to emerge (Bock and Lengauer 2008; Jones and Martienssen 2005). Similarly, recent studies have shown that different forms of noncoding transcription are ubiquitous throughout the genome (Seila et al. 2008), and it has been speculated that such activity forms a major locus for the regulation of developmental processes (Dinger et al. 2008; Mattick 2007; Taft et al. 2007). These phenomena too, due perhaps to their recent discovery, have not yet been studied extensively using systems-based approaches. Another factor that stifles the creation of most large-scale models is the lack of quantitative data. Until recently, the bulk of data in biology were qualitative and were not amenable for modeling. With the development of new high throughput methods that generate quantitative data, more systems analysis like those outlined above will emerge. Finally, it should be clear that while computational modeling will aid tremendously in generating hypotheses, predicting the effects of perturbations, and yielding insight into system behavior, it can never replace physical experiments.
In summary, systems approaches have contributed substantially to the study of early embryogenesis and development – from the level of transcriptional regulation and intracellular signaling, to cellular differentiation and tissue patterning. These innovative studies have begun to elucidate embryogenic processes that are network-based, multi-component, and distributed across several scales of biological and physical complexity. While presenting significant technological, experimental, and theoretical challenges, this pursuit promises to advance our understanding of embryogenesis and development beyond what has been attainable previously.
The authors gratefully acknowledge insightful discussions with Leroy Hood and James Eddy, and funding from the NIH Howard Temin Pathway to Independence Award in Cancer Research (NDP).