Immune responses are symphonies of molecular and cellular interactions, with each player doing its part to produce the composite behavior we see as effective host defense, or when dis-coordinated, as immunopathology or immunodeficiency. Just as the listening separately to the notes played by individual instruments fails to capture the ensemble effect achieved when an entire orchestra plays in unison, so too are we limited in our understanding of how the immune system operates when we focus only on the properties or actions of one or a few unconnected components.
In the 19th and early 20th century, biology was largely a study of physiology, the integrated basis for the functionality of an organism. However, with the advent of new instrumentation and technology in the late 1970’s, especially recombinant DNA methods, biology became progressively more reductionist, with a focus on individual cells and molecules, and immunology was no exception to this trend. The new knowledge acquired by the field through many such detailed studies has been enormously important in developing a parts list of the components involved in immune processes and in identifying some of the contributions of these molecular and cellular elements to the overall functioning of the system. Nonetheless, this information has still yielded only limited insights into the way these various elements integrate with each other to give rise to complex immunological behaviors, especially into how small quantitative changes in individual component function affect more global properties. This latter issue is of substantial importance in understanding how polymorphisms linked to disease by large-scale genetic studies influence immune function, as just one example.
At the same time as new tools were developed for and applied to ever finer dissection of the cell, genes, and proteins of the immune system, another set of technological advances increased the rate of data acquisition from a trickle to a stream to a river that has, with the commoditization of microarrays (1
), the widespread use of deep sequencing methods (2
), the advent of highly multiplexed flow cytometry (3
), and the availability of high-throughput proteomics (4
), turned into a torrent. Rather than exploring a single element in depth, these latter technologies are employed for broad probing of the state of biological systems (gene expression, protein identity, or substrate modification). This has led to a major change in how research is conducted in many laboratories – rather than experiments being designed based on pre-formed hypotheses derived from past training and knowledge of the literature, high-throughput methods are being used for unbiased exploration of the properties of a system to then generate novel hypotheses (5
). It is up to the investigator to sort through the massive amount of new data flowing from the various multiplex technologies, a process that requires substantial ability in statistics and/or the capacity to properly use algorithms and software developed by experts in mathematics and computation. We have thus entered a new domain, in which the skills of ‘bioinformaticians’ are becoming essential elements in the research efforts of many laboratories. The technical capacity to generate these large-scale, in some cases global, datasets has in turn led to the emergence of the new discipline of ‘systems biology’, which in its simplest form is the old physiology recast in modern guise. It constitutes an attempt by the field to move from the very specific, from the detail, from the single molecule or gene, to a quantitative analysis of such elements as they operate together to give rise to behaviors not possessed by any single component alone – so-called ‘emergent properties,’ (6
) the symphony rather than the notes of just the violin or oboe.
Many investigators consider bioinformatics synonymous with systems biology. But the truth is more complex. While statistical analysis of large datasets to look for trends, to cluster individual components into related groups, or to uncover connectivity among elements to produce large network maps are all essential to making use of such extensive information, these approaches alone fall short of moving the field from mere organization of knowledge to a deeper understanding of the principles underlying a system’s behavior or to an explanation of its mode of operation. The most common output from informatic manipulation of data elements is a non-parametric graph that shows qualitative interactions – often referred to as a ‘hairball’ or ‘ridiculome’1
because of the enormous complexity of such global depictions. In truth, these graphs are extremely useful for illustrating relationships between elements and for understanding the organization of components into operational modules, but they do not allow the investigator to predict how alterations in the concentration or efficiency of function of a particular element will influence the overall system’s activity or to discern why/how certain properties of the system arise from its elements. But in the end, this is just what we want from such a systemic analysis; the ability to fathom how higher-level function emerges from components that on their own lack the capacity in question and to predict how perturbations of individual elements will change this behavior, both for the basic insight this provides and for the potential clinical utility of such information.
It is the domain of ‘modelers’ to move from informatic analyses into this more functional realm. Mathematical or computational modeling is not a new endeavor, especially in immunology, but it is a less widely employed and appreciated aspect of the emerging discipline of systems biology as compared to bioinformatic analysis of data. But we believe that the two are complementary and indeed, each cannot reach its full potential without the other. Computational simulation is only effective if the modeler has in hand the properly processed and analyzed data necessary to instantiate a model close to biological reality (in terms of element identity, organization, and quantitative parameters). At any level of resolution, from molecules, to cells, to tissues, to a complete organism, the modeler needs the contributions of informaticians to develop a realistic and valid model structure for further computational processing. On the other hand, without modeling, the mere organization of data does not add the necessary insight into global system performance sought by biologists.
From the perspective of the practicing immunologist, what does systems biology in all its guises have to offer? Isn’t experimentation - not mathematical twiddling - really the essential activity involved in gaining new understanding of how the immune system functions? And yes, informatics is useful for handling large datasets, but isn’t its major value in discovery of new interesting molecules or genes so one can go back into the lab to study these in detail using comfortable experimental tools and techniques? In this review we argue that these existing paradigms are changing - that the value of traditional experimental studies will increase dramatically if more quantitative tools are introduced into mainstream immunology research in the form of analytic measurements, formalized model generation, simulations, and computer predictions, built on a foundation involving systematic measurements organized and parsed by informatics approaches.
What is the basis for this view? We suggest that too often, interpretation of experimental data is limited by a failure to intuit the complex, non-linear behaviors typical of highly connected systems with large numbers of feedback connections (7
), which of course perfectly describes the immune system. Add to this the exponential increase and decrease in lymphocyte cell numbers during adaptive immune responses, properties that markedly amplify the influence of very small differences in the activity of molecular circuits or cells (9
), and the need for more formal representations and quantitative analyses of immune function becomes even more evident.
We are not talking here about the type of ad hoc ‘theoretical immunobiology’ that has acquired a questionable name in the past. Rather, we are referring to combining rich experimental datasets and existing knowledge in the field with newer efforts to obtain more quantitative measurements of biochemical or cellular parameters suitable for computer modeling and simulation. Predictions of system behavior from such simulations obtained under defined conditions corresponding to experimentally testable situations amount to ‘in silico experimentation’ (11
). These predicted outcomes must then be put to the test at the bench, to examine the strength of the underlying computational model. Through iterative cycles of such model building, simulation, prediction, experiment, and model refinement (when experimental results and prediction disagree as they inevitably will), one can develop much more complete and informative models of immunological processes than those we formulate purely in our imagination or represent as simplified cartoons in reviews.
What about the scale of such models? Many investigators bemoan the presumed need for ‘completeness’ in order to achieve a useful model and despair of obtaining the data needed to reach this ultimate goal. Although for bacteria or yeast it is possible to undertake truly ‘system-level’ studies involving the measurement all gene transcripts or proteins expressed by a cell under various conditions and the systematic perturbation of each of these elements through mutation, this is clearly impractical or impossible for more complex organisms. However, useful models that represent emergent behaviors need not involve the entire system – the complex properties of subcircuits or modules that form key parts of larger networks are valuable to investigate and simulate on their own (7
) even if the eventual goal must be to stitch such incomplete models together into a grander scheme that more truly reflects overall physiology. To conduct studies at the ‘systems’ level, it is merely necessary to at least move up the scale from individual component dissection to a consideration of integrated behavior of sets of connected components (molecules, cells, even disparate tissues). Such efforts help us organize our thinking about the aspects of the subnetwork’s structure that give rise to its specific properties [amplification, noise suppression, time-gated function, and so on (13
)], and can assist in understanding the underlying control circuits that regulate behaviors like switching between tolerant and immune states (8
), the antigen thresholds required for induction of responses, original antigenic sin, the choice of CD4 effector fate, and many others. In concert with the critical efforts already underway to systematically obtain data on gene expression in immune cells (14
) and to quantify aspects of immune function previously examined in a more qualitative manner, we can begin to generate a body of models for many such modules of immune function. These in turn can each be refined by contributions from many investigators in the field, hopefully over time approaching the underlying reality more and more closely, and leading eventually to the generation of an integrated ‘supermodel’ as these smaller pieces prove their worth through rigorous experimental testing. It is an opportunity for all immunologists to contribute to and receive back from a group undertaking that ultimately supports their own specific research interests while advancing the entire field. Rather than being concerned about ‘big science’ in thinking about systems approaches, we hope that immunologists will view systems immunology as the new immunophysiology, with opportunities for all to participate and to benefit.
As discussed above, there are two major threads in systems biology, informatics and modeling. Each has become such a large enterprise that we cannot do justice to both here, and so have opted to focus on the less commonly used modeling and simulation aspect. In the body of this review, we discuss the computational approaches and tools available for translating data into models suitable for simulation and prediction of biological behavior, the key technologies for data acquisition that contribute to effective computational modeling, and the limitations that must be overcome for their more effective use in supporting these endeavors. In each section, we provide examples of how these technologies and tools have already begun to contribute to a better understanding of the immune system. We end with a perspective on what can be expected in the next few years in this rapidly changing arena.