Idiopathic pulmonary fibrosis (IPF) is a progressive and relatively poorly understood fibrotic lung disease whose median survival (2.5–3 yr) is unaffected by currently available medical therapies (
1). In the last two decades, we have experienced an unprecedented increase in our understanding of lung fibrosis in general. Studies using advanced molecular biology approaches, genetically modified animals, virally administered genes, and high-throughput transcriptional profiling approaches provided evidence for multiple pathways, molecules, and systems that may be involved in fibrosis. On the basis of these studies, it seems that pulmonary fibrosis, at least the form induced by bleomycin in the mouse lung, is in part dependent on intact tumor necrosis factor (TNF) pathways (
2), intact transforming growth factor (TGF)-β activation and signaling pathways (
3), angiogenesis, cell trafficking and recruitment (
4), coagulation cascades (
5), apoptosis (
6), lipid mediator metabolisms (
7), and expression of multiple regulatory molecules by the alveolar epithelium (
8). We have data to support the role of alveolar epithelial cells, myofibroblasts, circulating mesenchymal pleuripotent cells, T cells, macrophages, and endothelial cells in lung fibrosis (
8–
10). The success of these studies has presented us with a biological
Rashômon. Like the viewers of the film by the legendary Japanese filmmaker Akira Kurosawa (
11), which provides multiple versions to a crime observed by four witnesses, we observe multiple, mismatched, often conflicting versions of the same event that leave us wondering whether we could resolve the picture, or more importantly in this case, can we really understand human pulmonary fibrosis in a way that will allow us to significantly impact the disease?
The answer, we believe, lies in several major developments that happened in the last decade. The availability of complete sequences of the human and other genomes and the introduction of high-throughput technologies for gene and protein expression provide at least part of the answer. It is relatively easy to invert the experimental design now and, instead of identifying a molecule in a mouse model of disease, one can profile or mine publicly available transcriptional profiles of human disease, identify a gene or protein of interest using genetically modified animals or other methods for gene knockdown, and study its function. In addition, instead of studying the expected phenotypic outcomes of experiments that manipulate the expression or function of a gene of interest in animal model of disease, we can now look at the global impact of these perturbations. The advent of high-throughput methodologies to study genetic background variability, epigenetic regulation, and transcription factor–based gene expression regulation should allow us to provide an additional part of the answer by explaining how individual variability explains disease susceptibility in mice and humans. Furthermore, these technologies allow us to query how biological information gets dynamically translated into context- and cell-specific gene and protein expression patterns, which, in turn, serve to change the cellular context. The rapid increase in computing power and connectivity translates into more rapid and efficient data manipulation and sharing; this releases us from the need to query multiple articles to generate a hypothesis because much of the data we require to “connect and project” are now readily available in relatively easy-to-use and widely accessible databases. Indeed, within a relatively brief period of time, the use of high-throughput techniques for gene expression profiling and of computerized databases has become a mainstay of biomedical research.
Although all of these exciting technological advances that exponentially increase the levels of knowledge about every disease and model serve as facilitators of integration, they do not inherently provide integrative models of disease. For this to occur, a shift in thinking is required. Instead of a single-factor/reductionist approach, which is highly effective in the lab, we need to think “globally.” We need to shift from an approach that tries to explain lung fibrosis using “one molecule at a time, one cell type at a time” to an approach that looks at the network of interactions between multiple molecules, pathways, and cells, and characteristics of the organism, as they converge to determine the lung phenotype in pulmonary fibrosis. Systems biology is the field of biology that aims to provide this “holistic” view. In this review, we will discuss and define systems biology and its application as well as relevance to the study of IPF. We will provide a brief description of microarray studies of IPF (recently reviewed by us [12]) as well as examples of systems biology analysis of these data. We will also describe the requirements and challenges in implementing systems biology to IPF research.