|Home | About | Journals | Submit | Contact Us | Français|
Research embracing systems biology approaches and careful analysis of the critical host response has greatly expanded our understanding of infectious diseases. First-generation studies based on genomics and proteomics have made significant progress in establishing the foundation for network-based investigations on virus-host interactions. More recently, data from complementary high-throughput technologies, such as siRNA and microRNA screens and next-generation sequencing, are augmenting systems level analyses and are providing a more detailed and insightful multidimensional view of virus-host networks. Together with advances in data integration, systems biology approaches now have the potential to provide profound impacts on translational research, leading to the more rapid development of new therapeutics and vaccines for infectious diseases. In this review, we highlight new high-throughput technologies, a new philosophy for studying virus-host interactions, and discuss the potential of systems biology to facilitate bench-to-bedside research and create novel strategies to combat disease. Can we save the world using these approaches? Read on.
For the past 50 years the discipline of virology has been overly focused on the pathogen itself. We now know that “the host response” is equally or more important in defining the eventual pathological outcome of infection. Based on recent vaccine failures and inadequate production of novel antiviral therapeutics it is clear that the virology/immunology community must take more daring and innovative approaches to help combat viral infection on multiple fronts. We believe that systems biology approaches, combined with sophisticated computational strategies, are required to combat the AIDS virus, other chronic and acute infections, and most recently a potential “swine flu” pandemic. Why is that? Systems biology analyzes all of the components, interactions, and dynamics of a biological system in a comprehensive, quantitative, and integrative fashion. To better understand human diseases, systems biology involves iterative cycles in which model organisms with different levels of complexity are perturbed and then measured using combinations of high-throughput technologies. After mining the multidimensional data, predictive computational models are developed, evaluated, and then refined based on the model predictions with new iterations of manipulation of the systems.
Systems biology, or more specifically components of a systems biology approach, have already greatly expanded our understanding of human diseases, including host responses to viral infection [1-3]. Only a few years ago, first-generation studies began to characterize patterns of gene expression on a genomic scale, distinguishing diseased states from normal states at the molecular level. With the development of new high-throughput technologies, and their successful application to infectious disease research, new-generation studies have demonstrated great potential for providing a more insightful multidimensional network view of virus-host interactions. Unfortunately, the full potential of systems wide approaches to study human diseases has not been fully realized.
In particular, as systems biology evolves, it is anticipated to have profound impacts on translational research. Not only does systems biology deliver scientific discoveries and analytical tools for translational research, it also challenges several shared barriers which also impact translational research. These include establishing efficient inter-disciplinary collaborations (indeed the cultures of any interdisciplinary team presents a multitude of cross culture communications and challenges), building integrated computational infrastructures for data management and data sharing, and developing sophisticated computational tools to translate large volumes of multidimensional data into better understanding of basic biology and disease. This review highlights new developments in high-throughput technologies for the systematic study of virus-host interactions and the value of systems biology in facilitating translational research (Figure 1).
The interplay between the development of new high-throughput technologies and specific needs of biological and medical researchers is an essential component of systems biology. Microarray-based functional genomics, which provides a global view of transcriptional changes occurring in host cells during viral infection, has been the most common high-throughput method used to study virus-host interactions (reviews in [2,3]). In this section, we highlight several additional high-throughput technologies that have been recently applied to the study of virus-host interactions.
Small interfering RNAs (siRNAs) are a class of double-stranded short RNA molecules that are endogenously expressed by host cells in response to the nucleic acids of foreign organisms, including viruses. A siRNA specifically induces the cleavage of foreign RNAs through RNA interference (RNAi). By introducing a pool of synthetic siRNAs into cells, genome-wide siRNA screening provides a powerful (but potentially flawed) forward-genetic approach for identifying cellular genes with certain functions . For example, recent reports have demonstrated the efficacy of large-scale siRNA screens in identifying cellular factors important for infection by and replication of a variety of viruses, including influenza virus , hepatitis C virus (HCV) , West Nile virus  and human immunodeficiency virus-1 (HIV-1) [8-10] (see Table 1 for a summary). Although some of the identified host factors may be system-specific [8-10], by identifying hundreds of infection-related host factors simultaneously, this approach has great utility in systematically locating key components of protein interaction networks. However, one of the key limiting factors is the need to define and enforce data-reporting standards to facilitate data dissemination and re-analysis by the scientific community (Minimum Information About an RNAi Experiment, URL: http://miare.sourceforge.net/HomePage), and to make the structured data available in a public repository, analogous to what has been advocated for by the microarray field [11,12]. Even with such a centralized database, it will require years (or even decades) of intensive biological confirmatory assays to prove the real worth of these RNAi screens.
Physical interactions between host and pathogen proteins are critical to establishing an infection and influencing pathogenesis. Until recently, large-scale interaction studies have primarily focused on intra-species interactions. Interactions between host and virus proteins have been identified from small directed experiments, focused on particular pathogen proteins or pathways of interest. Two recent studies have used a high-throughput yeast two-hybrid approach to identify physical interactions between human and viral proteins from Epstein-Barr virus (EBV)  and HCV , respectively (Table 1). After integrating their data with previously cataloged human protein interactions, both studies noted that human proteins interacting with viral proteins tended to be hubs (proteins with many interacting partners) and bottlenecks (proteins that are central to many paths in the network) in human protein interaction networks, hinting at a conserved mechanism among viral systems for efficiently controlling and manipulating host cellular processes . The yeast two-hybrid datasets, when integrated with other ‘omics’ data, can potentially provide mechanistic insights into the changes observed at the transcriptional and proteome levels. However, the same challenges associated with tedious biological validation studies will nonetheless delay any true progress associated with these approaches.
Similar to siRNAs described above, microRNAs (miRNAs) are a group of endogenously expressed single-stranded small non-coding regulatory RNA molecules. Unlike a siRNA, however, a miRNA represses its target gene expression by inducing not only RNAi but also translational inhibition. During some virus infections, the virus and cell mutually regulate each other's gene expression through self-encoded miRNAs . To date, over 5,000 microRNAs have been identified, including approximately 800 human microRNAs. Various hybridization- and PCR-based strategies for profiling miRNA expression have been developed  and applied to the study of virus-host interactions.
Microarray-based miRNA profiling revealed that specific miRNA signatures may correlate with CD4+ T-cell counts and viral loads in HIV-infected individuals . Similarly, we have found multiple cellular miRNAs in transplanted human liver biopsies that may regulate of host responses at the mRNA level during HCV infection (Peng et al, submitted).
With continued refinement in miRNA profiling technologies , the greater challenge is to study the biological functions of a miRNA, mainly because functional targets of a specific miRNA are difficult to identify. While computational target predictions may at best achieve a 50% success rate, target identification through mRNA expression profiling alone will overlook targets regulated mainly at protein level. To experimentally enrich functional miRNA targets, regardless of the regulatory mechanisms used, Hannon et al. used an assay coupling RNA-induced silencing complex (RISC) immuno-precipitation with microarray analysis of bound mRNAs . In this approach, even miRNA targets exclusively regulated at the translational level would be detected by microarray analysis downstream, as miRNAs and their corresponding target mRNAs would be incorporated into the RISC before immuno-precipitation.
Next-generation sequencing is dramatically accelerating biomedical research by enabling comprehensive analyses of genomic variation, mRNA and small RNA expression and discovery, and DNA-protein interactions (see recent reviews on the technology and its applications [21,22]). This technology may allow investigators to conduct studies on virus-host interactions which were previously not feasible or affordable, such as the identification of important alternate splice isoforms, miRNA discovery and profiling, and expression profiling in organisms for which a complete genomic sequence has not been determined. RNA-seq , a new method for whole transcriptome analysis based on next-generation sequencing technology, offers a much greater dynamic range than microarrays, and therefore a better platform to quantify low-abundance transcripts. Compared to microarrays, prior sequence information requirements are less exacting. This is particularly important for studies on those organisms with very limited or low-coverage genome sequences, such as nonhuman primates, for which the current standard for expression profiling is to use rhesus or human microarrays. Although these microarrays provide useful information for a large number of genes, there are complications in analyzing the data due to potential hybridization differences between species. However, sequence information from RNA-seq does not depend upon hybridization to a specific, predetermined probe.
Information obtained from RNA-seq can also be used to improve current microarrays through alternative splice form detection, gene and exon boundary mapping, and novel transcript discovery. Although the technology is still in the early stages of use, a variety of software tools are being developed to deal with the vast amount of data generated by next-generation sequencing (see reference  for a short summary). There are also ongoing efforts to establish the guidelines for reporting and archiving next-generation sequencing data. Public repositories such as the NCBI Short Read Archive (SRA, URL: http://0-www.ncbi.nlm.nih.gov.millennium.unicatt.it/Traces/sra) have been set up to store both raw sequencing data from different platforms and associated metadata regarding experimental annotations and instrument runs. As the technology matures, the major challenge will be to use new bioinformatics approaches and computational methods to extract biologically relevant information from the large volume of data.
Until recently, systems level analyses have mostly been performed on single-cell organisms, or in the case of virus-host interactions, using cell-line infection models. Although a great deal may be learned from this approach, knowledge of how a specific cell type responds in isolation to virus infection provides an incomplete picture what happens in the complex milieu of an infected animal. Therefore, in the context of virus-host interactions, systems biology must leverage not only multiple high-throughput data types but also multiple model systems and perturbations to gain a systems level view of pathogenesis and disease. As an example, Figure 2 illustrates a three-dimensional matrix of comparisons that might be performed when studying influenza virus infection using multiple infection models. Assessing phenotypic differences, from the gross to the molecular level and across model systems and viral strains, will allow researchers to define commonalities and differences in the host response and to identify both common and unique targets for therapeutic intervention.
The real strength, but potentially the biggest challenge of this approach, will come from integrating vast amounts of data into a single systems-level view. This challenge is already being met in other fields where new computational tools have been developed to take advantage of the data generated by high-throughput technologies. For example, Gunsalus et al.  have integrated transcriptomics, protein interaction, and RNAi data to provide a multidimensional view of how molecular machines work together in the early embryogenesis in Caenorhabditis elegans. In the same light, Chuang et al.  have developed a network-based classifier by integrating gene expression profiles of metastatic and non-metastatic cancer patients with the human protein interaction network. Similarly, Pujana et al.  combined gene expression profiling with functional genomic and proteomic data from different species to identify genes associated with a higher risk of breast cancer.
One goal of virus-host systems biology is to make predictions on host responses and dynamic interactions between viruses and hosts by computational modeling. As structures and interactions of virus-host networks unearthed by high-throughput studies are often static, predictive computational modeling attempts to identify key network components and interactions by predicting the dynamics of the whole network during viral infections, in turn revealing potential translational targets for assessments. Depending on the scope of the study and the availability of experimental data, modeling may be done at different levels of abstraction: genes, proteins, cells, and organisms.
Several recent studies have attempted to model the dynamic host immune responses to different pathogens such as influenza A virus , vesicular stomatitis virus , and Bordetellae bacteria . In these cell-based models, molecular-level descriptions of host response were not available; yet, in order to generate more testable hypotheses, relevant molecular details such as key signaling pathways and regulatory networks should be included, especially the identification of host targets for antiviral therapies. As an interesting example, Zhang et al.  proposed a multi-scale model to simulate cancer heterogeneity, integrating a simplified epidermal growth factor receptor gene-protein network coupled with a cell cycle module. While computational modeling of host responses to viral infection is still in its infancy, continued development of predictive models will have significant impacts on identifying host cell factors for antiviral therapies.
Effective multi-disciplinary collaboration is key to the success of both systems biology and translational research (Box 1) as is attested to by funding for translational research infrastructure through the Clinical and Translational Science Awards (CSTAs; URL: http://www.ctsaweb.org/) from the National Institutes of Health. To promote the use of systems biology approaches for delivering translational discoveries with clinical impacts, investments in organizational infrastructure in the public sector and public-private partnerships will be necessary. Computational infrastructure presents a major challenge for utilizing systems biology approaches in translational research, as both basic researchers and clinical scientists need ready access to clinical data and biomolecular data, a significant challenge from both technological and regulatory standpoints. In this regard, increasing awareness in both the research community and funding agencies has transformed the conduct of biomedical research in multi-disciplinary collaborations. For example, researchers from University of Washington and their collaborators are building strong inter-disciplinary research teams through several large research programs (NIDA Center for Functional Genomics; URL: http://nida.viromics.washington.edu/; Systems Virology; URL: https://www.systemsvirology.org).
Clearly, systems biology is a rapidly evolving discipline. To fulfill its promise to provide us with a raft of new drug targets, systems biology needs to address several major challenges: 1) Computation. To enable targeted search for antiviral therapeutics, it is necessary to develop predictive computational models that incorporate sufficient molecular details and make effective predictions of virus-host interactions for experimental testing. 2) Data. Relative to the complexity of biological systems being studied, the field is actually data-poor even though some may be overwhelmed by the sheer data volume. A body of self-consistent, comprehensive, and viral infection-specific experimental data is needed for building effective computational models. We are currently building a “virus-host genomic compendium” of both our own data and data of others; this effort will enable us to identify common and unique features of the host response to dozens of highly virulent viruses. 3) Money. Systems Biology is not cheap. It costs literally millions of dollars to establish the proper biological, clinical, and computational infrastructure. 4) Engagement and Dedication. Systems biology researchers need to be fully engaged in translational research, especially bridging bench and bedside scientists during the development of its own infrastructures and collaborations.
“It's easy for a biomedical researcher studying a disease to forget that his or her work someday actually needs to help treat an actual patient”.
Translational research transforms scientific discoveries into clinical applications to prevent, detect and cure human disease. Typically such discoveries begin at “the bench” with basic researches at a molecular or cellular level, then progress to the patient's “bedside”, the clinical practice.
The ‘bench-to-bedside’ translational research is a two-way process. Researchers provide new analytic tools for use and new targets (biomarkers, drug targets, mechanisms of gene and pathway action) for assessment in patients. Clinicians make novel observations about disease progression and provide clinical samples and variables for laboratory investigation, which often requires detailed experimental studies using model systems. A key barrier to translational research is for both researchers and clinicians to have ready access to two types of data: 1) clinical data, including medical records, pathology reports, diagnostic results and clinical trial information; and 2) biomolecular data, including sequencing, genomics, proteomics and other high-throughput research data.
We thank Sean Proll for discussions and assistance with the preparation of the original figures. We thank Matt Dyer for insightful suggestions and assistance with the preparation of the manuscript. Research in the authors' laboratory is supported by Public Health Service grants R01AI022646, R01HL080621, R24RR016354, P30DA015625, P01AI058113, and P51RR00166 from the National Institutes of Health, USA.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Special interest (*)
Outstanding interest (**)