We performed three independent human viral challenge studies (HRV, RSV, and influenza) to define host-based peripheral blood gene expression patterns characteristic of response to viral respiratory infection. The results provide clear evidence that a unique biologically relevant peripheral blood gene expression signature classifies respiratory viral infection with a remarkable degree of accuracy. These findings underscore the conserved nature of the host response to viral infection, which is also evident in the cross-validation between experimental cohorts. The “acute respiratory viral” gene expression signature derived from these cohorts was validated in an independently derived external dataset, and, importantly, can distinguish respiratory viral infection from bacterial infection. These findings provide compelling evidence that peripheral blood gene expression can function as a biomarker for specific classes of infectious pathogens and may potentially serve as a useful diagnostic for triaging treatment decisions for ARI.
Discrimination between infectious causes of illness is a critical component of acute care of the medical patient as such distinctions facilitate both triage and treatment decisions. While traditional culture, antigen-based, and PCR based diagnostics are useful in pathogen classification, these assays are not without limitations(
Bryant et al., 2004;
Campbell and Ghazal, 2004). Current rapid diagnostic methods are lacking in sensitivity, with influenza and RSV tests (e.g. BinaxNOW antigen testing) reporting sensitivities of 53-80% (
Jonathan, 2006;
Landry et al., 2008;
Rahman et al., 2008) or are labor-intensive, such as direct-fluorescent antibody (DFA) testing. Categorizing infection based on host response is an emerging hypothesis that not only enhances our diagnostic capabilities, but may provide additional insight into the pathobiology of infection. We have identified gene expression patterns that characterize host response to viral infection and that identify infected individuals with a high degree of accuracy. Several lines of evidence validate our findings, including the internal cross validation between exposure cohorts as well as validation with the free-living influenza A and bacterial infection pediatric cohort (
Ramilo et al., 2007). Other investigators have identified host gene expression patterns – in nasal epithelium – that are associated with viral infection. Differentially expressed genes in nasal epithelium exposed to HRV 16 (
in vitro and from experimentally infected subjects) were similar to those found in the current study in peripheral blood (
Proud et al., 2008). In particular, RSAD2 (viperin), a potential antiviral molecule (
Chin and Cresswell, 2001;
Jiang et al., 2008;
Wang et al., 2007b), was the most highly differentially expressed gene in nasal epithelium between infected and uninfected individuals at 48 hours post inoculation. Our HRV (HRV-16) predictive factor included RSAD2 (viperin) and the probit regression model selected it as the key differentially expressed gene in blood for determining infected state in the HRV cohort. Whole blood gene expression studies looking at RSV infection in hospitalized infants shared differentially expressed genes with the RSV factor found in our study, with a predominance of interferon-response elements, FCγ1AR, and OAS3 (
Fjaerli et al., 2006). Finally, data from the naturally-occurring influenza A/bacterial infection study (
Ramilo et al., 2007) confirmed a distinct host response signature to viral infection occurring both in this cohort and our experimentally infected cohorts. Taken together, this provides strong evidence for highly accurate
in vivo detection of human viral respiratory infection through analysis of peripheral blood gene expression. Notably, different peripheral blood immune cell types induce varying gene expression programs in response to pathogen exposure. Thus, the peripheral blood gene expression signatures derived and validated in these cohorts may only be applicable to individuals without underlying immune deficiencies. Additional studies in immune deficient populations will be needed to generalize the current findings to these rare but clinically important patient subsets.
Evident from the genes in each factor, signatures that discriminate subjects with symptomatic respiratory viral infections from healthy subjects and subjects with bacterial infection contain biologically plausible gene networks involved in host viral response. The acute respiratory viral factor was most heavily represented by genes in the interferon signaling canonical pathway (p = 9.75 × 10
−9) and the pattern recognition pathway for bacteria and viruses (p = 5.67 × 10
−5). This over-representation of interferon response elements remained when individual viral challenges were analyzed as separate entities (HRV p = 1.38 × 10
−10, RSV p = 2.25 × 10
−9, influenza p = 1.25 × 10
−7). (
www.ingenuity.com). Overlap between the genes defining each factor (discriminating symptomatic individuals versus asymptomatic individuals OR discriminating viral respiratory infection from bacterial infection) was strong. Baseline gene expression among all challenge subjects was similar and indistinguishable from the later timepoints for asymptomatic subjects and classification of subjects from one cohort based on the other cohorts was remarkably accurate. Discovery of discriminant factors for disease states such as this one is inherently blind to biology, as the model is not aware of data labels. Despite differences in study design, commonalities between experimentally infected adults with HRV, RSV, or influenza A and community infected children with influenza A predominated over virus-specific aspects of each signature. However, when selecting the gene or genes with greatest discriminating power for leave-one-out cross validation, the model chose different genes for each viral illness (HRV: RSAD2; RSV: RTP4; influenza A: ISG15; viral vs. bacterial: IFI27, RSAD2, IFI6, CXCL10, FLJ20035, GBP1 and SIGLEC1 and viral vs.
S. pneumoniae: RSAD2). Thus, with careful exploration of disease biology or with additional cohorts for validation, disease specific markers of infection may arise, adding parity to the diagnostic signatures. Overlap is minimal with differentially expressed genes from other studies of peripheral blood response to environmental stress found in a study of humans exposed to ionizing radiation, and the genotoxic stress of chemotherapy and LPS (
Dressman et al., 2007;
Meadows et al., 2008), decreasing the likelihood that these genes are part of a generalized response program inherent to immune effector cells.
Despite data acquisition and processing differences, gene expression patterns derived from publically available microarray data for individuals with influenza A infection were similar to those with experimentally acquired symptomatic HRV, RSV, or influenza A infection. Genes found to characterize the response to respiratory viral infection in our cohorts overlap with genes found in many gene expression studies of host response to viral infections, both
in vivo (
Bhoj et al., 2008;
Proud et al., 2008;
Ramilo et al., 2007) and
in vitro (
Jenner and Young, 2005). This generalizability of the respiratory viral response signature finding illustrates that the host response to respiratory viral infections is robust and conserved such that it can be discerned in divergent patient populations (healthy adult volunteers experimentally infected with HRV or RSV and children hospitalized with influenza A). Second, this finding illustrates the dominance of a pathogen specific response at time of peak symptoms over a generalized “infection” response, as discrimination between viral and bacterial infection is possible. The ability of these signatures to differentiate between pathogen classes (viral versus bacterial) provides a marked distinction between these findings and current methods of infectious or inflammatory illness classification (e.g. peripheral white blood cell count or measurement of inflammatory markers such as C-reactive protein). The sensitivity and specificity of these markers in both our experimental setting and when applied to a cohort from the literature data represent an improvement on the performance of current rapid (e.g. rapid antigen testing) diagnostics as well as current culture-based diagnostics. A combination of these tests may ultimately prove to offer the best sensitivity and specificity for disease diagnosis. These data provide an important backbone to the concept that host peripheral blood gene expression may be a valuable tool alone or in conjunction with standard microbiologic testing for infectious diseases. Validation in an additional community based cohort, as well as developing signatures to diagnose pre-symptomatic viral respiratory infections is desirable.
An important question that arises is whether the changes in host gene expression described here occur before peak symptoms? While still preliminary, we have time course data on subsets of these cohorts. The factor analysis was applied using the RSV, HRV and influenza data from all samples at all times, from which the factor discussed above [Factor 16] was constituted. In we plot the factor score (strength) of the discriminative factor, as a function of time. Two curves are depicted, representing the average factor scores, averaged separately for those that would eventually be symptomatic, and those that would not. The differences in f scores between individuals who remain asymptomatic and those who become symptomatic reach statistical significance (p = 0.028) at 45.5 hours following inoculation. This factor was found to be detectable prior to development of peak symptoms among symptomatic individuals. Thus, using host response as the diagnostic paradigm, presymptomatic diagnosis may be possible.
Signature validation across experimentally infected cohorts illustrates the robust nature of the host response to viral infection. Additional validation of the gene expression signatures in other community-based cohorts would elevate these findings to a true diagnostic test that could enhance or supersede traditional microbiologic based diagnostics. Additionally, such data would be extremely valuable if it could be used to either diagnose infection class prior to standard microbiologic studies (i.e. in the early phases of disease) or indicate prognosis following disease acquisition or therapeutic intervention. In our study, we were able to utilize an easily obtained sample (peripheral blood) to characterize response to a respiratory infection. While development of a diagnostic test that utilizes host gene expression to characterize or predict infectious diseases is not yet possible from the data generated in this study, it represents an important advance showing that peripheral blood gene expression can be used to characterize host response to infection.