|Home | About | Journals | Submit | Contact Us | Français|
Background.Annual vaccination is the primary means for preventing influenza. However, great interindividual variability exists in vaccine responses, the cellular events that take place in vivo after vaccination are poorly understood, and appropriate biomarkers for vaccine responsiveness have not been developed.
Methods.We immunized a cohort of healthy male adults with a licensed trivalent influenza vaccine and performed a timed assessment of global gene expression before and after vaccination. We analyzed the relationship between gene expression patterns and the humoral immune response to vaccination.
Results.Marked up regulation of expression of genes involved in interferon signaling, positive IL-6 regulation, and antigen processing and presentation, were detected within 24 hours of immunization. The late vaccine response showed a transcriptional pattern suggestive of increased protein biosynthesis and cellular proliferation. Integrative analyses revealed a 494-gene expression signature—including STAT1, CD74, and E2F2—which strongly correlates with the magnitude of the antibody response. High vaccine responder status correlates with increased early expression of interferon signaling and antigen processing and presentation genes.
Conclusions.The results highlight the role of a systems biology approach in understanding the molecular events that take place in vivo after influenza vaccination and in the development of better predictors of vaccine responsiveness.
Influenza viruses circulate worldwide, causing an estimated 250,000–500,000 deaths each year . In the United States alone, yearly epidemics affect an estimated 2–5% of the population  and are responsible for an average of 200,000 hospitalizations  and 36,000 deaths . Since 1977, influenza A (H3N2), influenza A (H1N1), and influenza B viruses have been responsible for the majority of documented infections. Trivalent inactivated vaccines, updated yearly based on the currently circulating strains, are the primary tools used for preventing influenza [5, 6]. Recent systematic reviews have highlighted the need for better vaccines [7–9]. It is clear that the protection offered by the currently available vaccines is incomplete, and that large interindividual variation exists . Furthermore, the cellular events that take place in vivo after vaccination and the correlates of vaccine-induced protection are incompletely understood. Genome-wide transcriptional analysis has recently been introduced as a useful tool to study the mechanisms of viral infection [11, 12] and host responses to vaccination [13–15]. However, an integrated, systematic evaluation of the transcriptional response to influenza vaccination using multiple time points and incorporating antibody response data has not been performed to date.
We immunized a cohort of 119 healthy male adults, ages 18–40, with a licensed trivalent influenza vaccine and assessed the genome-wide gene expression patterns in peripheral blood cells before and on days 1, 3, and 14 after vaccination. Antibody titers were measured before and on days 14 and 28 after vaccination. We then studied the correlation between changes in gene expression and the antibody response following influenza vaccination.
Healthy volunteers ages 18–40 years were enrolled (n = 119). Given the systematic differences in gene expression between sexes and among ethnic groups [16, 17], this initial cohort was limited to male individuals of self-reported Caucasian ancestry. Individuals who were known to have received an influenza vaccine in the previous three years were not included. Enrollment, vaccination, and sample collection were conducted at a university campus. The protocol was approved by the institutional review boards of all participating institutions. Informed consent was obtained from each subject prior to enrollment.
Participants were immunized on day 0 with the 2008–2009 trivalent influenza vaccine [A/Brisbane/59/2007(H1N1), A/Brisbane/10/2007(H3N2), B/Florida/4/2006; Sanofi-Pasteur].
Peripheral blood samples for RNA purification (n = 461) were obtained before and on days 1, 3, and 14 after immunization. To minimize changes in gene expression induced by sample handling and processing, whole blood samples (2.5 mL) were collected in PAXgene RNA stabilization tubes (QIAGEN) and frozen at −80°C. RNA purification was performed using the PAXgene Blood RNA system (QIAGEN). Spectrophotometry (NanoDrop-1000 Spectrophotometer, Thermo Fisher Scientific) and microfluidic electrophoresis (Experion Automated Electrophoresis System, Bio-Rad Laboratories) were used for quality control.
In vitro transcription was performed using Ambion Illumina TotalPrep RNA Amplification Kits (Applied Biosystems/Ambion). cRNA was hybridized onto Illumina Human HT-12v3 Expression BeadChips (Illumina), following the manufacturer’s protocol. All samples for each individual were processed simultaneously. The arrays included over 25,000 well-characterized genes (including splice variants) and nonannotated gene candidates. Standard quality control thresholds were applied after preprocessing of signal intensity data, and failed microarrays were removed [18, 19]. We also excluded arrays from individuals for which data was not available at all time points. The final set of 368 arrays represented 92 individuals at four time points. We required a detection P value of <.05 in at least 65% of the samples for a transcript to be considered detected. There were 12,795 detected transcripts in our data. To identify genes with levels of expression that changed after vaccination, we used the expression values of each transcript at the four time points in a linear mixed-effects analysis of variance (ANOVA) model, using day as a fixed effect and person as a random effect.
Whole blood (10 mL) was collected in Vacutainer Serum Separator Tubes (Beckton-Dickinson). Serum was separated by centrifugation prior to storage at −20°C. Hemagglutination inhibition (HAI) tests were performed as previously described , except for a starting serum sample dilution of 1:4 and the use of turkey red blood cells (RBCs). HAI test antigens were allantoic fluid harvests from infected embryonated hen’s eggs (whole virus antigens). Neutralizing antibody tests were performed as previously described  except that hamster serum was not included. Test strains were the same as those used in the vaccine.
Initial quality control of the microarray signal intensity data was performed using the lumi Bioconductor package  in the R programming language . The same software was used for variance stabilization transformation  and robust spline normalization of the expression data. Regression and ANOVA were carried out in R.
The Gene Ontology (GO) vocabulary was obtained from the GO Web site (http://www.geneontology.org, 2009 build). Sequence information for reporters in the microarrays was converted to nuID annotation . The corresponding Entrez identifiers were mapped to the GO data structure. Using our own ontology analysis system, OntologyTraverser , we tabulated the genes annotated at or below each GO node for the entire array. We used a hypergeometric sampling model to examine the statistical representation of each GO node for genes in our gene sets. To compare sets, we took differences between the standardized scores determined for each gene set. Gene lists were simultaneously analyzed using Ingenuity Pathway Analysis (IPA) software (Ingenuity Systems) and DAVID Ontology [24, 25] (http://www.david.abcc.ncifcrf.gov) to confirm significantly associated pathways.
We performed regression analyses for each vaccine component in each of the two antibody titer measurements (Figure S1, supplementary data). Because the results for the two antibody titer data sets (HAI and neutralization) were strongly correlated, we combined the information from the two assays for downstream TRI calculations. We related the change in titer between pre- and postvaccination measurements (response variable) to the prevaccination titer (explanatory variable) using a simple linear model with an additional term to account for spurious variation attributable to recruitment dates (Figure S1, supplementary data). We next determined the residuals from the above linear regressions and used them as the input values for the individual response scores. The residuals are the distance of each individual point from the regression fit (Figure S2, supplementary data). These residuals reflect how the antibody titer rise of an individual deviates from the expected rise given the typical response of individuals within the cohort with a given prevaccination titer level. To increase the accuracy of combining the titer scores across vaccine components, we standardized the residuals by dividing by the residual standard deviation for each component; we then computed four outcome scores within each titer response measurement for each individual. These four summaries are the mean standardized residual across vaccine components, median standardized residual across vaccine components, mean rank (across all individuals, across vaccine components), and median rank. These scores were correlated (Figure S2, supplementary data). The scores were then standardized again (mean = 0, variance = 1 across all individuals), and scores within each individual were averaged to obtain the titer response index (TRI). The TRI, therefore, is a summary of the aggregate residuals of all antigen components in the vaccine with both HAI and neutralization antibody titers incorporated. The TRI is designed specifically for carrying out the correlation analyses between gene expression and antibody response.
In this regression model, we treated the expression data for all detected transcripts as the outcome, and the day and TRI as explanatory variables. We computed false discovery rate (FDR) q values (Benjamini-Hochberg method) for the partial F statistics of titer score against expression. We used an FDR cutoff of 0.05 and a TRI slope absolute magnitude cutoff of 0.1.
We sequentially excluded the values for each individual and used the expression data and titer scores on the remaining individuals to determine the correlation between the TRI and the expression values. We then attempted to predict the TRI value of the individual that was excluded from the model selection and fitting process using the expression values for the target individual and the model fit from the rest of the cohort. To make the process more robust and to balance the effect of each gene, we standardized the expression data to have a mean of zero and variance of 1 among the set of genes and individuals considered. We used a subset of genes with known functions in antigen presentation, interferon response, and peroxidase activity processes (Table 1). We identified day 1 and day 3 after vaccination as the days with the strongest correlation. We then fit the TRI to expression relation within each of these days. The cross-validation prediction for each individual was determined as the maximum absolute prediction value obtained from either day 1 or day 3.
We identified expression changes in 4740 transcripts (Dataset 1) after imposing a FDR cutoff of 0.01 on the P value for the day effect. Due to the assay’s design, some reporters mapped to different regions or alternatively spliced transcripts of the same gene. The reporters corresponding to differentially expressed transcripts in our data set uniquely mapped to 3854 RefSeq-annotated genes and 242 nonannotated gene candidates. The maximum expression change occurred on day 1, while day 14 had the greatest number of genes with maximum expression values. Interindividual differences in expression are reflected in the magnitude of person-effects in the ANOVA model (Dataset 1). The top 1% differentially expressed genes included interferon-inducible genes (eg, IFIT1, MX1, and IRF9), the signal transducer and activator of transcription (STAT) gene family, genes with ribosomal functions (eg, RPL7, RGS18, and RPS27), and genes with unclear functions (eg, WARS and the family of guanylate-binding proteins).
A visual summary of the dynamic changes in the transcriptional response to vaccination over time shows three major gene expression patterns: downregulation, early upregulation, and late upregulation (Figure 1A). Genes with late upregulation of expression were not significantly different between the last two time points. We evaluated the gene content of each of the above patterns of expression using three methodologies: association of Gene Ontology (GO) terms (http://www.geneontology.org) using our own R software package , the DAVID bioinformatics database [17, 18], and Ingenuity Pathway Analysis software. These results are summarized in Figure 1B, and complete gene lists are provided in Datasets 2–8. As expected, the early phase of transcriptional activation was marked by upregulation of genes involved in multiple immune processes. The data suggest early expression of genes whose products act on host viral sensing via Toll-like receptors TLR7 and TLR8. Overall, there were significantly increased transcript levels for genes participating in antiviral defense response (eg, GO:0009615), cellular activation (eg, GO:0001775), and cellular differentiation (eg, GO:0045582). Activation of antigen processing and presentation is implied by the increased expression of major histocompatibility complex (MHC) Class I genes (eg, GO:002474). Associations were most significant for the interferon response pathway (GO:0005062, GO:0034341, and GO:0034340) when compared with other cytokines. Of the signal transduction pathways, the Janus kinase/signal transducers and activators of transcription (JAK/STAT) (GO:007259) and nuclear factor kappa beta (NFκβ) (GO:0043122) axes were particularly enriched. In contrast, the late vaccine response was characterized by increased detection of transcripts for genes involved in intracellular processes suggestive of active cellular proliferation, including macromolecule biosynthesis (eg, GO:0009059), cellular protein metabolism (eg, GO:0044267), RNA metabolic processing (eg, GO:0016070), and regulation of antiapoptosis (GO:0045767).
Antibody titers were measured by both hemagglutination inhibition (HAI) and neutralization assays. The titer results from these assays were strongly correlated, and the titers generally plateaued after day 14. We constructed a classification model for vaccine responsiveness (the Titer Response Index, TRI) based on the aggregate titer changes for the three antibodies. The TRI robustly aggregates both the individual’s response across the vaccine components and the individual’s response relative to the rest of the cohort. Interestingly, the distribution of the TRI scores is trimodal, facilitating an unbiased classification of individuals as high, intermediate, and low vaccine responders (Figure 2A), depending on where the individual’s TRI falls under this distribution. The TRI and rise in HAI titer for the 92 individuals show a significant positive relationship (r2 = 0.4585).
To assess whether specific patterns of gene expression correlated with the magnitude of the antibody response, we studied the relationship between gene expression and the TRI in all subjects. We found that the abundance level of 494 transcripts significantly correlated with the TRI (Dataset 8). These mapped to 481 known genes and 13 nonannotated gene candidates. Two opposing expression trends influenced the magnitude of the antibody response, and the maximum effects were observed at specific time points as illustrated by the genes STAT1 and E2F2 in Figure 2B. A visual heatmap display of the entire gene expression signature in those individuals with the 10 highest and 10 lowest TRI scores is given in Figure 2C (bottom). Remarkably, the difference between STAT1 and E2F2 expression alone generates a gradient that corresponds clearly to the TRI at the two extremes of the response spectrum (Figure 2C, top).
We then performed a cross-validated linear prediction procedure utilizing the expression values of a subset of differentially expressed genes (Table 1) to predict the antibody response scores of those individuals with the highest and lowest antibody responses as depicted in Figure 2C. We show that the residual expression values on days 1 and 3 are predictive of the observed TRI for each individual and, more importantly, that the cross-validated predicted TRI recategorized the highest and lowest responders into the appropriate groups (Figure 2D). The prediction performed particularly well in the individuals who mounted the least immune response.
Content analysis of the genes that were upregulated in the high-responder group showed enrichment of GO categories involved in cellular immune responses, most notably the interferon response and antigen presentation pathways (Figure 3A and Datasets 9, 10). Genes within the interferon response pathway were highly upregulated in the high-responder group (Figure 3B). A functional interaction network for this subset of genes is presented in Figure 3C. In contrast, the majority of upregulated genes and enriched functional pathways in low responders were not specifically related to cell-mediated immune responses (Figure S3 and Datasets 11, 12).
This study illustrates how a systems biology approach can be applied in a clinical scenario to understand the complex molecular events that take place in vivo after trivalent influenza vaccination, and to develop better molecular biomarkers for vaccine responsiveness. The implications of the findings, therefore, extend from vaccine biology to vaccine development and clinical vaccinology.
The data support a model in which genes involved in interferon signaling and antigen presentation pathways are strongly upregulated in the initial 24 hours after vaccination, and the expression pattern of early-activation genes correlates strongly with the magnitude of the antibody response measured 14 and 28 days after vaccination. The simultaneous assessment of transcript abundance for over 25,000 genes and gene candidates offers an unbiased, genome-wide view of the transcriptional events that take place after vaccination. By performing this analysis before and at three time points after vaccination, we have gained a deeper understanding of these events and their timing than had been possible before. The three patterns of coexpression that became evident from our analysis are quite distinct and suggest a previously undescribed biphasic transcriptional response to vaccination. The observation of maximum expression changes in the initial 24 hours after vaccination and the correlation between early expression and the magnitude of the antibody response are important new contributions to the understanding of the vaccination response sequence. Content analysis suggests a central role in the early vaccine response for genes whose products are involved in viral sensing via TLR7 and TLR8, MHC Class I presentation, interferon signaling, IL-6, and the NFκβ and JAK/STAT signaling pathways. These genes are also known to be activated in innate cells (ie, macrophages and dendritic cells) during viral infections . Days 3 and 14 show similar patterns of expression, more suggestive of increased RNA processing and protein synthesis. For inactivated influenza vaccines, therefore, these findings suggest that the first 24 hours after vaccination are of great biological importance.
As the field of clinical vaccinology moves into an era of high-throughput experimental data generation and systems-level biology, it is imperative to give serious statistical consideration to the limitations of current methods of assessment of the magnitude of immune responses. With trivalent influenza vaccines, which in the case of influenza are of high clinical importance, three factors are known to complicate the quantitative analysis of antibody titer data in clinical studies. First, by young adulthood, most individuals have been exposed to influenza antigens through infection or prior vaccination. Indeed, despite our exclusion of individuals who had been vaccinated over the previous three years, most of the study participants had measurable prevaccination antibody titers. Second, individuals with higher prevaccination titers have smaller differences between pre- and postvaccination titers. Accordingly, we observed in our data an approximately linear inverse correlation between prevaccination titers and the rise in titer. For this reason, a simple calculation of the titer change (the titer delta) is inadequate as a way to classify the vaccine responsiveness of individuals. Finally, influenza vaccination currently involves the concomitant administration of three antigens and, while the immune response takes place simultaneously and separations are somewhat artificial, there is inter- and intraindividual variation in the response to each. These observations are depicted in Figure S2. We, therefore, developed the TRI (Titer Response Index) as a classification method for these data. The TRI accounts for an individual’s prevaccination antibody titers, their titers for the three antigens, and the magnitude of their response relative to other individuals in the population. The observed trimodal distribution clearly deviates from what would be expected by chance alone, permitting a statistically significant classification of trivalent influenza vaccine recipients on the basis of responsiveness.
Compared to previous studies, our study has a larger sample size, which provides sufficient power to detect significant correlations between transcript abundance and responsiveness. By assessing the relationship between the early patterns of gene expression and the magnitude of the antibody response, we have discovered 494 transcriptional biomarkers that strongly correlate with the humoral immune response to trivalent influenza vaccines. Our content and network analyses of this predictive gene expression signature again point clearly toward the interferon and antigen presentation pathways. Genes in these pathways, as well as other genes with levels of expression that are clearly different between the highest and lowest vaccine responders in our study (CD74, HLA-E, E2F2, and PTEN) are potential targets for functional studies linking the early molecular events that follow vaccination with the subsequent adaptive immune response. Genomic regions that control expression of these genes could be important for studies seeking to explain interindividual variation in vaccine responsiveness at the DNA level.
A full predictive analysis using the 92 individuals for which expression data was analyzed would be ideal for validating these correlated transcripts. However, our power to perform cross-validation on individuals in the intermediate response range is affected by the expression differences that result from interindividual variation. Fortunately, these subjects appear to develop protective immune responses after vaccination and are less crucial to the interpretation of the data. Since individuals who did not mount a response to the trivalent influenza vaccine are the most clinically interesting and, given that the vaccine response of the entire cohort follows a trimodal distribution, we sampled from the extreme ends of this distribution in order to maximize the power of the cross-validation analyses. Specific patterns of gene expression characterize individuals at the two extremes of the antibody response spectrum, and cross-validation illustrates their predictive value. The fact that a simple gradient of STAT1 to E2F2 expression can predict an individual’s response status at the extremes of the spectrum, and the fact that cross-validation could be performed using a small subset of representative genes, underscore the magnitude of the differences and suggest that simplified predictors based on early gene expression patterns are possible. Further validation of our gene expression-based prediction model in another and preferably larger group of volunteers is warranted. Ideally, this would include individuals of both sexes and different ethnicities.
Early molecular predictors of vaccine responsiveness can be useful in the development and comparison of new vaccines and adjuvants. They can also play a role in studies of vaccine response among different subgroups (children, the elderly, or immunocompromised patients, for example) and open the door for studies of individualized vaccine regimens. Past and recent influenza pandemics highlight the need for studies geared toward rapid implementation of clinical and translational research findings.
The following datasets will be available for download from our laboratory’s Web site (www.bcm.edu/cnrc/faculty/cnrcbelmont):
Dataset 1. List of Differentially Expressed Transcripts.
Dataset 2. Gene Ontology Results for Downregulated Genes.
Dataset 3. Ingenuity Pathway Analysis Results for Downregulated Genes.
Dataset 4. Gene Ontology (GO) Results for Early Upregulated Genes.
Dataset 5. Ingenuity Pathway Analysis Results for Early Upregulated Genes.
Dataset 6. Gene Ontology (GO) Results for Early Upregulated Genes.
Dataset 7. Ingenuity Pathway Analysis Results for Late Upregulated Genes.
Dataset 8. Gene Expression Signature for Influenza Vaccine Responses.
Dataset 9. DAVID Gene Ontology (GO) Results for Highly Expressed Genes in High Responders.
Dataset 10. Ingenuity Pathway Analysis Results for Highly Expressed Genes in High Responders.
Dataset 11. DAVID Gene Ontology (GO) Results for Highly Expressed Genes in Low Responders.
Dataset 12. Ingenuity Pathway Analysis Results for Highly Expressed Genes in Low Responders.
This work is supported by a contract from the U.S. National Institutes of Health (NO1-AI-030039). K.L.B. is supported by a Ruth L. Kirschtein NRSA Pre-doctoral Training Grant (F31AI071372-03).