|Home | About | Journals | Submit | Contact Us | Français|
Influenza viral passaging through pre-vaccinated mice shows that emergent antigenic site mutations on the viral hemagglutinin (HA) impact host receptor-binding affinity and, therefore, the evolution of fitter influenza strains. To understand this phenomenon, we computed the Significant Interactions Network (SIN) for each residue and mapped the networks of antigenic site residues on a representative H1N1 HA. Specific antigenic site residues are ‘linked’ to receptor-binding site (RBS) residues via their SIN and mutations within “RBS-linked” antigenic residues can significantly influence receptor-binding affinity by impacting the SIN of key RBS residues. In contrast, other antigenic site residues do not have such “RBS-links” and do not impact receptor-binding affinity upon mutation. Thus, a potential mechanism emerges for how immunologic pressure on RBS-linked antigenic residues can contribute to evolution of fitter influenza strains by modulating the host receptor-binding affinity.
Influenza A virus infects host cells via interaction of the HA attachment protein with sialylated glycan receptors on host-cell membranes1,2,3. The primary host immune response to influenza involves antibodies with high neutralizing activity that recognize epitopes on the antigenic sites of HA, designated Ca, Cb, Sa, and Sb for H1 subtype HA4,5. The ability to circumvent these host antibodies via accumulation of amino acid mutations within the antigenic sites of HA results in “antigenic drift” of influenza viruses. This capacity is a global burden to track, which challenge vaccine development efforts6,7.
While antigenic and receptor binding sites were historically perceived as distinct regions on HA8,9, recent studies have shown that mutations at antigenic sites — including those at sites distant from the RBS — can notably modulate glycan receptor binding properties10,11,12. Receptor-binding properties of HA are a critical determinant of influenza evolution, and there is a need to understand how host antigenic pressure shapes the receptor-binding site properties of HA13,14,15,16. Such an understanding is important to enhance pandemic preparedness, especially in light of still circulating virulent H5N1 strains, evolution and spread of Tamiflu-resistant H1N1 strains, and widespread cross-host reassortment at a global-scale (as evidenced by the 2009 swine-origin H1N1 pandemic strain)17.
It has long been known that amino acid interactions are important determinants of protein fold-function-evolution relationships18,19,20,21. Towards understanding the structural underpinnings of how antigenic site mutations modulate RBS properties and thus influence influenza virus evolution, we considered the networks of amino acid residue interactions for each residue on HA – termed the Significant Interactions Network (SIN) (Figure 1). Inter-residue atomic interactions — including hydrogen bonds, disulfide bonds, pi-bonds, polar interactions, salt bridges, and van der Waals interactions — were computed between all pairs of amino acid residues within the trimeric HA structure. Integration of all such inter-residue interactions provided a quantitative measure for each HA residue, which we termed the SIN score (Figure 1 - see Methods for details). The SIN scores of all HA residues were normalized based on the highest SIN score amino acid within HA, such that the scores varied from 0 (minimum) to 1 (maximum) for each residue.
The SIN perspective on HA structure provides a good correlation between SIN score of a residue and its conservation in sequence space across multiple HA subtypes (Supplementary Figure S1). Residues with higher SIN scores are highly conserved given that they are highly constrained to mutate from a network perspective. The residues with a high propensity to mutate all have low SIN scores due to lower constraints from a network perspective. Some residues with low SIN score are also seen to be highly conserved. These residues may have a higher propensity to mutate if there is any selection pressure (compared to high SIN score residues) due to lower constraints from a network perspective.
To classify the SIN scores of residues in HA, these scores were grouped based on the location of the residues in a representative trimeric H1N1 HA structure (Supplementary Figure S2). The solvent exposed residues (not involved in glycan receptor binding) predominantly had SIN scores in the range of 0–0.25. Given that these residues were outside the core or interface or RBS of the trimeric HA, they had a higher propensity to mutate which correlated with the lesser constraints on these residues from a network perspective. On the other hand a relatively higher fraction of residues (compared to solvent exposed residues) that were buried in the core or in the interface of trimeric HA structure or were involved in anchoring sialic acid of glycan receptor (as described below) had higher SIN scores (in the range of 0.25–0.5 or >0.5). Given the critical structural and functional role of these residues, they have a lower propensity to mutate which correlated with more constraints imposed on these residues from a network perspective (higher SIN scores). Based on the distribution of SIN scores in these contexts of residues in HA, each residue was classified as having a high SIN score [0.5–1], medium SIN score [0.25–0.5], or low SIN score [0–0.25]. In contrast to the classical Ribbon diagram, the resulting SIN diagram perspective to influenza HA structure captures all residues (nodes) and their integrated inter-residue atomic interactions (edges) (Figure 1).
The SIN perspective on HA structures permits intuitive contrast of the degree of “networking” of amino acid residues constituting HA structures, highlighted here for illustrative examples of “poorly networked” residues (Figure 2A) and “highly networked” residues (Figure 2B). In this study, we focus on the SIN of the residues constituting the antigenic sites of influenza H1N1 HA so as to evaluate the impact of antigenic site mutations on RBS residues. For this purpose, we use of the HA protein of the A/Puerto Rico/8/1934 (PR8) H1N1 influenza virus as a model system (see Methods). The PR8 HA protein was chosen as a model system due to the recently obtained in vivo experimental data on the antigenic site mutations escape mutants that emerged from PR8 virus passaging through pre-vaccinated mice or monoclonal antibody selection pressure10.
The 150-loop (W153, T155), 130-loop (G134, T136), 180/190-loop (H183, E190, L194), 90-loop (Y98), and 220-loop (Q226, G228) are involved in anchoring the Sialic Acid (SA) monosaccharide of the host glycan receptor to PR8 HA (Figure 3A). The composition, relative orientation of the side-chains, stability, and interactions for each of these receptor binding site (RBS) residues are critical determinants of host receptor-binding affinity for H1N1 HA21,22,23,24,25. The PR8 antigenic site residues are L79, L80, P81, V82, R83, S84 (Cb antigenic site); P128, N129, E156, K157, E158, G159, S160, P162, K163, L164, K165, N166, S167 (Sa antigenic site); S140, H141, E142, G143, K144, S145, V169, N170, K171, K172, G173, T206, S207, N208, R224, D225, K238, P239, G240 (Ca antigenic site); and N187, S188, K189, E190, Q191, Q192, N193, L194, Y195, Q196, N197, E198 (Sb antigenic site) (Figure 3B). Thus, in this study, we focus on the SIN of each of these antigenic site residues and evaluate how antigenic mutations can impact HA affinity to the glycan receptor.
SIN analysis of the PR8 trimeric HA crystal structure (obtained from PDB ID:1RVZ) shows that all of the experimentally observed mutations impinging on glycan receptor-binding affinity10 are on antigenic residues with a SIN that includes SA-anchoring RBS residues (Table 1). An illustrative example is the SIN of K165 that includes H183 and E190 which are key SA-anchoring RBS residues, despite the fact that K165 has nearly 20 angstroms distance separation from H183/E190 on the PR8 HA structure (Figure 4). In addition, the SIN of K165 contains residues from one other neighboring HA monomer, thus “connecting” the HA glycoprotein across the HA1-HA1 protein-protein interface in the trimeric structure. Similarly, the SIN of I244 includes the SA-anchoring RBS residues H183 and L194 residues from the neighboring HA monomer, despite more than 20 angstroms distance between I244 and H183/L194 (Figure 4). In addition to mutations on PR8 HA antigenic residues that have cross-monomer “RBS links”, antigenic mutations affecting receptor-binding affinity are also seen to be on residues that have intra-monomer “RBS links”. An illustrative example is the SIN of L164 which includes the SA-anchoring RBS residues W153, H183, Y98, and Q226 (Figure 4). An additional example is the SIN of N129 that includes the SA-anchoring RBS residue T155. Thus, escape mutations that impinge on glycan receptor binding affinity are observed to be on antigenic residues with intra-monomer or inter-monomer RBS-links (i.e. antigenic residues with a SIN that contains SA-anchoring RBS residues).
Further analysis of the specific antigenic site escape mutants with modified receptor-binding affinity shows that these HA mutants possess altered SIN for one or more SA-anchoring RBS residues (Figure 5). For instance, we observed a general trend of increase in the SIN score of H183 (a critical SA-anchoring RBS residue) as compared to its score in wild-type PR8 HA in each of the following antigenic mutants, N129K, E156G, E156K, L164Q, K165E, N166K, Q196R, E198G, R224I, and I244T. H183 is contained within the SIN of each of these antigenic residues that were mutated (Table 1). The PR8 HA-SA co-complex crystal structure shows that His-183 forms a hydrogen bond with the 9-hydroxyl group of SA, in addition to a hydrogen bond with the SA-anchoring Tyr-98 as well as with the RBS-proximal Y195 (Supplementary Figure S3). Site-directed mutageneses confirm that the extended hydrogen bond network associated with these residues is an important determinant of SA-binding affinity of HA1. The increase in SIN score of H183 correlates with increasing its stability in the RBS and offers an explanation for increase in glycan-receptor binding affinity by antigenic mutations that are far removed from the RBS in three-dimensional space. Conversely, the SIN score of H183 is lowered by I93T (another antigenic residue in its SIN) and this correlates with reducing the stability of this residue in the context of this mutation and hence offers an explanation of the observed reduction in glycan-binding affinity.
The above relationship between antigenic mutations that increase or decrease SIN score of key SA-anchoring RBS residues and the respective increase or decrease in glycan-binding affinity is consistently observed for all the antigenic escape mutants (Figure 5; Table 1). For instance, some of the mutations on other “RBS-linked” antigenic residues of PR8 are observed to modify the SIN of the critical SA-anchoring residues W153 (that stabilizes the RBS via extensive van der Waals interaction networks), Y98 (that hydrogen bonds with the 8-hydroxyl group of the SA moiety on the receptor), and L194 (that makes non-polar contacts with the N-acetyl methyl group of the SA moiety on the receptor)1.
Amongst the few emergent PR8 escape mutants that showed no change in glycan receptor binding affinity, the mutating residues almost always have no SA-anchoring RBS residues in their SIN (i.e. antigenic residues without any “RBS-link”) (Figure 5; Table 1). The only examples of antigenic site mutations that occurred on “RBS-linked” antigenic residues are the N129Y, S160L, K163T, Q192L, and S140P mutationsmutations.. However, none of these escape mutations has any effect on the SIN of SA-anchoring RBS residues.
Taken together, the above results demonstrate that antigenic site residues whose SIN contains SA-anchoring RBS residues (RBS-linked antigenic residues) may undergo specific types of mutations that influence SA-anchorage and, thus, receptor-binding affinity of HA. "RBS-linked" antigenic residues thus emerge as an important factor shaping the phenotype of escape mutants emerging from H1N1 virus evolution.
Towards understanding the immunological implications of our observations, we considered the known B-cell epitopes for PR8 HA from the immune epitope database (IEDB; www.immuneepitope.org). This analysis shows that many of the known B-cell epitopes (including epitope IDs 72805, 77507, 77508, 77509, 77510, 12285, and 76992) are constituted from one or more "RBS-linked" antigenic residues. The experimental data analyzed (Table 1)10, shows that the specific RBS-linked antigenic residues contained within these epitopes are able to harbor mutations that modulate the SIN of key SA-anchoring RBS residues (Table 2; Supplementary Figure S4). This suggests that B-cell targeting of influenza PR8 HA (and potentially HA in general) — involving host antibodies recognizing surface epitopes within the Sa/Sb/Ca/Cb antigenic sites of HA — may be contributing to the emergence of potentially "fitter" influenza strains associated with increased host receptor-binding affinity.
In addition to identification of "RBS-linked" antigenic residues as an important determinant of H1N1 evolution, SIN analysis of PR8 HA identified a remarkably high number of stabilizing atomic interactions between the highly networked SA-anchoring amino acids in the RBS of PR8 HA (many of which are indicated in Figure 2). Perhaps, a higher SIN profile of the RBS may be an evolutionary solution to limit the vibration entropy of HA RBS residues — an important consideration for enhancing protein-glycan interaction affinity1, and in-turn influenza infection and transmission eficiency23,24,25.
While the results presented in this study were derived by analyzing PR8 H1N1 HA as an illustrative model system, SIN analysis may be readily applied to analyze HA structures regardless of strain/subtype. More broadly, the results obtained here would suggest that an effective "antigenic-RBS linkage density" is a critical determinant of the evolutionary abilities of different H1N1 or for that matter other influenza strains. The analysis of known B-cell epitopes on PR8 HA suggests that immunologic targeting of influenza HA by host antibodies can select for escape mutants with increased receptor-binding properties, thus aiding in the propagating potentially fitter strains. As more antibody structures complexed to H1N1 HA are determined in future, the database of B-cell epitopes will expand, thus permitting a more comprehensive analysis of how B-cell targeting of influenza HA contributes to the evolution of receptor-binding properties.
This study emphasizes that SIN analysis may be a valuable tool to factor into mapping of the influenza antigenic site mutants that may be likely to emerge from global influenza circulation under herd immunologic pressure — particularly across the heavily pre-vaccinated communities during each influenza season. Indeed, SIN analysis of influenza HA structures provides a new perspective on the link between the receptor-binding affinity of HA and antigenic site mutants. This new perspective will be valuable in complementing current methods that track global circulation of influenza strains, such as antigenic cartography. In this capacity, continual network analysis of all circulating H1N1 HA structures can potentially accelerate and optimize the selection of the ideal vaccine strains for each flu season.
All protein sequences of H1N1 subtypes were obtained from www.fludb.org/brc/home.do. Sequences were aligned with MATLAB multialign and Jalview muscle multiple sequence alignment algorithms. Phylogenetic analyses were performed as required using the Phylowidget tool (www.phylowidget.org/full/index.html). Protein modeling was performed using Accelrys Discovery Studio (DS) by employing the build multiple homology models protocol (www.accelrys.com/products/discovery-studio/). The PR8 crystal structure (PDB ID:1RVZ) obtained from Protein Data Bank (www.rcsb.org) was chosen to build homology models of the antigenic mutant forms described throughout the manuscript. Pymol and python scripting were used for visualization of the modeled molecular structures. Modeled protein structures were analyzed with our significant interactions network (SIN) computation MATLAB protocols in the following manner. Using the coordinates of each protein structure (PDB file), instances of putative hydrogen bonds (including water-bridged ones), disulfide bonds, pi-bonds, polar interactions, salt bridges, and Van der Waals interactions (non-hydrogen) occurring between pairs of residues using appropriate distance thresholds were computed (each of these chemical and physical atomic interactions are described extensively in the literature; see references S1-S45 for further information on these atomic interactions).
These data were assembled into an array of eight atomic interaction matrices. A weighted sum of the eight atomic interaction matrices were then computed to produce a single matrix that accounts for the strength of atomic interaction between residue pairs, using weights derived from relative atomic interaction energies and including weights for inter-chain interactions and long-range over short-range interactions (the relative energies of atomic interactions are described in the literature extensively; Supplementary References S1–S45). The resulting inter-residue energetic interaction matrix describes all first-order interactions for the analyzed molecular structure. All interaction pathways regardless of length were then calculated to obtain the paths. Using the collection of paths identified (and their corresponding scores), the complete SIN matrix was created, wherein each element i, j is the sum of the path scores of all paths. The degree of networking (henceforth termed SIN score) for each residue was computed by summing across the rows of the matrix, which was meant to correspond the extent of "networking" for each residue. The degree of networking scores were normalized with the maximum score for each protein so that the scores varied from 0 (minimum) to 1 (maximum) for each protein analyzed. MATLAB was used to develop the analytical methods outlined here. An R script was used to visualize the SIN diagram of each protein to visually appreciate the degree of networks constituting each protein structure (Figure 1). SIN scores were calculated for representative crystal structures at different resolutions (for the same protein) to demonstrate that small variations in the resolution of the structures did not alter the SIN of the residues in that structure (Supplementary Table S1).
Venkataramanan S., S.Z., N.P., K.W., R.R. and S.R. carried out analyses and wrote various codes for the analysis. Venkataramanan, S., S.R., V.S. and R.S. designed the methods and the overall conceptual definitions. I.A.W. provided additional structural perspectives to the approach. Venkataramanan S. and R.S. did the major writing of the manuscript. R.R. and R.S. reviewed the manuscript.
This work was supported by National Institutes of Health grant GM R37 GM057073-13 and Singapore–MIT Alliance for Research and Technology (SMART) to RS and the grant AI058113 to IAW. The authors would like to thank Mike Rooney for his help with developing a MATLAB and R based network visualization tool.