Influenza A virus infects host cells via interaction of the HA attachment protein with sialylated glycan receptors on host-cell membranes1,2,3
. The primary host immune response to influenza involves antibodies with high neutralizing activity that recognize epitopes on the antigenic sites of HA, designated Ca, Cb, Sa, and Sb for H1 subtype HA4,5
. The ability to circumvent these host antibodies via accumulation of amino acid mutations within the antigenic sites of HA results in “antigenic drift” of influenza viruses. This capacity is a global burden to track, which challenge vaccine development efforts6,7
While antigenic and receptor binding sites were historically perceived as distinct regions on HA8,9
, recent studies have shown that mutations at antigenic sites — including those at sites distant from the RBS — can notably modulate glycan receptor binding properties10,11,12
. Receptor-binding properties of HA are a critical determinant of influenza evolution, and there is a need to understand how host antigenic pressure shapes the receptor-binding site properties of HA13,14,15,16
. Such an understanding is important to enhance pandemic preparedness, especially in light of still circulating virulent H5N1 strains, evolution and spread of Tamiflu-resistant H1N1 strains, and widespread cross-host reassortment at a global-scale (as evidenced by the 2009 swine-origin H1N1 pandemic strain)17
It has long been known that amino acid interactions are important determinants of protein fold-function-evolution relationships18,19,20,21
. Towards understanding the structural underpinnings of how antigenic site mutations modulate RBS properties and thus influence influenza virus evolution, we considered the networks of amino acid residue interactions for each residue on HA – termed the Significant Interactions Network (SIN)
(). Inter-residue atomic interactions — including hydrogen bonds, disulfide bonds, pi-bonds, polar interactions, salt bridges, and van der Waals interactions — were computed between all pairs of amino acid residues within the trimeric HA structure. Integration of all such inter-residue interactions provided a quantitative measure for each HA residue, which we termed the SIN score
( - see Methods
for details). The SIN scores of all HA residues were normalized based on the highest SIN score amino acid within HA, such that the scores varied from 0 (minimum) to 1 (maximum) for each residue.
Illustrating the significant interaction networks (SIN) for amino acid residues constituting the influenza virus HA structure.
The SIN perspective on HA structure provides a good correlation between SIN score of a residue and its conservation in sequence space across multiple HA subtypes (Supplementary Figure S1
). Residues with higher SIN scores are highly conserved given that they are highly constrained to mutate from a network perspective. The residues with a high propensity to mutate all have low SIN scores due to lower constraints from a network perspective. Some residues with low SIN score are also seen to be highly conserved. These residues may have a higher propensity to mutate if there is any selection pressure (compared to high SIN score residues) due to lower constraints from a network perspective.
To classify the SIN scores of residues in HA, these scores were grouped based on the location of the residues in a representative trimeric H1N1 HA structure (Supplementary Figure S2
). The solvent exposed residues (not involved in glycan receptor binding) predominantly had SIN scores in the range of 0–0.25. Given that these residues were outside the core or interface or RBS of the trimeric HA, they had a higher propensity to mutate which correlated with the lesser constraints on these residues from a network perspective. On the other hand a relatively higher fraction of residues (compared to solvent exposed residues) that were buried in the core or in the interface of trimeric HA structure or were involved in anchoring sialic acid of glycan receptor (as described below) had higher SIN scores (in the range of 0.25–0.5 or >0.5). Given the critical structural and functional role of these residues, they have a lower propensity to mutate which correlated with more constraints imposed on these residues from a network perspective (higher SIN scores). Based on the distribution of SIN scores in these contexts of residues in HA, each residue was classified as having a high SIN score [0.5–1], medium SIN score [0.25–0.5], or low SIN score [0–0.25]. In contrast to the classical Ribbon diagram
, the resulting SIN diagram
perspective to influenza HA structure captures all residues (nodes
) and their integrated inter-residue atomic interactions (edges
The SIN perspective on HA structures permits intuitive contrast of the degree of “networking” of amino acid residues constituting HA structures, highlighted here for illustrative examples of “poorly networked” residues () and “highly networked” residues (). In this study, we focus on the SIN of the residues constituting the antigenic sites of influenza H1N1 HA so as to evaluate the impact of antigenic site mutations on RBS residues. For this purpose, we use of the HA protein of the A/Puerto Rico/8/1934 (PR8) H1N1 influenza virus as a model system (see Methods
). The PR8 HA protein was chosen as a model system due to the recently obtained in vivo
experimental data on the antigenic site mutations escape mutants that emerged from PR8 virus passaging through pre-vaccinated mice or monoclonal antibody selection pressure10
Illustrative examples of poorly networked (low SIN score) residues and highly networked (high SIN score) residues of influenza H1N1 HA-1 domain.
(W153, T155), 130-loop
(G134, T136), 180/190-loop
(H183, E190, L194), 90-loop
(Y98), and 220-loop
(Q226, G228) are involved in anchoring the Sialic Acid (SA) monosaccharide of the host glycan receptor to PR8 HA (). The composition, relative orientation of the side-chains, stability, and interactions for each of these receptor binding site (RBS) residues are critical determinants of host receptor-binding affinity for H1N1 HA21,22,23,24,25
. The PR8 antigenic site residues are L79, L80, P81, V82, R83, S84 (Cb antigenic site
); P128, N129, E156, K157, E158, G159, S160, P162, K163, L164, K165, N166, S167 (Sa antigenic site)
; S140, H141, E142, G143, K144, S145, V169, N170, K171, K172, G173, T206, S207, N208, R224, D225, K238, P239, G240 (Ca antigenic site);
and N187, S188, K189, E190, Q191, Q192, N193, L194, Y195, Q196, N197, E198 (Sb antigenic site
) (). Thus, in this study, we focus on the SIN of each of these antigenic site residues and evaluate how antigenic mutations can impact HA affinity to the glycan receptor.
Highlighting the amino acid residues constituting the sialic acid anchoring RBS residues and the antigenic site (Sa, Sb, Ca, Cb) residues of influenza H1N1 PR8 HA.