Phylogeny and principal component cluster analysis revealed that the H gene of A/H1N1 is most closely related to the triple reassortants swine influenza H1N1 and H1N2 found in North America since 2000 (Figure ). These results, suggesting that A/H1N1 emerged from this cluster of viruses, are consistent with the recently reported findings of Trifonov and co-workers [
15]. The H genes of swine H1N1 and H1N2 from North America were analyzed by the ISM and compared to A/H1N1 strains. Figure shows the typical IS profile of a selected virus (Figure ) as well as the cumulated CIS (Figure ) of all viruses of each group.
The CIS of HA1 of swine H1N2/H1N1 and A/H1N1 strains have characteristic dominant peaks at the IS frequencies F(0.055) and F(0.295), respectively (Figure and ). According to the ISM concept this reflects a differential interaction pattern of the HA1 of the two groups of viruses. The tropism of the viruses suggests that a high amplitude at F(0.055) corresponds to a higher propensity to interact with swine protein(s) while the high amplitude at F(0.295) may correspond to a preferred interaction with human protein(s). CIS of the HA1 gene of swine H1N1 viruses isolated from humans in US (before 2008) contains characteristic peaks at both frequencies F(0.055) and F(0.295) (Figure ). This suggests that these viruses, that sporadically infected humans, display both the distinct "swine" interaction pattern shared with the swine H1N1/N2 viruses and the characteristic "human" interaction pattern shared with A/H1N1 viruses. Thus, these three groups of viruses have a distinct propensity to interact with swine and human proteins which can be described by ISM analysis. These results also provide additional strong evidence that HA1 from swine viruses infecting humans in the US before 2008 were the likely precursors of A/H1N1.
Changes in H1N1/N2 viruses that lead to enhanced transmission in humans are of particular interest. The comparison of HA1 sequences of H1N1/N2 viruses which infected only swine with the early Mexican A/H1N1 strain A/Mexico/4115/2009 revealed 14 amino acid substitutions which are highly specific for A/H1N1 viruses (Table ).
| Table 2Effect of HA1 polymorphisms on the amplitudes corresponding to IS frequencies F(0.055) and F(0.295). |
We further investigated which of these mutations or combinations thereof are most important for the switch between interaction patterns from swine H1N1/N2 to A/H1N1 strains. As shown above, the interaction between H1N1/N2 and swine protein(s), and the interaction between A/H1N1 and human protein(s) are characterized by the frequencies F(0.055) and F(0.295), respectively. According to the ISM concept [
9,
16] mutations in HA1 which increase the amplitude at F(0.295) and decrease amplitude at F(0.055) would potentially contribute to the switch of the viral host tropism from swine to human. Seven of the 14 mutations presented in Table (R36K, F71S, T128S, T216I, S271P, E302K, M314L) increase amplitude on the F(0.295) and decrease amplitude on F(0.055), suggesting that these mutations may be critical for the switch of H1N1/N2 from a swine to a human tropism. It is of note that three of these mutations (R36K, T216I, S271P) also are present in swine H1N1 viruses that infected humans in the US between 2005 and 2007 (Figure ). ISM analysis also showed that any of the combinations of the mutations F71S, T128S, E302K and M314L, that are only present in A/H1N1, decrease the amplitude in F(0.055) and increase the amplitude in F(0.295). This suggests that these four mutations may play an important role in the efficient infection of humans by A/H1N1, and perhaps the effective human to human transmission. It is of interest to note that 7 of 14 mutations presented in Table decrease amplitude on F(0.055) and increase amplitude on F(0.295), 6 mutations decrease amplitudes on both frequencies or have no effect, and only one (N168D) increases the amplitude on F(0.055) and decreases amplitude on F(0.295). This suggests that the mutations in A/H1N1 that predispose for the human interaction pattern are remarkable of more prevalent than mutations that predispose for the swine interaction pattern.
It can be expected that A/H1N1 strains will accumulate additional polymorphisms in their HA1 genes which further favor the "human" interaction pattern and according to the ISM concept would be associated with an increase of the amplitude at frequency F(0.295) and decrease the amplitude at F(0.055). To identify candidate residues ("hot-spots") for such polymorphisms, we performed an in silico alanine scan of the complete HA1. This analysis revealed that mutations of residues 94D, 196D and 274D would increase the amplitude at the critical frequency F(0.295) (Figure ). Since Asp has the highest EIIP value (Table ) substitutions in any of the above positions will increase the amplitude at frequency F(0.295). Interestingly, Asp (single letter code D) in positions 94, 196 and 247 is highly conserved in all North American swine H1N1/N2 strains and in all A/H1N1 HA1 genes. The only notable exceptions are four A/H1N1 isolates from Spain (A/Castilla-La Mancha/GP13/2009, A/Castilla-La Mancha/GP9/2009, A/Valencia/GP4/2009, A/Catalonia/P148/2009), two isolates from Italy (A/Italy/06/2009) and four isolates from US (A/South Carolina/09/2009; A/South Dakota/05/2009; A/South Carolina/10/2009; A/Missouri/023/2009) that have a D274E mutation (Figure ). This mutation significantly increases the amplitude on the frequency F(0.295) and probably enhances the "human" interaction pattern. The same result was obtained for scan with any other amino acid with exception of Asp (results not shown). As can be seen in Figure , A/Castilla-La Mancha/GP13/2009 (FJ985753) which is identical to A/South Carolina/09/2009 (GQ221794) differ from the early Mexican A/H1N1 isolates A/Mexico/4115/2009 (EPI177288) only in positions I32L and E257D. The amino acids L and I have the same EIIP value (see Table ) and the L>I substitution does not affect the informational spectrum. In contrast, the EIIP values of amino acids D and E are significantly different (see Table ) and the D>E mutation increases the amplitude at frequency F(0.295) by 15%, with little effect on structural properties of the protein since both amino acids are negatively charged. The stable mutation D274E found in US, Spain and Italy may correspond to further adaptation of A/H1N1 to humans. Interestingly, Spain was also one of the first (European) countries with indigenous chains of transmission, at a time when in other countries the virus was mostly found in imported cases.
The computer scanning survey of the HA1 amino acid sequence of A/H1N1 strains showed that the main contribution to the information represented by the frequency F(0.295) comes from a domain located in the C-terminus of the protein which encompasses residues 286 - 326 (denoted VIN2) of the mature protein. Figure . shows IS and position of the VIN2 domain in the 3D structure of A/H1N1 isolate A/California/04/2009. It is of note that VIN2 is conserved in all A/H1N1 and that two of the four polymorphisms (E302K and M314L) which are identified as critical for human infection are located within this domain. The significance of these polymorphisms are strengthened by the fact that domain 286 - 326 also is highly conserved in HA1 of swine viruses (only 30 of 500 swine H1N1 HA1 presented in UniProt database have mutations in this domain).
The relative position of the receptor binding domain and the receptor targeting domain (VIN2) in 3D structure of A/H1N1 HA1 is similar to the position of these two domains in seasonal flu H1N1 viruses but different than in H1N1 1918 viruses [
10]. This suggests that efficacy of interaction between A/H1N1 and its receptor is similar to seasonal flu H1N1 viruses but less efficient than in 1918 viruses.