|Home | About | Journals | Submit | Contact Us | Français|
Nucleotide sequences of all eight RNA segments of 10 human H3N2 influenza viruses isolated during a 5-year period from 1993 to 1997 were determined and analyzed phylogenetically in order to define the evolutionary pathways of all genes in a parallel fashion. It was evident that the hemagglutinin and neuraminidase genes of these viruses evolved essentially in a single lineage and that amino acid changes accumulated sequentially with respect to time. In contrast, amino acid differences in the internal proteins were erratic and did not accumulate over time. Parallel analysis of the phylogenetic patterns of all genes revealed that the evolutionary pathways of the six internal genes were not linked to the surface glycoproteins. Genes coding for the basic polymerase-1, nucleoprotein, and matrix proteins of 1997 isolates were closest phylogenetically to those of earlier isolates of 1993 and 1994. Furthermore, all six internal genes of four viruses isolated in the 1995 epidemic season consistently divided into two distinct branch clusters, and two 1995 isolates contained PB2 genes apparently originating from those of viruses before 1993. It was apparent that the lack of correlation between the topologies of the phylogenetic trees of the genes coding for the surface glycoproteins and internal proteins was a reflection of genetic reassortment among human H3N2 viruses. This is the first evidence demonstrating the occurrence of genetic reassortment involving the internal genes of human H3N2 viruses. Furthermore, internal protein variability coincided with marked increases in the activity of H3N2 viruses in 1995 and 1997.
The surface hemagglutinin (HA) glycoprotein of influenza viruses is the major target for neutralizing antibodies, and point mutations in the potential antigenic domains of this protein are thought to allow viruses to evade established immune antibodies in the human population. Some influenza seasons are more severe than others, and this is thought to be a reflection of the degree of antigenic change in the HA protein of newly appearing antigenic variants from those of the former strain. Viruses with antigenically drifted HA proteins thus have a selective advantage in becoming the subsequent epidemic strain. Analyses of epidemic influenza virus isolates, therefore, have chiefly focused on antigenic characterization of the HA glycoprotein in order to detect new variants of each epidemic strain for the recommendation of vaccine strains in each season.
Although there have been a number of reports on the characterization of human H3N2 influenza viruses, it was recently reported that detection of a new H3N2 antigenic variant is often associated with the rapid disappearance of the former variant (14, 31). For example, during the 1992-1993 epidemic season in Japan and England an A/Beijing/352/89-like H3N2 variant that circulated early in the season was displaced by an A/Beijing/32/92-like variant (14, 31) during the latter part of the same season. Other reports, however, have demonstrated cocirculation of phylogenetically distinct H3N2 viruses in the same season (12, 34, 35).
It has been reported repeatedly that the virulence and growth of influenza viruses are influenced by changes in the internal proteins. The matrix (M1) protein is a multifunctional protein which contributes to the control of virulence, growth, (15, 52, 62, 63) and host specificity of influenza viruses (10, 33). In experiments with single gene reassortants of influenza viruses, it was shown that changes in the NP, basic polymerase-2 (PB2), and M1 proteins were involved in host restriction in monkeys (33), while attenuation of human viruses was confirmed in human volunteers by changes in the NP, nonstructural-1 (NS1), PB1, and PB2 proteins (10). Considering this evidence, the severity of an influenza epidemic season may be influenced not only by variability in the surface glycoproteins (HA and NA) but also by differences in the internal proteins (PB2, PB1, PA, NP, M1, M2, NS1, and NS2) of circulating influenza viruses. In this study, parallel evolutionary analyses of all eight gene segments and amino acid comparisons of all 10 viral proteins of 10 human H3N2 epidemic strains isolated over a 5-year period from 1993 to 1997 in Japan was done in order to provide a complete profile of protein variability, as well as the evolutionary patterns of all gene segments of these viruses. Also, phylogenetic and amino acid sequence data were compared with seasonal epidemiological data, including the total number of influenza-like illnesses and the numbers of virus isolations in Japan.
The viruses used in this study and their abbreviations are summarized in Table Table1.1. Abbreviations for Japanese strains isolated between 1993 and 1997 reflect the epidemic season in which the strain was isolated and not necessarily the year of isolation. For example, isolate A/Aichi/69/94, which was isolated during the influenza season of 1994-1995, was assigned the abbreviation Aic95 in order to more easily distinguish this virus from viruses of the 1993-1994 season, such as A/Akita/1/94 (Aki94). Viruses whose genes were sequenced in this report were propagated in 11-day-old embryonated chicken eggs (E) or MDCK-cell monolayers (M) as follows: A/Sichuan/2/87 (E), A/Hokkaido/20/89 (E), A/Beijing/352/89 (E), A/Washington/15/91 (E), A/Beijing/32/92 (E), A/Hebei/12/93 (E), A/Tianjin/33/92 (E), A/Kitakyushu/159/93 (E), A/Akita/1/94 (E), A/Tochigi/44/95 (M), A/Shiga/20/95 (E), A/Akita/1/95 (E), A/Wuhan/359/95 (E), A/Miyagi/29/95 (M), A/Fukushima/114/96 (E), A/Niigata/137/96 (M), A/Fukushima/140/96 (E), and A/Shiga/25/97 (M).
RNA from 100 μl of virus sample was purified as described previously (9) and suspended in 50 μl of RNase-free distilled water. Whenever possible, samples were taken directly from virus stocks received from municipal and prefectural centers for hygiene without being propagated further in our laboratory. Reverse transcription (RT)-PCR was performed by using a slightly modified protocol of a commercial kit (RT-PCR kit with avian myeloblastosis virus [AMV]; version 2.1; Takara). Briefly, first-strand cDNA synthesis was done by mixing 9.5 μl of RNA with 1 μl of “influenza A universal RT” primer (3′-AGCAAAAGCAGG-5′) (20 μM) and 9.5 μl of RT reaction mixture (4 μl of 25 mM MgCl2, 2 μl of 10× RT-PCR buffer, 2 μl of 10 mM dNTP, 1 μl of AMV reverse transcriptase, 0.5 μl of RNase inhibitor). This mixture was incubated at 30°C for 10 min, at 42°C for 60 min, and finally at 95°C for 5 min. Resultant cDNA was then used in all subsequent PCR amplifications of overlapping cassettes covering the complete protein coding domain of each gene. Nucleotide sequences for all oligonucleotide primers used for PCR amplification are available from the authors upon request. RT-PCR products were purified and sequenced directly as described previously (48) on an ABI377 autosequencer (Perkin-Elmer). All nucleotide sequence data reported here will appear in the GSDB, DDBJ, EMBL, and NCBI nucleotide sequence databases under the accession numbers listed in Table Table1.1. Accession numbers for previously reported sequence data used in this report for the PB2 (21, 24, 29, 45), PB1 (4, 22, 25), PA (5), HA (14, 35, 37), NP (1, 6, 18, 50), NA (2, 32, 53, 59), M (23, 28, 43, 60, 64), and NS (7, 27, 42) genes are also summarized in Table Table11.
Phylogenetic analyses of the complete protein coding regions of all RNA segments were done using the neighbor-joining (NJ) method (20, 47). Nucleotide distance matrices were estimated by the six-parameter method (19) based on the number of total nucleotide substitutions, and evolutionary trees for the HA (Fig. (Fig.1A),1A), NA (Fig. (Fig.1B),1B), PB2 (Fig. (Fig.2A),2A), PB1 (Fig. (Fig.2B),2B), PA (Fig. (Fig.2C),2C), NP (Fig. (Fig.2C),2C), M (Fig. (Fig.3A),3A), and NS (Fig. (Fig.3B)3B) genes were constructed (20). To evaluate the robustness of the trees, the probabilities of the internal branches were determined by 500 bootstrap replications (16).
As shown in Table Table2,2, amino acid changes in the HA and NA glycoproteins accumulated in a sequential fashion over time. Indeed, when compared to that of Kit93, 10 amino acid differences in the HA1 protein of 1995 isolates were also found in those of 1997 isolates (Fuk97 and Shi97). Previously uncharacterized substitutions were observed in proposed antigenic sites A and B, as well as the receptor binding domain of the HA protein (55, 56). Even though a novel change at position 226 of the receptor binding domain of the HA protein in Japanese and Chinese isolates of the 1994-1995 influenza season was reported (31), this residue was found to have changed again from isoleucine to valine in viruses of 1996 and 1997. Changes from leucine to glutamine at position 226 have been reported to be involved in determining binding specificity to host cell receptors (36, 41, 46); however, the effect of isoleucine or valine at this critical site have not been determined and warrant further study.
As with the HA, amino acid differences in the NA proteins were maintained in those of later viruses. Substitutions were observed throughout the NA protein, although most changes were located in the functional globular region. Amino acid differences were also revealed in locations which have been implicated as antigenic sites of the NA protein (11) at position 336 (Y to H) in viruses of 1996 and 1997 as well as at residue 342 (N to D) in Fuk97. One change was also revealed in the substrate-binding pocket at position 151 (D to G), although this change was observed in only one isolate, Fuk96.
Analysis of the M genes revealed that the M1 proteins of influenza viruses were highly conserved, supporting the observations by Ito et al. (23). Indeed, amino acid substitutions were only observed in the most recent isolates of 1997 (Fuk97 and Shi97), which contained three amino acid substitutions at positions 219 (I to V), 227 (A to T), and 230 (K to R). Also, it was of interest to reveal that, although most M2 proteins of viruses isolated from the 1993–1997 period were completely conserved, those of the 1995 isolates showed as many as four amino acid changes (Table (Table2)2) at positions 16 (G to E) and 21 (D to G) in the external amino domain, position 57 (H to Y) in the internal carboxyl domain, and position 43 (L to F) in the functional transmembrane domain.
Twelve amino acid substitutions were observed in the NS1 proteins of viruses isolated from 1994 to 1997, although most changes were restricted to a single virus or to viruses of a particular season. Only one substitution, at position 164 (S to P), was shown in all viruses from 1994 to 1997 when compared to that of Kit93. In contrast to the high variability of the NS1 proteins, the NS2 proteins of these viruses were highly conserved. Only one difference was observed in the NS2 protein of Shi95, at position 26 (E to V).
Analysis of amino acid sequence changes in the PB2, PB1, and PA proteins of recent Japanese viruses found only a low number of amino acid changes in these proteins, changes which were often not maintained in other isolates. Of the nine amino acid substitutions observed in the PB2 proteins of viruses from 1993 to 1997, only two were maintained, at position 570 (I to M) in all viruses isolated after 1994 and at position 697 (L to I) in isolates of 1996 and 1997. Similarly, only 2 of 11 amino acid substitutions among the PB1 proteins were conserved, at positions 56 (R to K) and 622 (Q to R), in all of the viruses isolated after 1994. However, with the exception of Fuk96 and Nii96, one amino acid difference was revealed in all of the isolates, at position 216 (S to G). Remaining substitutions in the PB1 protein were restricted to one isolate or to viruses of a single season. The PA protein had slightly higher variability, with 17 differences observed between isolates, although 9 of them were erratic and 5 were shared only among viruses of their respective epidemic seasons. Only two changes, at positions 432 (V to I) and 712 (T to K), were maintained in isolates of 1995 and 1996, although these disappeared in viruses of 1997 which contained changes at residues 312 (R to K), 350 (T to N), and 557 (M to I).
The pattern of amino acid differences in the NP proteins is worth notice from an evolutionary point of view. For example, all six amino acid changes observed in the NP proteins of Aki94 also existed in those of the 1997 viruses (Fuk97 and Shi97) but were not found in those of the 1995-1996 viruses. Also, a change at residue 451 (A to T) in viruses isolated in 1995 and 1996 (Miy95, Shi95, Fuk96, and Nii96) was not maintained in viruses of 1997. Two additional amino acid differences were found in isolates of 1997 at positions 18 (E to D) and 239 (M to V) which were not observed in those of other isolates.
Phylogenetic profiles of the HA gene (Fig. (Fig.1A)1A) were consistent with those of previous reports (14, 31) and showed that the HA gene had evolved in a sequential fashion, with isolates of each season forming distinct clades. The HA genes of viruses of 1995 formed a clade which could be further divided into two branch clusters (BC) represented by 95i (Aki95, Toc95, and Aic95) and 95ii (Shi95 and Miy95), respectively, with a bootstrap value (BSV) of 87. Although 1995 viruses were divided phylogenetically into two branch clusters, variability among the HA genes of these isolates was a reflection of silent nucleotide mutations in the HA1 domain. For instance, although Miy95 and Toc95 were on different branch clusters, their HA1 proteins were identical to that of Toc95 (Table (Table2).2). Also, the 1996 viruses Fuk96 and Nii96 formed a branch cluster together with the vaccine strain, Wuh95 (BC 96) (BSV 35), while Fuk97 and Shi97 (BC 97) (BSV 87), were located on the newest lineage.
Like the HA gene, the NA genes of H3N2 isolates were shown to evolve in an essentially linear fashion (Fig. (Fig.1B).1B). Analysis of the NA gene in this report showed the evolutionary patterns of the NA genes of the most recent viruses of 1996 and 1997 to have divided into two clades including Wuh95, Fuk96, and Nii96 (BC 96) (BSV 56) and Fuk97 and Shi97 (BC 97) (BSV 100). However, in contrast to the HA gene, cocirculation of distinct NA genes in the same epidemic season was observed, a finding supporting the phylogenetic variability observed by Xu et al. (59). Viruses of 1995 were revealed to be located in two distinct branch clusters, including 95i (Aki95 and Toc95) (BSV 94) and 95ii (Miy95 and Shi95) (BSV 97).
Results of parallel phylogenetic analyses of the genes coding for the proteins of the RNP complex were particularly noteworthy. The PB2 genes (Fig. (Fig.2A)2A) of viruses isolated from 1993 to 1997 were found to evolve generally in a sequential fashion, as isolates of 1993 (Kit93), 1994 (Aki94), 1995 (BC 95ii), 1996 (BC 96), and 1997 (BC 97) were on the same lineage. However, two viruses of 1995 (BC 95i) (BSV 100) formed a branch cluster which was distinguished from other viruses and apparently diverged before 1993. In the case of the PB1 gene (Fig. (Fig.2B),2B), bootstrap analysis determined that isolates of 1997 (BC 97) (BSV 100) were most similar genetically to that of an earlier isolate of 1994 (Aki94) (BSV 67), whereas those of other isolates of 1995 and 1996 formed a separate lineage which was further divided into three branch clusters containing those of 1995 (BC 95i and 95ii) (BSV 99 and 98, respectively) and 1996 isolates (BC 96) (BSV 97). Construction of the evolutionary tree for the PA genes (Fig. (Fig.2C)2C) revealed yet another distinct topology which showed little correlation with the chronology of the virus isolates, although the probabilities of the internal branches of the tree were very high. Viruses of 1995 (BC 95i and 95ii) (BSV 91 and 99, respectively) and 1996 (BC 96) (BSV 99) formed clades distinct from the viruses of 1997 (BC 97) (BSV 100), which appeared to have diverged sometime earlier. Also, the evolutionary position of a 1994 virus (Aki94) indicated that this gene branched off prior to that of a 1993 virus (Kit93) (BSV 100).
As shown in Fig. Fig.2D,2D, results of phylogenetic analysis of the NP genes of many human H3N2 viruses showed that these genes had evolved essentially as a single lineage, supporting observations by Shu et al. (50). However, an examination of the NP genes of viruses isolated from 1993 to 1997 sequenced in this study demonstrated that those of 1997 (BC 97) (BSV 100) viruses formed a distinct lineage with that of a 1994 isolate (Aki94) (BSV 100) that evidently had diverged prior to 1993. Viruses of 1995 and 1996, which appeared to have derived from the NP genes similar to 1993 viruses, evolved into three clades distinguishing the viruses of 1995 (BC 95i) (BSV 99) and (BC 95ii) (BSV 100) and those of 1996 (BC 96) (BSV 74).
Even though the M genes of human H3N2 viruses were highly conserved, the present study revealed distinct phylogenetic differences (Fig. (Fig.3A).3A). The M genes of the 95i branch (BSV 98) were distantly related to other viruses of 1995 (BC 95ii) (BSV 78), while the M genes of two 1996 isolates were also different from one another. Most interestingly, it was clearly shown that the M genes of viruses of 1997 (BC 97) (BSV 99) were distinguished from those of other recent Japanese viruses and showed the highest degree of genetic similarity to that of a Chinese isolate of 1993, Heb93 (BSV 95).
The results of an evolutionary analysis of the NS genes of recent isolates revealed nonlinear evolutionary pathways (Fig. (Fig.3B).3B). The phylogenetic relationships among 1996 and 1997 viruses could not be elucidated, since the BSVs were relatively low, allowing for other possible branching patterns. However, it was apparent that viruses of the 95ii clade (BSV 96) were distinct from other 1995 viruses of the 95i clade (BSV 57), which were more similar to that of Aki94 (BSV 63).
From year to year, the level of human morbidity and mortality attributed to influenza virus activity may vary considerably. In order to estimate the levels of influenza virus activity in Japan, various types of data are collected annually from local hospitals, clinics, and schools, as well as from prefectural and municipal institutes of hygiene. The numbers of reported influenza-like illnesses (ILIs) and the numbers of virus isolates of A/H3N2, A/H1N1, and B influenza virus are good indices in the estimation of morbidity due to influenza (Table (Table3).3). Also, these indices allow the estimation of the yearly relative morbidity in humans caused by each influenza virus. As shown in Table Table3,3, the numbers of ILIs reported in the five influenza seasons from 1993 to 1997 varied considerably, as did the numbers of isolates of each influenza virus (A/H1N1, A/H3N2, and B). By calculation of the relative morbidity due to A/H3N2 viruses for each season, it was revealed that A/H3N2 activity was very high in the 1993, 1995, and 1997 seasons, with morbidity cases of 304,665, 361,831, and 229,490, respectively. These values are in sharp contrast to the values determined for the 1994 and 1996 seasons of 65,192 and 11,245 cases, respectively.
Influenza viruses contain a segmented genome consisting of eight minus-sense RNA segments which code for 10 known viral proteins that are capable of reassortment during coinfection of a single host with two influenza viruses. Indeed, reassortment between human influenza viruses of different subtypes and between human and swine influenza viruses has been demonstrated repeatedly (3, 37–40, 51, 61), and it has been suggested that swine may serve as an intermediate host in which reassortment between human, swine, and avian influenza viruses may occur to give rise to new pandemic strains (3, 8, 26, 37, 44, 51, 54, 58). Also, a natural H1N2 reassortant containing the HA of recent human H1N1 viruses and the NA of recent human H3N2 viruses has been isolated from humans in China (30). With this evidence, it may be expected that reassortment between cocirculating human influenza viruses of the same subtype occurs. Indeed, recent phylogenetic divergence of the NA gene was suggested to be the result of genetic reassortment between recent human viruses (59). However, through evolutionary analysis of the internal NS gene, it was proposed that reassortment among cocirculating viruses appears not to occur very often and, therefore, it has been suggested that fixation of mutations in the genes coding for the internal proteins is dependent on immune pressure on the HA protein, effectively linking the evolution of the internal genes of influenza viruses to the HA gene (7, 17). Although this may indeed often be the case, parallel analyses of the genes coding for the surface glycoproteins and internal proteins have never been reported. A scarcity of sequence data of the internal genes of human H3N2 viruses has made it very difficult to analyze the phylogenetic patterns of these genes in a parallel manner. Also, available sequence data is of different viruses, isolated in different years and in different parts of the world, making an accurate comparison of the evolutionary patterns of these genes very difficult. The determination in this study of the phylogenetic pathways of all eight genes of 10 recent influenza viruses revealed that the gene segments coding for the internal proteins of these viruses are evolving in a more-independent manner than was previously speculated and were not linked to the evolution of the HA gene.
Although the HA proteins of four viruses isolated in the epidemic season of 1995 were almost identical, differences were observed in the PB1, PA, NP, NA, M2, and NS1 proteins which distinguished these viruses into two pairs. With the exception of the HA gene, the differences in the proteins of 1995 viruses were further supported by evolutionary analysis, indicating that the genes of 1995 isolates consistently diverged into two distinct branch clusters (95i and 95ii) with bootstrap probabilities of 95 to 100. It was, therefore, understood that considerable variability existed among cocirculating viruses in the same epidemic season which was not apparent through analysis of the HA protein alone. Also, it appeared that distinct RNA segment constellations of 95i and 95ii viruses may have been established through reassortment among cocirculating H3N2 viruses.
As summarized in Table Table4,4, evolutionary patterns determined for each of the eight genes of human H3N2 viruses isolated from 1993 to 1997 indicated that reassortment between cocirculating human influenza viruses apparently occurred during this 5-year period. Two isolates of 1995, Aki95 and Toc95, contained PB2 and NP genes which were genetically more similar to those of a 1993 virus than to those of other viruses isolated in the same epidemic season. Japanese viruses of 1997 contained HA, NA, and PB2 genes that appeared to have originated from those of the previous variant of 1996, while the genes coding for the internal PB1, PA, NP, and M proteins apparently derived from earlier viruses of the 1993-1994 season. Most notably, the evolutionary patterns of the internal genes clearly demonstrated with bootstrap probabilities of 99 to 100 that the PB1 and NP genes of 1997 viruses were most similar to those of a 1994 isolate, whereas the M genes were distinct from all Japanese isolates from 1993 to 1996 and were instead most similar to that of a Chinese isolate of 1993. Even though the topology of the phylogenetic tree for the PA gene did not correlate well with the dates of the isolates, the PA genes of the 1997 isolates were shown to have diverged from a virus other than those of 1995 or 1996. Although it was reported that the NS genes of human H3N2 viruses evolves in a rapid and sequential fashion (7, 17), our analyses of the NS genes of recent H3N2 viruses revealed a nonlinear pattern of evolution and amino acid substitutions which seldom survived longer than one season. The nonlinear evolutionary pattern of the NS genes created difficulty in the determination of the origins of the NS genes of recent viruses, since significant probabilities for the internal branches could not be calculated. Nevertheless, the evidence suggested that distinct RNA constellations appeared to be a result of genetic recombination between cocirculating human H3N2 viruses and that genetic exchange may or may not accompany antigenic drift of the HA protein.
It has been suspected for some time that new epidemic variants of H3N2 influenza virus originated from China (49, 54), since antigenically variable viruses may circulate for some time in China before becoming epidemic. This was apparent when A/Beijing/32/92-like viruses were isolated as early as 1990 in China but did not become epidemic until 1993 (13, 14, 31, 57). A/Beijing/32/92-like viruses, therefore, appear to have circulated for at least 3 years in China before becoming the predominant epidemic strain globally. It is unclear why a particular strain will suddenly emerge after circulating for years in China, although it is generally thought that this is because the HA protein has not undergone sufficient antigenic change to effectively evade established immunity in the human population. In the five epidemic seasons in Japan investigated in this report (1993 to 1997), the relative morbidity due to H3N2 viruses in the 1993, 1995, and 1997 seasons was considerably higher than in the 1994 or 1996 seasons. High levels of morbidity due to H3N2 virus activity in 1995 and 1997 coincided with observed variability in the internal proteins, which was apparently the result of genetic reassortment. The results of this study provide evidence that strongly suggests for the first time that genetic exchange among cocirculating H3N2 influenza virus strains involving gene segments coding for the internal proteins occurs naturally in the human population and that this mechanism of genetic reassortment may be important in virus evolution and pathogenicity. In addition to antigenic drift of the HA protein, emergence of new epidemic H3N2 strains may be influenced by the establishment of a suitable RNA segment constellation through a combination of genetic mutation and reassortment between cocirculating viruses. This study establishes the importance of analyzing the entire genome of human influenza viruses when studying new epidemic strains. Changes in the internal proteins, as well as antigenic variability in the surface glycoproteins, should be considered when analyzing and predicting newly emerging influenza viruses in humans.
The authors thank T. Gojobori for suggestions about the evolutionary calculations used in this report.
This work was supported by grants from the Japanese Ministry of Health and Welfare.