The 36 new isolates reported here greatly expand the amount of whole-genome sequence data available from recent avian influenza (H5N1) isolates. Before our project, GenBank contained only 5 other complete genomes from Europe for the 2004–2006 period, and it contained no whole genomes from the Middle East or northern Africa. Our analysis showed several new findings. First, all European, Middle Eastern, and African samples fall into a clade that is distinct from other contemporary Asian clades, all of which share common ancestry with the original 1997 Hong Kong strain. Phylogenetic trees built on each of the 8 segments show a consistent picture of 3 lineages, as illustrated by the HA tree shown in . Two of the clades contain exclusively Vietnamese isolates; the smaller of these, with 5 isolates, we label V1; the larger clade, with 9 isolates, is V2. The remaining 22 isolates all fall into a third, clearly distinct clade, labeled EMA, which comprises samples from Europe, the Middle East, and Africa. Trees for the other 7 segments display a similar topology, with clades V1, V2, and EMA clearly separated in each case. Analyses of all available complete influenza (H5N1) genomes and of 589 HA sequences placed the EMA clade as distinct from the major clades circulating in People’s Republic of China, Indonesia, and Southeast Asia.
The influenza (H5N1) viruses isolated in Europe, the Middle East, and Africa show a close relationship, despite the fact that they were collected from a widely dispersed geographic region, including Côte d’Ivoire, Nigeria, Niger, Sudan, Egypt, Afghanistan, Iran, Slovenia, Croatia, and Italy. The shared lineage of the viruses suggests a single genetic source for introduction of influenza (H5N1) into western Europe and northern and western Africa; our analysis places this source most recently in either Russia or Qinghai Province in China (; Appendix Table
. The broad dispersal of these isolates throughout these countries during a relatively short period, coupled with weak biosecurity standards in place in most rural areas, implicates human-related movement of live poultry and poultry commodities as the source of introduction of influenza (H5N1) into some of these countries. The virus’ presence in wild birds leaves open the alternative possibility that migratory birds may have been the primary source, with secondary spread possibly caused by human-related activities.
A phylogenetic tree containing 589 isolates from 2001 through 2006 ( and Appendix Figure 3
) shows the relationship of the 36 recent isolates from this study to previous isolates and shows the 3 major lineages of influenza (H5N1) that are now circulating in Asia plus the fourth lineage, EMA, that has spread west into Europe and Africa. depicts a consensus view of the parsimony-based analysis of 74 isolates of complete genomes from the EMA lineage. The EMA clade contains all known European, Middle Eastern, and North African cases (which began appearing in late 2005), as well as cases from China, Russia, and Mongolia in 2005 and 2006. Some of the EMA clade isolates appear in clusters of influenza (H5N1) infection that were reported in geese in Qinghai Province, China (14
), and in mute swans in Astrakhan (15
), both of which are possible sources of spread through migration.
The evolutionary relationships shown in provide clear evidence that 3 distinct clades, labeled EMA 1–3, are circulating in the European and African region. These clades clearly share a common ancestor in Asia. The 3 clades may represent separate introductions or, alternatively, a single introduction from Asia into Russia, Europe, or another western site that has subsequently evolved into 3 lineages. More data will be required to pinpoint when and where the 3 clades split apart. All previously reported European and Middle Eastern isolates belong to EMA-1.
Our results show that EMA-2 has spread to Europe and that EMA-3 has spread to both Europe and the Middle East. These results agree in part with a recent study (16
) that reported 3 distinct introductions of influenza (H5N1) into Nigeria. Our analysis, based on all available HA sequences (Appendix Figure 3
), indicates that the Nigerian isolates fall into just 2 clades, EMA 1–2, that likely resulted from at least 2 introductions of influenza (H5N1).
European countries have been affected by each of the 3 introductions of the EMA strains. For example, the Italian sequences can be segregated into 2 subgroups (). Two isolates in EMA-1 (Co/Italy/808/06 and Md/Italy/835/2006) are closely related in all segments and likely share a common ancestor with isolates found in Slovenia (Sw/Slovenia/760/2006), Bavaria, and the Czech Republic (Co/Czech Republic/5170/2006). The third Italian strain from our study (Co/Italy/742/2006) falls into EMA-3, along with our newly sequenced isolates from Iran (Co/Iran/754/2006) and Afghanistan (Ck/Afghanistan/1207/2006). EMA-2 contains 1 European isolate, from a swan in Croatia, and multiple isolates from domesticated birds in Nigeria and Niger. This group shares a common ancestor with a group of isolates from Astrakhan and Kurgan (Russia).
Of the 22 EMA isolates newly sequenced in this study, 20 have the amino acid lysine (K) at position 627 of the polymerase basic protein 2 (PB2), while only 2 have glutamic acid (E). (These last 2 are both from Italy and both in EMA-1.) The 627K mutation is associated with virulence in mice and adaptation to mammalian hosts (17
) and with increased host range (18
). Lysine at this position is common in human viruses: all 65 human influenza (H5N1) isolates from 2001 through 2006 for which the PB2 sequence is available have lysine at position 627. Before the analysis of our collection, the PB2 627K was a relatively rare finding in avian influenza (H5N1) viruses: it was present in only 42 of 385 isolates previously collected from 2001 through 2006. Our analysis shows that all 42 of these fall in the EMA clade ( and supplementary data available in Technical Appendix 2
. Excluding our current European, Middle Eastern, and African isolates, this mutation appears primarily in isolates obtained from wild birds in Astrakhan (15
) and at Qinghai Lake (14
). This mutation also occurs in the recent isolate A/Guinea fowl/Shantou/1341/2006 and in a mouse-adapted 2001 Asian isolate, A/pheasant/Hong Kong/Fy155/01-MB. This finding is in keeping with current knowledge of the acquisition of such mutations.
Our study increases current knowledge on strains circulating in Asia before the westward spread of influenza A (H5N1). The Vietnamese samples fall into 2 clusters, the larger of which (V2 in ) is the same strain responsible for multiple cases in Southeast Asia since 2004, particularly in Vietnam and Thailand. These isolates all seem to derive from earlier Hong Kong samples (including 2 cases of human infection) in 2002 and 2003. The second cluster, V1, which contains 5 samples, significantly expands our understanding of this distinct Vietnamese influenza (H5N1) lineage. The only other isolate from this cluster was recently reported in a Vietnamese duck (A/duck/Vietnam/568/2005) and labeled a “recent Vietnam introduction” (4
). This sample groups with the V1 clade when shown in the context of a larger tree of HA sequences (Appendix Figure 3
). The 5 newly sequenced isolates in clade V1 show the same phylogenetic relationship for all segments except PB2 (Appendix Figure 1
). The isolates in clade V1 appear to have undergone the same reassortment as was suggested (4
) for the 1 previous example of this Vietnamese clade, A/duck/Vietnam/568/2005; i.e., they have acquired a new PB2 segment. This PB2 is similar to older (1996–2002), A/duck/
Guangdong/1/96-like viruses from China. V1 clade isolates are associated with a distinct set of human cases, from China’s Anhui and Guangxi Provinces in 2005, a finding that provides additional support to the hypothesis that this group of influenza (H5N1) viruses was introduced into Vietnam from China (4
Although EMA has split into 3 independently evolving clades, 1 isolate, A/chicken/Nigeria/1047–62/2006, shows clear evidence of reassortment. In this genome, 4 segments—HA, (nucleocapsid protein, nonstructural protein, and PB1—belong to EMA-1, as seen in and Appendix Figure 1
. The other 4 segments—neuraminidase, matrix protein, PA, and PB2—belong to EMA-2 (Appendix Figure 1
). Individual segment trees based on all available sequences in GenBank corroborate this pattern and consistently split the 8 segments of this Nigerian isolate into 2 distinct clades. Reassortment events such as this can only be discovered by sequencing multiple virus segments.
The presence of all 3 EMA sublineages in the same geographic region creates ample opportunities for reassortment. Isolate A/chicken/Nigeria/1047–62/2006 is the most recent of the Nigerian isolates, consistent with the hypothesis that this reassortant was generated in Africa. Additional surveillance will be necessary to determine if this reassortant strain spreads further in the avian population and to assess its ability to infect mammals.
As shown in , the EMA clade is a distinct lineage evolving independently of the 3 exclusively Asian lineages. All 3 human influenza (H5N1) cases that have been sequenced outside east Asia—from Iraq (19
), Djibouti, and Egypt—belong to the EMA lineage. The human sequences A/Djibouti/5691/NAMRU3/06 and A/Egypt/2782/NAMRU3/06 group closely together and consistently fall in EMA-1. The placement of A/Iraq/207/NAMRU3/06 is slightly less certain; it also groups with EMA-1 () but with lower bootstrap support. EMA viruses isolated from humans are thus quite distinct from the recent large clusters of human cases in Indonesia and China, which fall into separate clades containing none of our samples. The EMA isolates are also distinct from other human cases in Southeast Asia, which fall into the clades (V1 and V2) containing our Vietnamese samples.
The emergence of 3 (or more) substrains from the EMA clade represents multiple new opportunities for avian influenza (H5N1) to evolve into a human pandemic strain. In contrast to strains circulating in Southeast Asia, EMA viruses are derived from a progenitor that has the PB2 627K mutation. These viruses are expected to have enhanced replication characteristics in mammals, and indeed the spread of EMA has coincided with the rapid appearance of cases in mammals—including humans in Turkey, Egypt, Iraq, and Djibouti, and cats in Germany, Austria, and Iraq. Unfortunately, the EMA-type viruses appear to be as virulent as the exclusively Asian strains: of 34 human infections outside of Asia through mid-2006, 15 have been fatal (2
Analyses of the complete HA tree ( and Appendix Figure 3
) suggest that the earliest sequenced relatives of the EMA clade are from the Yunnan region of China (A/duck/Yunnan/6255,6445/2003), Hong Kong, (A/chicken/Hong Kong/WF157/2003), and South Korea (A/chicken/Korea/ES/2003, A/duck/Korea/ESD1/2003), which were part of a regional outbreak in 2003 (20
). Experiments on the 2 Korean isolates showed them to be infectious but not fatal in mice (21
These findings show how whole-genome analysis of influenza (H5N1) viruses is instrumental to the better understanding of the evolution and epidemiology of this infection, which is now present in the 3 continents that contain most of the world’s population. This and related analyses, facilitated by global initiatives on sharing influenza data (22
), will help us understand the dynamics of infection between wild and domesticated bird populations, which in turn should promote the development of control and prevention strategies.