|Home | About | Journals | Submit | Contact Us | Français|
Since it emerged in Japan in the 1870s, Japanese encephalitis has spread across Asia and has become the most important cause of epidemic encephalitis worldwide. Four genotypes of Japanese encephalitis virus (JEV) are presently recognized (representatives of genotypes I to III have been fully sequenced), but its origin is not known. We have determined the complete nucleotide and amino acid sequence of a genotype IV Indonesian isolate (JKT6468) which represents the oldest lineage, compared it with other fully sequenced genomes, and examined the geographical distribution of all known isolates. JKT6468 was the least similar, with nucleotide divergence ranging from 17.4 to 19.6% and amino acid divergence ranging from 4.7 to 6.5%. It included an unusual series of amino acids at the carboxy terminus of the core protein unlike that seen in other JEV strains. Three signature amino acids in the envelope protein (including E327 Leu→Thr/Ser on the exposed lateral surface of the putative receptor binding domain) distinguished genotype IV strains from more recent genotypes. Analysis of all 290 JEV isolates for which sequence data are available showed that the Indonesia-Malaysia region has all genotypes of JEV circulating, whereas only more recent genotypes circulate in other areas (P < 0.0001). These results suggest that JEV originated from its ancestral virus in the Indonesia-Malaysia region and evolved there into the different genotypes which then spread across Asia. Our data, together with recent evidence on the origins of other emerging viruses, including dengue virus and Nipah virus, imply that tropical southeast Asia may be an important zone for emerging pathogens.
Japanese encephalitis virus (JEV) is the most important cause of epidemic encephalitis worldwide, with an estimated 35,000 to 50,000 cases and 10,000 deaths annually (36). The virus is a member of the JE serogroup of the genus Flavivirus, family Flaviviridae, and is transmitted between vertebrate hosts by mosquitoes, principally by Culex tritaeniorhynchus. Other important members of the same serogroup of flaviviruses include West Nile virus (WNV), which has recently spread to cause outbreaks of encephalitis in North America (2, 18), St. Louis encephalitis virus (SLEV), Kunjin virus (KUNV), and Murray Valley encephalitis virus (MVEV). Like other flaviviruses, JEV consists of a small (50 nm) glycoprotein-containing lipid envelope surrounding a nucleocapsid which encloses one molecule of single-stranded positive-sense RNA. This 11-kb molecule comprises 5′- and 3′-untranslated regions (UTRs), between which lie a single open reading frame carrying genes for three structural proteins (capsid [C], premembrane [PrM], and envelope [E]) and seven nonstructural (NS) proteins (3, 34).
Epidemics of encephalitis were described in Japan from the 1870s onwards. JEV was first isolated in 1935 and has subsequently been found across most of Asia (30). The origins of the virus are uncertain, but phylogenetic comparisons with other flaviviruses suggest it evolved from an African ancestral virus, perhaps as recently as a few centuries ago (9, 10). Clinical features of infection with JEV range from a nonspecific febrile illness to a severe meningoencephalopmyelitis, often associated with seizures (31), or a polio-like flaccid paralysis (32). Two epidemiological patterns of JE are recognized (39). In northern temperate areas JE occurs in summer epidemics, whereas in southern tropical areas the disease is endemic and occurs year-round. Based on limited nucleotide sequencing of C/PrM and E genes, four genotypes of virus have been identified (5, 6), and representatives of three of which have been fully sequenced. Genotype I includes isolates from northern Thailand, Cambodia, and Korea, genotype II includes isolates from southern Thailand, Malaysia, Indonesia, and Northern Australia, genotype III includes isolates from mostly temperate regions of Asia, including Japan, China, Taiwan, the Philippines, and the Asian subcontinent, and genotype IV includes isolates from Indonesia. In addition, a strain of JEV isolated in Singapore in 1952 from a patient who originated in Muar, Malaysia (Muar strain), may represent a fifth genotype, according to cross-neutralization and limited phylogenetic evidence (13, 38). Because genotypes I and III occur in epidemic regions, whereas II and IV are associated with endemic disease, it has been postulated that differences in strain virulence may explain the clinical epidemiology (5, 6, 42). However, strains are increasingly being identified that do not fit this paradigm. For example, in Vietnam epidemic disease occurs in the north and endemic disease occurs in the south, yet genotype III strains have been isolated in both areas (14, 30). More recently, a genotype I strain has been found in northern Australia (26). To understand better the origin and geographical distribution of JEV genotypes we have determined the complete nucleotide and deduced amino acid sequence of a representative of a genotype IV strain (Indonesian isolate JKT6468), compared it with other full-length and partial genomes, and investigated the geographical distribution of all known JEV isolates.
JEV strain JKT6468, isolated in 1981 from C. tritaeniorhynchus mosquitoes in Flores, Indonesia, was inoculated onto confluent monolayers of C6/36 cells in 2% fetal calf serum. Culture supernatant containing the virus was harvested at 4 to 5 days postinfection, when cytopathic effects were observed, and virus RNA was extracted as described previously (23). RNA was reverse transcribed by using RAV2 reverse transcriptase (RT; Amersham Pharmacia Biotech Inc., Piscataway, N.J.) and was PCR amplified by using Taq DNA polymerase (Roche Diagnostics Corp.). Primers were designed to amplify overlapping fragments of the genome on the basis of a consensus alignment of complete JEV nucleotides available or on sequence of JKT6468 already derived. RT-PCR products were purified by using the QIAGEN gel extraction kit. Purified cDNA was directly sequenced in both directions by using the appropriate sense and antisense primers and were ligated into pGEM-T Easy vector according to the manufacturer's protocol (Promega Corp., Madison, Wis.). Plasmids were screened for possession of the insert by diagnostic restriction digests with EcoRI. At least three clones of plasmids containing inserts were sequenced in both directions by using universal M13 forward and reverse primers. The full-length JKT6468 strain genome was compiled by using the Vector NTI suite software package (version 7.0; Informax Inc.). The complete sequences of JKT6468 and the derived polyprotein were aligned with the other complete JEV genomes (Table (Table1)1) by using the Vector NTI sequence analysis program and percentage differences were calculated. Attenuated strains were not used if the sequence of their parental wild-type isolates were available (e.g., SA14-14-2 and SA14). Phylogenetic trees were constructed by using the neighbor-joining, the maximum parsimony, and maximum likelihood methods on PAUP* version 4.04a (Sinauer Associates, Sutherland, Mass.). The robustness of phylograms was evaluated by 1,000 bootstrap replicates.
To understand better the geographic distribution of JEV genotypes, a list of all strains of JEV for which there are any sequence data available was made. This was done by searching GenBank, PubMed, and where necessary, contacting authors directly. In addition we sequenced the structural genes of two strains from Sarawak, Malaysian Borneo, that have not been studied previously: Sarawak strain isolated from mosquitoes in 1965 and CNS138-11 isolated from the brain of a fatal human case in 1999. Based on the available sequence data, phylogenetic trees were constructed for E gene, C/PrM gene, and NS5/3′ UTRs. Each JEV isolate was thus placed into a genotype by using the classification of Chen et al. (5), Ali and Igarashi (1), and Uchil and Satchidanandam (38).
To determine the rate of nucleotide substitutions for JEV, the number of pairwise nonsynonymous substitutions per site, obtained by comparing each member of a clade to its predicted ancestor based on maximum likelihood analysis, was plotted against the time period in years that separates the isolates to determine k, the rate of nonsynonymous substitutions per site. Nucleotide divergence rates were also estimated by comparing sister sequences that were closely related and isolated at least 5 years apart in the same geographical region. The differences in nonsynonymous changes depicted in branch lengths separating each sister sequence from the predicted common ancestor were divided by the number of years between isolates to yield rates expressed as the number of nonsynonymous changes per year.
To look for differences in pathogenicity, a representative strain of each genotype (KE093-83, genotype I; JKT1724, genotype II; VN118, genotype III; JKT6468, genotype IV) was injected intracerebrally (20 μl at 10-fold serial dilutions) into groups of five female NIH Swiss mice (3 to 4 weeks old). Animal experiments were conducted in accordance with the American Association for Accreditation of Laboratory Animal Care, and an Animal Welfare Assurance is on file with OPRR-NIH.
The relationships between geographical location and genotype were analyzed by using the Fisher's exact test (SAS/STAT version 8; SAS Institute, Cary, N.C.).
The JEV isolates JKT6468, JKT7003, CNS138-11, and Sarawak were assigned GenBank database accession no. AY184212, AY184215, AY184213, and AY184214, respectively.
The JKT6468 virus genome was 10,978 nucleotides long (GenBank accession no. AY184212). Comparison with 17 other fully sequenced JEV genomes showed JKT6468 to have the least similarity, with nucleotide divergence ranging from 17.4 to 19.6%, and protein divergence from 4.8 to 6.5%. JKT6468 is compared with a fully sequenced representative of genotype I (K94P05), genotype II (FU) and genotype III (JaOArS982) in Table Table2.2. Analysis by individual genes showed the PrM gene had the greatest divergence (28% compared with that of JaOArS982), while the 3′ noncoding region had the least divergence (9%). Phylogenetic analysis of all 18 full-length genomes for JEV and a representative strain for other viruses in the JE serogroup suggested that SLEV diverged at the deepest node followed by WNV and KUNV, MVEV, and finally JEV (Fig. (Fig.1A).1A). (Indeed, recent flavivirus phylogenies based on the NS5 gene suggest SLEV may belong to the Ntaya serogroup rather than the JE serogroup [8, 9]). Similar trees were produced by the neighbor-joining, maximum-parsimony, and maximum likelihood methods, differing only in a few terminal groupings of genotype III. Of the JEV genotypes, genotype IV was the basal group in all trees, diverging at the deepest node with 100% bootstrap support, suggesting that it represents the most ancient lineage, which branched off before genotypes I, II, and III. Sequence data are available only for Muar strain's structural proteins. We therefore derived a phylogenetic tree from structural proteins, which suggested that the Muar strain belonged to a separate genotype (genotype V), which diverged at a deeper node than genotype IV (as is shown in the E gene phylogeny [Fig. [Fig.1B1B]).
To estimate the rate of evolution of JEV we examined full-length genomes with a sister pair comparison method and a regression analysis method (38, 41, 43). The sister pair method estimated the mean (standard deviation) rate of nonsynonymous substitutions per nonsynonymous site per year as 7.4 × 10−5 (4.2 × 10−5), and the regression method estimated it as 8.0 × 10−5. From these rates we determined that genotype IV JEV diverged from the common ancestor approximately 350 (±150) years ago, and genotypes I, II, and III diverged more recently.
Because differences in virulence between the genotypes has been postulated to explain the differing clinical epidemiology of JE, we examined whether the different genotypes have different phenotypes in mice by intracerebral inoculation of representative strains. There was no significant differences between genotypes in mouse neurovirulence. The dose which caused 50% mortality ranged from 0.25 to 3 PFU, with survival times from 5 to 10 days.
Most isolates of JEV have been characterized genetically by sequencing the E gene, a region spanning part of C/PrM genes, or a region spanning the NS5/3′ UTR. To examine the geographical distribution of JEV genotypes, we constructed phylogenetic trees for all available sequences using these genes. Figure Figure1B,1B, the E gene phylogeny, which for clarity includes only a single representative strain for each genotype in each country, shows that whereas some genotypes have been found only in a single country (e.g., genotype IV was found only in Indonesia), others (especially genotype III) are more widely distributed. Similarly, whereas some countries have had several genotypes of JEV isolated (e.g., Indonesia has genotypes II, III, and IV), other countries have only had a single genotype isolated (e.g., India has only genotype III). The figure also shows that Muar strain, representing the putative 5th genotype, diverges at a node more basal than that of genotype IV.
Basing the geographical distribution of isolates on the political boundaries of countries includes many geographical anomalies which are unlikely to have a bearing on the distribution of a virus in nature. For example, the island of Borneo is divided politically into Indonesia (Kalimantan), Malaysia (Sarawak and Sabah), and Brunei; the island of New Guinea is divided into Indonesia (Irian Jaya) and Papua New Guinea. Moreover, some of the political delineations have changed with time. We therefore grouped countries into geographical regions after considering their geographical proximity, temperature, rainfall, and vegetation and examined the distribution of JEV genotypes according to region. This also showed an uneven distribution of genotypes (Fig. (Fig.2).2). Whereas most regions had only one or two genotypes present, the Indonesia-Malaysia region had all five genotypes circulating. The number of different genotypes detected in a region depends to some extent on the number of virus strains that have been isolated in that region (i.e., the more strains looked at, the more likely one is to identify different genotypes). Therefore, to determine whether the observed difference represents a true difference in genotype distribution or is simply a result of sampling bias, we constructed a contingency table to analyze all strains by geographical region and genotype (Table (Table3).3). Every strain of JEV for which there is information available was placed into a genotype by using phylogenetic trees constructed from analyzing E, C/PrM, or NS5/3′ UTR genes. For the few strains where there was discordance between the E and C/PrM trees, the E gene was given precedence because it has been shown to be a better predictor of phylogenies determined by complete genomes (38). Comparing the Indonesia-Malaysia region with all other regions combined showed a highly significant difference in the geographical distribution of genotypes (Fisher's exact test, P < 0.0001). An alternative analysis, in which no geographical areas were combined, but genotypes IV and V (which diverged earlier) were compared with I, II, and III (more recently evolved genotypes), confirmed that the observed distribution of genotypes represented a true difference (Fisher's exact test, P < 0.0001). Pairwise analysis showed no significant difference between geographical regions C, E, and F. A further analysis was therefore performed in which these three regions combined were compared with regions A, B, and D for genotypes I, II, III, and IV (the single genotype V isolate was grouped with genotype IV for this analysis). Again this showed a significant difference in the geographical distribution of genotypes (Fisher's exact test, P < 0.001). Alternative geographical groupings, e.g., including Taiwan in region E or D, did not significantly alter these findings. The time intervals during which viruses had been isolated did not differ significantly between regions and thus could not explain the differences in genotypes found. All genotypes, including the most divergent genotypes (IV and V), are present in the Indonesia-Malaysia region, but only the most recently evolved genotypes (I, II, and III) are present in the other geographical regions (Fig. (Fig.22).
JEV uses a wide range of mosquito species and vertebrate hosts across Asia, and geographical differences in vector and host availability may explain why genotype IV has never spread. A direct comparison of the different mosquito species from which viruses have been isolated was inconclusive because details are known for too few isolates. We therefore examined the virus' complete amino acid sequence to look for molecular determinants that might relate to differences in vector or host preference. Genotype IV had the greatest amino acid divergence compared with other genotypes (Table (Table2),2), with 19% of changes occurring in the C protein. These included an unusual series of amino acids near to its carboxy terminus, which are unlike those seen in other JEV strains (Fig. (Fig.3).3). Structural analysis showed that despite these differences, the C protein retained the hydrophobic transmembrane domain at its carboxy terminus (by which it attaches to endoplasmic reticulum during assembly) and its central hydrophobic region (which is attached to membrane in the virion associated form). Because this region of the C protein was so different from the other JEV genotypes, a second JEV isolate from genotype IV (JKT7003 isolated from mosquitoes in Java, Indonesia, in 1981) was also sequenced for this region (GenBank accession no. AY184215). It was found to have an identical sequence to that of JKT6468. Comparison with the Muar strain of JEV and MVEV showed the amino acid sequence of these two viruses was also very different from that of JEV genotypes I to III (Fig. (Fig.33).
Examination of the E protein of JKT6468 showed it was similar to other flaviviruses, with 12 cysteine residues thought to form six disulfide bridges. The E protein is thought to play a critical role in viral attachment and entry into cells and is highly conserved among flaviviruses (21). X-ray crystallography of the E protein of the flavivirus tick-borne encephalitis virus (TBEV) has shown the three-dimensional structure comprises three domains: domain III, which is the receptor binding domain, domain II, the dimerization domain which also contains the fusion peptide, and domain I, which acts as a hinge between the other two domains (27). A comprehensive alignment of E genes from 120 JEV strains identified three signature amino acid residues that distinguished genotype IV from other genotypes (E38 Arg→Lys, E327 Leu→Thr/Ser, E399 Pro→Ala/Thr) (Fig. (Fig.4).4). The predicted three-dimensional structure of the E protein of JKT6468, modeled on the structure for TBEV, showed E38 mapped to domain I, E399 was in the stem anchor region, and E327 mapped to the exposed lateral surface of domain III in a region thought to be involved in receptor binding (Fig. (Fig.3)3) (22).
Arthropod-borne viruses that damage the nervous system are being recognized as increasingly important emerging and reemerging human pathogens, but their origins and evolution remain unclear (29). JE was described in Japan from the 1870s onwards, and the prototype Nakayama strain was isolated from a fatal case in 1935. Since then the disease apparently spread across Asia to affect most of China and the Asian subcontinent, all of southeast Asia, and the Pacific Rim, reaching northern Australia in 1998. Although it is numerically less important than JE, the related WNV has recently reached southern Europe and North America (18, 37), drawing attention to the devastating potential of these viruses to spread. There is a clear need for a better understanding of the origins and spread of these major causes of morbidity and mortality.
Previous attempts to understand the distribution of JEV have tended to consider genotypes according to the countries most frequently represented, which may simply be a function of how many strains have been isolated from a particular place. Hence, genotypes I and III were considered northern epidemic genotypes and genotypes II and IV were considered southern endemic genotypes. And even when a genotype I isolate was identified in the south (26) it was considered that the genotype may have moved from epidemic to endemic areas. In our study, by concentrating on the presence or absence of a particular genotype in each geographical area and by using a robust statistical analysis to determine whether the observed differences are likely to represent the true distribution in nature, we have shown that not only is the Indonesia-Malaysia region the only area where all genotypes are represented, it is also the only region where the most divergent genotypes (IV and V) that are thought to represent the oldest lineages, have so far been found. The Indonesia-Malaysia region is geographically close to Australia, where JEV's closest fully sequenced relative, MVEV, is found. Alfuy virus, another member of the JEV-serogroup, which is thought to occur in Indonesia, New Guinea, and Australia, may actually be genetically closer to JEV (20), but it has not been fully sequenced. (Usutu virus, an African flavivirus, also lies close to JEV in phylogenies based on limited sequence, but it too has not been fully sequenced ). Taken together these observations suggest that JEV originated in the Indonesia-Malaysia region from an ancestral virus common to JEV and MVEV. From this ancestral virus JEV genotypes IV and V diverged, followed by the more recent genotypes I, II, and III. Whereas more recent genotypes have spread to other areas, the more divergent genotypes (IV and V) appear geographically confined to the Indonesia-Malaysia region. Indeed, only a single genotype V isolate has ever been found, and that was isolated in 1952 (13). This and its marked serological differences from other JE strains suggests it may represent a transitional lineage between MVEV and JEV, which has subsequently become extinct.
We also considered other explanations for our observations. For example, the most divergent genotypes (IV and V) may yet be found in different geographical areas, though the intensive surveillance in Japan, Taiwan, and Australia suggests that they are unlikely to have been missed in these countries. An alternative possibility is that these genotypes of JEV existed in more Northern latitudes and have subsequently disappeared. However, epidemiological evidence argues against this. Studies have shown that when JEV arrives in new areas, adults are affected as well as children because they have no preexisting immunity (35). The fact that when JE was first seen in Japan all ages were affected thus provides a strong indication that this was the first time the virus (a genotype III virus) arrived here (33). Interestingly, such large epidemics never seem to have occurred in the Indonesia-Malaysia region, which is also consistent with the virus having evolved here. Nevertheless, it is possible that our conclusions will need revising as the complete nucleotide sequences of other JEV strains and other flaviviruses become available.
For most of the major human pathogens, we have little idea about where they originated. Because JEV has recently evolved and has so rapidly spread to new areas, tracing its geographical origin has been possible. JEV is naturally transmitted between vertebrate hosts by Culex and other mosquitoes. The reasons for its spread are uncertain but may include changing agricultural practices, such as increasing irrigation (which provides mosquito breeding sites), and animal husbandry (which provides host animals) (35). Migrating birds, particularly the black-crowned night heron (Nycticorax nycticorax) and the Asiatic cattle egret (Bubulcus ibis coromandus), are thought to be important in the virus' dispersal to new geographic areas (15), but wind-blown mosquitoes may also play a role (12). Interestingly, the Asiatic cattle egret's range dramatically expanded across Asia in the 19th century following changing agricultural practices (11), which coincides with the evolution and spread of the more recent JEV genotypes. JEV continues to spread with recent outbreaks in India, Nepal, and Australia. Genotype III is the most widely distributed genotype and is the only genotype that has so far been found in India, despite the fact that fifteen strains from diverse geographical locations over 26 years have been genotyped. Ultimately we might expect genotypes I and II to be found there also. Similarly, we might predict genotype III will eventually reach Australia, and genotype II will be found further north in China or Japan.
Why the more ancient lineages of JEV have not spread is uncertain. JEV uses a wide range of mosquito species and vertebrate hosts across Asia, and geographical differences in vector and host availability may explain why genotype IV has never spread. We examined the virus' complete amino acid sequence to look for molecular determinants that might relate to differences in host preference. We found an unusual series of amino acids near to the carboxy terminus of the C protein, which was different from that seen in other JEV strains and was consistent with divergent evolution of this genotype. However, structural analysis suggested the C protein retained its hydrophobic domains and its excess of basic amino acids, which are thought to be important for its function in forming the nucleocapsid. There were fewer changes in the E protein, but a comprehensive alignment of E proteins identified three signature amino acids, one of which (E327) mapped to the exposed lateral surface of domain III in a region thought to be involved in receptor binding. These findings are consistent with the hypothesis that the change from an aliphatic amino acid (Leu) to a hydroxyl-containing amino acid (Ser/Thr) in this critical region of the E protein, by altering vector and/or host preference, could have contributed to the wider dispersal of the newer JEV genotypes compared with those of genotype IV. Previously a small number of changes in E protein sequence have been shown to affect mosquito oral infectivity (4, 24). Clearly, further studies are needed to substantiate this hypothesis.
The suggestion that JEV originated in the Indonesia-Malaysia region raises intriguing questions about the emergence of viruses in Southeast Asia. Nearly 150 years ago the evolutionary biologist Wallace recognized that this region, the Malay Archipelago as it was then known, has a unique environment that allows divergent evolution of plants and animals (40). Its tropical climate and great diversity of insect and vertebrate life may also facilitate the emergence and rapid evolution of viruses. This is supported by one hypothesis that in this same region WNV may have evolved into its Australian subtype, KUNV (25, 28), and dengue may have evolved from a sylvatic to a human virus (41). The greater diversity of MVEV strains found in Papua New Guinea than in Australia (19) is consistent with the idea that this flavivirus also spread from here to Australia. Traditionally, Africa is thought to be the zone of emerging pathogens. Our data, together with recent evidence on the origins of dengue virus (41) and Nipah virus (7), indicate that tropical Southeast Asia may also be an important region for emerging pathogens.
We thank R. Tesh for providing strain JKT6468 from the World Arbovirus Reference Collection our colleagues in Sarawak for providing the clinical material from which CNS138-11 was isolated; S. Weaver and J. Bryant for assistance with the rates of evolution and times of divergence; S. White and D. Freeman for statistical help; R. Telfair for helpful discussions on egret migration; M. Cutchin for geographical advice; and C. A. Hart, R. Tesh, and R. Shope for useful critiques.
This work was funded in part by the World Health Organization Vaccine Development Initiative, NIH grant AI10986, and the Wellcome Trust of Great Britain. T.S. holds a Wellcome Trust Career Development Award.