In this study we show that GII.4 norovirus evolution is epochal, with periods of stasis followed by the emergence of novel epidemic strains that evolve in a linear manner over time, and we map the antigenic variation onto the surface-exposed P2 capsid structure. Using a time-ordered panel of GII.4 VLPs from 1987–2005, we demonstrate that specific changes proximal to interaction site 2 regulate carbohydrate binding patterns, which have changed over the 20 y interval. Using sera from a human outbreak in 1988 and antisera from mice, we used ELISAs and an in vitro carbohydrate blockade as a surrogate neutralization assay to demonstrate that the noted variation alters the serologic and blockade responses consistent with a model of antigenic drift. Our data suggest a model of molecular evolution in which norovirus GII.4 strains persist by evolving novel carbohydrate-binding domains over time in response to immune-driven selection and by antigenic drift in the receptor-binding regions of the P2 subdomain.
Evolution in the norovirus capsid gene is complex, and our data are in agreement with other recent studies that underscore the critical importance of using protein structure to guide molecular phylogenetic analyses based on the hypothesis that protein domains evolve at different rates dependent on structural and functional constraints and environmental selective pressures [61
]. In our analyses, the shell domain appears to be evolving by random drift, as only 5% of changes are informative (i.e., became fixed in the population). To a limited extent the P1 subdomain, and in particular the P2 subdomain, are evolving at higher evolutionary rates, consistent with our hypothesis that surface-exposed residues are evolving in the presence of immune selection. High rates of evolution in surface-exposed residues have also been reported in chronological sets of HIV samples within individual patients [71
]. As the majority of the 176 ORF2 sequences included in our study belonged to the Grimsby and Farmington Hills clusters, the limited sequence information for contemporary clusters reaffirms the critical need for continued surveillance, the collection of full-length capsid sequence information, and detailed studies on the ORF2 evolutionary patterns of change noted in 2005 and beyond.
Our phylogenetic and evolutionary analyses of the P2 domain of ORF2 suggest that the GII.4 viruses have evolved linearly over the last 20 y in a fashion similar to influenza viruses, with serial replacements occurring sporadically, suggesting an epochal evolution in which periods of stasis are followed by sudden transitions [72
]. The periods of stasis are likely the result of entropy barriers that generally occur in highly degenerate genotype-to-fitness populations in which many genotypes give rise to the same phenotype [72
]. During the evolution of the GII.4 viruses, a long period of stasis of about 8 y or more has occurred within the ancestral Camberwell cluster, prior to the emergence of the epidemic Grimsby cluster. Of note, the majority of informative sites within the S domain occurred during the emergence of the Grimsby cluster, and these sites became fixed in the population. With our analyses, we cannot rule out the possibility that these changes to S were the key changes structurally necessary to facilitate the emergence of the GII.4 cluster as the predominant epidemic strain. The Grimsby cluster endured a shorter period of stasis, after which subsequence clusters appear to have evolved from the previous cluster in a linear manner. Later clusters appear to emerge every 1–2 y from 2002 to present. Although all six clusters are distinct, there is overlap between some clusters, and the dates of isolation of some strains that group with ancestral clusters clearly occur after the emergence of later clusters. This variation suggests that strains from earlier clusters may continue to circulate, but likely cause asymptomatic disease, or persist at low levels in the population prior to going extinct.
Analyses of the evolutionary profiles of the GII.4 viruses suggest that many of the outlier sequences are recombinant viruses, consistent with earlier reports by other groups [63
]. The recombination break-point is predicted to occur near the first P1/P2 boundary (nucleotide position 794/amino acid 265), suggesting that viable recombination may be restricted to crossover sites that preserve essential protein domain function (). In addition, some sites in the P2 region appear to revolve between a select subset of amino acid replacements. These sites include 329, 333, 340, 355, and 365. We predict that these are important sites of antigenic variation, but are structurally limited in that they must maintain a specific physiochemical property important for the overall capsid structure or the interaction with carbohydrate, or are structurally constrained by entry mechanisms.
Sites of heterogeneity predominantly occurred in the exposed P2 subdomain in and around the two carbohydrate-interaction sites that form the receptor binding pocket [23
]. Site 2 was the most variable region in our model and changes in this region affected carbohydrate binding profiles. Our empirical studies suggest that escape from herd immunity may represent the selective force that drives antigenic variation within and around the receptor binding pocket on the surface of the GII.4 P2 domain of ORF2. Variation within the RBD in ORF2 variants is likely under strong coselection to maintain recognition of one or more HBGA carbohydrate receptors for docking and perhaps entry, allowing the GII.4 noroviruses to persist and simultaneously circumvent highly penetrant susceptibility alleles that are common in human populations. Alternatively, as the current contemporary strains do not bind any carbohydrates tested, the receptor binding pocket may evolve to recognize other fucosylated carbohydrates or proteins for docking.
In influenza viruses, herd immunity—mediated primarily by neutralizing IgG antibodies [74
]—positively selects for antigenic variation in hemagglutinin, although the exact effect of individual mutations on antigenicity is complex. Mutations may occur in one or more of five neutralizing epitopes or in the sialic acid-binding site in the hemagglutinin glycoprotein, thus selecting for replacement strains that circumvent antibody neutralization [64
]. Among noroviruses, the concept of herd immunity is controversial; early human challenge studies suggested that strain-specific, long-term immunity can be elicited following challenge, as 50% of volunteers did not become infected after multiple challenges with NV. However, the same study demonstrated that in some volunteers only short-term immunity was evident [75
]. In more recent studies, we and others have argued that long-term immunity is possible and that pre-exposure history may influence the duration of the protective immune response against individual strains [16
]. Although early mucosal IgA [16
] and T cell [22
] responses may include components of a long-term protective immune response in uninfected, challenged volunteers, the role of serum IgG in protective immunity remains controversial. Norovirus-challenged volunteers or outbreak patients mount strong serum IgG antibody responses that block carbohydrate–VLP interactions in a genogroup-specific manner in a surrogate neutralizing assay potentially representing a component of a long-term protective immunity [57
]. However, IgG antibody levels are usually too low in prechallenge sera, or in salivary or fecal samples, for assaying by these methods. Importantly, the years following the emergence of a new epidemic strain in Europe were characterized by decreased numbers of outbreaks, speculated to be associated with increased herd immunity [62
]. If herd immunity drives GII.4 norovirus evolution, these data predict that serologic relationships among temporal GII.4 epidemic strains should change over time.
Although GII.4–1987 and GII.4–1997 VLPs differed by seven amino acids, no significant differences in antibody reactivity were noted with sera derived from humans and experimentally immunized mice, suggesting that the few amino acid changes did not significantly alter variation between the two strains during the long period of stasis. We speculate that pre-1995 Camberwell-like strains typically produced low-level endemic disease in human populations. By the mid 1990s, a series of mutations evolved that promoted epidemic spread of the post-1996 Lordsdale/Grimsby strains in human populations, perhaps by allowing for more efficient binding with additional HBGA ligands on mucosal surfaces, altering the stability of the capsid, or promoting transmissibility. The epidemic spread of the GII.4–1997-like strains in human populations may have subsequently allowed for higher levels of herd immunity and selected for faster antigenic changes in future strains. Influenza viruses show similar trends, in that genetic variation oftentimes, but not always, tracks with antigenic variation, because some mutations result in disproportionately large antigenic changes [78
]. However, global serologic responses between GII.4–1987/1997 and GII.4–2002/2002a demonstrated significant antigenic differences, reflecting the increased number of variant residues. Concordant with these findings, GII.4–2004 and GII.4–2005 epidemic strains were also serologically quite distinct from GII.4–1987 and GII.4–1997, and to a lesser extent distinct from GII.4–2002, but not from 2002a. Thus, epidemic replacement strain ORF2 capsid sequences were antigenically related yet distinct due to antigenic drift.
Given the high amount of GII.4 cross-reactivity, it is clear that one or more highly conserved epitopes define the serology of this genocluster. Immunodominant neutralizing epitopes have been described for a number of viruses, including West Nile virus [79
], HIV-1 [80
], and foot-and-mouth disease virus [81
]. Findings with the GII.4–2002 and −2002a ORF 2 capsid proteins support the possibility that GII.4 noroviruses may also encode a limited number of strong immunodominant epitopes. Compared to GII.4–2002, the GII.4–2002a norovirus ORF2 protein differs by two residues, defined by changes in P1 (P226S) and P2 (A395T), yet is antigenically quite distinct from all other strains tested (, , , and ). Previous work with foot-and-mouth disease virus, demonstrated that a single-amino acid change in an immunodominant epitope resulted in two antigenic specificities and a lack of virus cross-neutralization [82
]. Although speculative, the noted P2 variation is unlikely to encode this strong serologic change in GII.4–2002a, as other time-ordered VLPs encode amino acid changes at this position as well. Rather, we predict that the alteration in P1 (P226S) might well define a major immunodominant epitope. Experiments are in process to test this interesting hypothesis. On the structural model, the side chain of Ser226 is much smaller than the Pro side chain and it extends away from the surface into an open cavity below the dimer interface region. This change may alter the final conformation of the viral capsid by relaxing the constraints on the hinge movement. Clearly, detailed structure analyses of the time-ordered GII.4 VLP set will likely prove informative.
All convalescent outbreak sera blocked carbohydrate binding of GII.4–1987 and GII.4–1997 VLPs but were less capable of blocking GII.4–2002/2002a binding. Interestingly, the mouse anti-GII.4–2004 and GII.4–2005 sera more efficiently blocked binding of GII.4–1987 and GII.4–1997 to H type 3 than GII.4–2002/2002a binding to Ley and A. Of note, amino acids at positions 329, 355, and 365 in GII.4–2004 and GII.4–2005 are the same as GII.4–1987 and GII.4–1997, but not GII.4–2002/2002a, which implies that these sites may account for the cross blockade of anti-GII.4–2004 and anti-GII.4–2005 sera to GII.4–1987 and GII.4–1997 carbohydrate binding. These sites may also be important determinants of antigenic variation within the GII.4 genocluster.
The absence of a robust cell culture model for noroviruses prevents the development of classical neutralization assays. However, studies with numerous virus families have indicated that antibodies that block virus receptor–ligand interactions provide one mechanism to neutralize virus infectivity [83
]. Previous studies by our group and others have demonstrated that noroviruses bind to HBGAs, and that HBGAs are necessary for infection, since the FUT2
gene is a susceptibility allele for Norwalk virus infection in vivo [16
]. Although a GII.4 human challenge model does not exist, some GII.4 noroviruses have been reported to bind specifically to H type 3 and to a lesser extent to the A and B carbohydrates, suggesting usage of HBGAs in infection [57
]. Further, GII.4 outbreak investigations have established a strong correlation between a secretor-positive phenotype and symptomatic infection [21
]. However, carbohydrate-binding patterns within a temporal panel of norovirus VLPs have not been reported until now. Consonant with clear variations in overall serologic identity among the GII.4 VLP panel, we have also demonstrated that the GII.4 VLPs display variant binding patterns to carbohydrates typically regulated by FUT1 (Lex
), FUT2 (H Type 3), FUT3, the Lewis enzyme, (Lea
), and the A and B enzymes. These findings suggest that some GII.4 noroviruses not only bind carbohydrates regulated by the FUT2
susceptibility allele, but also can bind carbohydrates regulated by FUT1
and the A
alleles as well. However, to date, FUT1 expression has not been demonstrated in the gut mucosa [87
]. As fucosyltransferase enzymes lack tight core chain fidelity in vitro, it is possible that the FUT2 enzyme, or another fucosyltransferase, may express typically FUT1-regulated carbohydrates in the gut, as has been observed in saliva where FUT2 activity produces both Lex
from type 2 core chain [87
]. Further, we and others have not seen in vivo evidence of core chain usage by alternative fucosyltransferases, as FUT2-negative individuals were completely resistant to NV infection [16
] and more likely to be asymptomatic after GII.4 exposure [21
], regardless of the presence of other fucosyltransferases. Most surprisingly, the GII.4–2004 and GII.4–2005 strains did not bind any HBGA carbohydrates or saliva tested, suggesting that their carbohydrate ligands are either not represented within the panel of biotinylated HBGA carbohydrates available for testing, carbohydrate patterns differ in saliva and in intestinal mucosa, or they utilize non-HBGA-mediated pathways for entry. Thus, over time, it is reasonable to predict that noroviruses have the capacity to utilize the large number of related HBGAs as ligands. The potential plasticity in the carbohydrate-binding site would likely accommodate sufficient amounts of antigenic drift to escape herd immunity, while simultaneously preserving carbohydrate-binding potential and altering strain susceptibility to the many different human alleles that regulate HBGA expression.
Fucose ligand binding site 1 was strictly conserved in the GII.4 viruses, including, paradoxically, extant strains that only weakly bind saliva and do not bind any carbohydrate tested. In contrast, the secondary interaction site appears to facilitate carbohydrate specificity as binding characteristics of the time ordered VLP panel varied extensively. In interaction site 2, positions 390, 391, 392, and 443 were conserved throughout the GII.4 strains while sites 393, 394, and 395 were variable. In two instances, binding characteristics could be directly correlated to residue changes within this region. First, structural models predicted that carbohydrate binding would differ between the Camberwell cluster and the Grimsby cluster (including VA387), based primarily upon an Asp-to-Asn change at position 393 in site 2 (). In agreement with our hypothesis, binding between GII.4–1987 and GII.4–1997 was different. The substitution of an Asp at position 393 was predicted and then empirically demonstrated to sterically hinder or otherwise alter binding of the larger trisaccharide moieties of A- and B- antigens, as the Camberwell representative VLP binds H type 3 and Ley
but not A or B (). In contrast, both GII.4–1997 and VA387 bind H type 3, Ley
, A, and B [57
]; and they encode Gly and Asn at the 393 position, respectively. Interestingly, our data suggest that the primary impact of the mutations that occurred between the Camberwell and Grimsby clusters led to an expansion of HBGA usage, as representative strains GII.4–1987 and GII.4–1997 were indistinguishable antigenically. In the second case, a Thr at position 395, as exhibited by GII.4–2002a (E) altered the receptor binding pattern as this mutant bound to Lewis enzyme products, Lea
, as well as the FUT2-dependent product, A antigen. GII.4–2002a is the first GII.4 strain reported to bind FUT2-independent products, indicating a possible pathway for infection of secretor-negative individuals. Alanine at this position facilitates binding of H type 3 in GII.4–2002. These results are also in agreement with our hypothesis that microevolution in site 2 alters carbohydrate-binding interactions; more detailed genetic studies should confirm this hypothesis. Of note, the synthetic HBGAs used in this study lack the complex structures often found in vivo. Larger polysaccharide moieties likely play a crucial role in carbohydrate affinity and avidity by interacting directly with interaction site 2.
Taken together, our structural models () suggest that heterogeneity in the receptor interaction site 2 likely determines HBGA affinity and avidity, and subtle changes in this region may govern HBGA specificity. Tan et al. [30
] demonstrated that binding of VA387 to HBGAs could be ablated by mutating the Thr at position 338 to Ala. While Thr338 is not directly involved in ligand binding, it does form hydrogen bonds to Arg345, which directly hydrogen bonds to the ligand [23
]. It seems likely that hydrogen bonding patterns also influence which ligands the virus can bind. Subtle changes to residues that form hydrogen bonds with the primary ligand interaction residues may drastically alter ligand affinity and avidity. In addition, the length and charge of the side chain of a given residue likely allosterically regulates the site by sterically hindering some interactions (). Studies with foot-and-mouth disease virus have demonstrated that this virus contains a conserved shallow pocket on its surface that is predisposed to evolve a high affinity for its heparin sulfate receptor. Mutations remodel the surface by increasing the positive charge, which results in an increased affinity for its receptor [89
]. Differences in electrostatic potential may regulate HBGA affinity in GII.4 viruses as well, as the addition of a charge at position 393 alters the surface charge and binding of the GII.4–1987 virus (A–D). Studies with influenza virus have shown broad serologic differences between temporally distinct strains consistent with a phenotype of antigenic drift and variation, especially in antigenic sites, receptor-binding sites, and codons previously identified as being under positive selection [64
Virus recognition of variant carbohydrate receptor moieties is not unprecedented; influenza viruses recognize variant sialic acid moieties for infection of aquatic birds (α2–3 sialic acid) and humans (α2–6 sialic acid), and other viruses utilize similar mechanisms [90
]. However, the recognition specificities are much more subtle and complex than originally appreciated. Recently developed glycan microarray tools have demonstrated that different human and avian influenza virus strains bind to different glycan ligands depending upon downstream fucosylation, sulfation, and additional sialylation processing patterns, although the biological significance of these interactions are not fully known [93
]. Our data and that of others [62
] suggest that antigenic drift in norovirus ORF2s (HBGA antigens) and perhaps influenza virus (sialic acid-containing antigens) hemagglutinin may evolve by similar mechanisms. The combined flexibility of the ligand-binding pocket and the wide range of variant, yet related carbohydrate ligands, may provide the plasticity in both the receptor targets and viral attachment proteins necessary to allow for extensive antigenic drift in the face of herd immunity.
The data presented in this manuscript provide support for the hypothesis that antigenic drift and receptor switching may function synergistically to maintain the GII.4 noroviruses in the presence of human herd immunity. Our data suggest that strain-specific protective immunity is possible and that vaccines and immune prophylaxis must be formulated to protect against contemporary strains. As shown with influenza viruses, new therapeutic formulations will be necessary. Moreover, continued norovirus surveillance will be essential for maintaining vaccine and drug effectiveness.
At this time, it is unclear whether GII.4 noroviruses will continue to predominate as the major cause of epidemic gastroenteritis worldwide, or (like influenza virus) undergo an antigenic shift to a variant GI or GII genocluster that is currently circulating at low levels in human populations, or whether a new strain will be introduced from zoonotic pools. However, important caveats must be considered when evaluating this work. While it is clear that the mucosal compartment has high concentrations of IgG, carbohydrate–VLP blockade assays use serum IgG, whereas mucosal IgA and IgG responses may be more important in protective immunity [16
]. Unfortunately, mucosal antibody concentrations are usually not only insufficient for blockade studies, but were not obtained during norovirus outbreaks, preventing the testing of this possibility. In the absence of a robust cell culture model, blockade studies themselves represent a surrogate assay for neutralization, and it is possible that antibodies might neutralize virus infectivity by binding to regions distinct from the carbohydrate-binding pocket or even outside of P2 and inhibit other steps in entry, as shown with West Nile virus, among others [94
]. Research is clearly needed to define the number and location of the neutralizing sites in the norovirus particle and the impact of positively selected mutations on the neutralization phenotype. Structural studies solving variant carbohydrate-binding characteristics in the time-ordered VLP set will be imperative for understanding the role of the secondary sites in receptor specificity and binding affinity. Further, the VLPs used in this study are composed of ORF2 major capsid protein, whereas native virions would also include one or two copies of the ORF3 protein [95
]. The function of the ORF3 protein is still unclear, and its effect on virus structure and interaction with ligands is unknown. Although no examples of norovirus VLP post-translational modifications have been reported, any such modifications may impact ligand interaction. In this study VLPs were produced in a mammalian expression system, thus post-translational modifications should reflect natural processing. Finally, although HBGAs clearly function as ligands for Norwalk virus entry, clear evidence that HBGAs function for GII.4 docking and entry is less robust, but suggestive, and it is not clear whether differential binding patterns noted in vitro reflect in vivo binding and susceptibility phenotypes [18
]. In the absence of time-order GII.4 human challenge inocula, it will be difficult to definitively prove that contemporary strains circumvent immune responses to preexisting strains. Additional studies will be needed to determine whether the evolutionary patterns are unique to the GII.4 noroviruses or represent a general evolutionary pattern of the norovirus family. Our study, however, presents a predictive model for future empirical studies investigating the relationships among antigenic change, norovirus pathogenesis, vaccine design, and human disease.