In this study, we used sequence alignment, structural analysis and site-directed mutagenesis to examine the evolutionary relatedness of human noroviruses in terms of their interaction with the HBGA receptors. We showed that strains with distinct HBGA binding patterns within genogroups share common receptor binding interfaces in their interactions with variable HBGAs, likely tuned up by subtle structural differences within the binding interfaces. At the same time, strains in different genogroups that use different binding interfaces, as defined by their locations and sequence motifs, can recognize the same HBGA-targets, pointing to the overall functional and structural similarity of these distinct binding sites. These results provide evidence that the human HBGAs exert an important selection pressure in norovirus evolution. The two major genogroups (G I and GII) of human noroviruses that cause acute gastroenteritis represent two major evolutionary lineages, while strains in the A/B and Lewis binding groups within the two genogroups, such as those represented by the Norwalk virus and Boxer in GI and those by VA387 and VA207 in GII, may further divide into evolutionary sub-lineages as a result of divergent evolution within each branch ().
A schematic relationship of the known carbohydrate-binding phenotypes of caliciviruses.
The HBGA-binding interfaces of the two major genogroups of human noroviruses share some similarity in the overall structure and location; both are located in the outermost P2 regions of the capsids 
and both are composed of three major structural components, corresponding to the bottom and the walls of the binding pocket (). However, the two binding interfaces differ in their primary sequences, detailed locations, and modes of interaction with the HBGA-receptors 
. The binding interface of the GI strains (Norwalk virus) is constituted by three groups of amino acids from the P2 subdomain and positioned mainly in one P monomer, although it is near the interface of two P monomers of the Norwalk virus P dimer. On the other hand, the binding interface of the GII viruses (VA387) is composed of residues from both P1 and P2 subdomains and is located right at the interface of two monomers in the VA387 P dimer 
. The conservation of the binding interfaces within GII has been confirmed by the crystal structures of the VA207 P dimers in complex with the Lewis x and Lewis y tetrasaccharides, respectively, in which the binding interface of VA207, a Lewis binding strain, is constituted by the conserved amino acids and interacts with the α-1,3/4 fucose of the Lewis y antigen in a similar way like that of VA387 (Y. Chen, X. Jiang and X. Li, to be published data).
The two types of binding interfaces differ also in their binding modes to HBGAs. Based on crystal structures of the P dimers complexed with oligosaccharides, Norwalk virus has a smaller or narrower binding interface, while VA387 has a larger or broader one (, 
). As a result only two sugars of the A trisaccharide and the H pentasaccharide are involved in interaction with Norwalk virus 
, as opposed to all three sugars of the A and B trisaccharides in case of VA387 
. In addition, more amino acid residues of VA387 appear to be involved in binding to HBGAs, compared to Norwalk virus. Specifically, crystal structures revealed 11 residues of VA387 P domain interacting with the B trisaccharides, as opposed to only 7 in case of Norwalk virus P domain binding to the A or H oligosaccharides 
. Furthermore, mutagenesis studies mapped another 8 amino acids around the binding interface of VA387 affecting the binding function 
, while only 2 such residues of Norwalk virus were found (this report). Nevertheless, the aforementioned binding modes are based solely on the P dimers interacting with oligosaccharides under the condition of co-crystallization. The native interactions between norovirus and HBGAs in vivo
remain to be elucidated.
Another observation emerging from this study is the possibility of interplay between convergent and divergent evolution of noroviruses. The two major genogroups (GI and GII) of human noroviruses are characterized by distinct genetic traits with significant differences in the primary sequence within their P domains. These two distinct lineages may have evolved in the course of divergent evolution from a common ancestor. On the other hand, the acquisition of the common function of binding to HBGAs by distinct binding interfaces and modes is consistent with functional convergence as a result of adaptation to and selection by the same niche of human HBGAs. The two strains described in this study, VA387 and Norwalk virus, provide strong support for this hypothesis. Convergent evolution of protein function and/or structure in conjunction with acquired ligand binding specificity has been observed previously 
. One such example includes sugar binding families of LacI/GalR repressors and their PBP analogues, in which evolutionarily divergent lineages acquired independently similar ligand binding patterns through convergent evolution 
The fact that almost all known HBGAs have their noroviral counterparts suggests that noroviruses are highly adaptive human pathogens. In addition, it has been noted that some strains with conserved binding interfaces appear not to recognize HBGAs, such as the Desert Shield virus (DSV, GI-3) 
and Hunter virus (GII-4) 
, while other strains lacking the conserved binding interfaces retain the HBGA-binding ability, such as OIF of the GII-13 noroviruses 
. These variations further highlight the adaptive nature of noroviruses that may recognize other carbohydrates or even non-carbohydrates as receptors. As long as noroviruses remain a human pathogen, the diversity of HBGA-binding patterns seen today will probably extend into the future.
Limited studies have shown that the GI and GII noroviruses are biologically different. For example, the GI noroviruses are more involved in environmental contamination and cause outbreaks year around without apparent seasonal peaks, while GII strains are easier to spread via person-to-person contact 
and commonly cause outbreaks with clear fall/winter peaks. While future studies are required to identify factors and genetic markers responsible for these differences, this work can help to elucidate the evolutionary relatedness of the GI and GII noroviruses and improve the classification of caliciviruses (). Each of the four major genera and the two newly discovered “Becovirus” 
and “Recovirus” 
genera should represent an evolutionary lineage in this virus family. While each of them has adapted well into individual host species, the binding to carbohydrates has apparently been maintained or acquired in at least some strains of most genera of caliciviruses, other than human noroviruses. For example, the rabbit hemorrhagic disease virus (RHDV) of the Lagovirus
and the Tulane virus (TV) of the Recovirus
recognize HBGAs 
, while feline calicivirus (FCV) bind to sialic acid 
. Since the common ancestor of these genetically distinct species might not possess the HBGA binding trait, one might speculate that these common characteristics were acquired independently as a result of adapting to similar biological niches, suggesting a possible convergent evolution of caliciviruses.
Our mutagenesis study further demonstrated that, in addition to the conserved binding sites, a number of nearby amino acids also play an important role in the binding specificity to HBGAs, possibly by contributing to the conformational flexibility of the carbohydrate binding interfaces, and these residues are less conserved. For example, residues Q331
, and G392
of VA387 are likely involved in the binding to the A but not the B antigens 
, while S338
of Norwalk virus affect the binding strongly to H but weakly to A antigen (this study). Similar role of D393 of another GII-4 strain was also observed 
. The recent studies on the globally dominant GII-4 noroviruses suggests that the host herd immunity may play a role in the epochal evolution of GII-4 viruses 
. Future studies focusing on these non-conserved residues for their potential roles in the antigenicity and immunogenicity of the viruses may be necessary.
In this study 39 mutant P particles of four strains (Norwalk, Boxer, MOH, and VA207) have been generated to address the conservation issue of the HBGA binding interfaces of noroviruses. This task would be very difficult to complete by using the VLPs as the model, because VLP production are very time-consuming compared to P particles. In our previous studies we have demonstrated that the P particle is a good model for studying norovirus-HBGAs interaction by the observations that P particle uses the same HBGA binding interface and shares very similar HBGA binding profile as that of its VLP counterpart 
. In addition, we used the saliva binding assay for its simplicity, convenience and sensitivity. All saliva samples used in this report have been well characterized for their phenotypes and binding patterns to noroviruses in our previous studies 
. We do not expect significant differences with respect to synthetic oligosaccharide-based assays in evaluation of the importance of HBGA binding sites.
The findings of the conservation of HBGA-binding interfaces within genogroups can greatly facilitate the design and development of therapeutics against noroviruses. For example, a single compound that inhibits the function of the conserved HBGA-binding interface may be capable of blocking infection of all strains with the same type of HBGA binding interface. Thus, only two compounds might be sufficient to block most noroviruses in the two genogroups studied here, each group sharing a similar binding interface that could be blocked by one common inhibitor.