In this study, we performed an up-to-date phylogenetic analysis of RVs representing all 35 known P genotypes in GenBank. We focused on the VP8* sequences instead of the entire VP4, highlighting the most important portion of the spike protein in interaction with the host receptor, which is believed to be biologically relevant. In addition, since VP4 is large and many VP4 sequences in GenBank are incomplete, only limited numbers of sequences are available if full-length VP4 sequences are studied. Thus, our choice of the VP8* sequences is rational and informative.
Following construction of a phylogenetic tree and validation of genetic distances, five P genogroups (P[I] to P[V]) covering all 35 known P genotypes of RVs have been identified. While RVs have been found in many human and animal species, each of the five P genogroups seemed to have a limited host range or tropism. Three P genogroups (P[I], P[IV], and P[V]) were almost exclusively found in animals, one P genogroup (P[II]) infected only humans, and the remaining one (P[III]) infected both humans and animals. The sialic acid-dependent RVs formed a subcluster within P[I], supporting the previous observation that the requirement of sialic acid is related to the VP4 genotypes but not the host origins (7
The genetic relatedness on recognition of a host receptor is also seen among human RVs. Our previous studies showed that multiple strains within each of the P, P, and P genotypes in the P[II] genogroup showed the same binding specificity, with the H-related antigens (Leb
and/or H type 1) as the common interaction epitope for all three genotypes. In this study, three strains from each of the three P genotypes (P, P, and P) of P[III] RVs were studied, and all three strains revealed the same binding to the type A HBGAs. Furthermore, the recent study by Hu et al. (21
) characterized another P strain and revealed the same binding to the type A antigen as that observed by our study. Thus, we hypothesize that RVs within a genogroup may share a consensus HBGA profile based on their interaction with a common epitope, such as the H-related antigens for P[II] and the A antigen for P[III].
In this study, we estimated pairwise distances of inter- and intragenotypes as well as inter- and intragenogroups and derived putative P genotype and P genogroup classifications based on such a phylogenetic analysis (). The 86% cutoff value for genotypes defined in this study was slightly lower than the 89% cutoff values described previously based on the full-length VP4 (15
). This is because the VP8* regions used in this study are genetically more diverse than full-length VP4s, possibly due to an evolutionary selection from the hosts. While the majority of the P genotypes fell well within the 86% cutoff value, exceptions were observed. In P and P, because clear branches of sequences have been observed within each of the two P genotypes, small numbers of sequences had apparently lower identities (77%) than the cutoff value. While being grouped within each P genotype, special attention should be paid to the biological difference between these branches in future studies.
A further issue is the genetic relatedness between P and P. Although the majority of strains in the two P genotypes were segregated by the 86% cutoff value, there were a number of sequences that fell within the cutoff value, indicating that these two P genotypes were closely related. These results are supported by the fact that both P and P RVs recognize the common H-related HBGAs (24
). Epidemiologically, both P and P are highly predominant in many countries (1
) and likely to have overlapping target populations (24
). Thus, one could view the two most predominant P genotypes as one genetic lineage in assessment of disease burden and epidemic control.
Our results also raise a question about the potential role of the HBGAs in RV evolution. The facts that all three major human RV P genotypes (P, P, and P) recognize the H-related antigens (Leb
and H type 1) and that the vast majority of them infect humans suggest a strong evolutionary selection pressure from the human HBGAs. The variations in binding to different HBGAs (H related versus the A antigens) by different P genotypes within and between genogroups suggest a mechanism of selection by the polymorphic HBGAs of humans similar to that of human NoVs (45
). The shared carbohydrate binding sites between the sialic acid and A antigen binders, and possibly the H antigen binders, further suggest that the requirement of a carbohydrate receptor for host attachment is a common feature for RVs. Furthermore, the RV VP8* proteins are also very diverse in analogy to that of the P domain of NoVs (57
). Thus, one can expect diverse interaction profiles with multiple HBGAs and/or non-HBGA carbohydrates being recognized by different RV genotypes and/or genogroups, particularly for those sialic acid-independent animal RVs in P[I], P[IV], and P[V].
It is known that RV transmission is restricted within certain species, which is supported by our phylogenic analysis. We hypothesize that the absence of necessary carbohydrates in certain species that is required for RV infection could be barriers against RV transmission to these species. For example, one of the major forms of sialic acid recognized by the sialic acid-dependent RVs is the Neu5Gc (N
-glycolylneuraminic acid) (9
) that was found in many primate and other vertebrate species but not in humans (3
). This might explain why many sialic acid-dependent RVs infect animals but not humans. In addition, the polymorphism of the secretor (FUT 2) and Lewis (FUT 3) genes that are responsible for synthesis of the major Lewis antigens (Lea
, and Ley
) has been found in humans but not in other mammals (31
), which may explain why the P, P, and P RVs exclusively infect humans. On the other hand, species with common carbohydrates may facilitate cross-species transmission of certain RVs. For example, P and P of P[III] recognize the type A antigens of humans. Similar A antigens are also expressed in many animal species (51
), which could be the reason that P and P RVs infect both humans and animals. Likewise, the sialic acid-dependent strains in P, P, P, and P might share binding to the Neu5Gc antigens in those species that are susceptible to these P types, while the Neu5Gc binding is missing in many other species that are sialic acid independent. Alternatively, those animals may express forms of sialic acids differing from Neu5Gc. Future study to explore these possibilities is necessary.
The predicted relationship between P genogroups and host range or tropisms of RVs based on their carbohydrate receptor specificities may be important for understanding the epidemiology and development of strategies to control and prevent RV disease. For example, the predominance of the P[II] RVs that is responsible for more than 95% of RV epidemics in humans is likely associated with their broad spectrum of host ranges via recognition of the H-related antigens in the general population. This feature is similar to that of the predominant GII (mainly GII.4) of NoVs that also recognize the H-related HBGAs. Thus, a strategy such as vaccine targeting to these P genotypes would be important for disease control and prevention. On the other hand, the relative lower prevalence of P and P could be due to their relatively narrow target populations that have type A antigen (~30% of the general population in the North American and European countries). While these P genotypes may not be responsible for the major outbreaks of epidemics, strategies to prevent cross-species transmission are necessary.
Further proof of the A antigen as a receptor of RVs came from a separate study (21
) on another P RV, supporting our hypothesis that P and maybe all members of P[III] recognize the type A antigens. This study elucidated the crystal structure of the A antigen-binding site on VP8* and demonstrated an abrogation of the RV infectivity by anti-A-type antibodies. It also confirmed the A antigen-binding site predicted by our docking simulation experiments, with 6 (R101, S187, Y188, Y189, L190, and T191) of 9 amino acids predicted by our study having direct contacts with the A antigen according to the crystallography study (21
). Our mutagenesis studies on two of the other three amino acids (S146, Y155, and S156) () resulted in loss of the A antigen binding in the case of the S146A and Y155A mutants ( and ). These data suggested that both mutations caused a change of the local conformation, influencing indirectly the structural integrity and function of the binding site. Another possibility is that the binding mode of native A antigens somewhat differs from those observed in the crystallography study using a trisaccharide.
In summary, our study provides strong evidence for the genetic relatedness of VP8* in RV recognition of HBGAs. Two major genogroups (P[II] and P[III]) of RVs cause acute gastroenteritis in humans, and both genogroups recognize the human HBGAs, indicating that the human HBGAs play an important role in the infection and therefore evolution of RVs. The genotype- and genogroup-specific variations of RVs in interaction with different HBGAs suggest a functional selection of RVs by the polymorphic human HBGAs. These variations may also play a role in the host tropism and cross-species transmission of RVs among human and many animal species. Further studies to elucidate these relationships, including exploring variations among animal RVs, are warranted.