The species Alphapapillomavirus 7 (alpha-7) contains human papillomavirus genotypes that account for 15% of invasive cervical cancers and are disproportionately associated with adenocarcinoma of the cervix. Complete genome analyses enable identification and nomenclature of variant lineages and sublineages.
The URR/E6 region was sequenced to screen for novel variants of HPV18, 39, 45, 59, 68, 70, 85 and 97 from 1147 cervical samples obtained from multiple geographic regions that had previously been shown to contain an alpha-7 HPV isolate. To study viral heterogeneity, the complete 8 kb genome of 128 isolates, including 109 sequenced for this analysis, were annotated and analyzed. Viral evolution was characterized by constructing phylogenic trees using maximum-likelihood and Bayesian algorithms. Global and pairwise alignments were used to calculate total and ORF/region nucleotide differences; lineages and sublineages were assigned using an alphanumeric system. The prototype genome was assigned to the A lineage or A1 sublineage.
The genomic diversity of alpha-7 HPV types ranged from 1.1% to 6.7% nucleotide sequence differences; the extent of genome-genome pairwise intratype heterogeneity was 1.1% for HPV39, 1.3% for HPV59, 1.5% for HPV45, 1.6% for HPV70, 2.1% for HPV18, and 6.7% for HPV68. ME180 (previously a subtype of HPV68) was designated as the representative genome for HPV68 sublineage C1. Each ORF/region differed in sequence diversity, from most variable to least variable: noncoding region 1 (NCR1) / noncoding region 2 (NCR2) > upstream regulatory region (URR) > E6 / E7 > E2 / L2 > E1 / L1.
These data provide estimates of the maximum viral genomic heterogeneity of alpha-7 HPV type variants. The proposed taxonomic system facilitates the comparison of variants across epidemiological and molecular studies. Sequence diversity, geographic distribution and phylogenetic topology of this clinically important group of HPVs suggest an independent evolutionary history for each type.
Recent studies indicate that human papillomaviruses (HPVs) from the genera Betapapillomavirus and Gammapapillomavirus are abundant in the human oral cavity. We report the cloning and characterization of a 7304 bp HPV120 genome from the oral cavity that is related most closely to HPV23 (L1 ORF, 83.7 % similarity), clustering it in the genus Betapapillomavirus (β-PV). HPV120 contains five early and two late genes, but no E5 ORF. HPV120 was detected from heterogeneous human biological niches, including the oral cavity, eyebrow hairs, anal canal and penile, vulvar and perianal warts. Characterization of the clinical spectrum of HPV120 infections indicates a broader spectrum of epithelial tropism than appreciated previously for HPV types from the genus β-PV.
Infection by a human papillomavirus (HPV) may result in a variety of clinical conditions ranging from benign warts to invasive cancer depending on the viral type. The HPV E2 protein represses transcription of the E6 and E7 genes in integrated papillomavirus genomes and together with the E1 protein is required for viral replication. E2 proteins bind with high affinity to palindromic DNA sequences consisting of two highly conserved four base pair sequences flanking a variable ‘spacer’ of identical length. The E2 proteins directly contact the conserved DNA but not the spacer DNA. However, variation in naturally occurring spacer sequences results in differential protein binding affinity. This discrimination in binding is dependent on their sensitivity to the unique conformational and/or dynamic properties of the spacer DNA in a process termed ‘indirect readout’. This article explores the structure of the E2 proteins and their interaction with DNA and other proteins, the effects of ions on affinity and specificity, and the phylogenetic and biophysical nature of this core viral protein. We have analyzed the sequence conservation and electrostatic features of three-dimensional models of the DNA binding domains of 146 papillomavirus types and variants with the goal of identifying characteristics that associated with risk of virally caused malignancy. The amino acid sequence, three-dimensional structure, and the electrostatic features of E2 protein DNA binding domain showed high conservation among all papillomavirus types. This indicates that the specific interactions between the E2 protein and its binding sites on DNA have been conserved throughout PV evolution. Analysis of the E2 protein’s transactivation domain showed that unlike the DNA binding domain, the transactivation domain does not have extensive surfaces of highly conserved residues. Rather, the regions of high conservation are localized to small surface patches. The invariance of the E2 DNA binding domain structure, electrostatics and sequence suggests that it may be a suitable target for the development of vaccines effective against a broad spectrum of HPV types.
Papillomavirus; DNA; Protein-DNA interactions; Electrostatics; E2; Review
Background. Human papillomaviruses (HPVs) primarily sort into 3 genera: Alphapapillomavirus (α-HPV), predominantly isolated from mucosa, and Betapapillomavirus (β-HPV) and Gammapapillomavirus (γ-HPV), predominantly isolated from skin. HPV types might infect body sites that are different from those from which they were originally isolated.
Methods. We investigated the spectrum of HPV type distribution in oral rinse samples from 2 populations: 52 human immunodeficiency virus (HIV)–positive men and women and 317 men who provided a sample for genomic DNA for a prostate cancer study. HPV types were detected with the MY09/MY11 and FAP59/64 primer systems and identified by dot blot hybridization and/or direct sequencing.
Results. Oral rinse specimens from 35 (67%) of 52 HIV-positive individuals and 117 (37%) of 317 older male participants tested positive for HPV DNA. We found 117 type-specific HPV infections from the HIV-positive individuals, including 73 α-HPV, 33 β-HPV, and 11 γ-HPV infections; whereas, the distribution was 46 α-HPV, 108 β-HPV, and 14 γ-HPV infections from 168 type-specific infections from the 317 male participants.
Conclusions. The oral cavity contains a wide spectrum of HPV types predominantly from the β-HPV and γ-HPV genera, which were previously considered to be cutaneous types. These results could have significant implications for understanding the biology of HPV and the epidemiological associations of HPV with oral and skin neoplasia.
Alpha human papillomaviruses (HPVs) are among the most common sexually transmitted agents of which a subset causes cervical neoplasia and cancer in humans. Alpha-PVs have also been identified in non-human primates although few studies have systematically characterized such mucosal PVs. We cloned and characterized 10 distinct types of PVs from exfoliated cervicovaginal cells from different populations of female cynomolgus macaques (Macaca fascicularis) originating from China and Indonesia. These include 5 novel genotypes and 5 previously identified genotypes found in rhesus (Macaca mulatta) (RhPV-1, RhPV-a, RhPV-b and RhPV-d) and cynomolgus macaques (MfPV-a). Type-specific primers were designed to amplify the complete PV genomes using an overlapping PCR method. Four MfPVs were associated with cervical intraepithelial neoplasia (CIN). The most prevalent virus type was MfPV-3 (formerly RhPV-d), which was identified in 60% of animals with CIN. In addition, the complete genomes of variants of MfPV-3 and RhPV-1 were characterized. These variants are 97.1% and 97.7% similar across the L1 nucleotide sequences with the prototype genomes, respectively. Sequence comparisons and phylogenetic analyses indicate that these novel MfPVs cluster together within the alpha PV α12 species closely related to the α9 (e.g., HPV16) and α11 species (e.g., HPV34), and all share a most recent common ancestor. Our data expand the molecular diversity of non-human primate PVs and suggest the recent expansion of alpha PV species groups. Moreover, identification of an overlapping set of MfPVs in rhesus and cynomolgus macaques indicates that non-human primate alpha PVs might not be strictly species specific and that “subtypes” may represent recent divergence of host species or past interspecies infection.
alpha papillomavirus; Macaca fascicularis; novel PVs; genomic diversity; evolution
We present an expansion of the classification of the family Papillomaviridae, which now contains 29 genera formed by 189 papillomavirus (PV) types isolated from humans (120 types), non-human mammals, birds and reptiles (64, 3 and 2 types, respectively). To accommodate the number of PV genera exceeding the Greek alphabet, the prefix “dyo” is used, continuing after the Omega-PVs with Dyodelta-PVs. The current set of human PVs are contained within five genera, whereas mammalian, avian and reptile PVs are contained within 20, 3 and 1 genera, respectively. We propose standardizations to the names of a number of animal PVs. As prerequisite for a coherent nomenclature of animal PVs, we propose founding a Reference Center for Animal PVs. We discuss that based on emerging species concepts derived from genome sequences, PV types could be promoted to the taxonomic level of species, but do not recommend implementing this change at the current time.
The rapidly expanding field of microbiome studies offers investigators a large choice of methods for each step in the process of determining the microorganisms in a sample. The human cervicovaginal microbiome affects female reproductive health, susceptibility to and natural history of many sexually transmitted infections, including human papillomavirus (HPV). At present, long-term behavior of the cervical microbiome in early sexual life is poorly understood.
The V6 and V6–V9 regions of the 16S ribosomal RNA gene were amplified from DNA isolated from exfoliated cervical cells. Specimens from 10 women participating in the Natural History Study of HPV in Guanacaste, Costa Rica were sampled successively over a period of 5–7 years. We sequenced amplicons using 3 different platforms (Sanger, Roche 454, and Illumina HiSeq 2000) and analyzed sequences using pipelines based on 3 different classification algorithms (usearch, RDP Classifier, and pplacer).
Usearch and pplacer provided consistent microbiome classifications for all sequencing methods, whereas RDP Classifier deviated significantly when characterizing Illumina reads. Comparing across sequencing platforms indicated 7%–41% of the reads were reclassified, while comparing across software pipelines reclassified up to 32% of the reads. Variability in classification was shown not to be due to a difference in read lengths. Six cervical microbiome community types were observed and are characterized by a predominance of either G. vaginalis or Lactobacillus spp. Over the 5–7 year period, subjects displayed fluctuation between community types. A PERMANOVA analysis on pairwise Kantorovich-Rubinstein distances between the microbiota of all samples yielded an F-test ratio of 2.86 (p<0.01), indicating a significant difference comparing within and between subjects’ microbiota.
Amplification and sequencing methods affected the characterization of the microbiome more than classification algorithms. Pplacer and usearch performed consistently with all sequencing methods. The analyses identified 6 community types consistent with those previously reported. The long-term behavior of the cervical microbiome indicated that fluctuations were subject dependent.
Human Papillomavirus type 16 (HPV16) causes over half of all cervical cancer and some HPV16 variants are more oncogenic than others. The genetic basis for the extraordinary oncogenic properties of HPV16 compared to other HPVs is unknown. In addition, we neither know which nucleotides vary across and within HPV types and lineages, nor which of the single nucleotide polymorphisms (SNPs) determine oncogenicity.
A reference set of 62 HPV16 complete genome sequences was established and used to examine patterns of evolutionary relatedness amongst variants using a pairwise identity heatmap and HPV16 phylogeny. A BLAST-based algorithm was developed to impute complete genome data from partial sequence information using the reference database. To interrogate the oncogenic risk of determined and imputed HPV16 SNPs, odds-ratios for each SNP were calculated in a case-control viral genome-wide association study (VWAS) using biopsy confirmed high-grade cervix neoplasia and self-limited HPV16 infections from Guanacaste, Costa Rica.
HPV16 variants display evolutionarily stable lineages that contain conserved diagnostic SNPs. The imputation algorithm indicated that an average of 97.5±1.03% of SNPs could be accurately imputed. The VWAS revealed specific HPV16 viral SNPs associated with variant lineages and elevated odds ratios; however, individual causal SNPs could not be distinguished with certainty due to the nature of HPV evolution.
Conserved and lineage-specific SNPs can be imputed with a high degree of accuracy from limited viral polymorphic data due to the lack of recombination and the stochastic mechanism of variation accumulation in the HPV genome. However, to determine the role of novel variants or non-lineage-specific SNPs by VWAS will require direct sequence analysis. The investigation of patterns of genetic variation and the identification of diagnostic SNPs for lineages of HPV16 variants provides a valuable resource for future studies of HPV16 pathogenicity.
Human papillomavirus 16 (HPV16) species group (alpha-9) of the Alphapapillomavirus genus contains HPV16, HPV31, HPV33, HPV35, HPV52, HPV58 and HPV67. These HPVs account for 75% of invasive cervical cancers worldwide. Viral variants of these HPVs differ in evolutionary history and pathogenicity. Moreover, a comprehensive nomenclature system for HPV variants is lacking, limiting comparisons between studies.
DNA from cervical samples previously characterized for HPV type were obtained from multiple geographic regions to screen for novel variants. The complete 8 kb genomes of 120 variants representing the major and minor lineages of the HPV16-related alpha-9 HPV types were sequenced to capture maximum viral heterogeneity. Viral evolution was characterized by constructing phylogenic trees based on complete genomes using multiple algorithms. Maximal and viral region specific divergence was calculated by global and pairwise alignments. Variant lineages were classified and named using an alphanumeric system; the prototype genome was assigned to the A lineage for all types.
The range of genome-genome sequence heterogeneity varied from 0.6% for HPV35 to 2.2% for HPV52 and included 1.4% for HPV31, 1.1% for HPV33, 1.7% for HPV58 and 1.1% for HPV67. Nucleotide differences of approximately 1.0% - 10.0% and 0.5%–1.0% of the complete genomes were used to define variant lineages and sublineages, respectively. Each gene/region differs in sequence diversity, from most variable to least variable: noncoding region 1 (NCR1) /noncoding region 2 (NCR2) >upstream regulatory region (URR)> E6/E7 > E2/L2 > E1/L1.
These data define maximum viral genomic heterogeneity of HPV16-related alpha-9 HPV variants. The proposed nomenclature system facilitates the comparison of variants across epidemiological studies. Sequence diversity and phylogenies of this clinically important group of HPVs provides the basis for further studies of discrete viral evolution, epidemiology, pathogenesis and preventative/therapeutic interventions.
HPV types differ profoundly in cervical carcinogenicity. For the most carcinogenic type, HPV16, variant lineages representing further evolutionary divergence also differ in cancer risk. Variants of the remaining 10-15 carcinogenic HPV types have not been well-studied.
In the first prospective, population-based study of HPV variants, we explored whether, on average, the oldest evolutionary branches within each carcinogenic type predicted different risks of ≥2-year viral persistence and/or precancer and cancer (CIN3+). We examined the natural history of HPV variants in the 7-year, 10,049-woman Guanacaste Cohort Study, using a nested case-control design. Infections were assigned to a variant lineage determined by phylogenetic parsimony methods based on URR/E6 sequences. We used the Fisher's combination test to evaluate significance of the risk associations, cumulating evidence across types.
Globally, for HPV types including HPV16, the p-value was 0.01 for persistence and 0.07 for CIN3+. Excluding HPV16, the p-values were 0.04 and 0.37, respectively. For HPV16, non-European viral variants were significantly more likely than European variants to cause persistence (OR = 2.6, p = 0.01) and CIN3+ (OR = 2.4, p = 0.004). HPV35 and HPV51 variant lineages also predicted CIN3+.
HPV variants generally differ in risk of persistence. For some HPV types, especially HPV16, variant lineages differ in risk of CIN3+. The findings indicate that continued evolution of HPV types has led to even finer genetic discrimination linked to HPV natural history and cervical cancer risk. Larger viral genomic studies are warranted, especially to identify the genetic basis for HPV16's unique carcinogenicity.
HPV; variants; evolution; cervix; cancer
Human Papillomavirus (HPV) E6 induced p53 degradation is thought to be an essential activity by which high-risk human Alphapapillomaviruses (alpha-HPVs) contribute to cervical cancer development. However, most of our understanding is derived from the comparison of HPV16 and HPV11. These two viruses are relatively distinct viruses, making the extrapolation of these results difficult. In the present study, we expand the tested strains (types) to include members of all known HPV species groups within the Alphapapillomavirus genus.
We report the biochemical activity of E6 proteins from 27 HPV types representing all alpha-HPV species groups to degrade p53 in human cells. Expression of E6 from all HPV types epidemiologically classified as group 1 carcinogens significantly reduced p53 levels. However, several types not associated with cancer (e.g., HPV53, HPV70 and HPV71) were equally active in degrading p53. HPV types within species groups alpha 5, 6, 7, 9 and 11 share a most recent common ancestor (MRCA) and all contain E6 ORFs that degrade p53. A unique exception, HPV71 E6 ORF that degraded p53 was outside this clade and is one of the most prevalent HPV types infecting the cervix in a population-based study of 10,000 women. Alignment of E6 ORFs identified an amino acid site that was highly correlated with the biochemical ability to degrade p53. Alteration of this amino acid in HPV71 E6 abrogated its ability to degrade p53, while alteration of this site in HPV71-related HPV90 and HPV106 E6s enhanced their capacity to degrade p53.
These data suggest that the alpha-HPV E6 proteins' ability to degrade p53 is an evolved phenotype inherited from a most recent common ancestor of the high-risk species that does not always segregate with carcinogenicity. In addition, we identified an amino-acid residue strongly correlated with viral p53 degrading potential.
Persistent infection by specific oncogenic human papillomaviruses (HPVs) is established as the necessary cause of cervix cancer. DNA sequence differences between HPV genomes determine whether an HPV has the potential to cause cancer. Of the more than 100 HPV genotypes characterized at the genetic level, at least 15 are associated, to varying degrees, with cervical cancer. Classification based on nucleotide similarity places nearly all HPVs that infect the cervicovaginal area within the α-PV genus. Within this genus, phylogenetic trees inferred from the entire viral genome cluster all cancer-causing types together, suggesting the existence of a common ancestor for the oncogenic HPVs. However, in separate trees built from the early open reading frames (ORFs; i.e. E1, E2, E6, E7) or the late ORFs (i.e. L1, L2), the carcinogenic potential sorts with the early region of the genome, but not the late region. Thus, genetic differences within the early region specify the pathogenic potential of α-HPV infections. Since the HPV genomes are monophyletic and sites are highly correlated across the genome, diagnosis of oncogenic types and non-oncogenic types can be accomplished using any region across the genome. Here we review our current understanding of the evolutionary history of the oncogenic HPVs, in particular, we focus on the importance of viral genome heterogeneity and discuss the genetic basis for the oncogenic phenotype in some but not all α-PVs.
Human papillomavirus; Cervix cancer; Evolution; Phylogeny
To assess the role of human papillomavirus virus (HPV) genetics in cervical lesions, we sequenced the E7 gene of HPV16, 31, or 73 from singly infected women who (1) cleared the infection quickly, (2) had type-specific persistent infection, or (3) progressed to CIN2 or worse lesions. Four of the 296 HPV16 E7 nucleotides were variable, compared with 7 of 296 for HPV31 E7 and 4 of 296 for HPV73 E7. While most of the polymorphisms in HPV31 and -73 resulted in non-synonymous amino acid changes, the polymorphisms in the HPV16 E7 resulted in synonymous changes. The lack of heterogeneity of HPV16 E7 suggests high evolutionary purifying selection that might be related to the unique carcinogenicity of HPV16.
Human papillomavirus type 18 (HPV18) and HPV45 account for approximately 20% of all cervix cancers. We show that HPV18, HPV45, and the recently discovered HPV97 comprise a clade sharing a most recent common ancestor within HPV α7 species. Variant lineages of these HPV types were classified by sequence analysis of the upstream regulatory region/E6 region among cervical samples from a population-based study in Costa Rica, and 27 representative genomes from each major variant lineage were sequenced. Nucleotide variation within HPV18 and HPV45 was 3.82% and 2.39%, respectively, and amino acid variation was 4.73% and 2.87%, respectively. Only 18 nucleotide variations, of which 10 were nonsynonymous, were identified among three HPV97 genomes. Full-genome comparisons revealed maximal diversity between HPV18 African and non-African variants (2.6% dissimilarity), whereas HPV18 Asian-American [E1 (AA)] and European (E2) variants were closely related (less than 0.5% dissimilarity); HPV45 genomes had a maximal difference of 1.6% nucleotides. Using a Bayesian Markov chain Monte Carlo (MCMC) method, the divergence times of HPV18, -45, and -97 from their most recent common ancestors indicated that HPV18 diverged approximately 7.7 million years (Myr) ago, whereas HPV45 and HPV97 split off around 5.7 Myr ago, in a period encompassing the divergence of the great ape species. Variants within the HPV18/45/97 lineages were estimated to have diverged from their common ancestors in the genus Homo within the last 1 Myr (<0.7 Myr). To investigate the molecular basis of HPV18, HPV45, and HPV97 evolution, regression models of codon substitution were used to identify lineages and amino acid sites under selective pressure. The E5 open reading frame (ORF) of HPV18 and the E4 ORFs of HPV18, HPV45, and HPV18/45/97 had nonsynonymous/synonymous substitution rate ratios (dN/dS) over 1 indicative of positive Darwinian selection. The L1 ORF of HPV18 genomes had an increased proportion of nonsynonymous substitutions (4.93%; average dN/dS ratio [M3] = 0.3356) compared to HPV45 (1.86%; M3 = 0.1268) and HPV16 (2.26%; M3 = 0.1330) L1 ORFs. In contrast, HPV18 and HPV16 genomes had similar amino acid substitution rates within the E1 ORF (2.89% and 3.24%, respectively), while HPV45 E1 was highly conserved (amino acid substitution rate was 0.77%). These data provide an evolutionary history of this medically important clade of HPVs and identify an unexpected divergence of the L1 gene of HPV18 that may have clinical implications for the long-term use of an L1-virus-like particle-based prophylactic vaccine.
Cervical cancer is one of the leading causes of cancer mortality in women worldwide, yet few suitable animal models currently exist for study of this disease. Virtually all cases of cervical cancer in women are caused by specific types of genital human papillomavirus (HPV). In this study, we investigated naturally occurring genital PVs in female cynomolgus macaques (Macaca fascicularis) without breeding contact for at least 3.5 years. Exfoliated cervicovaginal cells from 19 of 54 animals tested positive for at least one PV. Seven different PVs were identified, including four novel genotypes and two genotypes (RhPV-d and RhPV-a) previously identified in rhesus macaques (Macaca mulatta). Four PV types were associated with cervical intraepithelial neoplasia (CIN), which resembled human CIN by endoscopy, cervical cytology, histology, and immunohistochemistry. The presence of CIN was highly associated with PV infection (P < 0.0001). The most prevalent virus type was RhPV-d, which was identified in 60% of animals with CIN. An RhPV-d genome sequenced from a high-grade CIN lesion was found to be phylogenetically related to the highly oncogenic HPV16. Transfer of cervical cytobrush samples from donor animals naturally carrying RhPV-d resulted in new infections in 4 of 12 previously virus-free animals and abnormal cytology and histology in 1 of 4 infected animals after 18 weeks of infection. Experimental transmission was confirmed by E1^E4 reverse transcription-PCR products and RhPV-d sequence identity with the donor variant. These findings identify key similarities between macaque and human oncogenic PVs which should prove useful in the study of viral persistence, carcinogenesis, and therapeutic development.
Complete genomes of HPV101 and HPV103 were PCR amplified and cloned from cervicovaginal cells of a 34-year-old female with cervical intraepithelial neoplasia grade 3 (CIN 3) and a 30-year-old female with a normal Pap test, respectively. HPV101 and HPV103 contain 4 early genes (E7, E1, E2 and E4) and 2 late genes (L2 and L1), but both lack the canonical E6 ORF. Pairwise alignment similarity of the L1 ORF nucleotide sequences of HPV101 and HPV103 indicated that they are at least 30 % dissimilar to each other and all known PVs. However, similarities of the other ORFs (E7, E1, E2, and L2) indicated that HPV101 and HPV103 are most related to each other. Phylogenetic analyses revealed that these two types form a monophyletic clade, clustering together with the gamma- and pi-PV groups. These data demonstrated that HPV genomes closely related to papillomaviruses identified from cutaneous epithelia can be isolated from the genital mucosal region. Moreover, this is the first report of HPVs lacking an E6 ORF and phylogenetic evidence suggests this occurred subsequent to their emergence from the gamma-/pi-PVs.
Human papillomavirus; novel type; complete genome; phylogeny; molecular clock
The human papillomaviruses (HPVs) have long been thought to follow a monophyletic pattern of evolution with little if any evidence for recombination between genomes. On the basis of this model, both oncogenicity and tissue tropism appear to have evolved once. Still, no systematic statistical analyses have shown whether monophyly is the rule across all HPV open reading frames (ORFs). We conducted a taxonomic analysis of 59 mucosal/genital HPVs using whole-genome and sliding-window similarity measures; maximum-parsimony, neighbor-joining, and Bayesian phylogenetic analyses; and localized incongruence length difference (LILD) analyses. The algorithm for the LILD analyses localized incongruence by calculating the tree length differences between constrained and unconstrained nodes in a total-evidence tree across all HPV ORFs. The process allows statistical evaluation of every ORF/node pair in the total-evidence tree. The most significant incongruence was observed at the putative high-risk (i.e., cancer-associated) node, the common oncogenic ancestor for alpha HPV species 9 (e.g., HPV type 16 [HPV16]), 11, 7 (e.g., HPV18), 5, and 6. Although these groups share early-gene homology, including high degrees of similarity among E6 and E7, groups 9 and 11 diverge from groups 7, 5, and 6 with respect to L2 and L1. The HPV species groups primarily associated with cervical and anogenital cancers appear to follow two distinct evolutionary paths, one conferred by the early genes and another by the late genes. The incongruence in the genital HPV phylogeny could have occurred from an early recombination event, an ecological niche change, and/or asymmetric genome convergence driven by intense selection. These data indicate that the phylogeny of the oncogenic HPVs is complex and that their evolution may not be monophyletic across all genes.
Human papillomavirus type 16 (HPV16) is the primary etiological agent of cervical cancer, the second most common cancer in women worldwide. Complete genomes of 12 isolates representing the major lineages of HPV16 were cloned and sequenced from cervicovaginal cells. The sequence variations within the open reading frames (ORFs) and noncoding regions were identified and compared with the HPV16R reference sequence (50). This whole-genome approach gives us unprecedented precision in detailing sequence-level changes that are under selection on a whole-viral-genome scale. Of 7,908 base pair nucleotide positions, 313 (4.0%) were variable. Within the 2,452 amino acids (aa) comprising 8 ORFs, 243 (9.9%) amino acid positions were variable. In order to investigate the molecular evolution of HPV16 variants, maximum likelihood models of codon substitution were used to identify lineages and amino acid sites under selective pressure. Five codon sites in the E5 (aa 48, 65) and E6 (aa 10, 14, 83) ORFs were demonstrated to be under diversifying selective pressure. The E5 ORF had the overall highest nonsynonymous/synonymous substitution rate (ω) ratio (M3 = 0.7965). The E2 gene had the next-highest ω ratio (M3 = 0.5611); however, no specific codons were under positive selection. These data indicate that the E6 and E5 ORFs are evolving under positive Darwinian selection and have done so in a relatively short time period. Whether response to selective pressure upon the E5 and E6 ORFs contributes to the biological success of HPV16, its specific biological niche, and/or its oncogenic potential remains to be established.