|Home | About | Journals | Submit | Contact Us | Français|
Human papillomavirus (HPV) 58 accounts for a notable proportion of cervical cancers in East Asia and parts of Latin America, but it is uncommon elsewhere. The reason for such ethnogeographical predilection is unknown. In our study, nucleotide sequences of E6 and E7 genes of 401 HPV58 isolates collected from 15 countries/cities across four continents were examined. Phylogenetic relationship, geographical distribution and risk association of nucleotide sequence variations were analyzed. We found that the E6 genes of HPV58 variants were more conserved than E7. Thus, E6 is a more appropriate target for type-specific detection, whereas E7 is more appropriate for strain differentiation. The frequency of sequence variation varied geographically. Africa had significantly more isolates with E6-367A (D86E) but significantly less isolates with E6-203G, -245G, -367C (prototype-like) than other regions (p ≤ 0.003). E7-632T, -760A (T20I, G63S) was more frequently found in Asia, and E7-793G (T74A) was more frequent in Africa (p < 0.001). Variants with T20I and G63S substitutions at E7 conferred a significantly higher risk for cervical intraepithelial neoplasia grade III and invasive cervical cancer compared to other HPV58 variants (odds ratio = 4.44, p = 0.007). In conclusion, T20I and/or G63S substitution(s) at E7 of HPV58 is/are associated with a higher risk for cervical neoplasia. These substitutions are more commonly found in Asia and the Americas, which may account for the higher disease attribution of HPV58 in these areas.
High-risk human papillomavirus (HPV) plays a necessary role in the development of cervical cancer, which is the third most common cancer in women worldwide.1 Among the 15 HPV types classified as high-risk, HPV16 and HPV18 are the two most important types accounting for about 70% of cervical cancers across the world. The remaining cancers are caused by HPV31, 33, 35, 39, 45, 51, 52, 56, 58 and 59. The distribution of these less common types shows some degree of ethnogeographical variation.2,3 HPV58 is an uncommon type with skewed ethnogeographical distribution. The prevalence of HPV58 is remarkably high among cervical squamous cell carcinomas in East Asia (33% in Shaanxi, 26% in Shanghai, 16% in South Korea, 10% in Hong Kong and Taiwan and 8% in Japan),4–9 whereas it only accounts for about 2% of cervical cancers in other regions.2,3 In addition to East Asia, a higher prevalence of HPV58 has also been observed in a few places in Latin America. HPV58 was found in 12% of cervical cancers in Costa Rica,10 21% in Southeastern Mexico,11 9.5% in Northeastern Brazil12 and was the second most common type (about 13%) found in precancerous lesions in Central and Southern Brazil.13,14
The reason for an ethnogeographical predilection in the attribution of HPV58 to cervical cancer is not fully understood. HPV58 variants are classified into four lineages. These lineages exhibit a distribution pattern that varies across the world.15,16 Studies comparing the risk association of these variant lineages are very limited and a conclusion has not yet been reached.17,18
The E6 and E7 proteins encoded by high-risk HPV play a key role in cellular transformation. Previous studies from Eastern19 and Southern Chinese populations20 suggested that sequence variations within these regions may be associated with a higher risk for cervical neoplasia. At present, information on sequence variation of the E6 and E7 open reading frames (ORF) of HPV58 have been derived mainly from Chinese populations.18–20 Our study examined the sequence polymorphisms of HPV58 E6 and E7 ORF and assessed the geographical distribution and risk association with cervical neoplasia from a global perspective.
A total of 401 samples including 295 cervical scrapes or swabs, 69 vaginal swabs and 37 anal samples collected from 15 countries and cities were examined in our study (Table 1). These samples have been used to establish a lineage classification system for HPV58 variants.16
The full length of E6 and E7 ORF was amplified by polymerase chain reaction (PCR) using primers 5′-CGA AAA CGG TCT GACCGA AA-3′ and 5′-TAT CGT CTG CTG TTT CGT CC-3′, followed by a second PCR using inner primers (5′-GAC CGA AAC CGG TGC ATA TA-3′ and 5′-ACC GCT TCT ACC TCA AAC CA-3′) when necessary. Briefly, the PCR was conducted in a 50-μL reaction mix containing 4 μL of extracted DNA, 200 μM of deoxynucleotide triphosphates, forward and reverse primers at 0.25 μM each, 1.25 unit of HotStarTaq Plus polymerase (Qiagen, Hilden, Germany). The cycling conditions were activation of polymerase at 95°C for 5 min, 40 cycles of denaturation at 94°C for 50 sec, annealing at 57°C (59°C for nested PCR) for 1 min and extension at 72°C for 1 min, followed by a final extension at 72°C for 8 min. Ten microliters of the purified PCR products were mixed with 2 μL of BigDye Terminator v.3.1 sequencing reaction mix (Applied Biosystems, Foster City, USA), 3 μL of 5× sequencing buffer and 3.2 pmol of the sequencing primer which was the corresponding PCR primer as described above and made up to a final volume of 20 μL. The cycling conditions for the labeling PCR were 25 cycles at 95°C for 15 sec, 50°C for 15 sec and 65°C for 75 sec. Fluorescence-labeled PCR products were purified with DyeEx (Qiagen) and run on an ABI 3130 automated sequence analyzer (Applied Biosystems). Sequence data were obtained from both directions and analyzed with SeqScape v2.5 (Applied Biosystems). Variations that occurred only in one isolate were confirmed by repeating PCR sequencing from the original sample.
The program MEGA5 was applied to identify the best evolutionary model and to construct the maximum-likelihood trees using the close-neighbor-interchange (CNI) search approach.21 The data were bootstrap resampled 1,000 times for tree topology evaluation. HPV33 was used as an outgroup sequence. The program Datamonkey (http://www.datamonkey.org) was used to detect positive or negative selection.22,23
All data analyses were performed by the Statistical Package for Social Sciences (version 18.0.0, IBM Corp., NY). The rate of detection of E6 and E7 sequence patterns was compared between the four regions by χ2 test and using the Bonferroni method to account for multiple comparisons.24 For the purpose of risk association analysis, women with normal cytology or absence of lesion at colposcopy or biopsy were grouped as “normal cervices.” Using cervical intraepithelial neoplasia grade III (CIN3) and cervical cancer as an outcome variable and normal cervices as control, the risk association of nonsynonymous substitutions was first assessed by univariate analysis using χ2 test or Fisher's exact test. Those mutations found to be significant for the association (p < 0.05) were tested for interactions, followed by unconditional binary logistic regression analyses controlling for patients' age and their country of origin. All p values <0.05 were regarded as statistically significant.
This study has included 37 anal samples collected from men in USA. In our previous study on HPV58 variant lineage based on the same cohort of samples, no differences in the distribution of variants between these anal samples and cervical samples collected from the same region was observed.16 In our study also, there were no differences in the distribution of E6 and E7 variants among these two sources of samples. For these reasons, anal samples were pooled together with cervical samples for the purpose of analysis in our study.
Altogether, 19 variants showing nucleotide sequence variations in the E6 ORF were identified (Fig. 1). The maximum variation between any two variants was nine nucleotides, a variability of 2.0% over the 450-bp E6 ORF. The mean dN/dS was 0.916, and no site with significant positive or negative selection was found. Altogether, 16 nucleotide positions showing sequence polymorphism were identified, with 11 positions had nonsynonymous substitutions leading to amino acid changes. Based on the phylogenetic relationship, the E6 sequence patterns could be classified into four groups designated as E6-A, -B, -C, and -D, and each group was associated with a signature nucleotide sequence variation as shown in Figure 2a. The maximum intergroup nucleotide sequence variability range from 0.9% (E6-A and -B) to 2.0% (E6-B and -D), whereas the maximum intragroup variability range from 0.7% (E6-A and -C) to 1.3% (E6-B).
Twenty-one variants showing sequence variations in the E7 ORF were identified (Fig. 3). The maximum nucleotide sequence variation between any two variants was nine, a variability of 3.0% over the 297-bp ORF. The mean dN/dS of E7 was 0.748 indicating a positive selection. Overall, 18 nucleotide positions showing sequence polymorphisms were identified, and 11 had nonsynonymous substitutions (Fig. 3). Four groups of E7 sequence patterns designated as E7-A with signature sequence variation of 632C, 694G, 793A; E7-B (694A); E7-C (632T) and E7-D (793G) were identified. The phylogenetic relationship and their associated signature nucleotide sequence variations are shown in Figure 2b.
The frequencies of different E6 and E7 sequence variation patterns among isolates collected from different parts of the world are shown in Figure 4. E6-A was the most common pattern found overall (Fig. 4a). In Asia, Americas and Europe, E6-A predominated and accounted for 86–96% of isolates. In Africa, E6-A (51%) cocirculated with E6-C (40%). E6-A was significantly less frequent in Africa compared to the other three regions (p = 0.003 for Americas, p < 0.001 for Asia and Europe); whereas E6-C was significantly more frequent in Africa than in other regions (p < 0.001 for all three regions).
The geographical distribution of E7 sequence patterns was more diverse (Fig. 4b). In Asia, both E7-B and E7-C patterns were commonly found, whereas E7-B and E7-D patterns codominated in Africa. E7-C was significantly more frequent in Asia (p < 0.001 for all three regions), whereas E7-D was significantly more frequent in Africa (p < 0.001 for all three regions).
Among the 401 samples examined, 33 were histology-confirmed invasive cervical cancer, 31 were CIN3 and 79 were normal cervices as defined by normal cytology or no lesion detected at colposcopy or biopsy. These cases were used to analyze the risk association of sequence variations. Two E7 substitutions (G41R and G63D) showed a significantly lower risk, and the other two E7 substitutions (T20I and G63S) showed a significantly higher risk based on univariate analysis (Table 2). All isolates with G41R also carried G63D, and a significant interaction was found (p = 0.023). Similarly, all isolates with T20I also carried G63S, and a significant interaction was found (p = 0.002). We therefore selected G41R and T20I into two separate final regression models, respectively, controlling for age and country of origin. It was found that T20I and/or G63S conferred an independent increase in risk for CIN3 and invasive cervical cancer (adjusted odds ratio 4.44, 95% CI. 1.51–13.05, p = 0.007). The risk association observed for G41R and/or G63D was no longer significant after multivariate analysis.
Our study has delineated the sequence variations of two onco-protein-encoding genes of HPV58. This dataset provides essential information for further in depth studies to elucidate the epidemiology of this characteristic HPV type that exhibits a higher disease attribution in certain ethnogeographical groups. While our study has examined a large series of samples covering a wide range of geographical regions, one should note that the number of samples available from certain areas was small and might not be representative of those areas.
The degree of nucleotide sequence variation among HPV58 isolates was found to be small (maximum 2% for E6, 3% for E7), as expected for HPV in general.15,25–27 Phylogenetically related E6 and E7 sequence patterns could be identified by distinct signatures of nonsynonymous sequence variations. These molecular signatures not only provide convenient markers for large-scale epidemiological studies, but the substitutions themselves may carry biological implications.
Overexpression of E6 and E7 proteins is required to maintain the transformed phenotype of keratinocytes, and thus they are constitutively retained in cancer cells and not affected by viral genome disruption resulting from integration. Therefore, E6 and E7 are good diagnostic targets, particularly for type-specific detection as E6 and E7 genes are diverse among HPV types. Our results show that the sequences of E7 were more variable than those of E6 and more heterogenously distributed across the world. This observation of HPV58 is in contrast to those reported for HPV16, 31, 33, 35 and 52 where the E7 ORF is less variable, and therefore E7 is often chosen as the target for diagnostic detection of HPV16.28,29 Based on the current findings, E6 rather than E7 is a better target for type-specific detection of HPV58.
The first clone of HPV58 was obtained from a Japanese patient with invasive cervical cancer.30 This strain is regarded as the prototype and commonly used for primer design. The E6-A2 and E7-A1 sequence pattern, as assigned by our study, corresponds, respectively, to the E6 and E7 sequence of the prototype. Our data showed that these “prototype-like” sequences were uncommon (10%) in Asia and rare (<3%) in other regions. Primers and probes designed based on the prototype sequence may have the problem of mismatch with the more commonly found strains.
Geographical predilection in the distribution of E6 and E7 sequence patterns was observed. E6-A predominated in most parts of the world except Africa, where it codominated with E6-C that was uncommon elsewhere. Two characteristics in the distribution of E7 sequence variation patterns were observed. First, E7-D with a nucleotide signature of 793G was a common sequence pattern found in 51% of isolates from Africa but uncommon (14%) in Americas and rare in Asia (4%) and Europe (7%). The second observation was that E7-C with an amino acid substitution signature of T20I and G63S was commonly found in Asia (34%) and Americas (11%) but rare elsewhere. Geographical variation in the distribution of HPV58 E6 and E7 sequence patterns should be considered in designing diagnostic assays.
HPV58 is rare worldwide but attributes to a higher proportion of cervical cancers in East Asia and some parts of Latin America. Our study attempted to identify HPV58 E6 and E7 sequence variations that might carry a higher oncogenic risk and examined whether the distribution of these variations could account for the ethnogeographical predilection in disease burden. Isolates carrying G63S and/or T20I were found to have an independent increase in risk for CIN3 and invasive cervical cancer. Only one variant harbored G63S alone without T20I. This variant was only found in four samples precluding analysis on these two amino acid variations separately.
The high-risk mutations T20I and G63S were found in 33% of isolates in Asia and 10% in Americas but rare in Europe (3%) and Africa (0%). This distribution is reminiscent of the geographical predilection in cervical cancers associated with HPV58. A previous study from southern China (Hong Kong) also revealed that substitutions T20I and/or G63S were associated with a 6.9-fold increase in odds ratio for developing invasive cervical cancer, and the age of patients were significantly younger than those infected with other variants of HPV58.20 Another study from eastern China (Zhejiang) also reported a significant positive trend of association between these substitutions and the severity of cervical lesions.19
It has been shown that HPV58 variants can be divided into four lineages, which is mainly based on the sequence variations within the long control region (LCR).15,16 Although, naturally occurring LCR sequence variations have been shown to carry different promoter activity in vitro,31 two previous studies that analyzed epidemiological risk association based on the LCR-lineage classification system did not reveal any significant differences among the lineages or sublineages.16,18 This is special for HPV58 as an association between LCR polymorphisms and lesion has been reported for other types. Our study suggested that the nucleotide sequence variations at E7 are more useful for predicting the risk of HPV58 variants. In vitro study to compare the transforming activities among E7 proteins carrying these amino acid substitutions is ongoing.
In conclusion, this study provides a comprehensive analysis on the sequence variation of E6 and E7 of HPV58 from a worldwide perspective. E6 is less variable and more appropriate as a target for type-specific diagnostic detection. Geographical variation in the distribution of HPV58 E6 and E7 sequence variations exists, which should be considered when designing diagnostic assays and therapeutic vaccines. Amino acid substitution(s) T20I and/or G63S at E7 is(are) associated with a higher risk for CIN3 and invasive cervical cancer. This high-risk sequence signature is more commonly found in isolates from Asia and the Americas than elsewhere, resembling the geographical predilection of cervical cancers attributed to HPV58.
For unknown reasons, human papillomavirus (HPV) 58 accounts for a significant proportion of cervical cancers in East Asia and parts of Latin America, but it is uncommon elsewhere. In this study, the authors analyzed the E6 and E7 genes from HPV58 samples from around the world. They found that patients with an HPV58 variant containing T20I and G63S substitutions in E7 had a greater risk of developing cancer. This variant was also more prevalent in Asia than in other regions, and may therefore help to explain the higher disease burden observed in this region.
The authors are grateful to the contribution of Dr. Sergio Andres Tonon and Dr. Cláudia Renata F. Martins who unfortunately have passed away before the completion of this study. Dr. Francois Coutlée's research on HPV variants is supported by the Cancer Society of Canada. Dr. Federico De Marco is partly supported by the Italian Ministry of Foreign Affairs, DGPC Uff V and by the Italian Ministry of Health. The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript.
Dr. Francois Coutlée has received financial support for research projects and honoraria for oral presentations on HPV from Roche
Diagnostics and Merck. Dr. Karen Smith McCune is a member of the Scientific Advisory Board and owner of shares of stock in OncoHealth Inc., a startup company developing diagnostic assays for cervical cancer screening.
Grant sponsors: International Centre for Genetic Engineering and Biotechnology (ICGEB) (project no. CRP/CHN08-03)