Search tips
Search criteria 


Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Int J Cancer. Author manuscript; available in PMC 2014 March 15.
Published in final edited form as:
PMCID: PMC3493709

Elevated methylation of HPV16 DNA is associated with the development of high grade cervical intraepithelial neoplasia


We explored the association of HPV16 DNA methylation with age, viral load, viral persistence, and risk of incident and prevalent high grade CIN (CIN2+) in serially collected specimens from the Guanacaste, Costa Rica cohort. 273 exfoliated cervical cell specimens (diagnostic and pre-diagnostic) were selected: 1) 92 with HPV16 DNA clearance (controls), 2) 72 with HPV16 DNA persistence (without CIN2+), and 3) 109 with CIN2+. DNA was extracted, bisulfite converted and methylation was quantified using pyrosequencing assays at 66 CpGs across the HPV genome. The Kruskal-Wallis test was used to determine significant differences among groups, and receiver operating characteristic curve analyses were used to evaluate how well methylation identified women with CIN2+. In diagnostic specimens, 88% of CpG sites had significantly higher methylation levels in CIN2+ after correction for multiple tests compared with controls. The highest AUC was 0.82 for CpG site 6457 in L1, and a diagnostic sensitivity of 91% corresponded to a specificity of 60% for CIN2+. Prospectively, 17% of CpG sites had significantly higher methylation in pre-diagnostic CIN2+ specimens (median time of 3 years before diagnosis) vs. controls. The strongest pre-diagnostic CpG site was 6367 in L1 with an AUC of 0.76. Age-stratified analyses suggested that women older than the median age of 28 years have an increased risk of precancer associated with high methylation. Higher methylation in CIN2+ cases was not explained by higher viral load. We conclude that elevated levels of HPV16 DNA methylation may be useful to predict concurrently diagnosed as well as future CIN2+.

Keywords: HPV16, methylation, epidemiology, receiver operating curve, biomarker


Persistent infection with human papillomavirus type 16 (HPV16) leads to the majority of cervical cancers and related cervical precancer.1, 2 However, neoplastic progression is an uncommon outcome,3, 4 and identifying biomarkers that distinguish HPV infections that clear spontaneously from those that progress to cervical precancer, or that detect prevalent precancer, has been the subject of considerable effort.5

HPV16 DNA methylation has recently been shown to be associated with the risk of precancer, i.e., cervical intraepithelial neoplasia grade 2 or worse (CIN2+) (6–15). Although some of the published literature has reported elevated methylation at specific CpG sites in CIN2+,613 there has been considerable variation in methylation levels within outcome groups and inconsistent conclusions [reviewed in Mirabello et al.12]. The two studies examining the most sites within the HPV16 genome methylation had limited sample sizes of less than 20 women and found elevated methylation associated with CIN2+.11, 13 Larger studies (>50 women) have only examined a limited number of CpG sites, mostly within the upstream regulatory region (URR); the largest study to date, 121 women, examined 16 URR CpG sites and found decreased methylation associated with progression.14 One prospective study15 examined 6 URR CpG sites that showed higher methylation with a lower risk of CIN2. These variable reports may be related to the study of different CpG sites including different E2 binding sites or to methodological differences.

We recently assessed HPV DNA methylation using an accurate and quantitative method in a large nested case-control and prospective epidemiological study12 in the Guanacaste, Costa Rica cohort. We found that methylation of several CpG sites within the HPV16 L1, L2, and E2-E4 ORFs were associated with infection outcome. Higher methylation at each of these CpG sites was associated with an increased risk of CIN3 compared to women who cleared their HPV16 infections. The combined effect of having increased methylation at specific CpG sites in L1, L2, and E2-E4 was associated with an odds ratio of 52 (95% CI 4.0–670) for CIN3 compared with low methylation at all three of these sites.12

Here we include results of additional exploration of methylation in the Guanacaste, Costa Rica cohort beyond what we reported in the previous report.12 We have enlarged the sample size by using more longitudinal and serial specimens, and adding CIN2 and invasive cancer cases and controls. In this report, we further explore the association between methylation and infection outcome, assess methylation changes over time, and evaluate the performance of HPV16 DNA methylation to distinguish women who will develop CIN2+ from women that will clear their HPV16 infections without progression. We additionally explored the effects of age and viral load on DNA methylation.

Materials and Methods

Study population

Relevant details of the Guanacaste, Costa Rica cohort have been described previously 12. In brief, the cohort is population-based and the 10,049 participants were recruited for screening and follow-up between June 1993 and December 1994 as part of a natural history study of HPV infection and cervical neoplasia of women aged 18+ years.16 The participation rate was 93.6% and loss to follow-up was <10% over seven years.17 The study protocol was reviewed and reapproved annually by National Cancer Institute and Costa Rican Institutional Review Boards.

Women were referred to colposcopy if they had abnormal cytology, abnormal direct visual examinations, or the appearance of severe cervical abnormalities on review of their Cervigrams (magnified digital images of the cervix), as previously detailed.17 HPV16 infections were identified in 503 women. Swab-derived cervical cell specimens,16 previously documented to contain HPV16 DNA, were selected from 205 women: 1) 100 with HPV16 DNA clearance (in <2 years; controls); 2) 38 with HPV16 DNA persistence (2+ years) without observed progression to CIN2+ (persistence); 3) 22 with HPV16 infection and CIN2; 4) 31 with HPV16 infection and CIN3; and, 5) 14 women with HPV16 infection and cervical cancer. Final diagnosis pathology reports and HPV genotyping were confirmed for all study participants. Controls were chosen from a set of 300 available controls to match the age distributions and proportion of HPV16 variant lineages (European, non-European) in the case outcome groups, and we analyzed the last HPV16-positive sample collected before HPV16 clearance. For the case outcome groups we analyzed specimens collected at two time points (diagnostic and pre-diagnostic). We analyzed 36 HPV16 samples collected at the last HPV16-positive screening visit (referred to as diagnostic samples) from women with persistence, and 16 HPV16 samples from the screening visit closest to diagnosis of CIN2 (median time before diagnosis = 0 months), 28 from CIN3 (median time before diagnosis = 3 months) and 13 from cancer (median time before diagnosis = 0 months). We also analyzed samples collected at the first HPV16-positive screening visit in the case groups for those with these samples available (referred to as pre-diagnostic samples): 36 with persistence (median time before last HPV16-positive visit = 72 months), 17 with CIN2 (median time before diagnosis = 27 months), 16 with CIN3 (median time before diagnosis = 45 months), and 3 with cancer (median time before diagnosis = 34 months).

Seven women had multiple HPV16 specimens collected at successive screening visits prior to CIN3 diagnosis, and for these women we additionally analyzed all serial HPV16 samples collected before diagnosis (3–7 serial samples per women); there were 30 serial samples total for these 7 women (16 additional serial samples plus the 14 pre-diagnostic and diagnostic samples for the 7 women described above).

DNA isolation and bisulfite conversion

Genomic DNA was extracted from cervical specimens using 200ul of the standard transport medium (STM) with QIAamp DNA Mini Kit (Qiagen Inc., Hilden, Germany) as recommended by the manufacturer. DNA was quantified by UV absorption, yielding in total approximately 800ng of DNA per specimen. 250ng of DNA was used in the bisulfite conversion reactions where unmethylated cytosines were converted to uracil with the EZ DNA methylation kit (Zymo research, Irvine, CA) according to manufacturer's instructions. Converted DNA was eluted in 25μl Buffer EB.

HPV16 DNA methylation assay

All methylation levels in the data shown here are from new measurements in the Lorincz lab, performed blindly with respect to CIN status and results from the Burk lab. Lorincz methylation assays were run subsequently to the Burk assays, and the QC from Burk did inform the Lorincz assay protocol; although, no analysis results were shared.

Primer sets with one biotin-labeled primer were used to amplify the bisulfite converted DNA. New primers for 16 PCRs in E2-E4, L2, L1, URR, E6 and E7 covering 66 CpG positions were designed using PyroMark Assay Design software version (Qiagen). Due care was taken to avoid any primer overlapping CG dyads to prevent amplification biases. To provide the internal control for total bisulfite conversion, a non-CG cytosine in the region for pyrosequencing was included where possible. PCRs were performed using a converted DNA equivalent of 1500 cells employing the PyroMark PCR kit (Qiagen). The cell genome-equivalents of DNA calculations assumed 6.6 pg DNA per diploid cell. Briefly, 12.5 μl of PCR master mix, 2.5 μl Coral red, 1.2 to 1.5 μl of DNA, 1 to 2 μl of primer (7.5 pmol of each primer), sample and an optimized concentration of MgCl2 adjusted with water to give a final 25ul reaction volume were combined and run at thermal cycling conditions: 95°C for 15min and then 50 cycles: 30 sec at 94°C; 30 sec at the optimized primer-specific annealing temperature; 30 sec at 72°C and a final extension for 10 min at 72°C. Details on the primers, the amount of MgCl2 and the annealing temperatures used are given in Supplement Table 1. The amplified DNA was confirmed by the QiaExel capillary electrophoresis instrument (Qiagen). 10ul of PCR product was pyrosequenced using a PyroMark™Q96 ID (Qiagen) instrument as previously described by Vasiljevic et al.18

The nucleotide positions for CpG sites in the L2 gene region were adjusted by 2 base pairs (added) in order to parallel the CpG positions reported in the previous related study12 by our team. This 2 base discrepancy is a result of using different primers and reference HPV16 DNA sequences. 58/66 CpG sites analyzed were included in the previous study.12

The final sample size was 273 specimens from 196 women after removal of specimens from 9 women that failed to amplify. Twenty quality control (QC) samples were repeat tested blinded to disease outcome from bisulfite treatment through to methylation pyrosequencing assays to assess variability at 20 CpG sites in L1, URR, E6 and E7 regions. The individual intra-class correlation coefficients (ICC) were >90% at 18 of the 20 CpG sites evaluated, one was 83% in E7 and one was 14% in the URR (data not shown).

Viral Load Determination

Quantitative real time PCR was carried out in 20 μl reactions containing 10 μl of (2×) QuantiFast SYBR Green PCR Master Mix (Qiagen, Hilden, Germany), 0.8 μM of each F and R HPV16 E6 primer19 or 10 μM of each F and R GAPDH primer19 and 1ul of 1:20 diluted extracted DNA. Reactions were run on a Rotor-Gene Instrument (RG-6000, Corbett Research) with an initial denaturation step at 95 °C for 5 min followed by 40 cycles of 95°C for 10s and 55°C for 30s for HPV16 E6 or 30 cycles of 95°C for 15s, 50°C for 20s and 72°C for 20s for GAPDH. Data acquisition at 510 nm was performed at 55°C and at 72°C, respectively. Standard curves for HPV were obtained by amplification of 10-fold dilution series of 107 to 101 copies of plasmid HPV16 in a fixed amount of 100 pg of human placental DNA (Sigma-Aldrich). The standard curve for GAPDH was obtained by amplification of 250, 25, and 2.5 ng and 250 and 25 pg of DNA. The amount of human DNA was then converted to number of cells by the assumption that 6.6 pg of DNA is present per diploid cell. Standard curves were generated from mean threshold cycle (CT) values of each dilution in triplicate, and mean CT values of samples in duplicate were used for quantity calculation. The amount of E6 and GAPDH genes was determined by linear interpolation of the crossing point (Cp) value using the equation of the regression line obtained from the correspondent absolute standard curves. Viral load was only estimated in controls, and CIN2 and CIN3 diagnostic specimens.

Replication and extension of previously reported analysis

Previously, we reported HPV16 methylation data from a smaller but overlapping set of specimens from women with CIN3, HPV persistence, or clearance (controls).12 As a prelude to our current larger analysis, we performed masked re-testing on specimens from 93 women (33 controls, 35 women with persistence, and 25 women with CIN3); in addition, the current analysis includes previously-unanalyzed specimens from 59 more controls, 6 more women with CIN3 (including 16 more pre-diagnostic specimens from women with CIN3). Also, to cover more completely the spectrum of cervical neoplasia, we have included 22 women with CIN2 and 14 with cervical cancer. This analysis, therefore, serves to both replicate and extend the previously reported work. Moreover, the two laboratories (RDB and ATL) were able to optimize further the bisulfite treatment and pyrosequencing assay for this extended analysis. The intra-laboratory reliability of our assay as performed in the Lorincz lab, taking advantage of additional optimization, was excellent with 90% of the ICCs >90%.

In terms of resultant inter-laboratory reliability, the methylation data for the replicated individuals and CpG sites was not highly correlated between the two data sets, as would be expected during assay optimization. The correlations were highest for CpG sites located in the L1 (median = 0.57), L2 (median = 0.58), and E2 (median = 0.63) regions.

Statistical methods

The non-parametric Kruskal-Wallis test was used since methylation was not normally distributed in the outcome groups to determine whether the proportion of methylation at each individual CpG site was associated with HPV infection outcome. To account for multiple testing, P values were corrected using the Benjamini-Hochberg method that adjusts expected proportions of false positives to less than the nominal level of 0.05.20 Logistic regression models were used to obtain the odds ratio (OR) and 95% confidence intervals (CI) for CIN2+ using the controls as the referent group for each individual CpG site. For obtaining ORs for a CpG site, methylation levels were dichotomized using the second tertile (66.7 percentile point), based on the distribution for that site in the controls. Women with methylation levels above the second tertile (high methylation) were compared to women with methylation levels below the second tertile (average to low methylation). To obtain the ORs for a combination of CpG sites a categorical variable representing the number of sites with % methylation in the top tertile (0; 1; 2; or 3) was created; we fit a logistic regression model with the categorical variable as the predictor of CIN2+ and then compared the odds of CIN2+ for women with any 1 of the 3 sites highly methylated (>2nd tertile of methylation), any 2 of the 3 sites highly methylated, and for all 3 sites highly methylated versus women with average to low methylation (<2nd tertile) at all three sites. Analyses were performed using all diagnostic or pre-diagnostic specimens unless noted that only the specimens not in the previous study were included.12 For the age analyses, the median age was used to stratify the cases and controls to determine if there was effect modification by age. Viral load was analyzed as a continuous variable and included as a potential confounder in the logistic regression models for each individual CpG site. Spearman rank correlations were used to investigate the associations between viral methylation levels at each CpG site and time before CIN3 diagnosis. Receiver operating characteristic (ROC) curve analyses, including calculation of area under the ROC curve (AUC), were used to evaluate the ability of methylation at individual CpG sites to separate women with CIN2+ from controls (clearance). Partial areas under the curve (pAUC) were estimated for a specificity of 40% or higher, and a correction was applied in order to have a maximal AUC of 1.0 and a non-discriminant AUC of 0.5.21 Analyses were performed using SPSS 15.0 and R software packages.


We analyzed methylation data from 92 controls (clearance), 93 diagnostic and 72 pre-diagnostic persistence or CIN2+ case specimens, and 16 serial specimens from 7 women who were eventually diagnosed with CIN3 (total = 273 samples from 196 women) (Table 1).

Table 1
Pre-diagnostic and diagnostic specimens and number of women in each outcome group from Guanacaste, Costa Rica between 1993 and 2001.

High methylation associated with CIN2+ diagnosis

Diagnostic specimens

94% of CpG sites (62/66, excluding 4 sites in URR) had significantly higher methylation levels in CIN2+ cases compared to controls when including only specimens not in the previous study12 after correction for multiple tests. 87.9% of CpG sites (58/66) had significantly different methylation levels among the 3 outcome groups (controls, persistence, CIN2+) including all specimens, even after accounting for multiple tests, particularly in L1, E2-E4 and L2 with P-values of <10−5 (Supplementary Table 2). Methylation levels were not significantly different between women with CIN2 and those with CIN3 (P >0.05 for every CpG site; Supplemental Figure 1), so these outcome groups were combined. Methylation was high in the L2 and L1 gene regions and particularly for those with cancer (Figure 1A). High methylation (top tertile of methylation) at all of the significant CpG sites was associated with an increased risk of CIN2+ (Table 2 and Supplementary Table 2). The strongest associations were in L1 (nucleotide position 6457; OR 11.5, 95% CI 4.8–27.7) and L2 (nucleotide position 4261; OR 20.8, 95% CI 6.6–65.7).

Figure 1
Median % methylation for the outcome groups at each CpG site for specimens collected at diagnosis (A.) and 27–72 months (median time) before diagnosis (B.). The legend indicates the color of each outcome group and the number of women in each group; ...
Table 2
Tests of association and measures of predictive capacity for HPV16 CpG sites exhibiting diagnostic AUC ≥0.75 and pre-diagnostic AUC ≥0.7.

There were no significant methylation differences among cytology of the controls (normal vs. ASCUS/LSIL, data not shown). If we stratified the CIN2-3 cases by co-infections status, the single HPV16 infections had slightly higher methylation levels at most CpG sites compared to the multi-HPV infections (data not shown).

Stratifying the cases and controls by the median age of 28 years, showed that the CIN2-3 cases (excluding 13 cancer cases since they were generally only in the older women) aged >28 years had higher methylation at most sites compared to those aged ≤28 years, and the controls aged >28 years had lower methylation at most sites (data not shown). A similar trend was observed in the women categorized as young, middle aged, and older (aged 18–25, 26–45, and 46+ years) at most sites: in the controls, methylation tended to decrease with age, and in the cases, methylation tended to increase with age (data not shown). High methylation (top tertile of methylation) was associated with a stronger increased risk of CIN2-3 in women aged >28 years at most CpG sites (Figure 2); in particular for the CpG site at nucleotide position 6457 in L1 (>28 years, OR 17.6, 95% CI 4.2–72.6), and there was a significant interaction between methylation and age group at L1 6457 (P = 0.03).

Figure 2
Odds ratio estimates for the association between high methylation and CIN2-3 stratified by the median age of 28 years. Logistic regression models were used to obtain the odds ratios for CIN2-3, using the controls as the referent group, for methylation ...

Pre-diagnostic specimens

23% of CpG sites (15/66) had significantly higher methylation levels in CIN2+ cases compared to controls when including only specimens not in the previous study12 before correction for multiple tests; 3 CpG sites (2 in L2 and 1 in L1) remained significant after correction. Methylation at 11 CpG sites (16.7%) had significantly different methylation levels among the 3 outcome groups after accounting for multiple tests including all specimens: 1 site in URR, 1 in E2-E4, 3 in L2, and 6 in L1 (Table 2 and data not shown). Methylation levels were higher in CIN2, CIN3, and cancer case groups primarily in the L2 and L1 gene regions (Figure 1B). High methylation (top tertile of methylation) at each of these 11 significant CpG sites was associated with an increased risk of CIN2+ (ORs ranged from 3.3 to 9.3) (Table 2 and data not shown), with the strongest risk estimate for position 4261 in L2 of 9.3 (95% CI 2.3–45.1).

After stratifying by the median age of 28 years, the CIN2+ cases aged >28 years had higher methylation at most sites compared to those aged ≤28 years, and the controls aged >28 years had lower methylation at most sites (data not shown), as shown for the diagnostic specimens.

Serial CIN3 methylation increases over time

Seven women had samples collected at time points 0 to 7 years before CIN3 diagnosis in addition to the pre-diagnostic and diagnostic samples (30 samples total). There was a trend of increasing levels of methylation in samples collected with less time to diagnosis that was significant at 10 CpG sites in L2, 8 sites in L1, 2 sites in E6, and 1 site in E2 (Spearman correlation coefficients of −0.38 to −0.56), and weak inverse associations at most other CpG sites (Figure 3). Some CpG positions in the URR region showed the opposite trend (Figure 3), however none were significant.

Figure 3
Mean % methylation by CpG site for 30 serial samples from 7 women collected 0–7 years prior to diagnosis of CIN3. The legend indicates the color of each time interval. The x-axis indicates each individual CpG site by nucleotide position grouped ...

Viral load correlations with methylation

There were no significant correlations between individual CpG site methylation and viral load in the controls, CIN2, or CIN3 diagnostic specimens after correction for multiple tests (data not shown). Viral load did not substantially attenuate the risk estimates for the association between high methylation and CIN2-3 for each CpG site when adjusted for in the model (Supplemental Figure 2). The risk estimate for the positive association seen between viral load and CIN2-3 was not significantly affected by adding each CpG methylation to the model (data not shown).

Methylation distinguishes women with precancer

ROC analyses were used to assess the ability of methylation to distinguish women with CIN2+ from controls at each CpG site and for a combination of sites. ROC curves were generated showing the percent sensitivity versus 100-specificity.

Diagnostic specimens

The AUCs ranged from 0.52 to 0.82, and the pAUCs for a specificity of 40% or higher ranged from 0.45 to 0.79 (Supplementary Table 2). A majority of CpG sites in L1 and L2, but only one each in E6 and E7 and 2 sites in E2-E4 gave good separation (AUC ≥0.75, Table 2). The highest AUC was for a CpG site in L1 at nucleotide position 6457 (AUC 0.82, 95% CI 0.75–0.89; pAUC 0.79); the ROC curve for this site is shown in Figure 4A and at a sensitivity of 91.1% the corresponding specificity for CIN2+ was 60.2%. For women aged >28 years, the AUC at nucleotide position 6457 in L1 was 0.89 (95% CI 0.81–0.97), and a sensitivity of 90.0% corresponded to a specificity of 75.6% for CIN2+. The AUCs for CIN2-3 in women aged >28 years and ≤28 years at nucleotide position 6457 in L1 were not statistically significantly different (AUC 0.85, 95% CI 0.75–0.96, vs. 0.73, 95% CI 0.61–0.86; P >0.05).

Figure 4
Receiver operating characteristic (ROC) curves for the CpG site at nucleotide position 6457 (A.) in L1 using diagnostic specimens and at 6367 (B.) in L1 using pre-diagnostic specimens. These are the strongest diagnostic and pre-diagnostic CpG sites with ...

There is a large amount of correlation among methylation levels at CpG sites within the L1 and L2 gene regions (Supplemental Figure 3). Therefore, we chose as an a posteriori exploratory analysis to combine the effects of 3 sites that were weakly correlated (correlation coefficient, r <0.6) and had the highest AUCs and strongest ORs. For diagnostic specimens, we analyzed the following combination of 3 CpG sites: position 4261 in L2, position 6457 in L1, and position 790 in E7. The combined OR was 216 (95% CI 21−;>999) for women with high methylation at all 3 CpG sites compared to women with low methylation at all 3 sites. For women with any 1 or 2 of the 3 CpG sites with high methylation the OR for CIN2+ was 16.5 (95% CI 2–139) and 33.2 (95% CI 4–278), respectively. The AUC for the ROC curve for this 3 CpG site combination was 0.82 (95% CI 0.75–0.90). The effect of high methylation at the same L2 (position 4261) and L1 (position 6457) CpG site combined with high methylation at a site in E2-E4 (position 3436) or in E6 (position 218) had similar results (data not shown), with AUCs of 0.83 (95% CI 0.76–0.91) and 0.82 (95% CI 0.75–0.90), respectively.

Pre-diagnostic specimens

AUCs ranged from 0.4–0.76, with all CpG sites in the E2-E4, E6, E7, and URR gene regions giving AUCs <0.7 (data not shown). Four CpG sites had AUCs of ≥0.7: 2 in L2 at nucleotide positions 5173 (AUC 0.73, 95% CI 0.62–0.84; pAUC 0.75) and 5128 (AUC 0.71, 95% CI 0.59–0.82; pAUC 0.71), and 2 in L1 at nucleotide positions 6367 (AUC 0.76, 95% CI 0.66–0.86; pAUC 0.75) and 5927 (AUC 0.70, 95% CI 0.59–0.82; pAUC 0.77) (Table 2). The ROC curve for L1 6367 is shown in Figure 4B, at a sensitivity of 90.0% the corresponding specificity for CIN2+ was 33.8%. CpG sites at positions L2 5173, L1 5927, and L1 6367 were strong diagnostic and pre-diagnostic sites, with both specimens having ORs >3 and AUCs >0.7 for developing CIN2+ (Table 2). For women aged >28 years, the AUC at nucleotide position 6367 in L1 was 0.84 (95% CI 0.71–0.96), and a sensitivity of 93.8% corresponded to a specificity of 42.5% for CIN2+.


The molecular determinants of CIN2+ development in women infected with oncogenic HPV types are unknown; we need effective markers that could retain high sensitivity and increase the specificity of DNA-based HPV assays to distinguish the very common benign HPV16 infections from rare malignant HPV16 infections. Our study shows that accurate quantitation of HPV16 DNA methylation with classification of women by methylation levels has a good sensitivity and specificity (AUC >0.8, sensitivity 91%, specificity 60%) for detecting CIN2+ in HPV16 screen-positive women and to distinguish between CIN2+ and HPV16 infections that clear and that in older women the diagnostic performance of DNA methylation triage appears to increase. We provide a list of the stronger CpG biomarker sites on which, if confirmed, a diagnostic and/or a prognostic assay could possibly be based (Table 2).

Using pre-diagnostic specimens, CpG sites in L2 and L1 had the strongest associations with outcome. The one other longitudinal study examining 6 CpG sites in the URR region found the opposite trend, they observed high methylation associated with a lower likelihood of CIN2+.15 Using diagnostic specimens, numerous sites in L1, L2, E2-E4, E6, and E7 had significantly higher methylation levels in women with CIN2+ compared to controls. The published literature has been recently reviewed,12 and overall other cross-sectional studies have also observed high HPV16 DNA methylation in the L16, 8, 10, 11, 13 and L211, 13 gene regions associated with advanced disease. Replication of select samples and CpG sites indicate that methylation levels in the URR region may be unreliable, likely due to the very low methylation levels in this region, and may account for some of the inconsistencies in the published literature.710, 1315, 22

A previous report on HPV16 methylation12 included information from a subset of the women analyzed in the current report, including 25 women with CIN3. Unlike our previous report, this report includes women with CIN2 and cancer, as well as serial samples for select women with CIN3; we also now include information from 59 additional controls and 6 additional women with CIN3. Our measures of methylation are all from a different lab; extraction, bisulfite conversion, primer design, and methylation assays in the earlier report were obtained from the Burk lab in the US, then independently in a different laboratory in the UK. The 2 studies ran successively. To avoid potential bias, the laboratory results of the previous study did not inform the current analyses.

Our independent replication corroborates the finding that women with CIN3 have higher methylation in L1, L2, and E2-E4 gene regions;12 additionally, we detected strong associations between high methylation and CIN2+ at more CpG sites in all gene regions. In comparison, 10/11 CpG sites that were significantly associated with disease outcome (clearance, persistence, CIN3) in the previous study12 were included in this study and were also significantly associated with outcome (clearance, persistence, CIN2+; noted in Table 2 and Supplementary Table 2). Our risk estimates were similar or stronger at all sites, except L1 7136 showed an increased risk of CIN2+ in the current study and a decreased risk of CIN3 previously.12 The strongest site in the previous study,12 L1 5378 was not included in the current study and similarly the 2 strongest sites in the current study, L1 6457 and L1 6367, were not tested in the earlier study; however, the data combined highlight the importance of the L1 gene region.

We have also extended the analyses to an evaluation of age and HPV viral load and show that viral load is not a significant confounding factor and that age may have modifying effects on the DNA methylation results. Our findings expand the data from the previous study12 by utilizing a larger sample size and cervical disease spectrum to evaluate the utility of viral methylation as a biomarker, as well as serially collected specimens on the same women. In particular, we have shown that methylation at specific CpG sites can distinguish women who will be diagnosed with CIN2+ several years in the future.

In cervical cancer screening, the most important application for novel biomarkers5 is to triage women with positive cytology and/or HPV tests. For example, p16 immunocytochemical staining is a promising biomarker for this triage.2325 The performance of our HPV16 methylation assay to detect CIN2+ (sensitivity of 91.1% and specificity of 60.2% for CpG L1 6457) was comparable to p16 [sensitivity of 92.6% and specificity of 63.2%],25 although a comparison in similar populations is currently lacking. Since HPV infections are the necessary cause of almost all cervical cancers, detection of HPV DNA has a high diagnostic sensitivity. However, the majority of infections clear, resulting in poor specificity of HPV DNA tests.5 Therefore, as a triage among HPV positive women a biomarker that retains a relatively high sensitivity and has a reasonable triage specificity would be most valuable. Based on data from the ASCUS-LSIL Triage Study (ALTS) and other similar studies there has been a general acceptance of HPV tests to triage ASCUS with characteristics of >90% sensitivity and >45% specificity.26, 27 Our data demonstrate that HPV genomic methylation could be a triage marker after primary HPV testing that would not need a morphological interpretation as required for current p16 and other similar biomarkers. A suitable DNA methylation assay could in theory detect most of the potentially transforming HPV infections while reducing the numbers of women who may be un-necessarily referred to colposcopy or to a “see-and-treat” approach.5, 28 Clearly this concept needs to be extended in a joint test to the other important HPV types (18, 31, 45, etc) and to much larger studies in diverse settings for it to be considered a realistic contender as a triage for HPV DNA screening positives.

Most CpG sites in the HPV16 genome appear to be methylated in a coordinated fashion and this appears to be the reason why the diagnostic performance of HPV16 DNA methylation was not significantly improved by combining the strongest CpG sites. Despite a very strong association with methylation, illustrated by the high odds ratios, the AUCs did not improve much for any of the combinations of the 3 strongest CpG sites compared to the best single site (0.83 compared to 0.82), even though the sites were chosen so the methylation levels of sites in the set were not highly correlated.

Since our pre-diagnostic samples were collected approximately 3 years prior to diagnosis of CIN2+, as a biomarker they could potentially provide a period of prognostic risk stratification. Methylation of these pre-diagnostic specimens, particularly at CpG sites within L1 and L2, showed good performance. The 4 CpG sites (2 in L1 and 2 in L2) that had AUCs >0.7 in pre-diagnostic specimens also had AUCs >0.7 in diagnostic specimens. On the basis of the poor to no-additivity of the AUCs, the best HPV DNA methylation assay may rely on an accurate measurement of just one biologically robust CpG site. Or, perhaps a simple average of a few sites per HPV type, in a cocktail. This approach would minimize costs and may provide for a sufficiently accurate and robust assay selected from several possible CpGs in either the L1, the L2, or the E2 ORFs.

The serial samples collected 0‒7 years before CIN3 diagnosis suggest that methylation at most regions of the HPV16 genome increases over time until the time of diagnosis, particularly in the L2 and L1 gene regions. This pattern is also reflected in the pre-diagnostic (the first HPV16 positive sample) and diagnostic samples for women with CIN2+, which show consistently higher methylation levels in the diagnostic samples. However, more extensive longitudinal analyses are needed to further evaluate whether changes in methylation predicts detection of CIN2+ later.

In conclusion: we have shown that HPV16 DNA methylation may be a useful diagnostic biomarker for CIN2+ to triage HPV DNA positives and may also be a reasonable prognostic marker. Viral methylation appears to occur years before detection of CIN2+ in this cohort and may provide high sensitivity and reasonable specificity for the development of CIN2+. Additional follow-up studies are needed to extend our pre-diagnostic, age modification, and serial sample findings in cohorts of women in different settings and with more in-depth longitudinal data.

Brief description of the novelty and impact of the work

We have shown that HPV16 DNA methylation, with classification of women by methylation levels, may be a useful diagnostic biomarker for CIN2+ to triage HPV DNA positive women and may also be a reasonable prognostic marker. We show that viral methylation occurs years before detection of CIN2+ and has the potential to provide high sensitivity and reasonable specificity for the development of CIN2+. In older women the diagnostic performance of DNA methylation triage is increased. Follow-up studies are needed to extend our findings to other carcinogenic HPV types to validate the utility of methylation as a diagnostic triage option.

Supplementary Material

Supp Table S1-S2&Figure S1-S3


We are grateful to the women who participated in this study and to the Guanacaste Project staff who so carefully collected the samples over the years. This work was supported in part by the Intramural Research Program of the NIH, National Cancer Institute, Division of Cancer Epidemiology and Genetics, and a grant from Cancer Research UK (C569/A10404).


human papillomavirus type 16
cervical intraepithelial neoplasia
CIN grade 2 or worse
upstream regulatory region
intra-class correlation coefficients
odds ratio
confidence intervals
Receiver operating characteristic
area under the ROC curve


1. Smith J, Lindsay L, Hoots B, Keys J, Franceschi S, Winer R, Clifford G. Human papillomavirus type distribution in invasive cervical cancer and high-grade cervical lesions: a meta-analysis update. Int J Cancer. 2007;121:621–32. [PubMed]
2. Wheeler C, Hunt W, Joste N, Key C, Quint W, Castle P. Human papillomavirus genotype distributions: implications for vaccination and cancer screening in the United States. J Natl Cancer Inst. 2009;101:475–87. [PMC free article] [PubMed]
3. Ho G, Bierman R, Beardsley L, Chang C, Burk R. Natural history of cervicovaginal papillomavirus infection in young women. N Engl J Med. 1998;338:423–8. [PubMed]
4. McCredie M, Sharples K, Paul C, Baranyai J, Medley G, Jones R, Skegg D. Natural history of cervical neoplasia and risk of invasive cancer in women with cervical intraepithelial neoplasia 3: a retrospective cohort study. Lancet Oncol. 2008;9:425–34. [PubMed]
5. Schiffman M, Wentzensen N, Wacholder S, Kinney W, Gage J, Castle P. Human papillomavirus testing in the prevention of cervical cancer. J Natl Cancer Inst. 2011;103:368–83. [PMC free article] [PubMed]
6. Kalantari M, Garcia-Carranca A, Morales-Vazquez CD, Zuna R, Montiel DP, Calleja-Macias IE, Johansson B, Andersson S, Bernard H-U. Laser capture microdissection of cervical human papillomavirus infections: Copy number of the virus in cancerous and normal tissue and heterogeneous DNA methylation. Virology. 2009;390:261–67. [PMC free article] [PubMed]
7. Ding D, Chiang M, Lai H, Hsiung C, Hsieh C, Chu T. Methylation of the long control region of HPV16 is related to the severity of cervical neoplasia. Eur J Obstet Gynecol Reprod Biol. 2009;147:215–20. [PubMed]
8. Kalantari M, Calleja-Macias IE, Tewari D, Hagmar B, Lie K, Barrera-Saldana H, Wiley D, Bernard H-U. Conserved methylation patterns of human papillomavirus type 16 DNA in asymptomatic infection and cervical neoplasia. J Virol. 2004;78:12762–72. [PMC free article] [PubMed]
9. Bhattacharjee B, Sengupta S. CpG methylation of HPV 16 LCR at E2 binding site proximal to P97 is associated with cervical cancer in presence of intact E2. Virology. 2006;354:280–5. [PubMed]
10. Sun C, Reimers L, Burk R. Methylation of HPV16 genome CpG sites is associated with cervix precancer and cancer. Gynecol Oncol. 2011;121:59–63. [PMC free article] [PubMed]
11. Brandsma J, Sun Y, Lizardi PM, Tuck DP, Zelterman D, Haines GK, III, Martel M, Harigopal M, Schofield K, Neapolitano M. Distinct human papillomavirus type 16 methylomes in cervical cells at different stages of premalignancy. Virology. 2009;389:100–07. [PMC free article] [PubMed]
12. Mirabello L, Sun C, Ghosh A, Rodriguez A, Schiffman M, Wentzensen N, Allan Hildesheim A, Herrero R, Wacholder S, Lorincz A, Burk R. Methylation of Human Papillomavirus Type 16 Genome and Risk of Cervical Precancer in a Costa Rican Population. JNCI. 2012;104:556–65. [PMC free article] [PubMed]
13. Fernandez A, Rosales C, Lopez-Nieva P, Graña O, Ballestar E, Ropero S, Espada J, Melo S, Lujambio A, Fraga M, Pino I, Javierre B, et al. The dynamic DNA methylomes of double-stranded DNA viruses associated with human cancer. Genome Res. 2009;19:438–51. [PubMed]
14. Hublarova P, Hrstka R, Rotterova P, Rotter L, Coupkova M, Badal V, Nenutil R, Vojtesek B. Prediction of human papillomavirus 16 E6 gene expression and cervical intraepithelial neoplasia progression by methylation status. Int J Gynecol Cancer. 2009;19:321–5. [PubMed]
15. Piyathilake C, Macaluso M, Alvarez R, Chen M, Badiga S, Edberg J, Partridge E, Johanning G. A higher degree of methylation of the HPV 16 E6 gene is associated with a lower likelihood of being diagnosed with cervical intraepithelial neoplasia. Cancer. 2011;117:957–63. [PMC free article] [PubMed]
16. Herrero R, Schiffman M, Bratti C, Hildesheim A, Balmaceda I, Sherman M, Greenberg M, Cárdenas F, Gómez V, Helgesen K, Morales J, Hutchinson M, et al. Design and methods of a population-based natural history study of cervical neoplasia in a rural province of Costa Rica: the Guanacaste Project. Rev Panam Salud Publica. 1997;1:362–75. [PubMed]
17. Bratti M, Rodríguez A, Schiffman M, Hildesheim A, Morales J, Alfaro M, Guillén D, Hutchinson M, Sherman M, Eklund C, Schussler J, Buckland J, et al. Description of a seven-year prospective study of human papillomavirus infection and cervical neoplasia among 10000 women in Guanacaste, Costa Rica. Rev Panam Salud Publica. 2004;15:75–89. [PubMed]
18. Vasiljević N, Wu K, Brentnall A, Kim D, Thorat M, Kudahetti S, Mao X, Xue L, Yu Y, Shaw G, Beltran L, Lu Y, et al. Absolute quantitation of DNA methylation of 28 candidate genes in prostate cancer using pyrosequencing. Dis Markers. 2011;30:151–61. [PMC free article] [PubMed]
19. Cricca M, Morselli-Labate AM, Venturoli S, Ambretti S, Gentilomi GA, Gallinella G, Costa S, Musiani M, Zerbini M. Viral DNA load, physical status and E2/E6 ratio as markers to grade HPV16 positive women for high-grade cervical lesions. Gynecol Oncol. 2007;106:549–57. [PubMed]
20. Benjamini Y, Yekutieli D. The control of the false discovery rate in multiple testing under dependency. Annals of Statistics. 2001;29:1165–88.
21. McClish D. Analyzing a Portion of the ROC Curve. Medical Decision Making. 1989;9:190–95. [PubMed]
22. Badal V, Chuang L, Tan E, et al. CpG methylation of human papillomavirus type 16 DNA in cervical cancer cell lines and in clinical specimens: genomic hypomethylation correlates with carcinogenic progression. J Virol. 2003;77:6227–34. [PMC free article] [PubMed]
23. Wentzensen N, Bergeron C, Cas F, Vinokurova S, von Knebel Doeberitz M. Triage of women with ASCUS and LSIL cytology: use of qualitative assessment of p16INK4a positive cells to identify patients with high-grade cervical intraepithelial neoplasia. Cancer. 2007;111:58–66. [PubMed]
24. Tsoumpou I, Arbyn M, Kyrgiou M, Wentzensen N, Koliopoulos G, Martin-Hirsch P, Malamou-Mitsi V, Paraskevaidis E. p16(INK4a) immunostaining in cytological and histological specimens from the uterine cervix: a systematic review and meta-analysis. Cancer Treat Rev. 2009;35:210–20. [PMC free article] [PubMed]
25. Denton K, Bergeron C, Klement P, Trunk M, Keller T, Ridder R, European CINtec Cytology Study Group The sensitivity and specificity of p16(INK4a) cytology vs HPV testing for detecting high-grade cervical disease in the triage of ASC-US and LSIL pap cytology results. Am J Clin Pathol. 2010;134:12–21. [PubMed]
26. ASCUS-LSIL Traige Study (ALTS) Group Results of a randomized trial on the management of cytology interpretations of atypical squamous cells of undetermined significance. Am J Obstet Gynecol. 2003;188:1383–92. [PubMed]
27. Arbyn M, Sasieni P, Meijer C, Clavel C, Koliopoulos G, Dillner J. Chapter 9: Clinical applications of HPV testing: a summary of meta-analyses. Vaccine. 2006;24:78–89. [PubMed]
28. Wright J, Rader J, Davila R, Powell M, Mutch D, Gao F, Gibb R. Human papillomavirus triage for young women with atypical squamous cells of undetermined significance. Obstet Gynecol. 2006;107:822–9. [PubMed]