|Home | About | Journals | Submit | Contact Us | Français|
Given that the integration of human papillomavirus type 16 (HPV16) into the host genome occurs preferentially with the disruption of the E2 gene, a ratio of E2 to E7 gene copies is often used as a marker for integration. It is largely undetermined, however, whether ratio estimates are affected by HPV intratypic variations. We assembled four plasmid constructs, each containing a DNA fragment from an HPV16 European, Asian-American, African-1, or African-2 variant. These constructs and nine cervical swab samples were assayed by real-time PCR with two primer-probe sets for each gene: a specific set, fully complementary to the HPV16 prototype, and a degenerate set, incorporating degenerate bases at positions where nucleotides differed among the variants. The ratio of E2 to E7 gene copies for the European variant construct was close to 1, no matter which sets of primers and probes were used. While the ratios for the African-1 and Asian-American variant constructs remained close to 1 with the degenerate sets of primers and probes, the ratios were 0.36 and 2.57, respectively, with the specific sets of primers and probes. In addition, a nucleotide alteration at the position immediately following the 3′ end of the E2 forward primer binding site was found to be responsible for an underestimation of E2 gene copies for the African-2 variant construct. Similar patterns were found in nine cervical samples. In conclusion, mismatches between the primers and probes and their targets due to HPV16 intratypic variations would introduce errors in testing for integration; this situation can be sufficiently ameliorated by incorporating degenerate bases into the primers and probes.
Infection with oncogenic human papillomavirus (HPV) is necessary but insufficient to cause invasive cervical cancer, with additional virus-host interactions needed to lead to the malignant phenotype (7, 20, 30). The integration of a high-risk HPV genome into the host chromosome is thought to be a key event in cervical carcinogenesis (23), as it may result in increased stability of HPV E6/E7 mRNA (16), augmented viral immortalization capacity (25), a high level of chromosomal instability (24), and specific alterations of cellular gene expression (1). Evidence from a study in vitro (15) indicates that cells with integrated, compared to episomal, HPV type 16 (HPV16) DNA have a selective growth advantage. Because the integration of the HPV genome occurs preferentially with the disruption of the coding sequence for the E2 protein, a ratio of E2 to E7 (or E6) gene copy numbers determined by real-time PCR has been commonly used as a marker for integration in epidemiological and clinical investigations (2, 3, 11, 13, 14, 21, 22, 26, 28, 34). Theoretically, the ratio ranges from 0 (completely integrated) to 1 (episomal only) if the integration occurs within the E2 target region; values between 0 and 1 reflect a mixed status.
Previous studies of HPV DNA integration have focused mainly on HPV16, the type that confers the highest risk of cervical cancer and also the type most commonly detected in women with normal cervical cytology (5). The reported rates of HPV16 integration (including cases of mixed integrated and episomal forms) vary from 8 to 100% in women without detectable cervical lesions and from 36 to 100% in those with high-grade cervical intraepithelial neoplasia (2, 3, 6, 9-11, 13, 14, 18, 29). Such a wide range may be explained in part by differences in study populations and/or issues in integration detection. Sequence analyses of various regions of the HPV16 genome have demonstrated a variety of naturally occurring variants. These variants are phylogenetically classified and named according to their geographic relatedness, as follows: European (E), Asian (As), Asian-American (AA), African-1 (Af1), and African-2 (Af2) (8, 33). In view of the fact that nucleotide alterations are common in the regions targeted by many previously reported primers and probes, the findings from these studies may be affected by the intragenomic diversity of the virus.
To address this issue, we assembled a set of plasmid constructs, each containing a DNA fragment derived from the HPV16 E, AA, Af1, or Af2 variant. These constructs and a set of cervical swab samples were assayed by real-time PCR to examine whether, and to what extent, testing for integration by using ratios of E2 to E7 gene copies would be affected by HPV16 intratypic variations.
PCR products from the HPV16 genome were generated from three cervical swab samples known to be positive for Af1, Af2, and AA variants. The amplification of the entire E2 and E7 regions was carried out using a LongRange PCR kit (Qiagen, Valentia, CA) with a pair of primers (forward primer [nucleotide positions 419 to 443], 5′-CGGAATTCTGTCAAAAGCCACTGTGTCCTGAAG, and reverse primer [nucleotide positions 3891 to 3866], 5′-CGGGATCCGCACGCCAGTAATGTTGTGGATGTA) which contained EcoRI and BamHI sites, respectively, at their 5′ termini. PCR products were digested with EcoRI and BamHI (New England Biolabs, Beverly, MA), purified with a QIAquick PCR purification kit (Qiagen, Valentia, CA), and ligated into pUC19 by using a quick-ligation kit (New England Biolabs, Beverly, MA). The Escherichia coli strain TOP10 (Invitrogen, Faraday, CA) was transfected with the ligation products.
Clones containing target inserts were identified by digestion with appropriate restriction enzymes, followed by electrophoresis. Plasmid DNA was purified with a QIAWell-8 plasmid kit (Qiagen, Valentia, CA). The purified DNA templates were sequenced using a BigDye Terminator cycle sequencing kit (Applied Biosystems, Foster City, CA). The pBR322 plasmid with the full length of the HPV16 prototype sequence (i.e., the E variant) was kindly provided by Denise Galloway (Fred Hutchison Cancer Research Center, Seattle, WA).
Nine HPV16-positive cervical swab samples from a screening population (17) were selected for the determination of ratios of E2 to E7 gene copies. Of these samples, two were from women with normal cytology, five were from women with atypical squamous cells of undetermined significance (ASC-US), and two were from women with low-grade squamous intraepithelial lesions (LSIL). HPV16 variants in these samples, including six samples with E variants, one with an Af1 variant, one with an Af2 variant, and one with an AA variant, were previously characterized by sequencing part of the long control region and the entire E6 region (unpublished data).
Listed in Table Table11 are primers and probes for real-time PCR which were designed using Primer Express 3.0 (Applied Biosystems, Foster City, CA). Because the purpose of the present study was to examine the impact of HPV16 intratypic sequence variations on the detection of viral DNA integration rather than to assess the impact of the number and locations of the mismatches, the primers and probes were designed according to sequence variations found in the natural variants. The type-specific primers and probes were fully complementary to HPV16 prototype sequences, whereas the degenerate primers and probes incorporated degenerate bases at positions where the nucleotides differ among the variants. We additionally designed three modified degenerate E2 forward primers for the Af2 variant construct.
Real-time PCR was performed on the ABI Prism 7900HT sequence detection system (Applied Biosystems, Foster City, CA). Briefly, the assay was set up in a 15-μl reaction volume containing 1× TaqMan universal PCR master mix (Applied Biosystems, Foster City, CA), 0.1 μM (each) primers and probe, and 1 μl of the DNA template. PCR amplification was carried out with a cycling program of holding at 50°C for 2 min and then at 95°C for 10 min, followed by a two-step cycle of 95°C for 15 s and 60°C for 1 min for a total of 40 cycles.
The log-phase five-point standard curves for a known number of HPV16 prototype copies (from 102 to 106 in a 10-fold dilution) were implemented in each set of the assays. The standard curve was generated by each set of primers and probe. The PCR run efficiency, which measures amplification rates, was calculated using the following formula (12):
where E is the run efficiency and S is the slope of the standard curve. To minimize assay-to-assay variation, all reagents except for the PCR master mix were freshly made from the same stock solution. The numbers of E2 and E7 gene copies were measured in separate reactions but on the same plate to avoid potential competition and minimize variation in sample loading. Each sample was assayed in triplicate, and a mean value of three measurements was used for analysis.
The reliability of the mean of the triplicate measures was evaluated by one-way analysis of variance with random effects. The ratios of HPV16 E2 to E7 gene copy numbers in four plasmid samples were independently measured five times. The coefficient of variation (CV) was used to describe interassay variations.
Table Table22 shows sequence variations in part of the E2 gene (from nucleotide position 3224 to 3852) and the entire E7 gene (from nucleotide position 562 to 858) within the plasmid constructs. Compared to the HPV16 prototype sequence, the Af1, Af2, and AA variants had a total of 13 consensus changes in the regions examined, including G to A, A to G, C to G, C to T, G to A, C to A, T to G, C to A, T to A, G to T, C to A, T to C, and T to G at positions 3249, 3362, 3377, 3410, 3449, 3516, 3566, 3684, 3694, 3778, 3787, 789, and 795, respectively. Each of these variants had several additional unique changes. Within the E2 probe binding site, there was a G-to-A change at position 3449 in all three non-E variants. In addition, the Af1 variant had an A-to-T change at position 3425 in the E2 forward primer binding site, and the AA variant had a T-to-C change at position 732 in the E7 probe binding site.
The numbers of E2 and E7 gene copies in four plasmid samples were measured in triplicate with the same equipment at five different times, with the known copy number of the HPV16 prototype as the standard. The overall reliability of the mean of the triplicate measures was 0.988, with an intraclass correlation of 0.965 (95% confidence interval, 0.951 to 0.978). The R2 values that describe the correlation between the threshold cycle and the log of the starting copy number of the standard (1.00 indicates perfect correlation) were no less than 0.99 for both E2 and E7 genes with either specific or degenerate sets of primers and probes. The efficiencies of five independent runs for the amplification of prototype HPV16 E2 and E7 genes varied from 0.84 to 0.88 and 0.83 to 0.86, respectively, with the specific sets of primers and probes and from 0.86 to 0.88 and 0.83 to 0.88, respectively, with the degenerate sets of primers and probes.
The CVs of the ratios of E2 to E7 gene copies for five independent runs were less than 10% for all except for the AA variant construct (CV = 18%) with the specific sets of primers and probes (Table (Table3).3). The mean ratio for the E variant construct was close to 1, no matter which sets of primers and probes were used. However, while the ratios for the Af1 and AA variant constructs remained close to 1 with the degenerate sets of primers and probes, they differed substantially from 1 with the specific sets of primers and probes, with mean ratios of 0.36 for the Af1 construct and 2.57 for the AA construct. The numbers of E2 and E7 gene copies estimated by the degenerate sets of primers and probes were substantially larger than those estimated by the specific sets of primers and probes. The mean ratio for the Af2 variant construct estimated by the degenerate sets of primers and probes, although larger than that estimated by the specific sets of primers and probes (0.48 versus 0.17), reflected an approximately twofold difference between the numbers of E2 and E7 gene copies. The result remained similar when this construct was assayed another five times (data not shown).
The sequence variations of the Af2 variant in the plasmid construct were reconfirmed by DNA sequencing. We noticed that the Af2 variant carried a unique nucleotide alteration at the position immediately following the 3′ end of the initially designed E2 forward primer (a G-to-A change at position 3431). To examine whether this change would introduce errors into the determination of the E2 gene copy number, we tentatively redesigned three E2 forward primers by moving the original primer sequence 1 to 3 bases away from that alteration. With the replacement of the E2 degenerate forward primer by these newly designed forward primers, the ratios of E2 to E7 gene copies gradually approached 1, with the value of 0.97 (mean copy numbers of 9,899 for the E2 gene and 10,212 for the E7 gene) reached by using modified E2 degenerate forward primer 3, the 3 ′ end of which was 3 bases away from the alteration. The modification of the E2 degenerate forward primer did not appreciably affect ratio estimates for the other three variant constructs (data not shown).
Nine cervical swab samples were assayed in triplicate by real-time PCR with the specific and modified degenerate sets of primers and probes. The overall reliability of the mean of the triplicate measures was 0.997, with an intraclass correlation of 0.992 (95% confidence interval, 0.987 to 0.996). The ratios of E2 to E7 gene copies in six cervical samples that were positive for the E variant were close or equal to 1 with either set of primers and probes (Table (Table4).4). However, the ratios for the three non-E variants differed substantially: in contrast to the ratio of ≥0.94 for the Af1 or AA variant with the modified degenerate sets of primers and probes, the ratio for the Af1 variant was 0.41 and that for the AA variant was 2.42 with the specific sets of primers and probes. The ratio of E2 to E7 gene copies for the Af2 variant was 0.56 with the modified degenerate sets of primers and probes and 0.07 with the specific sets of primers and probes.
In this study, we quantitatively evaluated the impacts of mismatches between the primers and probes and their binding sites on the detection of HPV16 DNA integration by real-time PCR. Because the numbers of E2 and E7 gene copies within the plasmid construct are known to be equal, comparisons of ratios of E2 to E7 gene copies estimated by the specific versus the degenerate sets of primers and probes reflect to what extent the physical status of the virus would be misinterpreted simply because of base mismatches.
We have assumed that the estimates derived by using primers and probes that were fully complementary to the targets were the actual copy numbers. As expected, the specific sets of primers and probes, which had no mismatches with the E variant construct, gave a ratio of E2 to E7 gene copies of close to 1 for this construct. The ratio of 1 for the E variant construct, derived by using the degenerate sets of primers and probes, suggested that the incorporation of degenerate bases, relative to those fully complementary, did not appreciably affect the determination of copy number. Impressively, results for the three non-E variant constructs differed substantially between tests that used the specific versus the degenerate sets of primers and probes. While the ratio estimates remained close to 1 for all except the Af2 variant construct with the degenerate sets of primers and probes, the estimates were significantly different from 1 with the specific sets of primers and probes. These differences cannot be explained by variations in specimen loading or any other assay procedures, because the aliquots for E2 and E7 that were tested on the same plate were from the same premix and the results from five independent runs were comparable.
The departure from the ratio of 1 for the Af1 and AA variant constructs with the specific sets of primers and probes is likely to be a result of the mismatch-related reduction of amplification efficiency (32). For example, the ratio of <1 for the Af1 plasmid construct could be explained by the presence of mismatches in the E2 forward primer and probe binding sites but not in the E7 primer and probe binding sites. The AA variant had a G-to-A change in the E2 probe target and a T-to-C change in the E7 probe target. Both mismatches reduced the amplification efficiency, but the impact of the E7 mismatch was more substantial than that of the E2 mismatch, thereby leading to the reduced copy numbers of both E2 and E7 genes and the ratio of >1. The ratio of <1 for the Af2 variant construct that was derived by using the degenerate sets of primers and probes was unexpected because this variant had only a single mismatch in the E2 probe binding site, the one that was also present in the Af1 and AA variant constructs. The ratio of close to 1 obtained by testing with the modified degenerate sets of primers and probes suggested that the underestimation of E2 gene copies by the degenerate set may be attributed to a nucleotide alteration at the position immediately following the 3′ end of the E2 forward primer: from guanine at position 3431 in the prototype that was used to generate the standard curve to adenine in the Af2 variant construct. Knowing about this pattern, although the underlying mechanism is unclear, is useful for a better design of primers and probes for real-time PCR.
The present study further showed evidence of the variant-associated misinterpretation of HPV16 integration by testing a set of cervical swab samples. Because proportions of HPV16 non-E variants are related to geography and vary from one population to another, the findings of this study may in part explain the wide range of integration rates reported so far. Given that HPV16 variants differ biologically and etiologically (31), the variant-related base mismatches may somewhat bias estimates of the integration-related risk of cervical neoplasia. We noticed a ratio of <1 estimated by the modified degenerate sets of primers and probes for a clinical sample that was positive for the HPV16 Af2 variant. One interpretation for this result is that the variant might have existed in both episomal and integrated forms. Alternatively, this variant may carry additional nucleotide alterations that were not covered by our modified degenerate sets of primers and probes.
To our knowledge, this is one of the first studies, if not the first, to quantitatively address the impact of HPV16 variants on the detection of viral integration. Although the influence of HPV16 E2 polymorphisms on the detection of HPV16 integration was reported previously (4), the findings of the study were limited by the fact that the evaluation was based only on clinical samples in which a true integration status was unknown. The use of plasmid constructs for illustration avoided any integration-related misinterpretation that might arise with the use of clinical samples. We are, however, aware that our degenerate sets of primers and probes, although better than the specific sets, were designed based on limited sequence information. In natural infections, some HPV16 variants may carry more nucleotide alterations than those currently identified. Because of the intratypic diversity of many HPV types, the careful design of primers and probes is a prerequisite for a valid assay to test the integration of the HPV genome. One way to achieve this design is to choose a conserved region as the target (4). However, it is sometimes difficult to find such a region, particularly when one needs to take into account the appropriate amplicon size (usually less than 100 bp for real-time PCR) and comparable efficiencies of amplification. Alternatively, as shown in the present study, we may incorporate degenerate bases into the primers and probes at sites where nucleotides differ among the variants.
The implications of the results of this study go beyond studying viral DNA integration. Currently, quantitative analysis of the HPV DNA load by real-time PCR has been widely used in many clinical and epidemiological investigations. The primers and probes for the assay are usually designed based on the prototype sequence. It is likely that errors in viral load estimation may arise if the sample is positive for nonprototype HPV variants.
In summary, our data indicate that mismatches between the primers and probe and their targets can introduce significant errors into the determination of copy number. Incorporating degenerate bases into the primers and probe can sufficiently compensate for the mismatch-reduced efficiency of amplification.
This work was supported by Public Health Service grants CA34493 and CA84396 from the National Cancer Institute. L.A.K. has received research funds from Merck Research Laboratories. Other authors have no commercial or other associations that might pose a conflict of interest.
We thank the participants who provided biologic specimens and clinical data to this study.
Published ahead of print on 30 December 2008.