|Home | About | Journals | Submit | Contact Us | Français|
To evaluate the performance of the Human Papillomavirus High-Risk DNA test in patients 30 years and older.
Screening (N=835) and diagnosis (N=518) groups were defined based on prior Papanicolaou smear results as part of a clinical trial for cervical cancer detection. We compared the Hybrid Capture II® (HCII) test result to the worst histological report. We used cervical intraepithelial neoplasia (CIN) 2/3 or worse as the reference of disease. We calculated sensitivities, specificities, positive and negative likelihood ratios (LR+ and LR−), receiver operating characteristic (ROC) curves, and areas under the ROC curves for the HCII test. We also considered alternative strategies, including Papanicolaou smear, a combination of Papanicolaou smear and the HCII test, a sequence of Papanicolaou smear followed by the HCII test, and a sequence of the HCII test followed by Papanicolaou smear.
For the screening group, the sensitivity was 0.69 and the specificity was 0.93; the area under the ROC curve was 0.81. The LR+ and LR− were 10.24 and 0.34, respectively. For the diagnosis group, the sensitivity was 0.88 and the specificity was 0.78; the area under the ROC curve was 0.83. The LR+ and LR− were 4.06 and 0.14, respectively. Sequential testing showed little or no improvement over the combination testing.
The HCII test in the screening group had a greater LR+ for the detection of CIN 2/3 or worse. HCII testing may be an additional screening tool for cervical cancer in women 30 years and older.
Current U.S. guidelines recommend that human papillomavirus (HPV) testing be used after an abnormal Papanicolaou smear (specifically, when the result is ASCUS, atypical squamous cells of uncertain significance) or as a screening tool for cervical dysplasia in patients 30 years and older (1,2).
Although the HPV test is recommended as a screening test, few clinical trials evaluating the accuracy of the tests available in the market have been conducted in screening populations. A recent review identified 25 non-randomized trials evaluating the HPV test as the primary screening tool for cervical cancer, and among those, only 4 studies adjusted for verification bias by referring all participants to receive a colposcopy and biopsy (3). Studies failing to correct for verification bias tend to overestimate sensitivity and underestimate specificity.
Adjusting for verification bias by using histology in a screening trial may be difficult due to the incomplete follow-up of participants and their refusal to accept biopsies after a negative test. If participants are referred to colposcopy, biopsies are typically taken only from tissue of abnormal appearance. In this situation, because of the low sensitivity of colposcopy in screening populations, a participant may be considered a true negative for disease and not receive verification of disease status through histology when in fact the colposcopic examination of the participant was normal.
In a randomized trial comparing the performance of the HPV test to that of the Papanicolaou smear in a screening population, researchers found that the sensitivity of the HPV test dropped from 83% to 46% when adjusting for verification bias (4). These results did not show a substantial advantage of the HPV test over the Papanicolaou smear (adjusted sensitivity 43%), although the reported specificity of the HPV test was substantial.
The accuracy of the Papanicolaou smear has been challenged due to the rate of false negative results that lead to missed cases of advanced cervical disease. Although the HPV test has been considered a reasonable alternative due to its high sensitivity, recent research shows conflicting results regarding the test's operating characteristics.
To evaluate the test's potential as a screening tool for cervical cancer, the performance of the HPV test needs to be studied in diverse settings and adjusted for verification bias. We aimed to evaluate the performance of the HPV High-Risk DNA test, a particular form of HPV testing, for oncogenic types of HPV in women 30 years and older recruited from screening and diagnostic settings.
We studied women participating in a phase II clinical trial to evaluate fluorescence and reflectance spectroscopy, an emerging technology for the detection of cervical pre-cancer that uses an optical probe to inspect the cervix. Nonpregnant women age 18 and older were enrolled in this trial from October 1998 to November 2005. Women were allocated to a screening group if they stated that they had no history of abnormal Papanicolaou smear results; women were allocated to a diagnosis group if they stated that they had had an abnormal Papanicolaou smear at any previous time. Women were excluded from participation in the study if they had a history of cervical cancer or cervical intraepithelial neoplasia (CIN). The trial was conducted at three clinical locations: a comprehensive cancer center in the United States (The University of Texas M. D. Anderson Cancer Center), a community general hospital in the United States (Lyndon Baines Johnson Hospital Health District), and a cancer center in Canada (British Columbia Cancer Agency).
Advertising in the local media was one of the strategies used to increase the participation of women from local communities, and the strategy was expected to be effective in increasing the participation of minorities. Details of the recruitment strategies and the socio-demographic characteristics of the women have previously been reported (5). The study protocol was approved by the Institutional Review Boards of the three clinical locations. The women provided informed consent before participating in the trial.
The clinical trial protocol included a complete clinical exam as well as several tests routinely used for the screening and detection of gynecological disease. Each woman provided a complete medical history, including an interview to assess cervical cancer risk factors, and was given a physical and pelvic exam. The pelvic exam included a conventional Papanicolaou smear, cervical cultures to test for Chlamydia and gonorrhea, specimens for HPV testing, and a colposcopic examination of the vulva, vagina, and cervix, including two to four fluorescence and spectroscopic measurements.
After the fluorescence and spectroscopic measurements were performed, biopsies were obtained from colposcopic abnormal sites and from normal sites. If there was an area of abnormal colposcopic impression, the colposcopist took one or two colposcopically directed biopsies of the area with the worst overall colposcopic impression, as is done in usual care. The colposcopist also took one or two biopsies of squamous and columnar epithelium from an area of normal appearance, typically at the 6 o'clock and 12 o'clock positions, whether an abnormal area was identified with colposcopy or not. At the time of the conventional Papanicolaou smear, endocervical and ectocervical samples were obtained and used to prepare the conventional smear sample. Cells for the HPV test were then obtained by immersing the cervical brush in a 0.9% sodium chloride sterile solution vial.
We selected the High-Risk DNA test, a particular form of HPV test, using the Food and Drug Administration-approved Hybrid Capture II® (HCII) test (Digene Corporation, Gaithersberg, Maryland). The test was performed by a clinical laboratory (Laboratory Corporation of America®) using the standardized procedure recommended by the test manufacturer.
In performing the HCII test, HPV DNA was denatured and then incubated with two RNA probes (probe B for high-risk types and probe A for low-risk types), resulting in the formation of RNA-DNA hybrids. The hybrids were captured by antibodies against RNA-DNA hybrids bound to a solid phase, and any unbound hybrids were washed off. Antibodies conjugated to alkaline phosphatase were allowed to attach to the hybrids. The chemiluminescent product obtained from the conjugated antibody-hybrid constructs corresponded to the amount of DNA in the sample, was measured by a luminometer, and was reported in relative light units (RLUs). A test was considered positive for high-risk types if, for probe B, the RLUs of the sample were equal to or greater than the mean RLUs of the positive control, which was equivalent to 1 picogram of HPV high-risk DNA per milliliter. A negative control was used to evaluate negative tests.
For the gold standard of “disease,” we selected the worst histological report from among all of the biopsies obtained for each woman. We defined “disease” as CIN 2/3 or worse, which is the disease threshold used in clinical practices for the treatment of patients.
All specimens were read twice at the study site by pathologists blinded to the women's clinical history. If the readings disagreed, the biopsy was read by a third pathologist. Details on the high level of agreement among participating pathologists have been previously published (6).
We compared the sociodemographic characteristics (age, race, education, marital status, and employment), clinical characteristics (Papanicolaou smear result, menopausal status, and gravidity), and risk factors (smoking and alcohol intake) of the screening and diagnosis groups. The Student's t-test was used to determine the differences in age by study group, and the Pearson chi-squared test was used to analyze categorical variables.
We initially determined the prevalence of CIN 2/3 or worse for the screening group based on the histology. Within the screening group, we evaluated the performance of the HCII test, the Papanicolaou smear, and a combination of the HCII test and the Papanicolaou smear (for which an abnormal result was defined as a Papanicolaou smear result of ASCUS or worse or a positive HCII test). We then evaluated the performance of the HCII test in the subgroup of women with Papanicolaou smear results of ASCUS or worse. Finally, we evaluated the performance of the Papanicolaou smear in the subgroup of women with positive HCII tests. We then recomputed the prevalence of CIN 2/3 or worse for these subgroups (i.e. after an initial abnormal test result) within the screening group.
For each test or test combination, we determined the sensitivity, specificity, and respective 95% confidence intervals and computed the positive likelihood ratio (LR+) and negative likelihood ratio (LR−). We then calculated positive and negative predictive values for each group of women based on the prevalence, sensitivity, and specificity of the tests.
We repeated the above analyses for the diagnosis group. We constructed receiver operating characteristic (ROC) curves for the HCII test in the screening and diagnosis groups and compared the areas under the curves using a chi-squared test with a nonparametric approach, as suggested by DeLong et al. (7).
We used SPSS© 12.0 for Windows (SPSS Inc., Chicago, Illinois) and Stata version 9 statistical software (StataCorp LP, College Station, Texas) for the statistical analysis.
We enrolled 1000 women with a history of normal Papanicolaou smears into the screening group and 850 women with a history of abnormal smears into the diagnosis group. In accordance with clinical practice guidelines, we then considered only those women who were 30 and older, which left 873 and 571 in the screening and diagnosis groups, respectively. Of those, we excluded women who did not have results from both an HCII test and an internal Papanicolaou smear; we also excluded women who did not have histology results. These steps left 835 women in the screening group and 518 women in the diagnosis group with complete data for the analysis. More women had missing data in the diagnosis group (9.3%) than in the screening group (4.4%), and the difference was significant (p < 0.001; see Figure 1).
Most of the women 30 years and older with complete data (61.7%) were recruited into the screening group. Women in this group tended to be older than those in the diagnosis group (46.7 years vs. 42.3 years, respectively; p < 0.001). Significant differences in race, marital status, smoking habits, and menopausal status were observed between the groups. The prevalence of abnormal Papanicolaou smears (ASCUS and worse) in the screening group and the diagnosis group was 7.1% and 40.2%, respectively. Details are shown in Table 1.
Table 2 presents the frequencies of histology and test results in the screening and diagnosis groups. For the screening group, the HCII test had a sensitivity of 0.69 and a specificity of 0.93. The LR+ and LR− were 10.24 and 0.34, respectively. In the diagnosis group, the sensitivity was 0.89 and the specificity was 0.78. The LR+ and LR− in this group were 4.06 and 0.14, respectively (see Table 3).
We constructed a ROC curve for each of the study groups and estimated the area under each curve (see Figure 2). The area under the ROC curve for the screening group was 0.81 (95% CI 0.78, 0.84) and the area under the ROC curve for the diagnosis group was 0.83 (95% CI 0.80, 0.87). The two curves were not statistically different (p = 0.69).
The results for testing with the Papanicolaou smear alone were slightly worse than the results for the HCII test alone. For the Papanicolaou smear alone, the sensitivities were lower and the specificities were approximately the same in the screening and diagnosis groups. The areas under the curve were lower for the Papanicolaou smear alone compared to the HCII test alone; similarly, the likelihood ratios were inferior for the Papanicolaou smear alone compared to the HCII test alone, although the positive likelihood ratio for the screening group was 6.89.
In the screening group, the operating characteristics of the combination of the Papanicolaou smear and the HCII test were the same as for the HCII test alone because all of the women who had an abnormal Papanicolaou smear also had a positive HCII test. In the diagnosis group, the sensitivity improved to 0.96 (95% CI 0.92, 0.99) but the specificity decreased to 0.65 (95% CI 0.60, 0.69). In that case, the predictive value negative (PVN = 0.98) was the lowest of any test sequence.
We then considered the use of a reference test. When the reference test was the Papanicolaou smear, we found that 47 women in the screening group had an abnormal result of ASCUS or worse. The HCII test was positive for all women who had CIN 2/3 or worse (7/47). With this small sample size, the conditional sensitivity was 1.00 (one-sided 97.5% CI 0.59, 1.00) and the conditional specificity was 0.65 (95% CI 0.48, 0.79). The area under the ROC curve was 0.82. In the diagnosis group, for the women who had an abnormal Papanicolaou smear, the HCII test had a conditional sensitivity of 0.90 (95% CI 0.82, 0.95) and a conditional specificity of 0.41 (95% CI 0.31, 0.52). The area under the ROC curve was 0.66.
When evaluating the performance of the Papanicolaou smear to detect CIN 2/3 or worse after an HCII reference test, we found for both the screening and diagnosis groups that the conditional sensitivity was worse than the respective situations examined when the Papanicolaou smear was the reference test. In addition, the respective areas under the ROC curves and the positive and negative likelihood ratios were worse when the HCII test was the reference test compared to when the Papanicolaou smear was the reference test. Details can be found in Table 3.
We found superior accuracy of the HCII test to detect CIN 2/3 or worse in the screening group. No significant difference was observed between the areas under the ROC curves for the screening and diagnosis groups, but a higher LR+ was observed in the screening group when compared to the diagnosis group (10.24 vs. 4.06, respectively).
This finding points toward the use of the HCII test as an alternative to the Papanicolaou smear for the screening of cervical cancer in similar populations. Other researchers have reported results similar to ours. A recent randomized clinical trial conducted by Mayrand et al. (4) compared the HCII test to the Papanicolaou smear and reported the performance of both tests in a screening population. Considering their definition of disease and use of strategies to correct verification bias, their results may be comparable to our screening group results. Mayrand et al. (4) defined two categories of disease: liberal and conservative. The former was defined using any histologic result available from the colposcopically-directed biopsies or loop electrosurgical excision procedure (LEEP) specimens, and the latter was defined using only LEEP specimens. In this way, their liberal definition of disease is more like the definition used in our study, which allows us to compare the two sets of results. Similar to our study, the researchers implemented strategies to reduce verification bias by confirming the histology diagnosis in a random sample of participants with negative tests reporting the crude and corrected operating characteristics of the HCII. Using the liberal definition of disease, Mayrand et al. (4) reported a sensitivity of 82.8% and a specificity of 61.1% as the crude operating characteristics of the HCII test, and a sensitivity of 45.9% and a specificity of 94.2% as the corrected operating characteristics of the HCII test.
The sensitivity in our screening group was higher than the adjusted sensitivity reported by Mayrand et al. (4) (68.8% vs. 45.9%, respectively). The differences in our results may be explained by the study sampling and the natural history of HPV infection. While women in our study were allocated into a screening or diagnosis group based on their history of Papanicolaou smear results, participants in the Mayrand et al. (4) study were recruited from a screening program. Indeed, Mayrand et al. (4) reported that approximately 28% of the participants had a history of abnormal Papanicolaou smears, and 2% had never been tested. Thus, participants in the Mayrand et al. (4) study may have been infected with HPV at the time of the first abnormal test and had an increased length of exposure to the HPV virus compared to our group of women with a history of normal smears.
Differences in the performance of the HCII test have also been explained by local differences in the processing and handling of samples, as suggested by Sankaranarayanan et al. while conducting a multicenter study of the HCII test in India (8). The three other studies that avoided verification bias (9-11) showed operating characteristics that were quite similar to the ones found in our study. Small variations exist, which may indicate that the test as used in our study was operating at a different point on the hypothetically same ROC curve.
Our evaluation of HCII test performance may not reflect actual clinical practice because not all patients in a screening group are referred to colposcopy, and biopsies are not usually obtained from tissue of normal colposcopic appearance. Moreover, endocervical curettages are not performed in screening populations. While our study methods helped to establish the disease or non-disease status of each woman, they may have increased the detection of small lesions that are not usually discovered when performing a routine colposcopic examination.
Current U.S. guidelines for cervical cancer screening suggest that the tests can be used simultaneously in women over 30 years old with the option of extending the screening intervals up to every three years if both tests are negative. Our data, in fact, support the use of the HCII test in conjunction with the Papanicolaou smear rather than as a follow-up test for an abnormal cytological test. We observed the best sensitivity and specificity with the concurrent use of either the Papanicolaou smear or the HCII test with near perfect predictive value negative. Even the use of the HCII test alone in either the screening group or the diagnosis group generated superior sensitivity and specificity when compared to previous reports for Papanicolaou testing alone. Additional issues, such as economic and epidemiological considerations, would need to be considered before making evidence-based guidelines for public policy. These questions are particularly relevant for determining policy in limited resource settings.
The HCII test is limited to detecting the 13 most common HPV high-risk viral strains associated with cervical cancer, and cannot determine specific viral strains. Moreover, a false negative result may be caused by the presence of small numbers of viral copies in the study sample. From the perspective of epidemiological studies, the test may provide a general overview of the prevalence of infection with high-risk viral strains, but the distribution of each high-risk strain in a study population needs to be addressed using other tests that are not clinically approved.
Using the HCII test, we have demonstrated the potential of HPV testing in two subgroups of women. There is greater potential for clinical applicability using the HCII test for women in a screening population than in a diagnosis population. The operating characteristics of the HCII test seem superior when compared to those of the conventional Papanicolaou smear, and are perhaps competitive with liquid-based technologies. For a screening population, we would like to minimize the number of false positives that would require additional testing or intervention. Since cervical cancer is fairly slow in developing, a lower sensitivity can be accommodated by repeated screening. However, we want to minimize false negatives for a diagnosis population which has likely experienced previous cytological abnormalities and is likely to be at an increased risk for future abnormalities. This allows for minimal numbers of missed diagnoses. In addition, the use of the HCII test alone in this population may cause some pathologies to be missed; this could be avoided by adding cytological screening to HPV DNA testing. Our results support the use of the HCII test in screening for cervical dysplasia. However, further study may be required to determine whether the use of HPV testing as a primary screening tool for cervical pre-cancer should become standard practice.
The authors thank Rebecca Partida for editorial contributions that enhanced the clarity of the manuscript.
Financial support for this study was provided by grant number CA82710 from the National Cancer Institute. The funding agreement ensured the authors' independence in designing the study, interpreting the data, and writing and publishing the report.
This study was conducted by The University of Texas M. D. Anderson Cancer Center Departments of Biostatistics and Gynecologic Oncology, the University of British Columbia Department of Obstetrics and Gynaecology, and Fox Chase Cancer Center Division of Population Science. This study was conducted at sites in Houston, Texas and Vancouver, British Columbia, Canada.