(flowchart) shows the study recruitment. A total of 1,117 women consented to participate in the study, and the tests from 1,099 women were analyzed. The median length of time between the referral smear and colposcopy was 1.8 months (interquartile ratio [IQR], 1.4 to 2.6 months; range, 0.3 to 59 months). The median age of the women was 29 years (IQR, 26.6 to 34.3 years). Over three-quarters of the women were under 35 years of age, with 55% being under the age of 29 years.
The referral population comprised approximately 24% of women with high-grade dyskaryosis, and 76% had low-grade disease (16.7% borderline, 44.2% with a single mildly dyskaryotic cytology, and 15.4% with mild dyskaryosis and one or more borderline smears or worse). Over 20% of the concurrent smears were borderline, 32% were mild dyskaryosis, and approximately 26% were high grade. Just under one-fifth (19%) of the concurrent smear specimens taken in the colposcopy clinic were negative. This may reflect sampling variability in the smear or regression of disease.
tabulates referral cytology against worst histology. Twenty-seven percent of the women either had a normal colposcopy with no biopsy or a negative biopsy, 37% of women had CIN1 or other minor abnormalities, 12% had CIN2, and 20% had CIN3 or worse.
Results of referral smear and worst-reviewed consensus histology within 9 months of baseline cytology
Cytology of samples taken at colposcopy (mild dyskaryosis or worse) had 88.9% sensitivity, 58.1% specificity, and 50.7% PPV. The cytology was read with the knowledge that the patient was having colposcopy and so may not reflect normal screening practice.
The overall HPV positivity of the different tests is shown in and ranged from 79% to 86%, apart from p16INK4a and PreTect HPV-Proofer, at 58.9% and 43.9%, respectively. The proportion of women positive for HPV type 16 and/or 18 was similar across tests (for HPV type 16, the positivity ranged from 26.2% to 31.9%, and for HPV type 18, it ranged from 7.8% to 11.8%; ).
HPV positivity and type-specific results of different tests
Sensitivity, specificity, and PPV for CIN2+, CIN2 alone, and CIN3+ of the different tests (all samples taken concurrently at colposcopy) are reported in . We have chosen to report separately the results for CIN2+ and CIN3+ because CIN2 is less likely to progress and has greater variability in diagnosis. CIN3 has been shown to have greater reproducibility.
Sensitivity, specificity, and PPV of different tests for detection of high-grade disease
Five adjunctive tests had a sensitivity greater than 95% for CIN3+: Hybrid Capture 2, Cobas 4800, Abbott RealTime, BD HPV, and Aptima. The Abbott RealTime test was significantly less sensitive for CIN2+ than Hybrid Capture 2, BD HPV, and Aptima and had a marginally but not significantly lower sensitivity than Cobas 4800. All these tests were significantly more sensitive than p16INK4a or PreTect HPV-Proofer.
There were seven cancers, of which all but one (a microinvasive stage 1A cancer missed by PreTect HPV-Proofer only) were detected by all the HPV tests. This sample tested positive for HPV-52 by BD HPV and for other HPV types by Abbott RealTime and Cobas 4800.
Overall, the highest specificity was achieved with the PreTect HPV-Proofer (70.8%, for CIN2+), but this test had relatively low sensitivity (). The CINtec p16INK4a cytology test had a significantly lower specificity (54.7% and 49.4% for CIN2+ and CIN3+, respectively) than PreTect HPV-Proofer but a higher specificity than the other tests. Of the five highly sensitive tests, Hybrid Capture 2 showed significantly lower specificity than the other four tests (McNemar's test). Aptima and Abbott RealTime were also significantly more specific than Cobas 4800 and BD HPV. When focusing on CIN3+, overall, sensitivity is slightly improved; however, the ordering of the tests remains similar (). Because there is uncertainty about the progressive potential of CIN2, we provide sensitivity estimates for both CIN2+ and CIN3+; however, few would consider CIN2 to be a false positive, so we do not report specificity for CIN3+.
shows the effects on sensitivity and specificity of using different cutoffs for Hybrid Capture 2, Cobas, Abbott RealTime, Aptima, BD HPV, and p16INK4a in predicting histologically confirmed high-grade disease (CIN2+ and CIN3+). The ROC curves for the highly sensitive tests are very similar. If the cutoff for Hybrid Capture 2 was raised (from ≥1 RLU to ≥2 RLU), the sensitivity remained relatively unchanged, while the specificity slightly improved. Using a cutoff ≥3 instead of ≥2 (see Materials and Methods) for p16INK4a reduced the sensitivity for CIN3+ (from 90.2% to 79.0%), although it increased the specificity from 49.4% to 75.9%. For the BD HPV test, lowering the cutoff to a cycle threshold (CT) value of ≤33 from the nominal ≤36.2 would result in a small decrease in sensitivity for CIN3+ (from 97.8% to 97.3%) and for CIN2+ (from 95% to 93.8%), while it would result in an increase in specificity for CIN3+ (from 21.9% to 26.7%) and for CIN2+ (from 24.1% to 29.4%).
Fig 2 ROC curves for tests to detect CIN2+ and CIN3+. Sensitivity plotted against specificity calculated at different cutoffs for each test is shown. Plots are presented separately for CIN3+ and CIN2+, with a dashed line used to represent the line of chance. (more ...)
Sensitivity was slightly higher for all tests in younger women (age <30 years), being (for CIN3+) between 3.0% and 5.3% higher for all tests except PreTect HPV-Proofer, whose sensitivity was 88.9% in women under age 30 years and 70.3% in those over age 30 years. All tests showed higher specificity in the older age group (data not shown).
To more accurately reflect the use of these tests in a situation where HPV triage has been recommended, we also considered the restricted population of women who had a borderline or mildly dyskaryotic referral smear. The results were generally similar (). For the five tests that showed the highest sensitivity (Hybrid Capture 2, Abbott RealTime, Cobas 4800, BD HPV, and Aptima), specificities were similar when only women with borderline or mildly dyskaryotic smears were considered, and the relative performance (i.e., the order) of the tests was unchanged in these lower-risk cytology categories.
It should be noted that the tests did not all miss the same cases, making comparisons more complex. There were 120 CIN2+ individuals missed by at least one test: 87 individuals were negative by one test only (60 by PreTect HPV-Proofer, 26 by CINtec p16INK4a cytology, and 1 by Hybrid Capture 2); 10 individuals were negative by two tests (7 by PreTect HPV-Proofer and CINtec p16INK4a cytology, 3 by PreTect HPV-Proofer and Abbott RealTime); 23 individuals were negative by three or more tests. Data on individual HPV types were complicated by the large number of multiple infections and will be reported separately.