CONTEXT AND CAVEATS
Recent studies suggest that screening for epithelial ovarian cancer by annual transvaginal ultrasound (TVU) and CA125 measurement does not confer a survival benefit and may lead to unnecessary surgery for some patients.
A multimodal strategy combining measurement of CA125 and other serum markers specific for ovarian cancer was investigated as a potential screening strategy and compared with TVU. Serum samples from invasive ovarian cancer patients and matched control subjects from the Prostate, Lung, Colorectal, and Ovarian trial were used to evaluate candidate serum markers for their association with malignancy and increasing CA125.
Of the six serum markers investigated, human epididymis protein 4 (HE4) was identified as a potential serum marker with the highest relative increase in serum levels and was associated with malignancy. In a subanalysis of patients with increasing CA125, HE4 was found to confirm more cancers than TVU as a second-line screen.
HE4 may be a valid serum marker for use in a multimodal screening strategy for ovarian cancer.
Measuring HE4 may not make screening itself effective. Although adding HE4 measurement to a screening strategy may improve specificity, it may also reduce sensitivity. Further studies to determine the effect of detecting HE4-associated tumors on patient outcomes and confirmation of better performance as a first- and second-line screen relative to TVU are needed.
From the Editors
A recent report from the Prostate, Lung, Colorectal, and Ovarian (PLCO) trial suggests that screening for epithelial ovarian cancer (EOC) using annual transvaginal ultrasound (TVU) and CA125 measurement leads to unnecessary surgery (1
) without reducing mortality (2
). A multimodal strategy using increasing CA125 serum levels measured annually to select women for TVU yielded an acceptable positive predictive value of 35% at the initial screen in the UK Collaborative Trial of Ovarian Cancer Screening (3
); data on patient mortality have not yet been reported. Poor performance by TVU in efficacy trials suggests that use of a second serum marker in participants with increasing CA125 serum levels may offer advantages as a second-line screening method.
Using proximate samples from 112 women with EOC and 706 matched non-oophorectomized control subjects from the PLCO trial, we evaluated six candidate serum markers for their elevation in patients with and without increasing CA125 serum levels. CA125, human epididymis protein 4 (HE4), mesothelin, matrix metalloproteinase 7 (MMP7), SLPI, Spondin-2, and insulin-like growth factor binding protein 2 (IGFBP2) previously showed at least 30% sensitivity at 95% specificity in clinical studies (4
) or in validation studies using preclinical samples (8
PLCO trial design and study population (1
) are described in the Supplementary Methods
(available online). Previous reports from the PLCO validation study did not analyze TVU results or serial measures of CA125 (9
). Also unique to this analysis are inclusion of 237 control subjects with a family history of breast or ovarian cancer to improve generalizability to the high-risk population. Written informed consent was obtained from each participant. EOC was defined as ovarian, fallopian tube, and primary peritoneal cancer but excluded granulosa cell tumors. Assays used for study are described in the Supplementary Methods
(available online). Marker concentrations were rescaled, and covariable adjustment was performed using analysis of covariance on the standardized scale (8
In the PLCO trial, women aged 55–74 at enrollment were screened annually for EOC for 6 years by CA125 and considered positive if CA125 levels were 35 U/mL or higher. Serial CA125 results obtained in the trial were available for all years, but only one proximate serum sample was provided for the measurement of novel markers. A clinically significant increase in CA125 in the proximate sample (“increasing CA125” hereafter) was defined by a personal threshold consistent with 96.9% specificity as determined by the parametric empirical Bayes rule, a longitudinal algorithm (11
) designed to detect increasing marker levels. Because serum was not stored from blood collected at the fourth screen, the proximate sample was obtained at the third screen for cancers diagnosed between the third and fifth screens (Supplementary Table 3
, available online), adversely affecting sensitivity estimates in this study. PLCO study participants were also screened annually by TVU in the first 4 years, resulting in a subset including 84 patients and 516 controls subjects (including 175 family history control subjects) with TVU results available.
Patients and control subjects did not differ in most categories; however, the patients did statistically significantly differ from the control subjects in terms of family history of breast or ovarian cancer (by design, P
< .001 for all participants and P
< .001 for the subset of participants with TVU results available) and in history of endometriosis (P
= .029 for all participants and P
< .012 for the subset of participants with TVU results available); family history control subjects did not statistically significantly differ from the randomly selected PLCO population control subjects (Supplementary Tables 1
, available online).
Using all 112 patients and 706 control subjects with marker results available, we investigated the associations between CA125, HE4, mesothelin, MMP7, IGFBP2, Spondin-2, and SLPI marker levels with malignancy, accounting for increasing CA125 to address validity as a second-line screen and simultaneously adjusted for participant characteristics (13
). Statistically significantly increased levels of every marker were observed for cancers identified by increasing CA125 serum levels (). The highest average signal was recorded for HE4 and was 4.26 SDs above the mean HE4 measurement in control subjects (P
< .001), suggesting that HE4 is the best marker of those tested for use as a second-line screening modality. For cancers not detected by increasing CA125, an HE4 signal of 0.495 SDs above the mean in control subjects (P
= .006) was observed; statistically significant changes in the levels of other markers were not observed in these patients ().
Table 1 Associations of marker levels with patient status by increasing CA125 and participant characteristics reported at the time of enrollment for 112 patients and 706 control subjects in the Prostate, Lung, Colorectal, and Ovarian trial with CA125 and HE4 (more ...)
We also investigated associations between marker levels and each participant characteristic after adjusting for the remaining characteristics (). The characteristics studied included age at first blood draw (continuous), body mass index (continuous), nonwhite race (yes or no), family history (breast or ovarian cancer, yes or no), oral contraceptive use (≥1 year, yes or no), nulliparous (yes or no), history of endometriosis (yes or no), current smoker (yes or no), prior hysterectomy (yes or no), current hormone therapy with intact uterus (yes or no), and current hormone therapy with prior hysterectomy (yes or no). All markers, including CA125, were associated with at least two characteristics, but no markers were associated with family history, history of endometriosis, oral contraceptive use, or nulliparity. An increasing CA125 serum level was the only marker independent of all participant characteristics. HE4 serum levels increased with age and smoking suggesting that analysis using a longitudinal algorithm might further improve performance of HE4 as a screening modality, but serial measures were not available to perform this analysis.
HE4 was evaluated for its potential value in multimodal screening using the subset of 84 patients and 516 control subjects with TVU results available (). Positivity thresholds were chosen for HE4 and for increasing CA125, which yield 96.9% specificity to be consistent with TVU specificity in this dataset. McNemar test was used to test the hypothesis that the number of cancers detected by measuring HE4 in women with increasing CA125 is greater than the number of cancers detected by performing TVU. Using increasing CA125 levels to select women for follow-up testing, we found that 27 of 39 cancers with increasing CA125 were confirmed by measuring HE4 levels compared with 17 cancers confirmed by TVU (P = .03). Positivity of CA125 may be defined by a single threshold rule rather than by an increase; for example, in the PLCO trial, positivity was defined by a CA125 level of at least 35 U/mL (specificity = 98.8%). Using this rule to select women for a confirmatory test, we found that 26 of the 34 CA125-positive cancers were confirmed by measuring HE4 compared with 16 cancers confirmed by TVU (P = .02) (data not shown).
Table 2 Cancers detected and specificity of increasing CA125, TVU, and HE4 alone and in combination, by tumor type and stage on the basis of 84 patients and 516 control subjects in the Prostate, Lung, Colon, and Ovarian trial with CA125, HE4, and TVU imaging (more ...)
Measurement of HE4 had higher sensitivity in confirming all stages of type 2 EOC (includes grade 3 and 4 tumors of serous, undifferentiated, or adenocarcinoma not otherwise specified histology) and lower sensitivity in confirming early-stage type 1 EOC (all remaining tumors including grades 1 and 2 serous tumors and all clear cell, endometrioid, and mucinous histology tumors) (15
) compared with TVU. We found that sensitivity is maximized by both measuring HE4 and performing TVU in the second-line screen and calling the screen positive if either confirmatory test is positive. We found that HE4 also performs better than TVU as a first-line screen. As reported in , increasing CA125 and HE4 serum levels both had higher sensitivity for EOC than did TVU when used alone as a first-line screen at the same 96.9% specificity, identifying 39 and 30 of 84 cancers respectively compared with 24 cancers identified by TVU.
This study used preclinical samples to retrospectively validate clinical decision rules. There are several limitations to the study, including the lack of access to serial samples and the fact that serum samples were not available from the fourth screen. Also, control subjects who underwent oophorectomy during the trial were excluded from the study, upwardly biasing estimates of specificity for screening modalities that falsely identify disease conditions other than EOC for which oophorectomy is performed. The final limitation of the study is that the PLCO trial used decades old screening test technology, still in use by others because of a lack of technical advances in the field. Potential bias introduced by these limitations likely leads to conservative conclusions. Use of an automated clinical platform to measure analytes in serum, use of increasing HE4 in decision rules, and use of a morphology index (16
) for TVU interpretation would likely improve overall screening performance.
Our analysis provides empiric support for the multimodal screening strategy that is being tested in the UK Collaborative Trial of Ovarian Cancer Screening. We conclude that HE4 may have a role in multimodal screening. However, measuring HE4 will not make screening itself effective. Requiring two tests both to be positive improves specificity, but use of an “and” rule can be expected to reduce sensitivity relative to use of an “or” rule. The efficacy of multimodal screening for ovarian cancer has not been demonstrated, and diagnosing HE4-associated tumors is not necessarily better than diagnosing TVU-associated tumors in terms of health outcomes, as there are no reports of one being more beneficial than the other in terms of outcome. Accordingly these results have more utility for future research than for clinical practice. Confirmation by an independent group that measurement of HE4 outperforms TVU as both a first- and second-line screen is needed.