This study aimed to investigate the performance of serum tumour markers CA125 and HE4, and the risk stratification tool ROMA in a prospective collection of serum samples from patients with an ovarian mass. We found that there was a significant difference between benign and malignant disease with respect to serum CA125, HE4 and ROMA levels. When the ROC–AUCs of the different tumour markers were compared, HE4 and CA125 performed similarly, except for the post-menopausal patients in whom CA125 performed better. This similar performance of HE4 and CA125 was also noted in other studies (Hellstrom et al, 2003
; Scholler et al, 2006
; Palmer et al, 2008
; Montagnana et al, 2009
; Andersen et al, 2010
). Combining HE4 and CA125 in the ROMA improved HE4 but not CA125 performance, regardless of menopausal status. As CA125 is the current standard for comparison, this means that neither HE4 nor the ROMA improved the diagnosis of ovarian cancer. This is in contrast to the results of Moore et al (2008b)
, who found that a combination of CA125 and HE4 performed better than CA125 alone. However, Moore et al (2008b)
excluded all borderline tumours, NEOC and metastatic tumours to calculate the performance of the tumour markers they tested. We decided not to exclude these tumours in our initial analysis as we wanted to study a patient population that reflected a normal clinical setting. When borderline tumours, NEOC and metastatic cancers were excluded, the ROC–AUC for CA125 was 0.937 vs
0.836 in the study by Moore et al (2008b)
. In contrast, the ROC–AUCs of HE4 were similar: 0.914 (our data) vs
0.908 (Moore et al, 2008b
). In a more recent study, Moore et al (2010)
also included borderline tumours in their analysis. Within this study, the examination of benign cases vs
all stages of EOC and borderline tumours revealed an ROC–AUC of 0.913. Within a setting of a multicentre prospective trial with central review and monitoring it seems plausible that a diagnostic test would perform slightly better.
Compared with CA125, HE4 is inversely influenced by age; whereas CA125 is higher in healthy pre-menopausal patients (Bon et al, 1996
; Bonfrer et al, 1997
), HE4 tends to be higher in post-menopausal patients (Moore et al, 2008a
; Andersen et al, 2010
). These slightly higher normal values influence the performance of the tumour markers concerned. Although not significant, this can also be seen in our study population: the ROC–AUC of CA125 was higher in the post-menopausal group. Of particular interest, HE4 seems to have a slightly higher ROC–AUC in the pre-menopausal group than in the post-menopausal group. Although this difference is not significant, it causes the ROC curves of CA125 and HE4 to come together in the pre-menopausal group and diverge in the post-menopausal group (). In other words, the performance of HE4 is similar to that of CA125 in the pre-menopausal group, but significantly worse in the post-menopausal group. This increased performance of HE4 in the pre-menopausal group is in agreement with previous studies (Moore et al, 2008b
; Andersen et al, 2010
), and confirms that CA125 and HE4 function independently of each other.
Owing to the fact that ROC curves are not used in clinical practice, we aimed to find the cutoff points for the different tumour markers. The cutoff values corresponding to the highest accuracy (minimal false-negative and false-positive results) for all patients were 62.5
for CA125, 72.2
p for HE4 and 22.2% for ROMA. In the product insert, it is suggested that 94.4% of the healthy female subjects (n
=179) that were studied had a HE4 value of 150
p or below. If we define the reference value as the value that includes 95% of healthy controls, and we use this as a cutoff point to minimise the false-positive rate, we obtain a sensitivity of 50.3% and a specificity of 96.5%. In clinical practice, this means that 3.5% of patients with a benign tumour will be treated as if they had a malignant tumour (overtreatment), and 49.7% of patients with a malignant tumour will be treated as if they had a benign tumour (undertreatment). Therefore, in our study, this cutoff point is not useful for differentiating benign from malignant cysts. Andersen et al (2010)
also determined their cutoff at the 95th percentile in a healthy control group. On the basis of this cutoff, they obtained a sensitivity of 77.0% and a specificity of 94.9%. Unfortunately, they failed to mention what their cutoff value was. Using the cutoff point of 70
p, as previously suggested by Moore et al (2008b)
, we reached a sensitivity of 74.5% and a specificity of 83.3%. This is therefore comparable to our ideal cutoff point of 72.2
p, and is thus a reasonable cutoff point for HE4. With regard to ROMA, different cutoff points are used in pre-menopausal and post-menopausal patients. Both cutoff points are determined to provide a specificity level of 75% for the CA125 plus HE4 assay combination. Our ideal cutoff points of 16.6% for the pre-menopausal patients and 35.9% for the post-menopausal patients were somewhat different from those suggested previously. However, these were not established at 75% specificity, but at the point on the ROC curve at which we had minimal false-negative and false-positive results. Irrespective of whether we analyse only invasive EOC, our ideal cutoff point in the pre-menopausal and post-menopausal category is higher than the suggested cutoff points of 12.5 and 14.4%, respectively.
As expected, histological subtypes seem to be important for the performance of the different tumour markers. With regard to benign tumours, it was interesting to see that the fibromas/thecomas group and the endometriomas had the highest levels of CA125, whereas for HE4, the endometriomas had the lowest level. As already mentioned by Huhtinen et al (2009)
, measuring both CA125 and HE4 together could be of particular interest in differentiating endometriosis from ovarian cancer, as ovarian cancer will cause a raised CA125 and HE4, whereas endometriosis will only cause a raised CA125. This could explain why HE4 performs better in pre-menopausal patients compared with post-menopausal patients, and vice versa
for CA125. However, even for the pre-menopausal patients, HE4 and ROMA did not perform better than CA125. All malignant tumours expressed high levels of CA125 and HE4, but the highest levels were noted for the serous subtype. High expression levels of HE4 for the different epithelial subtypes, with the exception of the mucinous subtype, were already noticed in previous studies (Lu et al, 2004
; Drapkin et al, 2005
; Gilks et al, 2005
; Galgano et al, 2006
Although CA125, HE4 and ROMA are not currently recommended as a screening tool, it is interesting to see how well a tumour marker performs in the early stage of disease. A definite trend could be seen from stage I to stage IV disease, and CA125 and HE4 performed significantly worse when early and late stages of disease were compared. As a consequence, the ROMA also performed worse. With these ROC–AUCs, the chances that HE4 or ROMA will be successful as a screening marker are low, as very high specificities are required in screening for low prevalent disease.
In summary, this large independent validation study was able to demonstrate similar performance indices as those recently published in the literature. However, in our study, neither HE4 nor ROMA increased the detection of malignant disease. Human Epididymis secretory protein 4, or its combination with CA125, could be useful in diagnosing certain benign or malignant subtypes; however, this needs to be explored in more detail.