|Home | About | Journals | Submit | Contact Us | Français|
Human epididymis protein 4 (HE4) is approved for clinical use with CA125 to predict epithelial ovarian cancer (EOC) in women with a pelvic mass or in remission after chemotherapy. Previously reported reference ranges for HE4 are inconsistent.
We report positivity thresholds yielding 90%, 95%, 98% and 99% specificity for age-defined populations of healthy women for HE4, CA125 and Risk of Malignancy Algorithm (ROMA), a weighted average of HE4 and CA125. HE4 and CA125 were measured in 1780 samples from 778 healthy women aged >25 years with a documented deleterious mutation, or aged >35 years with a significant family history. Effects on marker levels of a woman’s age, ethnicity and epidemiologic characteristics were estimated, as were the population-specific means, variances and within- and between-woman variances used to generate longitudinal screening algorithms for these markers.
CA125 levels were lower with Black ethnicity (p=0.008). Smoking was associated with higher HE4 (p=0.007) and ROMA (p<0.019). Continuous oral contraceptive use decreased levels of CA125 (p=0.041), and ROMA (p=0.12). CA125 was lower in women age ≥55, and HE4 increased with age (p<0.01), particularly among women age ≥55.
Due to the strong effect of age on HE4, thresholds for HE4 are best defined for women of specific ages. Age-specific population thresholds for HE4 for 95% specificity ranged from 41.4 pmol/L for women age 30 to 82.1 pmol/L for women age 80.
Incorporation of serial marker values from screening history reduces personalized thresholds for CA125 and HE4 but is inappropriate for ROMA.
Human epididymis protein 4 (HE4) is an epithelial ovarian cancer (EOC) serum marker that was identified originally as a candidate early detection marker . Since then it has been shown to be potentially useful for remission monitoring [2–4] and has been cleared by the U.S. Food and Drug Administration (FDA) for that use. It has also been shown to perform as well or better than CA125 in distinguishing benign from malignant tumors in women with a pelvic mass [5–7]. The Risk of Ovarian Malignancy Algorithm (ROMA) which includes CA125 as well as HE4  has been cleared by the FDA for evaluation of a pelvic mass. HE4 has also been reported to contribute to identification of malignancy in women with nonspecific abdominal/pelvic symptoms . Most recently it has been reported to provide detectable signal about a year prior to diagnosis [10, 11], renewing interest in it as an early detection marker. We have recently reported that HE4 outperforms imaging as a second-line screen when rising CA125 is used as a first line screen .
The efficacy of screening for EOC has not been shown. Use of both imaging and CA125 (using a threshold of 35 U/ml for all women) annually in women aged 55–74 leads to unnecessary surgery  without reducing mortality . However, a multimodal strategy using rising CA125 annually to select women for imaging is yielding acceptable positive predictive value (PPV) in an efficacy trial in the U.K. where the longitudinal Risk of Ovarian Cancer Algorithm (ROCA)  is used to measure rising CA125 . The ROCA has not been applied to HE4. An alternative strategy for using novel markers in a longitudinal algorithm is the Parametric Empirical Bayes (PEB) longitudinal algorithm . Use of the PEB algorithm has been reported for CA125 previously [17, 18], and it can be readily adapted for use with HE4 because it is fit using data from healthy women only.
HE4 is in clinical use but reference ranges for HE4 have been reported only very recently and remain uncertain due to inconsistency. Park et al report the 97.5% upper reference limit for HE4 to be 33.2 pmol/L for a young Asian population , low compared to the median value of 48 pmol/L in healthy controls reported by the same authors for an Asian hospital population . Moore et al report the upper 95th percentile of HE4 to be 89 pmol/L for premenopausal women, 128 pmol/L for postmenopausal women, and 115 pmol/L for all women, lower in pregnancy and rising with age . Other investigators have used positivity thresholds for HE4 of 70 pmol/L  and 150 pmol/L  in clinical validation studies.
Here we report HE4 positivity thresholds yielding 90%, 95%, 98% and 99% specificity, that account for age, derived from analysis of 1780 samples from 778 healthy women aged >25 years with a documented deleterious mutation, or aged >35 years with a significant family history. We also report parameters characterizing the within- and between- components of variance for HE4 in order to allow researchers to calculate the thresholds for HE4 when adjusting for a woman’s screening history as well as her age using the PEB rule . Whenever the total variance of a marker is dominated by between- women differences the PEB decision rule will maintain overall test specificity but may achieve better sensitivity, longer lead time, and lower thresholds for the majority of women compared to a single threshold (ST) rule [18, 24]. Interpretation of serial measures of HE4 using a longitudinal algorithm has not been previously reported. For comparison we also report the PEB-relevant parameters for CA125 and ROMA in this population.
Women enrolled in the Novel Markers Trial (NMT) through February 29, 2012 were eligible for the current study if at the time of enrollment they were aged ≥25 years and reported having tested positive for a deleterious mutation in BRCA1 or BRCA2, or aged ≥35 years with a significant family history and were not diagnosed with EOC during follow up. The NMT is a two-arm randomized multi-institutional Phase I screening trial sponsored by the National Cancer Institute  The NMT introduces HE4 as either a first- or second-line screen in a multimodal screening strategy that includes CA125, HE4 and transvaginal ultrasound (TVU) . Participants completed a baseline questionnaire that provided information about ovarian cancer family history and other risk factors, and contributed up to 5 blood samples about 6 months apart. HE4 and CA125 were measured in 1780 samples from 778 healthy women on the Abbott Architect™ automated platform in a Clinical Laboratory Improvement Amendments (CLIA)-approved laboratory using FDA-approved kits. The ROMA PI was calculated using previously defined pre- and post-menopause formulae . We used [PI = −12.0 + 2.38 * ln(HE4) + 0.0626 * ln(CA125)] for women under age 50, and [PI = −8.09 + 1.04 * ln(HE4) + 0.732 * ln(CA125)] for the remaining women.
To address interpretation of single and longitudinal measures of markers, we first stratified ages of all women by four clinically relevant age strata: age<45, 45≤age<55, 55≤age<65, and age≥65. Natural-log (ln) transformed marker values were used in all analyses and then transformed back to their raw scale for reporting purposes. Because the ROMA PI is a linear combination of log-transformed CA125 and HE4 values  no additional transformations were needed and results are reported on its raw scale.
A linear regression model associating marker concentration (Y) to covariates was fit using a Generalized Estimating Equations clustering  on participant identifier in order to adjust for varying numbers of observations per participant to estimate the model Y = β0 + β1X1 + + βkXk + ε where ε is a residual normal distribution with mean 0 and residual variance V. The coefficients were used to determine the magnitude of effect for each epidemiological covariate. The regression predictors included epidemiological covariates, and main effects of the age-strata, age as a continuous variable, and also an interaction between age and age stratum to allow the effect of age to differ among the different age-defined strata. The regression line was used to produce reference ranges for HE4, which has a strong trend with age within each age stratum. Age-defined population thresholds are calculated from the fitted values and residual standard error based on an assumption that the residual errors are approximately normally distributed. For CA125, where levels are constant in the younger and older women, age-defined population thresholds were determined from empirically defined percentiles of the marker within those age ranges.
The marker threshold for an individual woman can be personalized if screening history is known, including age and marker value at each screen. To control for screening history we used the PEB screening rule, taking advantage of its ability to accommodate covariates  . We assume that each woman’s marker levels may systematically differ, on average by a constant amount, from the regression equation above, and marker concentrations will vary around this person-specific regression line with residual variance S, where S<V. The intraclass correlation (ICC) is defined here by ICC=1−S/V, and 0<ICC<1. The ICC is estimated by first computing the regression’s residuals -- Z = Y − (β0 + β1X1 + + βkXk -- then calculating the ICC using the ICC1 function of the ‘multilevel’ package of R . To approximate thresholds controlling for both age and covariates we use B to represent the ICC of Z and use Bn = (n)/(1 + (n − 1) · B))) to represent the ICC of a sample average of ‘n’ independent values of Z, denoted [17, 18]. The PEB threshold is calculated from the regression equation and the within- and between- woman variance parameters using the following formula:
where cα is the alpha quantile a standardized normal distribution (e.g., 1.64 for 95% specificity). See McIntosh and Urban  for derivation and details. Parameters required for calculating the PEB thresholds for HE4 and CA125 are provided in the Results Section. With longer screening histories Bn increases to a maximum of 1 and the intercept term equals a woman’s unique individual value level . When ‘n=0’ then Bn=0 and the PEB threshold defaults to the population-defined linear regression values. The person specific residual variance also decreases as screening history accumulates, meaning that the reference range for a woman shrinks in size over time. Theoretical results show that at comparable population-wide specificities the PEB uniformly improves upon rules that ignore screening history and improves over simpler rules that make decisions based on simple change-from-baseline [17, 18].
Baseline characteristics of NMT participants included in this report are reported in Table 1. The majority of women were aged 45–64, Caucasian, non-Hispanic, parous, non-smoking, non-hysterectomized and non-users of hormone replacement therapy (HRT), and most did not have a prior tubal ligation. A deleterious mutation in a BRCA1 or BRCA2 gene was reported by 17.7% of women included in this study. A personal history of breast cancer was reported by 17.2% and a family history of breast cancer was reported by 75.3% of women. A family history of ovarian cancer was reported by 43.1%, and Ashkenazi Jewish lineage was reported by 19.5%, of women. Current oral contraceptive (OC) use was reported by 5.0% of women; half of these reported continuous OC use resulting in cessation of periods.
Effects of age and age-adjusted effects of other covariates on mean marker levels are reported in Table 2. We first evaluated the effect of age on the marker levels within each age stratum. CA125 levels are lower for women with age≥55 than in age<45 (p<0.001) but levels are constant and unchanging with respect to age within these categories (p=0.49 for age<45 and p=0.797 for age≥55). Rapid changes in CA125 were found in the strata defined by 45≤age<55, with a decline of 30% over the 10-year period (p=0.006).
For HE4 an increasing trend was found in all age strata but the most dramatic change was found in women after age 55. Before age 45 Ln HE4 concentrations elevate by an average of 0.0082 per year, or exp(0.0082 × 10) × 100 =8.5% per decade on the raw scale (p=0.013), but they elevate by 0.0158 per year faster than the reference group, or exp((0.0158 + 0.0082) × 10) × 100 = 27.1% per decade on the raw scale after age 55. Although the slope among women with 45≤age<55 was just 2.7%, it did not differ significantly from that of women age <45 (p=0.40), suggesting that these age categories could potentially be combined when interpreting HE4. An analysis combining these two younger periods finds HE4 with a slope of 0.00602, or 6.2% change per decade on the raw scale before age 55. We did not combine these groups in our analyses, however.
For ROMA, which includes both CA125 and HE4, a increasing trend was apparent in the reference group (age<45, p=0.012), and the slope was higher among the older age groups though the difference was only significant in women 45≤age<55 (p<0.005). However, given the definition of ROMA, especially since its definition is different for women before and after menopause (age 50 here), it is difficult to interpret the reference ranges of ROMA over time. We provide it here for reference.
Among all other covariates, concentrations of CA125 were lower in women with Black ethnicity (26.6%; p=0.008) and with continuous OC use (18.9%; p=0.041). Concentrations of HE4 were higher in current smokers (21%; p=0.007). ROMA was higher in smokers (p<0.019) and lower in women reporting continuous oral contraceptive use (p=0.012).
Table 3 reports age-defined population thresholds. For CA125, where concentrations are constant within the age ranges age<45 and age≥55, a single reference range is appropriate for all women in those age ranges, and thresholds are determined empirically and without distributional assumptions. However, for HE4, because it changes so dramatically with age we provide the thresholds at specific ages predicted from the regression equation. However, because the thresholds are linear on the log scale one can predict reference ranges for any intermediate age by using linear interpolation on the appropriate scale; e.g., the 95th percentile for age 65 is exp(0.5 Ln(50.8) + 0.5 Ln(64.6))= 57.3. Note that only age is accounted for in Table 3. Applying the appropriate percent reduction using the effect sizes from Table 2 can approximate adjustments for other covariates (e.g., a 21% increase in HE4 for a current smoker). However, because sample sizes for the significant predictors of CA125 and HE4 are small one must be cautious as one may anticipate a wide confidence interval around their effects.
For HE4, age-specific population thresholds for 95% specificity ranged from 41.4 pmol/L for women age 30 to 82.1 pmol/L for women age 80. The magnitude of the effect of age on HE4 means and thresholds can be seen in Table 3. The 10 year span from age 30 to 40 increases the 95th percentile for HE4 by only 4 points, but a 10 year span from age 60 to 70 increases HE4 by nearly 14 points. The association with age for HE4 has been previously reported, but not the distinct behavior on either side of 55 years [12, 21, 29]. ROMA thresholds similarly increase with age. For CA125, thresholds yielding 95% specificity are 31.8 U/ml, 28.5 U/ml and 22.2 U/ml for women age <45, 45≤age<55 and age ≥55 respectively.
Incorporation of serial marker values from screening history potentially reduces thresholds for CA125 and HE4 but was found to be inappropriate for ROMA. Thresholds personalized for both covariates and screening history are obtained using the PEB algorithm for Ln(HE4) and Ln(CA125). We do not include ROMA in the PEB calculation, as it is not recommended for use with healthy women for which longitudinal screening algorithms will be applied, and because applying a longitudinal algorithm to the ROMA as a means to combine markers is not necessarily optimal or superior to applying the PEB algorithms to CA125 and HE4 separately and defining positivity as either marker positive. The PEB parameters V and B, from the equation, are shown in Table 4 (The ICC (or B) is equal to Bn when n=1). We also provide Bn for scenarios when screening history is between 0 and 6 screens. These values and the results of Table 2 can be used to calculate the threshold for any screening history for any woman. For HE4, the residual variance V and B is computed from the regression equation represented in the footnote to Table 2. However, for CA125 these values were computed separately using data within each age stratum, and parameters are reported separately for each stratum. Note because CA125 values change so dramatically in the 45≤age<55 stratum it is not recommended to use the PEB rule during this time, but we report them here for completeness. The ICC for HE4 is smaller than for CA125 suggesting that its screening history is less informative than that of CA125, but its value exceeds 0.5, suggesting that personalizing thresholds may have a meaningful effect compared to ignoring screening history [18, 24].
The effectiveness of controlling for screening history is shown by the ‘% reduction in offset’ in Table 4, referring to the percentage that the “offset” in the PEB expression, which represents the deviation from the expected value that is tolerated before a screen is declared positive, was reduced. The goal of the PEB rule is to provide a narrower predicted reference for each woman while maintaining overall test specificity. The narrower reference range will facilitate earlier detection of a tumor: a reference range that is half the width can detect tumors when markers elevate half the amount. As seen in Table 4, by the 4th screen the reference range for HE4 is on average 30% narrower when controlling for screening history than when ignoring prior marker values. CA125, because of its higher ICC, can achieve an even greater reduction in reference range size. Note that the ICC compares screening rules that ignore history to those using the same marker accounting for screening history; the ICC does not measure the quality of the marker otherwise. As seen in Table 4 the incremental benefit of controlling for screening history diminishes over time. For CA125 with women age≥55 two previous screens can reduce the offset by 43% but a 49% reduction is not obtained until 6 screens.
Interpretation of single and serial measures of HE4 and CA125 in women at high risk for ovarian cancer can be summarized in thresholds for positivity at first and subsequent screens. We investigated characteristics that might affect population thresholds for use at first screens. For HE4 we found concentrations to depend on age, with a quite dramatic change over time after age 55. The thresholds reported by us and others (e.g., Moore et al.  or Park et al ) are difficult to compare owing to differences in the ages of the populations, in addition to potential differences in race. We have shown that adequately controlling for effects of age on HE4, given its rapid rise after age 55, will require reference ranges that depend on an individual woman’s specific age. This is different from CA125 where all women in broad age categories share the same CA125 mean levels. For CA125, if 95% specificity is desired population thresholds are 31.8 U/ml and 22.2 U/ml for women age <45 and ≥55 respectively. For HE4 it is useful to categorize women more finely because HE4 increases steadily with age. The age-specific population thresholds for HE4 for 95% specificity nearly double over the age range studied, increasing from 41.4 pmol/L for a 30-year-old woman to 82.1 pmol/L for an 80-year-old woman. Although our analysis focused on high-risk women it is likely to be generalizable to low-risk women because most of the high-risk women in this study were at modestly increased risk, and because prior work suggests that risk does not influence marker levels .
Reference ranges for ROMA may provide a means to screen women by combining CA125 and HE4. However, although ROMA PI was derived using statistically optimal procedures for use in a cross-sectional study  that are appropriate for use there, its optimality for use in a longitudinal study has not been shown. It may be inferior to other alternative approaches such as using the markers separately then making decisions based on whether one or both are elevated. In its current form ROMA is not appropriate for use in early detection among asymptomatic women.
A limitation of this study is that the healthy women included in the NMT were not routinely characterized for presence or absence of uterine leiomyoma, which may cause elevation in CA125. To address this possibility we reviewed medical charts as well as imaging reports for all 53 women who had NMT-protocol-indicated TVS to identify any leiomyoma occurring in study participants with elevated markers. Four such women were identified. In all four, TVU was performed due to transitory elevation in CA125. Presence of a leiomyoma appears to increase within-woman variability in CA125 and to cause occasional elevation above 99% thresholds.
Results of this study are more relevant to research than to clinical application of markers because screening for ovarian cancer is currently not recommended. It is especially important that primary care physicians understand that there is currently no evidence to support screening, especially using imaging which is often their preferred modality . The Prostate Lung Colon and Ovary (PLCO) trial in the U.S. failed to demonstrate a reduction in EOC specific mortality with annual screening using both CA125 and TVU concurrently . At the prevalence screen in the PLCO trial, rates of surgery were 2.6 times higher for TVU positive only than for CA125 positive only . Reports from the United Kingdom Collaborative Trial of Ovarian Cancer Screening (UKCTOCS) suggest that CA125 may outperform TVU in screening for EOC when a longitudinal algorithm is used to measure rising CA125 . The sensitivity, specificity, and positive predictive value for all primary invasive EOCs identified at the prevalence screen in the UKCTOCS were 89.5%, 99.8%, and 35.1% respectively for the multimodal strategy using CA125 to select women for TVU, and 75.0%, 98.2%, and 2.8% respectively for TVU alone . While results for the CA125 arm are promising, it is not yet known whether the sequential multimodal screening strategy will reduce mortality. The clinical benefit of using HE4 and the PEB algorithm for ovarian cancer screening has also not been documented. Data on clinical outcomes of the NMT cohort are currently being gathered. HE4 and CA125 interpreted by a longitudinal algorithm may have a role to play in EOC screening trials in the future. If research were to identify markers with improved operating characteristics, the high risk patient population (mutation carriers and those with significant pedigrees) might one day benefit.
Acknowledgments and Grant Support: We gratefully acknowledge helpful comments provided during review of manuscript from Beth Schodin and Barry Dowell of Abbott Diagnostics. This work was supported by the Pacific Ovarian Cancer Research Consortium, Award Number P50 CA083636 from the National Institutes of Health/National Cancer Institute (NHI/NCI). Also gratefully acknowledged is support for the Translational and Outcomes Research Laboratory from NIH/NCI U01 CA152637 and the Canary Foundation, support for clinical centers from the Marsha Rivkin Center for Ovarian Cancer Research and Canary Foundation, and a grant of no-charge study materials from Abbott Laboratories. The content is solely the responsibility of the authors and does not necessarily represent official views of NIH/NCI, Canary Foundation, Marsha Rivkin Center for Ovarian Cancer Research or Abbott Laboratories.
Funding Source: National Institutes of Health/National Cancer Institute P50 CA083636 and U01 CA152637
Conflict of Interest: None of the authors listed above (Nicole Urban, Jason Thorpe, Beth Karlan, Martin McIntosh, Melanie Palomares, Mary Daly, Pam Paley, and Charles Drescher) has declared any conflict of interest with the above manuscript.