Search tips
Search criteria 


Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Cancer Res. Author manuscript; available in PMC 2010 May 15.
Published in final edited form as:
PMCID: PMC2752375

A multicenter, double-blinded validation study of methylation biomarkers for progression prediction in Barrett’s esophagus


Esophageal adenocarcinoma risk in Barrett’s esophagus (BE) is increased 30- to 125-fold versus the general population. Among all BE patients, however, neoplastic progression occurs only once per 200 patient-years. Molecular biomarkers are therefore needed to risk-stratify patients for more efficient surveillance endoscopy and to improve the early detection of progression. We therefore performed a retrospective, multicenter, double-blinded validation study of 8 BE progression prediction methylation biomarkers. Progression or nonprogression were determined at 2 years (tier 1) and 4 years (tier 2). Methylation was assayed in 145 nonprogressors (NPs) and 50 progressors (Ps) using real-time quantitative methylation-specific PCR. Ps were significantly older than NPs (70.6 vs. 62.5 years, p < 0.001). We evaluated a linear combination of the 8 markers, using coefficients from a multivariate logistic regression analysis. Areas under the ROC curve (AUCs) were high in the 2-, 4-year and combined data models (0.843, 0.829 and 0.840; p<0.001, p<0.001 and p<0.001, respectively). In addition, even after rigorous overfitting correction, the incremental AUCs contributed by panels based on the 8 markers plus age vs. age alone were substantial (Δ-AUC = 0.152, 0.114 and 0.118, respectively) in all three models. A methylation biomarker-based panel to predict neoplastic progression in BE has potential clinical value in improving both the efficiency of surveillance endoscopy and the early detection of neoplasia.

Barrett’s esophagus (BE) is a metaplastic condition where the normal squamous epithelium of the lower esophagus is replaced by a small intestinal-like columnar lining(1). Esophageal adenocarcinoma (EAC) risk in BE is increased 30- to 125-fold relative to the general population(2), and endoscopic surveillance in BE patients is recommended at intervals of two to three years(1, 3). EACs detected in surveillance programs occur at earlier stages and have better prognoses(4, 5), but endoscopic surveillance suffers from high cost, inconvenience, patient anxiety, low yield, and procedure-related risks. In addition, the current marker of EAC risk in BE, dysplasia, is plagued by high inter-observer variability and limited predictive accuracy(6-8). Because neoplastic progression is infrequent in BE, the merits of and appropriate interval for endoscopic surveillance in BE have led to frequent debate(3, 5). This process would benefit greatly from effective biomarkers to stratify patients according to their level of neoplastic progression risk.

In 2005, we reported that hypermethylation of p16, RUNX3, and HPP1 occurs early in BE-associated neoplastic progression and predicts progression risk(9). Later, we developed a tiered risk stratification model to predict progression in BE using epigenetic and clinical features(10). We also studied methylation levels and frequencies of individual genes using real-time quantitative methylation-specific PCR (qMSP) in 259 endoscopic esophageal biopsy specimens of differing histologies. Among 10 genes evaluated, five, namely nel-like 1 (NELL1), tachykinin-1 (TAC1), somatostatin (SST), AKAP12, and CDH13, were methylated early and often in BE-associated neoplastic progression(11-15). In the above studies, methylation status and levels correlated inversely with mRNA expression levels (9-15). In light of these findings, we performed a retrospective, multicenter, double-blinded validation study of these 8 methylation biomarkers (i.e., p16, RUNX3, HPP1, NELL1, TAC1, SST, AKAP12, and CDH13) for their accuracy in predicting neoplastic progression in BE.

Materials and Methods

Definition of Barrett’s esophagus progressor and nonprogressor patients and sample collection

Progressors (Ps) and nonprogressors (NPs) were defined as described previously.(10) Ps were considered both as a single combined group, and in two tiers: progression within 2 years (tier 1) or 4 years (tier 2). 195 BE biopsies (145 NPs and 50 Ps) were obtained from 5 participating centers: the Mayo Clinic at Rochester/Jacksonville, the University of Arizona, the University of North Carolina, and Johns Hopkins University. All patients provided written informed consent under a protocol approved by Institutional Review Boards at their institutions. Biopsies were taken using a standardized biopsy protocol. (9, 10) Clinicopathologic features are summarized in Supplementary Table 1.

Bisulfite Treatment and Real-Time Quantitative Methylation-Specific PCR (qMSP)

Bisulfite treatment was performed as described.(11) Promoter methylation levels of 8 genes (p16, HPP1, RUNX3, CDH13, TAC1, NELL1, AKAP12 and SST) were determined by qMSP on an ABI 7900 Sequence Detection (Taqman) System .(11) β-actin was used for normalization. Primers and probes for qMSP are described in Supplementary Table 2. A standard curve was generated using serial dilutions of CpGenome Universal Methylated DNA (CHEMICON, Temecula, CA). A normalized methylation value (NMV) for each gene of interest was defined as described.(11) Wetlab analysts (ZJ and YC) and all SJM laboratory personnel were blinded to specimen P or NP status.

Data Analysis and Statistics

Associations between progression status and patient characteristics were tested using Student’s t-test or Chi-squared testing. Relationships between biomarkers and patient progression status were examined using Wilcoxon rank-sum testing.

To evaluate the predictive utility of the markers, we constructed receiver operating characteristic (ROC) curves. ROC curve analyses were first conducted on individual markers, then in combination to determine whether a panel performed better than any single marker. Our algorithm rendered a single composite score, using the linear predictor from a binary regression model justified under the linearity assumption(16). The predictive accuracy of composite scores was evaluated based on a resampling algorithm: we randomly split data into a learning set containing 2/3 and a test set including 1/3 of observations. The combination rule derived from the learning set produced two ROC curves, from the learning and test sets, respectively. Vertical differences between these two ROC curves yielded the overestimation of sensitivities at given specificities. This procedure was repeated 200 times, and these 200 differences were averaged to estimate the expected overfitting.

We also utilized predictiveness curves(17) to display risk distribution as a function of the combined marker in the population. This curve represents a plot of risk associated with the vth quantile of the marker, P{D=1|Y =F-1(v)} vs. v, with F(·) the cumulative distribution of the marker. These plots display population proportions at different risk levels more clearly than do other metrics (like ROC curves). Since a case-control sample was studied, we used an external progression prevalence rate to calculate risk in the targeted screening population. To calibrate for future samples, a shrinkage coefficient estimated from the logistic regression model was applied to the linear predictors from which risk was calculated(18).

All analyses were performed in R ( Statistical data analysts (Y.Z., W.G., and Z.F.) were blinded to the identities of the 8 biomarkers.


Clinical characteristics

Ps vs. NPs did not differ significantly by gender, body mass index, BE segment length, LGD patient percentage,family history of BE, LGD, HGD or EAC, cigarette smoking, or alcohol use; however, Ps were significantly older than NPs (70.6 vs. 62.5 years; p< 0.001, Student’s t test; Supplementary Table 1). Samples consisted of one biopsy from each of 50 Ps and 145 NPs (195 patients) in the combined model. In the 2-year model, we redefined progressors whose interval from index to final procedure exceeded 2 years as nonprogressors, yielding 36 Ps and 159 NPs. In the 4-year model, we redefined progressors whose interval from index to final procedure exceeded 4 years as nonprogressors, yielding 47 Ps and 148 NPs.

Univariate analyses

NMVs of HPP1, p16 and RUNX3 were significantly higher in Ps vs. NPs by Wilcoxon test (0.456, 0.138, and 0.104 vs. 0.273, 0.069 and 0.063; p = 0.0025, 0.0066 and 0.0002, respectively). The remaining 5 markers did not differ significantly in Ps vs. NPs (Supplementary Table 3). We further assessed the classification accuracy of single markers using ROC curve analyses. Areas under ROC curve (AUCs) for HPP1, p16 and RUNX3 were all significantly greater than 0.50, (Supplementary Table 4).

Logistic regression analyses of the 8-marker panel

We then combined all 8 markers by performing logistic regression and treating them as linear predictors (Table 1, Supplementary Figure 1). All models exhibited high AUCs (0.843, 0.829 and 0.840, respectively; Table 1, Supplementary Figures 1A-1C). We performed overfitting correction based on 3-fold cross-validation and 200 bootstraps. The overfitting-corrected AUCs remained high (0.745, 0.720 and 0.732, respectively), while shrinkages from overfitting correction were modest (0.098, 0.109 and 0.108, respectively) in the three models (Table 1, Supplementary Figures 1A-1C).

Table 1
Logistic regression and overfitting correction for the 2-year, 4-year and combined models

We also explored the incremental AUC value contributed by an 8-marker-plus-age panel to that of age alone (Table 2, Supplementary Figure 1). The AUCs of the 8-marker-plus-age panels in the three models (0.858, 0.850 and 0.855, respectively) were higher than those of age alone (0.604, 0.630 and 0.635, respectively; Table 2, Supplementary Figures 1D-1F). Overfitting-corrected AUCs remained high (0.756, 0.744 and 0.753, respectively), and increments contributed by the age-plus-biomarker panel vs. age were substantial (0.152, 0.114 and 0.118, respectively) in the three models (Table 2, Supplementary Figures 1D-1F).

Table 2
Incremental values above age alone for the 2-year, 4-year and combined models

Sensitivity and specificity of the 8-marker panel

While maintaining high specificity to minimize false-positive results, our model still predicted a number of new early diagnoses, i.e., diagnoses that would not have occurred earlier without the panel (Table 3). While maintaining specificity at 0.9 or 0.8, sensitivities (0.443 and 0.629 for the combined model, 0.607 and 0.721 for the 2-year model, and 0.465 and 0.606 for the 4-year model, respectively) were above or approached 50% in all three models based on the 8-marker panel alone. Furthermore, at 0.9 or 0.8 specificities, sensitivities (0.457 and 0.757 for the combined model, 0.536 and 0.786 for the 2-year model, and 0.450 and 0.724 for the 4-year model, respectively) exceeded or approached 50% in all models based on the 8-marker-plus-age panel.

Table 3
Specificity and sensitivity for 2-year, 4-year and combined models

Risk stratification of BE patients

ROC curves derived from these marker-based models were used to establish thresholds to stratify patients into risk categories. This procedure was performed to identify high-risk (HR) individuals for more frequent endoscopic screening. The threshold above which patients were classified as HR was chosen at specificity = 90%, to minimize false-positive, unnecessary endoscopies (type II error). A second threshold was established to identify low-risk (LR) individuals for less frequent endoscopic screening. The threshold below which patients were classified as LR was chosen at sensitivity = 90%, to minimize false-negative, missed HR individuals (type I error). Based on the combined P and NP classification, we classified patients as LR with a threshold that corresponded to 90% true-positives and 43% false-positives; the HR group was defined using a threshold that yielded 43% true-positives and 10% false-positives. Assuming a cumulative progression rate to HGD and/or EAC of 7.5% over 5 years(19), the corresponding negative predictive value relating to our LR threshold was 98.7% (i.e., progression risk in the LR group was 1.3%) and the positive predictive value relating to HR was 27% (i.e., progression risk in the HR group was 27%).

Predictiveness curve analyses

We used predictiveness curves (also known as risk plots) to assess the clinical utility of the combined classification rules in stratifying patients according to risk levels in the target population. To create predictiveness curves, we ordered and plotted risks from lowest to highest value. A progression rate to HGD and/or EAC of 7.5% over 5 years(19) was assumed in adjusting estimates from the case-control sample to reflect population risk and its distribution. Results are shown in Table 4 and Supplementary Figure 2. After overfitting correction, by age alone, nearly 90% of BE patients were classified as intermediate-risk (IR), whereas patients were well-stratified into low-risk (LR), IR, or high-risk (HR) categories by both the 8-marker alone and age plus 8-marker panels in all three models (Table 4, Supplementary Figure 2).

Table 4
Overfitting-corrected predictiveness curve analyses in 2-year, 4-year and combined models


In the current study, with specificity at 0.9, sensitivities of progression prediction approached 50% based on both the 8-marker panel alone and 8-marker-plus-age panel in all three models. These findings indicate that even while performing at high specificity, these biomarker models predicted half of progressors to HGD and EAC that would not have been diagnosed earlier without using these biomarkers.

Based on age alone, with specificity at 90%, only 17.6%, 23.2% and 22.1% of progressors were predicted in the three models. However, with panels based on age plus biomarkers or on biomarkers alone, approximately 60%, 50% and 50% of progressors were accurately predicted in these three models. Predicted progressors represent patients in whom we can intercede earlier, resulting in higher cure rates. Finally, our combined risk model outperformed known risk in the general BE population (7.5% progression risk over 5 years), both in terms of negative predictive value (1.3% progression risk over 5 years for the LR group) and positive predictive value (27% progression risk over 5 years for the HR group).

Age is a common risk factor for many cancers, including EAC(20). In the current study, Ps were significantly older than NPs, and the AUCs of age alone were 0.604, 0.630 and 0.635, respectively in the three models, suggesting that age per se predicts neoplastic progression in BE. However, methylation of tissues increases with aging, even in the absence of neoplastic progression (21-22). Thus, aging may exert risk on progression either independently, or through its influence on methylation. Nevertheless, the incremental prediction accuracy (above age) contributed by the 8-marker panel was substantial in all three models.

Thus, the current findings suggest that this 8-marker panel is more objective and quantifiable and possesses higher predictive sensitivity and specificity than do clinical features, including age. Furthermore, although age was a good classifier for disease progression, predictiveness curves revealed that age did not successfully stratify BE patients according to their progression risk. Moreover, age per se is not an accepted risk marker on which to base clinical decisions regarding surveillance interval or neoplastic progression risk in BE. In contrast, models based on both the 8-marker panel and the age-plus-8-marker panel provided estimated progression risks either close to 0 (i.e., LR) or between 0.1 and 0.5 (i.e., IR) in the majority of individuals, suggesting that these markers exerted a substantial impact on risk category. This finding also suggests that in clinical practice, separate thresholds can be chosen to define high, intermediate, and low risk, based on predictiveness curves.

In conclusion, we have developed a risk stratification strategy to predict neoplastic progression in BE patients based on an 8-marker tissue methylation panel. At high specificity levels, this model accurately predicted approximately half of HGDs and EACs that would not have otherwise been predicted. This model is expected to reduce endoscopic procedures performed in BE surveillance while simultaneously increasing detection at earlier stages. Future studies should explore additional potentially predictive methylation targets, along with alternative means of assessing methylation biomarkers (such as immunohistochemical staining for reduced biomarker expression). Thus, these findings suggest that a methylation biomarker panel offers promise as a clinically useful tool in the risk stratification of BE patients.

Supplementary Material

sup fig leg

sup fig1

sup fig2

sup tab1

sup tab2

sup tab3

sup tab4


We recognize the key contributions of Ms. Kim Nicolini to the design and execution of this study.

Supported by CA085069, CA001808, CA106763, CA95060, CA062924, CA106991, and the Early Detection Research Network


1. Cameron AJ. Management of Barrett’s esophagus. Mayo Clin Proc. 1998;73:457–61. [PubMed]
2. Hameeteman W, Tytgat GN, Houthoff HJ, van den Tweel JG. Barrett’s esophagus: development of dysplasia and adenocarcinoma. Gastroenterology. 1989;96:1249–56. [PubMed]
3. Wang KK, Sampliner RE. Updated guidelines 2008 for the diagnosis, surveillance and therapy of Barrett’s esophagus. Am J Gastroenterol. 2008;103:788–97. [PubMed]
4. Streitz JM, Andrews CW, Ellis FH. Endoscopic surveillance of Barrett’s esophagus. Does it help? J Thorac Cardiovasc Surg. 1993;105:383–8. [PubMed]
5. Corley DA, Levin TR, Habel LA, Weiss NS, Buffler PA. Surveillance and survival in Barrett’s adenocarcinomas: a population-based study. Gastroenterology. 2002;122:633–40. [PubMed]
6. Montgomery E, Bronner MP, Goldblum JR, et al. Reproducibility of the diagnosis of dysplasia in Barrett esophagus: a reaffirmation. Hum Pathol. 2001;32:368–78. [PubMed]
7. Montgomery E, Goldblum JR, Greenson JK, et al. Dysplasia as a predictive marker for invasive carcinoma in Barrett esophagus: a follow-up study based on 138 cases from a diagnostic variability study. Hum Pathol. 2001;32:379–88. [PubMed]
8. Alikhan M, Rex D, Khan A, Rahmani E, Cummings O, Ulbright TM. Variable pathologic interpretation of columnar lined esophagus by general pathologists in community practice. Gastrointest Endosc. 1999;50:23–6. [PubMed]
9. Schulmann K, Sterian A, Berki A, et al. Inactivation of p16, RUNX3, and HPP1 occurs early in Barrett’s-associated neoplastic progression and predicts progression risk. Oncogene. 2005;24:4138–48. [PubMed]
10. Sato F, Jin Z, Schulmann K, et al. Three-tiered risk stratification model to predict progression in Barrett’s esophagus using epigenetic and clinical features. PLoS ONE. 2008;3:e1890. [PMC free article] [PubMed]
11. Jin Z, Mori Y, Yang J, et al. Hypermethylation of the nel-like 1 gene is a common and early event and is associated with poor prognosis in early-stage esophageal adenocarcinoma. Oncogene. 2007;26:6332–40. [PubMed]
12. Jin Z, Olaru A, Yang J, et al. Hypermethylation of tachykinin-1 is a potential biomarker in human esophageal cancer. Clin Cancer Res. 2007;13:6293–300. [PubMed]
13. Jin Z, Mori Y, Hamilton JP, et al. Hypermethylation of the somatostatin promoter is a common, early event in human esophageal carcinogenesis. Cancer. 2008;112:43–49. [PubMed]
14. Jin Z, Hamilton JP, Yang J, et al. Hypermethylation of the AKAP12 Promoter is a Biomarker of Barrett’s-Associated Esophageal Neoplastic Progression. Cancer Epidemiol Biomarkers Prev. 2008;17:111–7. [PubMed]
15. Jin Z, Cheng Y, Olaru A, et al. Promoter hypermethylation of CDH13 is a common, early event in human esophageal adenocarcinogenesis and correlates with clinical risk factors. Int J Cancer. 2008;123:2331–6. [PubMed]
16. McIntosh MW, Pepe MS. Combining several screening tests: optimality of the risk score. Biometrics. 2002;58:657–64. [PubMed]
17. Pepe MS, Feng Z, Huang Y, et al. Integrating the predictiveness of a marker with its performance as a classifier. Am J Epidemiol. 2008;167:362–8. [PMC free article] [PubMed]
18. Van Houwelingen J, Le Cessie S. Statistics in Medicine. 1990. Predictive value of statistical model; pp. 1303–1325. [PubMed]
19. Wani S, Choi W, Sharma P. Low-grade dysplasia in Barrett’s esophagus - an innocent bystander? Pro. Endoscopy. 2007;39:643–6. [PubMed]
20. Lagergren J. Adenocarcinoma of oesophagus: what exactly is the size of the problem and who is at risk? Gut. 2005;54(Suppl 1):i1–5. [PMC free article] [PubMed]
21. Ahuja N, Issa JP. Aging, methylation and cancer. Histol Histopathol. 2000;15:835–42. [PubMed]
22. Ahuja N, Li Q, Mohan AL, Baylin SB, Issa JP. Aging and DNA methylation in colorectal mucosa and cancer. Cancer Res. 1998;58:5489–94. [PubMed]