|Home | About | Journals | Submit | Contact Us | Français|
Cancer staging determines extent of disease, facilitating prognostication and treatment decision making. The American Joint Committee on Cancer (AJCC) TNM classification system is the most commonly used staging algorithm for colon cancer, categorizing patients on the basis of only these three variables (tumor, node, and metastasis). The purpose of this study was to extend the seventh edition of the AJCC staging system for colon cancer to incorporate additional information available from tumor registries, thereby improving prognostic accuracy.
Records from 128,853 patients with primary colon cancer reported to the Surveillance, Epidemiology and End Results Program from 1994 to 2005 were used to construct and validate three survival models for patients with primary curative-intent surgery. Independent training/test data sets were used to develop and test alternative models. The seventh edition TNM staging system was compared with models supplementing TNM staging with additional demographic and tumor variables available from the registry by calculating a concordance index, performing calibration, and identifying the area under receiver operating characteristic (ROC) curves.
Inclusion of additional registry covariates improved prognostic estimates. The concordance index rose from 0.60 (95% CI, 0.59 to 0.61) for the AJCC model, with T- and N-stage variables, to 0.68 (95% CI, 0.67 to 0.68) for the model including tumor grade, number of collected metastatic lymph nodes, age, and sex. ROC curves for the extended model had higher sensitivity, at all values of specificity, than the TNM system; calibration curves indicated no deviation from the reference line.
Prognostic models incorporating readily available data elements outperform the current AJCC system. These models can assist in personalizing treatment and follow-up for patients with colon cancer.
The purpose of cancer staging systems is to convey prognosis to clinicians and patients in a clear, succinct manner. Current algorithms categorize patients on the basis of a parsimonious set of data elements available from pathologic reports. Survival estimates for each group are used to determine clinical trial eligibility, inform treatment decisions, and develop surveillance schedules. Survival estimates enhance a patient's ability to make informed choices and life plans. The majority of solid tumors, including colon cancer, are staged according to a system devised by the American Joint Committee on Cancer (AJCC) and the International Union Against Cancer. The AJCC classification system categorizes patients on the basis of anatomic depth of penetration of tumor into the intestinal wall (T stage) and nodal status (N stage) as well as presence of distant metastasis (M stage). Although easy to implement, survival estimates for patients with the same AJCC stage vary considerably because other factors influence prognosis. Recognizing the restrictions of current algorithms, the AJCC issued a request for proposals to develop alternative staging algorithms based on readily “available information beyond classical Tumor Node Metastases (TNM) staging.”1
Cancer registries routinely collect patient demographics as well as information about tumor characteristics such as the specific number of involved lymph nodes. Yet, only TNM information is included in the AJCC schema. We sought to determine whether prognostic models that leverage the full complement of routinely available data elements could more accurately predict overall survival. We constructed nomograms to convey survival estimates after curative-intent colon cancer surgery based on these models.
The Surveillance, Epidemiology, and End Results (SEER) Program of the National Cancer Institute was our source of population-based data for model development and testing. The SEER Program tracks cancer incidence and survival data using 14 population-based cancer registries and three supplemental registries covering approximately 26% of the US population. Overall, SEER registries include 23% of African Americans, 40% of Hispanics, 42% of American Indians and Alaska Natives, 53% of Asians, and 70% of Hawaiian/Pacific Islanders with cancer.2
Of 278,023 patients diagnosed with colon cancer between 1994 and 2005 in a SEER region, 128,853 had curative-intent resection of a first primary adenocarcinoma of the colon. We further restricted analyses to patients for whom complete staging information, including locoregional lymph node examination, was reported (Fig 1).
The primary outcome was survival overall. Patients were observed from diagnosis until death or through December 2005. Cox proportional hazards regression was used to predict survival probabilities. Five-year probabilities, along with 95% CIs, were reported, although the model has the flexibility to predict earlier time points as well.
Variables that have previously been shown to influence disease-specific and overall survival after colon cancer and that are reliably measured and routinely available from tumor registry data were incorporated into multivariable models. These include the T and N elements as well as age, sex, tumor differentiation/grade, and number of regional lymph nodes evaluated (Table 1).
Three nomograms were created with prespecified variables and using multivariable regression with Cox proportional hazards modeling.3–5 In implementing this approach, we followed the guidelines for developing models set forth by Iasonos et al,6 as published in the Journal of Clinical Oncology. To reduce over-fitting and upwardly biased estimates of performance, the models were constructed using a training set and evaluated using a test set. To identify a training and a test cohort, the entire cohort was randomly divided in a two-to-one manner (training and test sets) and stratified by year; this was done to ensure that any potential imbalance based on differences in treatment over time would not bias the evaluation. The proportional hazards assumption was carefully verified by tests of correlations with time and examination of residual plots. To allow flexibility in representing nonlinear covariate effects on outcome, continuous variables, such as age at diagnosis, were modeled using restricted cubic splines.7
Each of the three multivariable models incorporated an increasing number of variables. The first, most basic model was the AJCC staging algorithm version, which included the T- and N-stage elements of the AJCC collaborative algorithm. The second model included number of lymph nodes examined, as well as number of examined lymph nodes containing tumor, instead of the N-stage element. The third model included elements in the second model, as well as tumor differentiation, patient age, and sex. These variables were chosen a priori on the basis of their well-established independent associations with overall survival8 and their availability in the SEER registry.
Models were compared with one another and with the AJCC TNM system by assessing concordance index and calibration curves on the test set. Concordance probability is a measure of discrimination,7 and its interpretation is similar to that of the area under the receiver operating characteristic (ROC) curve.9 In essence, it represents the probability that, given two randomly selected patients, the patient who dies first had a lower estimated probability of survival. The concordance probability for each model was deduced using a recently proposed method, yielding consistent estimates with censored data.10 In addition to measuring their discriminative capacity, each of the three models was evaluated with calibration curves in which predicted versus observed outcomes are graphically depicted; this made it possible to conduct further comparison of accuracy in estimating prognosis.11,12 Time-dependent ROC curve analysis was also used, employing recently developed methods13,14 to assess the final model's discriminatory power at 5 years.
Cohort demographics and tumor-related characteristics are described in Table 1. Kaplan-Meier overall survival curves for the entire population, according to the AJCC classification schema (seventh edition), are shown in Figure 2. As seen in previous staging iterations, a rank-ordered hierarchical relationship between advancing stage and overall survival is not present. Overall survival by stage, in decreasing order, is as follows: I, IIIA, IIA, IIB, IIIB, IIC, IIIC.
Each of the three prognostic models and their associated nomograms are shown in Figure 3. Model performance improved as more detailed pathologic and demographic variables were used. The concordance index for the simplest nomogram (Fig 3A), based only on T and N elements, was 0.61 (95% CI, 0.60 to 0.62). When the number of lymph nodes examined and number of metastatic lymph nodes examined were substituted for the N-staging element (Fig 3B), the concordance index rose to 0.63 (95% CI, 0.62 to 0.64). Finally, when pathologic tumor differentiation and demographic variables of age and sex were added to the model (Fig 3C), the concordance index increased to 0.68 (95% CI, 0.67 to 0.68). The concordance index for the model that used only the variables in the seventh edition of the TNM staging system was 0.60 (95% CI, 0.59 to 0.61).
The calibration curve for the highest-performing model, which included detailed pathologic and demographic variables, is shown in Figure 4A. It reveals no deviations from the reference line and no need for recalibration. The 5-year ROC curve is shown in Figure 4B. This model discriminates well at all thresholds of predicted probability. Three particular thresholds are marked on the figure: 0.25, 0.30, and 0.40.
Cancer staging provides critical information on prognosis to patients and clinicians, guiding therapy. The majority of solid malignancies, including colon cancer, are staged according to the AJCC system, which groups patients on the basis of anatomic variables. As therapeutic options have expanded, more refined and accurate predictions of survival are needed to direct treatment decision making. Individualized prognostication empowers patients, enhancing their ability to make informed and meaningful choices.
We extended the AJCC TNM schema to include data elements routinely available from tumor registry data. With the addition of just a few variables, we were able to generate a prognostic model with performance superior to that of the TNM system. Because it is based on SEER data, the model can be regularly updated.
In an attempt to improve colon cancer staging, the AJCC revised and published the seventh edition of the TNM anatomic classification scheme in 2009.16,17 The major change in the seventh edition is the creation of additional substages to help refine prognostic groups. The staging elements remain the same: depth of tumor invasion into the colon wall (T stage) and number of lymph nodes involved by metastatic disease (N stage). In the new edition, N1 nodal disease is subdivided into three groups (N1a, N1b, and N1c) and N2 disease into two groups (N2a and N2b), resulting in three substages each for stage II and III colon cancer (IIA, IIB, IIC, and IIIA, IIIB, IIIC, respectively). Although prognostication improves within each stage, there is a loss of the clear rank ordering which is the hallmark of categorical staging systems (Fig 2). Thus stage IIB (T3N0) patients have inferior survival compared with stage IIIA (T1-2N1) patients; stage IIC (T4N0) patients fare worse than stage IIIB patients (T1-2N2). Without clear improvement in prognostication, the added complexity of multiple subcategories seems to detract from the simple AJCC scheme, which was based on the work of Dukes18,19 60 years ago. A need for more individualized staging that uses a greater number of demographic and clinicopathologic variables known to impact cancer survival is readily apparent.
In an attempt to develop more personalized prognostication, we previously created a colon cancer nomogram for predicting recurrence, based on assessment of patients treated at a specialty cancer center.20 The model had greater predictive accuracy than the TNM staging system (sixth edition) and included multiple prognostic factors, both continuous and discrete, as well as nonlinear and complex mathematical relationships. This model functions well; however, it is based on a specific population treated at a specialty center and uses variables that are not readily available in all clinical settings.
To develop prognostic models generalizable to the population at large, we sought to improve on the AJCC prognostic classification schema using variables readily available from tumor registries such as SEER. The most basic model included the T and N elements and performed similarly to the categorical seventh edition AJCC staging system. We improved on this by removing the N-stage element and replacing it with number of lymph nodes evaluated and number of metastatic lymph nodes. The number of lymph nodes examined has been shown to be correlated with outcome in many studies,21,22 and using the actual number of positive lymph nodes rather than categories (ie, N0, N1, or N2) further improves model performance. The most highly performing model includes tumor- and patient-related variables such as T category, number of positive and negative lymph nodes, tumor grade, patient age, and sex. ROC curve analysis, concordance index, and calibration plots confirm the predictive superiority of this model over the AJCC (seventh edition) TNM system.
A more individualized prognostic scheme, as developed in this study, may have its greatest impact on the use of adjuvant therapy after colon cancer resection. Six months of cytotoxic chemotherapy is advised for patients who are believed to harbor occult metastatic disease.23 These high-risk patients are commonly defined by the TNM staging system as having nodal metastasis, and the postoperative chemotherapeutic strategy has been successful in reducing cancer-related mortality by 30% to 50%.24,25 However, using node positivity as the sole determinant for adjuvant therapy ignores the fact that as many as 25% of patients with node-negative disease eventually experience recurrence. Similarly, not all patients with node-positive disease have poor outcome.20,26 Thus a more refined, risk-adapted strategy would assist in selection of patients to receive additional treatment and would also facilitate choice of treatment regimen.
Risk groups can be defined by models, which is useful in clinical trial stratification and for comparison of results across trials or in meta-analyses. Using ROC curves, risk groups are easily discerned. For example, using an arbitrary 0.30 probability of death at 5 years as a cutoff, patients can be segregated into high- and low-risk groups with a sensitivity of 0.72 and a specificity of 0.64. Decreasing the cutoff threshold to 0.25 increases sensitivity but lowers specificity (0.81 and 0.52, respectively); increasing the threshold to 0.40 reduces sensitivity but raises specificity (0.53 and 0.80, respectively). Depending on relative levels of acceptable false-positive and false-negative rates, one can find a threshold that maximizes the accuracy of classification. In this way, the models offer flexibility in choosing a threshold for defining risk groups, whereas a categorical staging system has fixed thresholds that cannot be tailored to the application at hand.
An important application for creating risk groups is patient selection/enrichment for clinical trials. As an example, suppose that, in a randomized clinical trial of adjuvant therapy for patients with stage II disease, the treatment is anticipated to increase 5-year survival from approximately 76% to 80%, corresponding to a hazard ratio of 0.813.27 A total of 740 events would be required for 80% power, which could be met by enrolling 3,400 patients and following them for a median of 5 years. However, if the investigators use the nomogram-predicted probability of 5-year survival of at least 0.25 as an inclusion criterion, only 2,000 patients would need to be enrolled to manifest the same number of events with median 5-year follow-up. The nomogram reduces the sample size requirement by 40%. This approach has been used in prospective randomized trials28 and would be valuable in colon cancer, where the benefit of adjuvant therapy for node-negative disease has not been established.29,30
Additionally, improved survival estimates based on nomograms would enhance the development of rational surveillance schedules after curative-intent surgery. The National Comprehensive Cancer Network recommends yearly computed tomography for patients at high risk of recurrence.31 As noted before, defining high risk exclusively with AJCC staging criteria is limited; some patients with stage II disease fare better than some stage III patients, and not all patients with stage III disease have uniformly poor outcome. This nomogram would enable patients and physicians to create a more individualized surveillance program based on prognosis.
The strengths of our prognostic models include the large, representative, population-based cohort and the fact that the included data elements are widely available from tumor registries. The outcome measure of overall survival is consistently and reliably determined, is of greatest interest to patients, and is most commonly used to develop staging schemes. Although models that predict risk of recurrence would also be valuable in treatment decision making, recurrence is not reliably captured and is subject to ascertainment bias based on surveillance intervals. Cancer-specific survival would be an alternative metric, but determining cause of death is unreliable in population databases such as SEER. Relative survival, defined as the ratio of the proportion of observed survivors in a cohort of patients with cancer to the proportion of expected survivors in a comparable cohort of cancer-free individuals, is commonly used as a surrogate for cancer-specific survival, but it cannot be used when patient-specific data points are necessary for model development; moreover, this statistical construct depends on the attributes of the reference population and is not transparent. For these reasons, we have developed prognostic models for population-based cohorts that consider overall survival as the primary outcome.
Physicians vary considerably in the extent to which they communicate with their patients about prognosis or review this information themselves. Although nearly all clinicians are aware of prognostic factors in a general sense, relatively few rely on explicit models in clinical visits. Certainly, clinicians use the prognostic factors included, informally discussing survival and the potential benefit of adjuvant chemotherapy with patients.29,32 However, explicitly defined models provide a more systematic approach in conveying this information to patients and physicians alike. A web-based interface affords easy access to and use of these tools, allowing clinicians and patients to view the most appropriate nomogram based on available data. In all cases, estimation of survival after colon cancer surgery is provided with 95% confidence intervals.15
In summary, we have developed prognostic models that incorporate data elements routinely available from tumor registries. For colon cancer, the models we have developed improve on the traditional categorical TNM system. Such an individualized approach to prognostication will aid in the development of risk-adaptive therapies and rational follow-up schedules and will assist patients in planning for their future.
Supported in part by a grant from the American Joint Committee on Cancer (M.R.W.) and a grant from the Society of Memorial Sloan-Kettering Cancer Center (M.R.W.).
Authors' disclosures of potential conflicts of interest and author contributions are found at the end of this article.
The author(s) indicated no potential conflicts of interest.
Conception and design: Martin R. Weiser, Mithat Gönen, Deborah Schrag
Collection and assembly of data: Martin R. Weiser, Joanne F. Chou, Deborah Schrag
Data analysis and interpretation: All authors
Manuscript writing: All authors
Final approval of manuscript: All authors