PMCCPMCCPMCC

Search tips
Search criteria 

Advanced

 
Logo of jcoHomeThis ArticleSearchSubmitASCO JCO Homepage
 
J Clin Oncol. 2009 February 10; 27(5): 686–693.
Published online 2008 December 29. doi:  10.1200/JCO.2008.17.4797
PMCID: PMC2645090

Colorectal Cancer Risk Prediction Tool for White Men and Women Without Known Susceptibility

Abstract

Purpose

Given the high incidence of colorectal cancer (CRC), and the availability of procedures that can detect disease and remove precancerous lesions, there is a need for a model that estimates the probability of developing CRC across various age intervals and risk factor profiles.

Methods

The development of separate CRC absolute risk models for men and women included estimating relative risks and attributable risk parameters from population-based case-control data separately for proximal, distal, and rectal cancer and combining these estimates with baseline age-specific cancer hazard rates based on Surveillance, Epidemiology, and End Results (SEER) incidence rates and competing mortality risks.

Results

For men, the model included a cancer-negative sigmoidoscopy/colonoscopy in the last 10 years, polyp history in the last 10 years, history of CRC in first-degree relatives, aspirin and nonsteroidal anti-inflammatory drug (NSAID) use, cigarette smoking, body mass index (BMI), current leisure-time vigorous activity, and vegetable consumption. For women, the model included sigmoidoscopy/colonoscopy, polyp history, history of CRC in first-degree relatives, aspirin and NSAID use, BMI, leisure-time vigorous activity, vegetable consumption, hormone-replacement therapy (HRT), and estrogen exposure on the basis of menopausal status. For men and women, relative risks differed slightly by tumor site. A validation study in independent data indicates that the models for men and women are well calibrated.

Conclusion

We developed absolute risk prediction models for CRC from population-based data, and a simple questionnaire suitable for self-administration. This model is potentially useful for counseling, for designing research intervention studies, and for other applications.

INTRODUCTION

Colorectal cancer (CRC) is the third most commonly diagnosed cancer and the third leading cause of cancer death in the United States. During 2008, an estimated 148,810 new cases of CRC will be diagnosed, and 49,960 persons will die as a result of the disease.1 Approximately one in 18 persons in the United States will develop CRC during his or her life.

Currently, several CRC screening strategies are available, including the fecal occult blood test (FOBT), double-contrast barium enema, flexible sigmoidoscopy, colonoscopy, virtual colonography, and combinations of these tests. Many of these strategies have been shown to be effective for reducing CRC incidence and mortality.2,3

Given the high incidence of CRC, its significant cost to society, and the availability of screening tests, a model that estimates an individual's probability of developing CRC using risk factor information that can be obtained easily in a clinical setting may aid physicians and their patients in deciding on screening regimens, and can also be useful in designing chemoprevention and screening intervention trials.4

Several risk and protective factors for CRC have been consistently identified in epidemiologic studies, including physical activity, cigarette smoking, and body mass index (BMI).5,6 However, no quantitative risk model that takes competing causes of death into account is currently available to estimate the absolute risk or probability of developing CRC. Existing models are qualitative and based on expert opinion,7 or applicable only to special populations.810 We present a model that, given a set of risk and protective factors and age, predicts the absolute risk of developing CRC over a given time period, accounting for competing causes of death. We used data from two population-based case-control studies to assess risk or protective factors. After describing the study populations, we describe the development of the CRC absolute risk model and give examples of risk estimates for various combinations of factors. We also present a short, self-administered risk assessment questionnaire that can be used to obtain information about risk factors for the model.

METHODS

Study Populations Used to Estimate Relative Risk

We used data from two population-based case-control studies, one for colon cancer1113 and one for rectal cancer,1416 to estimate relative risks (RRs) of CRC. Controls for both case-control studies were matched to cases on sex and 5-year age groups. These studies were conducted by investigators at the Universities of Utah (Salt Lake City, UT), Minnesota (Minneapolis, MN), and the Kaiser Permanente Medical Care Program (KPMCP) of Northern California (Oakland, CA). The Appendix (online only) provides additional details.

We restricted our analyses to non-Hispanic white men and women age 50 years and older, who comprised the majority of participants in both studies. We differentiated proximal (cecum through transverse colon), distal (splenic flexure, descending, and sigmoid colon), and rectal (rectosigmoid junction and rectum) tumor sites because incidence rates for these sites differ dramatically by age, and because risk factors and prior screening may have different effects on each site (Fig 1).

Fig 1.
Colorectal cancer incidence by tumor site for white non-Hispanic men and women (13 Surveillance, Epidemiology, and End Results [SEER] sites 1992-2002).

Risk Factors to Estimate RR Models

We assessed a variety of factors that have been consistently associated with colon or rectal cancer,6,17,18 including age; history of CRC in first-degree relatives; history of sigmoidoscopy and/or colonoscopy; history of polyps; use of multivitamins; red meat, vegetable, and fruit consumption; alcohol intake; BMI (kg/m2); cigarette smoking; use of aspirin and other nonsteroidal anti-inflammatory drugs (NSAIDs); current leisure-time vigorous activity; and estrogen status as assessed by menopausal status and hormone-replacement therapy (HRT) use. Although additional nutrient variables including calcium, dietary fiber, and iron have been related to CRC risk in some studies,6,18 a detailed dietary assessment and supporting nutrient database would be needed to capture these intakes accurately, making such an assessment unfeasible in most clinical settings. Therefore, we did not include dietary variables that require a detailed assessment. The Appendix includes further description of the variables we evaluated in developing our RR models.

Projecting Probabilities (absolute risk) of Developing CRC

Our approach19 included 1 estimating RR parameters from population-based case-control data separately for proximal, distal, and rectal cancer; 2 estimating baseline age-specific cancer hazard rates (based on the National Cancer Institute's Surveillance Epidemiology and End Results (SEER) Program incidence rates) and attributable risks (ARs) from the case-control data; and 3 combining competing risks, RRs, and baseline hazards to estimate the probability of developing the first of proximal, distal, or rectal cancer over a prespecified time interval (eg, 5, 10, or 20 years) given a person's age and risk factors. The advantage of modeling the sites separately is that the covariates have different associations by various sites, and thus discriminatory accuracy can be improved by separately modeling each site.20 The Appendix contains additional details.

Estimating the RR Models

We analyzed proximal and distal colon cancer cases separately and used all eligible controls from the colon cancer study1113 to estimate separate RR models. Age was included in the models in two categories (≤ 65 and > 65) when significant, to account for the matching. We determined RR estimates for all factors described earlier herein and assessed interactions among these factors and with age. Because a substantial number of participants in both studies had missing information on sigmoidoscopy/colonoscopy, we included an “unknown” category for that variable in all models. The following variables were coded as 0, 1, 2 for one df (trend) models: men proximal—smoking, BMI and family history; men distal—BMI and family history; men rectal—current vigorous exercise; women proximal—current vigorous exercise and family history; and women distal—family history. Odds ratios (ORs) estimating RRs and corresponding 95% CIs were computed from unconditional logistic regression models. Variable selection for inclusion in the final model was based on Wald tests for individual parameters as well as information on previously established risk factors. Statistical analyses were performed using SAS software (version 8.2; SAS Institute Inc, Cary, NC).

Estimating the Baseline Age-Specific CRC Hazard Rates

The baseline hazard rate was defined as the hazard rate for individuals each of whose risk factors are at the lowest risk level. The age-specific baseline hazard rates were computed by multiplying the age-specific SEER incidence rates by 1 – [the estimate of the AR for each CRC cancer site]19 (Appendix).

The age- and sex-specific SEER incidence rates for proximal, distal, and rectal cancer (Appendix) were obtained for white non-Hispanics in 13 SEER registries between 1992 and 2002 that cover 14% of the US population.21 Cancers of the appendix, second primary, and recurrence of colorectal cancers were not included in the computation of these rates. Competing mortality hazards from causes other than CRC were obtained from US mortality data between 1990 and 2002 (Appendix).

CIs on the Absolute Risk Estimates

We extended the influence function method of Graubard and Fears,22 for the three outcomes (proximal, distal, or rectal cancer) to estimate the variance of the absolute risk estimate (Appendix). Approximate normality of the estimates was used to obtain 95% CIs for the estimated absolute risks.

Risk Assessment Questionnaire

We constructed a short, self-administered risk assessment questionnaire to capture the information used in the CRC risk prediction models. We tested the questionnaire using cognitive interviewing techniques.23 Cognitive interviews involved four “rounds” of nine participants each. We used verbal probing and think-aloud techniques to evaluate sources of misunderstanding and inaccuracy in reporting. Based on the results of our cognitive testing, the questionnaire was improved after each round to address potential sources of response errors24 and until it was usually understandable and yielded responses that appeared to classify individuals into risk categories accurately.

RESULTS

RR Models

In our analyses, we included 1,599 colon cancer cases (665 men, 708 women) with 1,974 controls (1,058 men, 916 women) and 664 rectal cancer cases (397 men, 267 women) with 859 controls (478 men, 381 women). Table 1 displays frequencies for age and recruitment site for men and women respectively, and for proximal, distal, and rectal tumor sites.

Table 1.
Demographic Characteristics of the Colon and Rectal Case–Control Study Populations (restricted to white men and women, age ≥ 50 years)

Several factors were not related to CRC risk in our data, including FOBT; multivitamin use; alcohol use; and red meat and fruit consumption. We examined variables for smoking status and smoking pack-years for each risk model, but included only the variables “smoking duration and usual number of cigarettes smoked per day for current and former smokers” when significant, because this variable seemed to have the strongest effect of all the smoking variables, as noted previously in these studies.25

For men, the predictors of proximal colon cancer in the final model were prior negative sigmoidoscopy and/or colonoscopy, polyp history, number of relatives with CRC, aspirin and NSAID use, usual number of cigarettes smoked per day and years of smoking in current and former smokers, BMI, and servings of vegetables per day (Table 2). For example, men with two or more first-degree relatives with CRC were more likely to be diagnosed with proximal colon cancer than those without a positive family history (OR = 3.28; 95% CI, 1.84 to 5.84), whereas regular users of aspirin or NSAIDS were less likely to be diagnosed with proximal colon cancer (OR = 0.65; 95% CI, 0.51 to 0.82). For distal colon cancer, the final model included the same factors except for smoking and vegetable consumption (Table 2). The RR model for rectal cancer included sigmoidoscopy and/or colonoscopy, polyp history, number of relatives with CRC, NSAID use, and current vigorous leisure-time activity (Table 2). Only four controls and two cases in the rectal case-control study reported having two or more family members with CRC; therefore, family history was incorporated into the rectal RR model as “one or more relatives with CRC.” None of the models showed significant two-way interactions among these variables. All three RR models for men included separate intercepts for study center.

Table 2.
Relative Risk Estimates for Proximal, Distal, and Rectal Cancers for White Men Age ≥ 50 Years

For women, the proximal colon RR model included prior negative sigmoidoscopy and/or colonoscopy, polyp history, number of relatives with CRC, estrogen status within the last 2 years, current vigorous activity, aspirin and NSAID use, and servings of vegetables per day (Table 3). The distal colon RR model included sigmoidoscopy and/or colonoscopy and polyp history, number of relatives with CRC, aspirin and NSAID use, and estrogen status within the last 2 years, an age indicator (≥ 65 years), BMI, and an interaction between BMI and estrogen status (Table 3). Older obese women (≥ 30 kg/m2) had an increased risk of CRC (OR = 2.68; 95% CI, 1.39 to 5.20). The main effect of BMI, however, was not statistically significant (OR = 1.08; 95% CI, 0.75 to 1.54). No other statistically significant interactions were found in any of the other models. Sigmoidoscopy and/or colonoscopy, polyp history, number of relatives with CRC, estrogen status within the last 2 years, current vigorous leisure-time activity, aspirin and NSAID use, and BMI were all included in the rectal cancer RR model for women (Table 3). Again, because of small numbers in the rectal RR model, family history of colorectal cancer was reduced to “one or more relatives with CRC.” All three RR models for women included separate intercepts for study center.

Table 3.
Relative Risk Estimates for Proximal, Distal, and Rectal Cancers for White Women Age ≥ 50 Years

Estimates of the Baseline Age-Specific CRC Hazard Rates

To compute baseline hazard rates, we estimated separate ARs for the three models for men and women. Because the ARs did not differ by study site, we obtained combined AR estimates separately for proximal, distal and rectal cancer. The AR estimates for men were 0.86 (95% CI, 0.79 to 0.91) for proximal, 0.72 (95% CI, 0.63 to 0.80) for distal, and 0.90 (95% CI, 0.69 to 0.97) for rectal cancer. For women, the AR estimates were 0.81 (95% CI, 0.69 to 0.90) for proximal, 0.82 (95% CI, 0.73 to 0.89) for distal in women younger than 65 years, 0.85 (95% CI, 0.76 to 0.91) for distal in women age 65 years or older, and 0.93 (95% CI, 0.57 to 0.99) for rectal cancer.

Examples of Individual Absolute Risk Estimates for CRC

Table 4 presents estimates of the 10- and 20-year projected absolute risks of developing CRC for white men with various ages and risk factor profiles. The first risk profile, the lowest risk example, describes a 50-year-old man who had a colonoscopy in the last 10 years without evidence of polyps. He has no family history of CRC, vigorously exercises 5 hours/week, takes aspirin daily, never smoked, eats more than five servings of vegetables/day, and has a BMI of 24 kg/m2. His 10-year predicted absolute risk of developing CRC is only 0.16% (95% CI, 0.11 to 0.22), and his 20-year risk is 0.53% (95% CI, 0.38 to 0.73). In risk profile 9, a high-risk example, we consider a 60-year-old man who had a colonoscopy in the last 10 years and was found to have a polyp. He has two relatives with CRC, does not exercise regularly, does not take aspirin or NSAIDS regularly, smokes more than 20 cigarettes a day, eats fewer than five servings of vegetables per day, and has a BMI of 31 kg/m2. His 10-year predicted absolute risk of developing CRC is 7.14% (95% CI, 3.9 to 12.8), and his 20-year risk is 16.7% (95% CI, 9.1 to 28.5). Similar 10- and 20-year projected absolute risks of developing CRC for white women with various ages and risk factor profiles are presented in Table 5.

Table 5.
Examples of 10- and 20-Year Absolute Risk Estimates for CRC for White Women of Different Ages and Risk Factor Profiles
Table 4.
Examples of 10- and 20-Year Absolute Risk Estimates for CRC for White Men of Different Ages and Risk Factor Profiles

Risk Assessment Questionnaire

The short, paper-based, self-administered risk assessment questionnaire we constructed to capture the information required by the models (Tables 2 and and3)3) requires 5 to 8 minutes to complete. A Web version of the questionnaire is available at http//www.cancer.gov/colorectalcancerrisk/.

DISCUSSION

We present a model that predicts the probability or absolute risk of developing CRC for men and women age 50 years and older. We combined separate RRs and ARs and baseline hazards for proximal, distal, and rectal cancers to project the risk of the earliest of these tumors. We also developed a short, simple, self-administered risk assessment questionnaire that can be used to obtain information for risk estimation.

In related work, we used independent data from the National Institutes of Health (NIH)-AARP Diet and Health Cohort Study26 of men and women age 55 years and older to assess the performance of our models.27 We found that the models had discriminatory accuracy comparable with absolute risk models for other cancers and were well calibrated.

Although the models were developed from cases and controls age 50 years and older, one could project risk for younger individuals by assuming that our relative and ARs apply to younger populations and by using younger age-specific SEER rates. However, such assumptions would need to be checked in independent data because risk factors and biologic mechanisms may differ among those developing CRC at younger ages.

Although absolute risk models exist for breast cancer and lung cancer,19,28 this is the first such model for CRC. The four other CRC risk prediction models currently available either apply to special populations, such as patients who were referred by general practitioners to gastroenterologists for symptoms,8 provide a qualitative index of CRC risk,7,10 or predict different outcomes, such as the risk of having an advanced polyp or cancer in the proximal portion of the colon.9

Our model estimates the probability of developing CRC over a prespecified time interval from data collected from two large US population-based case-control studies of colon and rectal cancer, incidence data from 13 SEER registries, which are generally representative of the US population29 and from national mortality rates. Thus, our risk prediction models would be expected to apply to the general non-Hispanic white US population.

We used factors in our models that, in addition to having strong predictive ability, can also be ascertained easily in a clinical setting. Thus, we did not include some factors that may have predictive value, such as and calcium intake or long-term vigorous activity3033 but which would require a much more complex questionnaire.34

Our risk prediction model has some limitations. Because the majority of participants in the case-control studies were white, we could not estimate RRs for other racial or ethnic groups. A first step to developing models for other racial/ethnic groups could be to combine RR and AR estimates for whites with SEER rates for blacks, Asians, or Hispanics. However, the assumption of constant AR and RR estimates across racial groups needs to be validated in minority populations. Our model is not applicable to individuals with ulcerative colitis, Crohn's disease and familial adenomatous polyposis, because these conditions carry a high risk of CRC, and individuals with these conditions were excluded from the studies. Additionally, our model is not applicable to individuals with hereditary nonpolyposis CRC.

Because we used US mortality data from 1990 to 2002 for our competing mortality hazards, we did not adjust these estimates for potential confounders such as BMI given that our sensitivity analyses indicated that changes in the risk estimates were small (data not shown). Although our two case-control studies were conducted at slightly different time periods, we believe any changes in the distribution of risk factors would have a minimal effect on our risk estimates, considering that RRs and ARs were estimated separately for the two studies.

Another limitation of our model is that we estimated our RRs and ARs from case-control rather than from cohort studies. Although case-control studies have been used in the development of risk prediction models for melanoma,35 breast,19,36,37 bladder,38 and lung cancer,39 a general criticism is that such estimates could be subject to recall bias. However, recall bias likely plays a minor role in our models as most of the RRs we found, including BMI, physical activity, HRT, and aspirin and NSAIDs use, were consistent with RRs summarized in a recent comprehensive review of the epidemiologic literature.6 Although not covered in this review, our risk estimates for screening and polyp history,2,40,41 smoking,42 and family history43,44 are also consistent with many previously published findings, including results from cohort studies. Additionally, the models were well calibrated in an independent validation study using the AARP cohort.27

In summary, we developed an absolute CRC risk projection model for white men and women age 50 years or older that may aid physicians and their patients in deciding on screening regimens, and can also be useful in designing chemoprevention and screening intervention trials.

Supplementary Material

[Data Supplement]

Acknowledgment

We thank Anne Rodgers for her helpful comments and editorial assistance.

Appendix

Study Populations

For the colon cancer study, cases with a first primary CRC (ICD-O second edition codes 18.0, 18.2 to 18.9) diagnosed between October 1, 1991, and September 30, 1994, were identified in Utah, the metropolitan Twin Cities area in Minnesota, and KPMCP of Northern California using rapid-reporting systems. The second study included cases with a first primary tumor in the rectosigmoid junction or rectum identified between May 1997 and May 2001 in Utah or Northern California, using a rapid case-ascertainment program conducted in conjunction with the SEER Program. Individuals with a previous CRC, familial adenomatous polyposis, ulcerative colitis, or Crohn's disease as determined by the pathology report were not eligible for these studies.

Controls for both case-control studies were matched to cases (during their respective recruitment periods) by sex and 5-year age groups. In Utah, controls age 65 years and older were randomly selected from Health Care Financing Administration (now the Centers for Medicare & Medicaid Services) lists. Controls younger than 65 years were randomly selected from driver's license lists. In Minnesota, all controls were randomly selected from driver's license lists. In Northern California, controls were randomly selected from KPMCP membership lists. Data for both studies were collected using the same study questionnaire and the same quality-control procedures that have been described in detail elsewhere (Slattery ML, Caan BJ, Duncan D, et al. J Am Diet Assoc 94:761-766, 1994; Edwards S, Slattery ML, Mori M, et al. Am J Epidemiol 140:1020-1028, 1994; Slattery ML, Jacobs DR Jr. Ann Epidemiol 5:292-296, 1995)

Description of Variables

For all of the variables in our analyses, including age, the referent year was the calendar year, 2 years before the date of diagnosis for cases, and 2 years before date of selection for controls.

History of CRC in first-degree relatives.

We used information reported by participants on the number of first-degree relatives (ie, father, mother, siblings, and children) diagnosed with CRC. We categorized the number of first-degree relatives with CRC as none, one, two, or more in our analyses.

FOBT.

We categorized participants based on whether they had reported an FOBT in the last 10 years before the referent year.

Sigmoidoscopy/colonoscopy.

We also categorized participants on the basis of whether they reported having a sigmoidoscopy and/or colonoscopy in the last 10 years before the referent year. Because these examinations occurred 2 years before the date of diagnosis for cases, none these examinations resulted in a CRC diagnosis.

History of polyps.

Among those who reported having had a sigmoidoscopy and/or colonoscopy, we categorized participants by they had been told by a physician that they had a polyp in the last 10 years before the referent year.

Use of multivitamins.

We categorized participants as regular users of multivitamins if they reported using multivitamins at least three times/week for at least 1 month during the referent year.

Red meat, vegetable, and fruit consumption.

We used information on the number of servings of red meat, vegetables, and fruits per week reported by participants during the referent year. These dietary exposures were collected during in-person interviews conducted by trained and certified interviewers.

Alcohol intake.

We combined reported weekly intake of servings of red wine, beer, and liquor to derive an estimate of total alcohol consumption per week. We categorized these estimates into quartiles among controls for our analyses.

BMI.

Self-reported weight during the referent year and measured height at interview were used to calculate BMI, defined as weight (kg)/height2 (m2). We categorized participants' BMI based on WHO classifications: underweight and normal weight (BMI ≤ 24.9 kg/m2), overweight (BMI 25-29.9 kg/m2), and obese (BMI ≥ 30 kg/m2).

Smoking.

We categorized participants by 1 cigarette smoking status (never smoker, current smoker, and former smoker), 2 pack-years (calculated by dividing the average number of cigarettes smoked in a usual day by 20 and multiplying this number by the number of years smoked), and 3 usual number of cigarettes smoked per day for current and former smokers, and 4 years of smoking.

Aspirin and other NSAIDs.

We categorized participants as regular users of aspirin and other NSAIDs if they reported using these medications at least three times a week for at least 1 month during the referent year.

Physical activity.

We assessed participants' current leisure-time vigorous activity during the referent year and grouped participants into four categories: 0 hours/week, up to 2 hours/week, 2 to 4 hours/week, and more than 4 hours/week. Vigorous leisure-time activities were defined as “those activities which make you sweat or get out of breath” and included questions about racket sports, jogging, running and biking, exercise or dance class, weight lifting, hiking, vigorous swimming, scrubbing floors or mowing the lawn, and gardening and heavy labor.

Estrogen status.

We created a variable “estrogen status,” which was categorized as “estrogen positive” or “estrogen negative” based on a combination of menopausal status and use of HRT. Premenopausal women and women who had used HRT within the 2 years before the referent year were considered estrogen positive. Postmenopausal women not reporting the use of HRT during the 2 years before the referent year were considered estrogen negative.

Projecting Probabilities (absolute risk) of Developing CRC

The absolute risk A*(a,b) of CRC in the age interval (a,b) is the probability of developing CRC during that interval, given that one is alive and free of previous CRC at the beginning of the interval. The absolute risk is reduced by death from causes other than CRC. The absolute risk is defined mathematically by

equation image

where S*(a) = exp[−0a{λcrc(u,x) + λM(u)}du].

The main building blocks for our model are the hazard rates λcrc for CRC incidence and λM for competing causes of death other than CRC. Because incidence rates and the impact of risk factors differ by CRC tumor site, we decomposed the CRC incidence hazard rate into the sum of a proximal, distal, and rectal cancer hazards, denoted by i = P,D,R. Let T denote the earlier of the age of onset of the first colorectal outcome or age at death resulting from other causes. The cause-specific hazards that may depend on covariates x are defined as λi(a,x) = lim ε→0 P(a ≤ T < a + ε, J = i|T > a,x)/ε, i = P,D,R,M.

We used λcrc(a,x) = λP(a,x) + λD(a,x) + λR(a,x), and modeled λi(a,x) = λi(a) rri(a,x) as the product of the age-specific baseline hazard rate and the RR, rri(a,x) that includes covariates, for outcomes i = P,D,R. We thus estimated three separate RR models for the CRC outcomes. We did not include covariates in the hazard for competing causes of death, and used λM(a,x) = λM(a). The absolute risk of developing CRC thus is the probability of developing the first of proximal, distal, or rectal cancer in a given age interval. Following the approach outlined by Willis (Willis GB: Cognitive Interviewing: A Tool for Improving Questionnaire Design. Thousand Oaks, CA, SAGE Publications, 2004) we estimated the RR parameters from case-control data and then estimated baseline age-specific cancer hazard rates λi(a) for i = P,D,R as described in the following sections.

Estimating the Baseline Age-Specific CRC Hazard Rates and ARs

The baseline hazard rate λi(a) at age a for i = P,D,R is the hazard rate for individuals each of whose risk factors are at the lowest risk level. The age-specific baseline hazard rates are computed by multiplying the age-specific SEER incidence rates λ*(a) by 1 – the estimate of the appropriate AR as described in Smith and Dean (Smith T, Dean D: Colorectal Cancer Health Assessment Testing of a Self-Administered Survey: Summary Report—Technical Report Prepared for the National Cancer Institute. Rockville, MD, Westat, 2005) that is, λi(a) = λi*(a) (1 – ARi(a)). To estimate ARi for each of the three sites, we used ARi = 1 – (1/mij 1/rji, where mi is the number of cases of cancer type i, rji is the estimated RR for the jth case of the typei, and the summation is over all the cases of the type (Bruzzi P, Green SB, Byar DP, et al. Am J Epidemiol 122:904-914, 1985)

Colorectal Cancer Incidence and Mortality

Appendix Table A1 denotes colorectal cancer incidence for white non-Hispanic men. Appendix Table A2 denotes colorectal cancer incidence for white non-Hispanic women. Appendix Table A3 denotes mortality for white non-Hispanic women. Appendix Table A4 denotes mortality for white non-Hispanic men.

Variance Calculations

Recall that the age-specific baseline hazard rates λ(t) at age t are computed as λ (t) = λ*(t) (1 – AR(t)), the product of the composite rates λ* from SEER and 1 – AR, the attributable risk. Thus, we can write λi(a,x) in equation 1 as λi(a,x) = λi*(a) (1 – ARi(a)) rri(x), i = P,D,R. In what follows, we assume that the composite rates λ* are known without error; thus, only the factors (1 – ARi(a)) rri(x) contribute to the variance of A* in equation 1.

For ease of presentation, we let Hi = (1 – ARi(a)) rri(x). First, we compute the covariance matrix of (HP, HD, HR) by adapting the influence function based–approach of Graubard and Fears (Graubard BI, Fears TR: Biometrics 61:847-855, 2005). Let Yij denote case control status, that is, Yij = 1 for a case with cancer type i and 0 for a control, and xij the vector of covariates for the jth subject, including an intercept term. Also let pij = exp(xijb){1 − exp(xijb)}−1. The AR for the ith outcome is estimated as

equation image

(Bruzzi P, Green SB, Byar Dp, et al. Am J Epidemiol 122:904-914, 1985). Letting

equation image

and

equation image

equation image

where xi denotes the risk factors specific for ith cancer model. The influence of observation j on Hi is

equation image

equation image

equation image

equation image

The random variables zj for cases and controls from the three study centers are independent and are assumed to be random samples from six separate strata, defined by case-control status and site. The variance of Hi is thus estimated as

equation image

where ns is the number of cases or controls in stratum s and zs is the stratum mean of the variables zj. Because the same controls were used to estimate the RR parameters for proximal and distal cancers, we also compute the covariance between HP and HD based on the controls as

equation image

The superscripts u denote the scores from the controls from the three study sites, and nu stands for the corresponding numbers of controls.

After the covariance matrix Σ of (HP, HD, HR) is computed, the variance of A* in equation 1 is obtained by applying the delta method as DT Σ D, where DT = [partial differential]A*[partial differential]HP, [partial differential]A*[partial differential]HD, [partial differential]A*[partial differential]HR.

Table A1.

Colorectal Cancer Incidence for White Non-Hispanic Men

Age (years)Incidence (cases per 100,000)
ProximalDistalRectal
50-5415.414.421.0
55-5927.927.435.2
60-6449.246.052.5
65-6980.169.676.7
70-74121.588.390.2
75-79169.0107.3103.5
80-84219.02118.1115.1
≥ 85250.5113.7111.0

NOTE. Age-adjusted rates (2000 US Standard). Selected only for the first matching record for each person with diagnosis of proximal, distal or rectal cancer. Incidence, Surveillance, Epidemiology, and End Results (SEER) 13 Registries, November 2004 submission for Hispanics (1992-2002).

Table A2.

Colorectal Cancer Incidence for White Non-Hispanic Women

Age (years)Incidence (cases per 100,000)
ProximalDistalRectal
50-5412.311.513.3
55-5923.318.620.9
60-6442.129.930.5
65-6967.841.240.8
70-74103.157.650.9
75-79144.965.762.7
80-84191.474.972.8
≥ 85220.277.275.4

NOTE. Age-adjusted rates (2000 US Standard). Selected only for the first matching record for each person with diagnosis of proximal, distal or rectal cancer. Incidence, Surveillance, Epidemiology, and End Results (SEER) 13 Registries, November 2004 submission for Hispanics (1992-2002).

Table A3.

Mortality for White Non-Hispanic Men

Age (years)Mortality (cases per 100,000)
All CauseCRCAll Cause-CRC
50-54612.920.4592.5
55-59979.737.5942.2
60-641,594.668.71,525.9
65-692,503.099.22,403.8
70-743,813.6139.53,674.1
75-795,870.7190.65,680.1
80-849,338.6259.79,078.9
≥ 8517,577.5371.517,206.0

NOTE. Age-adjusted rates (2000 US Standard). Mortality-all cause of death, public-use with state, total US for Hispanics (1990-2002). These data are collected by the National Center for Health Statistics (NCHS).

Table A4.

Mortality in White Non-Hispanic Women

Age (years)Mortality (cases per 100,000)
All CauseCRCAll Cause-CRC
50-54361.014.0347.0
55-59589.524.7564.8
60-64954.639.7914.9
65-691,480.159.71,420.4
70-742,319.186.92,232.2
75-793,680.4123.13,557.3
80-846,175.8177.25,998.6
≥ 8514,316.6285.814,030.8

NOTE. Age-adjusted rates (2000 US Standard). Mortality-all cause of death, public-use with state, total US for Hispanics (1990-2002). These data are collected by the National Center for Health Statistics (NCHS).

Table A5.

Cases and Controls for Covariates Used in Relative Risk Estimation (in Table 2) for Proximal, Distal, and Rectal Cancers for White Men Age ≥ 50 Years

VariableNo.
Proximal Cancer CasesDistal Cancer CasesProximal and Distal ControlsRectal Cancer CasesRectal Cancer Controls
Sigmoidoscopy and/or colonoscopy and polyp history
    Sigmoidoscopy and/or colonoscopy in last 10 years, and no history of polyps1017134856172
    No sigmoidoscopy and/or colonoscopy in last 10 years225305511312249
    Sigmoidoscopy and/or colonoscopy in last 10 years and history of polyps5728972640
    Sigmoidoscopy and/or colonoscopy and polyps unknown4658102317
No. of relatives with CRC
    036039096037441
    15464913937
    ≥ 21587
Current leisure-time activity, h/wk
    0123100
    > 0 and ≤ 2155191
    > 2 and ≤ 44492
    > 47595
Aspirin/NSAID use
    Nonuser262275536337365
    Regular user16718752260113
Smoking, cigarettes/d
    Never smoker130386
    > 0 and < 1141125
    ≥ 11 and ≤ 20138263
    > 20120284
Years of smoking
    0130386
    > 0 and < 1541125
    ≥ 15 and < 35138263
    ≥ 35120284
Vegetable intake, servings/d
    < 5373842
    ≥ 556216
Body mass index, kg/m2
    ≤ 24.9110110337
    25.0 to ≤ 30205221514
    > 30114131207

Abbreviations: CRC, colorectal cancer; NSAID, nonsteroidal anti-inflammatory drug.

Table A6.

Cases and Controls for Covariates Used in Relative Risk Estimation (in Table 3) for Proximal, Distal, and Rectal Cancers for White Women Age ≥ 50 Years

VariableNo.
Proximal CasesDistal CasesProximal and Distal ControlsRectal CasesRectal Controls
Sigmoidoscopy and/or colonoscopy and polyp history
    Sigmoidoscopy and/or colonoscopy in last 10 years, and no history of polyps663526932110
    No sigmoidoscopy and/or colonoscopy in last 10 years220225493217248
    Sigmoidoscopy and/or colonoscopy in last 10 years and history of polyps3126451715
    Sigmoidoscopy and/or colonoscopy and polyps unknown544810918
No. of relatives with CRC
    030828581139352
    15541933739
    ≥ 28812
Current leisure-time activity, h/wk
    0152400110120
    > 0 and ≤ 2112351101166
    > 2 and ≤ 440682948
    > 430972747
Aspirin/NSAID use
    Nonuser235207462149169
    Regular user136127454118212
Vegetable intake, servings/d
    < 5317743
    ≥ 554173
Body mass index, kg/m2
    ≤ 29.9246726192298
    ≤ 30881907583
Age ≤ 65 > 65 years139,195287,629
Estrogen status within the last 2 years
    Negative270236593153169
    Positive10198323114212

Abbreviations: CRC, colorectal cancer; NSAID, nonsteroidal anti-inflammatory drug.

Footnotes

Authors' disclosures of potential conflicts of interest and author contributions are found at the end of this article.

AUTHORS' DISCLOSURES OF POTENTIAL CONFLICTS OF INTEREST

The author(s) indicated no potential conflicts of interest.

AUTHOR CONTRIBUTIONS

Conception and design: Andrew N. Freedman, Martha L. Slattery, Rachel Ballard-Barbash, Mitchell H. Gail, Ruth M. Pfeiffer

Financial support: Andrew N. Freedman, Rachel Ballard-Barbash

Administrative support: Andrew N. Freedman, Martha L. Slattery, Rachel Ballard-Barbash, David Pee

Provision of study materials or patients: Martha L. Slattery, Bette J. Cann

Collection and assembly of data: Martha L. Slattery, Gordon Willis, Bette J. Cann

Data analysis and interpretation: Andrew N. Freedman, Martha L. Slattery, Rachel Ballard-Barbash, Gordon Willis, Bette J. Cann, David Pee, Mitchell H. Gail, Ruth M. Pfeiffer

Manuscript writing: Andrew N. Freedman, Martha L. Slattery, Rachel Ballard-Barbash, Gordon Willis, Bette J. Cann, Mitchell H. Gail, Ruth M. Pfeiffer

Final approval of manuscript: Andrew N. Freedman, Martha L. Slattery, Rachel Ballard-Barbash, Gordon Willis, Bette J. Cann, David Pee, Mitchell H. Gail

REFERENCES

1. Cancer Facts and Figures 2008. Atlanta, GA: American Cancer Society; 2008.
2. Winawer SJ, Zauber AG, Fletcher RH, et al. Guidelines for colonoscopy surveillance after polypectomy: A consensus update by the US Multi-Society Task Force on Colorectal Cancer and the American Cancer Society. Gastroenterology. 2006;130:1872–1885. [PubMed]
3. Levin B, Lieberman DA, McFarland B, et al. Screening and surveillance for the early detection of colorectal cancer and adenomatous polyps, 2008: A joint guideline from the American Cancer Society, the US Multi-Society Task Force on Colorectal Cancer, and the American College of Radiology. CA Cancer J Clin. 2008;58:130–160. [PubMed]
4. Freedman AN, Seminara D, Gail MH, et al. Cancer risk prediction models: A workshop on development, evaluation, and application. J Natl Cancer Inst. 2005;97:715–723. [PubMed]
5. Giovannucci E. Modifiable risk factors for colon cancer. Gastroenterol Clin North Am. 2002;31:925–943. [PubMed]
6. World Cancer Research Fund (WCRF) Panel. Washington, DC: WCRF/American Institute of Cancer Research; 2007. Food, Nutrition, Physical Activity and the Prevention of Cancer: A global perspective. (Marmot M, Chair)
7. Colditz GA, Atwood KA, Emmons K, et al. Harvard Report on Cancer Prevention Volume 4: Harvard Cancer Risk Index—Risk Index Working Group, Harvard Center for Cancer Prevention. Cancer Causes Control. 2000;11:477–488. [PubMed]
8. Selvachandran SN, Hodder RJ, Ballal MS, et al. Prediction of colorectal cancer by a patient consultation questionnaire and scoring system: A prospective study. Lancet. 2002;360:278–283. [PubMed]
9. Imperiale TF, Wagner DR, Lin CY, et al. Using risk for advanced proximal colonic neoplasia to tailor endoscopic screening for colorectal cancer. Ann Intern Med. 2003;139:959–965. [PubMed]
10. Driver JA, Gaziano JM, Gelber RP, et al. Development of a risk score for colorectal cancer in men. Am J Med. 2007;120:257–263. [PubMed]
11. Slattery ML, Edwards SL, Ma KN, et al. Colon cancer screening, lifestyle, and risk of colon cancer. Cancer Causes Control. 2000;11:555–563. [PubMed]
12. Slattery ML, Ballard-Barbash R, Edwards S, et al. Body mass index and colon cancer: An evaluation of the modifying effects of estrogen. Cancer Causes Control. 2003;14:75–84. [PubMed]
13. Coates AO, Potter JD, Caan BJ, et al. Eating frequency and the risk of colon cancer. Nutr Cancer. 2002;43:121–126. [PubMed]
14. Slattery ML, Curtin KP, Edwards SL, et al. Plant foods, fiber, and rectal cancer. Am J Clin Nutr. 2004;79:274–281. [PubMed]
15. Murtaugh MA, Ma KN, Benson J, et al. Antioxidants, carotenoids, and risk of rectal cancer. Am J Epidemiol. 2004;159:32–41. [PubMed]
16. Slattery ML, Caan BJ, Benson J, et al. Energy balance and rectal cancer: An evaluation of energy intake, energy expenditure, and body mass index. Nutr Cancer. 2003;46:166–171. [PubMed]
17. Potter JD. Colorectal cancer: Molecules and populations. J Natl Cancer Inst. 1999;91:916–932. [PubMed]
18. Tomeo CA, Colditz GA, Willett WC, et al. Harvard Report on Cancer Prevention Volume 3: Prevention of colon cancer in the United States. Cancer Causes Control. 1999;10:167–180. [PubMed]
19. Gail MH, Brinton LA, Byar DP, et al. Projecting individualized probabilities of developing breast cancer for white females who are being examined annually. J Natl Cancer Inst. 1989;81:1879–1886. [PubMed]
20. Colditz GA, Rosner BA, Chen WY, et al. Risk factors for breast cancer according to estrogen and progesterone receptor status. J Natl Cancer Inst. 2004;96:218–228. [PubMed]
21. Surveillance Epidemiology and End Results (SEER) http://seer.cancer.gov/
22. Graubard BI, Fears TR. Standard errors for attributable risk for simple and complex sample designs. Biometrics. 2005;61:847–855. [PubMed]
23. Willis GB. Thousand Oaks, CA: SAGE Publications; 2004. Cognitive Interviewing: A Tool for Improving Questionnaire Design.
24. Smith T, Dean D. Colorectal Cancer Health Assessment Testing of a Self-Administered Survey. Rockville, MD: Westat; 2005.
25. Slattery ML, Potter JD, Friedman GD, et al. Tobacco use and colon cancer. Int J Cancer. 1997;70:259–264. [PubMed]
26. NIH-ARP Diet and Health Study. http://dietandhealth.cancer.gov/
27. Park Y, Freedman AN, Gail MH, et al. Validation of a colorectal cancer risk prediction model among whites 50 years old and over. J Clin Oncol. 2009;27:694–698. [PMC free article] [PubMed]
28. Bach PB, Kattan MW, Thornquist MD, et al. Variations in lung cancer risk among smokers. J Natl Cancer Inst. 2003;95:470–478. [PubMed]
29. Hankey BF, Ries L, Edwards BK. The Surveillance, Epidemiology, and End Results Program: A national resource. CEBP. 1999;8:1117–1121. [PubMed]
30. Slattery M. Physical activity and colorectal cancer. Sports Med. 2004;34:239–252. [PubMed]
31. Slattery ML, Edwards SL, Ma KN, et al. Physical activity and colon cancer: A public health perspective. Ann Epidemiol. 1997;7:137–145. [PubMed]
32. Lee IM, Paffenbarger RS, Jr, Hsieh C. Physical activity and risk of developing colorectal cancer among college alumni. J Natl Cancer Inst. 1991;83:1324–1329. [PubMed]
33. Severson RK, Nomura AM, Grove JS, et al. A prospective analysis of physical activity and cancer. Am J Epidemiol. 1989;130:522–529. [PubMed]
34. Ferrari P, Friedenreich C, Matthews CE. The role of measurement error in estimating levels of physical activity. Am J Epidemiol. 2007;166:832–840. [PubMed]
35. Fears TR, Guerry D, IV, Pfeiffer RM, et al. Identifying individuals at high risk of melanoma: A practical predictor of absolute risk. J Clin Oncol. 2006;24:3590–3596. [PubMed]
36. Gail MH, Costantino JP, Pee D, et al. Projecting individualized absolute invasive breast cancer risk in African American women. J Natl Cancer Inst. 2007;99:1782–1792. [PubMed]
37. Claus EB, Risch N, Thompson WD. The calculation of breast cancer risk for women with a first degree family history of ovarian cancer. Breast Cancer Res Treat. 1993;28:115–120. [PubMed]
38. Wu X, Lin J, Grossman HB, et al. Projecting individualized probabilities of developing bladder cancer in white individuals. J Clin Oncol. 2007;25:4974–4981. [PubMed]
39. Spitz MR, Hong WK, Amos CI, et al. A risk model for prediction of lung cancer. J Natl Cancer Inst. 2007;99:715–726. [PubMed]
40. Martínez ME, Sampliner R, Marshall JR. Adenoma characteristics as risk factors for recurrence of advanced adenomas. Gastroenterology. 2001;120:1077–1083. [PubMed]
41. Bonithon-Kopp C, Piard F, Fenger C, et al. Colorectal adenoma characteristics as predictors of recurrence. Dis Colon Rectum. 2004;47:323–333. [PubMed]
42. Giovannucci E, Martinez ME. Tobacco, colorectal cancer, and adenomas: A review of the evidence. J Natl Cancer Inst. 1996;88:1717–1730. [PubMed]
43. St John DJ, McDermott FT, Hopper JL, et al. Cancer risk in relatives of patients with common colorectal cancer. Ann Intern Med. 1993;118:785–790. [PubMed]
44. Fuchs CS, Giovannucci EL, Colditz GA, et al. A prospective study of family history and the risk of colorectal cancer. N Engl J Med. 1994;331:1669–1674. [PubMed]

Articles from Journal of Clinical Oncology are provided here courtesy of American Society of Clinical Oncology