|Home | About | Journals | Submit | Contact Us | Français|
The authors developed a comprehensive model of colon cancer incidence that allows for nonproportional hazards and accounts for the temporal nature of risk factors. They estimated relative risk based on cumulative incidence of colon cancer by age 70 years. Using multivariate, nonlinear Poisson regression, they determined colon cancer risk among 83,767 participants in the Nurses’ Health Study. The authors observed 701 cases of colon cancer between 1980 and June 1, 2004. There was increased risk for a positive family history of colon or rectal cancer (55%), 10 or more pack-years of cigarette smoking before age 30 years (16%), and tallness (67 inches (170 cm) vs. 61 inches (155 cm): 19%). Reduced risk was observed for current postmenopausal hormone use (−23%), being physically active (21 metabolic equivalent (MET)-hours/week vs. 2 MET-hours/week: −49%), taking aspirin (7 tablets/week vs. none: −29%), and being screened (−24%). Women who smoked, had a consistently high relative weight, had a low physical activity level, consumed red or processed meat daily, were never screened, and consumed low daily amounts of folate had almost a 4-fold higher cumulative risk of colon cancer by age 70 years. For women with a high risk factor profile, adopting a healthier lifestyle could dramatically reduce colon cancer risk.
The influence of many risk factors for colon cancer often varies depending on when they occur, both in relation to age and in relation to other exposures. The dependence of the effects of risk factors on age or on other exposures implies a violation of the standard Cox proportional hazards model and requires a different modeling approach in order to incorporate all risk factors in a single model. The use of biomathematical models, of which the log-incidence model is an example, to describe human cancer incidence evolved from hypotheses regarding the transformation of cells from normal cells to malignant cells and are particularly suited to the adenoma-carcinoma model of colon cancer (1, 2).
Factors that have been shown to increase risk of colon cancer include disturbances in energy balance (e.g., physical inactivity and/or obesity) (3–,5), high intake of red or processed meat (6), low folate intake, high alcohol intake, and cigarette smoking (7). In addition, aspirin use (8–,10) and postmenopausal hormone use (11) have also been shown to reduce a woman's risk of colon cancer. However, most previous studies evaluating the role of these risk factors have focused on them individually. Our aim was the development of a comprehensive model that combines known risk factors for colon cancer and allows modeling of different risk factor profiles. The nonlinear model we developed allows for relative risk estimates based on the cumulative incidence of colon cancer up to age 70 years.
We used data from the Nurses’ Health Study, a cohort study begun in 1976 when 121,701 female registered nurses aged 30–54 years who were married and residing in one of 11 US states (California, Connecticut, Florida, Maryland, Massachusetts, Michigan, New Jersey, New York, Ohio, Pennsylvania, and Texas) completed a self-administered questionnaire. Since 1976, the participants have completed questionnaires biennially to update information on their lifestyle, medications, and medical conditions.
We excluded 29,233 persons who did not return the 1980 dietary questionnaire, persons who were male or had duplicate responses at entry (n = 46), persons with an unknown date of diagnosis (n = 1), and all persons who reported breast cancer or another type of cancer (excluding nonmelanoma skin cancer) on the 1976 questionnaire (n =2,271). A total of 732 women reported an age at menarche of less than 9 years or more than 21 years and were excluded. We also excluded women who were missing information on age at menopause (n =15) or menopausal status (n =15).
Further exclusions among these 89,388 women included those who did not return any questionnaire after 1978 and were without updated information on risk factors or disease status (n =1,350), those who were missing data on duration of postmenopausal hormone therapy or had an unknown history of use (n = 995), those with missing data on height or an unknown weight at age 18 years (n =2,514), and those who were missing information on postmenopausal hormone use (n = 762).
After exclusions, we observed 701 cases of colon cancer among 83,767 women over a period of 1,607,643 person-years. For this analysis, women were followed from the return of the 1980 questionnaire to June 1, 2004, the date of return of the last questionnaire, the development of any cancer, or death, whichever occurred first. The study was approved by the institutional review board of Brigham and Women's Hospital in Boston, Massachusetts. Completion of the self-administered questionnaire was considered to imply informed consent.
In 1980, 1984, 1986, 1990, 1994, and 1998, a validated (12–,14) semiquantitative food frequency questionnaire was included with the general questionnaire to elicit data on the long-term average diets of the participants. We calculated the median values of the quintiles of intake for each dietary variable and used this as a continuous variable. If a questionnaire was missing, the participant's intake was assumed to be unchanged from that of the previous report.
Detailed questions about physical activity were asked every 4 years starting in 1986. Based on the types of activities reported and the duration of time spent in each activity, we derived total metabolic equivalent (MET)-hours/week and entered this as a continuous variable in the model.
Aspirin use was reported starting in 1980, and frequency of use was reported starting in 1984. Using a detailed algorithm, we derived cumulative averages of frequency, dose, and duration of aspirin use that were consistent across follow-up years. For this analysis, we included the derived variable of cumulative average number of tablets per week as a continuous variable (nonusers were assigned a value of 0).
History of colon or rectal cancer in an immediate family member (mother, father, or sibling) was assessed in 1982, and the information was updated in 1988, 1992, 1996, and 2000.
Information on colorectal cancer screening was provided in 1988, 1990, 1992, and every 2 years after that. On the 1990 questionnaire, participants reported their history of colorectal cancer screening between 1980 and 1990. When a woman reported that she had undergone screening by sigmoidoscopy or colonoscopy, we assigned her 2 cycles (4 years) of screening “coverage” starting from the age at which she reported being screened (to approximately account for appropriate screening intervals of 5 and 10 years for sigmoidoscopy and colonoscopy, respectively). We then summed the total number of years of screening coverage for each woman.
Because screening during follow-up could affect the natural history of the disease differentially according to risk factor status, we also conducted our analysis after censoring women at the time of diagnosis of an adenoma and removing the screening variable from the model.
Reproductive information included age at menarche, age at menopause, and use of postmenopausal hormones (including type). The participants updated information on menopausal status every 2 years, and our analysis accounted for newly menopausal women at each questionnaire cycle. Height was reported once at baseline (1976). Current weight was reported on every questionnaire starting in 1976, and weight at age 18 years was reported in 1980. The validity of both self-reported current weight and weight at age 18 years is high (15, 16). Using linear interpolation methods, we estimated weight and body mass index (weight (kg)/height (m)2) at single ages. Information regarding cigarette smoking was collected on all biennial questionnaires starting in 1976. For this analysis, we used information on smoking in early adulthood (pack-years accrued before age 30 years as a continuous variable) in order to account for the long latency of the effect of smoking on colon cancer risk that has been previously observed (17, 18). On each biennial questionnaire, we asked the participant whether in the past 2 years she had been diagnosed with colorectal cancer; if she had, we requested permission to obtain and review her medical records to confirm the diagnosis. Family members, the postal system, and the National Death Index were used to identify cohort member deaths.
The log-incidence model is fitted using iteratively reweighted least squares, with PROC NLIN in SAS (SAS Institute Inc., Cary, North Carolina). The parameters of the model are readily interpretable in a relative risk context. We also calculated cumulative incidence from age 30 years to age 70 years, and we report relative risks for the cumulative risks. Confidence intervals were calculated for these relative risks using methods described previously (19).
We fitted the model assuming that incidence at time t (It) is proportional to the number of carcinogenic events Ct accumulated throughout life up to age t—that is,
The cumulative number of colon cell divisions is factored as follows:
The representation of Ct is described in more detail in Appendix 1 of a previous paper (19). For this analysis, our final comprehensive model was
where t=age, t80=age in 1980, and t86=age in 1986;
FHX=1 if there was a family history of colon or rectal cancer;
REDandPROj=intake of red meat or processed meat (servings/day) at age j;
SMKj=total pack-years of smoking before age 30 years;
PMHcur,t=1 if the participant was a current user of postmenopausal hormones at age t, and 0 otherwise;
PMHpast,t=1 if the participant was a past user of postmenopausal hormones at age t, and 0 otherwise;
ASPt=number of tablets of aspirin taken per week at age t;
ACTt=physical activity in MET-hours/week at age t;
BMIj=body mass index at age j;
HGT=adult attained height;
FOLj=total intake of folate (μg/day) at age j; and
SCREENt=number of years of endoscopy screening coverage.
Interpretation of each of the coefficients is as follows. β0 represents the rate of increase in log incidence per year of increase in age. β1 represents the rate of increase in log incidence per year among persons with a family history of colon or rectal cancer. β2 represents the effect on log incidence for each additional daily serving of red and processed meat (summed together) per year, starting from age in 1980 (the first year in which we collected dietary information from the Nurses’ Health Study women). β3 represents the effect on log incidence per year per pack-year of smoking before age 30 years. β4 and β5 represent the immediate and past effects, respectively, of using postmenopausal hormones after menopause. β6 represents the effect on log incidence per year with increasing number of aspirin tablets used per week. β7 represents the effect on log incidence per year with each MET-hour/week of physical activity. β8 represents the effect on log incidence per body mass index-year—that is, an increase of 1 unit of body mass index for 1 year. β9 and β10 represent the effect on log incidence per year for height and folate intake, respectively. β11 represents the effect on log incidence per year of screening coverage by endoscopy (sigmoidoscopy or colonoscopy).
Table 1 shows the mean values and distributions of the variables of interest at approximately the midpoint of the follow-up period (1990). Table 2 presents results from our final model. Women who reported a family history of colon or rectal cancer had a statistically significantly increased risk; we found a 55% increased risk of colon cancer (relative risk (RR) = e0.4383 = 1.55) among persons who had at least 1 first-degree relative diagnosed with colon or rectal cancer.
Current use of postmenopausal hormones was associated with a statistically significant 23% reduction in risk of colon cancer (RR = e−0.2625 = 0.77). Although past users had an overall 2% decreased risk, the effect was much weaker than that for current use and was essentially null (RR = e−0.0228 = 0.98).
Measures of energy balance were supportive of previous findings. Specifically, height was positively associated with increased risk. Compared with a 61-inch-tall (155-cm) woman, a 67-inch-tall (170-cm) woman had a 19% increased risk (RR = e(0.000736)(6)(40) = 1.19). Body mass index was suggestive of a positive association; for a body mass index of 30 compared with a body mass index of 20, there was a 16% increased risk (RR = e(0.000373)(10)(40) = 1.16). Physical activity showed an expected inverse association, which was statistically significant. Women who engaged in physical activity at a level of 21 MET-hours/week for 40 years had a 49% reduction in risk compared with women who only had 2 MET-hours of activity per week (RR = e(−0.00089)(19)(40) = 0.51; Figure 1, part A).
Intake of red and processed meat was associated with an increased risk. A woman who consumed 1 serving of red or processed meat daily for 40 years had a 20% increased risk of colon cancer compared with a woman who did not eat any red or processed meat (RR = e(0.00452)40 = 1.20; Figure 1, part B). Women who consumed 600 μg/day of folate for 40 years had a modest 16% reduction in risk compared with women who had a folate intake of 150 μg/day (RR = e(−9.76E-06)(450)(40) = 0.84).
Smoking increased risk for colon cancer; women who accrued 10 pack-years of smoking before the age of 30 years had a 16% increased risk compared with nonsmokers (RR = e(0.000368)(10)(40) = 1.16). Figure 1, part C, shows the age-specific incidence rates for a woman who had 10 pack-years of smoking before age 30 years as compared with the same age-specific rates for a never smoker.
Screening (by endoscopy) was associated with a significantly reduced risk. A woman who was screened from age 50 years to age 70 years had a 24% reduced risk of developing colon cancer. Lastly, we confirmed that aspirin has an important inverse association with colon cancer risk. The use of 7 aspirin tablets per week for 40 years (1 tablet per day) was associated with a 29% reduction in risk (RR = e(−0.00122)(7)(40) = 0.71; Figure 1, part D).
In order to evaluate the influence of screening on the natural history of the disease, we also censored women who were diagnosed with adenoma during the follow-up period (on the date of their adenoma diagnosis). For this analysis, we had 631 cases accumulated over 1,588,035 person-years. The results were essentially the same as the main results from the full model with screening as a covariate (data not shown).
After choosing the variables in our final model (Table 2), we used the coefficients to generate age-specific incidence rates from age 30 years to age 70 years. Using the age-specific incidence rates, we then estimated the cumulative incidence of colon cancer up to age 70 years for hypothetical women with varying levels of specific risk factors, holding the other variables constant (Table 3). Increased risk was observed for height, having a consistently high relative body weight from age 18 years to age 70 years, having a family history of colon or rectal cancer, and smoking before age 30 years. Important inverse associations were observed for current postmenopausal hormone use, being consistently lean, being physically active, taking aspirin, and being screened by endoscopy. Although the overall association with body mass index was not statistically significant (Table 2), a lifetime pattern of overweight was associated with increased risk. Given the same average height of 64 inches (163 cm), women who were consistently lean (at the 10th age-specific percentile from age 18 years to age 70 years; 105 pounds (48 kg) at age 18, 118 pounds (54 kg) at age 50, 120 pounds (55 kg) at age 60, and 118 pounds (54 kg) at age 70) had a 7% reduced risk of colon cancer (95% confidence interval (CI): −1%, −13%) compared with the “average” woman in this cohort (a weight of 123 pounds (56 kg) at age 18, 142 pounds (65 kg) at age 50, 146 pounds (66 kg) at age 60, and 145 pounds (66 kg) at age 70). More strikingly, a woman who had a consistently high relative body weight (at the 90th age-specific percentile; 150 pounds (68 kg) at age 18, 185 pounds (84 kg) at age 50, 190 pounds (86 kg) at age 60, and 185 pounds (84 kg) at age 70) had a 68% increased risk (95% CI: 39%, 103%) compared with a woman of “average” weight and height.
To evaluate the combined effect of multiple modifiable risk factors, we compared women with various risk factor profiles, keeping the other (nonmodifiable) variables constant. Women with a moderate risk profile had an elevated and statistically significant 44% increased risk compared with the low-risk group (RR = 1.44, 95% CI: 1.05, 1.96). Women with a high risk factor profile (women who smoked, had a consistently high relative weight, had a low physical activity level, consumed 1 serving per day of red or processed meat, were never screened, and consumed low amounts of folate (150 μg/day)) had a 3.8-fold increased risk (RR = 3.84, 95% CI: 1.61, 9.16; Table 3 and Figure 2). We observed a slightly stronger effect of aspirin use for high-risk women versus moderate-risk women when compared with women at low risk (Figure 3). Lastly, we evaluated the effect of screening among high-risk women (Figure 4). Although endoscopic screening from age 50 years to age 70 years reduced the risk among high-risk women, their risk was still much higher than that of women whose lifestyle behaviors placed them in the moderate- or low-risk category.
We evaluated the predictive ability of our log-incidence model using receiver operating characteristic (ROC) curve analysis. First, we calculated the predicted absolute risk of colon cancer for each woman using all of the risk factors in our final model and stratified the data by 5-year age group. Within each age group, we then calculated the Mann-Whitney U statistic, thus obtaining an index that can be interpreted as the probability that within a specific 5-year age group a random case will have a higher predicted risk than a random noncase or, alternatively, as the area under the ROC curve based on our predicted model for women in a specific age group. We then computed a weighted average of the age-specific Mann-Whitney U statistics with weights equal to the inverse variance of the age-specific statistics. Overall, the area under the ROC curve, adjusted for age, was 0.61 (95% CI: 0.59, 0.63). This is consistent with other published reports on the discriminatory accuracy of breast and ovarian cancer models (19, 20), as well as a recent colon cancer model developed by Park et al. (21) and Freedman et al. (22).
In this large, prospective cohort study, women who did not exercise, consumed high amounts of red and processed meat, had a low folate intake, and had a consistently high relative body weight had over 3.5 times’ the cumulative incidence of colon cancer, by age 70 years, of women who maintained a low-risk lifestyle and diet (defined as exercising regularly, consuming low amounts of red and processed meat, maintaining a low relative body weight, and consuming 400 μg/day of folate). We also found that while endoscopic screening significantly reduced risk, the magnitude of the risk reduction was less than the reduction we observed for lifestyle and dietary changes alone. These findings support most of the previously established risk factors for colon cancer and underscore the importance of primary prevention measures to decrease colon cancer incidence. Colorectal cancer screening beyond age 70 years will become an increasingly important issue as the US population ages; however, we did not include women over age 70 years in our cumulative incidence relative risk calculations because we had limited numbers of cases in the highest age categories.
Using Surveillance, Epidemiology, and End Results (SEER) data (1988–1992), we compared our age-specific incidence rates (35–84 years) with those of the US population using the observed and expected numbers of cases in each 5-year age group and found a relative risk of 0.72. In the higher age categories, the Nurses’ Health Study population appeared to have fewer cases than expected when compared with the SEER data. One possible explanation for this discrepancy is higher screening rates in our population as compared with the population of the SEER registries.
Our model has several similarities to a model developed recently by Freedman et al. (22) for use as a clinical prediction tool. Our final models (for women) were similar in terms of risk factors included, with the exception of several variables that were significant in our model: red and processed meat (nonsignificant in Freedman et al.’s model), folate (not considered), height (not considered), and smoking (nonsignificant in women). Conversely, the Freedman et al. model included vegetable intake and an interaction between body mass index and estrogen status. We did not consider vegetable intake because we have found no prior evidence that vegetables are a risk factor in this cohort, a similar cohort of men, or a separate cohort of women (23, 24). Moreover, fruit and vegetable intake is not yet an established risk factor for colon cancer overall (25). Also, in contrast, we found no evidence for an interaction between body mass index and estrogen status (data not shown). Freedman et al.’s screening variable was slightly more detailed, and they evaluated the proximal colon, distal colon, and rectum separately. The association we observed for smoking is consistent with a recent meta-analysis of prospective studies (26) in which findings were not reported by sex; Freedman et al. only found a significant association between smoking and proximal colon cancer (22).
Several strengths of our analysis include the fact that our data were entirely based on a prospective cohort study with detailed repeated dietary data, whereas the Freedman et al. model relied on combining data from 2 separate case-control studies. We also were able to control for time-varying covariates; multiple biennial questionnaires allowed us to update our risk factor information throughout the follow-up period. Requiring detailed diet information may be less feasible for use in a clinical prediction setting; however, our goal was to develop an accurate model based on the current state of knowledge on colon cancer risk factors. Modifications for clinical feasibility, more detailed analyses by subsite, and a similar evaluation in a male population are some of the many avenues for future development of our model.
The majority of epidemiologic studies of colon cancer have focused on risk factors individually and have not accounted for the changes in magnitude and direction of association that occur throughout the life span. By calculating the cumulative incidence of colon cancer up to age 70 years for various risk profiles based on our nonlinear regression model, we can avoid the limitations of age-specific incidence rates and obtain simple relative risk estimates for any combination of risk factors. Furthermore, our findings suggest that primary prevention of colon cancer through lifestyle changes is an important complement to colorectal cancer screening in reducing colon cancer incidence (Figure 4). These lifestyle and dietary changes also have the benefit of reducing the risk of other major chronic diseases.
In conclusion, our results support the overwhelming body of evidence that specific lifestyle and dietary changes, especially if started early in life and maintained over time, are an efficacious strategy for reducing the burden of colon cancer.
Author affiliations: Channing Laboratory, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts (Esther K. Wei, Charles S. Fuchs, Edward L. Giovannucci, Bernard A. Rosner); California Pacific Medical Center Research Institute, San Francisco, California (Esther K. Wei); Siteman Cancer Center, Washington University School of Medicine, St. Louis, Missouri (Graham A. Colditz); and Department of Nutrition, Harvard School of Public Health, Boston, Massachusetts (Edward L. Giovannucci).
This work was supported by the National Cancer Institute (grant CA87969).
The authors thank Drs. Sue Hankinson, Meir Stampfer, and Walter Willett for their comments on earlier drafts of this article and Marion McPhee, Barbara Egan, and Karen Corsano for technical support.
Conflict of interest: none declared.