We evaluated the calibration and discriminatory power of a CRC risk prediction model (ie, Freedman et al1a
model) by using data from a large prospective cohort study. The Freedman et al 1a model was well calibrated overall in both men and women and in most categories of risk factors. However, the model overpredicted risk in the AARP cohort in people with a family history of CRC and in those with a history of screening with polyps, and it underestimated risk in those with a history of screening but no polyps. In addition, we found that the prediction model had a modest discriminatory accuracy, as measured by the AUC or concordance statistic.
Differences between the NIH-AARP questionnaires and those used to develop the model7,8
may account for some of the discrepancies in calibration. The screening time period covered in the NIH-AARP questionnaire was the past 3 years, compared with the past 10 years in the Freedman et al model. Second, there was no time restriction on history of polyps in the NIH-AARP study, whereas Freedman et al used polyps found in the past 10 years. Because a person with screening and polyps in the data used by Freedman et al may have been screened as much as 10 years before, the model might overestimate risk in a person in the NIH-AARP cohort who was screened only 3 years before and who was treated for a polyp found then.
Another disagreement between the observed and the expected number of CRC patient cases was found in family history of CRC. In the NIH-AARP study, participants with a family history of CRC tended to get screening; thus, they also may have had polyps identified. Given that family history was correlated with screening and polyp history and that the risk prediction model overpredicted the risk of screening and had polyp history, overprediction by family history is not surprising.
In women taking aspirin/NSAIDs in the NIH-AARP cohort, the model overpredicted risk by 16%. This may reflect the weak inverse association between aspirin/NSAID use and CRC risk in the NIH-AARP cohort.
The discriminatory power of the CRC prediction model was modest and comparable to other cancer risk models. Studies that validate cancer risk prediction models have reported discriminatory accuracy, as measured by AUC of 0.60 to 0.63 for breast cancer,(9,)10
0.69 for lung cancer,11
0.60 for ovarian cancer,12
and 0.62 for melanoma.13
Although these cancer risk prediction models were developed on the basis of well-established risk factors for these cancers, the modest discriminatory power suggests the need to find additional strong risk predictors.
Overall, our validation results were comparable with those for other cancer risk models, such as breast and lung cancer.11,14
The risk prediction model was well calibrated in the NIH-AARP cohort, with some exceptions, despite some differences between this population and the population used to develop the model. The NIH-AARP study population was younger (mostly younger than 70 years) and had a higher socioeconomic status and education level; these factors have been related to healthier lifestyle, easier access to regular cancer screening, and lower CRC risk. These differences may account for differences in the strengths of some associations with risk factors.
The relative risk features of the prediction model were based on retrospective, case-control data. It is possible that some risk factors, especially those related to behavior, may be subject to recall bias. Although the risk estimates in the prediction model were comparable to those from other published studies, the associations between risk factors and CRC tended to be stronger in the case-control studies than in the prospective cohort studies.
In conclusion, the CRC risk prediction model developed by Freedman et al was well calibrated by using a large prospective cohort study, and it had a modest discriminatory power to distinguish an individual's CRC risk. This CRC prediction model can be recommended for broader use.