We conducted several “experiments” with three independently-developed CRC microsimulation models with an aim to understand the impact of superimposing a screening mechanism on alternative specifications of the adenoma-carcinoma process. The natural history models represent a wide spectrum of dwell times (), although all are consistent with the observed data on adenoma prevalence and cancer incidence. The projected 20-year cumulative CRC incidence for the subgroup of cancer-free 55-year-old individuals with an underlying lesion varied across models: 8.6% (MISCAN), 13.1% (SIMCRC), and 13.5% (CRC-SPIN) (). The implication of this result is that the expected effectiveness of screening would be greater for the SimCRC and CRC-SPIN models compared with the MISCAN model. We also found that the 20-year cumulative CRC risk after polypectomy was much greater with the MISCAN model (5.9%) compared with the other two models (1.4% and 1.7% for CRC-SPIN and SimCRC, respectively) (). The implication of this finding is that post-polypectomy surveillance would more beneficial with the MISCAN model compared with the other two models. Both of these findings for the MISCAN model are consistent with a shorter overall dwell time, and the implied lower screening effectiveness associated with the former finding could be offset by the implied higher surveillance effectiveness associated with the latter finding. We also found that the MISCAN model showed a relatively high percentage of diagnosed cancers that would not have had a chance of being prevented with colonoscopy 10 years prior because the associated adenoma developed during the 10-year interval (), which indicates that the MISCAN model would favor strategies with repeated screenings or shorter intervals, particularly in strategies without surveillance.
The CRC natural history model parameters were selected to fit observed data on adenoma prevalence from autopsy studies and cancer incidence from SEER. Because there is a larger range of uncertainty with adenoma prevalence than with cancer incidence, the natural history models’ projections of cancer incidence were closer to one another than their projections of adenoma prevalence (), yet all projections were consistent with the cross-sectional data. The underlying assumptions that were incorporated to simulate the adenoma-carcinoma sequence yielded a range of average overall/total dwell times varying from 10 to 26 years. While dwell times are a function of growth rates and variability among growth rates, the incorporation of non-progressive adenomas (i.e., adenomas that could not progress to cancer) within the MISCAN model was necessary to achieve mean overall/total dwell times as short as 10 years and still calibrate to the empirical data. This assumption results in extreme heterogeneity for the progression rate from adenoma to cancer, where one group of adenomas with rate 0 (i.e., non-progressive adenomas) and the other group with rates that are faster on average than those in the CRC-SPIN or SimCRC models, in which all adenomas have the potential to progress to cancer. The effect of incorporating non-progressive adenomas (and thus modeling a shorter dwell time) is generally a decreased effectiveness of screening.
We have shown substantial differences in generated outcomes with our three natural history models, particularly for the MISCAN model relative to the CRC-SPIN and SimCRC models (also see of companion paper
26). However, when we have used these models to generate outcomes associated with screening
strategies, which include both screening and surveillance, we often reach similar conclusions.
43-45 Hence, although models with longer overall/total dwell times predict greater effectiveness of screening than models with shorter dwell times, they also show poorer effectiveness of surveillance compared with those with shorter dwell times, since it takes longer for a new adenoma to progress to cancer. Hence, programs that involve both screening and surveillance may produce similar conclusions when comparing screening strategies. While it is not likely that we will ever have direct evidence on dwell times because of ethical considerations, we anticipate that the large prospective trials of sigmoidoscopy that are currently underway or recently concluded
10,46-48 will provide opportunities to further validate our models and provide insights into the natural history process. In particular, these trials have variations in their surveillance protocols (intensive surveillance vs. none prescribed). Hence, we would expect the differences of screening effectiveness in trials with surveillance vs. without surveillance to be relatively modest, all else being equal, if the underlying dwell times are longer. Another difference across trial protocols that could shed light on the natural history parameters is the post-sigmoidoscopy follow-up (referral to colonoscopy for large or high-risk adenoma vs. referral to colonoscopy for any adenoma). We would expect the differences of screening effectiveness in trials with follow-up of any adenoma vs. only high-risk adenomas to be relatively modest, all else being equal, if the underlying dwell times are shorter (because missed lesions with shorter dwell times have lower cancer incidence than missed lesions with longer dwell times).
We do not feel that we need to resolve the uncertainties of the natural history of the adenoma-carcinoma process in order for our CRC screening models to be potentially useful for policy makers. That being said, it is important to continue to evaluate our models critically and to be able to evaluate both parameter and structural uncertainty in policy applications. The comparison of outputs from independently developed models provides sensitivity analysis of the uncertainties surrounding each model’s “deep” structural parameters. Consistent conclusions across models would be helpful for policy makers. If the model conclusions are not consistent with each other then we would conclude that models are less helpful for those situations, although they still add value by highlighting the uncertainty.
The CISNET consortium has been a strong proponent of adopting a “comparative modeling” approach, both for understanding the implications of model differences and for conducting policy analyses. Using multiple models designed to address the same question provides a sensitivity analysis on underlying model structure, or on the “deep” model parameters. We have used multiple CRC CISNET models to project life-years gained and resources used associated with various screening strategies to inform the deliberations by the United States Preventive Services Task Force aimed at updating the recommendations for CRC screening for the average-risk population
43 and to inform the Centers for Medicare and Medicaid Services on the cost-effectiveness of new screening technologies.
44,45 Despite the apparent model differences our findings in those analyses were very similar across models, strengthening the validity of the results.
There are several limitations to note. First, while there are several sets of underlying assumptions that one could make that would be consistent with data on adenoma prevalence and cancer incidence, we only considered three. Although we show that the models represent a broad spectrum of underlying natural history assumptions, three models cannot cover the full range of plausible assumptions. Second, we recognize that there are many “moving parts” in these models and it is difficult to identify the exact cause of differences across models. We focus much of our discussion on dwell times but there may be other assumptions that contribute to the observed differences in model output. Lastly, the ideal study that is needed is one that directly estimates dwell time; unfortunately, that type of study could not be conducted in an ethical manner.
In conclusion, we provide results from several hypothetical exercises designed to gain insights on the impact of screening interventions superimposed on alternative approaches to specifying the adenoma-carcinoma process using three CRC models that are part of the CISNET consortium. We found that differences in dwell times had differential effects on the effectiveness of screening vs. the effectiveness of surveillance. Without direct evidence on dwell time of adenomas, we anticipate that the large prospective trials of sigmoidoscopy may provide insights into the natural history process because of the variations in surveillance and post-sigmoidoscopy follow-up protocols. When conducting applied analyses to inform policy, using multiple models provide a reasonable sensitivity analysis on the key (unobserved) “deep” model parameters.