A complete description of the structure and inputs of a model is necessary, but limited in the insights that are provided for natural history models. In correspondence with the idea of developing standard outputs to describe models, we propose to include the projected MCLIR as a prediction measure in the description of screening models. By comparing the MCLIRs between models, similarities or differences in natural history, and more specifically, the models' implicitly assumed length of the preclinical disease phase, become apparent. The MCLIR should be relatively easy to calculate with any screening model. Because it is based on model output instead of input, it is uniformly applicable regardless of the type of model. We propose this approach as a general way to describe models that are used to estimate the effectiveness or cost-effectiveness of screening strategies.
Closely related output measures to express the impact of dwell time are lead time (restricted to disease that would progress to clinical cancer without intervention) and dwell time itself (from disease onset to clinical cancer). In MISCAN, CRC-SPIN and SimCRC dwell time on average was 8, 25 and 21 years respectively (13
). Given that dwell time is the main driver of the MCLIR, it clearly gives the same type of information but arranged differently. The MCLIR is presented by time since complete removal of disease (e.g., age 65 years). The difference between the metrics is their relation to clinical incidence by age, given that at each ages, the incident cases represent a difficult-to-grasp mix of shorter and longer dwell times. This relation is not straightforward for the dwell time distribution, whereas it is part of the metric of the MCLIR. Dwell time is closer to the inputs of the model; the MCLIR is closer to effects of screening that could be observed. Even though lead time, like the MCLIR, is bound to an age of intervention, it lacks straightforward relation to the cancer incidence by age. In addition, the lead time distribution
(as opposed to average lead time) is difficult if not impossible to output for many models that do not produce paried life histories with and without intervention (e.g., models build with the TreeAge program).
We chose to present the MCLIR for age 65 years because this is in the middle of the age range (50–80 years) when individuals are often recommended to get CRC screening. As pointed out in the Methods section, it may be useful to present the MCLIR for different ages (e.g. the MCLIR55, MCLIR65 and MCLIR75), if dwell time assumptions differ by age. Similarly, if dwell time depends on disease characteristic (such as location of disease, e.g., colon cancer v. rectal cancer) or other patient characteristics than age (such as gender or race), then it would be valuable to present the MCLIR for each of these groups.
Are model differences a limitation, or even a failure, of modeling? We think the differences between our CRC screening models reflect genuine uncertainty because all 3 models provide good fit to observed data such as CRC incidence and adenoma prevalence rates (13
). The demonstrated differences indicated areas where additional data are needed to inform models, and where, in the absence of data, strong assumptions must be made. Only when more relevant data have become available, will it become clear which model is more accurate. Very recently, the incidence and mortality endpoint results of a large randomized controlled once only sigmoidoscopy study have become available (23
). This study's 11-years follow up contains strong information on dwell time in combination with endoscopy sensitivity, at least for the distal colon. We expect that after the three model groups have calibrated their models to these new data, the MCLIRs will differ substantially less. Remaining uncertainty will concern the proximal part of the colon that is not reached by sigmoidoscopy. In the meantime, the way to handle uncertainties is to perform sensitivity analyses to investigate the robustness of the results for the uncertainties.
In this article we presented CRC screening models. Our approach, however, is relevant for any screening model, including those for nonneoplastic disease. The reason is that screening by definition presumes a detectable preclinical phase before disease becomes symptomatic. The duration of this phase always is an important determinant of the potential of screening. Although the concept is generalizable, specific issues may need attention when the MCLIR is applied to other diseases. One issue is that, unlike for CRC, the possibility of `incidental' detection of asymptomatic disease (e.g., breast or lung cancer on a computed tomography examination for unrelated indication) may be important. As pointed out in the Methods section, the absence or presence of incidental detection when simulating the MCLIR, should be specified since the MCLIR assumes no further screening. The best comparison of MCLIRs between models will be made when they handle incidental detection in a similar way.
In the models presented, the simulation of natural history begins at the onset of detectable disease (i.e., the onset of small adenomas). In some models, the simulation of natural history may begin before the disease is in a detectable state. Given that nondetectable disease is not relevant for the effectiveness of screening, we included detectability in the definition of the MCLIR. If the onset of detectable disease is not defined in a model, the MCLIR will automatically simulate the removal of any and all preclinical disease. In that case, however, it becomes more important to also present the clinical incidence reduction (CLIR) after removal that is incomplete due to assumed realistic lack of sensitivity (see next paragraph).
The MCLIR addresses the modeling of the limitation imposed by the natural history (the limitation of screening effectiveness). To describe the limitation imposed by (lack of) test sensitivity would be a logical next step. Interestingly, one could use the same method used for the MCLIR, by presenting the clinical CLIR after screening with the base case sensitivity. Due to the high sensitivity the CLIR after colonoscopy was only slightly lower than the MCLIR for our models (results not shown). The difference between the MCLIR and the CLIR after removal of disease detected with a fecal occult blood test (FOBT) of course will be much larger. Also, dwell time of detectable preclinical disease and sensitivity of the test under evaluation are to some extent interchangeable when calibrating models to data on screening effectiveness. A model with long dwell time combined with low sensitivity can project similar effectiveness to a model with short dwell time and high sensitivity. Typically, these models will show different MCLIRs but similar CLIRs after detection with estimated sensitivity.
Suggestions for using projections to describe models date back to the 1990's (11
). We build on those suggestions by proposing a standardized metric for comparing models. Projections cannot replace input description. The latter is necessary for reproducibility. However, projections do have value. Where needed, they can show the implications of implicit assumptions in a concise manner. Projections, combined with a restricted input description, are suitable to be included in the main text of a journal, whereas complete lists of input parameters can be placed in a supplemnentary (online) document.
In conclusion, adding the simulated maximum clinical incidence reduction (MCLIR) after complete removal of precursor disease is a simple way to clarify the impact of natural history in (CRC) screening models. It would be worthwhile to include such a measure in all screening modeling papers. We have described how to calculate the MCLIR and proposed standard notation for reporting it.