This collaborative modeling project demonstrates that the choice of optimal breast cancer screening strategies is complex. All six modeling groups concluded that the most efficient screening strategies are those that include a biennial screening interval. Initiation of screening at age 40 provides small added benefits but is accompanied by a large increase in the number of screening examinations and a high false positive rate. Extending screening beyond age 74 yields moderate mortality reductions and lower false positive rates but at the expense of some women being over-diagnosed. The absolute difference in lifetime probability of death that would be expected under the Task Force recommendation for biennial screening from ages 50 to 74 compared to annual screening form age 40 to 84 is very small (0.5%).
Screening intervals are somewhat arbitrary. Screening every one month or every six months would detect the greatest number of cancers, but would be infeasible in terms of time, mammography resources and the weight of false positive exams. The finding in this study that biennial screening is more efficient than annual screening is consistent with previous screening trials, most of which used 2-year intervals.
1,2 The efficiency of biennial screening is largely due to the biology of breast cancer and the specificity of mammography. Slow growing tumors are much more common than rapidly growing tumors, and the ratio of slow to fast growing tumors increases with age,
32 so that little survival benefit is lost between screening every year versus every other year. For the small sub-set of younger women with aggressive, faster-growing tumors, even annual screening is not likely to confer a survival advantage. Since the specificity of mammography is less than 100%, the less often screening occurs, the lower the number of false positive results and unnecessary biopsies.
In all models, some reductions in breast cancer mortality, albeit small, were seen with strategies initiating screening at age 40 versus age 50. This is consistent with the recent Age trial in the United Kingdom.
3,4 The small magnitude of benefit is attributable to the low incidence of disease from age 40 to 49 and the low sensitivity of mammography in this age group. The same factors that lead to the small benefits in the younger age group also contribute to the harms of screening – high rates of false positive screens and unnecessary biopsies. In addition, since the proportion of DCIS is highest in younger women, screen detection of DCIS that may not be clinically significant could be considered a further harm. Thus, decisions to screen before age 50 largely depend on women's willingness to tolerate these harms for a small chance that cancer is present and that screening will reduce the probability of death from that cancer.
At the other end of the age spectrum, all six models found that screening beyond age 74 remains on the efficiency frontier. This result is consistent with previously reported results of screening benefit from observational and modeled data.
33–36 As with the situation for younger women, any benefits of screening older women must be balanced against possible harms. For instance, the probability of over-diagnosis accelerates among women over age 74. Model estimates for the oldest age groups also have some uncertainty built in because of the limited primary data on natural history of breast cancer and the absence of screening trial data after age 74 years.
It is logical to assume that more screening will save substantially more lives. However, when comparing the strategy recommended by the US Preventive Services Task Force
8 to the most intensive regimen we evaluated (annual screening from ages 40 to 84), there was only a one-half of one percent additional reduction in the lifetime probability of death from breast cancer. This somewhat counter-intuitive result is based on several factors, including the fact that most women never develop breast cancer and when cancer is diagnosed, treatment is very effective in avoiding death for most women. Additional variables, such as slow tumor growth rates and low incidence rates in young women, also mean that a less intensive screening schedule still maintains the majority of the benefits of more intensive strategies and use far fewer mammography screening and diagnostic resources.
The collaboration of six groups with different modeling philosophies and approaches to estimate the same end-points by using a common set of data provides an excellent opportunity to cross-replicate results and depicts uncertainty related to modeling assumptions and structure by providing a range of results. The resulting conclusions about the ranking of screening strategies were very robust and should provide greater credibility than inferences based on one model alone.
Despite our consistent results, our study had some limitations. First, our models project mortality reductions similar to those observed in clinical trials, but the range of results includes higher mortality reductions than seen in the trials because we model lifetime screening (vs. for the period of the trial and for invitations to screening) and assume adherence to all screening and treatment. The trials followed women for limited numbers of years and have some non-adherence. Second, we do not consider morbidity associated with surgery for screening-detected disease
37 or decrements in quality of life associated with false-positive results, living with earlier knowledge of a cancer diagnosis, or over-diagnosis.
38 Third, in estimating lifetime results, we projected breast cancer trends from background incidence rates of a 1960 birth cohort extrapolated forward in time. However, future background incidence (and mortality) may change as the result of different forces, and/or results may vary for groups with higher than average risk.
39 We assumed 100% adherence to screening and treatment to evaluate program efficacy. Benefits will always fall short of the projected results because adherence is not perfect. If actual adherence varies systematically by age or other factors, the ranking of strategies could change. Finally, we did not include costs in our analysis, although the average number of mammograms per woman (and false-positive results) provides some proxy of resource consumption. Even with these acknowledged limitations, the models demonstrate meaningful, qualitatively similar outcomes.
Choices about optimal ages of initiation and cessation will ultimately depend on program goals, resources, weight attached to the balance of harms and benefits, and considerations of efficiency and equity.