We observed statistically significant heterogeneity (
P < .001) among the effect size estimates (ORs) in the 25 studies (27 estimates), indicating variability in effect estimates of repeat mammography rates. Of the 15 categorical covariates identified a priori as study characteristics that may influence effect size, no single covariate resolved the heterogeneity in univariate analyses. In multivariable meta-regression, only the intervention strategy of reminders vs the combined categories of education/motivation and counseling remained a statistically significant predictor of the magnitude of the intervention effect as measured by the odds ratio; however, substantial residual heterogeneity persisted in the model. The summary odds ratio for the eight heterogeneous studies using reminders was the largest observed (OR = 1.79, 95% CI = 1.41 to 2.29 computed under a random-effects model) and was statistically significantly (
Pdiff = .008) greater than the summary odds ratio for the homogeneous group of 17 studies that used the more intensive strategies of education/motivation or counseling (OR = 1.27, 95% CI = 1.17 to 1.37) regardless of whether it was computed under a fixed- or random-effects model. It is important to note that the eight studies using reminders showed statistically significant heterogeneity despite the notoriously low statistical power of homogeneity testing (
80). Moreover, all eight studies were alike in terms of using medical records or administrative data to ascertain mammography status using design 1 (one pre- and one postintervention mammogram) and, except for one study, being conducted within a health-care setting. The results of the influence analyses confirmed that the observed heterogeneity was mostly attributable to one or two of the studies using reminders (
65,
67). Because of this heterogeneity, we cannot conclude that the use of a reminder intervention strategy within a health-care setting is more effective than alternate intervention strategies in the same or different study settings. Therefore, additional studies are needed to help resolve the remaining heterogeneity in this subgroup by identifying the explanatory study characteristics or research methodologies that are the key factors in increasing repeat mammography.
The 17 studies that used education/motivation or counseling were remarkably homogeneous in their effect sizes with a narrow confidence interval, suggesting a high degree of consistency among the studies and that the true intervention effect of these strategies may be, at best, moderate, that is, odds ratios between 1.18 and 1.36. The results of the meta-regression modeling further suggest that, among these homogeneous studies, there was no detectable advantage or disadvantage in the different study designs, methods, settings, populations, intervention strategies, delivery modes, outcome measurements, screening intervals, or use of theory. This finding raises a question as to whether substantial increases in regular mammography screening can be expected from education/motivation or counseling interventions, regardless of how intensive, rigorous, innovative, or expensive the approach. In other words, changes in regular mammography screening behavior may not be particularly sensitive to variations in education/motivation or counseling interventions. In the current US environment, substantial increases in regular cancer screening behavior may depend more on factors at the systems level (eg, regulations relating to health-care access such as insurance coverage and standards of preventive care) than on factors at the individual level such as perceived risk of breast cancer.
Our finding of a relatively modest intervention effect for the subgroup of more intensive intervention studies is consistent with the finding in a meta-analysis (
16) that was restricted to tailored interventions and that focused on one-time mammography screening. Sohl and Moyer (
16) found that interventions promoting repeat mammography had a smaller effect size (OR = 1.17) compared with those promoting one-time use (OR = 1.53). As discussed previously, however, we cannot conclude that lower intensity interventions such as reminders are a better strategy without additional research to determine whether a particular type of reminder strategy is effective across different study settings, populations, and study methodologies.
An unexpected finding was that there were two types of study designs used in repeat mammography interventions. In design 1, women who had had a recent preintervention mammogram at study baseline were followed long enough to complete one on-schedule mammogram during the study period. In design 2, women with diverse mammography histories at study baseline were followed long enough to complete two on-schedule mammograms during the study period. In design 2 studies, women who were overdue or had never been screened may have been more resistant to attempts to get them to complete screening. Although the confidence intervals overlapped, the odds ratio for design 2 studies was smaller compared with design 1 studies, suggesting that women who were not overdue at study baseline were more likely to complete another mammogram on schedule compared with a group of women that included some who were overdue. Future studies should consider how mammography history may affect receptivity to different types of interventions (
81). For example, if a woman experiences the procedure as painful, she may be unwilling to return for her next mammogram when it is due and may disregard reminders or messages promoting mammography.
There is no consensus about how to classify types of intervention strategies. Systematic reviews and meta-analyses of one-time mammography screening have used a number of different classifications (
7–
19). The lack of consistency may result, in part, from lack of consensus about how to operationalize our theoretic frameworks and constructs (
24). In addition, many of the interventions reviewed here were complex, multicomponent interventions and, therefore, difficult to classify. For these reasons, we explored three approaches to classifying intervention strategies. None of these approaches yielded homogeneous subgroups across all categories of a variable, and, with the possible exception of reminders, the effect sizes were generally similar.
In our assessment of the quality of reporting for eight characteristics of internal validity, most studies tested for equivalence of study groups at baseline and reported the response rate at follow-up (). Fewer studies reported whether they compared the characteristics of participants who remained in the study with those who dropped out, and even fewer reported whether there was differential attrition by study group (). Only 11 studies conducted an intention-to-treat analysis, so it is likely that the effect sizes were overestimated because data on other cancer screening behaviors suggest that dropouts are less likely to complete screening tests (
82–
84). There was far less attention to reporting study characteristics that affect external validity such as the representativeness of participants and settings. As noted by Steckler and McElroy (
85), “Systematic reviews and meta-analyses are limited in the conclusions that can be drawn when external validity data are not reported.” This limitation needs to be addressed if we are to successfully disseminate effective interventions. Application of frameworks to address internal (
41,
86) and external (
44,
87) validity in the implementation and evaluation of interventions will enable us to learn from our successes as well as our failures.
A limitation of our systematic review is that several of the studies reviewed here were not explicitly designed to promote repeat mammography, although they provided data that allowed us to calculate effect estimates. It may be that had their interventions been designed to address repeat mammography in addition to one-time mammography, the effect estimates in those studies would have been different. Although there is an extensive body of intervention research on one-time mammography screening, the number of intervention studies of repeat mammography is comparatively small, and estimates in some categories of the predictor variables were unstable. In addition, most of the studies were conducted more than 10 years ago, and most were of non-Hispanic white women, thus limiting our ability to generalize the findings to the present day and to other ethnic groups.
If we are to reap the benefits of mortality reduction from mammography screening, we need a better understanding of the determinants of repeat screening behavior so that we can develop more effective interventions. This review called attention to a number of characteristics in the studies to date, which, if attended to in future studies, will increase our understanding of how to develop and implement interventions to increase regular mammography screening. Because reporting standards are increasingly being adopted by journal editors, it will be easier to synthesize the literature and draw conclusions about what works, under what circumstances, and for what reasons.