Diagnostic mammograms obtained as short-interval follow-up examinations to follow up a probably benign abnormality have a low sensitivity for detecting cancers diagnosed within the following 12 months (61%) compared with what previous studies have shown for other types of diagnostic mammograms in the same population (usually ≈ 80%) [1
]. Few patient or radiologist characteristics were statistically significantly associated with sensitivity.
To our knowledge, this is the first article to describe the accuracy of initial short-interval follow-up mammograms and to evaluate the accuracy by patient and radiologist characteristics. Sickles et al. [1
] showed a similarly low sensitivity at 12 months (for all short-interval follow-up examinations) in results that were posted on the public BCSC Website; these results were not published or discussed in the original articles. Our study population was also from the BCSC, but included one new study site and additional years of data beyond those included in Sickles’ articles. We also examined the influence of both patient and radiologist characteristics on sensitivity and specificity. These differences and that we matched laterality of cancer diagnoses and examinations might have accounted for the slightly higher sensitivity noted in our results (60.5% vs 55.8% in Sickles’ earlier article [1
]). Other previous studies have evaluated the accuracy of the examination producing the initial short-interval follow-up recommendation (i.e., the examination given the probably benign assessment), but these studies did not evaluate the sensitivity or specificity of the initial short-interval follow-up examinations themselves [4
The reason for the low 12-month sensitivity of initial short-interval follow-up examinations is unclear, but there are several possible explanations. First, cancers assessed as probably benign (BI-RADS category 3) may not grow as rapidly as cancers that appear more suspicious for malignancy (BI-RADS category 4 or 5). Therefore, it may be more difficult to identify interval change (hence, recommend biopsy) at initial short-interval follow-up examinations because these examinations usually are performed 6 months rather than 1 year after the index mammogram that prompted the short-interval follow-up. The rationale for recommending an initial short-interval follow-up examination is to identify 6 months earlier those poorer-prognosis “probably benign” cancers that do grow sufficiently rapidly to be detected early [10
A second possible reason for the low sensitivity of initial short-interval follow-up examinations is that radiologists interpreting them might be reassured by the previous radiologist’s probably benign interpretation and thus have a higher threshold for calling the initial examination suspicious compared with other diagnostic examinations. This would be false reassurance if it is resulting in low sensitivity for diagnosing breast cancer. It would be interesting to evaluate short-interval follow-up sensitivity among radiologists who interpreted both the examination that resulted in a probably benign assessment and the initial short-interval follow-up examination; however, we were unable to do this in our study.
A third possible reason for the low sensitivity of initial short-interval follow-up examinations may be that the standard BI-RADS definition to use a 12-month follow-up period for evaluating sensitivity and specificity does not match the 6-month follow-up period recommended after an initial short-interval follow-up examination. Our data showed that using a 6-month follow-up interval for the definition of sensitivity increased the unadjusted sensitivity of short-interval follow-up examinations to 83%, which is similar to the 12-month sensitivity for other diagnostic examinations (≈ 80%) [14
]. However, a trade-off in defining the follow-up interval for sensitivity always exists—if you shorten the follow-up interval, sensitivity increases, because there are fewer false-negative examinations that appear as interval cancers during this shorter time. It has been recommended that the follow-up interval for defining sensitivity and specificity of a screening or diagnostic test should match the follow-up interval recommended for that test, so long as that is what occurs in clinical practice [23
]. Future research should evaluate the specific follow-up intervals that radiologists are actually recommending after initial short-interval follow-up examinations and whether women are complying with those recommendations.
Our study had several limitations. We were unable to retrospectively evaluate whether cancers did or did not show interval progression on short-interval follow-up mammograms, which might have helped to determine more effective thresholds for recommending biopsy rather than continued surveillance. In addition, we were unable to evaluate performance of initial short-interval follow-up examinations by the types of lesions requiring follow-up (e.g., mass, focal asymmetry, calcifications) or by the size and stage of the cancers that were diagnosed. These analyses were beyond the scope of our project.
Although our sample included 130 radiologists from three geographic areas in the United States, we had limited ability to evaluate the importance of individual radiologist characteristics because of a small sample size in some categories. However, all sensitivity and specificity analyses were based on multiple short-interval follow-up mammograms interpreted by the radiologists, increasing the statistical power of these calculations. Despite the large size of our cohort, we were also limited by the small number of women with breast cancer used to evaluate sensitivity, especially within the 6-month follow-up period. Overall, the rate of cancer diagnoses in this study was smaller than has been reported for other short-interval follow-up studies. This may be because a large proportion of short-interval follow-up examinations in our study directly followed screening mammograms, which have a lower cancer rate compared with other diagnostic examinations. Given the low cancer rate and differences between this and previous studies, we caution the reader in interpreting our results.
Our study had several unique strengths. We were able to evaluate the interpretive performance of short-interval follow-up examinations in a large, geographically diverse population. The large population size allowed us to make several exclusions (such as women with a history of breast cancer or women without outcome information) and to analyze a population of women eligible for short-interval follow-up mammograms. Had we not made these exclusions, we likely would have increased the variability in our sensitivity and specificity estimates, thus decreasing the ease of interpretation of our results. Our population included unique detailed information on patient and radiologist characteristics available from linking several study databases. We also had the ability to link to cancer registry data, which enabled us to calculate sensitivity and specificity.
In conclusion, the sensitivity of diagnostic mammograms obtained as initial short-interval follow-up examinations is low when using the standard 12-month auditing definition for follow-up period. The reasons for this low sensitivity should be elucidated. We noted increases in sensitivity among women who underwent unilateral short-interval follow-up examinations and among radiologists who spent 10 or more hours per week in breast imaging; but overall, few patient or radiologist characteristics were associated with accuracy. The value of using a 6-month (rather than a 12-month) follow-up period for defining sensitivity also should be examined in future studies.