We found that most diagnostic mammograms assessed as probably benign were given a recommendation for short-interval follow-up as suggested by the BI-RADS Atlas
] and that the percentage of short-interval follow-up recommendations increased over time, reaching 96.7% in 2006. These are much higher proportions of short-interval follow-up recommendations after BI-RADS category 3 assessments than reported in previous studies [4
]. One explanation for the increased consistency between assessment and recommendation in our study is that we limited our analysis to diagnostic examinations that were performed for the additional evaluation of a recent mammography. Previous studies showing lower consistency included screening mammographic examinations, examinations for which BI-RADS category 3 assessments should not be made at all [4
]. Only Geller et al. [13
] evaluated concordance rates among diagnostic examinations that were performed to evaluate a breast problem, but even these rates of concordance were lower than those we found (43.8% in 1996 vs 55.8% in 2001). We did not include diagnostic examinations performed to evaluate a breast problem in our study, which also could have accounted for some of the difference in results. Finally, our finding of increased consistency over time may be the result of additional guidance put forth by the American College of Radiology in the fourth edition of BI-RADS in 2004 [1
Mammographic examinations that were interpreted using a system that automatically linked assessment and recommendation were associated with the highest odds of short-interval follow-up recommendations. Over 99% of these mammographic examinations received short-interval follow-up recommendations, though a small proportion (0.9%) were given different recommendations, suggesting that radiologists infrequently overrode the linked short-interval follow-up recommendation. This result is consistent with those of a previous study by Geller et al. [18
], who showed that linked systems improved the concordance between BI-RADS assessments and recommendations. Approximately 15% of mammograms in the study by Geller et al. were interpreted using linked systems, compared with 40% in the present article. Systems that automatically link assessment and recommendation are increasingly being used in practice, and linkage may be the most effective way to improve agreement, not only among BI-RADS category 3 assessments but also among all BI-RADS assessment categories.
Use of short-interval follow-up recommendations for lesions assessed as probably benign varied by both patient and radiologist characteristics. Both increased mammographic breast density and reporting of breast symptoms at the time of mammography were associated with a decreased probability of receiving a short-interval follow-up recommendation, consistent with the probably benign assessment. Instead, these women tended to receive recommendations for immediate follow-up (additional imaging or biopsy). It is important to note that few of the 15,515 probably benign assessments were given to women with extremely dense breasts (n
= 1,005) or with symptoms (n
= 266 with a breast lump and n
= 893 with other symptoms). Radiologists may have recommended these women for immediate follow-up because they doubted the accuracy of mammography in women with dense breasts [19
] or because they considered symptomatic women to be at greater than average breast cancer risk [21
]. The majority of mammography malpractice suits citing delays in cancer diagnoses are initiated by women with symptoms, which could help explain the more aggressive recommendations for this group [22
Radiologist characteristics were also associated with the consistency between BI-RADS category 3 assessments and short-interval follow-up recommendations. Older radiologists (≥ 55 years old) and those with more than 20 years of experience were less likely to recommend short-interval follow-up compared with younger radiologists or radiologists with fewer years of experience. These two characteristics are highly correlated and may suggest that radiologists who have been practicing longer are more resistant to changing their ways of practice when new or revised interpretive guidance is published. In addition, these radiologists were trained long before the revised American College of Radiology guidance was published, suggesting that there is a need for continuing education for all radiologists, regardless of age or years of experience. Several studies have shown that changing established practice patterns among physicians is difficult and often requires multiple interventions [23
]. Training in BI-RADS specifically has been shown to improve agreement in final mammogram assessments among radiologists [25
], but we are unaware of any studies that have evaluated whether BI-RADS training improves agreement between recommendations and assessments.
Radiologists who were more likely to recommend short-interval follow-up after a probably benign examination included those with a primary affiliation at an academic medical center. These radiologists may be more up to date on BI-RADS guidance or may have more opportunities for continuing education in mammography than do radiologists who are not affiliated with an academic medical center. In addition, radiologists who reported interpreting ≥ 1,000 mammograms in a single year were more likely to recommend short-interval follow-up, compared with radiologists who reported interpreting < 1,000 mammograms. We are not aware of other studies that have evaluated radiologist characteristics, such as age, experience, academic affiliation, or volume in relation to the consistency of following the BI-RADS guidance of linking mammography assessments and recommendations.
One limitation of our study was that we evaluated radiologist characteristics from a survey collected at one point in time with mammography assessments and recommendations that changed over time. To account for this fact, we recoded radiologist age and experience so that they changed over time with the year of the mammography interpretation. However, most other radiologist characteristics, such as sex and academic affiliation, do not change over time. In addition, our study population was limited to women with their first diagnostic examination assessed as probably benign and cannot be generalized to subsequent examinations with probably benign assessments. Our data may be subject to some misclassification if women had prior mammographic examinations outside of the Breast Cancer Surveillance Consortium registries that we could not identify. Finally, we could not assess patient preferences for follow-up. Some women may have requested an immediate biopsy or additional imaging instead of waiting 6 months for another mammography, which could have accounted for some of the variability in recommendations.
Our study also has several strengths. We were able to link a large number of mammographic examinations with both patient characteristics and characteristics of the radiologists who interpreted them. Our study population included diagnostic mammographic examinations performed for the additional evaluation of a recent screening examination across seven U.S. states, in Breast Cancer Surveillance Consortium sites that collect data from facilities that range from small rural practices to large urban hospitals. Therefore, our results should be representative of community practice in the United States [3
In conclusion, we noted a high degree of consistency between radiologists' recommendations for short-interval follow-up after a probably benign assessment and BI-RADS guidance; this consistency improved between 2001 and 2006. The use of computerized systems that automatically link assessment and recommendations may be an effective method for further increasing consistency with BI-RADS guidance. In addition, continuing medical education on BI-RADS guidance would be relevant for all practicing mammographers to reduce variability in assessment and recommendation and, ultimately, to improve mammographic performance.