|Home | About | Journals | Submit | Contact Us | Français|
American College of Radiology BI-RADS guidance suggests that women with a probably benign finding on mammography receive a management recommendation for short-interval follow-up; historically, radiologists in community practice have not consistently linked this assessment with short-interval follow-up. We evaluated predictors of discordance between probably benign assessments and short-interval follow-up recommendations.
We linked data on 196 radiologists who completed a survey on demographic and practice patterns to 15,515 diagnostic mammograms they interpreted with probably benign assessments between 2001 and 2006. Patient characteristics were collected at the time of the mammography. Using logistic regression, we examined whether patient and radiologist characteristics were associated with the odds of short-interval follow-up recommendations (relative to a recommendation for normal follow-up, additional imaging evaluation, or biopsy or surgical consultation).
Overall, 90.9% of mammograms with probably benign findings were recommended for short-interval follow-up; 4.3% were recommended for normal follow-up, 3.0% for additional imaging, and 1.8% for biopsy or surgical consultation. Women with probably benign findings were less likely to receive a short-interval follow-up recommendation if they had extremely dense breasts versus almost entirely fatty breasts (odds ratio [OR], 0.61; 95% CI, 0.39–0.96) or had a breast lump versus no symptoms (OR, 0.55; 95% CI, 0.38–78). Radiologists were less likely to recommend short-interval follow-up if they had ≥ 20 years of experience versus < 10 years of experience (OR, 0.57; 95% CI, 0.36–0.90) but more likely if they practiced primarily at an academic medical center versus other institutions (OR, 2.66; 95% CI, 1.14–6.21).
In contrast to older studies, the majority of probably benign assessments are now recommended for short-interval follow-up, but the probability of short-interval follow-up recommendations varies by patient and radiologist characteristics.
BI-RADS, which was developed by the American College of Radiology, outlines a standardized reporting system of mammographic lesions. The BI-RADS Atlas  outlines seven assessment categories, each of which has an associated follow-up recommendation to guide radiologists and referring clinicians in the management of their patients . BI-RADS assessment category 3 (probably benign finding) is associated with a suggested recommendation for short-interval (< 1 year, usually 6 months) follow-up. The lesions intended for a category 3 assessment are those believed to have an extremely low (< 2%) probability of being malignant [1, 3–9]. Short-interval follow-up mammography monitors lesions for changes at a more frequent interval than regular screening and is intended to serve as an alternative to invasive procedures, such as biopsy or fine-needle aspiration [1, 10]. Given the low probability of malignancy, discordance between probably benign assessments and short-interval follow-up recommendation can lead to unnecessary and costly workup for the majority of findings that are truly benign.
Previous studies have reported that up to 14% of screening and diagnostic mammography examinations are assigned BI-RADS category 3 [4–6, 11–14]. These studies have also shown that only 40–71% of these examinations are given the suggested recommendation for short-interval follow-up [4, 11, 13, 15]. The inconsistency between BI-RADS category 3 assessment and short-interval follow-up recommendation may be related to patient characteristics, such as age, family history of breast cancer, body mass index, and breast density [5, 12, 14, 15], or to variation in the practice patterns of individual radiologists [12, 14]. Some of the studies describing the use of short-interval follow-up recommendations took place before 1999, before BI-RADS was used as widely in clinical practice as it is now . Fortunately, concordance has increased over time (from 54% between 1996 and 1999 to 71% between 1999 and 2001), perhaps reflecting changes in the BI-RADS Atlas  and Mammography Quality Standards Act . However, no study has evaluated concordance since publication of the most recent edition of BI-RADS Atlas in 2003 , the only edition that provides more explicit guidance concerning the linkage of assessment categories and management recommendations.
Using data from the Breast Cancer Surveillance Consortium, we assessed current trends in the use of short-interval follow-up recommendations for probably benign assessments and examined whether concordant recommendations are associated with patient or radiologist characteristics. A recommendation for prompt additional imaging, biopsy, or surgical consultation instead of short-interval follow-up mammography may lead to unnecessary patient anxiety, workup, and increased health care costs. A recommendation for normal follow-up instead of short-interval follow-up may lead to delay in diagnosis of the infrequent (< 2%) cancers found among probably benign lesions. Understanding the factors that influence a radiologist's management of probably benign lesions may help in designing practice-based interventions to reduce variability in mammographic performance.
This analysis used data from seven mammography registries that are part of the National Cancer Institute–funded Breast Cancer Surveillance Consortium : Group Health, Western Washington; Colorado Mammography Project; New Hampshire Mammography Network; Carolina Mammography Registry, North Carolina; New Mexico Mammography Project; San Francisco Mammography Registry, California; and Vermont Breast Cancer Surveillance System. The Breast Cancer Surveillance Consortium statistical coordinating center pooled data for analysis. Each registry and the statistical coordinating center received institutional review board approval to enroll participants, link data, and perform analytic studies. All procedures were HIPAA compliant, and all registries and the statistical coordinating center have received a federal certificate of confidentiality and other protection for the identities of women, physicians, and facilities who are subjects of this research.
We invited radiologists who interpreted mammograms during 2005–2006 at any of the Breast Cancer Surveillance Consortium sites to complete a self-administered mailed survey between January 2006 and September 2007. Surveys were distributed in four sites (Colorado, North Carolina, New Hampshire, and Washington) in 2006 and in the remaining three sites (California, New Mexico, and Vermont) in 2007. We linked survey results from 196 participating radiologists to the diagnostic mammograms they interpreted between 2001 and 2006. A copy of the survey is available at www.breastscreening.cancer.gov/collaborations/favor.html.
The Breast Cancer Surveillance Consortium collected information on patient demographics (e.g., age and race) and breast cancer risk factors (e.g., family history of breast cancer, breast symptoms, menopausal status, and current hormone therapy use) using standardized questionnaires completed by women at the time of their mammographic examinations .
We included diagnostic mammography examinations performed for the additional evaluation of a recent mammogram and given a final BI-RADS assessment of category 3 (probably benign) during 2001–2006. We did not include diagnostic examinations performed to evaluate a breast problem, an indication usually used for symptomatic women. This classification was based on the radiologist's indication for the examination; however, women could still self-report having symptoms or breast problems at the time of the examination regardless of the radiologist's indication. BI-RADS Atlas  recommends that radiologists give probably benign assessments for 2 to 3 years after the first probably benign assessment to evaluate stability versus change in the lesion. However, only the first two follow-up examinations in the surveillance protocol are recommended at an interval shorter than routine annual screening mammography (usually 6 months). If women had multiple examinations with a probably benign assessment, we included their first mammography with a probably benign assessment. We excluded mammography examinations for women with a history of breast cancer or a prior short-interval follow-up mammography. Finally, we excluded mammograms that were missing key covariate or outcome information, including breast density, time since previous mammography, and the final follow-up recommendation made by the radiologist. The final sample included 15,515 women, which was 5.6% of all women from the same time period without missing data who underwent mammography indicated by the radiologist to be for the additional evaluation of a recent mammogram (n = 277,446).
Recommendations were classified as short-interval follow-up, normal follow-up, additional imaging evaluation, and biopsy or surgical evaluation or clinical examination from the radiologists' reports. All short-interval follow-up recommendations were included regardless of the recommended time interval for follow-up, although 84% of the 15,515 examinations were recommended for follow-up in 6 months. Recommendations for additional imaging evaluation in 6 months (n = 640) were classified as short-interval follow-up instead of additional imaging evaluation. In a sensitivity analysis, keeping the original recommendation for additional imaging evaluation did not change our results. Some radiology facilities use software that automatically links assessment and recommendation based on BI-RADS guidelines; however, in some systems, it is possible to override the linkage and make a different recommendation. For each mammography, we included an indicator of whether the mammographic recommendation was linked to the final assessment at the time of interpretation as a covariate in the analysis.
We describe the characteristics of women who received probably benign assessments and of the radiologists who made the probably benign assessments. We also describe the distribution of patient and radiologist characteristics by follow-up recommendation after the probably benign assessment.
We used logistic regression to examine the association between the odds of receiving a recommendation for a short-interval follow-up mammography given a probably benign assessment and patient and radiologist characteristics. Using generalized estimating equations , we account for clustering by radiologist and adjusted for patient age, year of the mammography, mammographic breast density, and Breast Cancer Surveillance Consortium registry.
In Table 1, we show the distribution of radiologist characteristics among those who gave BI-RADS category 3 assessments. About 70% of radiologists were male, and 65% had more than 10 years of experience interpreting mammograms. Few radiologists spent the majority of their time in breast imaging; 84.7% spent less than 25% of their workload interpreting screening mammograms, and 72.1% spent less than 10% of their workload interpreting diagnostic mammograms.
Among 15,515 women with probably benign assessments, 64.6% were between the ages of 40 and 59 years, and 16.1% reported having a first-degree family history of breast cancer (Table 2). The majority of examinations with probably benign assessments occurred among women who reported no symptoms at the time of mammography. Even though we only included examinations that were indicated by the radiologist to be for additional evaluation, 1,159 women reported having breast symptoms (a lump or other symptoms) at the time of their examination. Mammographic assessment and recommendation were linked among 41.4% of mammographic examinations.
Table 2 also shows the distribution of follow-up recommendations by patient characteristics. Almost 91% of women with a probably benign assessment received a recommendation for a short-interval follow-up mammography, but 4.3% were recommended for normal follow-up, 3.0% for additional imaging, and 1.8% for biopsy or surgical or clinical evaluation. Women were less likely to be recommended for short-interval follow-up if they were postmenopausal (odds ratio [OR], 0.78; 95% CI, 0.61–0.98), had extremely dense breasts (OR, 0.61; 95% CI, 0.38–0.96), or reported a breast lump (OR, 0.55; 95% CI, 0.38–0.78). A higher proportion of women with extremely dense breasts tended to be recommended for more aggressive management in the form of additional imaging or biopsy, compared with women with fatty breasts. Women with symptoms were also more likely to be recommended for breast biopsy, surgical consultation, or clinical evaluation than were women without symptoms. The adjusted odds of a short-interval follow-up recommendation were statistically significantly increased from 2003 through 2006 compared with 2001 and for women who had mammographic examinations at facilities that linked assessment with recommendation (OR, 18.05; 95% CI, 7.8–41.77). The increase in short-interval follow-up recommendations over time and among facilities with linked assessments and recommendations was accompanied by decreases in all other types of recommendations.
Radiologists who were ≥ 55 years old or who had more than 20 years of experience interpreting mammograms were less likely to give a short-interval follow-up recommendation (OR, 0.48; 95% CI, 0.26–0.88) than were younger radiologists and radiologists with fewer years of experience interpreting mammograms (OR, 0.53; 95% CI, 0.29–0.95) (Table 3). Radiologists with a primary appointment at an academic medical center were statistically significantly more likely to recommend short-interval follow-up (OR, 2.66; 95% CI, 1.14–6.21). In addition, radiologists with a total mammography volume ≥ 1,000 annually were more likely to recommend short-interval follow-up compared with radiologists with an annual volume < 1,000.
We found that most diagnostic mammograms assessed as probably benign were given a recommendation for short-interval follow-up as suggested by the BI-RADS Atlas  and that the percentage of short-interval follow-up recommendations increased over time, reaching 96.7% in 2006. These are much higher proportions of short-interval follow-up recommendations after BI-RADS category 3 assessments than reported in previous studies [4, 11, 13, 15]. One explanation for the increased consistency between assessment and recommendation in our study is that we limited our analysis to diagnostic examinations that were performed for the additional evaluation of a recent mammography. Previous studies showing lower consistency included screening mammographic examinations, examinations for which BI-RADS category 3 assessments should not be made at all [4, 11, 13, 15]. Only Geller et al.  evaluated concordance rates among diagnostic examinations that were performed to evaluate a breast problem, but even these rates of concordance were lower than those we found (43.8% in 1996 vs 55.8% in 2001). We did not include diagnostic examinations performed to evaluate a breast problem in our study, which also could have accounted for some of the difference in results. Finally, our finding of increased consistency over time may be the result of additional guidance put forth by the American College of Radiology in the fourth edition of BI-RADS in 2004 .
Mammographic examinations that were interpreted using a system that automatically linked assessment and recommendation were associated with the highest odds of short-interval follow-up recommendations. Over 99% of these mammographic examinations received short-interval follow-up recommendations, though a small proportion (0.9%) were given different recommendations, suggesting that radiologists infrequently overrode the linked short-interval follow-up recommendation. This result is consistent with those of a previous study by Geller et al. , who showed that linked systems improved the concordance between BI-RADS assessments and recommendations. Approximately 15% of mammograms in the study by Geller et al. were interpreted using linked systems, compared with 40% in the present article. Systems that automatically link assessment and recommendation are increasingly being used in practice, and linkage may be the most effective way to improve agreement, not only among BI-RADS category 3 assessments but also among all BI-RADS assessment categories.
Use of short-interval follow-up recommendations for lesions assessed as probably benign varied by both patient and radiologist characteristics. Both increased mammographic breast density and reporting of breast symptoms at the time of mammography were associated with a decreased probability of receiving a short-interval follow-up recommendation, consistent with the probably benign assessment. Instead, these women tended to receive recommendations for immediate follow-up (additional imaging or biopsy). It is important to note that few of the 15,515 probably benign assessments were given to women with extremely dense breasts (n = 1,005) or with symptoms (n = 266 with a breast lump and n = 893 with other symptoms). Radiologists may have recommended these women for immediate follow-up because they doubted the accuracy of mammography in women with dense breasts [19, 20] or because they considered symptomatic women to be at greater than average breast cancer risk . The majority of mammography malpractice suits citing delays in cancer diagnoses are initiated by women with symptoms, which could help explain the more aggressive recommendations for this group .
Radiologist characteristics were also associated with the consistency between BI-RADS category 3 assessments and short-interval follow-up recommendations. Older radiologists (≥ 55 years old) and those with more than 20 years of experience were less likely to recommend short-interval follow-up compared with younger radiologists or radiologists with fewer years of experience. These two characteristics are highly correlated and may suggest that radiologists who have been practicing longer are more resistant to changing their ways of practice when new or revised interpretive guidance is published. In addition, these radiologists were trained long before the revised American College of Radiology guidance was published, suggesting that there is a need for continuing education for all radiologists, regardless of age or years of experience. Several studies have shown that changing established practice patterns among physicians is difficult and often requires multiple interventions [23, 24]. Training in BI-RADS specifically has been shown to improve agreement in final mammogram assessments among radiologists , but we are unaware of any studies that have evaluated whether BI-RADS training improves agreement between recommendations and assessments.
Radiologists who were more likely to recommend short-interval follow-up after a probably benign examination included those with a primary affiliation at an academic medical center. These radiologists may be more up to date on BI-RADS guidance or may have more opportunities for continuing education in mammography than do radiologists who are not affiliated with an academic medical center. In addition, radiologists who reported interpreting ≥ 1,000 mammograms in a single year were more likely to recommend short-interval follow-up, compared with radiologists who reported interpreting < 1,000 mammograms. We are not aware of other studies that have evaluated radiologist characteristics, such as age, experience, academic affiliation, or volume in relation to the consistency of following the BI-RADS guidance of linking mammography assessments and recommendations.
One limitation of our study was that we evaluated radiologist characteristics from a survey collected at one point in time with mammography assessments and recommendations that changed over time. To account for this fact, we recoded radiologist age and experience so that they changed over time with the year of the mammography interpretation. However, most other radiologist characteristics, such as sex and academic affiliation, do not change over time. In addition, our study population was limited to women with their first diagnostic examination assessed as probably benign and cannot be generalized to subsequent examinations with probably benign assessments. Our data may be subject to some misclassification if women had prior mammographic examinations outside of the Breast Cancer Surveillance Consortium registries that we could not identify. Finally, we could not assess patient preferences for follow-up. Some women may have requested an immediate biopsy or additional imaging instead of waiting 6 months for another mammography, which could have accounted for some of the variability in recommendations.
Our study also has several strengths. We were able to link a large number of mammographic examinations with both patient characteristics and characteristics of the radiologists who interpreted them. Our study population included diagnostic mammographic examinations performed for the additional evaluation of a recent screening examination across seven U.S. states, in Breast Cancer Surveillance Consortium sites that collect data from facilities that range from small rural practices to large urban hospitals. Therefore, our results should be representative of community practice in the United States .
In conclusion, we noted a high degree of consistency between radiologists' recommendations for short-interval follow-up after a probably benign assessment and BI-RADS guidance; this consistency improved between 2001 and 2006. The use of computerized systems that automatically link assessment and recommendations may be an effective method for further increasing consistency with BI-RADS guidance. In addition, continuing medical education on BI-RADS guidance would be relevant for all practicing mammographers to reduce variability in assessment and recommendation and, ultimately, to improve mammographic performance.
The authors had full responsibility in the design of the study, the collection of the data, the analysis and interpretation of the data, the decision to submit the manuscript for publication, and the writing of the manuscript. We thank the participating women, mammography facilities, and radiologists for the data they have provided for this study. A list of the Breast Cancer Surveillance Consortium investigators and procedures for requesting Breast Cancer Surveillance Consortium data for research purposes are provided at breastscreening.cancer.gov/.
This work was supported by the National Cancer Institute (grants 1R01 CA107623 and 1K05 CA104699; and grants U01CA63740, U01CA86076, U01CA86082, U01CA63736, U01CA70013, U01CA69976, U01CA63731, and U01CA70040 to the Breast Cancer Surveillance Consortium), the Agency for Healthcare Research and Quality (grant 1R01 CA107623), the Breast Cancer Stamp Fund, and the American Cancer Society, through a generous donation from the Longaberger Company's Horizon of Hope Campaign (grants SIRGS-07-271-01, SIRGS-07-272-01, SIRGS-07-273-01, SIRGS-07-274-01, SIRGS-07-275-01, and SIRGS-06-281-01). The collection of cancer data used in this study was supported in part by several state public health departments and cancer registries throughout the United States. A full description of these sources is available at breastscreening.cancer.gov/work/acknowledgement.html.