PMCCPMCCPMCC

Search tips
Search criteria 

Advanced

 
Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
 
Ann Intern Med. Author manuscript; available in PMC 2012 April 18.
Published in final edited form as:
PMCID: PMC3209800
NIHMSID: NIHMS332518

Cumulative probability of false-positive recall or biopsy recommendation after 10 years of screening mammography

Abstract

Background

False-positive mammography results are common. Biennial screening may decrease the cumulative probability of false-positive results across many years of repeat screening but could also delay cancer diagnosis.

Objective

To compare the cumulative probability of false-positive results and the stage distribution of incident breast cancer after 10 years of annual or biennial screening mammography.

Design

Prospective cohort.

Setting

Seven mammography registries in the National Cancer Institute–funded Breast Cancer Surveillance Consortium.

Participants

169,456 women who received a first screening mammogram at age 40–59 between 1994 and 2006 and 4,492 women with an incident invasive breast cancer diagnosed between 1996 and 2006.

Measurements

False-positive recalls and biopsy recommendations; stage distribution of incident breast cancer.

Results

False-positive recall probability was 16.3% at first and 9.6% at subsequent mammography. False-positive biopsy recommendation probability was 2.5% at first and 1.0% at subsequent examinations. Availability of comparison films halved the odds of a false-positive recall (adjusted OR 0.50 (CI 0.45, 0.56)). When screening began at age 40, the cumulative probability of a woman receiving at least one false-positive recall after 10 years was 61.3% (95% CI, 59.4% to 63.1%) with annual and 41.6% (CI, 40.6% to 42.5%) with biennial screening. Cumulative probability of false-positive biopsy recommendation was 7.0% (CI, 6.1% to 7.8%) with annual and 4.8% (CI, 4.4% to 5.2%) with biennial screening. Estimates were comparable when screening began at age 50. We observed a non-statistically significant increase in the proportion of late-stage cancers with biennial compared to annual screening (absolute increase 3.3% (CI −1.1, 7.8) age 40–49, 2.3% (CI −1.0, 5.7) age 50–59) among a population of women with incident breast cancer.

Limitations

Few women underwent screening over the entire 10 year period. Radiologist characteristics influence recall rates and were unavailable. Most mammograms were film rather than digital exams. Incident cancers were analyzed in a small population of women who developed cancer.

Conclusions

After 10 years of annual screening, more than half of women will receive at least one false-positive recall, and 7–9% will receive a false-positive biopsy recommendation. Biennial screening appears to reduce the cumulative probability of false-positive results after 10 years but may be associated with a small absolute increase in the probability of being diagnosed with late stage cancer.

INTRODUCTION

Mammography is the only screening test shown to reduce breast cancer mortality in clinical trials (15). However, screening a healthy population confers both harms and benefits. False-positive (FP) recalls for additional imaging after screening mammography occur for 14% of women at first screening and for 8% at subsequent exams (2, 6), causing many women inconvenience and anxiety. Recommendations for fine needle aspiration or surgical biopsy after screening mammography are less common (2) but have more severe consequences (7, 8).

Women will undergo 12 screening mammography examinations in their lifetimes if, following updated U.S. Preventive Services Task Force guidelines, they start biennial screening at age 50 and stop at age 74 (9). They will undergo 17 examinations if they start biennial screening at age 40, 24 if they start annual screening at age 50, and 34 if they start annual screening at age 40. Estimates of the probability that a woman will experience at least one FP recall after 10 screening examinations range from 29% to 77% (1012), and are about 8–9% for benign biopsy (12, 13). These estimates, however, are based on extrapolations, are limited by statistical methodology that assumes women participating in multiple screening rounds are representative of all women recommended for screening, and do not take into account factors shown in prior studies to be associated with wide variability in FP rates, such as radiologist recall rates (1417) and patient age, breast density, hormone therapy use, and screening interval (6, 15, 18).

To address these limitations, we estimated the cumulative probability of FP recall and biopsy recommendation after 10 years of annual or biennial screening using data from the National Cancer Institute–funded Breast Cancer Surveillance Consortium (BCSC) (19), a nationally representative longitudinal sample of screening mammograms from community practice, and using newer statistical methods that account for duration of observation and informative censoring (20). In addition, we aimed to estimate how patient characteristics and variability in radiologist FP rates might affect cumulative FP probability, and compared the stage at diagnosis of incident breast cancers among women whose preceding screening interval was approximately biennial or annual.

METHODS

Study Population

We used data from seven BCSC mammography registries (http://breastscreening.cancer.gov) (19) (see Appendix A). Registries collected patient characteristics and clinical information at each mammogram, including radiologists’ assessments and recommendations based on the American College of Radiology’s Breast Imaging Reporting and Data System (BI-RADS®) (21). Each registry is linked to a state cancer registry or regional Surveillance, Epidemiology, and End Results (SEER) program, which we used to determine cancer status following mammography. Six of seven sites also linked to pathology databases. Data were pooled at a central Statistical Coordinating Center. Registries and the Coordinating Center received Institutional Review Board approval for active or passive consenting processes or a waiver of consent to enroll participants, link data, and perform analysis. All procedures were Health Insurance Portability and Accountability Act compliant, and registries and the Coordinating Center received a Federal Certificate of Confidentiality and other protection for the identities of women, physicians, and facilities.

In analyses of FP probabilities we included women who were age 40–59 at first screening mammogram performed at a participating BCSC facility. Mammograms were considered screening exams if the radiologist indicated routine screening. To avoid misclassifying diagnostic mammograms as screening exams, we excluded mammograms when a breast-imaging exam occurred within the prior nine months. We included screening mammograms from 1994 to the most recent year with complete breast cancer capture, which varied from 2004 to 2007 across the seven registries.

A separate cohort was constructed for analyses of cancer stage, which included women age 40–59 at the time of diagnosis of an incident invasive breast cancer between 1996 and 2006, at or following a screening mammogram and who had at least one additional prior mammogram. We excluded women with cancer diagnoses between 1994 and 1996 to allow for capture of mammography up to two years prior to diagnosis (for women undergoing biennial screening). We also excluded women with cancer diagnoses at or after age 60 to focus on incident cancers in 10-year periods similar to those in our analyses of cumulative FP probability (ages 40–49 for women starting screening at age 40, 50–59 for women starting screening at age 50). All breast cancers were classified according to the American Joint Committee on Cancer (AJCC) staging system (22). Late-stage cancer was defined as stage IIB, III, or IV. We also restricted our sample to women with breast cancer diagnoses that occurred within a fixed follow-up period after each woman’s most recent prior screening mammogram (the index mammogram): within 1 year for women with 9–18 months separating the index and next most recent prior mammograms (annual screeners); and within 2 years for women with 19–30 months between the index and most recent prior mammograms (biennial screeners). A flow diagram summarizing inclusion and exclusion criteria for the cancer cohort is provided in Appendix B.

Measures and Definitions

Patient characteristics, including birth date, postmenopausal hormone therapy use, and history of breast cancer in a first-degree relative, were collected by questionnaire at each examination. We considered mammograms first examinations if there were no prior mammograms in the BCSC database, no indication of comparison films, and no self-report of a prior mammogram. We defined screening intervals using women’s self-report and information from the BCSC database on the date of the prior screening mammogram. Screening interval was categorized into annual (9–18 months), biennial (19–30 months), or longer than biennial (>30 months). We censored 4,323 (2.5%) of women whose self-reported time since last mammogram differed from that in the database by more than six months, to ensure that women had not obtained a mammogram outside of the BCSC, and we used database-derived time since last mammogram when discrepancies of less than six months occurred. We also censored women at time of cancer diagnosis or the end of the study.

Recall was defined as a BI-RADS assessment for the initial screening mammogram (initial assessment) of 0 (needs additional imaging evaluation); 4 (suspicious abnormality); 5 (highly suggestive of malignancy); or 3 (probably benign finding) with a recommendation for immediate follow-up.

Biopsy recommendation was defined as a final BI-RADS assessment of 4 or 5, or 0 or 3 with a recommendation for biopsy, fine needle aspiration, or surgical consult after all imaging workup and within 90 days of the screening exam (final assessment) (23). Final BI-RADS assessments were set to missing and the exam was excluded from biopsy recommendation analyses if the screening exam was performed at a facility that does not capture follow-up exams or if the final assessment was 0 with recommendation for additional imaging, non-specified workup, or a missing recommendation (N = 25,045, 6.5%).

A recall or biopsy recommendation was considered false-positive when there was no diagnosis of invasive carcinoma or ductal carcinoma in situ within 1 year of the screening examination or before the next screening mammogram, whichever occurred first.

For our analyses of breast cancer stage, we defined the screening interval (annual or biennial) associated with cancer diagnosis as the interval between screening mammograms immediately preceding the diagnosis, specifically the interval between a woman’s most recent screening mammogram prior to the date of cancer diagnosis (index mammogram) and the screening mammogram before that exam. This approach is described in a previous publication (24).

Statistical Analysis

The proportion of mammograms resulting in a FP recall or biopsy recommendation at a single screening was computed for first and subsequent screening rounds. Generalized linear mixed models estimated the effect of age, family history of breast cancer, breast density estimated using density categories defined by the BI-RADS Atlas, postmenopausal hormone therapy use, and year of first exam on odds of a mammogram resulting in a FP result at a single screening round, while accounting for BCSC registry and random variation among radiologists. Models for FP results at subsequent rounds also adjusted for availability of comparison films and time since previous mammography. Adjusted FP probabilities were estimated from these models using indirect standardization (25, 26).

We used data from women age 40–59 at first mammography to build a model for the probability of FP results at each screening round using generalized linear mixed models for FP results conditional on screening round number, total number of screening rounds before censoring, the covariates adjusted for in analyses of single screening rounds, and interpreting radiologist. We then used this to model to estimate the cumulative probability of FP results for a woman who begins screening at age 40 or age 50 after 10 years of screening, by aggregating probabilities at individual screening rounds to the individual woman level. We used the method of Hubbard et al. (20), which accounts for the informative censoring which may arise when the length of time that a subject is under observation is associated with the outcome. We assumed that covariate effects were the same at each screening round, except for screening interval, which was assumed to have no effect at the first screening round and age, which was allowed to have a differing effect at first and subsequent screening rounds. We used indirect standardization to ensure a common distribution of BCSC registries across risk profiles (25, 26). We also report unadjusted estimates that do not account for covariates or between-radiologist variation. Appendix C has method details for estimating cumulative FP probabilities.

We report fitted values from our model for combinations of covariates chosen to represent a range of patient characteristics (age, year, hormone therapy use, family history of breast cancer, breast density, registry at the time of their first screening exams) for risk categories of low (no family history of breast cancer, BI-RADS 1 breast density), intermediate (no family history of breast cancer, BI-RADS 2 breast density), high (no family history of breast cancer, BI-RADS 3 breast density), and very high (family history of breast cancer, BI-RADS 3 breast density) and for quartiles of the distribution of radiologist FP rates. We defined the risk categories in terms of family history and BI-RADS breast density based on the results of previous studies of factors associated with FP results (15, 18). All risk profiles are for women who used no postmenopausal hormone therapy and had comparison films available for all subsequent exams. Quartiles for radiologist FP rates were constructed based on radiologist random effects from the FP risk models.

For invasive cancers, we estimated adjusted probabilities of each cancer stage and late-stage cancer using logistic regression models stratified by age at diagnosis (40–49 and 50–59) and including covariates for screening interval, race, family history, and BCSC registry. This set of adjustment variables was selected on the basis of prior research into the relationship between screening interval and risk of late stage cancer (24).

We defined statistical significance using a two-sided alpha-level of 0.05. Analyses were performed in R 2.10.1 (R Foundation for Statistical Computing, Vienna, Austria).

Role of the Funding Source

The National Cancer Institute supported this project through the BCSC cooperative agreements. All study authors and members of the BCSC Steering Committee approved the final version of the manuscript. The authors had full responsibility in designing the study, collecting the data, analyzing and interpreting the data, deciding to submit the manuscript for publication, and writing the manuscript.

RESULTS

We included 386,799 mammograms from 169,456 women interpreted by 997 radiologists. Nearly half the women (47.7%) had only 1 screening mammogram; 11.8% had 5 or more examinations (Table 1). The complete distribution of observed numbers of rounds of screening and patterns of screening intervals are provided in Appendix D. In our cohort, 9,331 women (5.5%) had only one year of follow-up, and 4,891 (2.9%) were observed for 10 or more years.

Table 1
Characteristics of Women Aged 40–49 and 50–59 Years at Time of First Mammographic Screening, Stratified By First or Second Exam.*

Most mammograms (78.9%) were for women aged 40–49 years at first mammogram. Median age at first screening was 42 for women who began screening in their forties and 53 for women who began in their fifties. Among subsequent mammograms, 55.6% occurred at an approximately annual screening interval (within 9–18 months of a prior mammogram) and 27.6% occurred approximately biennially (within 18–30 months of a prior mammogram). The remainder of mammograms occurred at longer than biennial intervals.

Most mammograms were assessed as negative (BI-RADS 1) or benign (BI-RADS 2); a BI-RADS score of 0, indicating need for additional imaging, was the third most common initial assessment for first and second mammograms and both age strata (Table 1). Of the 44,992 mammograms with initial BI-RADS scores of 0 across all observed screening rounds, most (71.5%) resolved to negative or benign readings; 12.3% remained a BI-RADS 0, 10.1% had suspicious abnormalities, and data were missing for 6.0% (Appendix E).

Probability of a Mammogram Leading To False-Positive Recall or Biopsy Recommendation

Unadjusted FP recall probability was 16.3% for first and 9.6% for subsequent mammograms. FP recall probabilities were higher for mammograms among women who started screening more recently, those with heterogeneously dense breasts, and in first exams only, older women and those with a family history of breast cancer (Table 2). Availability of comparison films halved the odds of FP recall on subsequent screening exams (odds ratio [OR], 0.50; 95% CI, 0.45 to 0.56), and biennial screening interval (last exam within 19–30 months) increased the risk of FP recall relative to an annual interval (within 9–18 months) (OR, 1.13; CI, 1.08 to 1.19).

Table 2
Adjusted False-Positive Recall Probabilities at First and Subsequent Exam by Associated Characteristics with Odds Ratios and 95% Confidence Intervals.

Unadjusted FP biopsy recommendation probability was 2.5% for first and 1.0% for subsequent exams. FP biopsy recommendation probabilities were higher for older women and those with heterogeneously dense breasts and in first exams only, higher for those with a family history of breast cancer (Table 3).

Table 3
Adjusted False-Positive Biopsy Recommendation Probabilities at First and Subsequent Exam by Associated Characteristics with Odds Ratios and 95% Confidence Intervals.

Cumulative Probability of a Woman Experiencing a False-Positive Recall or Biopsy Recommendation After 10 Years of Screening

For a woman who starts screening at age 40 the unadjusted cumulative probability of a FP recall after 10 years of screening was 61.3% (CI, 59.4% to 63.1%) with annual and 41.6% (CI, 40.6% to 42.5%) with biennial screening (Table 4). For a woman who starts screening at age 50, the unadjusted probability was 61.3% (CI, 58.0% to 64.7%) under annual and 42.0% (CI, 40.4% to 43.7%) under biennial screening.

Table 4
Cumulative Probability and 95% Confidence Intervals for False-Positive Recall after 10 Years of Screening under Four Screening Strategies (Start Age 40 vs. 50; Annual vs. Biennial Screening) by Radiologist’s and Woman’s Risk Level for ...

The adjusted cumulative probability of a FP recall under biennial screening was less than that under annual screening for each risk profile we modeled (Table 4). For women at intermediate risk of having FP recall (no family history of breast cancer, BI-RADS density 2) who start screening at age 40 and whose films are read by a radiologist with a median FP recall rate, for example, we estimated the cumulative probability of a FP recall after 10 years of biennial screening to be 37.8% compared to 52.4% for annual screening. Estimated reductions in cumulative FP probabilities with biennial compared to annual screening were comparable for women who began screening at age 50.

Estimates of a woman’s adjusted cumulative probability of experiencing a FP recall after 10 years of screening increased across FP risk profiles, radiologists’ recall rates, and annual compared to biennial screening (Table 4). Within each stratum of FP risk, radiologist risk, and screening frequency, 10-year risk of there was little difference in FP recall associated with age at first mammogram (e.g. 29.4% for low FP risk, low radiologist risk, annual screening begun at age 40, compared to 32.4% for low FP risk, low radiologist risk, annual screening begun at age 50).

Estimates of a woman’s adjusted cumulative probability of experiencing a FP biopsy recommendation after 10 years of screening increased across FP risk profiles and with increasing radiologists’ FP biopsy recommendation rates (Table 5). Probabilities were higher with annual compared to biennial screening and for older (starting age 50) compared to younger ages (starting age 40) at first mammogram.

Table 5
Cumulative Probability and 95% Confidence Interval for False-Positive Biopsy Recommendation after 10 Years of Screening under Four Screening Strategies (Start Age 40 vs. 50; Annual vs. Biennial Screening) by Radiologist and Woman Risk Level for False ...

At their first screening mammogram, 6.5% of BCSC women had BI-RADS 1 breast density and reported no family history of breast cancer. After 10 years of screening, these women would be expected to have cumulative FP probabilities similar to those of the low FP risk profile. In the BCSC sample, 37.1% had BI-RADS 2 breast density and reported no family history of breast cancer, like the intermediate FP risk profile; 39.2% had BI-RADS 3 breast density and reported no family history of breast cancer, like the high FP risk profile; and 3.7% had BI-RADS 3 breast and reported a family history of breast cancer, like the very high FP risk profile. The remainder of the BCSC sample had characteristics not reflected by the example risk profiles reported in Tables 4 and and55.

Incident Cancers

We identified 4,492 women with an incident invasive breast cancer diagnosis at age 40–59 following an annual or biennial screening interval (Appendix B). After adjusting for family history, race, and BCSC registry, a non-statistically significantly greater proportion of women who were screened biennially were diagnosed with a late-stage cancer (24.6% v. 21.3%, absolute difference 3.3% (−1.1, 7.8) for women age 40–49; 24.6% v. 21.9%, absolute difference 2.3; CI −1.0 to 5.7 for women age 50–59) (Table 6). Odds ratios for model covariates are in Appendix F.

Table 6
Adjusted Proportion of Cancer Stage at Diagnosis and 95% Confidence Intervals for Annual and Biennial Screeners Stratified by Age 40–49 or 50–59 at Time of Cancer Diagnosis.

DISCUSSION

In an analysis of probabilities of false-positive recall or biopsy recommendation using registry data collected in community practice, we estimated that the risk of a FP result was higher following a biennial screening interval than an annual interval. However, after 10 years of repeat screening at approximately annual or biennial intervals, the cumulative probability of receiving at least one FP recall or biopsy recommendation was lower with biennial compared to annual screening whether women started screening at age 40 or age 50. Compared to annual screening, biennial screening was associated with a non-statistically significant absolute increase of 2 and 3 percent in the proportion of women diagnosed with late-stage cancer in a cohort of those who developed cancer.

Our estimates of a woman’s cumulative probability of a FP mammogram result after repeat screening are higher than previously reported (10, 11, 13, 15). This is partly explained by a higher probability of FP results at each exam for our cohort. Our estimate of the FP recall probability at a single screening round was 16.3% at first exam and 9.6% at subsequent exams, compared to estimates of 6.5% for other cohorts (10). Additionally, previous methods to estimate the cumulative FP probability assumed that censoring was non-informative, which leads to underestimation if women at higher risk of a FP are more likely to be observed for fewer screening rounds (20).

Our analysis identified covariates associated with FP mammography results that resemble previous reports. Positive associations between FP recall and previous breast biopsies, family history of breast cancer, postmenopausal hormone therapy use, more recent exam year, and time between screening exams have been reported previously, as have negative associations between FP recall and older age and comparison film availability (6, 15, 18, 2729). We also identified statistically significant associations between FP recall and family history of breast cancer, exam year, time since previous mammogram, and availability of comparison films. Surprisingly, we found older age was associated with FP recall only at the first exam. This induced small differences in the cumulative probability of FP recall by starting age.

Few previous studies have estimated the cumulative probability of FP mammography results after repeat screening in U.S. community practice. We searched PubMed Central using the terms “cumulative”, “false positive”, and “mammography” to identify all studies evaluating the cumulative probability of FP recall and biopsy recommendation after repeat screening mammography. From among these, we reviewed the titles and abstracts to identify all studies providing estimates of the cumulative FP probability based on screening mammography in the U.S. We then searched all references citing these papers using Web of Science and reviewed titles and abstracts of these manuscripts to identify any additional references we may have missed. This review found 7 studies reporting cumulative FP probabilities for repeat screening mammography in the U.S (1013, 15, 20, 30). Elmore (1998) reported a 49.1% probability after 10 rounds of screening (10). Christiansen (2000) found a FP probability of 22% after five screening mammograms under biennial screening for an intermediate-risk woman and median radiologist (15), compared to estimates for our population of 38–40%. Studies of benign biopsy have found a probability of 8–9% after 10 screening rounds (12, 13), which are similar to our estimate. Based on our review, we believe ours is the first study to incorporate covariate effects and variation among radiologists into estimates of cumulative FP biopsy recommendation rates.

Our results on the risk of late-stage cancer following annual and biennial screening intervals are similar to those previously reported. A previous BCSC study found a statistically significantly higher proportion of late-stage cancers among women 40–49 participating in biennial compared to annual screening (28% vs. 21%), but no significant difference among women 50–59 (22% vs. 21%) (24). Although we found no statistically significant absolute difference in the overall proportion of late-stage cancers with biennial compared to annual screening, our findings could not exclude an increase in late stage cancer of as much as 7.8% among women in their 40s and as much as 5.7% among women in their 50s based on the upper confidence bound of the estimate of absolute difference. The relatively broad confidence limits around our estimates of difference are likely attributable to the small sample size available for our analysis of incident cancer, and a larger future study is required to exclude the possibility of a clinically significant increase in late stage cancer with biennial compared to annual screening, or even a smaller and less clinically significant decrease.

We have investigated two types of FP mammography results: recall for additional imaging and recommendation for biopsy. Our definitions of FP recall and biopsy recommendation are consistent with the BI-RADS Atlas, which distinguishes these two types of false-positives (21). Previous research on the effects of FP mammograms suggests that women receiving a FP recall or benign biopsy experienced elevated anxiety and distress (31). Benign biopsy poses additional risks of pain and scarring (32, 33). So FP recalls, although common, exert smaller effects than do FP biopsy recommendations. Both the relative frequency and severity of these two types of FP results should be considered when evaluating the harms of screening mammography.

Most screening mammograms had an initial assessment of negative or benign (BIRADS 1 or 2) or of BIRADS 0, needs additional imaging. Most in the latter category resolved on further evaluation to a negative or benign result; about 10% were interpreted as having suspicious abnormalities, and status continued to be unresolved (BIRADS 0) or was missing for about 19%. This could be because the woman did not return for follow-up imaging within 90 days of her screening mammogram or because she went to a facility outside the BCSC. In our analysis these observations have been defined as recalls but have been excluded from biopsy recommendation analyses. If the sub-group with missing final assessments is likely to go on to receive a biopsy recommendation, then this would tend to bias estimates of FP biopsy recommendation downward. However, these missing observations make up only 6% of the total sample, so the magnitude of this bias is expected to be small.

Our study has limitations. Although it was based on a large sample, it included 10 or more rounds of screening for a very small number of women, so our cumulative probability estimates after 10 years of annual screening depend on statistical modeling. However we were able to incorporate information from women with fewer than 10 exams using statistical methods developed for this purpose that accommodate informative censoring; previous methods for estimating cumulative FP probabilities are downwardly biased when FP recall is more common among women with fewer observed rounds of screening, as in our cohort (20).

We lacked information on radiologist characteristics associated with FP recall. Previous research identified variation in interpretive performance by radiologist characteristics such as fellowship training and years of experience as influencing FP recall (14, 16, 17, 34, 35). We attempted to capture differences in radiologist FP rates using random effects to estimate FP recall and biopsy recommendation variability in the middle 50% of radiologists. Variation is even larger when comparing radiologists with the highest and lowest FP rates.

Most mammograms in this analysis were film-screen exams. Digital screening mammography is rapidly becoming the predominant screening modality, with 76.2% of accredited facilities using full field digital machines as of May 1, 2011 (36). However, research on the performance of digital mammography has indicated similar specificity, and hence FP rates, for digital and film-screen exams (37, 38). A slight, non-statistically significant decrease in specificity has been observed for some sub-groups (38). This would result in increased FP probabilities relative to those observed in this study.

The study’s cumulative FP risk estimates apply only to the first 10 years of screening. Over the course of a lifetime of screening, beginning screening 10 years earlier would result in an additional 10 screening mammograms under annual screening and 5 under biennial, and the lifetime risk of FP mammography results will thereby be increased. We could not estimate lifetime cumulative FP risks because doing so would require extrapolation beyond the length of observation in the current study. We found no statistical difference in FP recall probabilities among women age 60 and over and those aged 40–44 years, but estimated that FP biopsy recommendation probabilities were statistically significantly higher in women age 65 or older. Therefore, cumulative FP biopsy recommendation probabilities for the ten years beginning at age 60 might be higher than those we have reported for women who began screening at younger ages.

In summary, we estimate that after 10 years of annual screening, a majority of women will receive at least one FP recall, and 7–9% will receive a FP biopsy recommendation. Both probabilities are lowered with biennial screening. In a population of women diagnosed with cancer, we also identified a non-statistically significant increase in the proportion diagnosed with late stage cancer after biennial screening compared to annual. Biennial screening thus decreases risks but may also attenuate the benefits of routine screening. Women and physicians should be aware of the possibility of these harms associated with different screening intervals so they can make informed decisions about screening and be prepared for what to expect when they receive their results. They should also ensure that prior mammograms, when they exist, are available to the interpreting radiologist, as it seems clear from these data that availability of prior studies may halve the odds of a FP recall.

Supplementary Material

Appendix A-F

Acknowledgments

We thank the participating women, mammography facilities, and radiologists for the data they have provided for this study. A list of BCSC investigators and procedures for requesting BCSC data for research purposes are at: http://breastscreening.cancer.gov/.

Grant Support

By the National Cancer Institute–funded Breast Cancer Surveillance Consortium cooperative agreement (grants U01CA63740, U01CA86076, U01CA86082, U01CA63736, U01CA70013, U01CA69976, U01CA63731, and U01CA70040) and the National Cancer Institute–funded grants R03CA150007 and RC2CA148577. The collection of cancer data used in this study was supported in part by several state public health departments and cancer registries throughout the U.S. For a full description of these sources, please see: http://breastscreening.cancer.gov/work/acknowledgement.html.

Primary Funding Source: National Cancer Institute

Footnotes

Reproducible Research Statement

Protocol: Available to interested readers by contacting Dr. Hubbard at hubbard.r/at/ghc.org

Statistical Code: Available to interested readers by contacting Dr. Hubbard at hubbard.r/at/ghc.org

Data: Available following approval by the BCSC Steering Committee at http://breastscreening.cancer.gov/

REFERENCES

1. Humphrey LL, Helfand M, Chan BK, Woolf SH. Breast cancer screening: a summary of the evidence for the U.S. Preventive Services Task Force. Annals of Internal Medicine. 2002;137:347–360. [PubMed]
2. Nelson HD, Tyne K, Naik A, Bougatsos C, Chan BK, Humphrey L. Screening for breast cancer: an update for the U.S. Preventive Services Task Force. Ann Intern Med. 2009;151(10):727–737. W237-42. [PMC free article] [PubMed]
3. Smith RA, Duffy SW, Gabe R, Tabar L, Yen AM, Chen TH. The randomized trials of breast cancer screening: what have we learned? Radiologic Clinics of North America. 2004;42:793–806. [PubMed]
4. Nystrom L, Andersson I, Bjurstam N, Frisell J, Nordenskjold B, Rutqvist LE. Long-term effects of mammography screening: updated overview of the Swedish randomised trials. Lancet. 2002;359(9310):909–919. [PubMed]
5. Tabar L, Vitak B, Chen HH, et al. The Swedish Two-County Trial twenty years later. Updated mortality results and new insights from long-term follow-up. Radiol Clin North Am. 2000;38(4):625–651. [PubMed]
6. Yankaskas BC, Taplin SH, Ichikawa L, et al. Association between mammography timing and measures of screening performance in the United States. Radiology. 2005;234(2):363–373. [PubMed]
7. Montgomery M. Uncertainty during breast diagnostic evaluation: state of the science. Oncol Nurs Forum. 2010;37(1):77–83. [PubMed]
8. Taplin SH, Abraham L, Geller BM, et al. Effect of previous benign breast biopsy on the interpretive performance of subsequent screening mammography. Journal of the National Cancer Institute. 2010;102(14):1040–1051. [PMC free article] [PubMed]
9. USPSTF. Screening for breast cancer: U.S. Preventive Services Task Force recommendation statement. Ann Intern Med. 2009;151(10):716–726. W-236. [PubMed]
10. Elmore JG, Barton MB, Moceri VM, Polk S, Arena PJ, Fletcher SW. Ten-year risk of false positive screening mammograms and clinical breast examinations. N Engl J Med. 1998;338(16):1089–1096. [PubMed]
11. Xu JL, Fagerstrom RM, Prorok PC, Kramer BS. Estimating the cumulative risk of a false-positive test in a repeated screening program. Biometrics. 2004;60(3):651–660. [PubMed]
12. Blanchard K, Colbert JA, Kopans DB, et al. Long-term risk of false-positive screening results and subsequent biopsy as a function of mammography use. Radiology. 2006;240(2):335–342. [PubMed]
13. Baker SG, Erwin D, Kramer BS. Estimating the cumulative risk of false positive cancer screenings. BMC Med Res Methodol. 2003;3:11. [PMC free article] [PubMed]
14. Barlow WE, Chi C, Carney PA, et al. Accuracy of screening mammography interpretation by characteristics of radiologists. Journal of the National Cancer Institute. 2004;96(24):1840–1850. [PMC free article] [PubMed]
15. Christiansen CL, Wang F, Barton MB, et al. Predicting the cumulative risk of false-positive mammograms. J Natl Cancer Inst. 2000;92(20):1657–1666. [PubMed]
16. Elmore JG, Jackson SL, Abraham L, et al. Variability in interpretive performance at screening mammography and radiologists' characteristics associated with accuracy. Radiology. 2009;253(3):641–651. [PubMed]
17. Smith-Bindman R, Chu P, Miglioretti DL, et al. Physician predictors of mammographic accuracy. Journal of the National Cancer Institute. 2005;97(5):358–367. [PubMed]
18. Carney PA, Miglioretti DL, Yankaskas BC, et al. Individual and combined effects of age, breast density, and hormone replacement therapy use on the accuracy of screening mammography. Ann Intern Med. 2003;138(3):168–175. [PubMed]
19. Ballard-Barbash R, Taplin SH, Yankaskas BC, et al. Breast Cancer Surveillance Consortium: a national mammography screening and outcomes database. American Journal of Roentgenology. 1997;169:1001–1008. [PubMed]
20. Hubbard RA, Miglioretti DL, Smith RA. Modeling the cumulative risk of a false-positive screening test. Statistical Methods in Medical Research. 2010;19:429–449. [PMC free article] [PubMed]
21. American College of Radiology. Breast Imaging Reporting and Data System (BI-RADS) Breast Imaging Atlas. Reston, VA: American College of Radiology; 2003.
22. American Joint Committee on Cancer. Manual for Staging of Cancer. 6th ed. Philadelphia: JB Lippincott; 2002.
23. Breast Cancer Surveillance Consortium. BCSC Glossary of Terms. BCSC. 2010
24. White E, Miglioretti DL, Yankaskas BC, et al. Biennial versus annual mammography and the risk of late-stage breast cancer. J Natl Cancer Inst. 2004;96(24):1832–1839. [PubMed]
25. Graubard B, Korn E. Predictive margins with survey data. Biometrics. 1999;55(2):652–659. [PubMed]
26. Lane P, Nelder J. Analysis of covariance and standardization as instances of prediction. Biometrics. 1982;38(3):613–621. [PubMed]
27. Cook AJ, Elmore JG, Miglioretti DL, et al. Decreased accuracy in interpretation of community-based screening mammography for women with multiple clinical risk factors. J Clin Epidemiol. 2010;63(4):441–451. [PMC free article] [PubMed]
28. Kerlikowske K, Carney PA, Geller B, et al. Performance of screening mammography among women with and without a first-degree relative with breast cancer. Ann Intern Med. 2000;133(11):855–863. [PubMed]
29. Ichikawa LE, Barlow WE, Anderson ML, Taplin SH, Geller BM, Brenner RJ. Time trends in radiologists' interpretive performance at screening mammography from the community-based Breast Cancer Surveillance Consortium, 1996–2004. Radiology. 2010;256(1):74–82. [PubMed]
30. Gelfand AE, Wang F. Modelling the cumulative risk for a false-positive under repeated screening events. Stat Med. 2000;19(14):1865–1879. [PubMed]
31. Brewer NT, Salz T, Lillie SE. Systematic review: the long-term effects of false-positive mammograms. Annals of Internal Medicine. 2007;146(7):502–510. [PubMed]
32. Zagouri F, Sergentanis TN, Gounaris A, et al. Pain in different methods of breast biopsy: Emphasis on vacuum-assisted breast biopsy. Breast. 2008;17(1):71–75. [PubMed]
33. Yazici B, Sever AR, Mills P, Fish D, Jones SE, Jones PA. Scar formation after stereotactic vacuum-assisted core biopsy of benign breast lesions. Clinical Radiology. 2006;61(7):619–624. [PubMed]
34. Beam CA, Lavde PM, Sullivan DC. Variability in the interpretation of screening mammograms by US radiologists. Findings from a national sample. Archives of Internal Medicine. 1996;156:209–213. [PubMed]
35. Elmore JG, Miglioretti DL, Reisch LM, et al. Screening mammograms by community radiologists: variability in false-positive rates. J Natl Cancer Inst. 2002;94(18):1373–1380. [PMC free article] [PubMed]
37. Pisano ED, Gatsonis C, Hendrick E, et al. Diagnostic performance of digital versus film mammography for breast-cancer screening. N Engl J Med. 2005;353(17):1773–1783. [PubMed]
38. Pisano ED, Hendrick RE, Yaffe MJ, et al. Diagnostic accuracy of digital versus film mammography: exploratory analysis of selected population subgroups in DMIST. Radiology. 2008;246(2):376–383. [PubMed]