Search tips
Search criteria 


Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Ann Intern Med. Author manuscript; available in PMC 2010 January 7.
Published in final edited form as:
PMCID: PMC2803099

Short-term hormone therapy suspension and mammography recall: the Radiological Evaluation and Breast Density (READ) Randomized Trial



Without population-based evidence, some clinicians recommend short-term hormone therapy (HT) suspension to improve mammography performance. HT raises breast density and abnormal screening mammograms are more common among women with greater breast density and among women using HT.


Test whether brief (1–2 month) HT suspension before screening mammography decreases additional mammographic imaging (recall) in women aged 45–80.


Three-group randomized controlled trial.


Integrated health plan in Washington state, US between 2004–2007.


1704 women aged 44–80 who used HT at their most recent screening (index) mammogram, were due for a screening (study) mammogram, and were still using HT.


Block randomization (by breast density and HT type) to: no HT suspension (N=567) or suspension for 1 (N=570) or 2 (N=567) months before the study mammogram. One blinded expert radiologist interpreted all mammograms.


Recall was the primary outcome and change in mammographic breast density (percentage and dense area) between index and study mammograms was the secondary outcome.


Mammography recall rates were 11.3% (61/542 no-suspension), 12.3% (50/478 1-month), and 9.8% (44/451 2-month). We identified no subgroups where brief suspension resulted in decreased mammography recall. With suspension, decreases in percentage breast density were orderly and statistically significant: 0.1% (no-suspension), −0.9% (1-month), and −1.5% (2-month). Comparable ordered declines were observed for dense area. Suspension groups experienced increased menopause symptoms.


Our results can only be generalized to women aged 45–80 years who have used HT for at least 1 year and will consider short-term suspension; the majority (61%) of eligible women refused participation. Mammography recall was determined by one expert radiologist.


Brief HT suspension was associated with small changes in breast density and did not affect recall rates. There is no evidence to support short-term HT suspension before mammography.

Keywords: Randomized controlled trial, mammographic breast density, mammography recall, mammography performance, hormone therapy, hormone therapy suspension, hormone therapy suspension


Each year, breast cancer is diagnosed in over 182,000 U.S. women.(1) Mammography is the only available screening method proven to reduce breast cancer mortality,(2) yet it is imperfect. Over 10% of U.S. women screened for breast cancer receive a recommendation for additional imaging at the time of their mammogram and only 4.4% of U.S. women who receive additional imaging after a screening mammogram will be diagnosed with breast cancer within 1 year.(3,4) False-positive mammograms have an enormous impact on women (e.g., increased patient anxiety) and subsequent healthcare use and cost.(5) False-positive mammograms raise annual U.S. healthcare costs by approximately $200 million ($500 per workup).(6)

As mammographic breast density increases, both sensitivity and specificity of mammography are decreased.(713) Breast density is not static and is affected by exogenous and endogenous reproductive hormones.(1417) We have previously reported clinically and statistically significant higher rates of mammography recall among HT users(18,19) and have demonstrated HT discontinuation between mammography exams (average time between exams=2 years) can reduce density and recall to levels comparable to non-users of HT.(18,19) Several reports encourage women to consider short-term HT suspension before receiving a mammogram, with the idea that cancers are more easily detected.(18,2022) Some providers already recommend short-term HT suspension in the hopes it will improve mammography performance; however, population-based evidence has been lacking for widespread or ad-hoc adoption in clinical practice.

This study was designed to test whether brief HT suspension before a screening mammogram improves mammography performance. To assess the impact of HT suspension on mammography sensitivity and specificity, a three-group randomized trial would need to enroll 700,000 women, making it unfeasible. The vast literature on screening mammography performance supports that recall rate is an important measure of the interpretive performance of screening tests(23). There is evidence that US radiologists could lower their recall rates without negatively impacting their cancer detection rates(2427). Screening mammography aims to maximize performance, which involves weighing the sensitivity against the specificity of a test. As recall rates increase, specificity decreases. The recall rate includes both false-and true-positive mammograms, but the majority of positive screening mammograms are false-positive; therefore, lowering recall rate in a screening population mostly translates into lowering the false positive rate. We therefore chose mammography recall rates as our primary outcome and change in mammographic breast density as our secondary outcome. This trial specifically tests whether brief (1–2 month) HT suspension before screening mammography decreases mammographic recall in women aged 45–80.


Design Overview

This study was a three-group randomized controlled trial to test whether brief HT suspension before screening mammography decreases mammography recall in women aged 45–80. We identified and attempted to recruit 5861 potentially eligible women over 35 months from November 2004 through September 2007 and enrolled a total of 1704 women. Participants were considered enrolled only after we verified their eligibility, they returned a consent form and Health Insurance Portability and Accountability Act authorization form, and randomization was completed. Women were followed for 12 months following their study screening mammogram (through December 2008) for adverse events. The institutional review boards (IRBs) at Group Health and the U.S. Department of Defense reviewed and approved this study.

Setting and Participants

The Radiological Evaluation and Breast Density (READ) randomized controlled trial was conducted within Group Health, an integrated health plan based in western Washington State (US), with approximately 550,000 members. Group Health has a population-based Breast Cancer Screening Program (BCSP) that women are invited to join when they turn 40 or join Group Health.(28, 29) Demographic and breast cancer risk factor information is gathered by a self-administered questionnaire completed when women enter the BCSP and at each mammogram.(29)

We used automated administrative databases to identify and recruit women aged 45–80 years who were due for a screening (“study”) mammogram. Women had to have a previous mammogram (“index”) at Group Health within the past 2 years where they self-reported using HT, and evidence of continued HT dispensing at the time of recruitment. Women were ineligible if they had a history of myocardial infarction, angina treated with medication, coronary revascularization surgery, stroke, blood clots, breast cancer, mastectomy, or breast implants. We further excluded women with any dispensings of tamoxifen or raloxifene because of these medications’ potential influence on breast density. Women who had a breast density rating of “entirely fat” [Breast Imaging-Reporting and Data System® (30) (BI-RADS) breast density 1] at their index mammogram were also excluded since their breast density was unlikely to influence mammography recall.

Randomization and Interventions

Women were randomized to one of three study groups (no-suspension or 1-or 2-month suspension) using a permuted block algorithm. Once a week, the study programmer identified women eligible for randomization, and each was assigned to one of six stratification cells, based on BI-RADS breast density at index mammogram and type of HT used in the 6 months before recruitment based on electronic pharmacy dispensing records [estrogen plus progestin (EPT) versus estrogen alone (ET)]. Within each stratification cell, a random number was generated for each woman using the ranuni function in SAS (version 9.0 SAS Institute Inc., Cary, NC), and women were assigned to study groups in order based on the rank of their random number within the stratification cell. Randomization group assignment was maintained in the study database, available to the study programmer, statistician, and staff responsible for mailing follow-up questionnaires, but the study radiologist and investigators were blinded to randomization group.

All women were mailed a letter with a baseline questionnaire and instructions personalized to their randomization group. The instructions included the participant’s mammography appointment date and, for women in the suspension groups, the date they should stop HT. Participants were given the study nurse’s toll-free number to discuss any study-related concerns and report any symptoms or adverse events. The study nurse called all women participants to answer questions, and confirm the baseline questionnaire had been completed and returned. These calls occurred two weeks before their scheduled HT suspension date (suspension groups) and one month before their scheduled mammogram (no-suspension).

Data Collection

We collected participants’ self-reported information from four sources: BCSP questionnaires from the index and study mammograms; baseline study questionnaire at enrollment; and follow-up study questionnaire at study mammogram. Body mass index was calculated from self-reported height and weight collected at the index and study mammograms.

The baseline study questionnaire collected information on medical history, physical activity, tobacco and alcohol use. Both study questionnaires gathered information about HT use, sleep and menopausal symptoms. Participants rated the importance of getting a good night’s sleep (1=not important to 10=very important) and reported the frequency (days/week) of nine common sleep disturbances selected from the General Sleep Disturbance Scale.(31) We used a modified Wiklund Menopause Symptom Checklist(32,33) to ascertain the presence and severity (0=not present to 10=severe) of menopausal symptoms (sleep, mood swings, vaginal bleeding, vaginal dryness, hot flashes and night sweats).

Hormone therapy groups

We used automated pharmacy dispensing data to identify potentially eligible women and to classify type of HT used for stratification. Women had to have two estrogen dispensings in the 6 months before recruitment; HT type was distinguished by the presence of any progestin dispensing in those 6 months (EPT).(34) We further classified HT dose using information on the highest daily conjugated equine estrogen equivalent dose and categorized these into: low (<0.625mg), medium (0.625mg), and high (>0.625mg).(3436) Progestin dose was defined as: low (2.5mg medroxyprogesterone acetate or 100mg micronized progesterone) and standard dose (all other progestins).

Outcomes and Measurements

Mammography recall (primary outcome)

Women completed their study mammogram as part of routine clinical care. All mammograms were interpreted in accordance with Group Health standard clinical practice during the trial, and clinical care was continued as usual. Within 1–2 months of the screening mammogram, the expert study radiologist (JCG) independently interpreted each study mammogram using the index mammogram for comparison and provided assessments and recommendations blinded to: randomization group, clinical assessment, and any clinical recommendations. Mammography recall was defined as any assessment requiring additional imaging or evaluation of either breast (BI-RADS assessment of 0 for either or both breasts)(30) determined exclusively from the study radiologists’ interpretation. All women in the study had ≥1 screening mammogram at Group Health before being recruited; so all women had comparison films that were used at the time of the study screening mammogram. Our primary outcome was defined a priori as recommended recall of any imaging (for technical reasons or otherwise). Technical reasons for recall were indicated by our expert radiologist primarily for the reason of not having enough tissue in one or more of the mammographic views to have complete confidence that there were no changes in the breast from the prior views. We believe it is important to include mammograms that were recalled for technical reasons because there is no reason to expect recall for technical reasons to differ across randomization group unless cessation had an effect on the ability to get a technically adequate image including ability to sufficiently compress the breast or to get clear images. Time between mammograms was determined in part by a woman’s recommended screening interval (determined by her risk profile)(37) and her choice (median time between index and study mammograms=720 days).

Breast density (secondary outcome)

We digitized the left breast craniocaudal projection from the index and study mammograms using a Kodak Lumisys 85 scanner. We used Cumulus software(38) (University of Toronto) to measure dense area and total breast area measured in pixels. Small films were scanned at 87 microns/pixel and large films at 116 microns/pixel and were converted to cm2 for dense area and breast area as follows: 7.554×10−5cm2/pixel (small films) and 1.350×10−4cm2/pixel (large films). Percent breast density was computed as the ratio of dense area to total breast area. All breast density measures were interpreted by the same trained reader (EAB) in batches of 50 films each. Participant’s index and study films were read in random order within the same batch of films, with the reader blinded to participant identifiers, randomization group, and the timing of the exam. To evaluate quality assurance, we included four intra-batch repeats within each batch. The mean absolute difference (standard error, SE) in quality control samples was 3.3% (0.2%) for percent breast density, 4.7 (0.4) cm2 for dense area; and 2.5 (0.2) cm2 for breast area. The concordance correlation between intra-batch repeats was 0.9 for percent breast density, 0.95 for dense area, and 1.0 for breast area.

Follow-up Procedures and Monitoring

The self-administered follow-up study questionnaire was completed before the scheduled mammogram. The questionnaire asked about tolerance of and adherence to HT suspension, menopausal symptoms, and specific adverse events, but did not address any questions related to anxiety related to mammographic procedures, possible recall or mammographic findings. Adherence was based on self-report from the follow-up questionnaire. Women in the 1-and 2-month suspension groups were asked “Were you able to stay off your HT as instructed for this study?” Women who responded “yes” were considered adherent. Adherence among the no-suspension group was determined from a “yes” response to the question “Did you take any estrogen in the last month?” We could not assign adherence status for the 3.8% of women who did not complete these questions.

Adverse experiences

We collected adverse experience information, from recruitment to 12 months following study mammogram, for serious adverse events (myocardial infarction, angina, stroke, coronary bypass surgery, coronary angioplasty/stent, pulmonary embolism, deep vein thrombosis, or death), anticipated adverse events (depression, dizziness or faint spells, palpitations, or vaginal bleeding), and unexpected adverse events (any event not previously listed). Adverse experiences were collected from: the study hotline, the follow-up questionnaire, and monthly reviews of Group Health automated data using International Classification of Diseases, 9th Revision and Current Procedural Terminology codes. The study physician (SDR) and a designated medical monitor adjudicated all potential events identified. All serious adverse events were also submitted to the Data Safety and Monitoring Board and Group Health’s and the U.S. Department of Defense’s IRBs. The Data Safety and Monitoring Board met four times and monitored serious adverse events, anticipated adverse events, and unexpected adverse events; interim analyses were examined at two meetings to determine if stopping guidelines had been met.

Statistical analysis

In designing our study, we estimated study power based on preliminary data demonstrating a 13% mammography recall rate among women using HT.(19) Enrolling 1500 women to three equal size groups provided 85% power to detect a 5% decrease in mammography recall rate (8% mammography recall in the combined suspension groups).

Chi-square tests, Kruskall-Wallis tests, and analysis of variance were used to compare demographic and potential confounding variables among the three groups. We used generalized linear models with a log link to estimate the relative risk (RR) and 95% confidence intervals (CI) of mammography recall for women in the suspension groups relative to the no-suspension group. We used linear regression to compare the mean change from index to study mammograms in breast density across study groups. All statistical analyses were completed using STATA, v.10.0 (StataCorp. College Station, TX).

All main analyses were based on a modified intention to treat analysis, where group differences were estimated based on randomized group assignment, regardless of adherence. Since mammography recall and breast density (the primary and secondary outcomes) could only be assessed among women completing their mammogram, all analyses were limited to women who did not withdraw from the study before their study mammogram, making a strict intention to treat analysis impossible. We report unadjusted results, because sensitivity analyses adjusted for covariates (body mass index, age, and HT type) did not alter the findings. We repeated the main analysis limited to women who adhered to the study protocol.

To investigate effect modification, we specified a priori four participant characteristics for subgroup analyses that we thought could alter the association between HT suspension and the study outcomes: age at study enrollment; duration of HT use at baseline; and estrogen and progestin strength. The associations between mammography recall and brief suspension in these subgroups were estimated by including an interaction term between randomization group and the subgroup variables in the multivariable regression models.


We approached 5861 women to assess their eligibility and to invite them to participate in the trial. We excluded 4157 women: 2999 (72.1%) refused; 977 (23.5%) were determined ineligible after initial contact (e.g., no longer taking HT, heart disease history, had recent mammogram); 179 (4%) could not be contacted; and 2 had unknown reasons for exclusion. Most women who refused participation (58.4%) were unwilling to stop HT, even briefly (Figure 1). Among the 4884 potentially eligible women invited, 65.1% declined participation. There were a few small differences between women who refused compared with those who participated. Women who refused were older than participants (mean 61.3 vs. 60.3 years, p<.001), leaner (mean=27.1 vs. 27.6 kg/m2, p<.002), less likely to be Caucasian (89.6% vs. 91.3%, p=<.001), and less educated (≤high school 19.2% vs. 14.0%, p<.001).

Figure 1
Recruitment for READ Study

We randomly assigned 1704 women to the three study arms: 567 no suspension, 570 1-month suspension, and 567 2-month suspension. A total of 228 (13.4%) randomly assigned women withdrew from the study. Withdrawal differed across groups: 24/567 (4.2%) no suspension; 90/570 (15.8%) 1-month suspension; and 114/567 (20.1%) 2-month suspension. Reasons for withdrawal varied, but the majority of women in the suspension groups withdrew because they refused to discontinue HT after receiving their randomization assignment. Nearly one third of the withdrawals in the no-suspension group (7/24) withdrew because they discontinued HT before their mammogram. Despite differential withdrawal by randomization group, there were no statistically significant characteristics of women who withdrew after randomization compared with women who completed participation: age (p=.81), HT type (p=.50), body mass index (p=.181), race (p=.29), education (p=.21), breast cancer risk (p=.32), family history (p=.25), HT duration (p=.71) or breast density (p=.25). We had complete study data including mammogram assessments and both questionnaires for 95.7% of the 1476 enrolled women who remained in the study. An additional five women (one no-suspension, two 1-month suspension, and two 2-month suspension) were excluded from the primary analysis because they did not complete their study mammogram or their film was not available for assessment by the study radiologist. This resulted in a final analytic sample size of 1471.

The mean age at recruitment among enrolled women was 60.3 years, mean body mass index 27.6 kg/m2, and 91.5% of the study population was Caucasian (Table 1). There were no statistically significant differences in age, body mass index, race, ethnicity, education, or breast cancer risk across randomization groups. Most of the randomly assigned women (62.0%) used unopposed estrogen, and nearly half had previously tried to quit HT. Approximately one third of the women self-reported having hypertension, high cholesterol, and/or depression. Health status and behaviors did not differ across randomization groups. Self-reported health status was high across all groups, with <10% reporting fair or poor health (data not shown). Most women were non-smokers (56.5% never, 37.2% former) and were not heavy alcohol drinkers (28.9% never-drinkers and 14.1% drinking daily, with 70.9% of drinkers reporting having 1 drink/day on days they drank).

Table 1
Baseline Characteristics of READ Study Participants by Randomization Assignment

Adherence to random assignment was high across randomization arms (99.0% no suspension, 92.7% 1-month, and 87.3% 2-month); but was statistically significantly lower (p<.006) in the 2-month suspension group compared to the 1-month group. There were no crossovers designed in this study. Women in the suspension groups were asked to report all reasons for nonadherence, and their most frequent responses included hot flashes (70.5%), night sweats (65.9%), sleep disturbances (65.9%), and “just feeling better” taking HT (64.8%).

Mammography recall

Just over one in 10 women (11.2%) were recalled for additional mammography imaging with the largest proportion of mammography recall occurring in the women randomly assigned to 1-month suspension (12.3%) followed by no suspension (11.3%) and 2-month suspension (9.8%) (p=.45) (Table 2). Recall rates did not change when limited to women adherent with randomization (p=.39). Reason for recall included technical (7.3%), work-up of a probably benign lesion (84.1%), work-up of a possible malignant lesion (6.1%) and other (reason not specified, 3.1%). Technical reasons were indicated as the reason for recall for 3.3% in the no suspension, 11.9% in 1-month suspension and 6.8% in 2-months suspension. After excluding the women who had a recall for technical reasons from the numerator and denominator, the pattern of recall remained the same by randomization group: 1-month suspension (11.0%) followed by no suspension (10.9%) and 2-month suspension (9.2%) (p=.57).

Table 2
Mammography Recall Rates by Randomization Group, Stratified by Type of Hormone Therapy

Recall patterns varied by HT strata. ET users had a statistically non-significant (p=.30) lower recall (9.7%) than EPT users (13.5%). There were no statistically significant differences in recall across randomization groups for ET (p=.42) or EPT users (p=.80). Among ET users, there was a suggestion that women in the 1-month suspension group had a higher recall rate compared to the no-suspension group. Among EPT users, recall rates had statistically non-significant drops from 14.5% among continued users to 12.1% for women who stopped EPT for 2 months.

Pre-specified effect modification analyses by age at recruitment, HT use duration, and HT strength identified no subgroup with statistically significant decreases in mammography recall (data not shown).

Mammographic breast density

Recall rates increased with increasing amounts of dense area on mammograms but were affected little by percent density (Table 3). After adjusting for body mass index and age, there was an increase in the relative risk of recall with increasing mammographic breast density (percent and dense area), but there was no relation between change in density and recall rates.

Table 3
Recall rate and the adjusted* relative risk of recall by mammographic breast density (percent and dense area) at study mammogram and change in breast density from the index to study mammogram

There were ordered and statistically significant decreases in percent breast density and dense area with 1 and 2 months of HT suspension (Table 4); decreases in density were greater with longer suspension. Percent breast density decreased in both suspension groups, with a mean (SE) change of -0.9% (0.3%) in the 1-month suspension group and -1.5% (0.4%) in the 2-month group, with no change [0.1% (0.3%)] in the no-suspension group. Comparable ordered declines were observed for dense area. Declines in percent breast density and dense area were larger after excluding non-adherent women.

Table 4
Breast density (Mean, SE) and Change in Breast density by Randomization Group

When stratified by HT type, changes in percent breast density and dense area shared the same pattern as for overall users, but statistically significant differences were limited to EPT users with short-term suspension with mean decreases in percent breast density of −1.3% and −2.2% for 1-and 2-month suspension, compared to no change (0.6%, p=.009 1 month vs. no suspension, p<.0001 2 months vs. no suspension) in the no-suspension group.

In our sub-analyses by age, duration of HT use before index mammogram, and estrogen and progestin strength, we observed the greatest declines in percent breast density and dense area among women <60 years and in women using standard progestin dose (Figure 2). Virtually every subgroup analysis showed ordered decreases in breast density with increasing HT suspension. We observed a nonsignificant increase in mammography recall for women on progestin doses of medroxyprogesterone acetate 2.5 mg compared to higher doses. Although we did not assess continuous vs. cyclical patterns, medroxyprogesterone acetate 2.5 mg is almost never given in a cyclical pattern and is the most common progestin dose for a continuous regimen.

Figure 2
Difference Between Index and Study Mammogram in Means of Percent Breast density and Dense Area for 1-Month Suspension vs. No-Suspension and 2-Month Suspension vs. No-Suspension, by Age, Duration of Hormone Therapy Use, Body Mass Index, and Estrogen and ...

Adverse events and symptoms

Eight serious adverse events occurred among randomized participants (through 7/18/08): two myocardial infarctions (no-suspension), two strokes (one no-suspension, one 2-month), and four deaths (two no-suspension, one 1-month, one 2-month) (data not shown).The no-suspension group had the most unanticipated events: three diagnoses of breast cancer, two women with atypical chest pain, and eight other events (e.g., fracture/fall, laceration). Six unanticipated events occurred among women in the 1-month group, but none occurred in the 2-month group. All women diagnosed with breast cancer adhered to randomization. Of the women diagnosed with breast cancer, all three in the no-suspension group used EPT, and one in the 1-month suspension group used ET until 1 month before her mammogram.

Vaginal bleeding was the most common anticipated adverse event, occurring in seven women in 1-month, six in 2-month, and three in no-suspension. Hot flashes, sleep disturbances and night sweats were the symptoms that increased the most in the suspension groups, with the greatest increase in symptoms in the 2-month suspension group (Table 5). None of our measured menopausal symptoms improved in either suspension group.

Table 5
Mean of Symptom Severity Scores by Intervention Group at Baseline and Follow-up and Change in Severity of Symptoms


We have demonstrated that short-term suspension of ET or EPT for 1 or 2-months is associated with small changes in breast density and that suspension does affect mammography recall rates. Although most women who agreed to participate in our study tolerated short-term HT suspension, suspension was associated with statistically significant increases in menopausal symptoms for women using ET or EPT.

No large-scale randomized trials have measured how quickly density changes with HT suspension. Moreover, no studies have quantified the amount of density reduction needed to improve mammography performance (recall, sensitivity, specificity or overall accuracy). Theoretically, even small decreases in breast density and recall have the potential to have a clinically meaningful impact on screening performance in larger groups. However, no studies have quantified the amount of decrease in continuous mammographic breast density (percent or area) needed to reduce mammography recall.

We designed our intervention to test 1-and 2-month HT suspension because we believed it was impractical for women to suspend HT for longer periods between annual and biennial screening mammograms. Previous studies(11, 18, 19) have demonstrated clinically and statistically significant improvements in mammography recall with HT suspension, but those studies were not randomized and were based on longer periods of HT suspension. The majority (61%) of eligible women approached to participate in this trial refused because they were unwilling to consider short-term HT suspension, even if short-term suspension might improve mammography quality. There were no clinical or statistical differences in breast density or HT type or duration for women who refused compared to women who participated. Among women who agreed to participate in the trial, 13% withdrew after randomization; there were also no clinical or statistical differences between women who agreed to participate and withdrew after randomization to women who remained in the study. Our study was designed to determine the effectiveness of short-term HT suspension, making the high proportion of refusals an important finding of this study.

Linear trends of increased recall were seen with increasing percent density and increasing dense area at the study mammogram, with slightly stronger effects for dense area. However, these trends were not observed in relation to change in percent density or change in dense area and recall. These findings emphasize that density is an important factor influencing recall rates, but that it is likely that change in density needs to be large to be able to have any clinical affect on recall rates.

Percent density changes in our study are similar in magnitude to changes observed during a menstrual cycle(14) and the menopause transition.(39) They are also comparable to other smaller studies measuring change in breast density over longer (12-month) periods. The exception is the Postmenopausal Estrogen/Progestin Intervention (PEPI) Trial, a randomized trial including 571 women aged 45–64(40), designed to examine differences in heart disease risk factors by ET and EPT vs. placebo. PEPI reported some of the largest mean changes in percent breast density over a 12-month period, with breast density increasing from 3.1% to 4.8% for women in the EPT groups and decreasing by −0.1% in the placebo group.(40) Additionally, one small (n=48) observational study found a reduction in density from 2 weeks of HT suspension, and suggested brief suspension could improve mammography performance.(20) However, this study was limited to women with new or enlarging circumscribed breast masses, which excluded abnormalities that were unlikely to be related to HT use. Our current study was not designed to examine pattern changes (e.g., masses/calcifications) in density in response to HT suspension, but this may be an important area for future examination.

Assessing clinically meaningful changes in breast density is challenging for continuous and BI-RADS density measures.(41) Large changes in density are needed to shift BI-RADS categories if radiologists use the recommended distribution of 25% density in each BI-RADS category.(30) Moving one BI-RADS breast density category amounts to a 14%-18% change in breast density.(40) Boyd(42) previously estimated the relative risk of breast cancer increases by 2% for each 1% increase in percent breast density when measured continuously. However, the clinical significance of changing breast density remains unknown, particularly in the subgroup of women for whom HT increases breast density. Understanding characteristics of women whose breast density changes with HT and other exogenous exposures, including which breast-level changes result, may be critical to understanding hormone exposures in breast cancer development and detection.

Several small randomized trials—and one large study nested in a randomized trial—have assessed the effects on density of various HT formulations, including dose, drug type, application, and duration of use.(4348) Together, these studies suggest estrogens and progestins increase density in some women, with a differential effect by route and dose. Our study was insufficiently large to estimate recall and density changes by HT dose, mode (continuous sequential vs. cyclical EPT), route, or type (synthetic vs. natural. Interestingly, we observed a suggestion of higher mammography recall for women using medroxyprogesterone acetate 2.5 mg (commonly used in continuous regimens) than for women using higher progestin doses (used cyclically), suggesting that suspension of continuous EPT may increase recall rates, although full interpretation of these findings is limited by a small sample size. In vitro studies of breast tissue suggest continuous progestin inhibits cell growth.(49, 50) If continuous progestins have an anti-proliferative effect, it is possible that upon withdrawal, particularly in the month following suspension, breast tissue may experience rebound proliferation leading to an increase in recall.

We recruited a population of women who were due for a screening mammogram for whom we had access to index and study mammogram images within our clinical system. Electronic pharmacy records enabled targeted recruitment to women who were still receiving HT dispensings and to block-randomize women by HT type (ET vs. EPT). Automated pharmacy data allowed evaluation of outcome differences by HT type, but we were unable to examine differences by HT mode, although inferences based on progestin dose can be made.

We designed this study to minimize measurement error including having one breast density interpreter; using one computer, monitor, and scanner; ensuring distribution of women across randomization groups in each breast density batch; and including intra-and inter-rater quality control readings in each batch. Despite all these controls, the variability in quantitative density assessments is important, and can limit the ability to detect change.(41) The standard error of the mean change in percent breast density ranged from 0.3–0.4%. This resembles other studies that have reported measurement error in mammographic breast density.

There is a tremendous literature on variability in mammography interpretive performance(23,5154). We purposefully used an expert radiologist to interpret all study mammograms in the same setting to decrease recall variability that could arise from including multiple radiologists.(23,5154) Having recall defined by one expert is a strength of the study, which increases internal validity by reducing variation in recall due to inter-reader variation, but it could also decrease the generalizability of the findings if other radiologists have different patterns of recall.(55)

Our results are generalizable to women aged 45–80 years who have used HT for at least 1 year; nearly half our study population had used HT for ≥15 years. With no evidence of an interaction between duration of EPT or ET use and mammography recall, we have no reason to believe that results from this trial cannot be generalized to women who have shorter-term HT use than the women in our study. We found no evidence of a benefit for short-term HT suspension in any age groups, including women over 70 years; an age group been previously demonstrated to have the greatest reduction in recall from HT suspension.(18,19)

Any intervention that can decrease mammography recall has important public health consequences regardless of the intervention effect size; false-positive mammograms account for 25% of the overall costs of US mammography.(56) In the US population in 2005, Medicare reimbursed 11.4 million screening mammograms among non-HMO women between ages 50 and 79 years.(57) Finding interventions to decrease recall for additional imaging without compromising sensitivity is critical to the sustainability of breast cancer detection in the US. Results from this large, population-based randomized trial provide evidence that brief HT suspension before screening mammography does not decrease mammography recall for any subgroups of women using HT, despite decreases in breast density. Our results also demonstrate a substantial negative impact on women from increased menopausal symptoms. For breast cancer detection and prevention, we need to understand whether subsets of women whose breast density changes in response to endogenous exposures (e.g., parity, cyclical changes) and exogenous exposures (e.g., various HT types and doses, tamoxifen, aromatase inhibitors) may benefit differentially from interventions designed to decrease mammographic breast density.


Role of funding source

The READ study was funded by the Department of Defense (PI, D Buist; DAMD17-03-1-0447). Registered clinical trial number: NCT00117663. Study participants were recruited from the Group Health Breast Cancer Screening Program funded by the National Cancer Institute (PI: D Buist, U01CA63731). The study sponsors had no involvement in the: study design; collection, analysis, and interpretation of the data; in the writing of the report; and in the decision to submit the paper for publication.

The study team had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis. This study could not have been completed without the assistance of Tammy Dodd, Linda Palmer, RN, BS or Melissa Rabelhofer. We would further like to thank members of our advisory board: Hermien Watkins, Paula Hoffman, Deb Schiro, and Margrit Schubiger; members of the Data Safety and Monitoring Board: Susan Heckbert, MD, PhD, Chair, University of Washington Department of Epidemiology; Ben Anderson, MD, University of Washington; Mary Anne Rossing, DVM, PhD, Fred Hutchinson Cancer Research Center; Robert D. Rosenberg, MD, University of New Mexico; Thomas Lumley, PhD, University of Washington Department of Biostatistics; and Elizabeth Lin, MD, Group Health Permanente medical monitor. We would further like to thank Robert Karl, MD, Donna White, MD and Jo Ellen Callahan for their support for implementing this trial at Group Health. We also acknowledge Stephen Taplin, MD, MPH for his collaboration in getting this study funded when he was an investigator at Group Health Cooperative. We thank Rebecca Hughes for her editorial assistance.

Drs. Reed and Newton receive grant support from Pfizer.

Grant support:

A population-based trial to assess the effects of brief hormone therapy (HT) suspension on mammography assessments and breast density”. The READ study was funded by the Department of Defense (PI, D Buist; DAMD17-03-1-0447). Registered clinical trial number: NCT00117663. Study participants were recruited from the Group Health Breast Cancer Screening Program funded by the National Cancer Institute (PI: D Buist, U01CA63731).


Publisher's Disclaimer: “This is the prepublication, author-produced version of a manuscript accepted for publication in Annals of Internal Medicine. This version does not include post-acceptance editing and formatting. The American College of Physicians, the publisher of Annals of Internal Medicine, is not responsible for the content or presentation of the author-produced accepted version of the manuscript or any version that a third party derives from it. Readers who wish to access the definitive published version of this manuscript and any ancillary material related to this manuscript (e.g., correspondence, corrections, editorials, linked articles) should go to or to the print issue in which the article appears. Those who cite this manuscript should cite the published version, as it is the official version of record.”

Trial Registration: identifier: NCT00117663

Protocol: Available to interested readers by contacting Diana Buist, PhD at gro.chg@d.tsiub

Statistical Code: Available to interested readers by contacting Melissa Anderson, MS at gro.chg@assileM.nosrednA

Data: not available


1. American Cancer Society. Cancer Facts and Figures 2008. 2008
2. Humphrey LL, Helfand M, Chan BK, Woolf SH. Breast cancer screening: a summary of the evidence for the U.S. Preventive Services Task Force. Ann Intern Med. 2002;137(5 Part 1):347–360. [PubMed]
3. National Cancer Institute BCSC. Abnormal Interpretations for 4,032,556 Screening Mammography Examinations from 1996–2005. 2007
4. Brown ML, Houn F, Sickles EA, Kessler LG. Screening mammography in community practice: positive predictive value of abnormal findings and yield of follow-up diagnostic procedures. Am J Roentgenol. 1995;165(6):1373–1377. [PubMed]
5. Barton MB, Moore S, Polk S, Shtatland E, Elmore JG, Fletcher SW. Increased patient concern after false-positive mammograms: clinician documentation and subsequent ambulatory visits. J Gen Intern Med. 2001;16(3):150–156. [PMC free article] [PubMed]
6. Randal J. For breast cancer screening, there is room for improvement, report concludes. J Natl Cancer Inst. 2004;96(16):1200–1201. [PubMed]
7. Carney PA, Miglioretti DL, Yankaskas BC, et al. Individual and combined effects of age, breast density, and hormone replacement therapy use on the accuracy of screening mammography. Ann Intern Med. 2003;138(3):168–175. [PubMed]
8. Ma L, Fishell E, Wright B, Hanna W, Allan S, Boyd NF. Case-control study of factors associated with failure to detect breast cancer by mammography. J Natl Cancer Inst. 1992;84(10):781–785. [PubMed]
9. Mandelson MT, Oestreicher N, Porter PL, et al. Breast density as a predictor of mammographic detection: comparison of interval-and screen-detected cancers. J Natl Cancer Inst. 2000;92(13):1081–1087. [PubMed]
10. Byrne C, Schairer C, Wolfe J, et al. Mammographic features and breast cancer risk: effects with time, age, and menopause status. J Natl Cancer Inst. 1995;87(21):1622–1629. [PubMed]
11. Laya MB, Gallagher JC, Schreiman JS, Larson EB, Watson P, Weinstein L. Effect of postmenopausal hormonal replacement therapy on mammographic density and parenchymal pattern. Radiology. 1995;196(2):433–437. [PubMed]
12. Rosenberg RD, Hunt WC, Williamson MR, et al. Effects of age, breast density, ethnicity, and estrogen replacement therapy on screening mammographic sensitivity and cancer stage at diagnosis: review of 183,134 screening mammograms in Albuquerque, New Mexico. Radiology. 1998;209(2):511–518. [PubMed]
13. Laya MB, Larson EB, Taplin SH, White E. Effect of estrogen replacement therapy on the specificity and sensitivity of screening mammography. J Natl Cancer Inst. 1996;88(10):643–649. [PubMed]
14. Buist DS, Aiello EJ, Miglioretti DL, White E. Mammographic breast density, dense area, and breast area differences by phase in the menstrual cycle. Cancer Epidemiol Biomarkers Prev. 2006;15(11):2303–2306. [PubMed]
15. White E, Velentgas P, Mandelson M, et al. Variation in mammographic breast density by time in menstual cycle among women aged 40–49 years. J Natl Cancer Inst. 1998;90(12):906–910. [PubMed]
16. Ursin G, Parisky YR, Pike MC, Spicer DV. Mammographic density changes during the menstrual cycle. Cancer Epidemiol Biomarkers Prev. 2001;10(2):141–142. [PubMed]
17. Baines CJ, Vidmar M, McKeown-Eyssen G, Tibshirani R. Impact of menstrual phase on false-negative mammograms in the Canadian National Breast Screening Study. Cancer. 1997;80(4):720–724. [PubMed]
18. Boudreau DM, Buist DS, Rutter CM, Fishman PA, Beverly KR, Taplin S. Impact of hormone therapy on false-positive recall and costs among women undergoing screening mammography. Med Care. 2006;44(1):62–69. [PubMed]
19. Rutter CM, Mandelson MT, Laya MB, Seger DJ, Taplin S. Changes in breast density associated with initiation, discontinuation, and continuing use of hormone replacement therapy. JAMA. 2001;285(2):171–176. [PubMed]
20. Harvey JA, Pinkerton JV, Herman CR. Short-term cessation of hormone replacement therapy and improvement of mammographic specificity. J Natl Cancer Inst. 1997;89(21):1623–1625. [PubMed]
21. Speroff L. The impact of the Women's Health Initiative on clinical practice. J Soc Gynecol Investig. 2002;9(5):251–253. [PubMed]
22. Burnside ES, Trentham-Dietz A, Kelcz F, Collins J. An example of breast cancer regression on imaging. Radiology Case Reports. 2006;1:27–37. [PMC free article] [PubMed]
23. Institute of Medicine. Improving Breast Imaging Quality Standards. Washington, D.C: The National Academies Press; 2005.
24. Smith-Bindman R, Chu PW, Miglioretti DL, et al. Comparison of screening mammography in the United States and the United kingdom. JAMA. 2003;290(16):2129–2137. [PubMed]
25. Elmore JG, Nakano CY, Koepsell TD, Desnick LM, D'Orsi CJ, Ransohoff DF. International variation in screening mammography interpretations in community-based programs. J Natl Cancer Inst. 2003;95(18):1384–1393. [PMC free article] [PubMed]
26. Smith-Bindman R, Chu P, Miglioretti DL, et al. Physician predictors of mammographic accuracy. J Natl Cancer Inst. 2005;97(5):358–367. [PubMed]
27. Hofvind S, Vacek PM, Skelly J, Weaver DL, Geller BM. Comparing screening mammography for early breast cancer detection in Vermont and Norway. J Natl Cancer Inst. 2008;100(15):1082–1091. [PMC free article] [PubMed]
28. Taplin SH, Thompson RS, Schnitzer F, Anderman C, Immanuel V. Revisions in the risk-based Breast Cancer Screening Program at Group Health Cooperative. Cancer. 1990;66(4):812–818. [PubMed]
29. Taplin SH, Ichikawa L, Buist DS, Seger D, White E. Evaluating organized breast cancer screening implementation: the prevention of late-stage disease? Cancer Epidemiol Biomarkers Prev. 2004;13(2):225–234. [PubMed]
30. American College of Radiology. Breast imaging reporting and data system (BIRADS™) 3rd ed. Reston, VA: American College of Radiology; 1998.
31. Lee HP, Gourley L, Duffy SW, Esteve J, Lee J, Day NE. Risk factors for breast cancer by age and menopausal status: a case-control study in Singapore. Cancer Causes Control. 1992;3(4):313–322. [PubMed]
32. Wiklund I, Holst J, Karlberg J, et al. A new methodological approach to the evaluation of quality of life in postmenopausal women. Maturitas. 1992;14(3):211–224. [PubMed]
33. Wiklund I, Karlberg J. Evaluation of quality of life in clinical trials. Selecting quality-of-life measures. Control Clin Trials. 1991;12(4 Suppl):204S–216S. [PubMed]
34. Buist DS, Newton KM, Miglioretti DL, et al. Hormone therapy prescribing patterns in the United States. Obstet Gynecol. 2004;104(5 Pt 1):1042–1050. [PubMed]
35. North American Menopause Society. Menopause Core Curriculum Study Guide. 2nd ed. Cleveland, OH: North American Menopause Society; 2002.
36. A new conjugated estrogen. Med Lett Drugs Ther. 1999;41(1058):67–68. [PubMed]
37. Taplin SH, Ichikawa L, Yood MU, et al. Reason for late-stage breast cancer: absence of screening or detection, or breakdown in follow-up? J Natl Cancer Inst. 2004;96(20):1518–1527. [PubMed]
38. Byng JW, Boyd NF, Fishell E, Jong RA, Yaffe MJ. Automated analysis of mammographic densities. Phys Med Biol. 1996;41(5):909–923. [PubMed]
39. Kelemen LE, Pankratz VS, Sellers TA, et al. Age-specific trends in mammographic density: the Minnesota Breast Cancer Family Study. Am J Epidemiol. 2008;167(9):1027–1036. [PubMed]
40. Greendale GA, Reboussin BA, Slone S, Wasilauskas C, Pike MC, Ursin G. Postmenopausal hormone therapy and change in mammographic density. J Natl Cancer Inst. 2003;95(1):30–37. [PubMed]
41. Byrne C. Invited commentary: assessing breast density change--lessons for future studies. Am J Epidemiol. 2008;167(9):1037–1040. [PubMed]
42. Boyd NF, Lockwood GA, Byng JW, Tritchler DL, Yaffe MJ. Mammographic densities and breast cancer risk. Cancer Epidemiol Biomarkers Prev. 1998;7(12):1133–1144. [PubMed]
43. Marchesoni D, Driul L, Ianni A, et al. Postmenopausal hormone therapy and mammographic breast density. Maturitas. 2006;53(1):59–64. [PubMed]
44. Christodoulakos GE, Lambrinoudaki IV, Vourtsi AD, et al. The effect of low dose hormone therapy on mammographic breast density. Maturitas. 2006;54(1):78–85. [PubMed]
45. Erel CT, Esen G, Seyisoglu H, et al. Mammographic density increase in women receiving different hormone replacement regimens. Maturitas. 2001;40(2):151–157. [PubMed]
46. Junkermann H, von Holst T, Lang E, Rakov V. Influence of different HRT regimens on mammographic density. Maturitas. 2005;50(2):105–110. [PubMed]
47. Lundstrom E, Wilczek B, von Palffy Z, Soderqvist G, von Schoultz B. Mammographic breast density during hormone replacement therapy: differences according to treatment. Am J Obstet Gynecol. 1999;181(2):348–352. [PubMed]
48. Harvey J, Scheurer C, Kawakami FT, Quebe-Fehling E, de Palacios PI, Ragavan VV. Hormone replacement therapy and breast density changes. Climacteric. 2005;8(2):185–192. [PubMed]
49. Lange CA, Richer JK, Horwitz KB. Hypothesis: Progesterone primes breast cancer cells for cross-talk with proliferative or antiproliferative signals. Mol Endocrinol. 1999;13(6):829–836. [PubMed]
50. Groshong SD, Owen GI, Grimison B, et al. Biphasic regulation of breast cancer cell growth by progesterone: role of the cyclin-dependent kinase inhibitors, p21 and p27(Kip1) Mol Endocrinol. 1997;11(11):1593–1607. [PubMed]
51. Beam CA, Guse C, Sullivan D. A sequential chart for the audit-based evaluation of screening mammogram interpretation. Acad Radiol. 1999;6:216–223. [PubMed]
52. Beam CA, Layde P, Sullivan D. Variability in the interpretation of screening mammograms by US radiologists. Arch Intern Med. 1996;156:209–213. [PubMed]
53. Elmore JG, Wells C, Lee C, Howard D, Feinstein A. Variability in radiologists' interpretations of mammograms. N Engl J Med. 1994;331:1493–1499. [PubMed]
54. Kerlikowske K, Grady D, Barclay J, et al. Variability and accuracy in mammographic interpretation using the American College of Radiology breast imaging reporting and data system. J Natl Cancer Inst. 1999;90(23):1801–1809. [PubMed]
55. Taplin SH, Rutter CM, Lehman CD. Testing the effect of computer-assisted detection on interpretive performance in screening mammography. AJR Am J Roentgenol. 2006;187(6):1475–1482. [PubMed]
56. Elmore JG, Barton MB, Moceri VM, Polk S, Arena PJ, Fletcher SW. Ten-year risk of false positive screening mammograms and clinical breast examinations. N Engl J Med. 1998;338(16):1089–1096. [PubMed]
57. Centers for Medicare & Medicaid Services. In the Medicare population in the year 2005, 11.4 million non-HMO women between the ages of 50 to 79 years of age received a reimbursed screening mammogram. Centers for Medicare & Medicaid Services. 2005;Vol. 2008
58. Reed SD, Newton KM, LaCroix AZ, Grothaus LC, Ehrlich K. Night sweats, sleep disturbance, and depression associated with diminished libido in late menopausal transition and early postmenopause: baseline data from the Herbal Alternatives for Menopause Trial (HALT) Am J Obstet Gynecol. 2007;196(6):593 e1–593 e7. discussion 593 e7. [PMC free article] [PubMed]