|Home | About | Journals | Submit | Contact Us | Français|
As the influence of estrogen alone on breast cancer detection is not established, we examined this issue in the Women's Health Initiative trial, which randomly assigned 10,739 postmenopausal women with prior hysterectomy to conjugated equine estrogen (CEE; 0.625 mg/d) or placebo.
Screening mammography and breast exams were performed at baseline and annually. Breast biopsies were based on clinical findings. Effects of CEE alone on breast cancer detection were determined by using receiver operating characteristic (ROC) analyses of mammogram performance.
After a 7.1-year mean follow-up, fewer invasive breast cancers were diagnosed in the CEE than in the placebo group, but the difference was not statistically significant. Use of CEE alone increased mammograms with short-interval follow-up recommendations (cumulative, 39.2% v 29.6.3%; P < .001) but not abnormal mammograms (ie, those suggestive of or highly suggestive of malignancy; cumulative, 7.3% v 7.0%; P = .41). Breast biopsies were more frequent in the CEE group (cumulative, 12.5% v 10.7%; P = .004) and less commonly diagnosed as cancer (8.9% v 15.8%, respectively, with positive biopsies; P = .04). Mammographic breast cancer detection in the CEE group was significantly compromised only in the early years of use.
CEE alone use for 5 years results in approximately one in 11 and one in 50 women having otherwise avoidable mammograms with short-interval follow-up recommendations or breast biopsies, respectively. Although the breast biopsies on CEE were less commonly diagnosed as cancer, breast cancer detection was not substantially compromised. These findings differ from estrogen-plus-progestin use, for which significantly increased abnormal mammograms and a compromise in breast cancer detection are seen.
After initial reports from the Women's Health Initiative (WHI) trial of combined hormone therapy,1,2 use of menopausal hormonal therapy substantially decreased3,4 but continues in wide use. In particular, estrogen alone, indicated for women with prior hysterectomy primarily for climacteric symptoms, currently is still used by millions of women in the United States.5
In the Women's Health Initiative (WHI) clinical trial for postmenopausal women without prior hysterectomy, combined hormone therapy with conjugated equine estrogen (CEE) plus medroxyprogesterone acetate (MPA) increased breast cancer incidence2,6 and compromised breast cancer detection, contributing to diagnostic delay.7 In contrast, in the WHI randomized trial evaluating CEE alone in women with prior hysterectomy, breast cancer incidence was not increased.8 Although mammograms with short interval follow-up recommendations were increased,8 details of CEE effects on breast cancer detection were not reported. Therefore, we assessed CEE influence on breast cancer detection by means of screening mammography and breast biopsy during the trial.
A total of 10,738 postmenopausal women enrolled on the WHI trial evaluating CEE alone at 40 clinical centers between 1993 and 1998. Detailed eligibility criteria and recruitment procedures have been described.9,10,11 Eligibility included age between 50 to 79 years, postmenopausal status, and requirement for written informed consent. Major exclusions included prior breast cancer, other prior cancer within 10 years except nonmelanoma skin cancer, or medical conditions likely to result in death in 3 years. Women using menopausal hormones were eligible after a 3-month washout. A baseline mammogram and breast clinical exam not suggestive of cancer were eligibility requirements. Women in the CEE-alone trial also could participate in the WHI dietary modification (DM) trial and/or the WHI trial of calcium and vitamin D (CaD). Approximately 30% and 60% joined the DM and CaD trials, respectively. The study was approved by human subjects committees at each institution, and all participants provided written informed consent.
Participants were randomly assigned to CEE (Premarin 0.625 mg; Wyeth Ayerst, Collegeville, PA) or an identical-appearing placebo. A computerized randomization procedure was developed by the WHI Clinical Coordinating Center (CCC) and was implemented at local clinical centers. Both participants and staff were blinded to study medication allocation.
Baseline information was collected by using standardized self-report instruments and a brief physical exam. Interviewer-administered questionnaires were used to collect information on past hormone therapy use.9
Follow-up included a contact 6 weeks after random assignment to assess adherence, at 6-month intervals to assess clinical outcomes, and annually for clinic visits. Mammography was performed at WHI clinical centers as well as at more than 3,000 community sites.12 Mammogram reports were obtained, reviewed at the local clinical centers, and coded to reflect radiologist recommendations. Screening mammograms and clinical breast exams were required annually. Dispensing of study medications required their completion and clearance of findings suggestive of breast cancer. Therefore, mammograms with recommendations for additional imaging evaluation (equivalent to BIRADS category 0)13 were considered incomplete studies and were considered missing mammograms for these analyses. Work-up of breast findings, including recommendations for biopsy, were directed primarily by community physicians (Fig 1, CONSORT diagram).
Breast cancer self reports were verified by centrally trained WHI physician adjudicators who reviewed medical records and pathology reports (available in 98.2% of participants).14 Breast cancers required histologic confirmation. Final adjudication and coding were performed at the WHI Clinical Coordinating Center by using the Surveillance Epidemiology and End Results coding system.15 The intervention was stopped by the National Institutes of Health after a mean intervention period of 7.1 years on the basis of increased stroke risk and assessment that benefit for coronary heart disease was unlikely.16
Participant characteristics were compared in the two randomized groups by using χ2 statistics or t tests. Breast cancer results by randomly assigned group were assessed with time-to-event methods and were based on the intent-to-treat principle. Breast cancer incidence was compared by using hazard ratio (HRs) and corresponding 95% CIs estimated from Cox proportional hazard models stratified by age and dietary modification trial random assignment. Subgroup analyses were determined with Cox proportional hazard models, for which P values were determined with Wald statistics.
The observations in this report are based on mammogram recommendations after screening mammograms performed in protocol-defined time windows. Information on performance of diagnostic mammograms ie, those driven by clinical findings between screenings) was not captured and is not a component of these results.
Estimates of mammogram sensitivity, specificity, and positive and negative predictive values were compared by randomly assigned group. CIs were calculated by using the efficient-score method with continuity correction. Breast cancer occurrences were defined as invasive breast cancer or ductal carcinoma in situ. Mammograms with findings suggestive or highly suggestive of malignancy were considered abnormal or positive. All other completed mammograms were considered negative. A positive mammogram was considered a true positive if breast cancer was diagnosed within 1 year of the mammogram. A negative mammogram was considered a true negative if breast cancer was not diagnosed within 1 year. Specificity was defined as the percentage of true-negative exams among women without a breast cancer diagnosis within 1 year. Sensitivity was defined as the percentage of true-positive exams among women with a breast cancer diagnosis made within 1 year. Positive predictive value was defined as the percentage of true-positive exams among women with positive exams. Similarly, negative predictive value was the percentage of true-negative exams among women with negative exams.
Receiver operating curves (ROCs) and the corresponding area under the curve (AUC) were used to assess the diagnostic accuracy of mammograms.17 The ROCs plot the true-positive rate (ie, sensitivity) versus the false positive rate (ie, specificity) for a mammogram result, as the cutoff for defining an abnormal test is allowed to vary across the range of the five possible mammogram recommendation categories (ie, normal, benign, short-interval follow-up recommended, suggestive of cancer, and highly suggestive of cancer). The AUC is a general measure of the diagnostic accuracy of the mammography. An ROC that corresponds to a fair coin toss classifier (ie, a nonpredictive model) is a straight line connecting the coordinates (0,0) to (1,1) and has an AUC of 0.50. An ROC that correspond to a perfect classifier is a pair of vertical and horizontal lines connecting the coordinates (0,0) to (0,1) to (1,1) and has an AUC of 1.00. CIs for the AUC were computed by using the bootstrap method.
ROCs are presented for three distinct time periods (ie, 1 to 2, 3 to 4, and ≥ 5 years from entry) and are compared across random assignment groups. A generalized estimating equation (GEE) approach was used to take into account the correlation between multiple mammogram results for a participant within each time period. Testing was done within the GEE model to assess the interaction between random assignment group and the mammography result. Statistical significance was determined for each period after categorizing the mammogram into abnormal versus normal, for which abnormal is defined as mammograms with a recommendation for short-interval follow-up or with findings suggestive or highly suggestive of malignancy.
Biopsy frequency was evaluated by using the semi-annual reports of breast biopsies, and the time to each woman's first biopsy report was determined and compared between random assignment groups using a log-rank statistic. Biopsy dates were collected for women with breast cancer diagnoses. Because the dates of biopsies with benign findings were not collected, a breast biopsy was considered to be a true positive if an invasive breast cancer or ductal carcinoma in situ was diagnosed during the 6-month interval when a biopsy was reported or within 2 months after the interval when the biopsy was reported.
Analyses regarding mammogram performance by random assignment were based on an intent-to-treat principle. Additional analyses adjusted for study medication adherence were conducted by censoring follow-up for women 6 months after they became nonadherent (ie, consuming < 80% of study pills or starting nonstudy hormone therapy during the most recent study interval).
All women were postmenopausal and had prior hysterectomy. As reported, 41% had prior bilateral oophorectomy, and the median age was 63 years; 31% entered onto the study between 50 and 59 years of age. Most factors associated with breast cancer risk and with abnormal mammograms were balanced between the hormone and placebo groups (Table 1). Participation in other WHI clinical trial components was also balanced between random assignment groups (Fig 2). A full CONSORT diagram is available (Fig 1).11
As previously reported,8 fewer invasive breast cancers were diagnosed in the CEE alone than in the placebo group, but the difference was not statistically significant (104 cancers in 5,310 hormone-group participants v 133 cancers in 5,429 placebo-group participants; hazard ratio, 0.80; 95% CI, 0.65 to 1.04; P = .09; Table 2). Staging information new to this report now includes results from an additional 47 occurrences, previously reported as having missing data.8 Significantly fewer invasive breast cancers were diagnosed with tumors ≤ 2 cm in women in the CEE group (65 and 102 cancers for CEE alone v placebo, respectively; P = .008; Table 2), but there was no effect on tumors greater than 2 cm (26 and 23 cancers for CEE alone v placebo, respectively; P = .63). As a result of having fewer small cancers in the hormone group, the average cancer size was somewhat greater than in the placebo group. Although there were five more cancers with lymph node–positive involvement in the hormone group (33 v 28, respectively; P = .49), there were 32 fewer cancers with lymph node–negative disease in the hormone group (60 v 92 cancers, respectively; P = .02).
The cumulative frequency of mammograms with a short-interval follow-up recommendation by year and random assignment group are outlined in Figure 3A. Women in the hormone group had approximately 4% greater risk of having such a mammogram after 1 year and approximately a 9% greater risk after 5 years (30.7% v 22.0%; P < .001; Fig 3A). However, in the same period, there was no increase in mammograms with more serious findings either suggestive or highly suggestive of breast cancer (cumulative, 5.4% v 5.1% for CEE v placebo; P = .53; Fig 3A).
The frequency of clinically indicated breast biopsies by year and random assignment group are outlined in Figure 3B. The cumulative percent of women with a biopsy through year 7 was significantly greater in the CEE-alone group than in the placebo group (12.5% v 10.7%; P = .004), and the time to first biopsy report was significantly shorter (P = .006). Of 1,838 breast biopsies performed, breast cancer was diagnosed in 112 (8.9%) of 1,007 biopsies in the CEE group and in 127 (15.8%) of 831 biopsies in the placebo group (P = .04).
The year-by-year performance characteristics of mammograms by random assignment group are listed in Table 3. As seen, the sensitivity and positive predictive value of mammograms were compromised by CEE-alone use, whereas the specificity and the negative predictive value of mammograms were similar in the CEE-alone and placebo groups.
ROCs and AUC statistics were used to compare mammogram diagnostic performance by time on study and by random assignment group (Fig 4). Overall, performance was significantly inferior in the CEE-alone group in the first 2 years but not in subsequent periods. Looking across time intervals for women in the placebo group, the ROC AUCs that assessed mammogram performance were 0.89, 0.85, and 0.92 for mammograms in years 1 to 2, 3 to 4, and ≥ 5, respectively. For women in the CEE-alone group, the ROC AUCs were 0.85 (P = .03), 0.88 (P = .79), and 0.91 (P = .58) for mammograms performed in years 1 to 2, 3 to 4, and ≥ 5, respectively, with P values evaluating the effect of random assignment on mammogram diagnostic performance (Fig 4A). In adherence-adjusted analyses that censored follow-up 6 months after nonadherence to study medication use, mammogram diagnostic performance for women in the CEE-alone group was not significantly different from those in the placebo group for any period examined (P = .38, .49, and .06 in years 1 to 2, 3 to 4, and ≥ 5, respectively; Fig 4B).
Information regarding cumulative adherence is provided by year and random assignment group in (Appendix Table A1, online only). Drop-outs represent those patients discontinuing study medication use. Drop-ins represent those patients initiating nonprotocol hormone therapy.
Use of CEE alone significantly increased mammograms with short-interval follow-up recommendations but not those with more serious findings. Significantly more breast biopsies were performed for clinical indications in the hormone group, yet they less frequently diagnosed cancer. However, mammogram diagnostic performance differences between random assignment groups were seen only in the early years of exposure. Thus, mammograms in women in the CEE-alone group were generally able to diagnose breast cancer in a timely manner at a cost of an increase in both mammograms with short-interval follow-up recommendation and breast biopsies, which were more likely to be false positive.
The influence of CEE alone on breast cancer detection in women with prior hysterectomy can be compared with that of CEE plus MPA in women with no prior hysterectomy in the parallel WHI randomized trial.1,2 There were similarities, as both increased mammograms with short-interval follow-up recommendation and both significantly increased breast biopsies that less reliably detected cancer. However, combined hormone therapy also increased abnormal mammograms with more serious findings and more commonly compromised mammogram performance.2,7 During 5.6 years of combined use of CEE plus MPA, invasive breast cancers were increased, and they were diagnosed at higher stage,2 consistent with combined hormone therapy effects to both stimulate breast cancer growth and delay breast cancer diagnosis.2,6,7 However, in the CEE-alone trial with a longer intervention duration of 7.1 years, there were fewer invasive breast cancers in the CEE group, and mammographic cancer detection was not substantially compromised.8 Breast cancers were diagnosed without substantial delay for women using CEE alone, but more breast biopsies were needed to find the tumors.
Post hoc analyses in the CEE-alone group found significantly (P = .008) fewer small invasive breast cancers ≤ 2 cm (65 v 102, respectively), and significantly fewer (P = .02) cancers with negative lymph nodes (60 v 92, respectively), whereas there was no significant increase in larger invasive breast cancers or those with positive nodes. These findings may indicate CEE reduces the incidence of smaller, early-stage cancers. Ongoing postintervention follow-up will provide additional information regarding the long-term effects of CEE exposure on breast cancer incidence.
To our knowledge, this is the first comprehensive description of the time course of the influence of CEE alone on the diagnostic performance of mammography and breast biopsy in a randomized clinical trial. The preponderance of observational studies have combined results for estrogen alone with those for estrogen plus progestin use18,19,20 or have reported no difference in performance between regimens.21,22,23 In the Million Women Study, conjugated estrogen either alone or combined with progestin had closely comparable adverse influence on the false-positive recall rate.24 However, as this end point includes both abnormal mammograms and those given a short follow-up recommendation, it did not address the difference seen in this report, for which combined hormone therapy only increased mammogram with a short-interval follow-up recommendation.
As an increase or even preservation of breast density could potentially adversely affect mammogram diagnostic performance,21,25 breast density was assessed in ancillary studies in the WHI hormone therapy trials. In 445 randomly assigned women, CEE alone use for 2 years modestly increased breast density compared with placebo (absolute difference in percent breast density, 2.9%; P < .001).26 The effect of combined CEE plus MPA, in a parallel study of similar design and size, was substantially greater, with an absolute difference in percent breast density for combined hormone therapy compared with placebo after 2 years of 6.9% (P < .001).27 Although additional study is needed to correlate breast density change with mammogram recommendations in individuals, breast density change may contribute to mammographic performance differences seen.
The mammography performance regarding sensitivity and specificity reported in this randomized trial reflects that of a large number of mammography centers and interpreting radiologists throughout the nation. Direct comparison to national benchmarks28 for performance is not appropriate, given substantial differences in study population and protocol directives. For example, when considering the performance of the year 1 mammogram in the WHI trial, all women were required to have a clear mammogram in the past year, information on only completed mammograms was recorded, and no reliable evidence on estrogen effects on performance was available. Nonetheless, comparisons of current results with the Breast Cancer Surveillance Consortium (BCSC) mammogram screening program are favorable. With respect to tumor size, the BCSC program found tumors less than 1 cm in 37% and tumors of 1 to 2 cm in 42%, and the mean tumor size was 1.6 cm. In the placebo arm of the WHI study, tumors less than 1 cm were found in 40%, tumors of 1 to 2 cm were found in 40%, and the mean tumor size was 1.4 cm, which is a performance similar to that of the BCSC program.28,29
Study strengths include the randomized design, large sample size, requirement for annual mammography, and the central adjudication of breast cancers by reviewers blinded to random assignment. The trial evaluated CEE at 0.625 mg/d, and the results may not apply to other oral or transdermal hormonal therapies.
These findings have clinical implications. Women using estrogen alone for climacteric symptoms for durations comparable to those in the study can be reassured regarding breast cancer risk and detection. However, they will experience more mammograms with short-interval follow-up recommendations and more false-positive breast biopsies. For mammographers, identification of the nature of the findings leading to short-interval follow-up recommendations in women on estrogen should be a priority. Given the suggestion of early interference with mammographic performance after estrogen-alone initiation, extra diligence in review of mammograms obtained in this setting also could be recommended.
In conclusion, use of CEE alone for about 5 years results in approximately one in 11 and one in 50 women who had an otherwise avoidable mammogram with recommendation for short-interval follow-up or a breast biopsy, respectively. Clinically indicated breast biopsies were significantly more frequent among CEE users but less commonly led to breast cancer diagnoses. These findings differ from those with combined CEE plus MPA use, for which abnormal mammograms were increased and for which there was strong evidence of diagnostic delay.
(National Heart, Lung, and Blood Institute, Bethesda, Maryland) Elizabeth Nabel, Jacques Rossouw, Shari Ludlam, Linda Pottern, Joan McGowan, Leslie Ford, and Nancy Geller.
(Fred Hutchinson Cancer Research Center, Seattle, WA) Ross Prentice, Garnet Anderson, Andrea LaCroix, Charles L. Kooperberg, Ruth E. Patterson, Anne McTiernan; (Wake Forest University School of Medicine, Winston-Salem, NC) Sally Shumaker; (Medical Research Labs, Highland Heights, KY) Evan Stein; (University of California at San Francisco, San Francisco, CA) Steven Cummings.
(Albert Einstein College of Medicine, Bronx, NY) Sylvia Wassertheil-Smoller; (Baylor College of Medicine, Houston, TX) Jennifer Hays; (Brigham and Women's Hospital, Harvard Medical School, Boston, MA) JoAnn Manson; (Brown University, Providence, RI) Annlouise R. Assaf; (Emory University, Atlanta, GA) Lawrence Phillips; (Fred Hutchinson Cancer Research Center, Seattle, WA) Shirley Beresford; (George Washington University Medical Center, Washington, DC) Judith Hsia; (Los Angeles Biomedical Research Institute at Harbor–University of California, Los Angeles Medical Center, Torrance, CA) Rowan Chlebowski; (Kaiser Permanente Center for Health Research, Portland, OR) Evelyn Whitlock; (Kaiser Permanente Division of Research, Oakland, CA) Bette Caan; (Medical College of Wisconsin, Milwaukee, WI) Jane Morley Kotchen; (MedStar Research Institute/Howard University, Washington, DC) Barbara V. Howard; (Northwestern University, Chicago/Evanston, IL) Linda Van Horn; (Rush Medical Center, Chicago, IL) Henry Black; (Stanford Prevention Research Center, Stanford, CA) Marcia L. Stefanick; (State University of New York at Stony Brook, Stony Brook, NY) Dorothy Lane; (The Ohio State University, Columbus, OH) Rebecca Jackson; (University of Alabama at Birmingham, Birmingham, AL) Cora E. Lewis; (University of Arizona, Tucson/Phoenix, AZ) Tamsen Bassford; (University at Buffalo, Buffalo, NY) Jean Wactawski-Wende; (University of California at Davis, Sacramento, CA) John Robbins; (University of California at Irvine, CA) F. Allan Hubbell; (University of California at Los Angeles, Los Angeles, CA) Howard Judd; (University of California at San Diego, LaJolla/Chula Vista, CA) Robert D. Langer; (University of Cincinnati, Cincinnati, OH) Margery Gass; (University of Florida, Gainesville/Jacksonville, FL) Marian Limacher; (University of Hawaii, Honolulu, HI) David Curb; (University of Iowa, Iowa City/Davenport, IA) Robert Wallace; (University of Massachusetts/Fallon Clinic, Worcester, MA) Judith Ockene; (University of Medicine and Dentistry of New Jersey, Newark, NJ) Norman Lasser; (University of Miami, Miami, FL) Mary Jo O'Sullivan; (University of Minnesota, Minneapolis, MN) Karen Margolis; (University of Nevada, Reno, NV) Robert Brunner; (University of North Carolina, Chapel Hill, NC) Gerardo Heiss; (University of Pittsburgh, Pittsburgh, PA) Lewis Kuller; (University of Tennessee, Memphis, TN) Karen C. Johnson; (University of Texas Health Science Center, San Antonio, TX) Robert Brzyski; (University of Wisconsin, Madison, WI) Gloria E. Sarto; (Wake Forest University School of Medicine, Winston-Salem, NC) Denise Bonds; (Wayne State University School of Medicine/Hutzel Hospital, Detroit, MI) Susan Hendrix.
|Follow-Up Year||Treatment Group|
|CEE (n = 5,310)||Placebo (n = 5,429)|
|Cumulative Drop-Out Rate (%)*||Cumulative Drop-In Rate (%)†||Cumulative Drop-Out Rate (%)*||Cumulative Drop-In Rate (%)†|
Abbreviation: CEE, conjugated equine estrogen.
Supported by Contracts No. N01WH22110, 24,152, 32100-2, 32105-6, 32108-9, 32111-13, 32,115, 32,118 to 32,119, 32,122, 42107-26, 42129-32, and 44221 from the National Heart, Lung, and Blood Institute, National Institutes of Health, US Department of Health and Human Services (to Women's Health Initiative).
Authors' disclosures of potential conflicts of interest and author contributions are found at the end of this article.
Clinical trial information can be found for the following: NCT00000611.
Although all authors completed the disclosure declaration, the following author(s) indicated a financial or other interest that is relevant to the subject matter under consideration in this article. Certain relationships marked with a “U” are those for which no compensation was received; those relationships marked with a “C” were compensated. For a detailed description of the disclosure categories, or for more information about ASCO's conflict of interest policy, please refer to the Author Disclosure Declaration and the Disclosures of Potential Conflicts of Interest section in Information for Contributors.
Employment or Leadership Position: None Consultant or Advisory Role: Rowan T. Chlebowski, AstraZeneca (C), Novartis (C), Amgen (C), Eli Lilly (C), Pfizer (C); Anne McTiernan, Merck (C) Stock Ownership: Anne McTiernan, Merck Honoraria: Rowan T. Chlebowski, AstraZeneca, Novartis Research Funding: None Expert Testimony: Robert D. Langer, Wyeth (C) Other Remuneration: None
Conception and design: Rowan T. Chlebowski, Garnet Anderson
Provision of study materials or patients: Rowan T. Chlebowski, JoAnn E. Manson, Dorothy Lane, Robert D. Langer, F. Allan Hubbell, Susan Hendrix, Marcia L. Stefanick
Collection and assembly of data: Garnet Anderson, Mary Pettinger
Data analysis and interpretation: Rowan T. Chlebowski, Garnet Anderson, JoAnn E. Manson, Mary Pettinger, Shagufta Yasmeen, Dorothy Lane, Robert D. Langer, F. Allan Hubbell, Anne McTiernan, Susan Hendrix, Robert Schenken, Marcia L. Stefanick
Manuscript writing: Rowan T. Chlebowski, Garnet Anderson, JoAnn E. Manson, Mary Pettinger, Shagufta Yasmeen, Dorothy Lane, Robert D. Langer, F. Allan Hubbell, Anne McTiernan, Susan Hendrix, Robert Schenken, Marcia L. Stefanick
Final approval of manuscript: Rowan T. Chlebowski, Garnet Anderson, JoAnn E. Manson, Mary Pettinger, Shagufta Yasmeen, Dorothy Lane, Robert D. Langer, F. Allan Hubbell, Anne McTiernan, Susan Hendrix, Robert Schenken, Marcia L. Stefanick