|Home | About | Journals | Submit | Contact Us | Français|
To compare the effects of two selective estrogen receptor modulators, tamoxifen and raloxifene, on global and domain-specific cognitive function.
The National Surgical Adjuvant Breast and Bowel Project's Study of Tamoxifen and Raloxifene (STAR) study was a randomized clinical trial of tamoxifen 20 mg/d or raloxifene 60 mg/d in healthy postmenopausal women at increased risk of breast cancer. The 1,498 women who were randomly assigned in STAR were age 65 years and older, were not diagnosed with dementia, and were enrolled onto the Cognition in the Study of Tamoxifen and Raloxifene (Co-STAR) trial, beginning 18 months after STAR enrollment started. A cognitive test battery modeled after the one used in the Women's Health Initiative Study of Cognitive Aging (WHISCA) was administered. Technicians were centrally trained to administer the battery and recertified every 6 months. Analyses were conducted on all participants and on 273 women who completed the first cognitive battery before they started taking their medications.
Overall, there were no significant differences in adjusted mean cognitive scores between the two treatment groups across visits. There were significant time effects across the three visits for some of the cognitive measures. Similar results were obtained for the subset of women with true baseline measures.
Tamoxifen and raloxifene are associated with similar patterns of cognitive function in postmenopausal women at increased risk of breast cancer. Future comparisons between these findings and patterns of cognitive function in hormone therapy and placebo groups in WHISCA should provide additional insights into the effects of tamoxifen and raloxifene on cognitive function in older women.
The National Surgical Adjuvant Breast and Bowel Project's Study of Tamoxifen and Raloxifene (STAR) showed that raloxifene was as effective as tamoxifen in reducing the risk of invasive breast cancer and was associated with similar risk for stroke.1 In light of similarities in efficacy between the two interventions for prevention of breast cancer, potential effects on cognition assume greater importance.
Small placebo-controlled studies have shown little effect of raloxifene on cognitive function,2–4 although one case-control study reported a significant worsening of attention following 8 weeks of treatment with raloxifene (60 mg/d).5 Findings from the Multiple Outcomes of Raloxifene Evaluation (MORE) trial indicated no overall benefit of raloxifene on cognitive function6 in women with osteoporosis. However, secondary analyses demonstrated a significant benefit of raloxifene on verbal memory and psychomotor speed in women age 70 years and older.6 In a follow-up investigation of 5,386 women in MORE, raloxifene did not reduce the risk for Alzheimer's disease, but the 120-mg dose reduced the risk of cognitive impairment.7 Studies of the effects of tamoxifen on cognitive function typically have been conducted in combination with other chemotherapeutic agents or radiation therapy, and it has been difficult to determine the effects of tamoxifen alone.8–10
This article presents the primary results of Cognition in the Study of Tamoxifen and Raloxifene (Co-STAR), a STAR ancillary study comparing the effects of these two selective estrogen receptor modulators (SERMs) on global and domain-specific cognitive function. On the basis of the limited data from prior studies, we hypothesized that women randomly assigned to receive raloxifene would have better cognitive performance, particularly on tests of verbal and figural memory, and less decline over time in comparison to women randomly assigned to receive tamoxifen.
STAR was a multicenter, randomized clinical trial of oral tamoxifen 20 mg/d or oral raloxifene 60 mg/d for a maximum of 5 years, among 19,747 postmenopausal women age 35 years or older at increased risk for breast cancer according to the modified Gail model.11–12
Co-STAR enrolled 1,498 women randomly assigned in the STAR trial who were age 65 years and older and had not been diagnosed with dementia. Previous diagnoses of neurologic or psychiatric conditions, history of head injury, depression, alcohol addiction, and drug addiction were recorded but did not serve as exclusion factors for this study. All participants were fluent in English and provided written informed consent for the Co-STAR study. Co-STAR was coordinated by the Wake Forest University School of Medicine, approved by its institutional review board, and sponsored by the National Institute on Aging.
Co-STAR enrollment began in October 2001, 18 months after STAR enrollment started and continued until the unmasking of STAR in June 2006. Visit 1 refers to the first assessment when a participant enrolled in Co-STAR, and visits 2 and 3 refer to the beginning of years 2 and 3 in Co-STAR, respectively, corresponding to 1-year and 2-year follow-up. Each participant had a maximum of three Co-STAR assessments, which were included in this article because of the low numbers of participants with more than three visits.
Co-STAR was conducted at 153 STAR/Co-STAR clinical sites across the United States and Canada, chosen on the basis of their strong enrollment and retention in STAR and the age and ethnic distribution of participants. Co-STAR originally planned to recruit participants at their STAR randomization. However, because there was a small number of women older than age 65 years at STAR randomization, the protocol was amended to allow age-eligible women to join Co-STAR any time during their first 4 years of Co-STAR follow-up. Therefore, most participants did not receive cognitive assessments until after study drugs had been initiated. In this way, visit 1 corresponds to an on-treatment visit for 1,225 participants and to a pretreatment baseline visit for 273 women.
A standardized 83-minute test battery (Table 1) modeled after the cognitive battery used in the Women's Health Initiative Study of Cognitive Aging (WHISCA)25 was administered. The battery was designed to include measures that have been shown to be sensitive to subtle cognitive changes associated with aging and hormone therapy. Measures of verbal and figural memory were expected to show the greatest sensitivity to treatment, because WHISCA demonstrated the greatest effects of hormone therapy on these two outcomes.25 The test battery additionally included the Modified Mini Mental State Examination (3MS) to assess global cognitive function and the Positive and Negative Affect Schedule (PANAS) and Geriatric Depression Scale to measure changes in positive affect and negative affect and depression, respectively. A description of all tests can be found online. Given that performance on memory tests improves with exposure and practice, we used a modified version of the California Verbal Learning Test (CVLT); we reduced the original number of immediate recall trials of words from a shopping list of 16 words from four semantic categories (List A) from five to three and by omitting extra category-cued recall trials. Participants were also asked to recall List A after a short and long delay (20 minutes) and to recall a second interference list (List B) before the short-delay recall. Forms with different shopping lists were also used at the third and fourth evaluations to reduce practice effects across annual evaluations.
Quality assurance of the cognitive measures is described elsewhere,25 and it included central training sessions and formal certification processes. Trained and certified technicians administered the cognitive battery at each of the 153 clinical centers.
Four sets of statistical analyses were conducted. The first set focused on the effects of treatment on age-related changes in cognition and involved all 1,498 Co-STAR participants over 3 years, regardless of whether they had a valid pretreatment baseline assessment or not. Years 4 and 5 were excluded because data were available only for 121 and 13 participants, respectively, and analyses including year 4 were similar to those with years 1 to 3. Repeated measures analysis of covariance models included visit, treatment, and visit by treatment interaction and were adjusted for age at Co-STAR enrollment, years between random assignment and Co-STAR enrollment, years since last menstruation, race/ethnicity, education, prior use of estrogen, and prior use of progestin (categories for last four variables are listed in Table 2). Interactions of treatment with age (< 70 v ≥ 70 years) and with years since last menstrual period were included to investigate differences in treatment effect by age and time since menopause. No interactions were statistically significant except age by treatment for letter fluency. P values for treatment and visit are reported. Models using z scores were fitted for CVLT to further take into account practice effects and the interaction between age and practice effects.26 Results were similar to those of the original models (data not shown).
The second set of analyses focused on changes from pre- to post-treatment for 273 participants who completed the first cognitive battery before they started taking their medication. These analyses included visit, treatment, and their interaction and were adjusted for age at Co-STAR enrollment, race/ethnicity, education, and baseline scores for the cognitive test. These analyses were repeated replacing race and education with prior use of estrogen and prior use of progestin, and similar results were obtained. Characteristics for both cohorts are listed in Tables 2 and and44.
Finally, analyses were repeated with 1,227 participants who completed the first tests after starting their medication and with 450 participants who completed all three visits; repeat analyses showed similar results to the first analyses (data not shown). A significance level of .01 was adopted a priori for all outcomes to control for multiple outcomes; a Bonferroni adjustment would have been too strict, given the correlation among these outcomes.
Of the 7,944 age-eligible STAR participants, 733 previously randomly assigned to receive tamoxifen and 765 randomly assigned to receive raloxifene were enrolled in Co-STAR (Fig 1). Three-hundred and twenty-one participants (21%) entered Co-STAR at the same time they entered STAR. Of these, 273 had their first Co-STAR visit before they started their medication. The remainder entered the trial after they started their medication up to 5 years after STAR began. Sixty-nine participants withdrew from each arm during the trial for reasons including a dislike of cognitive testing; family, personal, or physical problems; and, rarely, death.
No statistically significant differences in baseline demographic factors, clinical factors/medical history, personal habits, or the timing of enrollment were detected between the two treatment groups (Table 2). The average age (standard deviation) of the cohort at the time of Co-STAR enrollment was 69.9 (4.2) years ranging from 65 to 83 years, and 60% of women had had their last menstrual period more than 20 years ago. The majority were white (94%) and 68% had attended at least some college. More than half (55%) reported that they had undergone a hysterectomy, with 79% reporting prior usage of estrogen. Hypertension was fairly prevalent (43%), and 20% had experienced depression. 3MS scores at the first assessment were ≥ 95 for 67% of the participants. Only 5% reported current smoking. On average, there was a 2.3-year interval between STAR random assignment and Co-STAR enrollment. Approximately two thirds of the participants returned for a 1-year follow-up after Co-STAR enrollment, and one third had 2-year follow-up assessments, because some women had already completed their participation in STAR.
No differences in mean cognitive measures between treatment groups were statistically significant at the initial assessment (data not shown). There were no significant differences in adjusted mean cognitive scores between the two treatment groups across visits. However, CVLT List B interference scores, reflecting difficulty learning a new word list after having been exposed to the primary word list, tended to be higher in the raloxifene group than in the tamoxifen group (P = .04) (Table 3). Letter fluency scores were significantly higher in women younger than age 70 years (40.7 ± 0.5) in the raloxifene group compared with older women (37.8 ± 0.7), but not in the tamoxifen group, 39.0 ± 0.6 and 39.3 ± 0.7, respectively. Scores for global cognition, verbal and visual memory, visuospatial skills, verbal knowledge, PANAS-positive, and general depression changed significantly over the course of the study independently of treatment. Performance improved over time on the 3MS, Benton Visual Retention Test, card rotations, Primary Mental Abilities-Vocabulary, and Geriatric Depression Scale (P ≤ .01). CVLT scores generally improved from visit 1 to visit 2 and then declined at visit 3 (P < .0001); for the majority of women, the decline at visit 3 coincided with the introduction of a more difficult alternate form. Evidence that the form was more difficult comes from WHISCA, where the placebo group showed a significant decrease in performance on that form. We controlled for form effect by repeating these analyses for the subset of 450 women who completed all three visits and for the 1,225 women whose first cognitive test session was conducted after random assignment and found similar results.
The 273 women who had their first Co-STAR evaluation completed before starting their medication were similar by treatment group, except that 46% of women in the raloxifene group had used progestin compared with 31% in the tamoxifen group (Table 4). These women also tended to be younger (P = .01), were more likely to have undergone a hysterectomy (P = .01), to have reported prior estrogen usage (P = .01), and to have hypertension (P = .0002) or diabetes (P = .002) than the remaining 1,225 participants (data not shown).
Analysis of the baseline cognitive scores revealed statistically significant group differences only for PANAS-positive affect (P = .01: Table 5) with higher positive mean scores for the raloxifene group (3.6) than for the tamoxifen group (3.4). Across two follow-up years (Table 5), there were no significant treatment differences in adjusted means for any of the measures. There was a trend (P = .06) for the raloxifene group to show higher positive affect than the tamoxifen group. In addition, there were some significant time effects across the two follow-up years with the most notable effects occurring for the CVLT measures, where scores declined from follow-up year 1 to follow-up year 2 (P ≤ .001) because a more difficult test form was introduced.
In Co-STAR, we hypothesized that raloxifene would confer comparatively greater cognitive benefits, particularly in the domain of verbal memory. Contrary to our hypothesis, there were no significant differences in cognitive test performance between raloxifene and tamoxifen groups. The lack of a robust difference between the two treatments was evident in all 1,498 enrolled women and in an analysis restricted to 273 women with pretreatment baseline data. The only trend observed for cognitive measures was that raloxifene was associated with higher scores compared with tamoxifen (P = .04) on the List B interference trial, one of four verbal memory measures in the analysis involving all 1,498 women. Overall, these results demonstrated no significant differences in the effect of tamoxifen versus raloxifene on global or domain-specific cognitive function.
In contrast to this study, modest cognitive benefits were observed with raloxifene in the MORE trial, which examined cognitive function in 7,478 women with osteoporosis randomly assigned to receive raloxifene at 60 or 120 mg/d or placebo.6 Over a 3-year period, there were no overall differences in cognitive function between women randomly assigned to receive either dose of raloxifene versus placebo in a sample with a mean age of 66 years. There was a trend in the overall sample (P = .05) for women randomly assigned to receive raloxifene to have a reduced risk of cognitive impairment on verbal memory. In addition, secondary analyses restricted to women age 70 years and older demonstrated a significant benefit of raloxifene on verbal memory and psychomotor speed in MORE. Given that Co-STAR participants were recruited to be age 65 years and older, we hypothesized that raloxifene would confer cognitive benefits compared with tamoxifen. Although the raloxifene group showed a trend to better performance than the tamoxifen group on the List B outcome of the CVLT, this finding was not confirmed in the subset of women with a pretreatment baseline. Therefore, our findings did not support this hypothesis. Although 59% of the Co-STAR sample was younger than age 69 years and thus would not be expected to enjoy the possible age-related benefit of raloxifene, we observed only one interaction between treatment and age (< 70 v ≥ 70 years) suggesting improved fluency with raloxifene in younger versus older women. Also importantly, in MORE, raloxifene was compared with placebo, whereas in Co-STAR raloxifene was compared with tamoxifen.
Several other differences between Co-STAR and MORE are worth considering. More than 4,000 women in the MORE trial completed pre- and post-treatment cognitive assessments, leading to greater power to detect an effect of raloxifene on cognitive function, especially if that effect was greatest from baseline to 1-year post-treatment. Second, unlike Co-STAR participants, all MORE participants had osteoporosis. A strong risk factor for osteoporosis is estrogen deficiency.27 Conversely, early menses and older age at first birth—two factors in the Gail model for determination of breast cancer risk—are associated with higher levels of estrogen. In preclinical studies, raloxifene in the absence of estradiol exerted partial agonist effects in the hippocampus, but in the presence of estrogen, it exerted mixed agonist/antagonist effects.28 The hippocampus is a critical structure in mediating verbal memory29 and the effects of estrogen compounds, including raloxifene, on memory.30–32 Thus, raloxifene may have different effects on tasks mediated by the hippocampus such as verbal memory in women with low estrogen compared with women with higher estrogen, such that greater cognitive benefits may be evident in women with low estrogen. Another difference between MORE and Co-STAR was that in Co-STAR, there was only a 60 mg/d dose of raloxifene, whereas in MORE, there were doses of 60 and 120 mg/d.
Earlier observational studies provided mixed evidence concerning the effects of tamoxifen on cognition. Previous clinical studies provided some suggestion that tamoxifen might produce impairments in cognitive function. For example, a study of women with breast cancer found that those receiving treatment with chemotherapy and tamoxifen performed worse than women receiving chemotherapy alone on tests of visual memory and visuospatial function.8 Conversely, in a cross-sectional study of early-stage breast cancer, anastrozole led to significant impairments in verbal and visual learning and memory compared with tamoxifen.33 In a cross-sectional study of elderly nursing home patients, women treated with tamoxifen showed a reduced risk of Alzheimer's disease, improved activities of daily living, and improved decision making.34 To our knowledge, Co-STAR is the first clinical trial to examine the effects of tamoxifen on cognitive function in healthy women, and no significant differences were observed between tamoxifen and raloxifene.
The study has two important limitations. First, there was no placebo arm for comparison with the tamoxifen and raloxifene treatment arms. If both tamoxifen and raloxifene had beneficial or adverse effects on memory in Co-STAR, then cognitive effects would not be evident. Therefore, we cannot rule out the possibility that either or both treatments would have positive or negative effects on cognition when compared with placebo. Second, only a minority (approximately 20%) of participants completed assessments at baseline and throughout the trial, resulting in low power to detect treatment effects occurring within the first year of treatment. Notably, Co-STAR has several strengths. The results address the important clinical issue of whether cognitive effects should be considered when choosing between two SERMs that show similar efficacy in preventing breast cancer.1 The test battery was the same as that used in the Women's Health Initiative Memory Study,25 which will allow for comparisons of tamoxifen and raloxifene with placebo and conjugated equine estrogen with and without medroxyprogesterone acetate.
In summary, the present findings indicate that tamoxifen and raloxifene are associated with similar patterns of cognitive function in healthy postmenopausal women at increased risk of breast cancer. These findings will help women and their health care providers make more informed decisions regarding the use of tamoxifen or raloxifene for the prevention of breast cancer, because the data do not support one SERM conferring a cognitive advantage over the other. These results, however, should be interpreted with caution because of the absence of a placebo group. Future comparisons between these findings and patterns of cognitive function in hormone therapy and placebo groups in WHISCA should provide further insights into the effects of tamoxifen and raloxifene on cognitive function in older women.
Co-STAR Cognitive Battery: A standardized neuropsychologic test battery was administered according to a standardized set of instructions. The Cognition in the Study of Tamoxifen and Raloxifene (Co-STAR) test battery was modeled after the cognitive battery used in the Women's Health Initiative Study of Cognitive Aging (WHISCA).25 The test battery additionally included the Modified Mini Mental State Examination (3MS) to assess global cognitive function. The battery was designed to include measures that have been shown to be sensitive to subtle changes in cognitive function associated with aging and hormone therapy. General cognitive function, sleep, menopausal symptoms, and affect were also assessed. The battery required 83 minutes for administration (Table 1) and included the following measures.
Benton Visual Retention Test (BVRT)16,17 is a test of figural memory. Participants viewed each of 10 line drawings for 10 seconds before the figure was removed and participants were asked to draw it from memory. Drawings were progressively more complex, and each trial was scored independently for errors by two trained examiners according to standard procedures.17 The score on the BVRT was the total number of figures with errors and ranged from 0 to 26.
Modified California Verbal Learning Test (CVLT)18 (Delis DC: Clin Neuropsychol 5:154-162, 1991) is a measure of learning and memory. A target list of 16 words (List A)—a shopping list comprising four words from each of four semantic categories—was presented three times. The participant was asked to recall as many words as possible after each presentation. The outcome measures were the total number of words learned during the three immediate free recall trials, the number of words recalled from a second “interference” shopping list (List B), the number of words recalled after a short delay (short-delay free recall), and the number of words recalled after a 20-minute delay (long-delay free recall). Given that performance on memory tests improves with exposure and practice, independent of any treatment effects, we used a version of the CVLT that was modified by reducing the original five immediate free recall trials to three trials and by omitting extra category-cued recall trials. Alternate forms were also used at the third and fourth evaluations for the same purpose. Among all the tests in the battery, this test may be most sensitive to the effects of hormone therapy (Maki PM: Am J Psychiatry 158:227-233, 2001; Maki PM: Neurology 69:1322-1330, 2007). The maximum score is 48 on the total List A trials, and the maximum score is 16 on short-delay and long-delay free recall List B trials.
Modified Mini Mental State Examination (3MS)13 consists of 15 items whose scores sum to 0 to 100; higher scores reflect better general cognitive functioning. Test items measure temporal and spatial orientation, immediate and delayed recall, executive function, naming, verbal fluency, abstract reasoning, praxis, writing, and visuo-constructional abilities. It has good reliability, sensitivity, and specificity for detecting cognitive impairment and dementia ([No authors listed]: Control Clin Trials 19:61-109, 1998; Wong M: Brain Res 543:148-152, 1991).
Primary Mental Abilities-Vocabulary (PMA-V) Test14 assesses verbal knowledge and reasoning ability. For each of 50 target words, participants selected the one word out of four choices that was most similar in meaning to the target word. The score was the total number of correct words completed within 3 minutes minus one third of the total number incorrect, for a maximum score of 50.
Letter (F, A, S) Fluency Test15 gives the participant 1 minute to generate as many words as possible beginning with each letter (F, A, and S).
Semantic Fluency Test15 gives the participant 1 minute to generate as many examples as possible of members of two categories of items (fruits and vegetables). The total number of words generated for letters and categories were the outcome measures.
Digit Span19 requires the participant to immediately recall a series of digits in the same order as presented initially (forward) or in the reverse order (backward). The maximum score is 14 for each subscale.
Card Rotations Test,30 with 28 trials, requires the participant to view sample line drawings of a geometric figure and eight alterations representing two or three-dimensional rotations of the drawing. Participants were asked to identify alternatives that showed the sample in two, but not in three, dimensions. The maximum score is 160.
Finger Tapping Test21 is a test of motor speed and dexterity. Participants are asked to depress a lever as many times as possible in each of seven 10-second trials, first with the right hand and then with the left hand. The highest and lowest scores are dropped; the score is the average of the remaining five trials for each hand.
Positive and Negative Affect Schedule (PANAS)22 assesses mood. PANAS is a list of 10 pleasant mood states (eg, interested, proud, inspired) and 10 unpleasant mood states (eg, irritable, guilty, jittery). Respondents were asked to rate on a five-point scale the extent to which they have experienced each mood during the previous 2 weeks. Ratings for each item can range from 1 to 5.
Geriatric Depression Scale-Short Form (GDS-SF)23 assesses mood with the 15-item short form of the GDS which measures nonsomatic features of depressed mood. Participants indicate the presence or absence of each symptom. The GDS-SF score is the total number of positive depressive items. The GDS-SF has comparable sensitivity and specificity to the Center for Epidemiologic Studies Depression Scale (CES-D) for detecting late life depression in a nonpsychiatric population.24 Scores greater than 10 typically indicate depression.
See accompanying editorial on page 5119
Written on behalf of the Co-STAR Study.
Supported by the National Institute on Aging Grant No. NO1-AG-2106 and in part by the Intramural Research Program, National Institute on Aging, National Institutes of Health (Co-STAR); supported by Public Health Service Grants No. U10-CA-37377, U10-CA-69974, U10CA-12027, and U10CA-69651 from the National Cancer Institute, the Department of Health and Human Services, AstraZeneca, and Eli Lilly (STAR).
Presented in part at the Alzheimer's Association International Conference on Alzheimer's Disease, July 26-31, 2008, Chicago, IL.
Authors' disclosures of potential conflicts of interest and author contributions are found at the end of this article.
Although all authors completed the disclosure declaration, the following author(s) indicated a financial or other interest that is relevant to the subject matter under consideration in this article. Certain relationships marked with a “U” are those for which no compensation was received; those relationships marked with a “C” were compensated. For a detailed description of the disclosure categories, or for more information about ASCO's conflict of interest policy, please refer to the Author Disclosure Declaration and the Disclosures of Potential Conflicts of Interest section in Information for Contributors.
Employment or Leadership Position: None Consultant or Advisory Role: Therese B. Bevers, Eli Lilly (C) Stock Ownership: None Honoraria: Susan M. Resnick, Eli Lilly, AstraZeneca Research Funding: Pauline M. Maki, Wyeth Pharmaceuticals; Therese B. Bevers, Eli Lilly, National Cancer Institute Expert Testimony: None Other Remuneration: None
Conception and design: Claudine Legault, Pauline M. Maki, Susan M. Resnick, Laura Coker, Sally A. Shumaker
Financial support: Susan M. Resnick, Sally A. Shumaker
Administrative support: Claudine Legault, Sally A. Shumaker
Provision of study materials or patients: Claudine Legault, Susan M. Resnick, Therese B. Bevers, Sally A. Shumaker
Collection and assembly of data: Claudine Legault, Laura Coker, Therese B. Bevers, Sally A. Shumaker
Data analysis and interpretation: Claudine Legault, Pauline M. Maki, Susan M. Resnick, Patricia Hogan, Therese B. Bevers, Sally A. Shumaker
Manuscript writing: Claudine Legault, Pauline M. Maki, Laura Coker, Patricia Hogan, Therese B. Bevers, Sally A. Shumaker
Final approval of manuscript: Claudine Legault, Pauline M. Maki, Susan M. Resnick, Laura Coker, Therese B. Bevers, Sally A. Shumaker