|Home | About | Journals | Submit | Contact Us | Français|
Clinicians and researchers synthesize data from randomized controlled trials (RCTs) of antidepressants to make conclusions about the efficacy of medications for depression. All treatments include nonspecific factors in addition to the specific effects of drugs, and study design may influence patient outcomes via nonspecific factors. This study investigated whether placebo control and treatment duration affect outcome in antidepressant RCTs.
Medline and the Cochrane Database were searched to identify RCTs of FDA-approved antidepressants for major depression. Included studies enrolled outpatient participants aged 18–65, lasted 6–12 weeks, compared an antidepressant to placebo or another antidepressant, and were published in English after 1985. Excluded trials enrolled inpatients, pregnant women, and subjects with psychosis or mania. Mixed effects logistic regression models including study type (placebo-controlled or comparator) and study duration (6, 8, or 12 weeks) as fixed effects determined whether these factors affected response and remission rates.
In the 90 trials analyzed, the odds of depression response (OR 1.79, 95% CI = 1.45 – 2.17, p < 0.001) and remission (OR 1.53, 95% CI = 1.11 – 2.11, p < 0.001) were significantly higher in comparator relative to placebo-controlled trials. Eight (OR 1.37, CI 1.14 – 1.64, p = 0.001) and 12 (OR 1.52, CI 1.12 – 2.07, p = 0.008) week duration trials had significantly greater response rates than 6 week trials without differing themselves.
Response and remission rates to antidepressants are significantly affected by study type and duration. Clinicians and researchers must consider study design when interpreting and designing RCTs of antidepressant medications.
When a psychiatrist prescribes an antidepressant, his or her patient may reasonably ask “what are the chances I will get better on this medication?” and “how long will it take me to feel better?” In answering these questions, a psychiatrist practicing Evidence Based Medicine (EBM) is informed by research studies testing the proposed medication for depression . However, there are many studies to choose between when gathering evidence about the anticipated effectiveness of antidepressants and speed of treatment response.
For example, sources of information about the effectiveness of antidepressants include open studies, placebo-controlled randomized clinical trials (RCTs), and comparator (i.e., medication vs. medication) RCTs. Response rates are generally higher in open studies compared to placebo-controlled RCTs . Similarly, information about the speed of antidepressant response may come from observing the time course of response within individual studies or comparing response rates across trials of different durations. Within a single 12 week RCT, medication response rates are greater at the trial endpoint compared to 8 weeks , but similar response rates have been observed across trials of 6, 8, and 12 week duration [4,5]. These discrepant results suggest that study design may affect treatment outcome, and they leave unclear which studies constitute the best evidence to answer specific clinical questions.
Few previous investigators have directly addressed the questions of whether or how study design impacts treatment response . In one of the few available studies, higher antidepressant response rates were found in placebo-controlled trials relative to comparator trials (58.1% and 50.6%, respectively) . However, this analysis included unipolar as well as bipolar depression and examined RCTs dating to 1959, when methodological problems plagued many trials. Subsequent investigators found an average medication response rate of 49% in placebo-controlled versus 59% in comparator trials for late life depression . However, they did not conduct a formal literature search, provide inclusion and exclusion criteria for the studies they examined, or test whether the observed difference was statistically significant.
Sneed et al (2007) recently examined antidepressant response rates in 9 placebo-controlled and 7 comparator trials for late life depression . A 46% response rate to medication was found in placebo-controlled trials compared to 63% in comparator trials. The odds of medication response in comparator trials were nearly two times the odds in placebo-controlled trials (OR 1.78, 95% CI = 1.10 – 2.90, p < 0.001). This study used rigorous trial selection criteria and statistical methods but was limited to subjects over 60 years old, leaving unclear whether these results can be generalized to all adults.
Different antidepressant study designs may affect treatment outcome by influencing nonspecific factors. In addition to their treatment specific effects, all treatments comprise nonspecific factors such as healthcare provider attention, treatment credibility, and patient and doctor expectations . In the National Institutes of Mental Health (NIMH) Treatment of Depression Collaborative Study (TDCRP), subjects’ higher expectation of therapeutic gain predicted greater likelihood of depression response and lower final depression scores in all four treatment conditions . In another trial, 90% of participants with high expectations of improvement responded to treatment compared to 33% of subjects with lower expectations .
To further explore the influence of RCT design on antidepressant treatment outcome, response and remission rates to antidepressant medications were compared across placebo-controlled and comparator trials enrolling outpatient participants aged 18 to 65. Analyses of antidepressant response and remission rates in trials of 6, 8, and 12 week duration were also performed. The primary hypotheses were that response and remission rates to medications in comparator trials would be significantly higher than those observed in placebo-controlled studies and that response and remission rates to medication in 6, 8, and 12 week duration trials would not be significantly different.
It should be noted in what follows that while this study shares some characteristics in common with meta analyses, it is not a true meta analysis. Meta analyses pool odds ratios or effect sizes from many individual studies comparing a treatment of interest with a reference group. Combining the effect sizes from these comparisons increases power, helps adjudicate between conflicting individual study results, and provides a more accurate estimate of the true effect size. A different question was addressed in this study, which was whether specific design parameters influence antidepressant treatment outcome across studies. Answering this question required comparing antidepressant response and remission rates rather than odds ratios or effect sizes. Therefore, traditional meta analytic methodology was not appropriate in this case, and different methods of data analysis were utilized.
A Medline search was conducted to identify RCTs contrasting antidepressants to placebo or active comparator in adults with depression. The index terms “depression—drug therapy,” “depressive disorder—drug therapy,” and “antidepressant agents,” in addition to the class and individual generic name of all antidepressants were combined using the ‘or’ operator. This returned 19,338 results, which were limited to 1) English language articles, 2) publication year from 1985 to 2006, 3) age group >= 18 (to be inclusive), and 4) publication types including clinical trials, controlled clinical trials, meta-analysis, multi-center study, randomized controlled trial, or review, which yielded 2,821 journal articles. The year 1985 was chosen to select trials utilizing more rigorous methods. The first author (BRR) conducted a review of these titles to rule out those which were not clinical trials of antidepressants for depression, resulting in 564 titles.
Three judges (BRR, JRS, and SPR) reviewed the 564 titles, sequentially proceeding from article title, to abstract, and finally full paper text, to determine whether they met inclusion or exclusion criteria (see Figure 1). These evaluations were pooled, and any differences between judges were resolved by discussion. To further ensure all relevant papers were reviewed, the references of all meta analyses and review articles published since 2000 among the 2,821 journal articles were searched for pertinent references. In addition, the Cochrane Database of Systematic Reviews was electronically searched using the topics depression, anxiety, and neurosis. This yielded 24 protocols and completed reviews, each of whose references was reviewed to ensure they were among the reviewed trials.
Inclusion criteria stipulated that articles report RCTs of a Food and Drug Administration (FDA) approved antidepressant medication for Major Depressive Disorder (MDD) in outpatient subjects aged 18–65. Further criteria required trials to last between 6 and 12 weeks (inclusive), have comparison group of placebo or another FDA-approved antidepressant medication, be written in English, published 1985 or later, and have response or remission rates specified using a standardized outcome measurement (e.g., Hamilton Rating Scale for Depression (HRSD) , Beck Depression Inventory (BDI) , Montgomery-Asberg Depression Rating Scale (MADRS) , Clinical Global Impression (CGI) ). Trials were excluded for enrolling inpatients, pregnant women, subjects who were psychotic, or those defined to have treatment resistant depression. Also excluded were antidepressant augmentation studies and trials requiring as inclusion criteria a specific subtype of Major Depression, a specific medical illness, or an Axis I disorder other than depression.
Publication information (year of publication, funding source, type of study, number of groups), demographic characteristics of the included subjects (sample size, age, gender, race, clinical characteristics), details of the treatment condition (medication name, mean dose), and outcome data (pre and post-treatment means, standard deviations, response and remission rates) were extracted from each included RCT. Study quality was also measured by determining whether studies reported critical methodological aspects such as (1) concealment of treatment allocation, (2) blinding of outcome assessment, and (3) use of intent to treat data analyses . Three judges (BRR, JRS, and SPR) extracted the data, and any differences were resolved by consensus.
Data analyses followed those in a prior manuscript, where the procedures are described in greater detail . Mixed effects logistic regression models were used, similar to the approach taken by Bryk and Raudenbush , Hox , and Haddock, Rindskopf, and Shadish . Response rates in all medication cells in the placebo-controlled and comparator trials were included, even when multiple cells comparing the same medication at different doses were present. Meta analyses typically combine these cells to avoid making multiple contrasts with the same placebo comparison group, but since such contrasts were not of interest in this analysis, all medication cells were included separately.
Analyses proceeded in a stepwise fashion as follows. First, an unconditional model was fit to the data on antidepressant response rates to determine whether significant variability in response rates existed across studies. The unconditional model is described by a within-studies and a between-studies equation, which accommodate the nested structure of subjects within medication conditions within study. If variability in antidepressant response rates was greater than that expected by chance alone, then the analysis proceeded with a conditional model adding study type (placebo-controlled or comparator) as a fixed effect in the between-studies equation. Odds ratios and estimated probabilities of response to antidepressant medication in the different study types were computed. For completeness, the estimated probability of response to placebo in placebo-controlled trials was also computed. Finally, a full interaction model was constructed by adding study duration to the conditional model of antidepressant response rates. In these analyses, 6 and 7 week duration trials were grouped together under the heading of 6 week trials, 8 and 9 week duration trials were grouped as 8 week trials, and 10 to 12 week duration trials were grouped as 12 week trials.
This analysis of response rates was repeated in an identical fashion for the data on antidepressant remission rates. The regression models were estimated using HLM 6. Differences in participant characteristics between trials were investigated using two-tailed independent samples t tests for continuous variables and chi-square (Χ2) tests for categorical variables (SPSS version 15).
Forty eight placebo-controlled and 42 comparator trials met the study’s inclusion and exclusion criteria (Tables 1 and and2).2). As shown in Table 3, there were 100 active treatment conditions enrolling 9,515 participants in the 48 placebo-controlled RCTs, since many trials compared more than one medication to placebo. Among the placebo controlled trials, 80% demonstrated significant differences in depression response rates between medication and placebo. There were 84 active treatment conditions enrolling 7,030 subjects in the 42 comparator RCTs. Among the comparator trials, 10% demonstrated significant differences in depression response rates between medications.
Response rates to medication ranged from 25–74% (mean 52.2 ± 10.4) in the placebo-controlled and 39–91% (mean 65.2 ± 11.9) in the comparator trials, while remission rates ranged from 22–62% (mean 39.7 ± 10.6) in the placebo-controlled and 27–70% (mean 48.4 ± 13.0) in the comparator trials. Response rates to placebo in the placebo-controlled trials ranged from 13–53% (mean 34.7 ± 10.4), while remission rates ranged from 10–37% (mean 24.5 ± 8.0).
Placebo-controlled relative to comparator trials enrolled individuals who were significantly younger (40.4 ± 2.6 vs. 42.9 ± 3.2, t = −3.86, df 153, p < 0.001), of white ethnicity (Pearson X2 = 7.77, df 1, p = 0.041), depressed for longer periods of time (Pearson X2 = 17.11, df 3, p < 0.001), and had higher drop out rates (34.6 ± 14.4 vs. 22.0 ± 9.7, t = 5.54, df 145, p < 0.001). The two types of trials did not differ in subjects’ pre-treatment HRSD scores, mean number sample size, medications used, gender, prior depressive episodes, response and remission outcome measures, or quality ratings.
In the unconditional model of antidepressant response rates, variability between studies was over 10 times that expected by chance alone (Χ2/df = 936.9/88 = 10.7). Therefore, the null hypothesis that antidepressant response rates are homogeneous across studies was rejected, and the analysis proceeded with the conditional model. The variability found in antidepressant remission rates across studies was over 12 times that expected by chance alone (463.3/37 = 12.5), so this analysis likewise proceeded with the conditional model.
In the conditional model of antidepressant response rates, study type accounted for 34% of the variability observed (see Table 4, (0.25 – 0.16)/0.25 = 0.34). As shown in Table 5, the odds of responding to medication in comparator trials were 1.79 times the odds in placebo-controlled trials (95% CI = 1.45 – 2.17, p < 0.001). The estimated response rate in placebo-controlled trials was 52% compared to 65% in comparator trials. For the purposes of comparison, the estimated response rate to placebo in the placebo-controlled trials was 40%.
In the remission rate analysis, study type accounted for 16% of the variability observed (see Table 4, (0.24 – 0.20)/0.24 = 0.16). The odds of remitting to medication in comparator trials were 1.53 times the odds in placebo-controlled trials (95% CI = 1.11 – 2.11, p < 0.001), and the estimated remission rate in placebo-controlled trials was 38% versus 49% in comparator trials. The estimated remission rate to placebo in the placebo-controlled trials was 29%.
Study duration accounted for 17% of the variability in antidepressant response rates once study design was taken into account (see Table 6, (0. 16 – 0.14)/0. 16 = 0.17). The odds of medication response were significantly greater in 8 week (OR 1.37, CI 1.14 – 1.64, p = 0.001) and 12 week (OR 1.52, CI 1.12 – 2.07, p = 0.008) compared to 6 week duration clinical trials. However, there was no difference between 8 and 12 week trials (OR 1.11, 95% CI 0.81 – 1.5, p = 0.497). No significant interactions between study duration and study type were found for medication response rates.
Duration accounted for 11% of the variability in antidepressant remission rates once study design was taken into account (see Table 6, (0. 20 – 0.18)/0. 20 = 0.11). There was a trend toward higher remission rates to medication in 8 versus 6 week duration clinical trials (OR 1.803, CI 0.987 – 3.295, p = 0.055), but no significant differences were observed between remission rates in 12 and 6 week clinical trials (OR 1.810, CI 0.641 – 5.111, p = 0.254) or 8 and 12 week trials (OR 1.003, 95% CI 0.398 – 2.532, p = 0.993). No significant interactions between study duration and study type were found for medication remission rates.
Consistent with the primary hypothesis, response and remission rates to antidepressants were significantly higher in comparator relative to placebo-controlled trials. The odds of responding to medication in a comparator trial were nearly twice those in placebo-controlled trials, while the odds of remitting were one and a half times as great. In contrast to the stated hypotheses, antidepressant treatment outcome also depended on study duration to some extent. The odds of treatment response were higher in 8 and 12 week duration trials compared to those lasting 6 weeks, while the odds of remission in 8 weeks duration trials were significantly greater than those in 6 week trials. There were no significant differences between 8 and 12 week duration RCTs.
The factors explaining the large differences observed in antidepressant response and remission rates between placebo-controlled and comparator trials are unknown. One obvious dissimilarity between these study types is that subjects, clinicians, and outcome raters in comparator trials know the subjects are receiving medications demonstrated to be effective for depression, while participants in placebo-controlled trials may be taking placebo. This raises the possibility that higher expectations of improvement among these individuals in comparator trials may account for the observed differences in treatment outcome. Greater expectations may lead subjects to form stronger therapeutic alliances, continue treatment during periods of clinical worsening or increased side effects, and report less severe symptoms. Similarly, clinicians and raters who are aware subjects are receiving medication rather than placebo may evaluate them more optimistically. Alternatively, lower expectations for therapeutic gain in placebo-controlled trials may decrease medication response rates in those trials. Given that placebo is not administered in clinical practice, comparator trials and open studies may approximate more closely the clinical effectiveness of antidepressants.
Patient and doctor expectations may also play a role in the speed of response to antidepressants. Some investigators have argued that 12 week trials of antidepressants are necessary based on finding within a single trial that medication response rates are higher at 12 weeks compared to 8 weeks . In fact, the American College of Neuropsychopharmacology (ACNP) recently advocated antidepressant trials up to 20 weeks long when remission of depression is the goal of treatment . However, in the present study, no differences were found in response and remission rates between 8 and 12 week trials. This finding suggests conclusions regarding the necessary duration of antidepressant trials cannot be based on within-study comparisons, but instead must come from comparisons between trials of different durations. More subjects appear to respond to medications as the end of a trial approaches, regardless of whether it is 8 or 12 weeks long .
A major significance of these findings has to do with the appropriate design of studies comparing medication and psychotherapy in the same trial. The design of medication and psychotherapy trials has been heavily influenced by debate over the NIMH TDCRP, in which many pharmacologists argued that “internal calibration” with pill placebo is necessary to demonstrate the sample represented a drug-responsive population . While the argument for internal calibration of medication and psychotherapy studies is cogent, the results of this study suggest such a design may create other problems.
These problems can be illustrated by considering the Treatment for Adolescents with Depression Study (TADS), which randomized adolescents with Major Depression to cognitive behavior therapy (CBT) alone, fluoxetine alone, combined CBT and fluoxetine, and pill placebo . The authors found fluoxetine alone and CBT alone were significantly better than placebo but not different from one another, while combined CBT and fluoxetine was superior to either monotherapy. In this study participants in the CBT alone condition knew they were receiving psychotherapy, while subjects taking pills did not know whether they were fluoxetine or placebo. Similarly, subjects in the combined cell knew they were receiving two active treatments rather than one (CBT alone) or possibly none (fluoxetine and pill placebo). Antidepressant response and remission rates are lower when subjects do not know they are receiving active treatment (i.e., placebo-controlled RCTs) versus when they know they are receiving medication without knowing the exact agent (i.e., comparator RCTs). Therefore, in comparing openly administered psychotherapy to blinded medication treatment, such combined studies may be biased against medication.
The longstanding absence of a psychotherapy placebo makes it difficult to both internally calibrate a combined treatment study with pill placebo and avoid biasing the study against medication. The study design one chooses will then be determined by the question one wishes to answer. To determine the relative efficacy of medication and psychotherapy, a three cell design similar to Keller et al’s (2000) study of combined psychotherapy/nefazodone versus nefazodone and psychotherapy alone may be better . However, using this design one would not be assured that the individuals studied represented a medication-responsive population.
Lastly, a number of limitations should be considered when interpreting the results of this study. Publication bias may have affected which studies were included in these analyses, since RCTs failing to demonstrate significant differences between medication and placebo may not have been published. Analysis of the FDA clinical trial database has revealed that 48% of trials involving an investigational antidepressant and 64% of trials involving established agents demonstrate a significant difference between drug and placebo . These values are lower than the 80% of the placebo-controlled trials included in this study that report a significant difference between drug and placebo, which indicates the presence of publication bias among the included studies. However, the implications of publication bias for the results reported here are unclear, since it is not the efficacy of drugs compared to placebo that is being investigated. Publication bias seems unlikely to affect the overall pattern of placebo-controlled trials having lower response rates than comparator trials. In fact, the inclusion of unpublished RCTs not demonstrating a difference between medication and placebo seems likely to strengthen the observed results, since lower antidepressant response rates in placebo-controlled RCTs would increase their relative differences with response rates in comparator trials.
Second, because this analysis combines subjects across different trials, the results may be due to differences among participants enrolled in the different study types and durations rather than study design itself. This possibility was investigated by comparing subjects in placebo-controlled and comparator trials on a number of demographic and clinical characteristics, which revealed higher drop out rates and longer episodes of current depression among subjects in placebo-controlled trials. This potential limitation illustrates how retrospective analyses alone can never definitively answer the question of whether study design influences treatment outcome. It is essential to randomize a single sample to different study types or durations and prospectively study their treatment outcomes. Such a prospective trial is now underway by the current authors.
In summary, the finding that response and remission rates to antidepressant medications differ significantly when they are administered in placebo-controlled versus comparator trials shows that study design affects treatment outcome. Furthermore, study design affects the speed of treatment response. While the overall percentages of subjects responding to medications in 8 and 12 week trials are not significantly different, subjects in 12 week long trials respond slower.
The authors thank David Rindskopf, PhD at the Graduate Center of The City University of New York for his assistance in planning data analyses.
Disclosures: This study was supported by grants from the National Institute of Mental Health. Dr. Rutherford receives funding under T32 – MH1514144 “Research Fellowship Training: Affective and Related Disorders,” and Dr. Sneed receives funding under K23 – MH70056 “Vascular Depression: A Distinct Diagnostic Entity?” No other financial disclosures are reported for Drs. Rutherford and Sneed. Dr. Roose reports receiving research grants from Forest Laboratories and Novartis Pharmaceuticals as well as consultant fees from Forest Laboratories, Organon, Wyeth Pharmaceuticals, and Eli Lilly and Company.