We assembled the most comprehensive collection to date of youth depression treatment trials (both peer-reviewed and non-peer-reviewed), required random assignment for inclusion, contacted study authors repeatedly until we had obtained all data needed for the most precise ES calculation, and applied stringent meta-analytic methods (e.g., depression measures only, one mean ES per study, correction for small samples, weighted ES, and random effects analyses) to the data we obtained. The meta-analytic findings that emerged differed from those of previous reports in some very significant ways.
Perhaps the most striking difference between our findings and those of previous meta-analyses concerned the overall magnitude of treatment benefit. The mean effect of psychotherapy in our analyses was 0.34, falling between Cohen’s (1988)
benchmarks for a small (i.e., 0.20) and medium (i.e., 0.50) effect. Psychotherapy effects in previous youth depression meta-analyses (Lewinsohn & Clarke, 1999
; Michael & Crowley, 2002
; Reinecke et al., 1998a
) had averaged 0.99, comparing favorably with Cohen’s benchmark of 0.80 for a large effect. The surprisingly modest treatment effect evident in our analyses suggests a new perspective on the success of youth depression psychotherapy. Our findings—including our direct comparison with previous meta-analytic findings with problems and disorders other than depression—indicate that youth depression treatment does not surpass but instead may lag significantly behind treatments for other youth conditions.
Such an inference would need to be considered with caution. The years in which the depression and nondepression treatment studies were published overlapped substantially but not perfectly, which could complicate the comparison if year of publication were associated with ES; however, such an association is not evident in the youth treatment literature (see Weisz et al., 1995
). The depression versus nondepression study comparison did control for multiple factors that previous literature suggests might explain differences between different collections of studies (mean age, gender distribution, recruited vs. referred youths, active vs. passive control group, methods of ES calculation, and peer-reviewed studies vs. non-peer-reviewed); these factors did not account for the difference between depression and nondepression studies. However, different collections of studies will inevitably differ in diverse ways, so it is possible that factors not identified and controlled for in our analysis might account for the ES difference between depression and nondepression studies.
In light of our finding that psychotherapies for youth depression have relatively modest mean effects, several potentially useful next steps might be considered. These could include (a) strengthening the substance or ramping up the dose of current treatments, (b) combining currently separate depression treatments into more potent multicomponent packages, and (c) developing and testing entirely new methods that produce more substantial benefit. That said, it is important to note that ES values showed a broad range across studies in our collection; indeed, five different treatment programs generated effects exceeding 1.0. Thus, some treatments in the current armamentarium may already have strong potential.
In this connection, it must be noted that the strongest potential may not attach to the most popular treatments. In the current zeitgeist, treatments that focus on altering unrealistic, negative cognitions have particularly prominent status. Indeed, 33 of the 44 treatments in our study set emphasized cognitive change (i.e., through CBT or other cognitive approaches). This broad approach is also popular in adult depression treatment; however, some of the most provocative adult research (see, e.g., Hollon, 2000
; Jacobson et al., 1996
) has highlighted the potential of noncognitive behavioral-activation strategies, providing evidence that the impact of such strategies is not improved on by treatment with a cognitive focus. Our analyses of youth treatment evidence indicated, similarly, that noncognitive treatments demonstrated effects that were easily as robust as the cognitive treatments, suggesting that beneficial treatment for youth depression may not require altering cognitions.
Although our overall mean ES for psychotherapy was much lower than in previous reports, it was significantly different from zero, suggesting reliable treatment effects across the group of studies as a whole. However, these effects proved durable only in the relative short-term. ES at follow-up periods of 1 year or longer showed no lasting treatment effect. This supports the potential value of booster sessions and continuation treatment (Clarke et al., 1999
; Weissman, 1994
) in extending treatment benefit over time. However, two caveats should be noted. First, more than one third of the studies reviewed did not include follow-up assessments with treatment versus control comparisons; we do not know how lasting effects were in those studies. Second, only five studies included follow-up at 1 year or beyond; this limits our ability to generalize, and it highlights the need for studies with longer term follow-up.
We also assessed the generality versus specificity of treatment effects, investigating the extent to which effects on depression-related outcome measures were replicated with nondepression outcome measures. Previous meta-analytic findings on generality–specificity across an array of youth treatments and treated problems (Weisz et al., 1995
) had shown significant treatment effects on both targeted and nontargeted outcomes but with effects stronger for targeted than nontargeted outcomes, suggesting specificity of benefit. However, the previous work had not focused on depression in particular or on the question of whether carryover effects might depend on conceptual similarity between depression and the outcomes being measured. When we addressed this question here, we found evidence of both generality and specificity in treatment effects. Depression treatment was associated with significant improvement in the conceptually similar domain of anxiety. In fact, following depression treatment, the reduction in anxiety symptoms was only marginally lower than the reduction in depressive symptoms. By contrast, we found that effects for the conceptually dissimilar domain of externalizing problems were significantly inferior to effects on depression measures, and that the mean effect on externalizing outcomes was not significantly different from zero. Our finding that depression treatment has beneficial effects on anxiety is consistent with growing evidence that youth depression and anxiety are closely associated empirically (see, e.g., Achenbach & Rescorla, 2001
) and that they share a common core of negative affectivity (see, e.g., Cole, Peeke, Martin, Truglio, & Seroczynski, 1998
; King, Ollendick, & Gullone, 1991
). A useful question for future research is whether the effects of youth depression treatment on anxiety result from increased skill in addressing the negative affectivity that is apparently shared by the two syndromes. Whatever the answer to this question, the findings do offer some support for the possibility that youth depression and anxiety might be treated by a common intervention encompassing emotional disorders (see Barlow et al., 2004
Taken together, these findings on the magnitude and specificity of effects may help inform the debate over alternatives to antidepressant medication, as discussed in the introduction (see Glass, 2004
; Safer, 1997
; TADS Team, 2004
; Vitiello & Swedo, 2004
; Weisz & Jensen, 1999
; Whittington et al., 2004
). Our results suggest that for those who seek an alternative to antidepressants, psychotherapy offers a reasonable option, generating a small to medium ES that generalizes to comorbid anxiety symptoms and shows substantial holding power for some months after treatment ends. Because recent concerns over SSRIs relate to elevated risk of suicidality, it may warrant attention that our study set included six investigations that assessed suicidality as an outcome, and that these studies averaged a small reduction in suicidality (mean ES = 0.18, marginally greater than zero).
As another perspective on these issues, one might construe psychotherapy as a potentially useful complement to, rather than a replacement for, antidepressants—that is, a form of intervention that may boost outcomes when combined with medication. This perspective is consistent with the findings of the most complete and sophisticated direct comparison, to date, of medication to psychotherapy in youth depression treatment—that is, the TADS (see TADS Team, 2004
). In this study, adolescents treated with fluoxetine alone showed outcomes superior to those in a placebo condition, but adolescents treated with a combination of fluoxetine and a 12-week course of CBT showed the most positive treatment response, supporting the idea that psychotherapy may complement the effects of antidepressant medication. An important additional finding was that CBT alone did not significantly outperform the placebo condition, supporting concerns that psychotherapy alone (at least in its CBT form) may not be a very potent treatment force. If this finding were taken as definitive evidence on the potential of CBT, then the results could be quite discouraging to those who seek a psychotherapeutic alternative to medication. However, a close look at indicates that the CBT ES generated in TADS is not characteristic of most CBT or psychotherapy effects on youth depression; 20 of the 23 other CBT programs in the table showed larger ES than the TADS version of CBT, and the mean ES value across the non-TADS CBT programs in the table was 0.48, markedly higher than the −0.07 ES associated with the TADS CBT intervention. What is not clear from the available data is whether this picture results from a low-potency version of CBT in TADS, from the unusual and challenging comparison of CBT with a medication placebo condition in TADS (see Baskin et al., 2003
), from a combination of the two, or from other factors not identified.
A concern raised by some (e.g., Jensen, 2003
; Weisz, 2004
) about the evidence on youth treatment research in general is that so many of the trials have compared active treatment with passive control conditions, including no treatment and waitlist. This was evident in the current depression study set as well, with 20 of the 35 studies having used passive control conditions. Those studies showed a relatively strong treatment effect (mean ES = 0.41), significantly superior to zero at the .01 level. In contrast, the 14 studies comparing treatment with an active control group generated a mean ES of only 0.24, markedly lower but still superior to zero. Thus, our findings showed rather modest benefits of depression treatment when compared with the most rigorous comparison conditions (see Baskin et al., 2003
). As Jensen (2003)
stressed, we need studies that use “control groups comparable in intensity of exposure to the supposed active treatment. Such studies are critical, if we are to conclude that something about a given therapy is specifically
effective, over and above simple compassion, friendliness, attention, and belief” (p. 37). The fact that passive control groups generate higher ES may also help explain why previous youth depression treatment meta-analyses have yielded higher mean ES than the current one, because the study collections in those prior meta-analyses involved somewhat heavier reliance than the current meta-analysis on no-treatment and waitlist comparison groups. Specifically, 40% of the studies in our meta-analysis used active control groups, in contrast to 26% averaging across the three previous youth depression meta-analyses—that is, 14% in Michael and Crowley’s (2002)
meta-analysis, 33% in Reinecke et al.’s (1998a
) meta-analysis, and 33% in Lewinsohn and Clarke’s (1999)
In the debate over empirically supported treatments, concern has been raised that the empirical support comes largely from efficacy studies in which experimental control is achieved at the cost of clinical representativeness (e.g., Weisz, 2004
; Westen et al., 2004
). One result, the argument goes, is that for many treatment programs we do not know whether the procedures actually work with clinically referred youths, treated by clinical practitioners, in clinical practice settings. Our findings may alleviate some of these concerns to some degree. Although we did find that ES values were somewhat larger for research therapists than for clinical practitioner therapists, and somewhat larger for treatments delivered in research settings than in service settings, neither difference was significant. Moreover, ES was reliably superior to zero for referred youths, practitioner therapists, and clinical service settings, suggesting that significant treatment benefit can be obtained across all three clinical representativeness dimensions.
Despite the modest overall treatment effects evident across the depression trials, we found that treatment benefit proved rather robust across some notable variations in person and treatment characteristics. For example, significant treatment effects were identified for (a) both child-majority and adolescent-majority samples considered separately, (b) samples identified as having depressive disorders and samples identified through depression symptom measures, (c) both group and individual treatments, and (d) treatments with and without a cognitive emphasis. In addition, treatment duration was not correlated with outcome, suggesting that some briefer treatments may have the potential to be as effective as lengthier ones. Thus, the benefits of psychotherapy, though modest on average, were evident across rather diverse characteristics of treated youths and across variations in the format, content, and duration of therapy.
We also found effects to be rather consistent across published and non-peer-reviewed studies, suggesting that publication bias may not be a major problem in the youth depression treatment literature thus far. In the youth psychotherapy literature generally, unpublished/non-peer-reviewed studies show significantly lower ES than published studies (see McLeod & Weisz, 2004
). However, most youth depression psychotherapy research is relatively recent compared with treatment research with other youth conditions (see Weisz, Hawley, & Jensen Doss, 2004
); more recent research may profit from an increased focus, in journal reviews, on the quality of the research procedures rather than on the statistical significance of intervention effects.
Our findings shed light on another area of discussion among youth depression treatment experts: intervention outcome as perceived by different informants. We found that depression-related outcomes looked significantly better when the outcome information was provided by the youngsters themselves (e.g., via symptom self-report measures) than when their parents provided the information. Moreover, youth-report outcomes were significantly better than zero, whereas parent-report depression outcomes were not. Such a finding may reflect the fact that youths themselves have better access to information on their own internal state than do outside observers (see Hammen & Rudolph, 1996
). Collateral reporters rather consistently report lower levels of depression than children themselves (Angold & Costello, 1993
; Capaldi & Stoolmiller, 1999
; Hammen & Rudolph, 1996
). It is possible that parents’ difficulty in evaluating their children’s internal states may make them relatively insensitive to the changes that would need to be noted to detect improvement at the end of treatment. It may also be relevant that parents of depressed children are more likely than other parents to be depressed themselves (Kovacs & Devlin, 1998
); some have argued that relatively depressed mothers may perceive their children’s behavior in a more negative light than it actually warrants (Breslau, Davis, & Prabucki, 1988
), but because maternal depression is in fact associated with more actual child disorder, it is not clear whether bias is involved (Boyle & Pickles, 1997
; Richters, 1992
). Whatever the reason for our findings, they do raise a concern that the evidence for beneficial effects of psychotherapy for youth depression rests almost entirely on reports by the youths themselves without confirmation from other more objective informants.
Our findings, together with the scrutiny of studies required for this meta-analysis, suggest several observations about the state of the evidence and ways to improve it. As noted previously, we need more studies designed to test whether specific depression treatments can outperform active conditions that control for attention and other nonspecific factors. In addition, the fact that more than one third of the studies in our collection generated no usable follow-up comparisons of treatment and control conditions is a reminder that we need more studies that include follow-up assessments, and in which the control condition remains unaltered throughout the follow-up period, so that treatment–control comparisons can be meaningful at the time of follow-up. The episodic nature of depression may also argue for extending the lag between posttreatment and follow-up. An important counter to this point is that maintaining control conditions for extended periods raises ethical concerns, if doing so exposes depressed youths to long periods without treatment. This point, in our view, underscores the potential value of the treatment-as-usual control condition (see, e.g., Weisz, 2004
). Ethical concerns should not attach to procedures that provide youths with the intervention they would have received in the absence of the study. This suggestion and the previous one are quite compatible; no treatment and waitlist, currently the most commonly used control conditions, are not only the weakest experimentally but also the most difficult to sustain throughout a waitlist period, given ethical and humane concerns.
Another design limitation evident in the studies that we reviewed was the relative absence of intent-to-treat analyses (only 11 studies explicitly reported such analyses), a state of affairs that constrains interpretation of positive effects. Without such analyses, one cannot rule out the possibility that youths who dropped out of treatment (and were thus dropped from analyses) did so because they were not benefiting from treatment, and that the resulting ES values are overestimates.
Two final areas of concern involve the search for moderators and mediators of treatment impact (see discussion in the following books: Kazdin, 2000
; Weisz, 2004
). On the moderator front, many of the studies can be faulted for a failure to characterize the samples fully enough to address the role of participant characteristics. As an example, only 13 of the 35 studies provided detailed information on the race/ethnicity of their samples, and only 6 of the 35 studies included any test of any potential moderator. Even more striking is the relative inattention to the question of what change processes underlie improvement. In only one of the studies was a candidate mediator or change process identified and its mediating role tested.
Taken together, our findings and our observations on the evidence suggest an agenda for future research on youth depression treatment. Clearly, a useful foundation has been laid, with evidence from 35 studies pointing to treatment effects that are significant, albeit markedly more modest than those reported in previous meta-analyses. Effects appear to be durable for the initial months following treatment but not when followed for 1 year or more; however, more evidence is needed regarding long-term holding power. Critical examination of the evidence suggests a need for increased use of active control conditions, meaningful follow-up assessment, intent-to-treat analysis, moderator assessment, and tests of proposed mechanisms of change. Much has been accomplished in 25 years of youth depression treatment research, but important work remains for the years ahead.