|Home | About | Journals | Submit | Contact Us | Français|
Community clinic therapists were randomized to (a) brief training and supervision in CBT for youth depression or (b) usual care (UC). The therapists treated 57 youths (56% girls), aged 8–15, 33% Caucasian, 26% African-American, and 26% Latino; most youths were from low-income families; all had DSM-IV depressive disorders (plus multiple comorbiditities). All youths were randomized to CBT or UC and treated until normal termination. Session coding showed more use of CBT by CBT therapists, more psychodynamic and family approaches by UC therapists. At post-treatment, depression symptom measures were at sub-clinical levels, and 75% of youths had no remaining depressive disorder, but CBT and UC groups did not differ on these outcomes. However, compared to UC, CBT was (a) briefer (24 vs. 39 weeks), (b) superior in parent-rated therapeutic alliance, (c) less likely to require additional services [including all psychotropics combined and depression medication in particular], and (d) less costly. The findings showed advantages for CBT in parent engagement, reduced use of medication and other services, overall cost, and possibly speed of improvement—an hypothesis that warrants testing in future research.
Advocates for evidence-based treatments (EBTs; e.g., National Advisory Mental Health Council Workgroup, 2001; Office of the Surgeon General, 1999, 2004; President’s New Freedom Commission, 2003) have made a case for transporting these treatments to a broad array of everyday practice contexts. This perspective may make sense, in principle. However, before major resources are devoted to large-scale dissemination, it may be wise to study the implementation process, to learn what steps are needed to transport these treatments effectively.
As several researchers have suggested (e.g., Southam-Gerow, Chorpita, Miller, & Gleacher, 2008; Southam-Gerow, Weisz, & Kendall, 2003; Weisz, Donenberg, Han, & Weiss, 1995; Weisz, 2004), there are many differences between the contexts in which EBTs are usually tested and the contexts of everyday clinical care. Bridging all these differences may not be a simple process. This point is made clear in a recent review of research on the implementation of tested programs in practice settings (Fixsen, Naoom, Blase, Friedman, & Wallace, 2005). Fixsen et al. report that many implementation efforts have not succeeded, despite evidence that the programs have worked well within controlled trials; by contrast, a few studies have shown successful implementation in practice settings when extensive, multilevel procedures are used, including careful selection of the “implementers,” thorough training, and active coaching/supervision.
The complex array of findings reviewed by Fixsen et al. (2005) suggests that extensive research will be needed to understand what implementation procedures work in any particular domain (see also Southam-Gerow, Austin, & Marder, 2008). The present study was one step toward such a body of research, focused on a specific form of treatment for depression in children and adolescents (here referred to as “youth”), as implemented in clinical practice.
The need for this research is underscored by questions about whether skilled and effective use of EBTs can be achieved in the limited time available to many practitioners. Many of the most skilled users of EBTs have built their skills over several years of graduate or postdoctoral training. Such training often occurs in specialized training clinics that focus on one treatment protocol, or a few related ones, for a narrow range of clients, and with faculty mentors and peers all concentrating on a similar skill set (see e.g., Southam-Gerow & Kendall, 2006). Such extended, intensive, and highly focused skill-building is not likely to be feasible for most practitioners, given their mandate to serve a broad array of clients and given the time constraints, work demands, and productivity pressures of everyday clinical care. Similarly, the procedures used in many clinical trials—including selection of only the most talented therapists, extensive training, skill-building through practice cases, and assigning study cases only after high skill levels are achieved in practice cases—may not be feasible in practice contexts, given the service mandates and financial and time constraints under which clinics and clinicians must now operate.
These implementation challenges are quite relevant to cognitive-behavioral therapy (CBT) for depression. CBT is widely recommended for youth depression by professional groups (e.g., American Academy of Child and Adolescent Psychiatry—see Birmaher et al., 1998), and has the most extensively replicated success in clinical trials for youth depression (see Weisz, McCarty, & Valeri, 2006). However, preparation for a CBT trial often involves extensive therapist preparation. For example, CBT therapists in an RCT by Brent et al. (1997) were required to (a) have six months of intensive training on the specific CBT treatment manual, (b) “treat two cases in adherence to the treatment model before becoming a therapist for the study” (p. 878), and (c) after passing a and b, receive 1 hour of group and 1 hour of individual supervision per week throughout their work as therapists. Because such an extensive time commitment would not be feasible for most clinicians employed in everyday practice, an important implementation question arises: Can clinicians employed in practice settings learn sufficient skills in CBT, in the time that is available to them, to be successful in treating youth depression?
Two recent studies suggest that it may be difficult to train practitioners to use CBT in ways that outperform the usual interventions provided in everyday youth mental health care. In one study, Clarke et al. (2002) compared usual care (UC) for youth depression in the Kaiser Permanente HMO (e.g., outpatient visits plus psychotropic medication) to UC plus CBT. The depressed offspring of depressed parents were treated using the Coping with Depression Course for Adolescents (CWD-A), a CBT protocol that had shown beneficial effects in two previous efficacy trials. Therapists received initial training in the CWD-A followed by supervision every other week. At the end of the study, “the authors were unable to detect any significant advantage of the CBT program over usual care” (Clarke et al., 2002, p. 305).
In another study, Kerfoot, Harrington, Rogers, and Verduyn (2004) randomly assigned social workers and other community support workers to receive training and supervision in CBT or to continue their usual care (UC) procedures, with both groups treating depressed adolescents. At the end of treatment, the adolescents in both groups had shown similar reductions on depression symptom self-report measures, with no significant difference between CBT and UC. Kerfoot et al. concluded from these findings that “training community-based social workers in [CBT] is neither practical nor effective in improving the outcomes of their clients” (p. 92).
These important studies highlight the challenge of moving CBT into community service settings, particularly when the task includes training professionals in their first-ever use of CBT, and when CBT is pitted against UC in which professionals use familiar procedures and do their best to help their young clients improve. The research to date has taken valuable steps toward investigating whether there are training and supervision procedures through which CBT might produce more beneficial effects than UC. As a next step, we set out to (a) use a more complete effectiveness design than has been tried thus far, (b) use a fully randomized experimental procedure, and (c) avoid built-in dose differences that might favor CBT. Given these goals, our study design differed from Clarke et al. (2002) and Kerfoot et al. (2004) in significant ways:
We used clinically representative treatment settings, youths, and therapists. All sessions took place within routine outpatient care in public mental health clinics; we included only youths who had been referred through normal pathways, contacting them only after their families had called the clinics to seek services; and we included only clinicians who were already employed as therapists in those clinics. By contrast, the setting for CBT in Clarke et al. was a research center, with UC done in HMO offices; and in Kerfoot et al. both CBT and UC were done in various community social service settings. Youths in Clarke et al. were recruited from an HMO database, and Kerfoot et al. had study therapists recruit their own cases. Clinicians in Clarke et al. were research center staff for CBT and HMO clinical staff for UC; clinicians in Kerfoot et al. were social workers and community support workers.
Our experimental design was structured to create as fair and balanced a comparison as possible between CBT and UC; we randomized therapists to CBT or UC, then randomized youths to CBT or UC. By contrast, Clarke et al. (2002) randomized youths but not therapists; and Kerfoot et al. (2004) randomized social workers to CBT or UC, but youths were then recruited by the therapists, precluding youth randomization.
We also sought to avoid built-in dose differences that would favor CBT over UC (we tracked treatment duration, but left it free to vary). By contrast, the CBT condition in Clarke et al. (2002) was CBT plus usual HMO services (including medication)—a design that offers advantages but does create a built-in dose difference favoring CBT over UC.
To minimize conflict with clinic productivity rules, we limited initial CBT training to one day and stressed learning via weekly case supervision. By contrast, Clarke et al. and Kerfoot et al. mainly emphasized initial training—30 hours in Clarke et al., three days in Kerfoot et al. After training, Clarke et al. provided supervision every other week for 30 minutes in year 1, 15 minutes in year 2, with most supervision focused on attendance issues (personal communication, G. Clarke, 3/23/07). After their training, Kerfoot et al. offered voluntary supervision but had poor clinician participation (median number of supervision meetings attended was three; almost a third of therapists attended none or one).
The CBT program in the present study was Primary and Secondary Control Enhancement Training (PASCET; Weisz, Thurber, Sweeney, Proffitt, & LeGagnoux, 1997). A previous RCT (Weisz et al., 1997) had shown beneficial effects of PASCET with elementary to middle school youths who had elevated depression symptoms, and PASCET was especially appropriate for this study because (a) it included the most common core elements of CBT [e.g., activity selection, cognitive restructuring, relaxation training], (b) it was designed to fit the relatively broad age range needed in this study, (c) it was designed to accommodate the variations in pace and session attendance often seen in outpatient care, and (d) its focus on surveying multiple depression coping skills and then identifying and practicing a few “best fit” skills appeared to fit the array of different forms of youth depression seen in outpatient settings (see e.g., Weiss et al., 1991).
To encompass outcomes across a relatively broad spectrum, we built on the assessment model of Hoagwood, Jensen, Petti, and Burns (1996), examining CBT vs. UC group differences on clinical outcomes, consumer response (i.e., therapeutic alliance), treatment duration, use of additional clinical services beyond psychotherapy, and cost. Our study appears to be the first trial of CBT for youth depression to encompass all three major effectiveness trial dimensions (i.e., clinically representative treatment setting, referred youths, and psychotherapists); the rigor of a fully randomized design (i.e., with both therapists and youths randomized to CBT or UC); and broad assessment encompassing, clinical process, clinical outcome, and cost. We tested the hypotheses that CBT would prove superior to UC on clinical outcomes and therapeutic alliance, with lower cost and less need for additional services than UC. The primary study outcome was depression symptomatology, assessed via youth- and parent-report measures.
The sample included 57 youths, 56% girls, aged 8–15 years (M = 11.77, SD = 2.14); 33% were Caucasian, 26% African-American, 26% Latino, 11% “mixed/other”, and 4% not reporting. Some 34% of our families reported yearly income at or below $15,000; 32% between $15,000 and $30,000; 14% between $30,000 and $45,000; 12% between $45,000 and $60,000; and 8% above $60,000 (seven families did not report income). Primary diagnoses (based on DISC 4.0 combined parent and youth report) were 56% major depressive disorder (MDD), 12% dysthymic disorder (DD), and 32% minor depressive disorder (MinDD; American Psychiatric Association [APA], 1994). Comorbidity with other disorders was high (see Table 1). For example, 14% of the sample met criteria for conduct disorder, 60% oppositional defiant disorder, 47% attention deficit hyperactivity disorder, 19% generalized anxiety disorder, and 39% separation anxiety disorder. There were no significant condition differences on any of these variables or on number of disruptive, anxiety, or total comorbid disorders (all ps > .48).
Therapists (n=26 in CBT, 28 in UC) averaged 32.0 years of age (SD = 7.02, range 25–55), and 75% were female; 43% were Caucasian, 34% Latino, 13% Asian/Pacific, 6% mixed ethnicity, and 4% African-American. Some 22% were social workers, 14% doctoral-level psychologists, 56% masters level psychologists, and 8% other masters level professionals (e.g., Marriage and Family). Therapists averaged 4.30 (SD = 1.70) years of training and 2.40 (SD = 3.50) years of additional professional experience prior to the study. Previous research in community samples (e.g., Weersing, Weisz, & Donenberg, 2002) has reported similar clinician professional and demographic characteristics. CBT and UC therapists did not differ significantly on any of these characteristics, suggesting successful randomization.
Participants were routine referrals to seven public urban community mental health clinics within the most populous US county (US Department of Commerce, 2003). Enrollment began in 1998 and ended in 2003, with assessments ending in 2005. If families’ initial request for services included any mention of internalizing symptoms, and youth age was 8–15 years, families were told about the study. Given permission, a phone screen followed. Those who passed the phone screen (i.e., positive for any relevant symptom, no signs of psychotic or developmental disorders) were invited to a project interview (see below). If the interview identified a diagnosis of MDD, DD, or MinDD, we then assessed treatment priority (see below) to determine whether to invite the family to participate.1 Figure 1 shows the flow of enrollment using the CONSORT format. Of the 268 youths assessed for eligibility, 185 did not meet study criteria2; 26 who were eligible did not enroll, most often because the family decided not to start therapy at the clinic.
To determine treatment priority, we obtained Diagnostic Interview Schedule for Children (DISC-IV; Shaffer et al., 2000) DSM-IV (APA, 1994) diagnoses, disorder symptom counts, and symptom-report measures (described below) separately from parents and youths. Parents and youths also reported their top three reasons for seeking services (parent examples: “She is unhappy all the time,” “dark clouds that she goes through.” Youth examples: “bad mood,” “I’m sad”) and gave a severity rating (0–10 scale) associated with each of these. These diagnostic, symptom, referral problem, and severity data were discussed by project staff, senior clinic staff, and family; if it was agreed that the depressive disorder had treatment priority, the youth was invited to enroll in the trial.3
Medication decisions followed usual clinic procedures, guided by clinic staff psychiatrists. This fit our goal of adhering to usual clinic procedures wherever possible, and clinic staff were unwilling to relinquish control over medication, in any event. We recorded medication use and included the data in analyses (as discussed later).
By agreement with clinics, project assessments were conducted before and after treatment (T1 and T2). Assessments were carried out by interviewer pairs including one clinical psychology graduate student and one post-BA research assistant, both blind to condition. Assessments were highly standardized, so instead of assessing inter-interviewer reliability, we focused on ensuring that standardized procedures were followed strictly. Interviewer training by post-doctoral staff included didactics, modeling, and individually supervised practice interviews; videotapes of full completed interviews were reviewed throughout the study to prevent drift and insure adherence to standardized protocols.
T1 assessments were done prior to the start of therapy. Timing of T2 assessments reflected the fact that the study was a hybrid of efficacy and effectiveness research (e.g., Southam-Gerow et al., in press). Because treatment duration was free to vary in UC and in CBT, post-treatment assessment could not be set at the same time points for all participants (as in an efficacy trial with fixed treatment duration). Assessment at any fixed point (e.g., 16 weeks) would have included some youths who had terminated and others who had not; moreover, some administrators and clinicians objected to assessments during treatment, believing they would alter the therapy process or relationship. So, T2 assessments were set at the end of treatment for all participants. This did not control for duration, but it did ensure that assessment would come after the treatment—be it CBT or UC—had been given a full opportunity to produce effects. The median lag between the date of the last session and the date of the T2 assessment was 59 days (32 for CBT, 63 for UC), reflecting the fact that termination decisions were often not announced by families (only becoming evident after several weeks of non-attendance), and the fact that scheduling was difficult for our low-income sample, often requiring substantial advance planning (e.g., to arrange time off work).
Treatment research has been criticized for relying on too narrow a range of outcomes (esp. diagnoses and symptoms; see Hoagwood et al., 1996; Kazdin & Weisz, 1998). To address the concern, we built on the Hoagwood et al. (1996) model, assessing three outcome dimensions: (a) symptoms and diagnoses; (b) consumer perspectives, including youth and parent ratings of therapeutic alliance; and (c) treatment process and systems impact—including duration, cost, and use of other clinical services. Following CONSORT principles, depression symptom reduction was the primary outcome of interest in the study, with the other measures regarded as secondary.
To assess diagnoses and symptoms, we used the Diagnostic Interview Schedule for Children (DISC; Shaffer et al., 2000), version 4.0. Earlier versions generated extensive reliability and validity data (e.g., Rubio-Stipec et al., 1996; Schwab-Stone et al., 1996; Shaffer et al., 1996), and the DISC team has continued to assess psychometrics of the DISC 4.0 (e.g., Shaffer et al., 2000). We used DISC 4.0 to obtain (a) youth-report diagnoses and symptom counts, (b) parent-report diagnoses and symptom counts, and (c) combined diagnoses and symptom counts. To obtain combined values, we counted a symptom as positive if either youth or parent endorsed it. To reduce youth burden and focus their reports on the categories thought to be most validly reported by youths (as per consultation with DISC authors), we used only parent-report modules for oppositional defiant disorder, conduct disorder, and ADHD. Other DISC modules were administered to both parents and youths.
The 27-item CDI is a widely used youth self-report measure of depressive symptoms. In clinical samples, Cronbach’s alphas for the CDI have ranged from .71 to .89, and test-retest reliability coefficients have ranged from .50 to .87 (see Kovacs, 2003). The CDI and several other measures of youth depression have been found to correlate at .5 and higher (e.g., McCauley, Mitchell, Burke, & Moss, 1988; see Kovacs, 1992 for review). We used the CDI Total Depression scale for this study.
To parallel the youth-report CDI, we used the parent-report CDI-P. This measure has shown good test-retest reliability and internal consistency (Wierzbicki et al., 1987; Kazdin, French, & Unis, 1983), and has distinguished children with depressive disorders from children with other diagnoses (Kazdin et al., 1983). Some research has shown substantial correlations between parent and youth versions of the CDI in non-clinical populations (e.g., Slotkin et al., 1988; Wierzbicki et al., 1987), but the relation between parent and youth reports of youth depression is often weak (Kazdin et al., 1983), which argues for assessing both perspectives.
The CBCL is a widely used 118-item scale that obtains parent ratings for an array of behavioral and emotional problems. Extensive reliability and validity evidence has been reported (see Achenbach, 1991a). We focused mainly on the broad-band Internalizing scale and two narrow-band scales: Anxious-Depressed and Withdrawn-Depressed, all reported as T-scores with mean 50 and SD 10.
Youth and parent forms of the ETOS were used to assess UC vs. CBT differences in pre-treatment expectations (Sample item: “How do you expect to feel when therapy is over?” 1=much worse, 9=much better.). ETOS scores have been found to be significantly related to (a) the information youngsters and parents receive prior to treatment, and (b) therapists’ expectations at therapy onset (Bonner & Everett, 1982, 1986).
In the post-treatment assessment, we used the7-item TASC (Shirk & Saiz, 1992) youth-report and parent-report forms to assess youth and parent therapeutic alliance with the therapist. Hawley and Weisz (2005) have reported good internal consistency and test-retest reliability for the TASC among clinic-referred youths (alpha = .93, r = .79) and parents (alpha = .81, r = .82).
A key question to ask about any treatment is whether it reduces the need for other services. For this question, we used the SACA (Horwitz et al., 2001), a standardized parent-report interview assessing use of multiple mental health services (outpatient, inpatient, and other). SACA reliability and validity are well-documented (Hoagwood et al., 2000; Horwitz et al., 2001; Stiffman et al., 2000).
All the clinics kept detailed records on study clients, following standard protocol required for reporting and billing. The records contained diverse information including treatment duration, session attendance, no-shows/cancellations, and other data required for service and cost documentation. Three reviewers recorded data from these records. Each record was coded by two reviewers, with discrepancies resolved by consulting the original record.
We assigned youths to either UC or CBT, using block randomization (e.g., Friedman, Furberg, & DeMets, 1998), to support balance on clinic, youth gender, and bilingual therapist requirement (i.e., parent preferred Spanish). Youths identical status on the three dimensions were blocked and assigned randomly within blocks. To illustrate, the first bilingual requirement boy in clinic A was randomly assigned to either CBT or UC; the next bilingual requirement boy in clinic A was assigned to the other condition. This procedure provided reasonable but not perfect balance across conditions, because randomization occurred at the clinic level, each clinic had four blocks, and some blocks were left incomplete at the end of the study.
To ensure comparability of therapists in the two conditions, we randomly assigned all therapists to either UC or CBT, using a randomized block procedure (see Friedman et al., 1998) to balance the conditions on inclusion of bilingual therapists and on representation of disciplines—psychologists, social workers/MFCCs, and psychiatrists. Note that we adhered strictly to therapist randomization, not excluding any CBT therapists, regardless of their performance in training or competence with CBT during therapy.
Power was calculated for between- and within-group comparisons, using actual sample sizes and assuming alpha = .05 (Lenth, 2006). Between-group power was based on a two-group t-test model with equal variances; power was .80 to detect an effect size d = .76 difference. For within-group comparisons, we used a paired-group t-test model with equal variances; power was .80 to detect d = .51 for the CBT condition, and .80 to detect d = .58 for the UC condition. To detect a medium effect size of .50, power for between-group comparisons was .45. Power for within-group comparisons was .67 for the UC condition, .78 for CBT.
UC therapists agreed to use the treatment procedures they used regularly and believed to be effective in their clinic practice. Therapy continued in UC until a normal client termination. CBT therapists were trained to use the Primary and Secondary Control Enhancement Training (PASCET) program (Weisz, Thurber, Sweeney, Proffitt, & LeGagnoux, 1997). PASCET is built on findings concerning cognitive and behavioral features of, and beneficial treatments for, youth depression (e.g., Lewinsohn et al., 1990; Stark, Reynolds, & Kaslow, 1987), and on the two-process model of perceived control and coping (see Rothbaum, Weisz, & Snyder, 1982; Weisz, Rothbaum, & Blackburn, 1984a,b). In this model, primary control involves coping by making objective conditions (e.g., school grades, peer relationships) fit one’s wishes. Secondary control involves coping by adjusting oneself (e.g., one’s expectations, interpretations) to fit objective conditions, influencing their subjective impact without altering the actual conditions. Depression is addressed by applying primary coping to distressing conditions that are modifiable and secondary coping to conditions that are not. The goal for each youth is to learn and try out a set of primary and secondary control skills, identify a subset of the skills that work best for that individual, then practice applying those “best-fit skills” in situations that trigger depressive symptoms. Sessions are mainly individual therapy, but with periodic parent meetings and with brief summary meetings for youth, parent, and therapist after each individual youth session. Therapists are guided by a manual (Weisz et al., 1999) and youths by a parallel practice book. The expanded PASCET manual used for this study contains detailed plans for 10 individual sessions and outlines to guide up to 5 more sessions, but treatment can be extended flexibly for youths who need more than 15 sessions to learn and practice the concepts and skills fully.
Training and supervision time had to fit the constraints of busy clinics, including clinician productivity requirements. Based on these constraints, therapists randomized to PASCET received a one-day, six-hour training, then about 30 minutes of weekly case supervision in use of the protocol from one of six doctoral level clinical psychologists who had prior experience with PASCET. Whenever possible, we used group supervision, targeting 30 minutes of case discussion per therapist. This fit routines in all the clinics, where all study therapists—PASCET and UC—received regular supervision from clinic staff; regular supervision time was typically reduced for PASCET therapists to accommodate their project-related supervision.4 Supervision attendance was good, with virtually all scheduled meetings held or rescheduled.
Both PASCET and UC treatment sessions were recorded, then coded for protocol adherence, characterization of UC, and treatment differentiation, using two coding systems.
To gauge PASCET protocol adherence, we used the PBA, which contains 16 items (rated Present/Absent) reflecting whether or not the session included specific elements of PASCET session procedures. Sessions were randomly selected from sessions #1–10—i.e., the structured sessions featuring primary and secondary control skills (other sessions lacked the prescriptive content required for adherence coding). We restricted coding of tapes to those cases with at least 6 session tapes (n = 30). From this sub-sample, we randomly selected 50% of the cases; one of two expert raters coded all available tapes for each of the cases. Some 25% of coded tapes were selected randomly and coded by the second rater; kappa was 1.00 for inter-rater agreement.
We used the TPOCS-S to characterize the therapy provided in UC, and to assess differentiation between the PASCET and UC conditions. TPOCS-S items are designed to represent prominent therapeutic approaches in youth psychotherapy. We used four TPOCS-S subscales: CBT (14 items—e.g., Cognitive Distortions); Psychodynamic (4 items—e.g., Transference); Family (5 items—e.g., Parenting Style); Client-Centered (4 items—e.g., Client Perspective). TPOCS-S coders make 7-point extensiveness ratings on each item—i.e., for the extent to which the therapist used each therapeutic intervention during an entire session.
Four graduate students and one Ph.D. clinical psychologist formed the coding team. Two sessions from each case (excluding first and last) were randomly sampled if the case had 20 sessions or less (otherwise, 3 were coded). This produced 94 coded sessions (51 UC, 43 PASCET—different because UC therapy had more sessions. Independent coding of 53 sessions showed good item level inter-rater reliability (mean intraclass correlation = .71, SD = .14).
Analyses involved (a) addressing missing data, (b) data reduction, (c) tests of group comparability, (d) primary analyses testing treatment group effects, and (e) post hoc analyses. All analyses used the intent-to-treat (ITT) sample. For measures with fewer than 15% missing items, we scaled the sum of completed items to fit the scores of the completed measures (e.g., if two of the 27 CDI items were missing, we multiplied total score by 27/25). Participants missing a measure at any time point were excluded from analyses with that measure at that time point.
Because we sought to make inferences about latent constructs rather than specific outcome measures, and to reduce the number of significance tests, we used EFA to identify latent factors underlying the parent- and youth-report depression measures. Tests of normality showed generally normal distributions, with only youth-report CDI showing moderate positive skew. Maximum-likelihood estimation produced two factors based on a scree-test. We used an oblique promax rotation (Hendrickson & White, 1964) to identify the two-factor solution with the simplest structure. One factor, parent-reported depression, included all parent-report measures—CDI-P (loading: .89), DISC-P depression symptoms (.53), CBCL withdrawn-depressed (.69), and CBCL anxious-depressed (.74). The other factor, youth-reported depression, included both youth-report measures—CDI (.99) and DISC-C depression symptoms (.37). the distinction between factors is clear, the .99 loading of the CDI on the youth-report factor suggests caution in interpreting loadings on each factor (Van Driel, 1978). Using unit weighting (Kline, 2004), we averaged standardized values of each factor’s measures to compute a factor score for each participant. To obtain post-treatment factor scores on metrics comparable to those of the pre-treatment factors, we standardized each of the post-treatment factor’s measures using the measures’ pre-treatment means and SDs.
As a context for later analyses of depression measures, we assessed the success of randomization in producing similar PASCET and UC groups, the content of the treatment sessions, duration of treatment in PASCET and UC, and patterns of session attendance.
To test whether randomization created comparable PASCET and UC groups, we compared the groups at T1, using t-tests for continuous measures and chi square tests for categorical measures. Our tests included demographic measures (ethnicity, gender, family income), youth-report clinical measures (CDI, DISC depression symptom count, youth depression factor score), parent-report clinical measures (parent CDI, DISC depression symptom counts, DISC symptom counts for externalizing disorders, CBCL anxious-depressed, CBCL withdrawn, CBCL externalizing, parent depression factor score, ETOS), and combined measures (DISC parent-plus-youth combined diagnoses). The two groups did not differ significantly on any pretreatment measure (see Table 1).6
The PBA scale used to gauge PASCET adherence contains 16 items rated present or absent. For the PASCET cases coded, the mean proportion of required elements present in the sessions was .98 (range: .88 – 1.00, SD: .01).
Interventions used by UC therapists showed high variability, with essentially no use of any PASCET-specific protocol procedures. Thus, the TPOCS-S was used to characterize UC in terms of the broad therapeutic approaches employed, including CBT, Psychodynamic, Family, and Client-Centered. UC therapy sessions were rated higher on client-centered (M = 3.58, SD = .77) than on CBT (M = 1.62, SD = .45); t(19) = 9.00, p < .001), psychodynamic (M = 1.73, SD = .74; t(19) = 12.28, p < .001), and family (M = 2.12, SD = .96; t(19) = 5.04, p < .001) interventions. UC sessions also had higher ratings on Family intervention than on CBT, t(19) = 2.34, p < .05. In general, UC therapists used interventions from multiple theoretical orientations, with an emphasis on non-behavioral approaches.
We compared PASCET and UC via TPOCS-S coding of 20 PASCET and 20 UC cases. PASCET sessions showed more CBT than UC sessions; UC scored higher than PASCET on the Psychodynamic and Family subscales (see Table 2).
Mean number of sessions was 20.52 (SD = 16.07) for UC and 16.45 (SD = 6.07) for PASCET, t(29.53) = 1.20, p = .24, d = .34. Mean treatment duration was 39.26 weeks (SD=28.98) for UC and 25.20 weeks (SD=15.40) for PASCET, t(34.75)= 2.19, p = .04, d=.60. For the 75% of the sample that had no depressive disorder at post-treatment (see below), mean duration was 42.39 weeks (SD = 29.44) for UC and 24.19 weeks (SD = 16.60) for PASCET, t (29.66) = 2.42, p = .02, d = .79.
UC families canceled sessions marginally more often and no-showed nonsignificantly more often than PASCET families (Table 3). Total missed sessions (cancelled + no-shows) was marginally higher for UC than PASCET, t (29.14) = 1.70, p=.10, d=.49.
We compared UC and PASCET groups using three dimensions of the Hoagwood et al. (1996) model: Diagnoses/symptoms, systems, and consumer perspectives. We used analyses of covariance for all continuous measures, with treatment entered as the IV and pretreatment score as a covariate. We considered HLM, but this was rejected because our sample size would not have produced stable HLM estimates, and we only had assessment data from two time points.
All youths had met criteria for a depressive disorder at pretreatment; 75% showed no depressive disorder at post-treatment. There was no significant difference between PASCET and UC in the percent who shed their diagnosis, regardless of whether we focused on any depressive disorder (73.3% vs. 77.3%) or MDD, MinDD, or DD, separately. ANCOVAs, with clinic as a random effect, used youth-report and parent-report factor scores as composite indices of depression symptomatology, adding other measures of interest (See Table 3). None of the ANCOVAs showed a significant effect of treatment group.
Analyses of additional services provided during the treatment phase of the study (Table 4) showed that UC families were more likely than PASCET families to receive additional mental health services (e.g., treatment from a second therapist, school-based services, psychotropic medication). Compared to PASCET youths, more UC youths used all psychotropics combined, and more UC youths used depression-specific psychotropics.
An increasingly important consideration in youth treatment, from a systems perspective, is cost. Guided by clinic administrators, we estimated cost of treatment for each youth by adding the costs (obtained from the internet for medications [average manufacturer’s price], and from clinic administrators for all other elements) of (a) psychotherapist time for sessions; (b) psychiatric staff costs for medication assessment, prescription, and monitoring/titration sessions; (c) medications for depression; and (d) weekly clinic administrative costs of keeping a case open and keeping records current (this includes ongoing costs during weeks when sessions are missed, such as continued updating of clinic records, staff calls to families who miss sessions, efforts to reschedule the next session, and reminder calls before sessions). Mean cost per case was significantly higher for UC ($4,930.56) than for PASCET ($3,221.34), as shown in Table 3.
Youths’ TASC-C scores did not differ reliably by condition. However, on the TASC-P, PASCET parents rated therapeutic alliance as significantly stronger than UC parents did (see Table 3).
Because UC showed considerable variability across the sample, with some representation of CBT, client-centered, psychodynamic, and family approaches, we sought to learn whether any of these approaches might predict depression reduction in UC. We conducted hierarchical regressions predicting the parent-report and youth-report depression factors scores from the four TPOCS-S subscales. For each regression analysis, the baseline score on the outcome measure was entered prior to the TPOCS-S subscale, to control for initial severity. Three of the subscales failed to predict, but the Psychodynamic subscale predicted reduced parent-reported depression (β = −.44, R2 Change = .20, p < .05), F(2, 14) = 4.11, p < .05, R2 = .37. Post-hoc analyses showed two of the four items on the Psychodynamic subscale to be significant predictors: transference [e.g., noting how youth’s interactions with the therapist resemble other relationships in the youth’s life], (β = −.52, R2 Change = .24), p < .05, F(2, 14) = 5.37, p < .05, R2 = .43; and Explores Past [e.g., noting a connection between past behavior and experiences and current youth behavior and experiences], (β = −.44, R2 Change = .19, p < .05), F(2, 14) = 4.13, p < .05, R2 = .37.
We assessed the extent to which brief training and limited case consultation in PASCET would influence therapist behavior and clinical, consumer, and systems outcomes in community clinic practice. Our coding of session tapes indicated that the training and consultation had a measurable impact on therapist behavior, with PASCET sessions including coded elements of the protocol and with TPOCS-S coding showing significantly higher ratings for PASCET than UC on the use of CBT approaches, and significantly lower ratings than UC on psychodynamic and family approaches that were not in the PASCET protocol. This suggests that the training and supervision procedures did lead to implementation of the basic protocol elements. However, the quality of that implementation deserves attention (as discussed later).
On clinical outcomes, while all youths began the study with a depressive disorder, more than 70% in both conditions had no depressive disorder at post-treatment. Scores on continuous depression measures also dropped to subclinical levels—e.g., mean CDI scores in both the PASCET (M = 8.00) and UC (M = 8.22) groups dropped well-below established clinical range cutpoints (see Craighead, Curry, & Ilardi, 1995; Fristad, Weller, Weller, Teare, & Preskorn, 1988; Kovacs, 2003; Timbremont & Braet, 2001). This pattern points to clinically meaningful change in depression levels from pre- to post-treatment. However, PASCET and UC youths were not significantly different at post-treatment in rates of depressive disorders or on symptom measures. On the other hand, the duration of treatment preceding these reductions in depressive disorders and symptoms was markedly shorter for PASCET than for UC youths (24 vs. 39 weeks). The overall pattern of findings could fit two different interpretations. We consider each.
One possibility is that PASCET led to faster improvement than UC. This interpretation warrants attention in the light of certain data from our study and two lines of previous research. First, we found that treatment duration was shorter for PASCET than UC in the full sample, and that the difference was somewhat larger among PASCET vs. UC youths who achieved remission. It is possible that UC, by averaging about four months longer than PASCET, may have benefited from the natural time course of depression remission. Median time to recovery from an MDD episode in clinically referred youths has been found to range from 7–9 months (see Kovacs, 1996); our average UC youth was in treatment more than 9 months. By contrast, our average PASCET case was in treatment for only six months; Kovacs (1996) reported six month recovery rates of only 33–40%. In sum, the average duration of UC in our study approximated the median time to recovery documented by Kovacs (1996), whereas the average duration of PASCET was well below the average time to recovery reported by Kovacs.
A potentially relevant line of evidence relates to the Hawthorne effect and related findings suggesting that subjecting any process to study can change that process. Data available from six of our seven participating clinics, during the decade prior to the beginning of the current study (from Weersing & Weisz, 2002), showed that average duration of UC for depressed youths was 23 weeks in the absence of a randomized trial; in the current trial, pitting UC against PASCET, UC for depressed youths averaged 39 weeks—a 70% increase over the period when there was no trial. The fact that PASCET and UC outcomes were to be compared did seem to engender a competitive climate. As an example, one UC clinician was overheard telling other UC clinicians, “We’re going to beat those manual people!” Given the natural time course of depression, outcomes could be enhanced to the extent that cases are held open for extended periods. It is possible that the significantly longer duration of UC, relative to PASCET, reflected a tendency by UC clinicians to continue treating youths until improvement was noted. Improvement may also have been boosted by the UC group’s significantly higher rate of additional services, medication in general, and antidepressants in particular, compared to the PASCET group.
Studies of youth depression treatment follow-up (e.g., Clarke et al, 1999; Birmaher et al., 2000; TADS Team, 2007) have rather consistently shown that participants continue to improve after acute intervention and that eventually all treated groups show roughly equal (and high) levels of improvement and remission. This suggests that the primary benefit of a more successful treatment may lie not in ultimate outcome but in speed of improvement. This in turn suggests a way to frame the present findings: Several lines of evidence, including (a) our data showing that UC lasted longer than PASCET, particularly when depression remitted and especially when the therapist supported termination; (b) our data showing that duration of UC in participating clinics increased by 70% when the PASCET vs. UC trial was introduced; (c) studies of the natural time course of remission, resembling duration of UC in our study (and markedly longer than the duration of PASCET); and (d) studies of youth depression treatment follow-up suggesting that the main benefit of successful treatments lies in speed of improvement, not ultimate outcome, all connect to the possibility that PASCET may have produced faster depression relief than UC. However, because our design did not include measurement of depression at common time points for all participants, a definitive test of this interpretation awaits further research.
So, we note an alternative interpretation of our findings: that the absence of post-treatment condition effects on depression symptoms or diagnoses means that PASCET and UC did not differ in effectiveness in the community clinic context of the present study. If this were the case, several strands of research would be relevant. Previous findings (e.g., Brent et al., 1998; Hammen, Rudolph, Weisz, Rao & Burge, 1999; Southam-Gerow et al., 2008; Southam-Gerow et al., 2003; Wright, Ehrenreich, Pincus, Hourigan, Southam-Gerow, & Weisz, 2007) suggest that youths referred to community clinics through normal community channels are more likely than those treated in research clinics or traditional efficacy trials to show high levels of externalizing comorbidity, poverty, and family stress; other studies have shown that effects of CBT are markedly diminished in youths with significant externalizing comorbidity (Rohde, Clarke, Mace, Jorgensen, & Seeley, 2004) and youths referred from community sources (rather than recruited through ads; Brent et al., 1998; Weersing, Iyengar, Kolko, Birmaher, and Brent, 2006). Our sample was completely community-referred, and it had high rates of externalizing comorbidity. The notion that CBT faces challenges with community-referred youths is consistent with findings of Kerfoot et al. (2004; noted in the introduction) that teaching CBT to practitioners did not improve outcomes for depressed youth referred from and treated in the community.
If it were true that PASCET did not outperform UC, two other perspectives on our findings would deserve attention. First, Fixsen et al. (2005) stressed that when a previously tested intervention is applied in a new context, null findings may reflect, not a problem with the intervention itself, but rather incomplete implementation in the new context. Fixsen et al. stress that information dissemination and training alone “repeatedly have been shown to be ineffective” (p. 70; see also, Grimshaw et al., 2001) and that “successful implementation efforts…require a longer-term multilevel approach”. Fixsen et al. note that the implementation approaches supported by evidence include: (a) skill-based training, (b) practice-based coaching, (c) practitioner selection, (d) practitioner performance evaluation, (e) program evaluation, (f) facilitative administrative practices, and (e) systems interventions. Our procedures included only a and b; some of the best-supported approaches—e.g., selecting the best practitioners to do the new intervention—were ruled out by the need to create a fair experimental test of PASCET vs. UC (which required random assignment of clinicians). The evidence from Fixsen et al. suggests that our approach to examining PASCET in community clinics is a first step, with some of the best-documented requirements for implementing interventions in new settings not yet included.
A closely-related point relates to the marked difference between therapists in their familiarity and skill with the treatment procedures used in the two conditions. UC therapists used their own preferred and familiar treatment procedures, which they themselves had selected. By contrast, PASCET therapists used an unfamiliar approach guided by a manual they had not selected or even seen before. The PASCET therapists’ total exposure to PASCET prior to starting treatment with study cases consisted of one six-hour training program. No PASCET therapist had any practice case prior to the first study case, and for 18 of the 24 PASCET therapists, their first PASCET case was their only study case. Most PASCET therapists did show acceptable fidelity, but fidelity measures, including ours, only assess whether the main components and skills of the manual are included in the sessions. Not covered in such measures is the competence or skill with which therapists used the components and skills. This is a study limitation, but we are not aware of a relevant measure that has been validated.
Our session videotape reviews suggested a broad range in therapist skill with PASCET; some therapists introduced treatment components and skills effectively, but many appeared uncomfortable and unnatural. Some therapists read portions of the manual to the youth, some lost track of where they were in the protocol, and others introduced key points in ways that left youngsters confused or bored. Therapist lack of skill in the protocol may have undermined PASCET effectiveness. In the future, it will be useful to assess the impact of PASCET when delivered by community clinic therapists who have gained experience and familiarity with the protocol and can deliver it comfortably and skillfully.
If it were true that PASCET did not outperform UC in reducing depressive symptoms and disorders, it would be worthwhile to ask whether approaches other than CBT might be a better fit for the therapists, youths, and settings of community clinic care. This possibility is suggested by our unexpected finding that clinical outcomes in UC were predicted, not by therapists’ use of CBT, but by their use of psychodynamic methods. Of course, our data on frequency of various approaches tell us nothing about the quality with which the approaches were used; poorly done CBT might well not predict outcomes, whereas skillfully-done CBT might. Despite these caveats, the finding warrants attention in future research (see below).
Beyond symptom and diagnostic change, our findings showed a number of condition differences that have clear clinical relevance. The PASCET group, compared to UC, used significantly fewer adjunctive services, was lower in total cost, and generated higher parent ratings of therapeutic alliance. Thus, along some significant clinical and practical dimensions, PASCET showed advantages over UC that might be of real value to practitioners and provider organizations. Using fewer adjunctive services can mean more efficient intervention that does not require complex case management and liaison with other providers. Stronger alliance with parents can enhance their participation in the treatment process, and increase the likelihood of getting their children to scheduled appointments—and our findings did show marginally better session attendance by PASCET youths. Finally, shortened time in treatment, and reduced cost, can mean reduced waitlists, shorter waits, and more families served.
It is instructive to consider both limitations and strengths of our study. One limitation, a relatively small sample, reduced power to detect effects; this may have made a difference on measures showing small to medium effect sizes, but not on others. Another limitation was heterogeneity of the therapists and youths, including highly mixed problem profiles and multiple comorbidities, with numerous externalizing disorders; this certainly increased error variance, and may have undermined treatment focus on depression. In addition, the heterogeneity of UC interventions limited our ability to interpret findings in relation to a specific alternative treatment. We could have limited heterogeneity in these various forms, by (a) using only a few carefully selected therapists rather than randomly assigning all who volunteered, (b) ruling out youths with comorbidities, particularly those that might complicate treatment [e.g., ADHD, ODD, conduct disorder], and (c) designing our own homogenous control condition rather than having UC therapists use their own familiar and preferred approaches. Our failure to do a, b, or c, while problematic in some respects, can also be viewed as a strength, increasing clinical representativeness by including the therapists, youths, and treatments of true usual clinical care.
Our results highlight several objectives for research on evidence-based practice and everyday clinical care. One part of the research agenda needs to be testing practices that have worked well in efficacy trials, to determine whether these practices—or some adaptation of them—can be effective in an everyday care context. Our study suggests that such research is feasible, and it highlights several ways such research can be strengthened in the future. Our finding that psychodynamic procedures predicted outcome in UC suggests a treatment model that may warrant testing in the future and, more broadly, illustrates the potential value of carefully documenting the contents of UC (rather than regarding UC as a mere “control condition”). Our study highlights the fact that efforts to identify effective practices for everyday care must contend with complex clients, significant comorbidity, and challenging economic and time constraints in clinical care settings, which limit the time available for learning new skills. Such challenges place the development of strategies for efficient training and flexible treatment of comorbidity and an ever-changing landscape of stressors high on the research agenda.
Ultimately, the challenge of bringing evidence-based treatment into everyday practice is essentially a balancing task—i.e., ensuring sufficient therapist skill-building to produce real competence and thus beneficial effects, while ensuring that the clinician time required fits the economic realities of clinical practice. In this era of widespread calls for exporting tested treatments to practice settings, there is remarkably little research on how to accomplish that goal. The present study provides one example of an array of possible approaches (see also, for example, Chambers, Ringeisen, & Hickman, 2005; Chorpita, 2006; Chorpita et al., 2002; Mufson et al., 2004; Schoenwald & Hoagwood, 2001; Southam-Gerow, 2005; Southam-Gerow, Austin, & Marder, 2008), many more of which will be developed and tested in the years ahead.
The study was funded by National Institute of Mental Health grant R01-MH57347 (JRW, PI), and the authors were supported by additional grants from NIMH (R21-MH63302 [JRW], R01-MH068806 [JRW], K23-MH069421 [MSG], F31-MH64993 [BDM], F31-MH079631 [DAL]) and the John D. and Catherine T. MacArthur Foundation (Research Network on Youth Mental Health [JRW]).
We thank our clinic administrative colleagues for their help, in many forms, throughout the study. These include Herb Blaufarb, Thomas Ciesla, Kita Curry, Carol Falender, Anita Feltes, Susan Hall-Marley, Joseph Ho, Amy Hulberg, Cynthia Kelly, Jill Morgan, Philip Pannell, Robert Parsons, Allison Pinto, Terry Rattray, Rebecca Refuerzo, David Slay, and Marian Williams. We also thank the therapists, parents, and youths who participated in the study, as well as our project administrative and graduate student colleagues, including Amie Bettencourt, Ashley Borders, Jen Durham, Aileen Echiverri, Samantha Fordwood, Sarah Francis, Andrea Kasimian, May Lim, Tamara Sharpe, and Irina Tauber.
1We complied with APA ethical standards in the treatment of our sample.
2Why were 185 youths excluded? Note that we tried to assess all whose referral concerns included any depression or anxiety symptom; most either did not meet criteria for a depressive disorder, or met criteria for another disorder that had treatment priority (see procedure, in text). Youths for whom anxiety disorders had priority (N=48) were invited to enroll in an anxiety trial. Youths for whom neither a depressive or anxiety disorder was primary (N=137) were offered regular treatment in their clinic. Because we lacked detailed information on youths at the time of their initial clinic contact, we had to interview many who ultimately did not qualify for the study.
3In any situation where decisions are influenced by parent, youth, or staff judgment as well as standardized measures, bias of various kinds could enter the process. Thus, it is important to note that randomization to PASCET vs. UC occurred after study eligibility was determined.
4Mean supervision time per week, by therapist report, was 3.47 hours, including individual, group, treatment team and staff meeting supervision. We do not have data on the exact amount by which regular supervision was reduced for PASCET therapists.
5When group variances differed significantly (p < .05) and violated the equal variance assumption of the standard t-test, we used the relatively robust Welch statistic and its associated degrees of freedom (Welch, 1951; Blalock, 1972).
6Because our sample came from seven clinics, we conducted preliminary analyses to see if the youth differed among these clinics, using tests structured like those described in this paragraph for clinical and sociodemographic variables at T1, with clinic entered as a random effect. Using ANOVA for continuous measures and chi-square for categorical measures, we found only one significant difference: clinics differed on youth CDI scores (p=.046).
Publisher's Disclaimer: The following manuscript is the final accepted manuscript. It has not been subjected to the final copyediting, fact-checking, and proofreading required for formal publication. It is not the definitive, publisher-authenticated version. The American Psychological Association and its Council of Editors disclaim any responsibility or liabilities for errors or omissions of this manuscript version, any version derived from this manuscript by NIH, or other third parties. The published version is available at http://www.apa.org/journals/ccp/
John R. Weisz, Harvard University and Judge Baker Children’s Center.
Michael A. Southam-Gerow, Virginia Commonwealth University.
Elana B. Gordis, University at Albany, SUNY.
Jennifer K. Connor-Smith, Oregon State University.
Brian C. Chu, Rutgers, the State University of New Jersey.
David A. Langer, University of California, Los Angeles.
Bryce D. McLeod, Virginia Commonwealth University.
Amanda Jensen-Doss, Texas A&M University.
Alanna Updegraff, Akron Children’s Hospital.
Bahr Weiss, Vanderbilt University.