|Home | About | Journals | Submit | Contact Us | Français|
This study describes a multimethod evaluation of treatment fidelity to the family therapy (FT) approach demonstrated by front-line therapists in a community behavioral health clinic that utilized FT as its routine standard of care. Study cases (N = 50) were adolescents with conduct and/or substance use problems randomly assigned to routine family therapy (RFT) or to a treatment-as-usual clinic not aligned with the FT approach (TAU). Observational analyses showed that RFT therapists consistently achieved a level of adherence to core FT techniques comparable to the adherence benchmark established during an efficacy trial of a research-based FT. Analyses of therapist-report measures found that compared to TAU, RFT demonstrated strong adherence to FT and differentiation from three other evidence-based practices: cognitive-behavioral therapy, motivational interviewing, and drug counseling. Implications for rigorous fidelity assessments of evidence-based practices in usual care settings are discussed.
The goal of this study was to determine whether a community clinic that featured family therapy (FT) as its routine standard of care demonstrated fidelity to the FT approach when treating adolescent behavior problems, utilizing assessment methods that appear well-suited for efficient fidelity evaluation in usual care. Treatment fidelity is an index of the degree to which interventions are delivered in accordance with essential theoretical and procedural aspects of a given model (Hogue et al., 1998). Fidelity consists of three related components (Waltz, Addis, Koerner, & Jacobson, 1993): adherence refers to the quantity or extent to which interventions are delivered; competence refers to the quality or skill of delivery; and differentiation refers to the degree to which comparative treatment approaches differ from one another in practice based on guiding theory and prescribed interventions. Whereas the past decade has witnessed noteworthy gains in the knowledge base and technology base of fidelity evaluation in controlled research settings, parallel efforts in usual care settings have been slow in coming (Schoenwald et al., 2011).
Family therapy has matured into an empirically supported treatment approach for adolescent behavior problems (ABPs) that include conduct problems, delinquency, and substance misuse. Specific manualized FT models that have proven efficacious for ABPs include brief structural family therapy, functional family therapy, multidimensional family therapy, and multisystemic therapy (for reviews see Henggeler & Sheidow, 2012; Waldron & Turner, 2008). Although these FT models differ from one another along several dimensions of treatment focus (e.g., systemic versus behavioral) and service delivery (e.g., office-based versus home-based), they all endorse two treatment foci based on their common grounding in the FT approach: (1) emphasis on core FT intervention techniques for ABPs (for review of relevant fidelity and process research see Hogue & Liddle, 2009), which include but are not limited to: convening multiple family members in most sessions; creating a family-focused reframe of the referring problem and specifying treatment goals that are family-based; working to bring about in-session change in family interaction patterns intended to restructure problematic relationships, increase interpersonal attachments and communication, and improve family problem-solving; and working to improve parenting behaviors; and (2) an “ecological” orientation tailored to ABP youth that entails active intervention in extrafamilial systems (school, peer, community, juvenile justice) within which adolescents demonstrate clinical and developmental problems (Becker & Curry, 2008; Henggeler & Sheidow, 2012).
FT models for ABPs have accumulated a robust portfolio of treatment outcome effects across the adolescent behavioral health spectrum, with strong efficacy and effectiveness results for conduct and substance use problems in clinical samples (for a meta-analytic review see Baldwin et al., 2012). Moreover, FT process-outcome studies have demonstrated links between adherence to core FT techniques and improvements in adolescent and family functioning as well as in-session changes in parenting practices and family interactions (Hogue & Liddle, 2009). These and related findings underscore the empirical validity of the FT approach for ABPs and its potential suitability as a first-line treatment option for multiproblem adolescents in outpatient behavioral care.
To date the bulk of implementation and outcome evaluations of FT for ABPs has been conducted in controlled conditions, either as clinical trials in research settings or as effectiveness studies in community clinics that benefitted from intensive training and oversight by model experts. Unfortunately, little is known about the fidelity and potency of FT when implemented in standard treatment conditions, that is, as an evidence-based practice supported by routine supervisory and administrative resources in usual care (UC) (Kaslow, Broth, Smith, & Collins, 2012). The term “evidence-based practice” (EBP) refers broadly to intervention techniques, models, or approaches that, having been originally validated in controlled research contexts, are implemented by front-line providers in the course of everyday clinical care (McHugh & Barlow, 2010).
The current study evaluated both adherence and differentiation of FT as practiced by community therapists in UC. For treatment differentiation purposes, we compared the degree to which study therapists used FT techniques to their use of treatment techniques derived from three alternative approaches for ABPs: cognitive-behavioral therapy (CBT), motivational interviewing (MI), and drug counseling (DC). These three non-FT approaches were selected for comparison because (1) they each have a substantial base of empirical support for addressing ABPs (Becker & Curry, 2008; Chorpita et al., 2011; Eyberg, Nelson, & Boggs, 2008; Winters, Stinchfield, Latimer, & Lee, 2007; Winters, Stinchfield, Opland, Weller, & Latimer, 2000); (2) they are all widely endorsed in UC settings for treating ABP youth; and (3) like FT, CBT and MI boast several manualized treatment models that have proven efficacious for a range of ABPs (see Hogue & Liddle, 2009; Miller & Rose, 2009; Waldron & Turner, 2008), making them ideal candidates to serve as transdiagnostic interventions capable of treating the heterogeneous, multiple-disorder populations typical in front-line settings (Garland, Bickman, & Chorpita, 2010; McHugh, Murray, & Barlow, 2009). Note that these four approaches are not intended to represent wholly discrete, non-overlapping EBPs. In research settings, MI and CBT are commonly packaged as a unified multicomponent treatment for ABPs (e.g., Dennis et al., 2004), and FT models are frequently combined with CBT and/or contain behaviorally oriented interventions characteristic of the CBT approach (e.g., Henggeler, Schoenwald, Borduin, Rowland, & Cunningham, 1998; Waldron & Turner, 2008).
Assessment methods that can effectively and efficiently assess treatment fidelity in UC are urgently needed to advance EBP dissemination efforts (Schoenwald et al., 2011). Field-based evaluation of EBPs delivered in UC is complicated by several features of routine practice that present stiff challenges to rigorous implementation assessment, including the use of eclectic intervention approaches and techniques by community therapists, heterogeneous client pools, and limited resources and expertise for conducting fidelity evaluations (Garland, Hurlburt, Brookman-Frazee, Taylor, & Accurso, 2010). To overcome such challenges Garland, Hurlburt, & Hawley (2006) recommend that UC implementation studies employ a “hybrid” assessment design that targets traditional aspects of psychotherapy process research (e.g., dose, fidelity, therapeutic relationship) but also emphasizes field-flexible assessment strategies and strong collaboration with community providers to develop context-sensitive evaluation tools.
Two resource-efficient evaluation methods provide an excellent fit for hybrid implementation assessment in UC: benchmarking analyses and collection of therapist self-report data. Benchmarking studies typically compare the performance of community providers to accepted gold standards (i.e., benchmarks) in critical areas such as retention, implementation, and outcomes (Hunsley & Lee, 2007). By examining how EBP implementation in UC compares to fidelity standards achieved in controlled research on empirically supported treatments, benchmarking analyses can play a pivotal role in discovering whether evidence-based treatments and practice elements are feasible, potent, and durable when delivered in UC (Hogue, Ozechowski, Robbins, & Waldron, in press).
Also, whereas observational assessment of treatment implementation remains the gold standard for fidelity research even in UC settings (Garland, Bickman, & Chorpita, 2010), it is critical to develop reliable complements or even alternatives to observational methods that are cost-effective and easy to use by non-researchers in clinical practice. The most promising method is therapist self-report measures, which offer several advantages over observational ratings (Carroll, Nich, & Rounsaville, 1998; Weersing, Weisz, & Donenberg, 2002), among them: they are quick, inexpensive, and nonintrusive; and they can be completed throughout treatment, facilitating evaluation of infrequent but clinically meaningful interventions. As described below in the Measures section, we developed a new therapist-report measure of EBP implementation for ABPs based on focus group feedback from the six UC treatment sites in which the study was conducted.
We used a three-phase evaluation design to assess fidelity to the FT approach in usual care for ABP youth. Study participants were randomly assigned to one of two conditions: (1) Routine Family Therapy (RFT): a single clinic that featured FT as its routine standard of care; or (2) Treatment As Usual (TAU): one of five clinics in the same catchment area that were not specifically aligned with the FT approach. We predicted that RFT would demonstrate basic adherence to FT treatment techniques and also basic differentiation from alternative EBPs featured in TAU: CBT, MI, and DC. There were three specific hypotheses: (1) FT adherence levels achieved by RFT therapists would be comparable to benchmark FT adherence levels established by research therapists in a controlled trial of an empirically supported FT model for ABPs (multidimensional family therapy; Liddle et al., 2008); (2) RFT therapists would report a higher overall level of allegiance and skill in FT techniques compared to non-FT techniques, and also, their FT allegiance and skill levels would be higher than those reported by TAU therapists; (3) RFT therapists would report stronger adherence to FT techniques than to techniques associated with the three alternative EBPs, and also, they would report a higher FT adherence level than that reported by TAU therapists. The final hypothesis is a flexible application of treatment differentiation analyses suitable for hybrid process research evaluation (Garland et al., 2006): In field settings where there is an unknown degree of overlap in treatment techniques among therapists who favor eclectic and/or combined approaches, differentiation analyses can provide useful information on which techniques show overlap and the amount of overlap observed.
Participants (N = 50) included adolescents (44% male; M age 15.2 years [SD = 1.4]) and their primary caregivers. Self-reported ethnicities were Hispanic (62%), African American (16%), multiracial (12%), White (4%), other (6%). Households were headed by a single parent (68%), two parents (23%), grandparents (6%), or other (3%); 50% earned less than $15,000 per year, 18% received public assistance, and 46% reported a history of child welfare involvement. Adolescents were referred primarily from schools (72%) but also from family service agencies (18%) and other sources (10%). Psychiatric diagnosis rates for DSM-IV disorders (Diagnostic and Statistical Manual of Mental Disorders—Fourth Edition; American Psychiatric Association, 2000), collected via the Mini International Neuropsychiatric Interview (MINI, Version 5.0; Sheehan et al., 1998), and based on meeting diagnostic threshold according to either adolescent or caregiver report, were: oppositional defiant disorder (ODD) = 90%, attention-deficit/hyperactivity disorder = 84%, conduct disorder (CD) = 52%, mood disorder or dysthymia = 49%, substance use disorder (SUD) = 20% (80% cannabis use, 20% alcohol), generalized anxiety disorder = 18%, posttraumatic stress disorder = 20%. A total of 94% of adolescents were diagnosed with more than one disorder.
Participants in this study were part of a larger randomized field trial designed to identify adolescents with untreated behavioral health problems, enroll them in available outpatient treatment services, and assess treatment effects up to one year later (see Hogue & Dauber, in press). Research staff developed a referral network of high schools, family service agencies, and youth programs serving clients in inner-city areas of a large northeastern city. Network partners made referrals to research staff during site visits and also by phone and confidential email. Staff then contacted referred families by phone and offered them an opportunity to participate in a home-based interview to assess the reason for study referral and discuss study enrollment.
Assessment interviews were conducted by research staff primarily in the home but also in other locations upon request. Caregivers and teens were consented and interviewed separately; caregivers consented for themselves and their adolescents, and adolescents assented for themselves. Caregivers and teens each received an honorarium in vouchers for completing interviews, which typically lasted 60–90 minutes.
After interview completion, families of teenagers who (1) met diagnostic criteria for ODD, CD, and/or SUD and (2) were interested in receiving treatment services, were randomly assigned to either the RFT site or one of five TAU sites. Urn randomization procedures were employed to promote balance between conditions on three variables: sex, ethnicity, and juvenile justice involvement. One TAU site that specialized in addiction treatment was withheld from randomization of ODD and CD cases, and one TAU site that did not accept substance users was withheld from randomization of SUD cases. Significant differences between study conditions were found for two variables: Families in RFT were more likely to have household income below $15K (χ2(1) = 3.9, p < .05) and less likely to have a history of child welfare involvement (χ2(1) = 4.2, p < .05). No other group differences on demographic or diagnostic variables were found.
Study therapists were asked to complete a post-session EBP fidelity checklist (described in Measures section) after the first two sessions of treatment. Post-session checklists were provided by 12 RFT therapists and 13 TAU therapists who treated the 50 study cases (25 RFT cases, 25 TAU cases). For the 25 cases assigned to five separate TAU sites, individual sites treated 7, 6, 5, 4, and 3 cases respectively, based on site availability to accept referrals. There were 6 cases for which only 1 post-session checklist was collected, resulting in 94 checklists collected overall; 16 of these (17%) corresponded to later sessions because session 1 and/or session 2 were not provided.
All treatment sites were outpatient clinical settings that accepted study cases as standard community referrals. No external training, financial support, or logistical support of any kind was provided to treat study cases, and therapists were not required to alter their routine clinical practices in any way. Therefore, all study sites provided usual-care services to referred families. Each site prescribed weekly treatment sessions and had in-house psychiatric support. Sites were in close geographical proximity and easily accessible to all families via public transportation.
The RFT condition consisted of a single community mental health clinic that featured family therapy as the routine treatment approach for behavioral interventions with youth. RFT therapists (n = 12) were licensed Marriage and Family Therapists, licensed Social Workers with training in family therapy, or advanced clinical trainees with family therapy experience. Therapists ranged in age from 28 to 59 years; 7 were female; 1 was European American, 7 Hispanic American, and 1 of another ethnicity; and as a group they averaged 3.1 years (SD = 4.3) postgraduate therapy experience. All RFT therapists received weekly group and individual supervision designed to promote the FT approach.
This condition included a set of five clinics, rather than a single comparison clinic, in order to capture a broader range of EBPs and sample the full spectrum of outpatient UC treatment options available for adolescent behavior problems. Among the five TAU sites, two sites were community mental health clinics (as was the RFT site). Two other sites were outpatient clinics housed within the child and adolescent psychiatry departments of teaching hospitals. The fifth site was an independent addictions treatment clinic with an adolescent program that featured group-based treatment with supportive individual sessions. TAU sites provided treatment services that, based on focus group feedback (see Measures section), were nominally consistent with the MI, CBT, and DC approaches. No TAU site contained a supervisor or staff therapist with extensive training in family therapy, and no site appeared to promote or feature implementation of signature FT techniques. Across the five sites, participating therapists (n = 13) ranged in age from 25 to 45 years; 9 were female; 9 were European American, 1 Hispanic American, and 3 Asian American; and as a group they averaged 3.3 years (SD = 3.1) postgraduate therapy experience.
We developed a brief face-valid questionnaire to assess therapists’ own judgments about their clinical orientation and technical skill related to four treatment approaches that have generated a substantial base of empirical support for treating adolescent conduct and substance use problems: family therapy (FT), cognitive-behavioral therapy (CBT), motivational interviewing (MI), and drug counseling (DC). The questionnaire asks therapists to self-rate their degree of allegiance to, as well as their skills in implementing, each of the four approaches using a 5-point scale: 1 = None, 2 = A Little, 3 = Moderate, 4 = Considerable, 5 = High.
We developed the 27-item ITT-ABP to gather therapist-report data on fidelity to the FT, CBT, MI, and DC approaches. ITT-ABP items were selected from validated observational fidelity scales representing each approach, using a two-stage development process. First, we reviewed the original psychometric studies of the scales under consideration to examine strength of factor loadings and interrater reliability for the validated scales. Second, we conducted two focus groups with all available therapists at each study site to review prospective items, gather information about the fit between prospective items and the treatment practices favored at each site as reported in focus groups, and accordingly trim prospective items to create the final set of four ITT-ABP scales: FT, CBT, MI, and DC. Table 1 contains a list of all 27 items for all four scales. The ITT-ABP measures the thoroughness and frequency (i.e., quantity, or adherence) with which each of the 27 treatment techniques was utilized based on a 5-point Likert scale: 1 = Not at all, 2 = A little bit, 3 = Moderately, 4 = Considerably, 5 = Extensively.
Items on the FT scale (7 items) and CBT scale (7 items) were drawn from the Therapist Behavior Rating Scale (TBRS) (Hogue et al., 1998, 2004), a macroanalytic tool designed to identify therapeutic techniques prescribed by FT and CBT for adolescent behavior problems. The TBRS has demonstrated strong psychometric properties in studies of treatment adherence (Hogue, Dauber, et al., 2008; Hogue et al., 1998), therapist competence (Hogue, Dauber, et al., 2008), and fidelity-outcome links (Hogue, Dauber, Samuolis, & Liddle, 2006; Hogue, Henderson, et al., 2008). Factor analytic studies of the TBRS revealed a Family Focus dimension including family relationship and family interaction items, and a CBT dimension including behavior/skills items and cognition-focused items (Hogue et al., 1998, 2004). ITT-ABP items on the MI scale (7 items) and DC scale (6 items) were drawn from the Motivational Enhancement Therapy and Twelve Step Facilitation subscales, respectively, of the Yale Adherence and Competence Scale (YACS) (Carroll et al., 2000). The YACS is a general system for rating adherence and competence in delivering behavioral treatments for substance use disorders. It has demonstrated strong reliability and validity for MI and DC fidelity scores in an efficacy trial comparing CBT, 12-step facilitation, and clinical management for adult substance use disorders (Carroll et al., 2000) as well as two multisite effectiveness trials comparing manualized treatment to UC (Ball et al., 2007; Carroll et al., 2006).
Preliminary analyses of ITT-ABP scores for a small sample of sessions were conducted to estimate the basic reliability of therapist-reported EBP adherence levels compared to observational reports using indices of percentage agreement.
In Study Phase 1 we conducted observational benchmarking analyses of 15 archived videotaped sessions held by therapists at the RFT site prior to referral of study cases. Benchmarking analyses were conducted using a probability sampling method known as statistical process control analysis (SPC; Deming, 1986) to compare within-sample variance in the RFT sample versus the benchmark MDFT sample. SPC provides a systematic means of monitoring the amount of variability in a continuous production process. It has been used in mental health settings to measure consistency of service delivery at outpatient clinics (Green, 1999) and variations in treatment staff performance over time (Dey, Sluyter, & Keating, 1994). In SPC, samples are taken from a given process under investigation (e.g., FT fidelity in RFT) and plotted on a control chart to check for patterns that suggest systematic variation within the target sample when compared to pre-established control limits (e.g., fidelity benchmark values derived from MDFT studies). To make a valid comparison, we coded the 15 archived RFT sessions using the same FT adherence measure that had been used to code the benchmark MDFT sessions, the TBRS. The TBRS (psychometric properties discussed above in Measures) contains a 7-point Likert-type scale with the following anchors: 1 = Not at all, 3 = Somewhat, 5 = Considerably, and 7 = Extensively (This original 7-point scale was shortened to the 5-point scale presented above for the ITT-ABP in order to provide a simpler range of choices, in which each scoring option had a specific anchor, for self-report by community therapists).
In Study Phase 2, just prior to enrolling study cases, we analyzed self-report data from 25 study therapists (12 RFT, 13 TAU) who reported on (1) their degree of theoretical allegiance to each of the four targeted treatment approaches (FT, CBT, MI, DC) and (2) their perceived skill in implementing each approach when treating teens. In Study Phase 3, we analyzed therapist post-session self-reports of adherence to FT, CBT, MI, and DC treatment techniques that were collected on 94 sessions from 50 study cases randomly assigned to RFT versus TAU. Analyses for both Phase 2 and Phase 3 featured mean comparisons of therapist-reported data via dependent and independent t-tests that compared scores across the RFT and TAU conditions.
We conducted two preliminary analyses (see Tables 1 and and2)2) to provide initial support for the reliability of the ITT-ABP. First, we completed observational coding on the first thirteen study sessions (9 RFT, 4 TAU) for which therapists submitted a session videotape along with a post-session ITT-ABP (see Table 1). These 13 videotapes were coded using the ITT-ABP items by two expert raters with established reliability in the source TBRS instrument (Hogue et al., 2006), using the same 5-point Likert scale used by study therapists, in order to generate observational “gold standard” scores against which to compare the self-ratings made by therapists. Coders scored tapes independently and used consensus to reconcile scoring differences in order to select a final gold standard score for each item. Although this sample of tapes is not large enough to calculate intraclass correlation coefficients representing the reliability between coders and therapists, it does provide preliminary reliability evidence in the form of raw percentage agreement scores between informants.
Table 1 lists three types of agreement for each of the 27 items: % absolute agreement (i.e., same score reported by both informants); % scoring within 1 rating point; and % agreement on item presence/absence (i.e., both informants scored either “1” [absent] or any score > 1 [present]). The data suggest an encouraging degree of therapist self-report reliability: For 15 items, absolute agreement occurred on at least 50% of tapes; for 20 items, informants were no more than 1 point apart for at least 70% of tapes; and for 23 items, raters agreed on the presence or absence of the given item for at least 69% of tapes. However, as seen in Table 2, the average ratings for these items are relatively low across the board, representing a high percentage of “1” scores reflecting no use of the intervention in the session. Because of this clustering of scores at the low end of the scale, there is increased likelihood of absolute agreement and also of score correspondence within one point.
Table 2 contains data on scale-level ITT-ABP scores across the 13 sessions, including bivariate correlations between observer and therapist scale scores (the sample size was deemed too small to generate meaningful alternative indices of interrater agreement in the form of Kappa statistics [Flack, Afifi, Lachenbruch, & Schouten, 1988] or intraclass correlation coefficients [McGraw & Wong, 1996]). These data suggest a strong degree of concordance between informants for the FT items, moderate concordance for CBT items, and weak concordance for MI items (with a negative correlation among TAU sites). The base rate for CBT items is low in this small sample of mostly RFT sessions, and DC items were virtually unscored. Also, because this pool of ITT-ABP sessions represents the first two treatment sessions only, the mean scores reported by both therapists and observational coders are likely to be lower than would be reported across the entire length of treatment due to the focus on client assessment and alliance building that is typical of early session work in all four approaches.
The SPC chart presented in Figure 1 depicts FT adherence data drawn from 15 randomly selected archived sessions held at the RFT site prior to the start of the current study. Videotaped RFT sessions were coded with the FT adherence scale of the TBRS (described in Measures section) by two non-participant raters with strong interrater reliability established in a previous TBRS study (Hogue et al., 2006). The FT treatment adherence mean scores for each of the 15 RFT sessions are plotted in the SPC chart. The chart also depicts a control sample mean, as well as upper and lower control limits (labeled MDFT UCL and MDFT LCL in Figure 1), that are based on criterion values (i.e., benchmarks) derived from an observational adherence process study of MDFT that used the exact same 7-point FT adherence scale of the TBRS (Hogue et al., 2006) during a controlled efficacy study of MDFT for ABPs. One virtue of SPC charts is that control limits can be narrowed or widened based on the purpose of the evaluation. Another virtue is that every score is depicted, allowing evaluators to visualize how scores cluster and identify which individual scores place nearer to or further from an established mean.
The observed mean FT adherence score for the 15 archived RFT cases was 3.4 (SD = .51), and the benchmark mean from the MDFT trial was 3.5 (SD = .60). The SPC chart (Figure 1) plotting RFT implementation data against MDFT benchmarks demonstrates that the FT adherence scores for RFT cluster closely around the average FT adherence score obtained for MDFT in a controlled study. Also, no RFT session falls beyond three standard deviations (UCL or LCL) of the MDFT mean. This SPC analysis indicates that the standard family-based services delivered at the RFT site prior to enrollment of study cases adhered closely to gold-standard levels of core FT techniques implemented by a family-based EST treating a client population that matches the current study sample. Note that the mean observational FT adherence score for the archived RFT pool cannot be directly compared to the mean observational FT score for the ITT-ABP pool (reported in Table 2), because the former was coded using a 7-point scale (to match the benchmark MDFT sample) and the latter a 5-point scale (to match the therapist-report ITT-ABP version).
Table 3 presents averaged self-reported EBP allegiance and EBP skill data provided by participating therapists at the six study sites. For ease of comparison, in Table 3 the TAU data were grouped based on site organizational characteristics: two community mental health clinics, two outpatient child psychiatry clinics, and one addictions treatment clinic. Note that only 9 of the 12 RFT therapists provided complete data on EBP allegiance and skill.
First, within the RFT condition only, dependent sample t-tests were used to compare the levels of allegiance and skill in FT versus each of the other three approaches. As expected, RFT therapists reported greater allegiance to FT than to CBT (t(8) = 3.8, p < .01, d = 2.7), MI (t(8) = 3.3, p < .05, d = 2.3), and DC (t(8) = 3.8, p < .01, d = 2.7), respectively. Similarly, RFT therapists reported greater perceived skill in FT techniques than in CBT (t(8) = 2.0, p = .08, d = 1.4), MI (t(8) = 2.2, p < .06, d = 1.6), and DC (t(8) = 3.3, p < .05, d = 2.3).
To test hypothesized between-group differences in EBP allegiance and skill, independent samples t-tests were used to compare RFT scores to TAU scores that were averaged across all five TAU sites (see Table 3). Allegiance to the FT approach was stronger in RFT (M = 3.8; SD = 1.3) than in the pooled TAU sample (M = 2.7; SD = .63) (t(20) = 2.6, p <.05), with a very large effect size for this mean difference (d = 1.2). In contrast, there were variable effect sizes when testing for differences in the other three approaches, with RFT tending to report lesser allegiance to CBT (t(20) = −1.8, p = .09, d = .79) and equivalent allegiance to MI (t(20) = −.18, p = .86) and DC (t(20) = −.45, p = .66).
Parallel analyses were then conducted for therapist self-ratings of EBP skillfulness. As predicted, perceived FT skill was significantly stronger in RFT (M = 3.7; SD = .71) than in TAU (M = 2.4; SD = .65) (t(20) = 4.4, p < .001), again showing a very large effect size (d = 1.9). In contrast, there were no between-condition differences in perceived skill for CBT (t(20) = −.97, p = .35), MI (t(20) = −.51, p = .61), and DC (t(20) = −.29, p = .77). As seen in Table 3, the addictions clinic site in the TAU condition had markedly higher mean scores for DC allegiance and skill than did all other sites, but this gap was erased once scores from all TAU sites were averaged for comparison to RFT. Overall, these data demonstrate that RFT therapists had much greater allegiance to the FT approach and confidence in their FT skills than did TAU therapists.
Prior to analyzing ITT-ABP scores we calculated the sample-specific internal consistency of each scale using Cronbach’s α coefficient. Across the full sample of 94 sessions, internal consistency was as follows: FT scale (7 items) α = .71; CBT scale (7 items) α = .72; MI scale (7 items) α = .78; DC scale (6 items) α = .84. These data indicate that the individual items of each respective scale represent a highly correlated set of interventions comprising a unified treatment approach.
Dependent samples t-tests were conducted to compare RFT therapists’ self-reported use of FT interventions to their use of CBT, MI, and DC interventions (see Table 4). RFT therapists reported significantly higher FT scores than CBT scores (t(47) = 10.7, p < .001, d = 3.1) and DC scores (t(47) = 18.1, p < .001, d = 5.3). Unexpectedly, they also reported higher MI scores than FT scores (t(47) = −2.0, p < .05), though this effect size was moderate only (d = .60) . Independent samples t-tests were then conducted to compare RFT therapists to TAU therapists on self-reported utilization of the four approaches (Table 4). As predicted, RFT therapists reported significantly greater use of FT interventions than TAU therapists (t(74.8) = 4.5, p < .001, d = 1.0). In contrast, RFT therapists reported less use of CBT than TAU therapists (t(66.9) = −2.4, p < .01, d = .59). RFT also reported less use of DC than TAU therapists, but only at a trend level (t(52.8) = −1.8, p < .10, d = .50). There were no between-condition differences in the use of MI interventions (t(80.7) = −.58, p =.56). Finally, we compared the FT score reported by RFT to the average score reported by the two TAU sites that were CMHCs and thus organizationally comparable to RFT. As seen in Table 4, RFT reported significantly greater use of FT interventions (M = 2.6; SD = .48) than did CMHC sites in TAU (M = 2.0; SD = .78) (t (30.3) = 3.6, p < .01).
This study found that family therapy implemented in usual care for adolescent behavior problems demonstrated (1) basic adherence to the FT approach and (2) differentiation from other evidence-based approaches for this clinical population, based on three main findings. First, observational benchmarking analysis of treatment sessions recorded at the RFT site before enrollment of study cases demonstrated that site therapists consistently achieved a level of adherence to core FT techniques that was comparable to the adherence benchmark established during a controlled efficacy trial of a research-proven FT model. Second, RFT therapists reported stronger allegiance to and self-perceived skill in the FT approach than in CBT, MI, or DC; moreover, they reported comparatively greater levels of FT allegiance and skill than did TAU therapists. Third, RFT therapists reported using a greater amount of FT techniques than either CBT or DC techniques when treating study cases; moreover, they utilized comparatively more FT techniques and less CBT and DC techniques than did TAU therapists.
Results of this multimethod fidelity assessment suggest that the FT approach can be faithfully implemented in usual care by a community-based clinic with basic adherence to signature FT techniques. Moreover, the observational benchmarking and therapist-report methods used to evaluate FT adherence and differentiation appear to be user-friendly procedures that program evaluators can comfortably employ to assess EBP models and techniques of any kind, if the following prerequisites are in place: (1) existence of a corresponding observational and/or self-report fidelity measure for the given EBP(s), many of which can be readily found in the research literature or acquired in the implementation toolkits for many evidence-based models; (2) access to audio/videotaping equipment, which is increasingly inexpensive and compatible with standard computer technology; and (3) therapist goodwill. Although this study was conducted as part of a multisite evaluation, study methods can be readily employed at individual sites to examine EBP adherence and perhaps to differentiate among a variety of EBPs routinely delivered within a single site. However, even the most routine collection of implementation data in community sites requires agency-wide commitment and regular monitoring to be successful; to wit, even with all three of the above prerequisites in place, this study failed to capture therapist allegiance and skill data on 25% of the RFT therapist sample.
In both of the observationally coded samples used in SPC analyses—archived RFT sessions and benchmark MDFT sessions—the mean FT adherence score fell between the anchor values of 3 (Somewhat) and 5 (Considerably), just below the midpoint of the 7-point rating scale. These adherence levels are consistent with levels reported in previous observational fidelity studies across a range of treatment approaches and populations (e.g., Barber, Crits-Christoph, & Luborsky, 1996; Carroll et al., 2000; Hill, O’Grady, & Elkin, 1992; Hogue et al., 2008). These below-midpoint mean scores likely reflect the fact that therapists cannot (and perhaps should not) be expected to deliver the complete roster of discrete intervention techniques constituting each scale in any one session, given reasonable time and client tolerance limits. That is, an active therapist can implement one or two interventions very thoroughly during a given session yet still receive a below-midpoint mean adherence score that has been averaged across multiple scale items. A more informative metric for judging the density of EBP delivery in UC might be tabulating the number or proportion of discrete techniques scored at or above the midpoint value, indicating areas of considerable/extensive activity by the therapist (Hurlburt, Garland, Nguyen, & Brookman-Frazee, 2010).
It is important to emphasize that the therapist-report data in this study be considered preliminary only, pending future observation-based analyses that can confirm (or disconfirm) the self-report data and provide a less biased assessment of FT adherence and skill. The handful of existing studies comparing observer versus therapist reports of EBP implementation (e.g., Carroll et al., 1998; Hurlburt et al., 2010; Martino et al., 2009) have found only modest to weak correspondence between informants, with therapists tending to overestimate the quantity of their EBP utilization. Thus it is possible that in this study, RFT therapists overstated their use of FT techniques with study cases. That said, support for the fundamental reliability of the therapist-report data can be found in the favorable (though not definitive) indices of agreement between therapists and observers on ITT-ABP items for a small subsample of videotaped sessions. Looking ahead, if reliable therapist-report measures can be established, they may prove essential for efficient fidelity evaluation in UC, in several formats: as a self-check by therapists to mark their own progress in treating individual cases; as a supervision aid for EBP trainers and agency supervisors to monitor fidelity; and as administrative data for stakeholders and external reviewers to evaluate therapist- and agency-level clinical performance (Carroll et al., 1998; Garland, Bickman, & Chorpita, 2010).
The design of this study would have been considerably improved by sampling additional clinics in the RFT condition. Until study results are replicated in other UC sites that feature FT, it is impossible to determine whether findings are generalizable, or instead, a by-product of site-specific historical and organizational factors. Along these lines, it is difficult to know whether study sites are fully representative of standard outpatient behavioral care, especially given their willingness to participate freely in EBP implementation research; unfortunately, an informative evaluation of the organizational context (work climate, worker attitudes, etc.) of each site was beyond the scope of this study. It would also have been preferable to sample FT implementation across all phases of therapy, rather than the early phase (first two sessions) only, on the premise that adherence levels may fluctuate dramatically over the course of treatment due to any combination of therapist, client, and external factors.
The design would also have been enhanced by observationally measuring the competence (quality) of FT implementation, which has the potential to predict clinical outcomes over and above adherence (quantity) (e.g., Carroll et al., 2000; Hogue, Henderson, et al., 2008). However there are several obstacles to conducting competence assessments in UC. For example, although we collected one-time self-reports of therapist skill for treatment differentiation purposes, this variable is a weak proxy for therapeutic competence in implementing specific treatment techniques for a given client in a given session. Assessing competence properly virtually requires observational coding by expert judges (Waltz et al., 1993), which is highly resource-intensive. Also, although virtually every manualized FT model offers robust guidelines for skillful implementation—how to deliver the appropriate interventions at the appropriate time for a given client—it has proven exceedingly difficult to operationalize competence reliably when creating fidelity measurement tools (Barber, Sharpless, Klostermann, & McCarthy, 2007). Finally, it appears that even reliable measures of adherence and competence do not reliably predict client outcomes, a counterintuitive but persistent finding (e.g., Barber et al., 2007). Moreover, in those instances when fidelity scores do predict outcomes, the effect sizes are typically small (Webb, DeRubeis, & Barber, 2010) and the relation may be curvilinear in nature (e.g., Barber et al., 2006; Hogue, Henderson, et al., 2008). It remains to be seen whether significant statistical correlations between EBP implementation data and client outcome data will be a required feature, or a desired but difficult-to-reach goal, for the development of instrumentation and procedures for assessing EBP fidelity in usual care.
Despite these missed evaluation opportunities, this study contributes encouraging new evidence on the feasibility of implementing the FT approach in everyday practice. Study findings indicate that UC therapists can achieve FT adherence levels comparable to benchmarks set in controlled research and differentiated from other EBPs delivered in similar settings. If it is eventually proven that high-fidelity FT can be delivered widely in UC settings with fidelity, and furthermore, that it can effectively treat the full spectrum of disruptive behaviors in clinically referred teens, this would help meet the urgent demand for adaptable, transdiagnostic interventions capable of treating multiproblem adolescents (McHugh et al., 2009).
This work described in this article is supported by the National Institute on Drug Abuse (R01 DA019607 and R01 DA023945). The authors would like acknowledge the dedicated work of the clinical research staff for this project: Cynthia Arnao, Molly Bobek, Daniela Caraballo, Benjamin Goldman, Diana Graizbord, Jacqueline Horan, Candace Johnson, Emily Lichvar, Emily McSpadden, Catlin Rideout, and Gabi Spiewak. We thank Jessica Samuolis for coding RFT session videotapes. We are grateful to the adolescents and caregivers who invited our staff into their homes to complete assessment interviews, share stories and hopes, and follow our lead into the outpatient behavioral health treatment system.
Sarah Dauber, Ph.D. is Senior Research Associate in the Division of of Health and Treatment Research at the National Center on Addiction and Substance Abuse (CASA) at Columbia University. Her clinical research interests include applied developmental psychology, comorbidity and developmental trajectories of adolescent depression and substance use, treatment fidelity and process-outcome research on evidence-based practices for adolescent behavioral health problems, and parenting interventions for substance-involved families in the child welfare system.
Aaron Hogue, Ph.D. is Associate Director of Health and Treatment Research at the National Center on Addiction and Substance Abuse (CASA) at Columbia University. His clinical research interests include the adoption and sustainability of empirically supported behavioral treatments for adolescent substance abuse in routine care, treatment fidelity and process-outcome research on evidence-based practices for adolescent behavioral health problems, and combined behavioral and pharmacological interventions for teens with co-occurring ADHD and substance use disorder.
The authors have no conflicts of interest to report.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.