Expectancy of Therapeutic Outcome
Because the expectancies were evaluated during the first and last therapy sessions, pretreatment data were not available for participants who never attended a session. These data were treated in two ways. First, the four expectancy questions at pretreatment were subjected to a multivariate analysis of variance (MANOVA) to determine whether the women who dropped out had different expectations from those who completed treatment. The MANOVA was nonsignificant. Next, we conducted a repeated measures MANOVA (pretreatment–posttreatment) with type of therapy (CPT or PE) as the independent variable. There was no interaction between groups and sessions. The group effect was nonsignificant; there were no differences between the two therapies on the therapeutic expectation questions at either pretreatment or posttreatment. The session effect was significant, F(4, 76) = 12.68, p < .001, and paired-sample t tests indicated that there were significant differences on each of the four questions: Question 1, t(80) = −3.93, p < .001; Question 2, t(80) = −5.29, p < .001; Question 3, t(79) = −2.88, p < .005; and Question 4, t(80) = −5.65, p < .001. Participants’ ratings increased from pretreatment to posttreatment on each of the questions for both therapies.
Analysis Plan
The results were analyzed in three different ways for comparison purposes. Unfortunately, this study was designed and conducted before ITT analysis became standard. Therefore, we did not continue to assess women who dropped out of treatment, and we administered only one scale, the PSS, during treatment. The PSS data were analyzed separately from the measures for which we had only pretreatment, posttreatment, and follow-up data. With only pretreatment data available for the treatment dropouts as well as those who never started, it was not possible for the main analyses involving the CAPS or BDI to determine whether partial therapy was at all beneficial for participants.
Initially, all of the participants who were accepted and randomized into the trial were analyzed with their last observations carried forward (LOCF). These ITT data allowed a more complete picture of the results regardless of whether the women completed the treatment or even began treatment. Another method of handling nonrandom missing data due to dropout is to use mixed-effects linear regression analysis or random regression. Random regression has several advantages over LOCF (
Heyting, Tolboom, & Essers, 1992;
Mazumdar, Liu, Houck, & Reynolds, 1999). Supplementing the use of LOCF data with random regression models as a converging test of our hypotheses allowed us added protection against misleading findings (
Gibbons et al., 1993;
Hedeker & Gibbons, 1996).
Finally, those women who completed treatment were analyzed separately. Although this might be viewed as a “censored” data set from a statistical standpoint, these results are very important from a clinical standpoint. The question addressed here is how effective these treatments are if someone completes the whole course of treatment. This might be particularly important for a therapy such as CPT in which the therapist is teaching new and different skills at each session and no two sessions are exactly alike. In the case of completer analyses, two different sets of analyses were conducted. First, a repeated measures pretreatment to posttreatment MANOVA was conducted for the three groups (CPT, PE, and MA). Second, a two-group (CPT and PE) analysis was conducted across the four assessment periods, including the 3- and 9-month follow-ups. Because there were two MANOVAs for the completer data set, Bonferroni corrections were calculated, and the p value was set at .025.
ITT Analyses With LOCF
ITT analyses with LOCF were conducted on 171 participants, including the 13 women who never attended a session but had been accepted into the study. A 3 (group: CPT, PE, or MA) × 4 (session: pretreatment, posttreatment, 3-month follow-up, or 9-month follow-up) repeated measures MANOVA using LOCF with CAPS and BDI scores as dependent variables produced a significant interaction, F(12, 320) = 4.1, Pillai’s trace = .27, p < .000; a significant session effect, F(6, 159) = 17.9, Pillai’s trace = .40, p < .001; and a significant group effect, F(4, 328) = 4.6, Pillai’s trace = .10, p < .001. Follow-up one-way analyses of variance (ANOVAs) indicated no pretreatment differences among any of the three groups on either measure. At the posttreatment assessment, there were significant differences between the groups on the CAPS, F(2, 168) = 15.5, p < .0001, and BDI, F(2, 167) = 12.1, p < .0001. A post hoc Tukey’s honestly significant difference (HSD) test indicated that the MA group had significantly higher symptom scores than either the CPT or PE group. At the 3-month and 9-month follow-ups, the results were the same: 3-month CAPS, F(2, 168) = 12.5, p < .0001; 3-month BDI, F(2, 167) = 10.1, p < .0001; 9-month CAPS, F(2, 168) = 12.1, p < .0001; and 9-month BDI, F(2, 167) = 8.4, p < .0001. In each case, the MA group had significantly higher scores than the treatment groups, which did not differ from each other. Means and standard deviations for each group at each session are listed in .
| Table 1CPT, PE, and MA Mean Scores Over Time: Intent-to-Treat Sample |
Simple repeated measures MANOVAs for each group across the four assessment sessions indicated that both the CPT, F(6, 55) = 12.6, Pillai’s trace = .58, p < .0001, and PE, F(6, 55) = 10.2, Pillai’s trace = .53, p < .001, groups changed significantly over time. The MA group did not change across the assessment periods. For the CPT and PE groups, the decreases in scores occurred from pretreatment to posttreatment. There were no significant changes from posttreatment to the 3-month or 9-month follow-up. CAPS scores for the ITT sample are depicted in .
The effect sizes for the two active treatments at posttreatment (relative to the MA condition) with the LOCF data set are presented in . Hedges
g effect sizes (
Hedges, 1982) were computed so that the results would be directly comparable to the effect sizes calculated in the International Society for Traumatic Stress Studies (ISTSS) treatment guidelines for PTSD (
Foa, Keane, & Friedman, 2000). Basic Hedges’
g values are part of the Cohen
d effect- size family. Effect sizes were calculated as the mean difference between the experimental (CPT and PE) and comparison (MA) groups divided by the pooled standard deviation within each of the samples. When the CPT and PE groups were compared directly, the experimental group was CPT, and PE was used as the comparison group. Effect sizes were then converted to unbiased Hedges’
g values to correct for variations due to small sample sizes (
Hedges, 1982;
Rosenthal, 1991). To assist with interpretation,
Cohen (1988) proposed a set of qualitative descriptors to accompany individual effect sizes. Demarcations between descriptors are meant to be approximate rather than absolute in nature. Small effect sizes are operationally defined as 0.2; medium effect sizes, as 0.5; and large effect sizes, as 0.8 (
Cohen, 1988). The CPT and PE groups showed large effects for symptoms (relative to the MA group) in the ITT sample. When we compared the CPT and PE groups directly (rather than the MA condition), CPT resulted in small but positive effect-size differences for PTSD, depression, and guilt measures at posttreatment, 3 months, and 9 months, indicating modestly greater symptomatic improvement relative to the participants in the PE condition.
| Table 2Unbiased Hedges g Effect Sizes: Cognitive-Processing Therapy (CPT), Prolonged Exposure (PE), and Minimal Attention (MA) |
Random Effects Regression Analyses
Pretreatment–Posttreatment Effects: CAPS and BDI We tested the accuracy of the major analyses conducted with LOCF data by running analyses testing the same hypotheses using the random regression method (or mixed-effects regression). Given that MA participants received pretreatment and posttreatment MA assessments and then were moved into one of the two active treatments, only two time points were assessed in these three-group random regression analyses. Random regression models handle nonrandom missing data due to dropout by estimating time trend lines for each individual based on available data for that individual as well as information about the parameters of the entire sample. The MIXREG (
Hedeker & Gibbons, 1996) program for random effects regression was used to compare the three treatment groups on the CAPS and then on the BDI. It was necessary to run two separate analyses for each dependent variable (CAPS and BDI) to compare changes over time across three groups using MIXREG (R. Gibbons, personal communication, April 2000). As a means of guarding against increased experimentwise error rates, results were only considered significant at the .0125 level.
The results of these analyses were consistent with the main findings of the LOCF analyses. The CPT group showed significantly more change in CAPS scores than the MA group (estimated improvement difference: −53.87, SE = 4.51, z = −11.96, p < .0001). The PE group also showed significantly greater change in CAPS score over time than the MA group (estimated improvement difference: −50.51, SE = 4.57, z = −11.06, p < .0001), but the CPT and PE groups were not significantly different from each other.
A second set of random effects regression analyses examined differences in BDI score change over time among the CPT, PE, and MA groups. As in the CAPS analyses, there was no indication of significant serial error correlations. CPT participants showed significantly greater changes in BDI scores over time than MA participants (estimated improvement difference: −15.93, SE = 2.09, z = −7.61, p < .0001). PE participants showed similarly significant decreases in BDI scores relative to the MA group (estimated improvement difference: −11.72, SE = 2.11, z = −5.55, p < .0001). There was no significant difference in BDI score change over time between the PE and CPT groups (estimated improvement difference: 4.21, SE = 2.03, z = 2.07, p < .04).
Random Effects Regression on PSS Change Participants were given the PSS at the initial assessment, at the beginning of every other session, and at the posttherapy assessments. The two therapies were equal in terms of overall number of hours but differed in number and length of sessions. Mixed-effects linear regression analysis (MIXREG) was used to assess group differences in PSS score change over the course of therapy. The PSS was given to the CPT and PE groups on a regular basis, but the MA participants were not attending sessions. Therefore, only the CPT and PE groups were included in this analysis. A MIXREG run not allowing for serial correlation of errors showed a trend toward a significant difference between the CPT and PE groups on PSS score over time. The trend suggested a larger decrease in PSS scores over the course of CPT than over the course of PE. However, it was found that there was substantial serial correlation of errors (r = .68, p < .0001), best described by a first-order nonstationary autoregressive error pattern. In this pattern, any given PSS score is predicted to a much greater degree by the score at the previous time point than by the score at the time point before that. A MIXREG analysis accounting for serially correlated errors showed no trend toward significant treatment group differences on the PSS. There were no significant differences in PSS scores between the two conditions at baseline or over time in therapy.
Analyses of Treatment Completers
Means and standard deviations for the completer sample are shown in , and CAPS scores are plotted in . In the first analysis, the three groups were compared in a repeated measures MANOVA from pretreatment to posttreatment with CAPS total score and BDI score as dependent variables. The MANOVA resulted in a significant interaction, F(4, 214) = 23.7, Pillai’s trace = .61, p < .0001, and significant treatment group, F(4, 214) = 9.4, Pillai’s trace = .30, p < .001, and session, F(2, 106) = 141.1, Pillai’s trace = .73, p < .0001, effects. Univariate repeated measures ANOVAs indicated that both the CAPS, F(2, 118) = 79.2, p < .0001, and the BDI, F(2, 107) = 26.0, p < .0001, resulted in significant interactions. Follow-up one-way ANOVAs indicated no pretreatment session differences but significant posttreatment effects on the CAPS, F(2, 118) = 76.1, p < .0001, and BDI, F(2, 110) = 32.8, p < .0001. Post hoc Tukey’s HSD tests indicated that the group differences on both the CAPS and BDI were between the MA group and the two treatment groups. A second analysis was conducted between the two treatment groups over the four assessment sessions. This repeated measures MANOVA resulted in significant session effects, F(6, 38) = 55.6, Pillai’s trace = .90, p < .0001, but no treatment type effect or interaction. On the CAPS, both groups exhibited a strong decrease in scores from pretreatment to posttreatment, F(1, 80) = 407.4, p < .0001; some increase from posttreatment to the 3-month follow-up, F(1, 73) = 8.5, p < .005; and no change from 3 months to 9 months. On the BDI, the groups improved significantly from pretreatment to posttreatment, F(1, 75) = 142.5, p < .0001. From posttreatment to 3 months, there were no significant changes, nor were there significant changes from 3 months to 9 months posttreatment.
| Table 3Mean Scores for Treatment Completers: Cognitive-Processing Therapy (CPT), Prolonged Exposure (PE), and Minimal Attention (MA) Groups |
The effect sizes for the completer sample are shown in . Both therapies had large effects relative to MA at posttreatment on PTSD, depression, and guilt scores. Effect sizes were also calculated for CPT relative to PE at posttreatment and the 3- and 9-month follow-ups. At posttreatment and the 3-month follow-up, there were small CAPS effect-size differences for CPT as compared with PE. For the BDI, there were moderate effect-size differences between the two active treatments favoring CPT at posttreatment and the 3-month follow-up. Contrary to the ITT analyses, at the 9-month follow-up, PE showed a small effect-size difference relative to CPT for the CAPS; there were no differences for the PSS and BDI.
Diagnosis and Treatment Outcome
Finally, diagnoses were examined in the three groups at posttreatment using the symptom but not time criteria. First, in the ITT sample, only 1 MA client of 45 (2.2%) was PTSD negative at the post-MA assessment. In comparison with the MA group, 33 of the 62 women randomized into CPT (53%) and 33 of the 62 PE clients (53%) were negative for PTSD at posttreatment, χ2(2) = 35.9 < .0001. In comparisons of those who received CPT versus PE over time, there were no significant differences in diagnosis at any of the time points. At the 3-month follow-up, 42% of CPT and 53% of PE clients still met criteria for PTSD. At the 9-month follow-up, 45% of CPT and 50% of PE clients were PTSD positive.
There were also no significant differences between the two active treatments for the SCID (major depression) in the ITT sample. The SCID module for major depressive disorder (MDD) was not readministered until the 3-month follow-up because at posttreatment the assessment would have had to involve the last third of the treatment (2 weeks). At pretreatment, 43.5% of the CPT and 47.5% of the PE clients met criteria for MDD. At the 3-month follow-up, 30.6% of CPT and 29.5% of PE clients still met criteria for MDD. At the 9-month follow-up, 22.6% of the CPT clients and 29.5% of the PE clients continued to meet criteria for MDD.
Completing the treatments as designed, of course, yielded a very different picture. Of those who completed treatment, only 19.5% of CPT and 17.5% of PE clients still met criteria for PTSD. At the 3-month follow-up, 16.2% of CPT and 29.7% of PE clients were PTSD positive. At the 9-month follow-up, 19.2% of CPT and 15.4% of PE clients were still PTSD positive. There were no significant differences in PTSD diagnostic status between the CPT and PE groups at any time point.
With regard to depression comorbidity among treatment completers, 46.3% of CPT and 52.6% of PE participants also met criteria for current MDD at pretreatment. At the 3-month follow-up, 17.6% of CPT and 22.2% of PE participants still met criteria for MDD. At the 9-month follow-up, 3.8% of CPT and 15.4% of PE clients continued to meet criteria for MDD. All of the chisquare analyses were nonsignificant.
End-State Functioning
To determine the percentage of participants who achieved good end-state functioning, we computed an index that combined scores from the PSS and BDI using the same cutoffs as
Foa et al. (1999). Good end-state functioning was defined as at or below a cutoff of 20 on the PSS and at or below 10 on the BDI. In the ITT sample at posttreatment, 53% of the CPT and 37% of the PE participants had good end-state functioning, and there was a trend for CPT participants to have better functioning than PE participants, χ
2(1) = 3.3,
p < .08. At 3 months posttreatment, there was also a trend, χ
2(1) = 2.7,
p < .11, with 50% of the CPT and 36% of the PE participants reporting good end-state functioning. At 9 months posttreatment, there was no difference between groups, with 45% of the CPT and 40% of the PE participants reporting good end-state functioning.
In the completer sample, 76% of the CPT and 58% of the PE participants reported good end-state functioning, resulting in a trend, χ2(1) = 2.9, p < .09. At the 3-month follow-up, 72% of CPT and 50% of PE participants reported good end-state functioning, again a trend favoring CPT, χ2(1) = 3.6, p < .06. At 9 months, there was no significant difference between the two treatments, with 64% of CPT and 68% of PE participants reporting good end-state functioning.
Supplementary Analyses
Length of Time Since Index Rape Because the length of time since the index rape varied from 3 months to 33 years, it is possible that treatment outcome was affected by chronicity. The distribution of years since rape was somewhat skewed toward more recent index traumas (within the previous 2 years). Therefore, instead of using years since index rape as a continuous variable, we divided the ITT and completer samples into three relatively equal groups based on percentile: 3 months to 2.25 years (n = 56), 2.3 to 10 years (n = 54), and more than 10 years (n = 58). There was no significant difference in the distribution of the chronicity groups across the three therapy conditions. The data were analyzed by means of 3 × 3 (Treatment Group × Time Group) ANOVAs at posttreatment with pretreatment CAPS, BDI, and PSS scores as covariates. There were no interactions or main effects for length of time since index rape for either the ITT sample or the completer sample.
Effect of Treatment on Guilt One of the aims of the study was also to examine the effect of treatment on cognitions. After the study was under way, a decision was made to administer some of the measures to the MA participants only at the second assessment session to reduce the size of the assessment battery. However, we did administer these measures at both time periods initially, so the MA sample size was sufficient to compare the MA condition with the other two treatments. Also, we administered a reduced battery at the 3-month follow-up once we decided to implement a 9-month follow-up. We did not administer the TRGI at the 3-month follow-up. Therefore, in the case of these analyses, treatments were compared at pretreatment, posttreatment, and the 9-month follow-up.
First, using the ITT data, we conducted a repeated measures MANOVA comparing the groups over the three assessments. There were four dependent variables: global guilt, hindsight bias– responsibility, lack of justification, and wrongdoing. The MANOVA resulted in significant group, F(8, 244) = 2.3, Pillai’s trace = .14, p < .02, and session, F(4,117) = 4.4, Pillai’s trace = .23, p < .001, effects. The interaction term was not significant. As with the symptom measures, there were no significant pretreatment differences between groups. However, at posttreatment, the groups were different on all four subscales: global guilt, F(3, 159) = 8.8, p < .0001; hindsight bias, F(2, 157) = 9.1, p < .0001; lack of justification, F(2, 154) = 10.6, p < .0001; and wrongdoing, F(2,153) = 6.3, p < .005. Post hoc Tukey’s HSD tests indicated that both the CPT and PE groups had significantly lower global guilt and wrongdoing scores than the MA group. However, the CPT group had significantly lower hindsight bias and lack of justification scores than either the PE group or the MA group, which did not differ from each other.
At the 9-month assessment, with MA posttreatment scores carried forward, there were also significant differences on all four measures: global guilt, F(2, 159) = 10.1, p < .0001; hindsight bias, F(2, 157) = 11.3, p < .0001; lack of justification, F(2,154) = 8.8, p < .001; and wrongdoing, F(2, 153) = 8.5, p < .001. The Tukey’s HSD test revealed the same pattern as the posttreatment assessment, with the CPT group having lower hindsight bias and lack of justification scores than the PE and MA groups and both active treatments resulting in lower scores than MA on global guilt and wrongdoing. Repeated measures MANOVAs for each group individually indicated that the MA group did not improve over time, whereas both the CPT and PE groups improved significantly over time: CPT, F(8, 49) = 5.0, p < .001, and PE, F(8, 48) = 4.8, p < .001. There were no significant changes from posttreatment to the 9-month follow-up for either active treatment.
The analyses for treatment completers replicated those for the ITT sample (and are available from Patricia A. Resick on request). Effect sizes for guilt cognitions are listed in for the ITT and completer samples. In the ITT sample, CPT showed a large effect size for guilt cognitions, whereas PE showed a medium effect size. In the completer sample, both groups exhibited very large effects; however, there were moderate-to-large effect sizes for CPT relative to PE at posttreatment and 9 months posttreatment.
Delayed Treatment Results Finally, on completion of the MA condition, interested participants were randomly assigned to one of the two active treatments. The treatment results in these delayed condition groups replicated those for the participants who were assigned directly to treatment. and depict total CAPS scores in the ITT and completer samples for all four groups (the initial findings for the three groups [CPT, PE, and MA] and the delayed treatment results for the women who were subsequently assigned to CPT or PE on completion of the MA condition).