|Home | About | Journals | Submit | Contact Us | Français|
To compare the rates of physical, psychiatric, and suicide-related events in adolescents with MDD treated with fluoxetine alone (FLX), cognitive-behavioral therapy (CBT), combination treatment (COMB), or placebo (PBO).
Safety assessments included adverse events (AEs) collected by spontaneous report, as well as systematic measures for specific physical and psychiatric symptoms. Suicidal ideation and suicidal behavior were systematically assessed by self- and clinician reports. Suicidal events were also reanalyzed by the Columbia Group and expert raters using the Columbia-Classification Algorithm for Suicidal Assessment used in the U.S. Food and Drug Administration reclassification effort.
Depressed adolescents reported high rates of physical symptoms at baseline, which improved as depression improved. Sedation, insomnia, vomiting, and upper abdominal pain occurred in at least 2% of those treated with FLX and/or COMB and at twice the rate of placebo. The rate of psychiatric AEs was 11% in FLX, 5.6% in COMB, 4.5% in PBO, and 0.9% in CBT. Suicidal ideation improved overall, with greatest improvement in COMB. Twenty-four suicide-related events occurred during the 12-week period: 5 patients (4.7%) in COMB, 10 (9.2%) in FLX, 5 (4.5%) in CBT, and 3 (2.7%) in placebo. Statistically, only FLX had more suicide-related events than PBO (p = .0402, odds ratio [OR] = 3.7, 95% CI 1.00–13.7). Only five actual attempts occurred (2 COMB, 2 FLX, 1 CBT, 0 PBO). There were no suicide completions.
Different methods for eliciting AEs produce different results. In general, as depression improves, physical complaints and suicidal ideation decrease in proportion to treatment benefit. In this study, psychiatric AEs and suicide-related events are more common in FLX-treated patients. COMB treatment may offer a more favorable safety profile than medication alone in adolescent depression.
Placebo-controlled, randomized clinical trials (RCTs) are the gold standard for evaluating efficacy and are also of critical importance in assessing safety and tolerability (March et al., 2004). Although they provide substantial information on the measurement of safety, most clinical trials in pediatric psychopharmacology are generally poorly designed for this purpose (Vitiello et al., 2003). For example, adverse events (AEs) in most of the pediatric antidepressant trials conducted before the Treatment for Adolescents With Depression (TADS) were primarily elicited by general inquiry (e.g., “Have you had any problems since your last visit?”), with no systematic ascertainment of AEs. Only one pediatric depression trial (Emslie et al., 2002) systematically collected information on AEs via a self-report questionnaire at each visit. Recently, Greenhill et al. (2003) reported on AE elicitation methods across pediatric psychopharmacological trials and found that the inconsistency between trials likely impairs the ability to accurately and promptly identify drug-induced AEs. Using the Safety Monitoring Uniform Report Form (SMURF; Greenhill et al., 2004), which was designed to improve detailed elicitation of AEs associated with psychopharmacological treatments, more AEs were identified by comprehensive body system review than by general inquiry, suggesting that the latter may not be a sufficient method for eliciting important and clinically-relevant AEs.
Clearly, medications can cause a variety of AEs involving many body systems. Major depressive disorder (MDD) itself, however, is associated with high rates of physical symptoms and behavioral changes, including pain, headaches, and stomachaches. Thus, disentangling medication-related AEs from changes in illness is difficult, both with respect to physical and behavioral symptoms, not to mention suicidality. Unfortunately, RCTs have not generally required a thorough evaluation of physical symptoms at baseline, before assessment of treatment-emergent AEs during acute treatment.
What do we know about AEs associated with selective serotonin reuptake inhibitors (SSRIs)? A number of physical and psychiatric AEs, such as headache, nausea, abdominal pain, insomnia, somnolence, tremor, and agitation, were more likely to occur with an SSRI than with placebo, in several of the pediatric clinical trials (Emslie et al., 2002; Keller et al., 2001; March et al., 1998; Wagner et al., 2003, 2004). In a few studies, more children on an SSRI displayed manic symptoms than children on placebo, but the incidence was too small to be statistically significant (Emslie et al., 1997).
Suicidality is generally too rare for clinical trials or even meta-analyses of clinical trials for a single medication to be informative. However, in a recent pooled analysis of suicidality, defined as suicidal thinking or behavior (attempts, preparatory behaviors, or ideation), suicidal AEs occurred in approximately 4% of children randomized to an antidepressant as compared to 2% on placebo (Hammad et al., 2006). Conversely, neither analyses of epidemiological samples or large databases of naturalistically treated patients have found an association between SSRI use and increased rate of completed suicide in any age group. In fact, most have actually shown an association between SSRI use and decreasing rate of suicide (Gibbons et al., 2005; Olfson et al., 2003). Similarly, the lack of positive toxicology screening for antidepressants in completed youth suicide do not support the notion that antidepressants are a consistent factor in death by suicide (Gray et al., 2002; Isacsson et al., 2005; Leon et al., 2004). Considering the available data, it appears that antidepressant use is correlated with increased suicidality in a small select group of individuals, but not correlated with completed suicide.
The safety data from TADS are unique in several ways. First, it is the only trial that can compare safety outcomes from medication, cognitive-behavioral therapy (CBT), the combination of medication and CBT, (COMB), and medical management with pill placebo (PBO). Therefore, not only does PBO provide a double-blind control, but the CBT-only arm provides an additional nondrug comparison condition. Because MDD carries with it both physical (e.g., insomnia) and psychiatric (e.g., irritability) symptoms, not to mention suicidal behavior, which can be a symptom of MDD, it is often difficult to disentangle AEs from a lack of efficacy. Beyond the contrast with PBO, the availability of safety data from treatment with CBT alone and in combination with fluoxetine (FLX) may shed additional light on this issue. However, as highlighted in the introduction to this special section by March et al., the clinicians and study participants were not blind to treatment in the COMB and CBT arms, which undoubtedly affected not only efficacy, but also safety outcomes. Second, this is the first CBT trial to assess AEs. Earlier psychotherapy trials did not assess safety at all, so this trial provides preliminary reports on AEs collected by CBT therapists. Third, TADS is the first trial to compare COMB treatment to monotherapies in depressed adolescents. On the primary outcomes, TADS demonstrated that COMB was the most effective treatment, followed closely by FLX; CBT alone was equivalent to PBO (TADS, 2004). Assessing the safety of individual treatment arms will provide additional information for the risk-benefit ratio of each treatment. Finally, unlike most of the prior antidepressant trials, which used only spontaneous reporting of AEs, TADS used two methods to elicit data relevant to safety issues: spontaneous reporting of AEs and systematic assessments using self-reports, clinician-rated and independent evaluator–rated measures of physical symptoms, psychiatric symptoms, and suicide-related behaviors. Hence, although AE reporting was not a primary outcome of the study, TADS is able to provide more details about AEs than ordinarily available from RCTs.
In this study, which replicates and extends the earlier intent-to-treat acute treatment safety outcomes (TADS, 2004), we tested whether the four treatment interventions of TADS differ as to physical symptoms, general psychiatric symptoms (including symptoms of mania/hypomania), and suicidal ideation and suicide-related AEs. For each of these safety outcomes, a dual approach was taken, examining systematically administered rating scales as well as individually reported AEs.
The rationale, design, methods, and sample characteristics have been described in prior reports (TADS, 2003, 2005) and are included in the Special Section introduction by March et al. The patients were adolescents (N = 439), ages 12 to 17 years (mean ± SD, 14.6 ± 1.5 years), who were outpatients, with a primary DSM-IV diagnosis of MDD. The baseline mean Children's Depression Rating Scale-Revised (CDRS-R) total score was 60.1, and 86% of the sample were in their first episode of MDD. Patients were randomly assigned to receive one of four possible treatment conditions: FLX (n = 109), CBT (n = 111), COMB (n = 107), or PBO (n = 112). The trial was double blind for the FLX and PBO conditions, and single blind for the CBT and COMB conditions.
The primary efficacy and safety outcomes paper (TADS, 2004) reported on the intention-to-treat (ITT) sample composed of enrolled patients randomly assigned to one of the four treatment arms, regardless of protocol adherence or completion. However, some participants may have received out-of-protocol treatments (e.g., subjects assigned to CBT may have been prematurely terminated if the addition of antidepressant treatment was clinically indicated or vice versa). These subjects were allowed to continue in the study and are included in the ITT analyses. In this paper, however, we focus on events occurring only within the assigned treatment condition. To minimize confounding events associated with out-of-protocol treatments, the focus here is on observed cases (OCs), defined as randomized patients who at the time of the specified assessment or AE were still in their assigned treatment arm. That is, the subject was still active in the study (had not dropped out), and the randomized treatment had not been prematurely terminated (for ethical reasons, discontinued or modified by the study clinician, such as adding one of the other treatments or an out-of-protocol treatment) before the assessment or AE under consideration. However, baseline rates of symptoms are the same as those reported in the primary acute ITT paper (TADS, 2005), as all subjects remained within their assigned treatment arm at the baseline assessment.
At each treatment visit, the patient, and often his or her primary caregiver, was asked through general inquiry how the teen was doing and if he or she had experienced any problems since the last visit. An AE was defined as any unfavorable medical change occurring post-randomization that was deemed to be clinically significant, independent of relatedness to treatment. An AE was considered clinically significant if it (1) was accompanied by interference in functioning or (2) required medical attention. Such events would normally be considered of moderate severity or greater in traditional RCTs of medication where AEs were classified as mild, moderate, or severe. Each event was documented as to whether it was related to FLX or CBT. Using an AE form that was completed at each treatment visit, pharmacotherapists captured AEs for all medication patients (FLX, PBO, or COMB), and CBT therapists captured AEs for CBT only patients. A serious AE (SAE) was defined using standard U.S. Food and Drug Administration (FDA) language: Life threatening (at immediate risk of death), requires hospitalization for any reason, results in persistent or significant disability or incapacity, results in congenital anomaly or birth defect, results in death, or other significant medical event, including cancer.
Additional information regarding physical and psychiatric symptoms was obtained using two systematically collected rating scales, the Physical Symptoms Checklist and the Adolescent Depression Scale (ADS), which were developed specifically for TADS.
The PSC, a symptoms checklist, was obtained at baseline and weeks 6 and 12. The PSC is a 47-item Likert-style self-report measure that ascertains both somatic and CNS signs and symptoms. Ratings range from 0 to 3: 0 = not at all; 1 = just a little; 2 = pretty much; 3 = very much. Patients were asked to rate symptoms present during the prior week. Excluding items pertaining to female gynecological symptoms, a PSC total severity score was derived from items 1 to 43. A principal components analysis (PCA) was conducted on these items using a Varimax rotation method to derive factor-based scores. Initially, 11 factors were identified by a minimum eigenvalue criterion of 1.0. Through the PCA process, 29 items were retained in the final model based on 416 patients. Fourteen items were eliminated because the item did not have a factor loading of 0.40 on any of the factors, the item did not consistently load on one factor, or the item did not make conceptual sense within that factor. Eight components that accounted for 60.5% of the variance were identified and labeled sleep, upper respiratory, pain, cardiac, panic, elimination, nausea, and skin. Factor-based scores were derived from those items within each component with a factor loading of 0.45 or greater. Headaches, which were not represented as a separate factor on the PSC but are commonly reported as an AE of SSRI treatment, also are analyzed separately. (Requests for the PSC should be directed to John March, M.D., at Duke University.)
The ADS is a 31-item Likert-style inventory that was completed at each treatment visit based on information from the teen. The ADS systematically assessed depressive symptoms, suicidality, symptoms associated with mania, and family, peer, and school functioning for each subject. Patients were asked to answer yes or no whether each symptom and functional problem was present in the week before the visit. The clinician then interviewed the subject to evaluate severity of each symptom. Severity ratings (0–3) were then coded for each ADS item. Severity was assessed as not present (0), mild (1), moderate (2), or severe (3). Pharmacotherapists rated patients randomized to any of the medication groups, and CBT therapists rated the patients in the CBT-only group. If a subject in the COMB group saw a CBT therapist on a day he or she did not see a pharmacotherapist, then the CBT therapist completed the ADS. A mania total score (range 0–27) was calculated from the clinician severity ratings for nine mania-related items. (Requests for the ADS should be directed to John March, M.D., at Duke University.)
Spontaneous reports of suicide-related behaviors were captured through standard AE reporting methods. Clinicians and site supervisors determined how to label these events, which were then reviewed on a trialwide conference call, and finally recoded for consistency by three raters at the TADS coordinating center. Because of the controversies surrounding SSRIs and suicidality, the FDA commissioned suicidology specialists at Columbia University to conduct a reclassification of all possibly suicidal events (suicidal thoughts or behaviors), applied by an internationally recognized expert panel, across all clinical trials of antidepressants (Posner et al., 2004). The group developed a highly reliable classification system that distinguishes suicidal from nonsuicidal events known as the Columbia Classification Algorithm of Suicide Assessment. To provide consistency across pediatric trials, the TADS investigators requested the Columbia-led group reclassify the TADS harm-related events, not only the events within the FLX and PBO groups (which had already been done for the FDA), but also to review and classify the events within COMB and CBT-only groups.
Both electronic and manual searches of all potential harm-related events were conducted using the search strategy conducted for the recent FDA analyses. In general, text strings such as “suicide” and “overdose” were assessed. Additional details about specific text strings used are available on the Journal's Web site at www.jaacap. com via the Article Plus feature. Narratives for all possible suicidal and aggressive events were submitted to Columbia, where each event was reviewed and coded by two independent raters from the expert panel. The Columbia coding system used seven codes for assessing all suicidal and harm-related events. Three of the codes were considered suicidality (ideation and behavior: suicide attempt (FDA Code 1: any self-injurious behavior associated with some intent to die), preparatory actions toward imminent suicidal behavior (FDA Code 2), and suicidal ideation (FDA Code 6: includes passive as well as active suicidal ideation). A fourth code, self-injurious behavior with unknown intent (FDA Code 3), was included in sensitivity analyses in the FDA work as possibly suicide-related events. These classification categories are reported here.
The SIQ-Jr (Reynolds, 1987) is a 15-item self-report focusing on suicidal thinking. Patients completed the SIQ-Jr at baseline and weeks 6 and 12. A total score ≥31 on the SIQ-Jr is considered a flag indicating elevated suicidal risk. In addition, a score of 5 or 6 on two or more items of a subset of individual items (items 2–4, 7–9) is also considered a flag for potentially serious self-destructive behavior.
The CDRS-R (Poznanski and Mokros, 1996) is a 17-item clinician-rated depression severity scale that was completed by independent evaluators blind to treatment assignment at baseline and weeks 6 and 12. Item 13 on the CDRS-R evaluates suicidal behavior and is rated on a scale of 1 to 7. Scores of 1 to 2 are suggestive of little or no suicidal thinking or behavior; a score of 3 suggests thoughts of suicide or suicidal ideation. Scores of 5 or greater suggest significant interference in functioning caused by suicidal thinking or behaviors.
As stated previously, data are included only for patients who were active in their assigned treatment arm at the time of the assessment. For example, if a subject was prematurely terminated from or dropped out of their assigned treatment arm, then subsequent AEs/SAEs would not be addressed in this article. In this way, only events occurring within the actual treatment assignment are analyzed. The rationale for this approach is that some patients sought or were provided additional or alternative treatment options following premature termination from their assigned treatment. For example, a subject assigned to CBT could be prematurely terminated because of lack of benefit and, as a result, begin treatment with FLX, yet continue to receive study CBT and complete study assessments. By censoring the data to use OCs only, events cannot be misattributed to the treatment in the original study arm.
Tests for differences in proportions (e.g., χ2 and Fisher exact tests) were used to examine between-treatment differences in event rates derived from the spontaneous AE reporting and the systematic assessments. Odds ratios (ORs) and 95% confidence intervals (CIs) were also calculated for event rates in order to measure the relative risk in an active treatment arm when compared to PBO or in SSRI-treated cases relative to non-SSRI treated cases. Linear random regression models for repeated measurements were employed to test for between-treatment differences in rate of change (slope) across time as well at the end of the acute treatment period (week 12) in the SIQ-Jr and PSC total scores. Fixed effects within the model were treatment, the natural log of time, and treatment × time, and patient and patient × time were included as random effects. Nondirectional statistical tests were conducted and the α level for each omnibus test was 0.05. If a significant treatment effect or treatment × time interaction was detected, a closed testing approach was applied to the paired contrasts and the level of significance for each between-treatment comparison was set at .05. When the treatment effect or treatment × time interaction was not significant at the .05 level, paired contrasts were conducted, but a sequential rejective method was applied in order to protect the type I error rate (Koch and Gransky, 1996).
TADS treatments proved acceptable and tolerable. Of the 439 randomized patients, 359 (81.8%) remained in their assigned treatment arm through 12 weeks of acute treatment, although eight of these missed the week 12 assessment visit. The proportion of patients who remained in the assigned treatment arm was greater with fluoxetine-treated patients. Specifically, 86.0% in COMB, 83.5% in FLX, 78.4% in CBT, and 79.5% in PBO were in their assigned treatment arm at the week 12 assessment point. There were no significant differences in discontinuation rate (dropout and/or premature termination) between the treatment arms. The mean maximal dose (in milligrams) prescribed was 27.9 ± 8.4 for the COMB arm, 32.8 ± 10.6 for the FLX arm, and 33.5 ± 9.6 for the PBO arm.
Baseline rates of self-reported physical symptoms (i.e., not including psychiatric symptoms or self-harm) were similar across the four groups and were quite high. For example, 53.8% of the sample reported trouble sleeping, 33.8% reported headaches, and 26.6% reported stomach pain or ache before initiation of treatment. Rates of each physical symptom from the PSC, a self-report checklist, can be found in the Article Plus material. Across the entire sample, girls had higher mean PSC total scores than boys (24.7 ± 17.9 versus 18.1 ± 15.4; p < .0001).
There was a decrease in the total severity of self-reported physical symptoms across all groups during the 12 weeks of acute treatment. Interestingly, COMB, FLX, and PBO all had significantly lower total PSC scores at week 12 (11.0 ± 8.2, p = .0036; 13.0 ± 10.5, p = .0328; and 12.3 ± 8.9, p = .0280, respectively) than CBT (18.5 ± 17.5). There were no significant differences between the other three groups.
On the eight factor–based scores (sleep, upper respiratory, pain, cardiac, panic, elimination, nausea, and skin), each of the four treatment groups showed improvement on all eight factors. Participants receiving CBT showed less improvement over time than the other three groups on all eight factors, although most did not reach statistical significance. The only factor to show a treatment × time interaction was pain, with participants receiving either FLX or COMB significantly more improved than those receiving CBT (p = .0017 and p = .0011, respectively).
Two hundred eleven spontaneous (but not necessarily related to treatment) physical AEs that required medical attention or caused dysfunction were reported in 113 patients during the course of the 12 weeks of acute treatment. No significant differences were found in number of patients reporting AEs between the three groups receiving pills: FLX (n = 35), COMB (n = 37), or PBO (n = 34). However, more events were reported by those in the FLX-only group (within similar numbers of patients) compared to those receiving either COMB or PBO (FLX = 81, COMB = 61, PBO = 60). Relatively few AEs were reported by CBT therapists (n = 9) compared with pharmacotherapists (n = 202).
Sedation, insomnia, vomiting, and upper abdominal pain were reported in at least 2% of patients and at rates at least two times greater with FLX and/or COMB than PBO (Table 1). Nonetheless, these AEs were infrequent (<5%). The only AE occurring in more than 5% of patients was headache, which occurred at similar rates in those treated with FLX (11.9%), COMB (6.8%), and PBO (10.7%). As stated, few AEs were reported for the CBT group, and no symptoms were reported at a greater rate with CBT than with PBO.
All items on the PSC were analyzed for worsening or emergence of symptoms, with the caveat that treatment-emergent symptoms on the PSC were not necessarily coded as AEs by spontaneous report. To determine worsening or emergence of symptoms, all patients receiving a 0 or 1 (0 = not at all, 1 = just a little) at baseline and increasing to a 2 or 3 (2 = pretty much, 3 = very much) at week 6 or 12 were identified, as well as those who increased in severity from a 2 (pretty much) to 3 (very much). Based on PSC self-report, treatment with fluoxetine did not lead to significantly higher rates of symptom worsening or emergence on any physical symptoms. Table 2 illustrates the rates of worsening or emergence of symptoms based on self-report for all four treatment groups. Furthermore, in comparing only the double-blind treatment groups (FLX versus PBO), there were no significant difference on emergence or worsening of any physical symptoms.
Clinical Global Impression-Improvement (CGI-I) ratings were completed at week 6 and week 12 by independent evaluators. CGI-I was based on improvement of depression. Three patients (2.8%) on FLX and two (1.9%) on COMB received a 5 or greater, indicating slightly worse (score of 5), much worse (6), or very much worse (7) at either week 6 or 12, compared to 10 (9.0%) on CBT and 7 (6.3%) on PBO. There was a statistically significant difference between subjects receiving fluoxetine (FLX and COMB) and subjects not receiving FLX (CBT and PBO; p = .01); however, the individual arms were not compared because the frequencies were too small in each arm. These rates are based on OCs and are similar to the ITT analyses presented by March (2005).
Worsening of the CGI-S score produced similar results. A worsening of at least 1 point from baseline to either week 6 or 12 was found in 2 (1.8%) patients receiving FLX, 2 (1.9%) in COMB, 5 (4.5%) in CBT, and 9 (8.0%) in PBO.
Table 3 presents rates for psychiatric-related AEs (spontaneous reports occurring at any time during the 12 weeks) grouped by clusters: mania spectrum, irritability/depression spectrum, agitation spectrum, anxiety, and other. There were more psychiatric AEs in patients treated with FLX (either alone or in combination) than in patients treated with CBT or PBO. Within mania spectrum (mania, hypomania, and elevated mood), one (0.92%) subject assigned to FLX developed mania and four patients developed hypomania (one COMB, one PBO, and two FLX). One subject was reported to have elevated mood (FLX). On the irritability/depression spectrum (hypersensitivity, irritability, anger, worsening of depression, and crying), two patients developed hypersensitivity (FLX), two irritability (one COMB, one FLX), one anger (FLX), one worsening of depression (PBO), and one crying (COMB). A total of five patients exhibited symptoms in the agitation spectrum (agitation, akathisia, nervousness, restlessness, and hyperactivity), including one subject with akathisia (COMB), one with nervousness (PBO), one with hyperactivity (FLX), and two (FLX, PBO) reporting restlessness. Three patients reported anxiety/panic symptoms, including two with panic attacks (FLX, CBT) and one with anxiety (FLX). Three patients developed tremor (two FLX, one COMB), one developed “abnormal behavior” (PBO), and one subject reported “feeling spacy” (COMB). More patients on FLX alone had psychiatric AEs than those on PBO; however, the numbers are too small to detect statistical significance.
Mania symptoms were also assessed using the ADS. The ADS is initially completed by the teen, who is asked simply whether a symptom is present. Clinicians then interviewed the teen about symptoms and rated the severity of each item (0–3), for a mania subscale score of 0 to 27. The majority of adolescents (83.4%) reported the presence of at least one symptom within the mania items on the ADS, and 59% reported presence of at least two symptoms. The most common symptoms endorsed by teens as positive were “having trouble paying attention or keeping your mind on what you are doing” (68.8%), “racing thoughts or having too many ideas in your head at one time” (33.5%), and “talking on and on, or talking very fast” (23.8%).
At baseline, mean (± SD) mania baseline severity scores were very low and similar for all groups: COMB = 2.6 ± 2.4, FLX = 2.2 ± 2.2, CBT = 2.5 ± 2.4, and PBO = 2.2 ± 2.3 (total sample 2.4 ± 2.3; range 0–12).
During the course of treatment, all four treatment groups showed a decrease in the total mania score on the ADS during the 12 weeks of treatment; final ADS Mania subscale scores were 0.5 ± 0.8 COMB, 1.1 ± 1.0 FLX, 1.1 ± 0.1 PBO, 1.0 ± 1.2 CBT (total sample, 0.9 ± 1.4). Total scores for COMB at endpoint were significantly lower than FLX (p = .013), PBO (p = .003), and CBT (p = .012).
Change scores on the ADS Mania subscale were constructed to assess for emergence or worsening of behavioral symptoms during treatment. Only subjects with at least two mania total scores were included (n = 424). A total of 65 of 424 (15.3%) adolescents had an increase of 3 points or more during the 12 weeks of treatment: 20% (n = 21) COMB, 14.2% (n = 15) FLX, 12.3% (n = 13) CBT, and 15.0% (n = 16) PBO. Most increases were with trouble paying attention (mean increase of 0.71 ± 1.3), racing thoughts (mean increase of 0.57 ± 0.9), excessive talking/talking very fast (mean increase of 0.49 ± 0.7), increase in activities (mean increase 0.49 ± 0.9), and impulsivity (mean increase 0.49 ± 1.1).
Of the five patients who developed mania or hypomania based on AE reporting, three were on FLX, one was on COMB, and one was on PBO. The three on FLX had high baseline ADS mania symptom scores (≥5). The patients on COMB and PBO who developed hypomania had baseline ADS scores of 0 and 1, respectively. Thirty-eight patients randomized to FLX treatment (with or without CBT) had an ADS Mania subscale total score ≥5 at baseline. Thus, most (92%) patients with elevated ADS Mania scores tolerated FLX treatment without developing additional symptoms of mania or hypomania.
Suicidal ideation was common at baseline, with 29.2% of patients reporting significant suicidal ideation on a self-report measure (SIQ-Jr). Based on independent evaluator assessment, 21.4% of patients reported some suicidal behavior at baseline (CDRS-R item 13 ≥3). The COMB group reported more suicidal ideation at baseline on the SIQ-Jr, compared with the other treatment arms, although this difference did not reach statistical significance.
Suicidal ideation improved across all groups during acute treatment (Fig. 1). Improvements in suicidal ideation have been reported previously (TADS, 2004), and the OCs findings were similar. Adjusted mean scores at the end of 12 weeks were not statistically different between the four individual treatment groups (COMB: 10.9 ± 0.3, FLX: 13.7 ± 0.2, CBT: 11.3 ± 0.3, PBO: 14.5 ± 0.6). However, as the group in COMB started with higher baseline severity scores, the overall improvement in suicidal ideation was greater for COMB than FLX (p = .004), CBT (p = .04), and PBO (p = .02).
Self-reported emergence or worsening of suicidal ideation on the SIQ-Jr was also analyzed. Emergence or worsening of suicidal behavior was defined as any subject whose baseline total score had not been ≥31 on the SIQ-Jr, but who had a total score ≥31 at either week 6 or 12. A total of 4.8% (18/374) patients showed emergence or worsening of suicidal behavior based on this self-report: 2.2% (2/93) in COMB, 7.3% (7/96) in FLX, 2.2% (2/93) in CBT, and 7.6% (7/92) in PBO. There were no significant differences between the four individual groups.
Clinician ratings were also analyzed for worsening and emergence of suicidality (CDRS-R item 13). To be conservative, any worsening of 1 or more points was considered worsening of suicidality. Five percent of COMB patients, 13.4% on FLX, 15.2% on CBT, and 7.2% on PBO showed worsening of suicidality from baseline to either week 6 or 12 on this measure. For emergence of suicidality, any subject who reported little or no suicidal ideation or behavior at baseline (score of 1 or 2), but who worsened to marked suicidal ideation/behavior or worse (score ≥5) at week 6 or 12 was reviewed. Based on these more stringent criteria, none of the COMB treated patients reported emergence of suicidality, compared to 3.7% on FLX, 1.3% on CBT, and 2.6% on PBO. There were no statistically significant differences between groups on worsening of suicidal ideation or behavior based on self- or clinician reports.
Suicide-related AEs were reanalyzed using the Columbia rating format. In instances where a subject had more than one suicide-related event, this subject was represented only once in the analysis and the most severe code was used, as in the FDA safety analysis. An event was rated as a suicide attempt if there was self-injurious behavior with some intent to die. Following the format used in the FDA analyses, suicide attempts (Code 1), as well as preparatory actions toward imminent suicidal behavior (Code 2), and suicidal ideation (Code 6) were considered suicide-related events in the primary analyses. There were no incidences of preparatory actions toward imminent suicidal behavior (Code 2) in the TADS. Additional sensitivity analyses also included self-injurious behavior with unknown intent (Code 3). Table 4 details the number of events for each group.
Based on the primary analyses (Codes 1, 2, and 6), 5 patients (4.7%) in COMB, 10 (9.2%) in FLX, 5 (4.5%) in CBT, and 3 (2.7%) in PBO had suicide-related events reported during the 12-week period. FLX alone had more incidences of suicide-related events than PBO (p = .0402, OR 3.7, 95% CI 1.00–13.7). However, patients taking FLX with CBT (COMB) had the same number of events as those with CBT and no medication. Very few actual suicide attempts occurred: two in COMB, two in FLX, one in CBT, and none in PBO. These rates are not statistically different between the four treatment groups. In the original TADS report (TADS, 2004), four COMB patients were reported as making a suicide attempt; however, one of these patients did so after prematurely terminating from the assigned treatment arm, and one subject was not coded by the Columbia group as making a suicide attempt, so these subjects are not counted here.
Table 5 provides details of the 24 events found in the sensitivity analysis (suicide attempt, preparatory acts, suicidal ideation, and self-injurious behavior intent unknown) that occurred. Fifteen of the 24 events were identified as SAEs, and 10 of subjects required hospitalization. Approximately one third (38%) of the patients continued in the assigned treatment arm following the suicidal event. However, most of those with an actual suicide attempt did discontinue TADS treatment or sought additional treatment (four of five). Of those who experienced the 24 events, just over half were male (n = 14). More than half of the patients had at least one comorbid psychiatric disorder (58.3%), and 10 (41.7%) had reported high levels of suicidal ideation at baseline (either by total score or on at least two planning items on the SIQ-Jr scale).
Clinical characteristics before the attempt also varied. Most patients (18/24; 75.0%) were considered to be at least moderately depressed (CGI-S ≥4) at the visit before the suicidal event. Only six patients were considered mildly ill or better (CGI-S ≤3). In two of these cases, the last available CGI-S score was about 3 weeks before the attempt, so depression severity immediately before the attempt is unknown. In addition, psychosocial stressors immediately before the event were reported by the study clinician in 70.8% of the narratives. Seven patients (29.2%) had an ADS mania score ≥3 within 2 weeks before the event, which is slightly higher than the mean baseline score for all patients. However, none of the patients who developed mania, hypomania, or elevated mood (n = 6) had a suicidal event.
The timing of the suicide-related event was similar among the groups, with slightly longer mean time to event in the COMB group: 52.0 ± 20.8 days on treatment for COMB, 38.0 ± 21.7 for FLX, 45.4 ± 26.7 for CBT, and 32.0 ± 15.0 for PBO. Table 5 includes the number of days from treatment initiation to the suicidal event (Days on Tx). Most (70.8%) had been in treatment for at least 1 month before the event. For those receiving FLX (with or without CBT), 11 of 16 (68.8%) had been on the antidepressant for at least 1 month; six of eight (75.0%) of those not on medication (CBT or PBO) had been in treatment for at least 1 month. For those on FLX, the dose varied, but was consistent with the dosing for the entire subject population. Only four patients (16.7%) had had a recent dose change before the event. More specifically, two in COMB and one in PBO increased to 30 mg and one in FLX increased to 20 mg within 7 days before the event). Neither of these patients made a suicide attempt.
For those making a suicide attempt, four of five patients had high rates of suicidal ideation at baseline; four had expressed psychosocial stressors before the attempt. All five had been in treatment for at least 1 month before the attempt, and none of the adolescents in the medication groups had had a recent dose change before the attempt.
The TADS was the first study to compare efficacy and safety of FLX, CBT, their combination, and placebo in the treatment of pediatric depression. The study is also among a relatively small number of trials that used systematic assessment of AEs, as well as the general inquiry used in most clinical trials. Over 80% of the adolescents completed 12 weeks of treatment in their assigned treatment arm, indicating the TADS treatments were generally acceptable and tolerable.
Depressed adolescents had high rates of physical symptoms at baseline by systematic assessment, and as depression improved, physical symptoms also improved. Careful assessment of physical symptoms before initiating treatment is important to establish a baseline and identify change with treatment.
Systematic review of symptoms and spontaneous AE reporting produced quite different results, with systematic review eliciting substantially more positive symptoms. This is consistent with findings by Greenhill et al. (2004). Unlike many clinical trials, the definition of an AE in TADS included the requirement that the event be “clinically significant” (i.e., characterized by either interference in functioning or need to seek medical attention). It is unclear whether this change in definition affected the results between treatment conditions, but it is possible because there were fewer AE reports overall in this trial than in other clinical trials.
The difference in systematic versus spontaneous symptom reporting was most evident in the CBT group, where only nine spontaneous AEs were reported in the 111 patients randomized to that treatment group. Through systematic review, however, the CBT group had higher total adverse symptom scores at weeks 6 and 12 than the other three groups, suggesting under-reporting of AEs by CBT therapists. Hence, to compare AEs across treatment groups, future trials comparing different treatment modalities would profit from including a structured systematic assessment of AEs.
Few patients showed a worsening of psychiatric symptoms during the trial based on spontaneous report. Specifically, 22 showed a worsening of depression symptoms, with only three on FLX and two on COMB showing a worsening of depression. However, more patients on FLX reported psychiatric symptoms overall than those in COMB, PBO, or CBT. The numbers were too small, however, to detect statistical differences. Only one teen developed mania during the course of the study (in FLX), and the development of hypomania was also rare (one COMB, two FLX, one PBO). Interestingly, COMB did not have significantly more psychiatric AEs or more mania than PBO. It remains unclear how or why patients receiving FLX in combination with CBT had fewer psychiatric AEs than those receiving FLX alone. Is it CBT that provides protection or is it a dosing issue? The mean dose for the combination group was lower than in FLX alone, but it is unclear whether this led to a difference in physical or psychiatric event reporting.
Despite excluding patients in imminent danger of attempting suicide, rates of suicidal ideation were quite high at baseline (29.2%). Suicidal ideation improved over time, and only 9.6% had suicidal ideation at the end of 12 weeks. The COMB group had higher baseline severity scores, so the overall improvement in suicidal ideation was greater for COMB than the other three groups.
During the study, there were 24 reports of suicidal behavior (attempt, preparatory actions, self-injurious behavior with intent unknown, or suicidal ideation). Only five suicide attempts were reported during the acute trial in patients remaining in their assigned treatment arm (two COMB, two FLX, one CBT). Unlike the two previous placebo-controlled trials of FLX (Emslie et al., 1997, 2002), there were significantly more suicide-related events reported in patients taking FLX than those on PBO. The majority of the suicide-related events in the trial consisted of suicidal ideation (n = 18). Patients who had suicide-related events had a high number of risk factors, including moderate depression at the time of the event, psychosocial stressors before the event, high levels of suicidal ideation at baseline, and so forth. In spite of the common belief that suicide-related events occur shortly after initiating antidepressant treatment, in this study most events occurred over 1 month after initiating treatment, and this was consistent across all four groups.
It is often difficult to disentangle the factors that precipitate either suicidal ideation or suicide attempts. Is it a failure to treat the depression? Is it increased activation or agitation? Does the medication induce suicidal behavior? If it is simply a result of receiving medication, both FLX with and without CBT would be expected to have similar rates of suicide-related events; however, in this study, FLX monotherapy had more events than COMB, yet COMB had no more suicide-related events than CBT without medication. It is possible that CBT provides skills (e.g., coping skills, family conflict management) that can be used to reduce suicide-related events. It is also possible that the increased medication dose in the FLX-alone group influenced suicidal behavior. Finally, it may be that the reduced rate of suicide-related events in the combination group was related to the greater overall benefit and improvement of depression in that group. Other factors that may affect suicidal behavior are stressors and behavioral activation. TADS did not have adequate measures of these factors, so it is unclear what impact these factors may have had.
As noted in the Introduction to this special section by March et al., one important limitation to the TADS was that patients and clinicians in the COMB and CBT groups were not blind to treatment assignment, whereas the FLX and PBO arms were double blind. It is possible that spontaneous AE reporting was affected by this study design. Furthermore, subjects in COMB treatment had more frequent sessions, as these patients received full CBT sessions plus full psychotherapy sessions. Increased frequency of visits may have improved the safety outcomes for this treatment arm.
Another limitation is that CBT therapists have not historically assessed AEs per se, which was evident in this trial with so few reports of AEs in that group. It is likely that raters of different disciplines (e.g., medicine versus psychology), the amount of contact a rater has with subjects and the raters' awareness of treatment conditions may affect his or her assessment of safety. For example, discussion among CBT therapists revealed that general inquiry was at times either skipped or minimized in order to maximize the time available for topics and goals directly relevant to the CBT intervention. If CBT or other psychotherapy is evaluated as a comparator to medication in future research, it is essential that systematic assessment of these events be conducted. As seen with the self-report assessment on TADS, rates of physical symptoms were substantially more frequent than spontaneous AE reports, which is consistent with a recent study by Greenhill et al. (2004). Furthermore, systematic assessment should be done at each visit. In this study, one of the limitations was that the systematic assessment of physical symptoms was only done at weeks 6 and 12 of treatment. A subject who experienced one of these symptoms during the interim period may not have reported the event at the time the information was collected at the assessment visit.
Another limitation of the TADS trial was that other than mania, there was no systematic assessment of psychiatric symptoms, such as aggression, agitation, akathisia, and disinhibition, which are potentially associated with FLX. Because TADS provides only spontaneous reports of these events, which were rare and not associated with suicidality, a definitive answer to the question of whether an association between psychiatric symptoms and suicide-related events exists is not possible.
The TADS trial did provide a systematic measure of mania symptoms through weekly monitoring visits. However, the method used (ADS mania subscale) is not a standardized scale. Furthermore, it was noted by clinicians (both pharmacotherapists and CBT therapists) that scoring of the items was inconsistent, varying from clinician to clinician. Further complicating interpretations of data collected with this scale is that some of the endorsed items may have been the result of disorders other than mania/hypomania. For example, inattention, which was one of the most common positive items on the subscale, may have been caused by depression or ADHD, and it is not clear that clinicians discriminated in scoring based on cause of the symptom. Thus, if the symptom was problematic for the patient, it was likely to be coded as positive on the mania subscale, despite the fact that there may have been no other symptoms of mania or activation for that patient. This is likely to have resulted in selected items on this scale being evaluated as positive even in adolescents without mania or hypomania. As such, the results from the ADS mania subscale should be interpreted with caution. Still, TADS is the first depression trial to provide any systematic assessment of mania-related symptoms. Future studies will benefit from incorporating systematic measure of these symptoms both at baseline and throughout treatment.
Although this is the first study to report substantial detail regarding suicide-related events occurring within the treatment trial, several unanswered questions remain. First, although adolescents completed a self-report about suicidal ideation at baseline, specific details about past suicidal behavior were not collected. For future studies, detailed prospective assessment of suicidal behavior is needed to better evaluate treatment-emergent suicidality. In this study, there were no significant differences between the four treatment groups on emergence or worsening of suicidal ideation or behaviors based on self- and clinician-report. However, these systematic assessments were only conducted at weeks 6 and 12, and only provide a snapshot of the suicidality at the time of those assessments. Another limitation related to suicidality assessment involved the recording of the actual suicide events. Suicidal behaviors were identified and recorded based on site discretion, which has been reported as one of the limitations of the safety outcomes in most clinical trials. To address this problem, all potentially suicide-related events were reevaluated by a national expert panel; thus, events were not simply interpreted by the site. However, the description of events provided to national expert panel were those submitted regarding the event in question, and the level of available information on these events varied greatly. Therefore, in some instances, limited information was available to the group to determine the classification. Future studies may want to conduct more intensive and standardized assessment of suicide-related AEs in order to complete a more comprehensive assessment.
Finally, suicidal events are a very low base-rate phenomena, which require extremely large samples for study. The number of suicide-related events in the present study was small, although they were more frequent than in two prior trials of FLX trials. Several possible explanations for the increase in this study exist. First, it is possible that the increased incidence was by chance. Second, it is possible that it is related to dosing (the prior two FLX trials were 20-mg fixed-dose studies). Third, this study included only adolescents, whereas the prior two trials included children as well. Adolescents have a higher base rate of suicidal behavior than children, which could also account for the increase.
Finally, the increased rate of suicidal behavior in this trial may be a result of the longer trial length (12 weeks versus 8–9 weeks). As mentioned, more systematic assessment of the possibly related behavioral activation symptoms mentioned by the FDA in the black box warning will help disentangle these unknowns. This study was simply too small too answer these questions.
Although research in the treatment of pediatric depression has expanded significantly over the past decade, TADS is the first study to directly compare pharmacological and psychosocial treatment modalities. As more trials are being published on this population, the tendency may be for clinicians to focus primarily (or exclusively) on the efficacy results; yet the safety results are equally important, if not more so.
Clinicians treating depressed youths should note that these patients report relatively high incidences of physical symptoms, as has been described in several reports (Lewinsohn et al., 1996; Rhee, 2003; Williams et al., 2002; Yates et al., 2004). It is possible that depressed adolescents visiting their pediatrician or primary care physician may present with physical complaints rather than specific mood symptoms. It is also important to note that physical symptoms decreased as depression improved.
Given the recent FDA warning, it is important to routinely assess psychiatric symptoms, such as hostility, agitation, and mania. Based on the present study, psychiatric AEs causing moderate to severe impairment are rare, but do occur. It is possible that psychiatric AEs are dose related or that concurrent psychotherapy offers some protection against such changes, but this area needs further study.
Finally, despite the fact that all of the treatment arms were effective in reducing suicidal ideation, individual teens in clinical trials such as TADS will express suicidal ideation and make suicide attempts in the context of preexisting suicidal ideation, worsening of depression, stressful life events, and perhaps also in the context of improving depressive symptoms. Clearly, suicidal thinking and behavior need to be assessed carefully, both before and prospectively during treatment. Clear understanding and identification of any event are also necessary. That is, clinicians need to collect adequate information about the event to determine whether it was suicidal ideation, self-injurious behavior without intent to die, or an actual suicide attempt (suicidal ideation with intent to die). Based on guidelines from the FDA, patients beginning medication treatment should be observed closely for clinical worsening, suicidality, or unusual changes in behavior (face-to-face with the clinician weekly for 4 weeks of treatment, then every other week for 1 month, and again at 12 weeks). In addition, parents and teens should be given adequate information about risks associated with antidepressants and the importance of parental monitoring of the youths. Furthermore, in this study, suicide-related events often occurred more than 1 month after treatment began, so continued monitoring of suicidal ideation and behavior throughout the course of treatment is important. In summary, this article, combined with the TADS primary efficacy paper (TADS, 2004), provides important information for clinicians to better help their patients and families make informed assessment of the risk-benefit ratio when treating adolescent depression.
The authors thank and acknowledge the independent expert raters from the Columbia Suicidality Classification Project: David Brent, M.D., Greg Brown, Ph.D., David Rudd, Ph.D., Cheryl King, Ph.D., Anthony Spirito, Ph.D., A.B.P.P., Peter Marzuk, M.D., Patrick O'Carroll, M.D., M.P.H., Annette Beautrais, Ph.D., Kees Van Heeringen, M.D., Ph.D., and Alec Miller, Psy.D.
TADS is supported by contract N01 MH80008 from the National Institute of Mental Health to Duke University Medical Center (John S. March, Principal Investigator).
The Columbia Suicidality Classification Group is led by Kelly Posner, Ph.D., Maria Oquendo, M.D., Madelyn Gould, Ph.D., M.P.H., and Barbara Stanley, Ph.D.
Disclosure: Dr. Emslie receives research support from Eli Lilly, Organon, and Forest Laboratories; is a consultant for Eli Lilly, GlaxoSmithKline, Forest Laboratories, Wyeth-Ayerst, and Pfizer; and is on the speakers' bureau of McNeil. Dr. Kratochvil is a consultant or scientific advisor to Eli Lilly, Shire, Cephalon, Organon, AstraZeneca, Boehringer-Ingelheim, Abbott and Pfizer; receives research support from Eli Lilly, McNeil, Cephalon, and Abbott; receives study drug for an NIMH-funded study from Lilly; and is on the Eli Lilly speakers' bureau. Dr. Silva is a consultant to Pfizer. Dr. Weller is a consultant to and receives grants from Otsuka, Johnson & Johnson, AstraZeneca, Organon, Pharma, Shire, and GlaxoSmithKline. Dr. Waslick receives research support from Eli Lilly. Dr. Casat receives research support from Eli Lilly, GlaxoSmithKline, Shire, Bristol-Myers Squibb, AstraZeneca, Sanofi-Synthelabo, Pfizer, and McNeil; is on the Advisory Board and the speakers' bureaus of Eli Lilly and GlaxoSmithKline. Dr. Walkup receives research support from Eli Lilly, Pfizer, and Abbott; is a consultant to Eli Lilly, Pfizer, Jazz, and Cephalon; and has received honoraria from Eli Lilly and Pfizer. Dr. Pathak receives research support from Forest. Dr. Posner has received research support from GlaxoSmithKline, Forest, Eisai, AstraZeneca, Johnson & Johnson, Abbott, Wyeth, Organon, Bristol-Myers Squibb, Sanofi-Aventis, Cephalon, Novartis, Shire, and UCB Pharma. Dr. March is on the speakers' bureaus of Pfizer and Eli Lilly and has received research support from Eli Lilly, Pfizer, and Wyeth. The other authors have no financial relationships to disclose.