Three major symptom factors as predictors of outcome in GENDEP
The observed mood factor was significantly associated with older age at study entry (Spearman’s ρ=0.11, p=0.0014) and later age of depression onset (Spearman’s ρ=0.14, p=0.0001) but not with other baseline and treatment characteristics (sex, age, marital status, employment, episode duration, antidepressant treatment history, attrition or dose of either antidepressant; all p>0.05). Higher observed mood scores at baseline predicted worse outcome of treatment on all three scales (), with the strongest effect on BDI. The effect was independent of drug and confirmed in sensitivity analyses incorporating additional covariates, including age of onset.
Prediction of treatment outcome from the three baseline symptom dimensions in GENDEP
The cognitive symptom factor score was unrelated to other baseline characteristics (all p>0.05) but was associated with higher exit dose of escitalopram (Spearman’s ρ=0.22, p<0.0001) and higher exit dose of nortriptyline (Spearman’s ρ=0.16, p=0.0074). Higher baseline cognitive symptom scores strongly predicted significantly worse outcome of treatment on MADRS and HAMD-17, but not on BDI (). Similar results were obtained in sensitivity analyses with additional covariates.
The neurovegetative symptom factor score was not a significant predictor of outcome ().
Six specific symptom dimensions as predictors of outcome in GENDEP
Four of the six specific symptom dimensions significantly predicted outcome on MADRS (). The interest and activity dimension was the strongest predictor and predicted outcome on each of the three depression rating scales at p<0.0001, independently of overall baseline severity and of which antidepressant was used (). These effects were confirmed in sensitivity analyses restricted to randomly allocated individuals [e.g. for interest-activity and MADRS: β=0.21, 95% confidence interval (CI) 0.13–0.27, p=7.0×10−9]. Higher interest-activity scores were associated with later depression onset (Spearman’s ρ=0.07, p=0.0486) and more previous depressive episodes (Spearman’s ρ=0.08, p=0.0211) but no other baseline variables (all p>0.05). The interest-activity dimension was also associated with a higher dose of escitalopram (Spearman’s ρ=0.21, p=0.0001) and of nortriptyline (Spearman’s ρ=0.18, p=0.0033). The prediction of outcome by the interest-activity dimension remained unchanged after controlling for potential confounders, including age of onset, number of depressive episodes and dose of antidepressants (β=0.18, 95% CI 0.13–0.24, p=4.7×10−10).
Prediction of treatment outcome from the six baseline symptom dimensions in GENDEP
The only symptom dimension with evidence of differential prediction by drug was anxiety. Higher baseline scores on the anxiety dimension predicted worse outcome with nortriptyline but slightly better outcome with escitalopram (i.e. the effects were in the opposite directions in the two medication groups; interaction p=0.0233; ).
Specificity of prediction by the interest-activity symptom dimension in GENDEP
As the two symptom dimensions most predictive of outcome (interest-activity and mood, ) shared cross-loaded items measuring activity and energy, we explored their relative contributions in an additional analysis with both interest-activity and mood dimensions entered as predictors of the primary outcome. This showed that the strong effect of the interest-activity dimension was independent of mood (β=0.18, 95% CI 0.12–0.24, p=3.4×10−10) and that the mood dimension did not carry additional predictive information independent of interest-activity (p>0.1). This result confirmed the decision that only the interest-activity dimension should be followed up in the replication sample.
Replication of the interest-activity dimension as a predictor of outcome in STAR*D
The interest-activity symptom dimension fulfilled a priori
criteria for pursuing replication (association at the corrected p
<0.005 in primary analysis and concordant results with all outcome measures). An equivalent score in STAR*D was constructed by summing HAMD-17, IDS and QIDS items corresponding to items forming the interest-activity score (Supplementary Table S3
). Items with equivalent content and source (clinician versus
self-report) were identified for all items, except that no equivalent to the self-reported work/activity on BDI was identified in STAR*D. The resulting scores were normally distributed (Supplementary Fig. S1
A higher baseline interest-activity symptom score significantly predicted worse outcome of treatment with citalopram on all three outcome scales in STAR*D after correcting for overall baseline severity (all p<0.001; , model A).
Prediction of treatment outcome from the baseline interest-activity symptom score in STAR*D
The outcome deteriorated gradually with increasing levels of the interest-activity score (). The prediction of outcome by the interest-activity dimension was complementary to the previously reported prediction by the somatization-anxiety score, with both scores independently contributing to the prediction of outcome (, model B). The strength of the prediction remained unchanged in sensitivity analyses controlling for a comprehensive list of baseline covariates, including ethnicity, marital status, employment, income, age of onset, number of episodes, family history of mood disorder, co-morbid post-traumatic stress and obsessive–compulsive disorder as identified by self-report, number of co-morbid axis I disorders by self-report, and anxiety-somatization score in addition to age, sex, baseline severity and recruiting centre.
Fig. 2 Association between the interest-activity symptom dimension at baseline and percentage improvement over 12 weeks of treatment on the primary outcome measures in (a) Genome-based Therapeutic Drugs for Depression (GENDEP) and (b) Sequenced Treatment Alternatives (more ...)
Although the prediction was replicated with each of the three outcome measures, there were differences in effect size: the prediction was strongest for the QIDS-C and weakest for HAMD-17. To differentiate effects of scale sensitivity from subject selection, we repeated the analyses with QIDS-C for the 2734 subjects who also had HAMD-17 ratings. We found that the prediction of outcome on QIDS-C in this restricted sample was at least as strong as in the whole sample (β=0.38, 95% CI 0.32–0.44, p=5.5×10−33). This excludes selection effect and suggests that the QIDS-C outcome measure may be more sensitive to the prediction of outcome by interest-activity symptoms.
Clinical significance of the prediction
The results reported here establish that the prediction of outcome by the interest-activity symptom dimension is statistically significant and highly unlikely to be due to chance. In addition, we wanted to establish whether the effect size of this prediction was sufficient for applications in clinical settings. For this purpose, we repeated the analyses with the outcome of remission (defined as a HAMD-17 score of ≤7 at the last visit in both studies) with no imputation and no covariates. In good agreement with the primary analyses, higher baseline scores on the interest-activity symptoms predicted lower rates of remission in GENDEP [odds ratio (OR) 0.59, 95% CI 0.50–0.68, p
] and in STAR*D (OR 0.62, 95% CI 0.56–0.67, p
). shows that the proportion of individuals who reach remission declines monotonically with increasing baseline scores on the interest-activity symptom dimension. Compared to the lowest scoring fifth of the participants (quintile 1), the rate of remission in the highest scoring one-fifth of participants (quintile 5) was reduced three time in GENDEP and halved in STAR*D (). We next sought to translate this effect into a more clinically useful metric. The AUC was 0.65 in GENDEP and 0.62 in STAR*D (Supplementary Fig. S2
), which translates to an NNA of 3 and 4 respectively. In other words, measuring the interest-activity symptom dimension in every three to four patients will help to predict one additional remission accurately compared to chance.
Fig. 3 Association between the interest-activity symptom dimension at baseline and remission [Hamilton Rating Scale for Depression (HAMD)-17 score ≤7] in (a) Genome-based Therapeutic Drugs for Depression (GENDEP) and (b) Sequenced Treatment Alternatives (more ...)