The aim of this report was to evaluate empirically, for the first time to our knowledge, the DSM-IV definition of severity of MD. Our analysis shows that this construct was neither simple in structure nor uniform in validity. Four specific findings are noteworthy. First, the correlations between the three DSM-IV indices of depressive severity were only moderate in magnitude. Taking into account that symptom severity and overall syndromal impairment partly overlap in content, this finding is even more striking. In addition, when examined together, the three severity indices did not form a highly coherent factor. Second, the individual measures of severity and also the general severity factor were validated in the sense that their association to a fairly wide range of characteristics in depressed patients was examined, with none of these validators playing any role in the diagnostic process. Classifying depressed subjects by severity can tell you some important things about the expected patterns of co-morbidity, other clinical features and prognosis. Third, the patterns of relationships between the severity indices and our set of validators differed meaningfully across the three indices. Fourth, in most of the cases (17 out of 23), at least one severity index explained significantly distinct proportions of variance of our validators when added to a model with one of the other indices. That is, these three different measures of depressive severity were often associated with different things. In summary, these results suggest that, as operationalized in DSM-IV, the concept of severity of MD is best understood as a multifaceted heterogeneous construct.
We were surprised at the low loadings of some of our measures of symptom severity on the common factor (e.g. appetite/weight and sleep). However, this has been seen in one other study (
Olsen et al. 2003) and there was very limited evidence in our sample for a second distinct symptom severity factor. In addition, although not entirely comparable to our study, a weak performance of various disaggregated weight and sleep items as severity measures was also found in studies on different severity measures (
Faravelli et al. 1996;
Santor & Coyne, 2001;
Zimmerman et al. 2006).
Specific findings in our sample for inter-relationships between the three indices of depressive severity and a range of external validators also has precedent in the literature. Prior studies have reported, for example, that impairment is related to risk for future depressive episodes (
Rodriguez et al. 2005), co-morbidities with anxiety or substance use disorders (
Mojtabai, 2001) and co-morbid panic-depression (
Roy-Byrne et al. 2000); and that impairment is not related to sex (
Sheehan et al. 1996) or age of onset of depression (
Zisook et al. 2004). In addition, our finding that all three severity indices were significantly associated with chronic depression also corresponds to earlier findings (e.g.
Pettit et al. 2009). In contrast to our results of males reporting higher symptom severity,
Scheibe et al. (2003) found no sex differences in severity of depression for interview-based measures. Our findings are also consistent with an earlier study on the same sample that found, using structural equation twin modeling, that the factors that impact on functional impairment in MD are partly separable from those that alter risk for the disorder (based on meeting sufficient DSM-IV criteria) (
Foley et al. 2003).
The classification of the severity subtypes of MD in the ICD-10 clinical (
WHO, 1992) and research criteria (
WHO, 1993) differ in several ways from that proposed in DSM-IV: (i) the additional criterion ‘loss of confidence and self-esteem’, (ii) the use of ‘type’ of symptoms, especially somatic symptoms, as additional severity measures, and (iii) the inclusion of distress in the syndromal impairment in the clinical criteria. Despite these differences, our results carry at least two implications for the ICD-10 classification of a mild, moderate and severe depressive episode. First, by specifying, in both the clinical and research criteria, a minimum of symptoms for each severity subtype, the ICD-10 definition emphasizes criteria count as crucial to the overall assessment of severity of MD, an approach not entirely supported by our results. Second, surprisingly, syndromal impairment is included as part of the definition of depressive severity in the clinical (
WHO, 1992) and not in the research criteria (
WHO, 1993). This is not consistent with our own findings, where syndromal impairment explained unique proportions of variance as an index of depressive severity independent of symptom severity or criteria count.
There are several well-established depression scales providing valuable severity measures [e.g. the Hamilton Rating Scale for Depression (HAMD;
Hamilton, 1960,
1967), the Beck Depression Inventory (BDI;
Beck et al. 1961,
1996), the Montgomery–Äsberg Depression Rating Scale (MADRS;
Montgomery & Äsberg, 1979), or the Zung Self-Rating Depression Scale (SDS;
Zung, 1965)] that combine a symptom count and symptom frequency or intensity to form a sum score. The HAMD and the BDI also include a work impairment question. Validation studies suggest that the MADRS and the BDI are superior to the HAMD, especially the long version, as an index of depressive syndrome severity (e.g.
Gibbons et al. 1993;
Licht et al. 2005;
Carmody et al. 2006). However, none of these measurements rely strictly on the DSM-IV definition of severity of MD. Either they are not restricted to the nine criteria A symptoms or they consider impairment and symptom severity as interchangeable and not parallel measures. Our data set did not contain any of these scales so we were unable to evaluate their performance. Of note, the notion of unidimensionality of severity that these scales typically assume (see
Gibbons et al. 1993) was not entirely supported by our results.
Limitations
These results should be interpreted in the light of five potentially important methodological concerns. First, our sample is limited to white twins born in the Commonwealth of Virginia and these results may or may not extrapolate to other samples. Second, the clinical characteristics we used as validators probably vary in the degree to which they reflect underlying severity, and so including some and excluding others could influence the general performance of the three severity measures. That is, the results of this comparison are necessarily limited to this particular set of validators.
Third, the nature of our analyses made it difficult to account formally, in most cases, for the non-independence of observations in our twin data. However, only about 17% of our data come from twin pairs, and correlations in all three severity measures in these pairs were fairly low (≤0.20). Thus, it is very unlikely that the twin character of our data influenced our results substantially. In addition, we explored formal corrections for the binary logit models and found no substantial effects. Fourth, our results could be affected by missing data regarding symptom severity. As the degree of symptom severity was obtained only when the symptom was endorsed, the problem of missing data reflects the inherent non-independence of symptom count and symptom severity and is unavoidable in this or any other similar analysis.
Fifth, as noted above, the DSM-IV is ambiguous about whether distress should be included in measures of the severity of MD. Although distress is included in the overall definition of severity as part of syndromal impairment, it is not further mentioned in the specification of the subgroups ‘mild’, ‘moderate’ and ‘severe’. Therefore, our main analyses did not include distress ratings in our severity measures. To address whether our findings would change were we to incorporate measures of distress, we repeated in our MMMF subsample (where an item assessing distress was added after the introduction of DSM-IV) all of the analyses conducted above with and without an additional single-item measurement for distress added to the factor analysis from which we derived the syndromal impairment index.
When we compared the correlations between syndromal impairment (n=788) with and without the distress measure to our other two measures of depressive severity, the correlations rose slightly with criteria count (from 0.23 to 0.27) and with symptom severity (from 0.37 to 0.43). The strength of association of our measure of syndromal impairment to our wide range of validators also increased slightly with a mean (S.D.) of the ORs from 1.28 (0.25) to 1.33 (0.32), although the OR improved for only 14 of the 23 validators (for details see ). These results suggest a slight increase in the coherence and predictive power of the severity measures if distress is included in the measure of syndromal impairment. This comes, however, at the cost of a reduction in conceptual clarity as the constructs of syndrome-related distress and syndrome-related functional impairment are at least partially distinct.
| Table A1Comparison of syndromal impairment with and without distress |