Our results show that studies performed to date provide insufficient evidence to confirm the stable existence of clear data-driven symptomatic subtypes of depression. First, relatively few studies have been dedicated to the detection of data-driven MDD subtypes (20 of 1176 articles). Second, the outcomes of these few studies are conflicting. Latent class analyses mainly grouped patients on overall severity, but not in classes with qualitatively different symptom profiles, whereas latent factor analyses, most consistently identified a factor explaining the variance of a mixture of cognitive and somatic symptoms (s1, s2, s5b, s6), which seems in contradiction with a purely cognitive or somatic symptom dimension. However, the 13 identified factors differed to such an extent that generalizable conclusions are questionable. In short, the collected studies fail to give adequate evidence for the existence of qualitatively different subtypes or symptom dimensions of MDD. Thus, this lack of empirical support also holds for theoretical motivated subtypes such as as melancholic or atypical, or cognitive and somatic symptom dimensions in MDD.
Particularly notable is the great deal of diversity in the results. How can we explain this large diversity? From the collected results, it can be seen that all sorts of factors influence the outcomes of latent variable analyses; for instance, the included number of patients, and the severity and quantity of their symptoms obviously affect the resulting latent classes and factors. Presumably, there is considerable difference in symptom endorsement rates between the studies, as was reported from a recent mega-analysis of genome-wide association studies of patients with MDD [55
]. Furthermore, the extensive number and differences of the questionnaires used, including all not recorded symptoms (weight gain, hypersomnia), is a likely contributor to the diverging results. On top of that, previous studies have found that in case of latent factor analyses, several items, including model choice (for example, component or common factor model), sample size, number and communalities of the variables, selection of the number of factors and model fit criteria, rotation method, and the degree of overdetermination, affect the stability and correspondence of the resulting factors [56
]. An extra influence to take into account is the manifestly high correlation of symptoms within one questionnaire, as shown by some (not all) latent factor analyses on different questionnaires simultaneously. Thus, many theoretical choices preceding analysis are important determinants of the retrieved classes or dimensions.
The major insight from this review is that we should improve our research techniques considerably to find data-driven subtypes of depression. Of course, it is an open-ended question as to whether such a pattern exists. It is possible that there simply are no symptomatic subtypes or dimensions, and that latent variable techniques have failed to show consistent subtypes because of this fact. One could argue that if there were clear symptomatic subtypes or dimensions, a more consistent pattern would have emerged out of the data, regardless of the theoretical and measurement choices involved. The other possibility is that symptomatic subtypes and dimensions do exist, but that the techniques used to date did not succeed in identifying them. If the second possibility is true, a crucial question is how study methods could be improved to detect patterns in the data that have not yet been detected. Careful consideration of all theoretical and modeling choices that influence study outcomes will be required to answer that question.
One possible strategy to improve study methods is data enrichment, because it is clear that the quality of data crucially determines the quality of study outcomes. First, some of the studies reviewed above indicate the possible benefit of dynamic measurements, as they showed that differentiated symptom profiles might be clearer with more rather than less severe cases or for patients earlier rather than later in the treatment process. Therefore it would seem prudent to study changes in symptom structure and severity over the course of treatment, preferably also distinguishing the influence of medication. A second set of choices would involve the scales used to assess these symptoms. Some diagnostic instruments use dichotomous (yes/no) measures for each symptom, whereas others have gradual assessments. Gradual assessments, would of course be expected to provide textured differentiation, and in this way possibly lead to greater precision in detecting meaningful subtypes. Another concern is that the frequently used rating scales are primarily designed to be sensitive to change as opposed to capturing the detailed phenomenological picture of MDD. To address the potential heterogeneity in patients with depression, at least all DSM criteria, including all disaggregated symptoms (s3-s5), should be measured in a standardized fashion. A third possibility to enrich data would be the inclusion of other variables in addition to depressive symptoms in analyses. Concerning the symptoms to include in the evaluation, it seems clear that this set of symptoms should be a broad one, so as to allow for the possibility of detecting subtypes associated with symptoms beyond those in the current DSM and ICD systems. For example, there is some suggestion in the literature that there might be value in differentiating irritable from non-irritable MDD [59
] and in assessing the influence of anxiety [60
]. Moreover, apart from symptoms, other indicators, such as hormone status, genetic profile, sex, and age, could also be included as variables in future analyses. The inclusion of these non-symptomatic variables possibly contributes to the unraveling of causal mechanisms of depressive subtypes, although we firmly believe that symptoms are the basis on which to start. Symptoms are non-invasive measures of disease; clinicians are trained to recognize and classify patients based on symptoms; and data on symptoms have been collected world-wide.
A second strategy to improve study methods involves the statistical approaches used to uncover depressive dimensions and subtypes. In addition to the latent factor and latent class approaches, we should consider complex CA models, especially those that use a canonical formulation to predict diverse outcomes [62
], and mixture models that combine features of latent class and item response theory models [64
] or latent class and latent factor models [14
]. Some recent studies have used factor mixture analyses to identify subtypes in other psychiatric disorders, such as attention-deficit/hyperactivity disorder [65
], post-traumatic stress disorder [13
], and schizophrenia [66
]. To date, these approaches have not yet been used to search for subtypes of MDD, but are attractive alternatives in light of their success in detecting useful subtypes of other disorders. Obviously, in accordance with the described latent variable models, many theoretical factors should be considered before applying those new techniques [67
]. However, if subtypes of MDD do exist, using different statistical methods to reveal their structure could be worthwhile.
Thus, future analyses should ideally explore several advanced statistical techniques on enriched datasets. An investigation of the possibilities and limitations of different modeling techniques seems more reasonable than adhering exclusively to the latent factor and latent class models used up to now. Mega-analyses of the MDD symptoms of different samples could be worthwhile, as combined data may have a positive effect on robustness and generalizability of the results [55
]. However, when performing mega-analyses, it is even more important to have rich datasets and to apply sophisticated modeling techniques to the data to accommodate inter-study heterogeneity [55
]. Experiences with those new symptom-based classification attempts might inform other data-driven classification attempts that go beyond the DSM, such as the Research Domain Criteria [69
Finally, what is the value of searching for data-driven subtypes of MDD? We started our review with the observation that patients with MDD differ considerably in their symptomatic presentation, with over 200 possible symptom combinations. To date, theoretical motivated subtypes have not resolved the substantial population heterogeneity of MDD. Empirical discernment of subtypes with similar symptoms could give an impetus to research on etiology, course, and treatment. Improved statistical tools are available to discover patterns in rich datasets, therefore, data-driven subtyping of depression is a valuable approach to be explored.