|Home | About | Journals | Submit | Contact Us | Français|
We do not know whether the clinical criteria for major depression (MD) reflect a single or multiple dimensions of genetic risk.
To determine the structure of genetic and environmental risk factors for the 9 DSM-IV symptomatic MD criteria.
Population-based twin registry.
Seven thousand five hundred members of adult twin pairs from the Virginia Adult Twin Study of Psychiatric and Substance Use Disorders.
Symptoms of lifetime MD as assessed at personal interview.
The best-fit twin model was multidimensional, requiring 3 genetic, 1 common environmental and 3 unique environmental common factors, and criterion-specific unique environmental factors. The first genetic factor was characterized by high loadings on cognitive and psychomotor depressive symptoms. The second and third genetic factors had strong loadings for mood and neurovegetative depressive symptoms, respectively. Genetic factor scores derived from these 3 factors differentially predicted patterns of comorbidity, other historical/ clinical features of MD, and demographic variables. These results suggested that the first genetic factor reflected a general liability to internalizing disorders, while the third genetic factor was more specific for melancholic MD. The 3 unique environmental common factors reflected, respectively, global depressive, core mood, and cognitive depressive symptoms.
The DSM-IV syndrome of MD does not reflect a single dimension of genetic liability. Rather, these criteria reflect 3 underlying dimensions that index genetic risk for cognitive/psychomotor, mood, and neurovegetative symptoms. While in need of replication, these results, validated by predictions using estimated genetic factor scores, have implications for gene-finding efforts for MD.
The historical origins of the DSM-IV criteria for major depression (MD)1 are traceable back through DSM-III2 and Research Diagnostic and Feighner criteria3,4 to proposals made in the 1950s.5,6 Clinical judgment and not psychometrics guided the development of these criteria.
These criteria are diverse reflecting mood, hedonia, vegetative functions, psychomotor activity, and cognitive function and content. Perhaps surprisingly, factor analyses in both epidemiological7,8 and clinical9 studies suggest that these 9 criteria form a single dimension. However, such studies have been conducted solely at the phenotypic level.
Twin and family studies consistently show the etiological importance of genetic factors for the diagnosis of MD.10,11 However, we are unaware of any prior study applying multivariate genetic methods to determine whether the 9 DSM symptomatic criteria for MD reflect a single underlying genetic factor. We herein report such a study using personal interview–based assessments of lifetime MD in the Virginia Adult Twin Study of Psychiatric and Substance Use Disorders.
Participants in this study derived from 2 interrelated Virginia Adult Twin Study of Psychiatric and Substance Use Disorders studies of white same-sex twin pairs.12 Twins were ascertained from the birth certificate–based Virginia Twin Registry. Female-female (FF) twin pairs, born in 1934 to 1974, were eligible if both members responded to a mailed questionnaire in 1987 or 1988. Reports on symptoms of lifetime MD used in this report were collected at the third interview wave (FF3), conducted in 1992 to 1995, with cooperation in individual waves ranging from 88% to 93%.12 Data on the male-male and male-female pairs (MMMF) came from a sample (birth years 1940–1974) initially ascertained, with a 72% cooperation rate, directly from registry records containing all twin births. The first interview (MMMF1) was completed largely by telephone in 1993 to 1996. The second wave of interviews (MMMF2), conducted in 1994 to 1998, had a response rate of 83%. Data on lifetime symptoms of MD were used from the MMMF2 wave.
Zygosity was determined by discriminate function analyses using standard twin questions validated against DNA genotyping in 496 pairs.13 The mean (SD) age and years of education of the twins were 35.1 (7.5) and 14.3 (2.2) at the FF4 interview and 37.0 (9.1) and 13.6 (2.6) at the MMMF2 interview.
These analyses used data from 7500 twins, including both members of 3084 pairs (503 monozygotic FF twin pairs, 346 dizygotic FF twin pairs, 703 monozygotic MM twin pairs, 485 dizygotic MM twin pairs, and 1047 opposite-sex dizygotic pairs) and 1325 twins without their co-twin. (These numbers do not sum because all possible pairings for triplet and quadruplet sets were included.)
The lifetime MD section of the interview was adapted from the Structured Clinical Interview for DSM-III-R14 modified to include DSM-IV criteria. Subjects had to respond positively to 1 or more lifetime episodes that included “feeling depressed or down” and/or “uninterested in things or unable to enjoy the things you used do” to continue in the section. Episodes that were judged to result from illness, medications, or normal grief were excluded. The worst lifetime episode was defined and individual questions were asked about the presence of each of the remaining DSM-IV MD symptomatic criteria for that time “when things were at their worst.”
The human subject committees at Virginia Commonwealth University approved this project. Written informed consent was obtained prior to face-to-face interviews and verbal consent, prior to telephone interviews. Interviewers had a master’s degree in a mental health–related field or a bachelor’s degree in this area plus 2 years of clinical experience. Members of a twin pair were always interviewed by different interviewers.
Twin models decompose the sources of individual differences in liability to MD into 3 components: additive genetic effects (A), shared family environment (C), and unique environment (E).12 Shared environment reflects family and community experiences that increase similarity in siblings raised together. Unique environment includes random developmental effects, environmental experiences not shared by siblings, and random error.
Multivariate twin models estimate the degree to which genetic and environmental influences are shared across the 9 DSM-IV A criteria for MD1 (which we term common factors) vs those specific to each individual criterion (termed criterion specific). This is done by including in the model genetic and environmental common factors that influence risk for all the criteria as well as criterion-specific influences.
Independent pathway structural equation twin models were fitted using the full information maximum likelihood method in Mx.15 We tested for both quantitative sex effects (ie, is the magnitude of genetic influences on MD the same in both sexes?) as well as qualitative sex effects (ie, are the same genetic factors influencing risk to MD in men vs women?). To obtain unbiased population-based parameter estimates, the model had to take into account the structure of the interview through which individuals were either selected into or skipped out of the MD section. That is, subjects who denied lifetime episodes of either sad mood (DSM-IV criterion A1 for MD) or loss of interest/ pleasure (criterion A2) could not meet diagnostic criteria for MD and were skipped out of the MD section. No data were recorded from them on the presence or absence of MD criteria A3 through A9.
The full information maximum likelihood method implemented in Mx makes use of all available twin information and can provide asymptotically unbiased parameters when selection items (here criteria A1 and A2) are included and missingness can be assumed to be “at random.”16 Mx optimizations were performed using both the try-hard option and different starting values to reduce the possibility that a solution found was a local rather than global minimum.
We began the sequence of model fitting with a basic 1-1-1 model with specifics (where the first, second, and third numbers reflect the number of genetic, shared environmental, and unique environmental common factors). Next, tests for quantitative and qualitative sex effects were carried out. Then, attempts were made to simplify the resulting model by deleting 1 by 1 all the common factors and then trying to delete the criterion-specific genetic and common environmental effects. (We did not attempt to eliminate the criteria-specific unique environmental effects as this unrealistically assumes that each criterion is assessed without error.) Next, we made the model progressively more complex systematically searching, at each step, for the “best-fit” model.
The primary goal was to find the model that reflected the optimal balance between parsimony and explanatory power. This goal was operationalized using the Akaike information criterion (AIC),17,18 which equals χ2−2 df where df equals the difference in the df of the 2 models. We sought to minimize the AIC value. (We considered using the Bayesian information criterion, which has some desirable properties with complex models.19 However, probably because of the substantial missingness in our data, the Bayesian information criterion performed erratically and was dropped early in our model fitting.)
After determining a best-fitting model based on AIC, the Mx estimated factor loadings for this model were extracted and rotated in SAS20 using a Varimax rotation criterion to improve interpretability. We considered as “prominent” loadings that were 0.31 or more (ie, accounted for ≥10% of the variance).
Maximum likelihood genetic factor scores were estimated by computing the conditional likelihood of the twin pairs’ item responses, weighted by the joint likelihood of the factor score estimates. This is an application of Bayes’ theorem, in that the joint likelihood p(F,R) of the factor scores F and the item responses R is evaluated as p(R|F)p(F). This factor score model was iteratively fitted, separately for each of the 5 different zygosity/sex groups, to each twin pair’s raw data to estimate genetic factor scores for each individual. To validate the genetic factors found in our best-fit twin model, we predicted from these genetic factor scores a representative group of “validating” variables unrelated to the DSM MD criteria. These measures included 1 personality dimension known to be strongly associated with MD,21–23 2 representative internalizing and 1 representative externalizing psychiatric disorder known to be highly comorbid with MD24; and 6 clinical and historical features of MD and 2 demographic variables (sex and years of education as an index of social class) known to be associated with MD.25,26
To examine whether the genetic factor scores differed from each other in their prediction of the external validators, 2 regression analyses were performed. First, separate regressions were conducted to examine the pattern of differences in prediction for each validator. Second, a model constraining the 3 genetic regression coefficients to be equal within each validator variable was specified. Since the outcome variables were binary or ordinal (or rescaled to be ordinal), the robust weighted least squares mean and variance adjusted estimator in Mplus version 6.027 was used to optimize models. This model estimates probit regression coefficients for each of the genetic factor scores predicting the validator. Since the estimated genetic factor scores are calibrated on a uniform standard scale, the effect size units are more readily interpretable when comparing the coefficients.
Of the sample of 7500 interviewed twins, 3829 (51.1%) denied lifetime episodes of sad mood and loss of interest/ pleasure, skipping out of the remainder of the MD section. Table 1 shows, in all subjects and in those screened into the MD section, the endorsement rates for the 9 criteria. In the entire sample, a period of 2 weeks or more of depressed mood was endorsed more frequently than a similar period of loss of interest or pleasure. Among the remaining 7 criteria reported for the time of the worse depressive episode, sleep problems was the most and suicidal ideation the least frequently endorsed.
Model fitting started with basic model 1 (Table 2), which includes 1 genetic, 1 shared environmental, and 1 specific environmental common factor as well as criteria-specific genetic, shared environmental, and specific environmental effects. This is a 111_111 model where the first 3 numbers reflect the number, respectively, of the genetic, shared, and specific environmental common factors and the second set of numbers, the presence (designated by 1) or the absence (designated by 0) of criteria-specific genetic, shared environmental, and specific environmental factors.
Models 2, 3, and 4 modify the basic model, respectively, testing for quantitative sex effects, qualitative sex effects, and both quantitative and qualitative effects. Of these, only model 3 produced an improved fit index, suggesting the presence of qualitative sex effects.
Working from model 3, we simplified the model by dropping individual criteria-specific genetic (model 5), shared environmental (model 6), and both criteria-specific effects (model 7). All of these models fit better by AIC than model 3 but model 6 produced the best fit.
The next set of models (8, 9, and 10 in Table 2) eliminated, 1 at a time, the single genetic, shared environmental, and unique environmental common factors. None of these improved on the fit of model 6. Dropping the common unshared environmental factor (model 10) produced a large change in AIC relative to the other models, suggesting that unshared environmental factors are a prominent source of risk common to these symptomatic criteria for MD.
Additional models were then fit that added additional common factors to model 6. Models 11, 12, and 13 added, respectively, a second genetic, shared environmental, and unique environmental common factor. All of these models produced much better values of AIC than model 6, with the best result coming from model 11. Adding a second genetic factor improved the AIC by nearly 100 points, robust evidence for multiple genetic factors underlying the MD symptomatic criteria.
Trying to improve on these models, a fifth common factor was added to model 11. Models 14 to 16 explore the fit of a third genetic, a second shared environmental, or second unique environmental common factor. Of these, model 16 fit best. The substantial model improvement supports the existence of multiple unique environmental common factors for the depressive criteria.
Working from model 16, we then added, in models 17 to 19, a third genetic, second shared environmental, or third specific environmental common factor. Both models 17 and 19 produced an improved fit over that observed for model 16, but model 17 fit better. The improvement in fit was 12 AIC units.
Now, given model 17, models 20 to 22 added a fourth genetic, second shared environmental, or third specific environmental common factor. Only model 22 improved on the AIC value of model 17 but only modestly (0.5 AIC unit).
In models 23 to 25, models were further extended to include a fourth genetic, second shared environmental, or fourth specific environmental common factor. None of these models improved on the fit of model 22, making it our best-fit model.
The best-fit model included 3 orthogonal genetic factors that could be substantively interpreted. Factor 1 had prominent loadings on 4 A criteria: “psychomotor changes,” “feelings of worthlessness,” “difficulty concentrating,” and “suicidal ideation.” These criteria reflect the psychomotor/cognitive symptoms of MD. Symptom criteria with the highest loadings on factor 2 were the 2 stem criteria for MD, “sad mood” and “loss of interest/pleasure,” and the symptom of “worthlessness.” These reflect the core mood symptoms of MD. The third factor was defined primarily by neurovegetative-related symptoms, with prominent loadings on “sleep difficulties” and “fatigue.” This factor also had the highest loading for “changes in weight.” Only 1 symptom, fatigue, had a substantial genetic criterion-specific loading.
While significant in aggregate, shared environmental effects were modest for all the DSM-IV A criteria for MD with no particularly coherent pattern of loadings (Figure, B).
The 3 individual-specific environmental common factors were, by contrast, readily interpretable. The first factor had prominent loadings on 7 of the A criteria (all but “guilt” and “suicidal ideation”). This came closest to reflecting a general depressive symptom factor. The second individual-specific environmental common factor had relatively strong loadings on both the screening items and “reduced concentration.” Like the second genetic factor, this unique environmental factor was defined predominantly by mood-related symptoms of MD. The third individual-specific environmental factor had robust factor loadings on 2 criteria that have traditionally reflected core cognitive symptoms of MD: “guilt” and “suicidal ideation.” The criterion-specific environmental factors (which include measurement error) varied from a low of +0.11 for anhedonia to a high of +0.74 for weight/appetite.
Table 3 presents the results of the best-fit model in a different but informative way. Noteworthy is the generally modest total heritability of most of the individual depressive symptoms, with estimates ranging from a low of 12% for weight/appetite changes to a high of 36% for fatigue. Collectively, for lifetime reports of the 9 DSM-IV MD criteria A symptoms, a large proportion of the risk liability was attributable to unique environmental sources.
Based on the rotated results from the overall best-fit model, the level of genetic liability for each of the 3 identified factors was estimated as a factor score for each subject in our sample. These 3 genetic factor scores were then examined for their differential prediction of a representative set of external “validators.” As outlined in Table 4, the 3 factors differed significantly in the magnitude of their predictive effect sizes with 11 of the 12 variables. For 7 of the 12 analyses (neuroticism; lifetime history of generalized anxiety disorder, panic disorder, and alcoholism; age at onset; number of episodes; and treatment seeking), the effect was strongest with the first or cognitive/motor genetic factor. In these analyses, the second strongest loading varied between the second mood and the third neurovegetative genetic factor.
The neurovegetative factor was the strongest predictor for 2 variables: the melancholia subtype and 1 key “endogenous” symptom, early-morning awakening. The other key endogenous symptom (unreactive mood) was equally well predicted by the first and third genetic factors. The second or core mood genetic factor was the only factor that had a significant effect and differential prediction of sex and the most strongly associated with educational status. While the second genetic factor was associated with higher educational status, both the first and third genetic factors were associated with lower educational attainment.
The goal of this report was to investigate the dimensionality of the underlying structure of genetic and environmental risk factors for the DSM-IV symptomatic criteria for MD. We were particularly interested in determining whether genetic influences on these 9 criteria were homogeneous or heterogeneous. The sequence of model-fitting comparisons produced strong statistical evidence against the hypothesis of genetic homogeneity for the symptomatic criteria for MD. The best-fit model required 3 genetic factors. Based on the rotated profile of factor loadings, these 3 factors reflected the psychomotor/ cognitive, mood, and neurovegetative features of MD. We estimated maximum likelihood genetic factor scores from the parameters of this best-fit model and showed that the resulting scores were meaningfully differentiated in their patterns of predictions for selected validating criteria. The psychomotor/cognitive genetic factor was the strongest predictor of the personality trait of neuroticism and co-morbidity with anxiety disorders. This factor appears to reflect, at least in part, a nonspecific genetic liability to a broad range of internalizing disorders (as detected in multivariate analyses of 2 twin cohorts28,29). By contrast, the third genetic factor (neurovegetative) was most strongly predictive of the melancholic subtype and may reflect genetic risks more specific to MD.
Our model fitting produced evidence for qualitative sex effects on the genetic risk factors for depressive symptoms. These results are consistent with what has been seen in twin models for syndromal MD both in this sample30 and in the large national Swedish twin sample.11
Fatigue was the 1 MD criterion with substantial specific genetic effects. Consistent with prior studies,31,32 this suggests that considerable genetic variation in risk for fatigue exists in the population, much of which is independent of the vulnerability to depression.
We are unaware of any prior multivariate genetic study of DSM criteria for MD with which to compare our results. The most comparable prior study is that of Korszun et al,33 who examined depressive symptom factors using items assessed by the Schedules for Clinical Assessment in Neuropsychiatry in 475 sibling pairs concordant for recurrent MD. They identified 4 factors, the first 3 of which (mood symptoms and psychomotor retardation, anxiety and psychomotor agitation, guilt, and suicidality) were significantly correlated in the sibling pairs. While the sample and assessment techniques and details of results differed from those of our study, both studies detected multiple, largely independent familial dimensions of depressive symptoms.
We also found that the latent common individual-specific environment was multidimensional and a prominent feature of the underlying risk liability structure for the MD symptom criteria. The 3 factors reflected general depressive, mood, and core cognitive symptoms. These results, which reflect a pathoplastic effect of the environment on depressive symptom patterns,34 are consistent with more fine-grained prior work in this cohort demonstrating different depressive symptomatic profiles associated with different classes of stressful life events.35 Environmental stressors for MD do not simply produce aggregate increase in risk for the full depressive syndrome but rather impact, at least in part, selected groups of depressive symptoms.36
A rich clinical literature has long debated the unity vs heterogeneity of the depressive syndrome,37,38 several themes of which are echoed in our findings. First, Beck and Alford39 suggested etiological primacy for the cognitive changes in depression. Of their cognitive triad, only 1—worthlessness—is in the DSM-IV A criteria. However, this symptom is closely associated with suicidality,40,41 so these 2 criteria might roughly index cognitive depressive symptoms. This is consistent with a recent factor analytic study of patients with MD using the Zung scale, which identified a “cognitive” factor that includes “suicidal rumination.”42 These 2 criteria—worthlessness and suicidal ideation—form 1 distinct environmental common factor and a key part of 1 genetic common factor.
Second, neurovegetative symptoms often cluster together in factor analyses of depressive symptoms.43–46 Trainees are typically taught that they reflect the “biological” or “hypothalamic” symptoms of depression. In our analyses, the 3 DSM A criteria assessing neurovegetative symptoms—weight/appetite and sleep changes and fatigue—had high loadings on the same genetic factor. Environmental influences on these symptoms were shared with other criteria in the first general symptom environmental common factor or were criteria specific in their effects.
Third, it has been debated whether sad mood or anhedonia better reflects the core symptoms of MD.47 In our analyses, these 2 symptoms closely tracked each other in both the genetic and environmental factor structures. These criteria appear to reflect quite closely related genetic and environmental causes that were, in turn, partially distinct from those that impacted the more specific depressive symptoms.
In an earlier clinical examination of this cohort,48 we examined individuals meeting DSM-IV criteria for MD in the last year and found that the symptomatic criteria for MD differed substantially in their predictive relationship with a range of clinical validators. Part of this heterogeneity was captured by the distinction between cognitive and neurovegetative symptoms, a distinction reflected in the current analysis in both our genetic and environmental factors. Our current results provide further support for our conclusion of this early report: “These results challenge the equivalence assumption for the symptomatic criteria for MD and suggest a more than expected degree of ‘covert’ heterogeneity among these criteria.”48(p1679)
Results from these current analyses should be viewed in light of findings that DSM criteria for alcohol dependence,49 antisocial personality disorder,50 and conduct disorder51 reflect multiple diverse genetic factors. Many of the historical-clinical syndromes that populate our nosology may reflect multiple dimensions of genetic risk.
Compared with results obtained for schizophrenia and bipolar disorder, efforts to identify, by genome-wide association, molecular genetic variants impacting risk for MD have, to date, been less successful.52–54 There are many reasons why this may be so, including the higher population prevalence and lower heritability of MD.10 However, one possible contributing factor may be the genetic complexity of MD. If the disorder results from the joint effects of multiple independent dimensions of genetic risk, case-control designs may not be an optimal strategy. Greater power may arise from assessing and analyzing these dimensions independently rather than as aggregated together in the clinical syndrome of MD.
These results need to be considered in the context of 6 potential methodological limitations. First, these results could be biased by methodological limitations of the twin method, particularly failure of the equal environment assumption or high levels of assortative mating. Prior analyses of these potential biases in this sample have suggested that their impact is likely to be modest at most.12,55
Second, this is a general population sample so a considerable amount of information is coming from individuals with subsyndromal depressive symptoms. Furthermore, on average, those who meet DSM-IV criteria for MD are only moderately ill. It is not clear how our results would compare with a more severely ill, hospitalized sample of individuals with MD.
Third, for the data analyzed herein, reports of lifetime episodes of MD have been shown in this sample to be of only moderate reliability.56,57 While our results could have arisen from recall bias, it is difficult to construct a plausible pattern of biases that would produce the results observed.
Fourth, the model-fitting process was challenging. Although we used a simple algorithm for model selection and a commonly used and well-tested fit index, how confident can we be in the models election process?58 The best response to this legitimate concern is to review the model-fitting results. Our conclusion that more than a single genetic factor is needed to account for the patterning of risk liability underlying the 9 MD criteria has robust statistical support: the large improvement in fit (99 AIC units) going from 1 to 2 genetic factors (models 6–11). We are also reasonably sure that 3 genetic factors were present because of the improvement increment in fit in going from models 16 to 17 (from 2 to 3 genetic factors; 12 AIC units). By similar logic, it is likely that MD has multiple individual-specific environmental factors (eg, the gain of 17 AIC units from model 16 vs 11). However, we are much less sure whether the correct number of environmental factors is 2 or 3 because of the nearly identical fit of models 22 and 17. As to the specific parameter estimates, some skepticism is appropriate. The patterns of loadings in our best-fit model, especially for the multiple genetic and individual-specific environmental factors, are clinically sensible and consistent with prior literature. They were also relatively stable across different models. Only the results of replication will address these appropriate questions more definitively.
Fifth, it would have been ideal if we had information on the presence of each depressive criterion in all twins. However, in the absence of depressive episodes, it is difficult to meaningfully assess, over a lifetime, the occurrence of many of these symptoms such as problems with sleep or weight. Because we had responses to the 2 “stem” items (sad mood and loss of interest) on our entire sample and especially in twins discordant for endorsing these stem items, the conditional missing data patterns generated by the skip-out structure could be directly modeled.59 Furthermore, we assessed in these twins the presence of all DSM-IV symptoms of depression experienced in the last year. We found, as predicted by our missingness model, that the intercorrelations of nonstem MD criteria were similar in subjects who did vs did not endorse one of the stems. While our model of missing data is unlikely to be precisely true, it is improbable that major biases were thereby introduced. Furthermore, this is a far preferable approach to restricting analysis to only subjects who met full MD criteria.33
Finally, our analyses were restricted to the 9 symptomatic DSM-IV criteria for MD. Our results might have differed considerably if we had analyzed a larger and more diverse set of depressive symptoms.
These results have potentially important implications for the investigation of genetic risk factors for MD. Prior twin studies that have examined the magnitude of aggregate genetic effects for MD, their development over time, or the resulting patterns of comorbidity have assumed that MD reflected a unitary dimension of genetic liability. These results will need to be reconsidered in light of evidence for multiple genetic factors underlying MD. Molecular genetic studies, particularly candidate gene and genome-wide association studies, have also focused almost exclusively on the comparison of subjects meeting criteria for MD with matched controls. If accurate, our results suggest that this approach would likely be inefficient. While these results should be replicated before any widespread changes in analytic strategy are appropriate, they raise our awareness of the widely accepted but rarely tested assumption that the criteria for DSM-IV psychiatric and substance use disorders reflect single dimensions of genetic risk. This assumption is probably unwarranted and should be empirically evaluated rather than assumed.
Funding/Support: This research was supported in part by grants MH-40828 and MH/AA/DA-49492 from the National Institutes of Health. The Virginia Twin Registry is supported by grant UL 1RR031990.
Conflict of Interest Disclosures: None reported.