Assessment of eating disorders at the symptom level can facilitate the refinement of phenotypes. We examined genetic and environmental contributions to liability to anorexia nervosa (AN) symptoms in a population-based twin sample using a genetic common pathway model.
Participants were from the Norwegian Institute of Public Health Twin Panel and included all female monozygotic (n = 448 complete pairs and 4 singletons) and dizygotic (n = 263 complete pairs and 4 singletons) twins who completed the Composite International Diagnostic Interview assessing DSM-IV axis I and ICD-10 criteria. Responses to items assessing AN symptoms were included in a model fitted using marginal maximum likelihood.
Heritability of the overall AN diagnosis was moderate (a2 = .22, 95% CI: 0.0; .50), whereas heritabilities of the specific items varied. Heritability estimates for weight loss items were moderate (a2 estimates ranged from .31 to .34) and items assessing weight concern when at a low weight were smaller (ranging from .18 to .29). Additive genetic factors contributed little to the variance of amenorrhea, which was most strongly influenced by unshared environment (a2 =.16; e2 = .71).
AN symptoms are differentially heritable. Specific criteria such as those related to body weight and weight loss history represent more biologically driven potential endophenotypes or liability indices. Results regarding weight concern differ somewhat from those of previous studies, which highlights the importance of assessing genetic and environmental influences on variance of traits within specific subgroups of interest.
Anorexia nervosa (AN) is a chronic disorder with severe medical and psychological consequences (Becker, Grinspoon, Klibanski, & Herzog, 1999; Garvin & Striegel-Moore, 2001). AN has the highest mortality rate of any psychiatric disorder (Keel et al., 2003; Sullivan, 1995) and is associated with numerous psychological problems, including depression, anxiety, and suicide (Berkman, Lohr, & Bulik, 2007; Birmingham, Su, Hylinsky, Goldner, & Gao, 2005). Yet many questions about the etiology of AN remain (Chavez & Insel, 2007; Stice, 2001).
In the last two decades, investigators have highlighted the influence of genetic factors on eating disorders (see Bulik, 2005; Mazzeo, Slof-Op’t Landt, van Furth, & Bulik, 2006 for reviews). However, examination of genetic and environmental contributions to AN has proven challenging, because of the relative rarity of the disorder, with prevalence estimates among women in the United States and Western Europe approximately 1% (Hoek & van Hoeken, 2003; Hudson, Hiripi, Pope, & Kessler, 2007). In the only twin study to date to examine the heritability of the narrowly-defined DSM-IV AN diagnosis, Bulik et al. (2006) obtained a heritability estimate of .56 (confidence interval [CI]: 0.00–0.87). Bulik et al.’s findings also suggested that unshared environment significantly influences AN symptomatology (i.e., unshared environment accounted for about one-third of the variance in AN). Similarly, studies of broadly-defined AN have supported the role of genetic factors in the etiology of this pernicious disorder (Klump, Miller, Keel, McGue, & Iacono, 2001; Kortegaard, Hoerder, Joergensen, Gilberg, & Kyvik, 2001; Wade, Bulik, Neale, & Kendler, 2000).
Although these diagnostic-level findings are meaningful, and provide direction for future studies, researchers have recently emphasized the importance of assessing eating disorders at the symptom level. As Striegel-Moore and Bulik (2007) noted: “A DSM-IV diagnostic category…might actually represent an occasionally co-occurring yet etiologically diverse mixture of genetically and environmentally influenced symptoms…” (p. 191). Thus, it is important to assess eating disorders at the symptom level to facilitate the refinement of phenotypes. Such refinement could ultimately lead to improvements in treatment and targeted prevention, by clarifying sources of variation for specific components of eating disorder symptomatology (Bulik, 2005).
The purpose of the current study was to assess genetic and environmental influences on AN in a large-population based female twin sample at both the diagnostic and symptom level. In order to do so, analyses were conducted using a marginal maximum likelihood (MML) approach to modeling genetic and environmental effects. This approach overcomes many problems associated with summing items assessing symptoms of an overall diagnosis (or using a single-item to assess a diagnosis composed of multiple symptoms). Specifically, as Neale and colleagues (Neale, Lubke, Aggen & Dolan, 2005) noted, individual items are rarely pure indicators of a latent trait or diagnosis (in this case, AN). Thus, sum scores contaminate the measure of the latent trait with item-specific variance components. For example, the latent trait might have no heritable variation, but if residual symptom variance is heritable then sum scores would also prove heritable. The MML approach makes multivariate analysis of all symptoms practical. In essence, it combines elements of both factor analysis (which enables assessment of the latent trait or diagnosis) and item response theory (IRT), which allows for examination of how “difficult” it is to meet a specific diagnostic criteria (or, in this case, endorse a specific item). This information also provides an indication of how individual items contribute differentially to a diagnosis. Thus, the joint analysis of symptom-level data is much more informative than the sum score approach, in which items of differing quality contribute equally to an overall composite (Neale et al. 2005).
Given the paucity of previous research on the heritability of specific symptoms of AN, no specific a priori hypotheses were proposed. However, the following paragraphs briefly review what is known about the heritability of several specific symptoms of AN, based on studies of broadly defined eating disorders. These studies have examined the relative contributions of three components of variance to specific eating disorder symptoms: additive genetic (A), shared environment (C), and unshared or specific environment (E).
A few recent studies have identified differences in the contributions of genetic and environmental factors to specific AN symptoms (Reichborn-Kjennerud et al., 2004; Wade & Bulik, 2007; Wade, Martin, & Tiggemann, 1998). Reichborn-Kjennerud et al. found that the undue influence of weight on self-evaluation was accounted for by shared and unshared environmental factors; genetic factors did not contribute significantly to the variance of this symptom among either men or women. Similar results were obtained in two studies by Wade and colleagues (Wade et al., 1998; Wade & Bulik, 2007). Specifically, in an earlier study, Wade and colleagues (Wade et al., 1998) found that Eating Disorder Examination Weight Concern scale scores (which also assess the undue influence of body weight on self-concept) were best accounted for by a combination of shared and unshared environmental factors. More recently, Wade and Bulik (2007) found that additive genetic effects had a small but significant contribution to variance in the undue influence of body weight or shape on self-evaluation. However, non-shared environmental factors accounted for the majority of the variance in the undue influence of weight and shape concerns.
In contrast, a study using the Eating Disorder Inventory (EDI) examined the related, yet distinct constructs of Body Dissatisfaction (BD) and Drive for Thinness (DFT) (Keski-Rahkonen, Bulik et al., 2005) yielding evidence for relatively high heritability of DFT and BD among female twins (i.e., a2DFT=.51, 95% CI: 43.7–57.5; a2BD=.59, 95% CI: 53.2–64.7). Similar results regarding BD and DFT were obtained in a two earlier studies (Klump, McGue, & Iacono, 2000; Rutherford, McGuffin, Katz, & Murray, 1993). In all of these studies, shared environmental factors did not contribute significantly to the variance of BD and/or DFT. These findings appear contradictory to Reichborn-Kjennerud et al., (2004) and Wade et al., 1998; Wade and Bulik, 2007). However, these constructs (i.e., weight concerns, undue influence, body dissatisfaction and drive for thinness) are related, yet distinct from one another. As Bulik and colleagues have noted (Bulik et al., 2007), “undue influence of weight on self-evaluation is sometimes confused with body dissatisfaction (Cooper & Fairburn, 1993). However, ’undue influence…’ has a specific meaning solely relating to the degree that self-evaluation is influenced by weight or shape relative to other factors in the person’s life (e.g., work, specific skills, relationships”; Bulik et al., 2007, p. S55).
Measurement differences across these studies are important to consider. For example, Reichborn-Kjennerud et al. (2004), used a single item self-report question (“Is it important for your self-evaluation that you keep a certain weight?”). This item was assessed at an ordinal level and subsequently transformed into a binary item, which results in a loss of information. In contrast, Wade and Bulik (2007) used the Eating Disorders Examination, summed items assessing the undue influence of weight and shape concern, and used their mean in analyses. Keski-Rahkonen, Bulik et al. (2005), as noted above, used the EDI DFT and BD subscales, which assess slightly different facets of the influence of weight on self-evaluation. These differences highlight the importance of construct validity issues, as measurement error can influence estimates of genetic and environmental variance.
BMI is a highly heritable trait (Maes, Neale, & Eaves, 1997), which appears to be influenced by numerous different genes (Rankinen et al., 2006). However, relatively little is known about genetic influences on low BMI and whether available data about the biology of low BMI are relevant to AN (Bulik et al., 2007). One study of Finnish twins (Keski-Rahkonen, Neale et al., 2005) found that, among women, intentional weight loss (≥ five kilograms) was strongly influenced by genetic factors (heritability = 66%, 95% CI: 55–75%). Moreover, the genetic covariance of intentional weight loss and BMI among women in the study was .45, suggesting that the majority of genetic factors affecting BMI differ from those affecting intentional weight loss. However, this study (as well as others which have examined the heritability of BMI, e.g., Maes et al., 1997) did not specifically focus on individuals who were at a low weight. It is possible that genetic and environmental influences could operate differently within the subset of the population who already has a low BMI. Thus, the characteristics of a particular sample or subsample are important to consider in studies of heritability.
Genetic epidemiological studies have not examined the heritability of amenorrhea (Bulik et al., 2007). Nonetheless, it is noted here because it has long been a controversial component of the AN diagnosis (Cachelin & Maher, 1998; Garfinkel et al., 1996). Further, amenorrhea is not limited to any specific eating disorder subtype (Pinheiro et al., 2007). Thus, these authors recommend reconsidering amenorrhea as a diagnostic criterion and propose that it be considered an associated feature of all eating disorders in women.
Although the relevance of genetic factors to eating disorders is becoming increasingly recognized (Bulik, 2005), many questions remain about the influence of environmental and genetic factors on both the overall diagnosis of AN, as well as its specific symptoms. Use of methodology such as MML could facilitate identification of promising endophenotypes or liability indices, which, in turn, could promote the refinement of diagnostic criteria to reflect underlying biological mechanisms more closely (Bulik et al., 2007). The current study represents an early step in this line of research by examining the heritability of the AN diagnosis and its component symptoms in a population-based twin sample.
Participants were from the Norwegian Institute of Public Health Twin Panel (NIPHTP). Twins in the NIPHTP are identified through the Norwegian Medical Birth Registry, which receives mandatory notification of all births. The NIPHTP is described in detail elsewhere (Harris, Magnus, & Tambs, 2002, 2006; Kendler, Aggen, Tambs, & Reichborn-Kjennerud, 2006). Data for the present study came from an interview study of Axis I and Axis II Psychiatric Disorders, which began in 1999. A description of the sample is available in Kendler et al. (2006).
Zygosity was initially based on questionnaire methodology using discriminant analyses. These classifications were recently updated using results from a sub-set of twins for whom zygosity was established from genetic marker analyses and which indicated 97.5% correct original classification (Harris et al., 2006). From these data, we estimated that in our entire interview sample, zygosity misclassification rates are under 1%, a rate unlikely to substantially bias results (Neale, 2003).
Our final sample consisted of 1,722 females; 448 monozygotic female (MZF) pairs, 261 dizygotic female (DZF) pairs, and 22 single responders. From this sample, all female monozygotic (MZ; n = 448 complete pairs and 4 singletons) and dizygotic (DZ; n = 263 complete pairs and 4 singletons) twins were included in the analyses (total N = 1430). Ages of participants ranged from 19.0 to 36.0 (M = 28.19, SD = 3.89). Only women were included in the current study, due to extremely low prevalence rates of AN among men (American Psychiatric Association, 1994).
Data for the present study came from the Norwegian version of the computerized Composite International Diagnostic Interview (CIDI; Wittchen & Pfister, 1997), a comprehensive structured diagnostic interview for the assessment of DSM-IV axis I disorders (American Psychiatric Association, 1994) and ICD-10 diagnoses. A total of 44% of eligible twins participated in the CIDI interview. Interviews were conducted between June 1999 and May 2004, interviewers were predominantly psychology students in the final part of their studies (equivalent to U.S. students in the final two years of a clinical psychology doctoral program) as well as experienced psychiatric nurses. They were trained in a standardized program by teachers certified by the World Health Organization (WHO) and were supervised closely. Interviews were largely conducted face to face; for practical reasons, 231 interviews (8.3%) were done by telephone. Each twin in a pair was interviewed by different interviewers.
The CIDI was developed by the WHO and the former United States Alcohol, Drug Abuse and Mental Health Administration, and has been shown to have good test-retest and inter-rater reliability (Wittchen, 1994; Wittchen, Lachner, Wunderlich, & Pfister, 1998). Both the paper-and-pencil version of CIDI and the computerized version identical to the one used in this investigation have previously been used in Norway (Kringlen, Torgersen, & Cramer, 2001; Landheim, Bakken, & Vaglum, 2003).
In the current study, eating disorder items were used as observed variables for the latent factor AN. These items were based on responses to interview questions (see Table 1). Participants were first asked if they had ever lost a lot of weight (≥ 15 lbs.) either by dieting or without meaning to (item 1). Second, they were asked if friends or relatives had ever said that they were much too thin or “looked like a skeleton” (item 2). A total of 550 participants endorsed item 1, and 471 endorsed item 2; in total, 765 participants endorsed at least one of these items. If participants endorsed neither, they skipped to the next section of the interview, and their data were coded as missing for the subsequent eating disorder questions. Third, participants were asked the lowest weight they dropped to (or had) after the age of 14 and their height at that time (item 3). If their reported lowest weight was not less than 125 lbs., they skipped to the next section of the interview. A total of 663 participants reported a weight of less than 125 lbs.
Participants who endorsed at least one of the first two items as well as the low weight criterion were subsequently asked questions regarding their fears about regaining weight (at the time of low weight; item 4), whether they considered themselves (item 5) or parts of their bodies (item 6) fat at this time, whether weight impacted their self-evaluation (item 7), whether others told them that their low weight was a hazard to their health (item 8) and whether they missed menstrual periods during this time (i.e., amenorrhea; item 9). The number of participants responding to these questions ranged from 541–546. Scores on these items, except for weight and height, were binary (yes/no).
BMI was calculated based on responses to the question regarding lowest weight since age of 14 and height at that time (item 3). This variable was then divided into quintiles for multivariate ordinal data analysis; 67.4% of participants who reported a period of time when they had lost a lot of weight (item 1) and looked too thin (item 2) reported lowest BMIs less than 18.5, meeting the criteria for underweight (World Health Organization, http://www.who.int/bmi/index.jsp?introPage=intro_3.html). A score of 0 on the polychotomized BMI variable indicated a BMI less than or equal to 16.65. Scores of 1, 2, 3 or 4 indicated BMIs ranging from 16.73–17.58, 17.63–18.49, 18.59–19.49 and greater than or equal to 19.53, respectively.
In the current study, we were interested in the extent to which the observed variables (i.e., eating disorder items) were related to the latent trait AN (indicated by item factor loadings) as well as the genetic influences on the latent trait and individual items. Similar to item response theory, an item’s factor loading represents its discrimination, or the likelihood of a symptomatic or non-symptomatic response. Thus, an item-factor approach (Neale, Aggen, Maes, Kubarych, & Schmitt, 2006), was used for the analyses. This procedure can be considered an implementation of the common factor model to multivariate binary or ordinal data, such that the likelihood of item data is computed conditional on the latent trait. To improve speed, we used an MML approach in which the overall likelihood is computed by integrating over the latent trait, which is achieved by specifying a finite mixture distribution for points on the latent trait. Gaussian quadrature weights are assigned to these points along the distribution of the factor; these weighted likelihoods are summed in order to compute the overall likelihood. Of note, use of at least 10 points provides a good approximation of normality (Neale, Aggen et al., 2006).
Due to skip patterns in the interviews, there were considerable missing data. Moreover, selection on these “gateway” items impacts the estimation of covariation among the items, which is essential for fitting the factor model. Specifically, there will be no variance on the gateway items when data on the probe items are available, because individuals must endorse the gateway items in order to be asked the probe items. Ultimately, this zero variance problem can affect the validity of factor analyses (Neale, Aggen et al., 2006). However, joint analysis of gateway and probe items collected from pairs of twins overcomes this problem, because the covariance between the gateway item and the co-twin’s probe items is available (Neale, Harvey, Maes, Sullivan, & Kendler, 2006).
The model used estimates three main types of parameters. First are the thresholds, which reflect the probabilities that the AN symptoms are endorsed. In the case of BMI, the thresholds subdivide BMI into its categories. Second are the factor loadings, which estimate association between the latent trait and each of the symptoms. Third are the additive genetic (A), shared environment (C), and specific or individual environment (E) influences on the latent factor. Of note, additive genetic effects are specified to contribute twice as much to the covariance between MZ twins as DZ twins because, for most intents and purposes, MZ twins share all of their genes, and DZ twins share half of their genes. Shared environmental influences are assumed to be equal among MZ and DZ twins. Specific environmental influences are assumed to be uncorrelated in MZ and DZ twin pairs. Fourth, two types of variance are estimated for each item: that which is contributed by the latent factor and residual variance. In this model, residual variance for each item (R in Figure 1) was partitioned into A, C, and E influences.
Lastly, the significance of the A and C contributions to the latent factor was tested using submodel comparisons (with the full ACE model compared to AE and CE models) as well as the computation of confidence intervals. Parameters for A and C were constrained in two separate submodels; each of these nested models was compared to the full model using a likelihood ratio test (Δχ2). A significant chi-square difference indicates that model fit worsens when parameters are fixed to zero. This procedure is used to determine whether genetic and environmental influences contribute significantly to the latent construct AN. Additionally, the Akaike’s Information Criterion (AIC) for the models, computed as -2lnL - 2df (Akaike, 1987) was examined. However, this index was not exclusively used to determine which model provided the best fit, as it may sometimes yield incorrect results (Sullivan & Eaves, 2002).
Descriptive statistics indicated that 1.9% of the sample met criteria for a lifetime diagnosis of AN. An ACE model (see Figure 1), using an item-factor approach with MML, was first fit to the data. The estimated MZ correlation for the latent trait was .37, while that for DZ pairs was .24. This suggests that the latent trait AN is somewhat heritable. Consistent with this observation, E had the largest contribution to variance in the latent trait [e2 = .64, (95% CI: .49; .79)], and additive genetic and common environmental influences on the latent trait AN were modest [a2 = .22 (0; .50); c2 = .14 (0; .44)]. The majority of items (numbers 1, 4, 5, 6, and 7) had relatively large factor loadings (range .76–.93; see Table 2). Items 2, 8, and 9 had more modest factor loadings (range .43–58; see Table 2), indicating that relatives or friends telling participants they were too thin, others telling them that their low weight was a hazard to their health, and amenorrhea were less strongly associated with the latent trait. A somewhat surprising finding, however, was that the factor loading for BMI (item 3) was quite low (coefficient = −.05).
Residual variance for each item (i.e., variance that was not due to the latent trait), was partitioned into A, C, and E influences. For all items, the largest amount of residual variance was due to unique environmental factors (see Table 2). However, several items (1, 2, 3, 4, and 7) had moderate proportions of residual variance due to genetic influences. For nearly all items, the amount of residual variance due to common environmental factors was nil, with the exception of items 4 and 9, which had 19% and 14% of residual variance, respectively, due to C.
The total heritability for each individual item (i) was computed as the product of the item’s squared factor loading (λ) and a2 for the latent trait, added to the product of one minus the item’s squared factor loading and the amount of the item’s residual variance due to A. So this equation, where λi is the factor loading for the ith item is as follows:
Similarly, total shared and unique environmental influences on each item were computed using this equation, respectively substituting c2 or e2 and residual variance due to C or E. Thus, four items (numbers 1, 2, 3, and 7) had estimates of heritability ranging from .29 to .34. These items assessed whether participants had ever lost a lot of weight, whether friends and relatives had said they were too thin, whether they still thought they were too fat at lowest weight, whether weight affected how they felt about themselves at lowest weight, and BMI. Items 4 and 5 (whether participants were afraid they would regain the weight at time of lowest weight and whether they still thought they were too fat) had heritabilities of .27 and .23, respectively. Lastly, items 6, 8, and 9 (whether participants still thought parts of their bodies were too fat, whether others told them their weight was a hazard to their health, and amenorrhea) had the lowest heritability estimates (.18, .09, and .16, respectively).
Two submodels, an AE and a CE model, were compared to the full ACE model to determine whether additive genetic and common environmental factors significantly influenced the latent trait AN. Results of chi-square tests indicated that dropping A and C separately did not significantly worsen model fit (see Table 3 for a summary of fit information for the full ACE model and each submodel). In addition, confidence intervals for A and C included zero, further indicating that A and C individually were non-significant. However, the confidence interval for E did not include 1.0. This indicates that unique environmental influences alone do not fully explain the etiology of AN and there is evidence for the aggregation of shared environmental influences on this latent trait, but there are insufficient data to ascertain whether their origin is genetic, environmental, or (most likely) both. Given these results and the sample size, parameters from the full ACE model are more likely to represent the true model than either submodel (Sullivan & Eaves, 2002).
This study examined the relative heritability of specific AN symptoms in a large population-based twin sample using an item-factor approach. The overall heritability of AN was moderate, and lower than that obtained in both the only previous study to examine the full AN diagnosis (Bulik et al., 2006), as well as that found in studies using broader definitions of AN (e.g., Klump et al., 2001; Kortegaard et al., 2001; Wade et al., 2000). However, the current estimate is within the (albeit wide) CI obtained in the Bulik et al. study. The use of sum scores in previous studies (e.g., Bulik et al., 2006), which assessed contributions to the variance of AN at a diagnostic level, may also account for differing results. Heterogeneity of items assessing a given trait, which is not accounted for in models using sum scores, can bias parameter estimates (Neale et al., 2005).
Thus, of particular interest in this study were the symptom-level analyses using the MML method. Items assessing weight loss and weight itself were moderately heritable. Heritability estimates for items assessing weight concern at low weight were somewhat lower, clustering around .25. The amenorrhea item was most strongly influenced by unshared environment. This result further supports the argument that amenorrhea is not a promising endophenotype or liability index for AN, and may be of limited value to the overall diagnosis if we are seeking more biologically valid diagnostic criteria (Bulik et al., 2007; Pinheiro et al., 2007).
Results regarding the influence of weight on self-evaluation differ from those of Reichborn-Kjennerud et al. (2004), who found greater support for the influence of shared and unique environmental factors on this construct. However, current results are more consistent with those of Wade and Bulik (2007), who found small to moderate heritability estimates for the undue influence of weight and shape concern on self-evaluation. Perhaps some of these differences among studies are related to the varying items used to assess this construct. For example, Reichborn-Kjennerud et al. used a single item, self-report question to assess undue influence, whereas Wade and Bulik used EDE items. In the current study, participants were asked, using a single question, how they felt about themselves when their weight was at its lowest.
Further, we only assessed a specific subgroup of the sample, most notably, those with a low enough BMI to be considered for the AN diagnosis. Specifically, participants had to endorse the gateway items to even be asked about self-evaluation. This is a common problem in large scale epidemiological studies, in which participant burden and fatigue must be considered. In Wade and Bulik’s (2007) study, all participants completed the EDE. In theirs as well as Reichborn-Kjennerud et al.’s (2004) investigations, participants did not need a history of low weight to respond to these items assessing undue influence/weight concern. These contrasting results across studies suggest that perhaps genetic and environmental factors operate differently within individuals who are already at a low BMI, compared to the general population. Moreover, it seems important to examine heritability within specific subgroups of interest, as it is possible that heritability estimates obtained at a population level differ from estimates obtained from specific subsets of individuals. Future research should address this possibility.
Current findings also highlight the importance of unshared or unique environmental factors, which contributed significantly to all AN symptoms. These results are similar to of Wade and colleagues (Wade, Bergin, Martin, Gillespie, & Fairburn, 2006) who found that unshared environmental factors contributed significantly to the number of lifetime eating disordered behaviors. This influence of the unshared environment may reflect individual experiences twins had outside of their family environment that affected their weight-related behaviors, such as comments made by peers, coaches, or other influential people. Future research should examine the interaction of these unique environmental experiences with underlying genetic vulnerabilities. This line of work may help to identify triggering experiences among the subset of the population particularly vulnerable to AN.
Further, it should be noted that, in addition to measuring unshared environmental experiences, the E component of the ACE model captures variance attributable to measurement error. Thus, the relatively higher influence of E found in the current study, compared to others which have evaluated AN at the diagnostic level (e.g., Bulik et al., 2006; Klump et al., 2001; Kortegaard et al., 2001; Wade et al., 2000) could reflect both measurement error and nonshared environmental experiences. It is not possible to determine exactly what proportion of variance accounted for by E in this study is due to either true unshared experiences or to measurement error. Consequently, it is important for future studies to replicate the current methodology, particularly given that estimates of heritability are sample dependent. For example, previous studies have identified significant developmental differences in the influence of genetic and environmental factors on eating disorder symptoms (e.g., Klump, Burt, McGue, & Iacono, 2007; Klump et al., 2000; Silberg & Bulik, 2005). In addition, studies in other areas (e.g., smoking) have found that birth cohort influences estimates of A, C, and E parameters (e.g., Kendler, Thornton, & Pedersen, 2000). In sum, no single study can provide a definitive value regarding the heritability of AN that would be applicable to all. Rather, multiple studies such as this one, which examine genetic and environmental influences on specific AN symptoms, can lead to an accumulation of evidence which will facilitate identification of particularly promising targets for intervention and prevention efforts.
Several limitations of this study should be noted. First, the sample included exclusively Norwegian female twins. Thus, it is unclear whether these results are applicable to men, nontwins, or other cultural groups. Further, measurement issues should be considered, particularly the issue of the gateway items. Use of gateway items is helpful in reducing participant burden and response biases due to fatigue; however, because these items, by definition, screen out the majority of the sample, heritability estimates derived from studies using gateway items assess this component of variance among those individuals who have met the screening criteria. These individuals are likely to differ from those in the total population. In addition, the use of gateway items may have led to an underestimate of the number of women affected by AN, because, as Wade (2007) has noted, AN symptoms are ego-syntonic, and, thus, are likely under-reported by affected individuals. Consequently, our results may not represent the full range of individuals with AN, but may include individuals with more chronic or severe cases. A final measurement limitation is that participants were classified as low weight if their BMI value was <18.5. BMI age and gender-specific percentiles are considered more accurate for individuals under the age of 18 (Cole, Flegal, Nicholls, & Jackson, 2007); consequently, the current study may have incorrectly classified some individual as underweight whose weight was truly in the low-normal range. However, these same individuals would have had to have met all other AN criteria to be diagnosed with the disorder. Thus, it is unlikely that this decision regarding BMI cutoffs significantly influenced the overall results.
Further, substantial attrition was observed in this sample from the original birth registry through three waves of contact. Detailed analyses of the predictors of non-response across waves will be presented elsewhere (Harris et al, unpublished observations), and suggest that cooperation was predicted by female sex, monozygosity, older age, and higher educational status. Few of the mental or physical health measures showed significant effects. Analyses did not show evidence of changes in the genetic and environmental covariance structure due to recruitment bias for a broad range of mental health indicators. While we cannot be certain that our sample was representative with respect to AN psychopathology, these findings suggest that significant bias is unlikely. Finally, in order to increase statistical power, the measure used in the current study assessed lifetime history of AN. Thus, results may have been influenced by recall bias.
Despite these limitations, this study has several strengths, including the use of a large, population-based sample. Further, use of symptom level modeling provides much richer data that can prove informative to the development of endophenotypes or liability indices (Bulik et al., 2007).
This research was supported by the National Institutes of Health Grants MH-068520 (Mazzeo), MH-20030 (Mitchell), MH66117-05 (Bulik, Devlin PI), MH-65322 (Neale), and MH-068643 (PI Kendler). The twin program of research at the Norwegian Institute of Public Health is supported by grants from The Norwegian Research Council, The Norwegian Foundation for Health and Rehabilitation, and by the European Commission under the program “Quality of Life and Management of the Living Resources” of 5th Framework Program (no. QLG2-CT-2002-01254). Genotyping on the twins was performed at the Starr Genotyping Resource Centre at the Rockefeller University. We are very thankful to the twins for their participation.