|Home | About | Journals | Submit | Contact Us | Français|
Eating disorder not otherwise specified (EDNOS) is the most prevalent eating disorder (ED) diagnosis. This meta-analysis aimed to inform DSM revisions by comparing the psychopathology of EDNOS to that of the officially recognized EDs: anorexia nervosa (AN), bulimia nervosa (BN), and binge eating disorder (BED). A comprehensive literature search identified 125 eligible studies (published and unpublished) appearing in the literature from 1987 to 2007. Random effects analyses indicated that while EDNOS did not differ significantly from AN and BED on eating pathology or general psychopathology, BN exhibited greater eating and general psychopathology than EDNOS. Moderator analyses indicated that EDNOS groups who met all diagnostic criteria for AN except for amenorrhea did not differ significantly from full syndrome cases. Similarly, EDNOS groups who met all criteria for BN or BED except for binge frequency did not differ significantly from full syndrome cases. Results suggest that EDNOS represents a set of disorders associated with substantial psychological and physiological morbidity. While certain EDNOS subtypes could be incorporated into existing DSM-IV categories, others such as purging disorder and non-fat-phobic AN—may be best conceptualized as distinct syndromes.
The systematic classification of mental disorders is an essential enterprise for both clinical research and treatment formulation because clearly defined symptom sets are the sine qua non of valid and reliable assessment. However, clinical cases do not necessarily fall neatly into pre-defined categories. For nearly every class of mental illness, the psychiatric community recognizes the possibility of clinically significant psychopathology that does not meet criteria for an established or emerging disorder. Since the third revision of the Diagnostic and Statistical Manual of Mental Disorders (DSM-III-R, 1987), these unclassifiable syndromes have been labeled “not otherwise specified” (NOS) and have been considered “atypical” remnants of our nosologic system. Residual categories have been accepted as a necessary compromise for the expedience of retaining clear thresholds for the diagnosis of established disorders and preserving the homogeneity of their clinical presentations, but this convenience comes at a cost. Recent data indicate that the majority of individuals with personality disorders (Johnson et al., 2005; Westen & Arkowitz-Westen, 1998), dissociative disorders (Sar, Akyuz, & Dogan, 2007), somatoform disorders (Kuwabara et al., 2007), and eating disorders (Fairburn & Bohn, 2005) are diagnosed with the NOS subtype of these conditions. Because the heterogeneous NOS diagnosis undermines the utility of classifying mental disorders into homogeneous subtypes, its high prevalence can thwart areas as diverse as clinical communication, treatment planning, epidemiological inquiry, primary prevention, and basic research. The present study is a meta-analysis that examines the relationship between eating disorder NOS (EDNOS) and the officially recognized eating disorders.
EDNOS has received a great deal of research attention and therefore provides an illustrative example of the challenges inherent in assigning the majority of cases to the “atypical” category. Of the three eating disorders formally recognized in DSM-IV (i.e., anorexia nervosa, bulimia nervosa, and EDNOS; APA, 2000), EDNOS is by far the most prevalent. Recent research indicates that EDNOS comprises 40% (Button, Benson, Nollett, & Palmer, 2005; Ricca et al., 2001; Rockert, Kaplan, & Olmsted, 2007) to 60% (Fairburn et al., 2007; Martin, Williamson, & Thaw, 2000; Nollet & Button, 2005; Turner & Bryant-Waugh, 2004; Williamson et al., 2002) of treatment-seekers at eating disorder specialty clinics. EDNOS may be even more widespread in non-specialty settings: 90% of eating disorder patients in a community-based outpatient psychiatry practice (Zimmerman, Francione-Witt, Chelminski, Young, & Tortolani, 2008) and 75% of young women with eating disorders in a community prevalence study (Machado, Machado, Gonclaves, & Hoek, 2007) received EDNOS diagnoses. EDNOS is especially common among populations that have received less research attention such as males (Striegel-Moore, Garvin, Dohm, & Rosenheck, 1999), ethnic minority groups (Alegria et al., 2007), aesthetically-oriented athletes (Ringham et al., 2006), young children (Nicholls, Chater, & Lask, 2000), and the elderly (Mangweth-Matzek et al., 2006).
The EDNOS category is so diverse that the label confers little information about a patient’s likely symptoms, course, or outcome, thus undermining its utility as a diagnosis. Only one subtype of EDNOS, binge eating disorder (BED), features its own set of diagnostic criteria in DSM-IV. The remainder of EDNOS cases encompass individuals who exhibit partial syndromes1 of anorexia nervosa (AN) or bulimia nervosa (BN), show mixed features of both disorders, or have extremely atypical eating behaviors (e.g., chewing and spitting out large amounts of food without swallowing; APA, 2000). Because no DSM-defined boundary clearly differentiates EDNOS from patterns of unusual but non-pathological eating behavior, the conceptual definition of an “eating disorder” is ambiguous, and eating disorder caseness can only be identified in practice through idiosyncratic clinical judgments. Longitudinal studies highlight presentational heterogeneity, suggesting that while some cases of EDNOS are prone to spontaneous remission, others follow a chronic course. Although data on diagnostic crossover in this population are limited, available findings suggest that approximately 40% of individuals with EDNOS go on to develop AN or BN within one year (Milos, Spindler, Schnyder, & Fairburn, 2006) to two years (Herzog, Hopkins, & Burns, 1993) of initial presentation. Longer-term follow-up studies of individuals with EDNOS have identified remission rates of 50% after three years (Milos et al., 2006) and 80% after five years (Ben-Tovim et al., 2001; Grilo et al., 2007), with the remainder of patients retaining eating disorder diagnoses.
The high prevalence of EDNOS relative to officially recognized eating disorders renders the selection of empirically supported treatments difficult for the majority of eating disorder patients. Although cognitive behavioral therapy and fluoxetine have demonstrated clear efficacy for the treatment of BN (Shapiro et al., 2007) and family-based therapy shows promise for the treatment of adolescent AN (Le Grange & Lock, 2005), to date no evidence-based therapy has been developed specifically for EDNOS (with the notable exception of BED; Grilo, Masheb, & Wilson, 2006; Peterson et al., 2001; Wilfley et al., 2002). EDNOS treatments are difficult to operationalize due to the heterogeneity of cases. In the absence of empirical guidance, clinicians are encouraged to cobble together techniques developed for the treatment of the other eating disorders (e.g., National Institute for Clinical Excellence, 2004). However, the efficacy of these borrowed approaches remains unknown. Early randomized controlled trials included only those participants who met strict diagnostic criteria for AN (e.g., Eisler et al., 1997) and BN (e.g., Fairburn et al., 1991), which could restrict their generalizeability to a subset of patients seen in actual clinical practice. In response to the low base rate of officially recognized disorders as well as an increasing number of studies that have questioned the validity of particular DSM-IV diagnostic requirements (Cachelin & Maher, 1998; Le Grange et al., 2006), many treatment trials have begun combining therapy outcome data from EDNOS with that of AN and BN. Most notably, several AN trials have included participants who do not meet the amenorrhea criterion (McIntosh et al., 2005; Walsh et al., 2006) and several BN trials have included participants who report binge eating less than twice per week (Bara-Carril et al., 2004; Schmidt et al., 2006). The application of AN and BN treatments to EDNOS patients, and, conversely, the inclusion of EDNOS patients in AN and BN treatment trials, is predicated on the debatable hypothesis that the psychopathology of EDNOS is commensurate with that of the officially recognized eating disorders.
The current DSM-IV criteria can also impede epidemiological inquiries into the prevalence of eating disorders. Traditional diagnostic instruments such as the Structured Clinical Interview for DSM-IV (First, Spitzer, Gibbon, & Williams, 1997) are likely to overlook EDNOS cases because they instruct assessors to skip out of the eating disorder module as soon as respondents fail to endorse one of the hallmark diagnostic criteria for AN or BN. Therefore, with a few recent exceptions (e.g., Hudson, Hiripi, Pope, & Kessler, 2007; Wade, Bergin, Tiggemann, Bulik, & Fairburn, 2006), the majority of epidemiological studies assessing eating disorder prevalence do not report population rates of EDNOS. Moreover, because EDNOS is defined in DSM-IV solely as a disorder that does not meet criteria for AN or BN and thus features no clear inclusion criteria, individual investigators must create operational definitions of EDNOS for use in epidemiological assessment. Even those investigations that do query for EDNOS typically limit their investigations to a circumscribed set of presentations, such as BED (Hudson et al., 2007), partial syndrome AN, or purging disorder (Wade et al., 2006), which may result in the underestimation of overall rates of eating disorder psychopathology in community samples.
The identification of volitional self-starvation as the core phenomenology of AN (Gull, 1874; Laségue, 1873) and the binge-purge cycle as the hallmark of BN (Russell, 1979) emanated from idiographic assessment of clinical case series, which formed the first generation of eating disorder classification research. Although other clinical presentations were described in the early literature—including binge eating disorder (Stunkard, 1959) and night eating syndrome (Stunkard, Grace, & Wolff, 1955)—only AN and BN were formally recognized in the psychiatric nomenclature through the publication of DSM-III (APA, 1980). To encompass alternative presentations, DSM-III also debuted a category for “atypical eating disorders,” or disorders of clinical significance that did not meet full criteria for AN or BN. At the time, atypical eating disorders were thought to be rare (Ash & Piazza, 1995) and received little empirical attention. As successive versions of the DSM (APA, 1987; APA 1994) prioritized the diagnostic reliability of AN and BN by promulgating objective thresholds for diagnosis (i.e. revising the suggested AN weight cut-off from 75% of original body weight to 85% of that expected; adding requirements that individuals with BN binge twice weekly and endorse compensatory behaviors), it became clear that a growing number of patients did not meet these new criteria. Thus “atypical” presentations—renamed EDNOS in DSM-III-R to represent “disorders of eating that do not meet the criteria for a specific eating disorder” (APA, 1987; p. 71) and later exemplified with a non-exhaustive list of six possible presentations (APA, 1994; see Table 1)—became the topic of increased study. Longitudinal investigations suggest that the prevalence of EDNOS relative to AN and BN has increased over time (Ash & Piazza, 1995), reflecting the growing heterogeneity of clinical cases. A wealth of data now suggest that, ironically, “atypical” eating disorders represent the most prevalent eating disorder diagnosis in both clinical (Fairburn et al., 2007; Zimmerman et al., 2008) and community (Machado et al., 2007) samples. Moreover, despite the perceived “subclinical” status of these atypical eating disorders, they may exhibit psychopathology commensurate to that of AN and BN (Fairburn et al., 2007). The strikingly high prevalence and severity of EDNOS relative to the officially recognized eating disorders introduces a nosologic paradox into our diagnostic system and calls into question how much is really known about the phenomenology of disordered eating.
In response to the shortcomings of the current diagnostic system, investigators have embarked on a third generation of classification research assessing the validity of specific diagnostic criteria, and, in turn, suggesting potential classificatory changes that could be adopted in DSM-V. Proposals parallel suggestions for nosologic improvement across several classes of mental disorders, and, to some extent, reflect deep-rooted taxonomic debates between “lumpers” and “splitters.” The first and most conservative solution would be to simply relax one or more criteria for the main eating disorders so as to subsume the majority of EDNOS patients into the officially recognized categories (Andersen, Bowers, & Watson, 2001; Mitchell et al., 2007). For example, the diagnostic criteria for AN could be made more lenient by omitting the amenorrhea criterion (Cachelin & Maher, 1998) or increasing the weight cut-off (Watson & Andersen, 2003). Similarly, the criteria for BN could be relaxed so as to include individuals who binge less than twice per week (Le Grange et al., 2006; Rockert et al., 2007) or whose binge episodes are not objectively large (Keel, Mayer, & Harnden-Fischer, 2001). Relaxing the current criteria would represent a constructive solution to the EDNOS problem if specific criteria could be identified that do not distinguish well between full and partial syndrome cases. To date, attempts to identify such criteria have not yet converged on obvious candidates. For example, while some investigations have found few differences between individuals with BN versus those who meet all criteria for BN except binge frequency (e.g., Le Grange, Loeb, Van Orman, & Jellar, 2004; Le Grange et al., 2006), others have identified significantly greater psychopathology among individuals with full syndrome BN (Crow, Agras, Halmi, Mitchell, & Kraemer, 2002).
Alternatively, Fairburn and Bohn (2005) have proposed a more revolutionary transdiagnostic solution that would aggregate individuals currently diagnosed with AN, BN, and EDNOS into the single superordinate category of “eating disorder.” Under the transdiagnostic model, eating disorder caseness would be established by evaluating the overall level of clinical impairment engendered by aberrant eating attitudes and behaviors, rather than prioritizing the frequency or severity of individual symptoms (Bohn et al., 2008; Fairburn & Bohn, 2005). This strategy would eliminate specific eating disorder diagnoses altogether in order to create a unitary diagnosis that underscores key similarities across the eating disorders, including dietary restraint, binge eating, compensatory purging, body checking, and weight preoccupation (Fairburn & Bohn, 2005). Combining the three eating disorders under one umbrella diagnosis in DSM-V would better reflect the frequent diagnostic migration observed in longitudinal studies (Herzog et al.,1993; Milos et al., 2006), which may be indicative of a shared etiological mechanism (Milos et al., 2006). Data supporting the adoption of the transdiagnostic model are mixed. While some studies have observed similar levels of psychopathology in EDNOS versus officially recognized disorders (Garfinkel et al., 1995; Moor, Vartanian, Touyz, & Beumont, 2004), others support the notion that EDNOS is a milder variant of eating pathology than either AN or BN (Dancyger & Garfinkel, 1995).
A third possible solution to the EDNOS problem would be to identify and extract distinct diagnostic categories from within the heterogeneous EDNOS group. BED (Spitzer et al., 1991) has already received provisional DSM-IV status as a diagnostic category nominated for further research, and candidates for new DSM-V eating disorder diagnoses include both purging disorder (recurrent purging in the absence of objectively large binge episodes; Keel, Haedt, & Edler, 2005) and night eating syndrome (a pattern of frequent and distressing nocturnal overeating; Allison, Grilo, Masheb, & Stunkard, 2005). However, the benefits of enhanced nosologic coverage must be balanced with the possible risks of over-pathologizing normative behaviors and introducing unwanted redundancies with existing categories (Pincus, Frances, Davis, First, & Widiger, 1992). Identifying new eating disorders would only be prudent if investigators can unearth homogeneous subtypes of EDNOS that both differ meaningfully from AN and BN and are associated with substantial psychosocial impairment.
At an even finer-grained level of distinction, the DSM-V could conceptualize eating disorder subtypes as occupying specific positions in a more general multi-dimensional space (Beumont, Garner, & Touyz, 1994; Williamson, Gleaves, & Stewart, 2005). A dimensional model of eating disorders dovetails with proposals to define personality disorders as maladaptive extremes of the Big Five personality traits (cf. Widiger, 1993). The frequent need for NOS diagnoses represents a limitation inherent to any category-based nosological system, and thus a dimensional model of eating pathology would obviate the need for an atypical category. Beumont et al. (1994) have proposed a three-dimensional system in which all individuals are diagnosed with a “dieting disorder,” but are then further differentiated by the severity of key symptoms including body mass index, binge eating, and purging. Recent taxometric investigations provide some support for a model in which anorexic symptoms are continuous with normality (Williamson et al., 2002), but findings suggesting that bulimic symptoms represent a discrete latent taxon (Gleaves, Lowe, Snow, Green, & Murphy-Eberenz, 2000; Williamson et al., 2002) challenge purely dimensional conceptualizations.
All of the new diagnostic proposals represent creative attempts to solve a diagnostic dilemma in which the majority of clinical cases are relegated to the atypical category. In order to choose among them, or design another alternative solution, the field must weigh the empirical merits of each. In the absence of more robust data on diagnoses, DSM-III-R and DSM-IV decision makers were forced to rely on an extremely modest empirical base when considering potential diagnostic revisions. Fortunately, more than 100 studies have compared various subtypes of EDNOS to the officially recognized eating disorders since the 1994 publication of DSM-IV. These comparisons provide a wealth of data on whether and where DSM-V diagnostic boundaries should be drawn among AN, BN, and EDNOS subtypes. However, conflicting findings, low statistical power, and different definitions of EDNOS across studies have hindered the ability of this literature to foster consensus on suggested revisions. Meta-analysis overcomes these methodological and interpretive difficulties by pooling effect sizes across studies, which enhances statistical power to determine the magnitude and statistical significance of overall effects. Meta-analytic techniques also capitalize on heterogeneity in study design by identifying study level characteristics (moderator variables) which are systematically associated with larger versus smaller effects.
The purpose of the present study was to conduct the first meta-analytic comparison of EDNOS versus the officially recognized DSM-IV eating disorders (AN, BN, and BED). Our study comprised three overarching objectives. First, we aimed to provide a comprehensive, quantitative summary of the differences in eating pathology, general psychopathology, and physical health between EDNOS and each of the officially recognized eating disorders. While small to negligible differences would support the potential application of a transdiagnostic approach in DSM-V, larger differences would confirm the construct validity of extant categories. Second, we evaluated potential moderators of effect size with special emphasis on identifying specific diagnostic criteria that do not distinguish well between full syndrome and partial syndrome AN, BN, and BED. Such criteria could be considered potential candidates for omission in DSM-V. Third, we identified theoretical and methodological limitations of the literature in order to highlight productive directions for future research. In sum, we hoped to first clarify the relative status of each EDNOS subtype, and subsequently highlight potential subgroups that could be removed from this problematic diagnostic category in future schemes for eating disorder classification in order to enhance clinical communication, treatment planning, epidemiological inquiry, and basic research.
Despite its provisional DSM-IV status, we treated BED as an officially recognized eating disorder in the present meta-analysis. We anticipated that organizing the data in this way would be optimally informative because a great deal is already known about BED; a recent review of its nosological status revealed that it has been the topic of more than 1,000 scientific articles (Striegel-Moore & Franko, 2008). Importantly, several latent class analyses support the distinctiveness of BED from BN (Bulik, Sullivan, & Kendler, 2005; Pinheiro, Bulik, Sullivan, & Machado, 2008; Striegel-Moore et al., 2005), long-term outcome studies suggest that BED is a stable syndrome (Pope et al., 2006), and behavioral genetic investigations highlight a pattern of familial aggregation consistent with genetic effects (Javaras et al., 2008). Thus, although the DSM-V Work Group has not yet determined whether BED will be promoted to official eating disorder status in DSM-V, the more pressing question to eating disorder diagnosticians is how DSM-V should classify the heterogeneous remainder of EDNOS cases. Indeed, many classification studies already treat BED as a disorder distinct from AN, BN, and other EDNOS subtypes, including several that have compared BED to subthreshold BED in order to evaluate the validity of specific diagnostic criteria (e.g., Cachelin et al., 1999; Crow et al., 2002; Fitzgibbon, Sanchez-Johnsen, & Martinovich, 2003). Treating BED as an officially recognized eating disorder in the present meta-analysis allowed us to evaluate the potential magnitude and significance of these effects.
The moderator of primary interest to our investigation was the diagnostic composition of the EDNOS group featured in each study. While some studies have compared relatively heterogeneous groups of EDNOS participants to their counterparts diagnosed with AN, BN, or BED (e.g., Clinton & Norring, 1999; Moor et al., 2004), others have examined specific groups of individuals who meet some but not all criteria for AN (e.g., Cachelin & Maher, 1998; Roberto, Steinglass, Mayer, Attia, & Walsh, 2007), BN (e.g., Fitzgibbon, 2003; Le Grange et al., 2006), or BED (e.g., Friederich et al., 2007). Variation in the diagnostic features of EDNOS across studies provides an ideal source of data for evaluating the validity of competing proposals for DSM-V revisions. Specifically, non-significant differences between individuals who meet all but one criterion for a particular full syndrome diagnosis (e.g., AN without amenorrhea or BN with less than twice-weekly binge eating) across multiple studies would provide empirical support for recommendations to omit (or relax) that criterion in DSM-V. In contrast, large differences between officially recognized disorders and individuals who fail to meet a particular diagnostic criterion would be consistent with the retention of that criterion, and would support the discriminant validity of extant DSM-IV thresholds. Lastly, small to negligible differences between officially recognized eating disorders and individual subtypes of EDNOS across all three meta-analyses could be interpreted as consistent with recommendations to embrace a transdiagnostic conceptualization of eating disorders (Fairburn & Bohn, 2005) in DSM-V, although admittedly our meta-analysis cannot provide evidence for or against collapsing across the officially recognized categories of AN, BN, or BED.
The use of structured clinical interviews to establish psychiatric diagnoses enhances inter-rater reliability and diagnostic accuracy (Garb, 1998). Because power to detect differences between EDNOS and officially recognized eating disorders should be enhanced with increasingly valid and reliable assessments, we predicted that studies using structured interviews to establish eating disorder diagnoses would exhibit larger effect sizes than those relying on unstructured interviews or self-report measures. Alternatively, the use of structured interviews could potentially lead to smaller differences between EDNOS and officially recognized disorders if assessors who relied on unstructured assessment protocols conferred the EDNOS diagnosis to all treatment-seekers reporting eating-related distress, rather than placing systematic exclusion criteria on the EDNOS diagnosis. In the other words, the inclusion of relatively healthy individuals in the EDNOS group could dilute observed levels of eating pathology. In order to evaluate these competing hypotheses, we examined the use of structured interviews as a potential moderator variable.
While the majority of studies evaluating differences between EDNOS and officially recognized eating disorders are based on patient samples, an increasing number of studies have recruited non-treatment-seekers from the community. Previous research has established that greater psychosocial impairment and general psychopathology is associated with increased treatment utilization among individuals with eating disorders (Keel et al., 2002). Greater overall levels of psychopathology in treatment-seekers could overshadow between-group differences in diagnostic status by creating a ceiling effect. Therefore, we hypothesized that differences between EDNOS and officially recognized eating disorders would be smaller in samples consisting of psychiatric inpatients or outpatients than in non-patient community samples.
Disproportionately high rates of EDNOS among children have prompted criticism of the DSM-IV criteria for their presumed inability to capture clinically significant eating pathology in young people (Nicholls et al., 2000). Indeed, investigators have described new eating disorders, such as selective eating disorder (Bryant-Waugh, 2000) and food avoidance emotional disorder (Higgs, Goodyer, & Birch, 1989), which may better encapsulate these unique presentations. In contrast, other theorists have proposed that adolescent EDNOS may signify a prodromal “disorder in evolution” (Le Grange et al., 2004, p. 481) which presages the ultimate development of full-blown AN or BN. Longitudinal investigations have provided some support for adolescent EDNOS as a risk factor for AN and BN (Chamay-Weber, Narring, & Michaud, 2005). Thus, we predicted that differences between EDNOS and officially recognized disorders would vary by age. More pronounced differences in younger age groups would provide support for the recognition of unique childhood eating disorders, whereas smaller differences in younger age groups would bolster the conceptualization of EDNOS as a precursor to one of the established eating disorders.
In order to obtain an exhaustive list of EDNOS studies, we conducted a four-step literature search. First, we searched five electronic databases—PsychINFO, Medline, PubMed, Excerpta Medica Database (EMBASE), and Cumulative Index to Nursing and Allied Health Literature (CINAHL)—to identify studies containing the terms “eating disorder$2 not otherwise specified” and “EDNOS.” Because authors do not always use the term EDNOS to describe eating disorders not meeting full criteria for AN, BN, or BED, the four databases that feature the capability to search for adjacent words within the body of an article (PsychINFO, Medline, EMBASE and CINHAL) were additionally queried with the terms “eating disorder,” “anorex$” (anorexia or anorexic), “bulimi$” (bulimia or bulimic), and “binge eating disorder” adjacent within five words to the terms “atypical,” “partial,” “residual,” “subclinical,” “subthreshold,” “subsyndromal,” “continuum,” “unspecified,” “non-specified,” “NOS,” or “non-classified.” Second, we scanned the reference sections of all eligible studies identified through the electronic database search for additional citations. Third, we hand-searched the January 1987 through February 2007 issues of the four journals identified by the SCOPUS database to publish the highest number of articles on eating disorders (International Journal of Eating Disorders, European Eating Disorders Review, Eating and Weight Disorders, and American Journal of Psychiatry) to locate eligible studies. Fourth, in appreciation of the potential misrepresentation of population effect sizes due to the greater likelihood that significant (versus non-significant) findings will be accepted for publication (Rosenthal, 1979), we sought out unpublished works via (a) searching for the terms “eating disorder$ not otherwise specified” and “EDNOS” in the ProQuest Dissertations and Theses electronic database, and (b) emailing requests for unpublished or in-press studies to the corresponding authors of each of the eligible studies.
In order to ensure a minimum standard of methodological quality and comparability among studies included in the meta-analysis, we required that studies meet each of the following eligibility criteria3:
At the end of the search process, we identified 125 eligible research reports, including 118 published studies and 7 unpublished studies. Of these, 84 reported AN versus EDNOS comparisons, 99 reported BN versus EDNOS comparisons, and 30 reported BED versus EDNOS comparisons.
Two coders independently extracted moderator and effect size data from each study. Coders included the second author, a post-doctoral fellow who coded all of the studies, and three clinical psychology doctoral students who coded the AN, BN, and BED studies, respectively. Before coding, each coder was trained in the use of the coding manual and coded seven practice studies. To investigate potential moderator effects, coders recorded the following features of each study: the diagnostic features of EDNOS subgroup(s) (Table 2), whether or not investigators utilized structured interviews to assign participants to diagnostic categories, whether participants consisted of patients or non-patients, and participants’ mean age. Coders reviewed studies in sets of 20 to avoid coder drift and fatigue, and participated in biweekly consensus meetings with the first author subsequent to each study set to ensure reliability and adherence to the coding manual. Table 3 presents inter-rater reliability coefficients for coders’ initial ratings of putative moderator variables (prior to consensus meetings) in terms of κ for categorical moderator variables and intraclass correlation coefficients for continuous moderator variables. Coefficients ranged from .71 to 1.00, indicating an adequate level of agreement (Landis & Koch, 1977). In cases where coders disagreed on study ratings, discrepancies were resolved through coder discussion, additional review of the article in question, and, when necessary, revision of the coding manual in order to achieve consensus and avoid future discrepancies.
All effect sizes were calculated as the standardized mean difference of the dependent variable for AN versus EDNOS, BN versus EDNOS, or BED versus EDNOS. When studies provided means and standard deviations on continuous measures, we subtracted the mean of the EDNOS group from the mean of the AN, BN, or BED group and divided by the pooled standard deviation. Therefore, positive d effect sizes indicate that individuals with officially recognized disorders exhibited greater pathology on a particular construct than individuals with EDNOS, whereas negative d effect sizes indicate that EDNOS pathology was greater. When studies provided symptom frequency counts, we calculated odds ratios and transformed them into standardized mean differences. We interpreted effect sizes according to Cohen’s (1988) criteria with absolute values of 0.2, 0.5, and 0.8 representing the guidelines for small, medium, and large effects, respectively.
Eight of the 125 eligible studies did not provide sufficient information from which to calculate effect sizes directly. Specifically, five studies reported group means but not standard deviations on well-validated dependent measures (e.g., Eating Disorder Inventory, Symptom Checklist-90), and three additional studies provided graphical rather than numerical representations of group differences. Omitting these studies from the meta-analysis would have (1) defeated our intended purpose of synthesizing all available comparisons between EDNOS and officially recognized eating disorders, and (2) potentially introduced bias into our analyses to the extent that studies with missing data differed on important design features from studies with non-missing data. Therefore, we estimated effect sizes based on past precedence. First, as recommended by Furukawa and colleagues (2006), we obtained standard deviations from previously established norms for eating disorder patients for use in effect size calculations in the case of missing variance estimates. Second, following Keel and Klump (2003), in three cases where data were presented in graphic form only, coders measured the height of the graphs (intraclass correlation coefficient for inter-rater reliability = 1.00) to determine group means.5
In order to ensure the independence of effect sizes within each meta-analysis, we aggregated effect sizes at the study, sample, and construct levels. At the study level, multiple papers occasionally emanated from the same sample of participants. We considered studies to be duplicates only if investigators presented clearly identical data in more than one study, or explicitly cited another study that was also eligible for inclusion in the meta-analysis in which their data had already been reported. In the case of duplicate samples, we selected more recent studies over earlier studies, and peer-reviewed studies over book chapters or unpublished manuscripts. At the sample level, when investigators reported data separately for multiple subgroups of AN, BN, or BED participants (e.g., separate data for BN-purging type and BN-non-purging type), we collapsed data from the two subgroups by taking the sample size-weighted mean and the pooled standard deviation in order to create an aggregate effect size for the overarching diagnostic category. Finally, at the construct level, the majority of studies utilized multiple measures of eating pathology, general psychopathology, and physical health. Rather than violating the independence assumption by allowing each sample to contribute multiple effect sizes or losing information by randomly selecting a single effect size, we averaged all measures of eating pathology emanating from the same study into a single effect size; we followed the same procedure for general psychopathology and physical health outcomes (cf. Ackerman, Beier, & Boyle, 2005; Speilmans, Pasek, & McFall, 2007). The only instance in which we allowed a study to contribute more than one effect size to the same meta-analysis was when investigators reported data for more than one diagnostic subtype of EDNOS. In that case we included comparisons between each EDNOS subtype and its full syndrome counterpart in order to preserve information for the diagnostic criteria moderator analyses (which is analogous to previous meta-analyses comparing multiple treatment groups to the same control group in order to evaluate differential treatment efficacy, e.g., Weisz, McCarty, & Valeri, 2006).
In order to determine the magnitude and significance of the overall effect size for each construct, we fit a random effects model. In contrast to a fixed effects approach, which assumes that individual effect sizes differ from the population mean through sampling error alone, a random effects approach assumes that variability among effect sizes is due to sampling error as well as unsystematic, random sources that vary across studies. Random effects models are more appropriate than fixed effects models for use with real-world data in which sample effect sizes are likely drawn from an underlying population in which effect size parameters themselves may vary (Field, 2003). Random effects models also feature the advantage of substantially reduced Type I error rates and enhanced ability to generalize findings beyond studies included in the meta-analytic sample because random effects approaches are more statistically conservative than fixed effects approaches (Field, 2003; Lipsey & Wilson, 2001).
After calculating the overall effect size for each dependent variable, we evaluated the degree of heterogeneity in each effect size distribution by calculating the Q statistic. We followed up significant Q statistics with moderator analyses designed to evaluate whether variation could be explained in part by study-level characteristics recorded during the coding process. In order to evaluate the influence of categorical moderator variables, we utilized a mixed effects analogue to ANOVA which treated individual effect sizes as random effects, and moderator variables as fixed effects. Conceptualizing moderators as fixed (rather than random) in an overall mixed effects model was theoretically consistent with levels of each moderator representing mutually exclusive and collectively exhaustive definitions of naturally occurring population categories. For each ANOVA, we have provided a Q statistic that evaluates whether the moderator accounts for a statistically significant proportion of the variation in the effect size distribution. Because each effect size in the present meta-analysis represents the standardized mean difference between EDNOS and an officially recognized eating disorder, a significant Q statistic in the context of a moderator analysis indicates that these mean differences are significantly larger in some subgroups of studies and smaller in others. In order to investigate the influence of continuous moderator variables, we applied mixed effects regression based on the method of moments estimation procedure. We conducted all statistical analyses using the Comprehensive Meta-Analysis Version 2.0 software program (Borenstein, Hedges, Higgins, & Rothstein, 2005).
There were a total of 456 effect sizes from the 84 eligible studies that compared AN to EDNOS. Adjusting for dependencies in the effect size distribution by averaging non-independent effects produced 73 effect sizes for eating pathology, 53 for general psychopathology, and 11 for physical health. Individual effect sizes from each study are displayed in Table 4, and results of moderator analyses are displayed in Table 5. Studies featured a median sample size of 88 participants, and aggregating across studies, a total of 11,557 participants were included in the AN meta-analyses.
There was a trend for individuals with AN to score higher than individuals with EDNOS on measures of eating pathology, d = 0.09, standard error = 0.06, 95% CI [−0.01, 0.20], but the difference was not statistically significant, p = .09. The effect size distribution exhibited significant heterogeneity, Q(72) = 249.10, p < .001, with effect sizes ranging from −1.50 to 2.72 (median = 0.11), inviting the evaluation of moderator hypotheses. As predicted, the diagnostic features of EDNOS accounted for a significant proportion of the variance in the effect size distribution in a mixed effects ANOVA, Q(4) = 10.50, p = .03. Specifically, AN exhibited significantly higher levels of eating pathology than EDNOS groups meeting all criteria for AN except fat phobia, d = 0.74, p = .0016. In contrast, AN did not differ significantly from EDNOS groups meeting all criteria for AN except amenorrhea (d = 0.20, p = .32), EDNOS groups meeting all criteria for AN except the weight criterion (d = −0.02, p = .93), and EDNOS groups meeting at least two criteria for AN (partial AN; d = −0.03, p = .72). AN also did not differ from heterogeneously defined EDNOS groups or those that featured no information about their diagnostic characteristics, d = 0.06, p = .37. In contrast to our predictions, none of the remaining putative moderators accounted for a significant proportion of variability in the effect size distribution, including the use of structured interviews to assign participants to diagnostic categories, Q(1) = 2.20, p = .14, participant age, Q(1) = 2.53, p = .11, and the use of a patient versus non-patient sample, Q(1) = 0.00, p = .96.
Individuals with AN did not differ from individuals with EDNOS in terms of general psychopathology, d = 0.02, standard error = 0.05, 95% CI [−0.07, 0.11], p = .68. Effect sizes ranged from −0.91 to 0.94 (median = 0.02), and exhibited significant heterogeneity, Q(52) = 101.90, p < .001. With regard to putative moderators, patient samples (d = −0.01, p = .76) exhibited significantly smaller effects than non-patient samples (d = 0.39, p =.001), Q(1) = 10.87, p = .001, and effect sizes increased with increasing age, Q(1) = 5.94, p =.01. However, neither the diagnostic features of EDNOS, Q(4) = 0.94, p = .92, nor the use of clinical interviews to assign diagnoses, Q(1) = 0.78, p = .38, were significant moderators of effect size.
Lastly, AN did not differ significantly from EDNOS on measures of physical health, d = 0.14, standard error = 0.16, 95% CI [−0.18, 0.45], p = .40. The Q statistic was indicative of significant heterogeneity in the effect size distribution, Q(10) = 32.09, p < .001, with effect sizes ranging from −0.60 to 0.93 (median = 0.09). Neither the use of a structured interview to assign diagnoses, Q(1) = 0.30, p = .59, nor participant age, Q(1) = 0.00, p = .97, accounted for a significant proportion of the variance in the effect size distribution. We did not investigate the potential effects of EDNOS diagnostic features and patient versus non-patient sample because some levels of these moderators featured too few studies (i.e., zero or one).
There were a total of 561 effect sizes from the 99 eligible studies that compared BN to EDNOS. Adjusting for dependencies in the effect size distribution by averaging non-independent effects produced 82 effect sizes for eating pathology, 62 for general psychopathology, and 14 for physical health. Studies featured a median sample size of 84 participants, and aggregating across studies, a total of 13,682 independent participants were included in the BN meta-analyses. Effect sizes for individual studies are displayed in Table 6. Table 7 summarizes the results of BN moderator analyses.
BN scored significantly higher than EDNOS on measures of eating pathology, d = 0.39, standard error = 0.05, 95% CI [0.29, .50], p < .001. Effect sizes ranged from −1.11 to 2.50, with a median of 0.41. The Q statistic confirmed significant heterogeneity in the effect size distribution, Q(81) = 329.81, p < .001, inviting the evaluation of moderator hypotheses. Contrary to our prediction, the diagnostic features of EDNOS did not explain a significant proportion of the heterogeneity in the effect size distribution in a mixed effects ANOVA, Q(3) = 5.12, p = .16. However, recognizing the potential importance of this putative moderator for DSM-V revisions, we conducted focal analyses of a priori planned contrasts for each of the four diagnostic groups subsequent to the non-significant omnibus test (as described in Rosenthal, 1995; Rosenthal, Rosnow, & Rubin, 2000). Planned contrasts suggested that effects were small to moderate when the EDNOS group met all criteria for BN except for objectively large binge episodes (i.e. purging disorder; d = 0.39, p < .001), when the EDNOS group missed two or more of the diagnostic criteria for BN (partial BN; d = 0.29, p = .01), and when the EDNOS group was heterogeneously defined or no information was given about its clinical characteristics (d = 0.43, p < .001). In contrast, when the EDNOS group met all criteria for BN except the binge frequency criterion, differences from BN were small and non-significant, d = 0.10, p = .49. In addition, mixed effects regression indicated that mean participant age was inversely associated with effect size, Q(1) = 4.72, p = .03, such that younger participants exhibited larger differences than older participants. Finally, use of a structured interview versus an unstructured interview or self-report instrument [Q(1) = 0.02, p = .89] and recruitment of a patient versus non-patient sample [Q(1) = 1.38, p = .24] did not account for a significant proportion of the heterogeneity in the effect size distribution.
BN groups scored significantly higher than EDNOS groups on general psychopathology, although the mean effect size was small (d = 0.19, standard error = 0.03, 95% CI [0.13, 0.25], p < .001). Effect sizes ranged from −1.36 to 0.79 with a median of 0.17. The Q statistic did not reach significance, Q(61) = 72.82, p = .14, so potential moderator effects were not investigated.
In contrast to patterns for eating pathology and general psychopathology, EDNOS groups exhibited significantly poorer physical health than BN groups, d = −0.18, standard error = 0.08, 95% CI [−0.34, −0.02], p = .03. Effects sizes ranged from −0.65 to 0.68 with a median of −0.16. Moderators were not evaluated because the Q statistic did not reveal significant heterogeneity in the effect size distribution, Q(13) = 8.95, p =.78.
There were a total of 260 effect sizes from the 30 eligible studies that compared BED to EDNOS. Aggregating effect sizes to adjust for non-independence produced 29 effect sizes for eating pathology and 24 for general psychopathology. We did not identify any studies comparing physical health variables between individuals with BED and EDNOS. The median sample size was 67 participants, and summing across all 30 studies, a total of 2,707 participants were included in the following analyses. Table 8 presents effect sizes for each of the individual BED studies, and Table 9 provides a summary of BED moderator findings.
There was a trend for BED to score higher than EDNOS on measures of eating pathology, d = 0.17, standard error = 0.09, 95% CI [−.01, 0.35], but the difference was not statistically significant, p =.06. The effect size distribution exhibited significant heterogeneity, Q(28) = 100.39, p < .001, with effect sizes ranging from −0.79 to 1.17 (median = 0.18). As predicted, the diagnostic features of EDNOS accounted for a significant proportion of this heterogeneity, Q(2) = 9.85, p = .01, according to mixed effects ANOVA. Specifically, BED groups scored significantly higher on measures of eating pathology than groups of partial BED participants who missed two or more of the diagnostic criteria for BED (d = 0.46, p < .001). However, BED did not differ significantly from EDNOS groups in which participants met all diagnostic criteria for BED except the binge frequency criterion (d = 0.28, p = .37), nor did it differ significantly from EDNOS groups that were heterogeneously defined or whose diagnostic characteristics were not specifically described (d = −0.07, p = .58). In addition, studies that recruited patient samples yielded smaller effects (d = −0.07, p = .67) than studies recruiting non-patient or mixed samples (d = 0.35, p < .001), Q(1) = 5.45, p = .02. There was also a trend for studies utilizing structured interviews to find smaller differences (d = −0.02, p = .91) than studies using unstructured interviews or self-report measures (d = 0.33, p = .01), Q(1) = 3.64, p = .06. Finally, mean age of participants did not significantly predict effect size, Q(1) = 1.89, p = .17.
In line with findings for eating pathology, BED did not differ from EDNOS with regard to general psychopathology, d = 0.03 (standard error = 0.07, 95% CI [−0.10, 0.16], p = .66). Effects ranged from −0.84 to 1.28 with a median of 0.02, and the heterogeneity statistic again reached significance, Q(23) = 40.37, p = .01. The diagnostic features of EDNOS accounted for a significant proportion of effect size heterogeneity in a mixed effects ANOVA, Q(2) = 7.36, p = .03. Specifically, there was a trend for BED to score lower on general psychopathology than heterogeneously defined EDNOS groups (d = −0.13, p = .06). However, BED did not differ significantly from EDNOS groups in which participants met all criteria for BED except binge frequency (d = 0.26, p = .14), nor did BED differ significantly from partial BED groups who missed at least two of the BED diagnostic criteria (d = 0.16, p = .19). With regard to the remaining putative moderators, studies using structured interviews did not differ from studies using unstructured interviews or self-report measures, Q(1) = 2.71, p = .10, nor did studies using patient versus non-patient samples, Q (1) = 0.13, p = .72. Finally, mean participant age [Q(1) = 1.63, p = .20] was not a significant predictor of effect size.
Because neither AN nor BED differed significantly from EDNOS on eating pathology, general psychopathology, or physical health, publication bias did not represent a plausible explanation for the findings of those analyses. However, we did investigate the possibility that a bias favoring the publication of significant results accounted for the BN findings by evaluating publication status (published vs. unpublished) as a moderator variable and also by calculating Orwin’s fail-safe N (the number of studies with a Cohen’s d of 0 that would need to be added to the meta-analysis in order to bring the overall effect size to a negligible level, set to an absolute value of 0.10 in this case; Orwin, 1983). Moderator analyses indicated that effect sizes did not differ by publication status for either eating pathology [Q(1) = 0.20, p = .65] or general psychopathology [Q(1) = 0.68, p = .41]. The fail-safe N was 208 studies for eating pathology and 51 studies for general psychopathology, suggesting that these two findings would be relatively robust to file drawer discoveries. In contrast, the fail-safe N for the BN physical health meta-analysis was only 12 studies, and publication status could not be investigated as a moderator because only published studies reported effect sizes for physical health.
Approximately 40% (Button et al., 2005; Ricca et al., 2001) to 60% (Fairburn et al., 2007; Martin et al., 2000; Nollet & Button, 2005; Turner & Bryant-Waugh, 2004) of individuals with eating disorders do not fulfill diagnostic criteria for the officially recognized DSM-IV eating disorders and are therefore given the residual EDNOS diagnosis. Our meta-analysis examined the differences in EDNOS versus AN, BN, and BED in order to inform potential improvements to eating disorder classification. Results demonstrated that while EDNOS did not differ significantly from AN or BED in terms of eating pathology or general psychopathology, individuals with BN scored significantly higher than individuals with EDNOS on measures of eating (d = 0.39) and general (d = 0.19) psychopathology. In contrast, EDNOS exhibited significantly poorer physical health than BN (d = −0.18) but did not differ from AN in this regard.
Overall, despite its perceived subclinical nosological status in DSM-IV, EDNOS did not exhibit large differences in eating and general psychopathology compared to AN and BED. The non-significant differences between AN and EDNOS observed in our meta-analysis can be integrated with the findings of recent taxometric analyses suggesting that AN represents the severe end of a continuum that is dimensional with normal eating behavior (Williamson et al., 2002). Within a continuum model, EDNOS may lie closer to AN, with restrained eaters and chronic dieters lying between cases and non-cases. In contrast, significant differences between EDNOS and BN support findings that BN may represent a latent taxon distinct from normality (Gleaves et al., 2000; Williamson et al., 2002). Although differences in psychopathology between EDNOS and BN were statistically significant, the modest size of these differences does not fully support the notion of EDNOS as a mild variant of an eating disorder. Consider that even an effect size of 0.39 (the difference between BN and EDNOS in eating pathology) is well below the average standardized mean difference in post-treatment bulimic symptoms (0.95) reported in a meta-analysis of studies comparing cognitive behavioral therapy versus no treatment for BN (Hay, Bacaltchuk, & Stefano, 2004). Moreover, in keeping with Wonderlich and colleagues’ (2007) observation that different clinical validators may lead to different conclusions regarding the status and severity of eating disorder subtypes, EDNOS exhibited significantly poorer physical health than BN. Although this latter finding is based on a relatively small number of studies (k = 14) and could be subject to treatment-seeking bias, it underscores the potential severity of the medical complications associated with EDNOS as an important area for clinical attention and future research. In summary, the findings of the present study highlight EDNOS as a set of clinically meaningful eating disorders associated with high levels of psychiatric and general medical morbidity.
It is noteworthy that patterns of results differed across officially recognized diagnoses as well as EDNOS subtypes. The significant differences between BN and EDNOS observed in this meta-analysis are not consistent with the wholesale application of a transdiagnostic approach that would eliminate diagnostic distinctions by classifying EDNOS in the same superordinate category as the officially recognized eating disorders. Although this parsimonious model is attractive in its potential to simplify the dissemination of empirically supported psychotherapies and to clarify the ambiguous definition of eating disorder caseness (Fairburn, 2008; Fairburn & Bohn, 2005), it could potentially cause clinicians and researchers to overlook the observed differences between full and partial syndrome cases. Indeed, taken together with existing literature indicating clear differences in treatment outcome and mortality for AN (Steinhausen, 2002), BN (Keel, Mitchell, Miller, Davis, & Crow, 1999) and BED (Fairburn, Cooper, Doll, Norman, & O’Connor, 2000), our moderator analyses were more consistent with nosologic recommendations to relax certain diagnostic criteria for AN, BN, and BED and, potentially, to identify and extract homogeneous subtypes from within the heterogeneous EDNOS category.
Moderator analyses indicated that certain diagnostic features were associated with larger differences between full and partial syndrome AN. Despite recent calls for a culture-free conceptualization of AN that encompasses alternate rationales for food refusal (Lee, Ho, & Hsu, 1993; Lee, Lee, Ngai, Lee, & Wing, 2001), moderator analyses indicated that individuals who met all criteria for AN except fat phobia exhibited significantly lower levels of eating pathology than individuals with full syndrome AN. Although this finding was based on a modest number of studies (k = 5), similar effects were observed across research groups in two different non-Western societies (China and Japan), and the magnitude of the difference in psychopathology between fat-phobic and non-fat-phobic AN (d = .74) was the largest observed in the present meta-analysis. Our cross-sectional findings dovetail with prospective longitudinal data that trace a more benign naturalistic course for non-fat-phobic AN. Compared to individuals with typical AN, individuals endorsing AN without fat phobia experience higher rates of long-term remission and reduced tendency to develop bulimic symptoms over time (Lee, Chan, & Hsu, 2003; Strober, Freeman, & Morrell, 1999). Taken together, available data support the retention of fat phobia as a core AN diagnostic criterion in DSM-V. In contrast, individuals who met all diagnostic criteria for AN except amenorrhea did not differ significantly in eating pathology from individuals who met all diagnostic criteria. Although drawn from a small number of studies (k = 4), this finding is consistent with recommendations to drop the amenorrhea criterion (Andersen et al., 2001; Mitchell, Cook-Myers, & Wonderlich, 2005) based on theories that menstrual dysfunction is secondary to the substantial weight loss already required for AN. Moderator analyses similarly revealed that individuals who met all criteria for AN except the weight cut-off did not differ significantly from full AN in terms of eating pathology. The finding that individuals who restrict their food intake at slightly higher weights exhibit psychopathology commensurate with their lower-weight counterparts may stem in part from recent population increases in overweight and obesity (Hedley et al., 2004), which render diagnostically low weights increasingly difficult to achieve. As important caveats, moderator analyses examining the utility of the weight criterion were based on a small number of studies (k = 2), and did not converge on a new weight cut-off that could be recommended for DSM-V. Although Andersen et al. (2001) required EDNOS participants to weigh <80% of pre-morbid weight, Watson and Andersen (2003) did not set a specific weight cut-off for EDNOS. Complicating matters further, a recent methodological review identified 10 distinct methods for calculating eating disorder patients’ expected body weights, highlighting discrepancies of up to 25 pounds in the weight at which DSM-IV AN would be diagnosed (Thomas, Roberto, & Brownell, in press).
The partial AN subgroup in which participants missed two diagnostic criteria for AN is of particular interest when considering possible DSM-V revisions because three (Bunnell, Cooper, Hertz, & Shenker, 1992; Ricca et al., 2001; Watson & Andersen, 2003) of the seven studies comprising this group examined individuals who failed to meet both the amenorrhea and weight criterion. The finding that partial AN did not differ significantly in eating pathology from full syndrome AN suggests that it may be possible to drop the amenorrhea criterion and increase the weight criterion simultaneously without jeopardizing the homogeneity of the AN category. A recent study of eating disorder patients found that revising both of these diagnostic criteria concurrently could go far to reduce the overcrowding of the EDNOS category by re-appropriating 15.5% of eating disorder patients from EDNOS to AN (Thaw, Williamson, & Martin, 2001).
Although the omnibus Q statistic for the diagnostic criteria moderator analysis was not statistically significant, we examined differential effects for EDNOS subgroups on an exploratory basis. Analyses revealed that individuals with BN scored significantly higher on measures of eating pathology than individuals with purging disorder, underscoring the importance of requiring that binge episodes be characterized by the consumption of objectively large quantities of food. In contrast, the twice weekly binge frequency criterion did not reliably distinguish between BN and EDNOS on measures of eating pathology. Although statistical power to detect small effects in this analysis was limited due to small sample size (k = 5), findings can confidently be interpreted as evidence that differences between individuals who meet all diagnostic criteria for BN versus individuals who meet all criteria except binge frequency are not large, thus highlighting the binge frequency criterion as a potential candidate for revision in DSM-V. Non-significant differences between full syndrome BN and low binge frequency BN are consistent with findings from a behavioral genetic study reporting that risk for binge eating in one co-twin did not decrease substantially when the other co-twin reported binge eating less than twice per week (Sullivan, Bulik, & Kendler, 1998). Unfortunately, studies included in the present meta-analysis did not converge on a clear threshold that could be recommended for use in DSM-V. Required binge frequencies for partial syndrome BN ranged from once per week (Le Grange et al., 2004; Le Grange et al., 2006) to once per month (Wildes, 2003) or less (Garfinkel et al., 1995; Noma et al., 2006). Available data suggest that relaxing the BN binge frequency criterion would have a relatively small impact on the proportion of individuals currently diagnosed with EDNOS. Two recent simulations found that relaxing this criterion to once per week (Fairburn et al., 2007) or omitting it entirely (Thaw et al., 2001) would re-appropriate only 4 to 5% of the total pool of eating disorder patients from EDNOS to BN.
BED findings echoed BN findings in questioning the retention of the twice weekly binge frequency criterion in DSM-V. Specifically, individuals with BED did not differ significantly on measures of eating pathology or general psychopathology from individuals who met all criteria for BED except binge frequency. Again, statistical power was limited by the small number of studies (k = 3) available for this analysis. Studies did not converge on an optimal binge frequency for use in DSM-V because frequencies for partial syndrome BED ranged from one day per week (Cachelin et al., 1999) to one day per month (Crow et al., 2002) or less (Fitzgibbon et al., 2003). In the absence of data, the potential impact of altering the binge frequency criterion on the prevalence of BED versus partial syndrome BED remains unknown.
There is clear evidence from our meta-analysis that specific diagnostic criteria for AN, BN, and BED could be revised without sacrificing the homogeneity of eating disorder diagnostic categories. Revising these criteria would likely re-allocate a substantial proportion of individuals from the heterogeneous EDNOS category to the more homogeneous AN, BN, and BED diagnoses, where they could be recruited into controlled treatment trials, classified as cases in epidemiological studies, and empirically studied in greater detail. However, there is also considerable evidence that EDNOS—even in its more heterogeneous forms—is of comparable severity to the officially recognized eating disorders. Therefore, we recommend a possible two-tiered approach to eating disorder classification in future versions of DSM. The first tier of eating disorders could include revised versions of AN (with a more lenient weight criterion and without amenorrhea), BN (with a more lenient binge frequency requirement), and BED (also with a more lenient binge frequency requirement). Given the likelihood that a substantial proportion of individuals with EDNOS would retain this residual diagnosis even after the revision of clinically irrelevant criteria (Fairburn et al., 2007; Thaw et al., 2001), a second tier of eating disorders could be defined. The second tier would be reserved for uniquely defined eating disorders that are of clinically significant concern but that (1) have received somewhat less research attention, and (2) clearly differ from first-tier eating disorders on important clinical validators such as level of eating pathology, treatment outcome, or longitudinal course. This approach would parallel, for example, the inclusion of dysthymic disorder alongside major depressive disorder (MDD) in the DSM-IV mood disorders category. Mood disorders provide a good analogy here because there is no assumption that the greater symptom acuity of MDD somehow renders it a “truer” or “more severe” mood disorder than dysthymia, which is defined in part by greater chronicity. In other words, significant differences across a matrix of clinical validators should assist us in demarcating the boundaries between diagnostic categories, but differences on a single validator should not be interpreted as reflecting greater overall severity in one disorder versus another (Wonderlich et al., 2007). To that end, inclusion in the second tier of eating disorder diagnoses should be predicated on meeting a minimum standard of empirical study and clinical impairment above and beyond mere inclusion in the DSM-IV list of example EDNOS presentations. Based on our findings, both purging disorder (Keel et al., 2005; Keel, Wolfe, Liddle, De Young, & Jimerson, 2007) and non-fat-phobic AN (Lee et al., 1993; Lee et al., 2001) would be prime candidates for inclusion in the second tier. In contrast, we did not identify any studies examining individuals who solely engage in chewing and spitting in the absence of more traditional compensatory behaviors; thus, this condition should probably not qualify for inclusion in the second tier as a stand-alone syndrome, but rather remain in the EDNOS category. The combined approach of relaxing current criteria and identifying unique disorders would reduce the prevalence of EDNOS relative to AN, BN, and BED, thus preserving the residual category for those atypical cases for which it was originally intended.
In addition to ascertaining the magnitude of differences between EDNOS and officially recognized eating disorders, a secondary purpose of the present study was to identify theoretical and methodological limitations of the EDNOS literature in order to inform the next generation of nosological research. The following six recommendations are offered in order enhance the marginal utility of future contributions.
The most important methodological limitation of the 125 studies included in the meta-analysis was that nearly half did not provide any information on the eating disorder features of EDNOS participants. It was unclear from many articles whether individuals in the EDNOS group met criteria for established EDNOS subtypes or whether they exhibited new and unique variants of eating pathology. This lack of conceptual clarity renders findings of individual studies difficult to interpret because the specific clinical group(s) to which results might generalize remains unknown. Therefore, the existing knowledge base will benefit most from studies that unlock this “black box” by clearly delineating the eating disorder symptoms of EDNOS subgroups in order to (1) evaluate the clinical utility of specific diagnostic criteria for AN and BN (e.g., Le Grange et al., 2006), or (2) explore the distinctiveness of newly proposed eating disorders (e.g., Napolitano, Head, Babyak, & Blumenthal, 2001). Of note, many of the research reports included in the present meta-analysis emerged from well-characterized large-scale databases which could potentially provide a wealth of secondary data analyses along these lines.
The research reports included in the present meta-analysis referred to EDNOS by more than 30 different names. For example, BN-like eating disorders characterized by the presence of purging behaviors in the absence of objective binge episodes were variously called “compensatory eating disorder” (Tobin, Griffing, & Griffing, 1997), “subjective bulimia nervosa” (Keel et al., 2001), “purging disorder” (Keel et al., 2005; Keel et al., 2007), and “EDNOS-P” (Binford & Le Grange, 2005). In contrast, the term “subthreshold BED” (Fitzgibbon et al., 2003; Gladis et al., 1998b; Striegel-Moore et al., 2000) became a homonym for multiple permutations of meeting some but not all diagnostic criteria for BED. Although each of these labels is descriptive and represents positive efforts to delineate the clinical characteristics of EDNOS subtypes, nomenclatural inconsistencies obstruct comparisons across studies. In future studies, investigators should attempt to adhere to the EDNOS subtyping nomenclature established in prior literature, however provisional, because consistent labeling schemes will facilitate cross-study comparisons. An ideal naming scheme would feature labels that are as descriptive as possible without becoming cumbersomely long, and would include the names of key symptoms that are either highly characteristic of that subgroup, or, alternatively, symptoms that keep the group from meeting full criteria for an established disorder. Under these suggested guidelines, referents such as non-fat-phobic AN, non-amenorrheic AN, and high weight AN would be ideal for AN-like EDNOS variants; and purging disorder and night eating syndrome would be ideal for newly characterized disorders. Acronyms that are not readily interpretable (i.e. EDNOS-P, ANXW) should be avoided.
Fewer than half of the studies included in the present meta-analysis reported having used structured clinical interviews to establish eating disorder diagnoses. This is worrisome in light of research demonstrating that clinicians do not always adhere to DSM criteria when they confer diagnoses, resulting in poor to fair agreement between structured interview-based and clinician-based diagnoses (Miller, Dasher, Collins, Griffiths, & Brown, 2001; Shear et al., 2000). In the context of the present meta-analysis, moderator analyses revealed a trend for studies using structured interviews to report smaller differences in eating pathology between EDNOS and BED than studies using unstructured interviews or self-report measures. It is possible that this pattern of findings stems from the differential construction of the ambiguous boundary between EDNOS and nonpathological eating behavior in clinical versus non-clinical settings. Because the DSM-IV merely cites examples of clinical presentations that are eligible for the EDNOS label rather than providing explicit inclusion and exclusion criteria, conferring this diagnosis in clinical practice is a highly subjective process. Only a handful of studies included in the present meta-analysis described having applied exclusion criteria to the EDNOS group by requiring that patients meet specific operational definitions of EDNOS. For example, Crow and colleagues (2002) diagnosed patients with “partial AN” only if they either (1) met all criteria for AN except weighing less than 90% of expected body weight, or (2) met full criteria for AN in the past 12 months. Similarly, Binford and Le Grange (2005) diagnosed patients with “EDNOS-P” only if they had purged at least once per week in the past six months in the absence of objective binge episodes. In light of the trend we observed for studies using structured interviews to obtain smaller effects, it is interesting that the majority of studies that created and utilized operational definitions of EDNOS also cited the use of structured interviews to confer diagnoses (e.g., Binford & Le Grange, 2005; Crow et al., 2002; Fairburn et al., 2007; Turner & Bryant-Waugh, 2004). Given the high rates of body dissatisfaction and disordered eating behaviors observed in community samples (Ackard, Fulkerson, & Neumark-Sztainer, 2007; Sullivan et al., 1998), it is clearly not necessary to diagnose all individuals seeking treatment for eating-related difficulties with a clinically significant eating disorder. In the context of the present meta-analysis, a lack of exclusion criteria on the EDNOS diagnosis in studies relying on unstructured assessment protocols may have artificially inflated observed differences between EDNOS and BN if healthy controls were in included the EDNOS group. Alternatively, the misclassification of partial syndrome cases of AN, BN, or BED in the full syndrome group as a consequence of unreliable definitions of EDNOS could also have attenuated observed effects. Therefore, future efforts should be made to explicitly delineate inclusion criteria for EDNOS in both clinical and research settings. Valid and reliable diagnoses of DSM-IV EDNOS subtypes are best derived from eating disorder-specific assessments rather than general psychiatric interviews. For example, the Eating Disorder Examination (Fairburn, Cooper, & O’Connor, 2008), the Interview for the Diagnosis of Eating Disorders (Kutlesic, Williamson, Gleaves, Barbin, & Murphy-Eberenz, 1998), and the Structured Interview for Anorexic and Bulimic Disorders (Fichter, Herpertz, Quadflieg, & Herpertz-Dahlmann, 1998) thoroughly assess a wide variety of eating disorder features, such as subjective binge episodes and chewing/spitting, which are characteristic of specific EDNOS variants but not explicitly diagnostic of AN, BN, or BED.
Moderator analyses demonstrated that sample characteristics other than diagnostic criteria also influenced the magnitude of observed effects. Patient samples yielded smaller discrepancies in eating pathology between BED and EDNOS than non-patient or mixed samples. This is consistent with well-replicated findings that treatment-seeking is associated with greater psychopathology (Keel et al., 2002), which may have obfuscated any differences between EDNOS and officially recognized disorders. It is noteworthy that more than two thirds of the studies included in the meta-analysis examined exclusively patient samples. Community samples, which feature wider variability between full and partial syndrome cases, might prove especially useful for future empirical evaluation of diagnostic thresholds, such as the optimal weight cut-off for AN and binge frequency requirement for BN.
Participant age represented another demographic characteristic that influenced the magnitude of observed differences between EDNOS and officially recognized disorders. Younger samples demonstrated greater discrepancies in eating pathology between BN and EDNOS than older samples. In contrast, younger samples demonstrated greater similarities in general psychopathology between AN and EDNOS. Differential age effects between AN and BN are to some extent consistent with epidemiological research indicating that BN may exhibit an older age of onset and a longer duration of illness (Hudson et al., 2007) than AN. It is possible that EDNOS may develop among young people as a milder variant of psychopathology for which early, less intensive interventions would prevent the subsequent onset of full-blown BN. Indeed, to the extent that DSM-IV defined eating disorders do not adequately capture the differential clinical presentation of younger samples (Nicholls et al., 2000), childhood eating disorders typically focus on restricting (e.g., food avoidance emotional disorder, selective eating disorder) rather than purging behaviors (Bryant-Waugh, 2000). Multivariate modeling techniques could be utilized to ascertain the validity of distinct syndromes within this demographic segment.
The finding that BN differed significantly from EDNOS with regard to eating pathology is inconsistent with recent comparisons finding no significant differences between the two groups (e.g., Binford & Le Grange, 2005; Fairburn et al., 2007), and could be interpreted as challenging burgeoning transdiagnostic theories of eating disorders. One likely explanation for the discrepancy between our meta-analytic findings and the results of the individual studies is differential statistical power. The median sample size for BN versus EDNOS comparisons was 84 total participants, which is far fewer than the 99 participants per group necessary for 80% power to detect an effect size of 0.40 at α = .05 in an independent samples t-test (Cohen, 1988). Thus although many studies were individually underpowered to detect small effects, their combined findings revealed significant meta-analytic differences. Now that this meta-analysis has determined that differences between EDNOS and officially recognized eating disorders are relatively small, future studies could demonstrate greater statistical conclusion validity by recruiting a sufficient number of participants to detect effects of this modest magnitude. Adequate sample size is especially important for nosological research because heterogeneously defined EDNOS groups are likely to feature large intra-group variances, which could further thwart efforts to detect significant findings.
To date the EDNOS literature has relied heavily on univariate group comparisons between partial and full syndrome cases. These investigations have been fruitful in challenging the validity of DSM-IV thresholds (i.e. binge frequency for BN, 85% expected body weight for AN), but are limited by their inability to nominate more appropriate thresholds, to reveal whether significant effects reflect qualitative differences in kind versus quantitative differences in degree, or to identify the ideal boundary between cases and non-cases. Thus, as we move towards DSM-V, we encourage investigators to consider the application of alternative statistical methods which may be better suited toward the creation of a new system versus the continued critique of the old one. Multivariate techniques such as latent class analysis (Bulik et al., 2000; Striegel-Moore et al., 2005) and taxometrics (Gleaves et al., 2000; Williamson et al., 2002) have already been used to good effect along these lines, and could potentially prove even more useful if clinically relevant variables that do not overlap with extant DSM-IV diagnostic criteria—such as egosyntonicity of symptoms and treatment response—were incorporated to further differentiate among groups. Non-linear regression and receiver-operating curve analyses might also be used to generate proposals for new diagnostic thresholds. New measures designed to assess the functional impairment associated with eating disorder features (Bohn et al., 2008; Engel et al., 2006) could provide a wealth of promising assessments through which newly proposed weight and binge frequency thresholds could subsequently be validated.
The findings of this meta-analysis should be interpreted with the following caveats in mind. The first limitation is the modest statistical power to detect small effects in analyses that relied on small sample sizes. Because BED represents a relatively recent addition to the nosological scheme (APA, 1994), only 30 studies could be identified that examined its relationship to other types of EDNOS. Similarly, all three study sets contained a limited number of comparisons for certain EDNOS subtypes such as high weight AN (k = 2), AN without amenorrhea (k = 4), non-fat-phobic AN (k = 5), purging disorder (k = 5), and BN and BED with low binge frequency (k = 5 and k =3, respectively). At most, the results of small-sample moderator analyses examining specific diagnostic criteria could be interpreted as evidence that effect sizes are not large, and the generalizeability of our results may be limited. One positive feature of our analytic plan, however, is that utilizing a random (as opposed to fixed) effects approach enhances the potential generalizeability of the findings (Field, 2003; Lipsey & Wilson, 2001). In addition, although the small sample size of specific moderator analyses may have limited the statistical power of the present meta-analysis, we hope that our quantitative review draws the field’s attention to under-researched corners of the literature that are in need of additional empirical attention.
Second, inter-rater reliability was lower for the classification of EDNOS subgroups according to BN diagnostic criteria (κ = .71) than for the other moderator variables. Although a κ of .71 is considered substantial (Landis & Koch, 1977) and coder discrepancies were each ultimately resolved through consensus, some disagreements stemmed from the limited availability of information in original research reports, which may have introduced error into the classification of EDNOS subtypes. A final limitation of the present study is that current measures of eating disorder psychopathology are based on contemporary conceptualizations of symptoms which stem in part from extensive clinical and research experience with AN and BN. Thus, it is possible that effect sizes from individual studies, and, therefore, our overall meta-analysis, would be attenuated if assessments were more attuned to assessing the specific psychopathology of EDNOS cases.
A meta-analysis of 125 studies highlighted the clinical severity of EDNOS by demonstrating that individuals who receive this residual eating disorder diagnosis exhibit small to no significant differences in eating pathology, general psychopathology, and physical health compared to individuals diagnosed with officially recognized DSM-IV eating disorders.
Moderator analyses suggest that a combination of relaxing current criteria and carving out homogeneous NOS subtypes would be superior to a transdiagnostic solution when considering DSM-V revisions. The marginal utility of future contributions to the EDNOS literature would be greatly enhanced through clearer identification of diagnostic features, adherence to a provisional nomenclature, rigorous diagnostic assessment, attention to sample demographics, enhanced statistical power, and the adoption of innovative statistical techniques. Because the high prevalence of EDNOS represents a special case of overflowing atypical categories across a wide range of psychiatric disorders, the design and implications of the current study could be adapted for evaluating the utility of other nosological categories. Diagnostic boundaries between officially recognized and “not otherwise specified” psychiatric disorders should ideally reflect empirical evaluation of differential symptom severity, functional impairment, and treatment response.
We would like to thank the National Institutes of Mental Health for their financial support of this research project through the auspices of Ruth L. Kirschstein National Research Service Award F31MH078394. We also acknowledge the meticulous work of our study coders, Andres De Los Reyes, Kate A. McLaughlin, and Jessica M. Cronce. Lastly, we thank the investigators who contributed their in-press and unpublished studies for inclusion in our meta-analysis.
1We will use the term “partial syndrome” throughout this manuscript to describe individuals who meet some but not all of the diagnostic criteria for one of the officially recognized eating disorders. Our use of the term “partial syndrome” is not meant to imply that individuals meeting some but not all diagnostic criteria for AN, BN, or BED exhibit eating pathology that is only “partially” as severe as individuals meeting full criteria. Indeed, the relative severity of EDNOS versus officially recognized eating disorders is exactly what our meta-analysis was designed to evaluate empirically.
2The “$” following the operand enables the search engine to identify terms that begin with the operand but feature multiple endings (i.e. noun, adjective, or plural forms).
3While piloting the coding manual, coders attempted to evaluate individual study quality. Specifically, they attempted to record (a) whether the EDNOS and full syndrome groups were matched on demographic characteristics such as age and sex, (b) whether authors had 80% power to detect a medium effect size, and (c) the reliability of diagnostic and dependent measure(s). Because the majority of studies did not report this information and it was thus not possible to include quality as weighting variable, we attempted to investigate the effects of a key measure of study quality integral to the diagnostic process (the use of a structured interview to assign participants to diagnostic categories) in our moderator analyses.
4Studies utilizing patient samples were not required to explicitly state that they had diagnosed patients using a clinical interview, because treatment status was assumed to imply that some form of clinical assessment had taken place.
5Re-running our meta-analysis while excluding the eight studies that featured missing data did not substantively alter either the overall findings or the findings of individual moderator analyses.
6In order to evaluate whether potential overlap between the independent (fat phobia) and dependent (eating pathology) variables accounted for the large differences we observed between full syndrome AN and non-fat-phobic AN, we conducted a second trial of the AN diagnostic criteria moderator analysis in which we applied a more conservative definition of eating pathology among studies in the non-fat-phobic AN subgroup. Under this more conservative definition, we excluded dependent variables that had potential for overlap with fat phobia (such as the Drive For Thinness subscale of the Eating Disorder Inventory), and we obtained similar results.
Publisher's Disclaimer: The following manuscript is the final accepted manuscript. It has not been subjected to the final copyediting, fact-checking, and proofreading required for formal publication. It is not the definitive, publisher-authenticated version. The American Psychological Association and its Council of Editors disclaim any responsibility or liabilities for errors or omissions of this manuscript version, any version derived from this manuscript by NIH, or other third parties. The published version is available at www.apa.org/pubs/journals/bul
Studies preceded by an asterisk (*) were included in the meta-analysis