Search tips
Search criteria 


Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
J Clin Psychiatry. Author manuscript; available in PMC 2012 November 8.
Published in final edited form as:
PMCID: PMC3493484

Can Personality Disorder Experts Recognize DSM-IV Personality Disorders from Five-Factor Model Descriptions of Patient Cases?



Dimensional models of personality are under consideration for integration into the next Diagnostic and Statistical Manual of Mental Disorders (DSM-V), but the clinical utility of such models is unclear.


To test the ability of clinical researchers who specialize in personality disorders to diagnose personality disorders using dimensional assessments, and to compare these researchers’ ratings of clinical utility for a dimensional system versus for the DSM-IV.


A sample of 73 researchers who had each published at least three (Median=15) articles on personality disorders participated between December 2008 and January 2009. The Five-Factor Model (FFM), one of the most-studied dimensional models to date, was compared to the DSM-IV. Participants provided diagnoses for case profiles in DSM-IV and FFM formats, and then rated the DSM-IV and FFM on six aspects of clinical utility.


Overall, participants had difficulty identifying correct diagnoses from FFM profiles, and the same held true for a subset reporting equal familiarity with the DSM-IV and FFM. Participants rated the FFM as less clinically useful than the DSM for making prognoses, devising treatment plans, and communicating with professionals, but more useful for communicating with patients.


The results suggest that personality disorder expertise and familiarity with the FFM are insufficient to correctly diagnose personality disorders using FFM profiles. Because of ambiguity inherent in FFM profile descriptors, it may be that this insufficiency is unlikely to be attenuated with increased clinical familiarity with the FFM.

Keywords: FFM, DSM-IV, DSM-V, Clinical Utility, Expertise, Concepts, Categorization

The Diagnostic and Statistical Manual of Mental Disorders (4th ed.; DSM-IV-TR)1 is currently under revision, and among the proposals under discussion for the pending DSM-V is the possibility of dimensionalizing mental disorders,2 particularly Axis II personality disorders. Before adopting any proposal, however, it is important to consider whether the proposed assessment system would be useful to clinicians with respect to making treatment plans and prognoses, communicating with patients or other clinicians, and describing a patient’s global personality or important personality problems.3,4 The current study examines the clinical utility of dimensional systems.

Two general approaches to dimensionalizing Axis II disorders have been proposed. One is to preserve personality disorder types (e.g., borderline personality disorder), and assess how close a person is to a given type (type-based dimensional system, henceforth).5 The other approach departs further from the DSM-IV by profiling a person along underlying traits, such as introversion (trait-based dimensional system, henceforth).69 The current study examines the clinical utility of a trait-based dimensional system.

Five-Factor Model: A Trait-based Dimensional System

The trait-based dimensional proposal for personality disorders that has received the most attention is the Five Factor Model (FFM, henceforth) of personality (e.g., Costa & McCrae’s Revised NEO Personality Inventory;6 see Clark10 for a recent review), and it is therefore the example we chose to examine in the current research. Note, however, that our broader intent in comparing the DSM-IV and the FFM is to provide new information about the clinical utility of trait-based dimensional models in general, as will be discussed later in the paper.

As diagnostic/assessment systems for personality disorders, the current DSM-IV and FFM each have distinct benefits and disadvantages. The DSM-IV classifies maladaptive personality with 10 personality disorders, each defined by unique criteria. For example, to be diagnosed with antisocial personality disorder, one must have, pervasively and across contexts, at least 3 of the 7 symptoms shown in Figure 1a. This approach has an important advantage in terms of cognitive processing; using discrete categories is cognitively efficient. Instead of describing or remembering all the features and characteristics of each person, one can simply describe or remember a person as having antisocial personality disorder.

Figure 1
DSM-IV and FFM Descriptions of a Prototypic Case of Antisocial Personality Disorder.

However, compared to the FFM and other dimensional systems, there are certain disadvantages to the DSM’s categorical assessment, and there exist many useful reviews on this topic. 11,12 These include diagnostic comorbidities that may be due to criterion overlap, arbitrary diagnostic thresholds of the number of criteria necessary to count as having a disorder, and clinical heterogeneity among people with the same diagnosis. These problems have led some to argue that DSM-IV disorder categories are neither discrete nor well defined. Some critics have argued that the DSM-IV personality disorders do not cover all important personality problems, 13,14 yet adding additional personality disorders could exacerbate the comorbidity problem. In sum, dissatisfaction with the current diagnostic system has generally been on the rise.15

In contrast, the FFM does not presuppose any personality disorder categories and instead describes personality in a continuous manner along a set of 30 traits (or “facets”) grouped into five overarching factors. Figure 1b shows the FFM profile of a prototypic patient with antisocial personality disorder. If a person has a high score on a given facet (e.g., anxiousness), he/she is better described by the high adjectives (e.g., fearful) than the low adjectives (e.g., relaxed).

The FFM has a number of advantages regarding construct validity: it has been shown to be biologically based, universal, temporally stable, and related to life outcomes.12 Furthermore, because the FFM describes people continuously along 30 facets rather than with discrete disorders, it avoids many of the aforementioned disadvantages of the DSM-IV. For example, the issue of high comorbidity is irrelevant with the FFM because no categorical diagnoses are given; similarly, the problem of arbitrary diagnostic thresholds of personality disorders is also moot because the FFM does not implement cutoffs specifying the presence versus absence of a disorder. Other trait-based dimensional systems,79 although differing in the choice of specific traits, share these strengths with the FFM.

However, there are considerable cognitive-processing challenges that may be inherent to any trait-based dimensional system. Specifically, the facets or traits may be fundamentally ambiguous. Previous research in cognitive science suggests that descriptors are relative to the categories they describe (e.g., large molecule versus large mountain; open hand versus open bottle; strong woman versus strong man).1620 As a result, descriptors are inherently ambiguous without the context of an accompanying category. Translated to the domain of personality pathology, when an FFM facet is used without the context of a diagnostic category, it can be ambiguous in a similar way. For instance, a low score on the “gregariousness” FFM facet could correspond to either paranoid fears (as in paranoid personality disorder), fear of not being liked by others (avoidant personality disorder), or indifference to others (schizoid personality disorder). A high score on the “anger” facet could correspond to either temper tantrums (histrionic personality disorder) or lack of control over anger (borderline personality disorder).21 While the features used in the DSM-IV diagnostic criteria are less likely to be ambiguous because the descriptors tend to refer to observable behaviors (e.g., “perceives attacks on his or her character or reputation that are not apparent to others and is quick to react angrily or to counterattack”) and are framed in the context of a diagnosis, FFM traits are unobservable (e.g., in Figure 1b, “angry” or “bitter”) and can be ambiguous if presented without any diagnostic context. This ambiguity in FFM patient descriptions could pose problems for clinical functions such as determining prognoses or developing treatment plans. Previous studies2224 comparing the clinical utility of the FFM with that of the DSM-IV have not examined this issue of ambiguity inherent to FFM, with the exception of a recent study by Rottman et al.25 Because the current study utilizes the task used in that work, we describe the task in detail here.

Back-translation Task used in Rottman et al.25

Both Rottman et al.’s 25 study and the current study used a back-translation task to examine whether trait-based descriptions of patients may be clinically ambiguous. In the back-translation task, participants are presented with patient descriptions in the FFM format (Figure 1b), which were taken from previous studies in which experienced clinicians thought about a prototypic case of each of the 10 DSM-IV personality disorders26 or comorbid cases22 and rated each case on the 30 FFM facets. Then, participants are asked to “back-translate” these FFM descriptions by identifying any known DSM-IV disorders found in the descriptions.

The logic behind this back-translation task is the following. We begin with the assumption that practicing clinicians are familiar with the DSM-IV personality disorders. Presenting clinicians with the DSM-IV diagnostic criteria and having them make DSM-IV diagnoses tests the validity of this assumption. Having demonstrated that the DSM-IV personality disorders are familiar concepts to the clinicians, the next step is to determine whether they can recognize these known concepts from the FFM trait descriptions. If traits are indeed ambiguous, such that a score on one facet (e.g., a low score on gregariousness) could correspond to multiple DSM-IV diagnoses, then clinicians should have difficulty identifying correct DSM-IV diagnoses from FFM descriptions alone (e.g., Figure 1b). That is, identifying DSM-IV diagnoses from FFM profiles would be a one-to-many mapping. If traits or a set of traits taken as a whole are not ambiguous, clinicians should be able to readily recognize the DSM-IV personality disorders from FFM descriptions alone.

The outcome of this task is not obvious and needs to be tested empirically for the following reasons. Previous studies have demonstrated that the FFM is comprehensive enough to reliably describe the DSM-IV personality disorders. For example, in Lynam and Widiger’s study,27 experts in personality disorders were asked to consider prototypic cases of each of the DSM-IV personality disorders and to rate them on the 30 facets comprising the FFM. Average inter-rater reliability was good, ranging from .48 to .66. Samuel and Widiger26 (see also Sprock28) also demonstrated that practicing clinicians could also describe the personality disorders in terms of the FFM with fairly high inter-rater agreement, ranging from .64 to .78. Samuel and Widiger26 also found extremely high agreement between the prototypes derived from practicing clinicians and those from experts in personality disorders27. These studies suggest that clinicians can reliably translate existing concepts of personality disorders into FFM ratings. Based on these results, Samuel and Widiger stated that “the DSM-IV PDs can be understood from the dimensional perspective of the FFM.”26 In a review article, Clark10 also stated that the DSM-IV personality disorders “can be characterized with the FFM conceptually… and empirically.” If the traits can capture the DSM-IV personality disorders in a reasonably unambiguous manner, clinicians should at least be able to recognize prototypical DSM-IV diagnoses from the trait-based descriptions alone. The current study examines whether this is indeed the case.

One might criticize the back-translation task for relying on the DSM-IV diagnoses, the very concepts that the proponents of the trait-based dimensional systems propose eliminating due to the problems discussed earlier. This, however, is irrelevant to the aims of the current study; even if the DSM-V does not use the same diagnostic categories as DSM-IV (or even if it eliminates diagnoses entirely), the back-translation task should nonetheless effectively assess whether there are ambiguities in traits. The back-translation task merely uses the categories already known to the clinicians as an established baseline, and is agnostic as to the validity of these categories (see also the General Discussion section for a more detailed discussion of this issue).

It may also be argued that there are ways to disambiguate or contextualize the traits with supplementary information, such as identifying dysfunctional behaviors associated with extreme trait scores29 (see the General Discussion section for a more detailed consideration of this possibility). Yet, the first order of business before endeavoring to implement such steps is to empirically examine whether or not traits are indeed ambiguous.

Rottman et al.25 presented the back-translation task to practicing clinical psychologists, psychiatrists, and clinical social workers. They found that on average, clinicians only identified correct diagnoses for 47% of prototypic cases, and only 21% of co-morbid cases when they were described by the FFM traits alone. This finding cannot be attributed to a lack of knowledge of the DSM-IV, because the same clinicians had relatively little difficulty identifying correct diagnoses presented in the DSM-IV format; on average, the clinicians identified correct diagnoses for 82% of prototypic cases, and 60% of comorbid cases when they were written in the DSM-IV format. In other words, clinicians had difficulty disambiguating the meaning of FFM patient descriptions even for well-known, prototypic DSM-IV disorders. The clinicians also rated the FFM as less clinically useful than the DSM. In sum, it appears that the FFM requires supplementary contextual information for clinicians to effectively disambiguate the meanings of the FFM’s facets for any given patient.

Experts in Personality Disorders

Rottman et al.’s25 focus on practicing clinicians demonstrated some of the cognitive difficulties that would be faced by mental health professionals using the FFM to make personality disorder diagnoses. However, two important issues were not fully addressed in this previous work.

First, many practicing clinicians, such as those tested by Rottman et al.,25 likely specialize in disorders other than personality disorders (e.g., Axis I disorders), and as such may not have been able to use the FFM to its full potential in that study, which focused solely on personality disorders. In contrast, research in cognitive science30,31 would suggest that clinical-research personality disorder experts who have specialized in building knowledge and theories about the causal workings of personality disorders relevant to FFM facets could help them better identify important correlations between scores on FFM facets for personality disorders. Furthermore, identifying important correlations between FFM facets could help personality disorder specialists integrate the information across the 30 facets and form a more coherent concept of a patient, benefiting diagnosis and other clinical functions. For instance, although a low score on the ‘Gregariousness’ facet may be ambiguous on its own, a combination of low ‘Gregariousness’ and low ‘Trust’ scores may indicate that a patient has paranoid personality disorder, whereas a combination of low ‘Gregariousness’ and high ‘Self-Consciousness’ scores may indicate that a patient has avoidant personality disorder. A similar finding has been demonstrated in chess experts, who are able to quickly perceive combinations of chess pieces and positions as meaningful “chunks” bound by relations such as attack and defense.32 In sum, having specialized knowledge in personality disorders may help reduce the effects of ambiguity in the FFM, in which case personality disorder experts should be able to overcome these challenges of working with the FFM. If this is true, then it is conceivable that the problems with ambiguity documented by Rottman et al.25 are not necessarily inherent to the FFM, but rather could be attributed to the background of the clinicians in that previous study that might readily be overcome with specialized training. To test this possibility, in the current study, personality disorder researchers were studied to tap into a population highly likely to have maximal knowledge about personality disorders.

The second critical issue addressed in the current study is that Rottman et al.’s25 clinicians self-reported being relatively unfamiliar with the FFM, and also being considerably less familiar with the FFM than with the DSM-IV. It remains possible that these clinicians had a harder time working with the FFM simply because the system was new to them. If so, it may be that any potential cognitive difficulties with an FFM-based assessment system would be attenuated once the system becomes more familiar. The personality disorder researchers tested in the current study, in contrast, should be familiar with both the FFM and the DSM-IV. We were also able to identify a subset of researchers reporting equal familiarity with the FFM and DSM-IV.

If expert knowledge contributes to perceived clinical utility of the FFM, then the current study provides a more comprehensive test of the utility of the FFM compared to Rottman et al.’s25 study of practicing clinicians. However, if a group of personality disorder researchers, including those with notable FFM expertise, have difficulty disambiguating FFM descriptions, this would suggest that the cognitive difficulties previously attributed to the FFM in Rottman et al.’s25 study are not likely to be overcome with experience or increased knowledge of personality disorders or through more extensive exposure to and experience with the FFM.



In line with previous research,27 we identified people with specialized knowledge of personality pathology by conducting a search in the PsycINFO database for authors who had published at least three papers with the keyword “Personality Disorder” in peer-reviewed journals, and who had published at least one article from January 2006 through mid-November 2008 (the time at which our search was conducted). We then excluded those for whom we could not find contact information and those who were highly likely to already be familiar with Rottman et al.’s25 study. Recruitment emails were sent to the remaining 476 researchers in December 2008. At the beginning of the study, we requested that participants verify that they consider personality disorders to be among their primary research interests and that they have been conducting research on personality disorders for at least four years. This allowed us to exclude those who collaborated on personality disorder papers only because of expertise in other fields (e.g., statisticians). Seventy-three participants completed the experiment. The experiment took 29 minutes on average, and participants received either a $60 gift certificate to an online retailer or a $60 check.

Materials and Design

Twelve different cases were described in both the FFM and DSM styles depicted in Figure 1. Ten described prototypic patients, each having only one of the 10 DSM-IV personality disorders. The remaining two were comorbid cases with two personality disorders each; these were included because comorbid cases have been argued to be more representative of real-world patients.33

The FFM facet scores were taken from previous studies in which practicing clinicians thought about prototypic personality disorder cases26 and about comorbid case vignettes22 and rated each on the 30 FFM facets. The FFM-style descriptions presented to participants contained both the average rating for each facet obtained from these studies, and a plot of the facet scores, anchored by high (e.g., “fearful, apprehensive” for anxiousness) and low (e.g., “relaxed, unconcerned, cool” for anxiousness) adjectives (the same descriptions used by Rottman et al.;25 e.g., Figure 1b). For the DSM-style descriptions, each prototypic case comprised all the DSM-IV-TR diagnostic criteria for that personality disorder (e.g., Figure 1a). The comorbid DSM-style descriptions were taken from a pretest by Rottman et al.,25 in which clinicians identified all the DSM-IV-TR personality disorder symptoms they found to be present in the comorbid vignettes.22

The 12 cases were divided into two groups, each containing five prototypic cases and one comorbid case. For diversity, each group included at least one disorder from each of the three clusters of personality disorders in the DSM-IV, and the diagnoses of the comorbid case did not match the diagnoses of any of the prototypic cases in the group. To the extent possible, we also matched the two groups of prototypic cases for difficulty of diagnosis, as previously determined.25

Each participant saw one group of six cases presented in the FFM style and the other group in the DSM style. Thus, descriptive style (DSM vs. FFM) was a within-subject variable. The pairing of cases with descriptive style, presentation order of the two groups, and order of the styles were counterbalanced across participants. The order of the six cases within each group was randomized.


The study was performed online using Qualtrics software. Participants were told that they would be presented with descriptions of adult patients, and were asked to imagine that these patients were referred to them along with a patient description from a previous consultation. Participants were told that the patients “do not have schizophrenia or any other psychotic disorder, and their symptoms do not occur due to the direct effect of any general medical condition.” This instruction was included to prevent participants from avoiding giving personality disorder diagnoses for reasons not of experimental interest (e.g., in the DSM-IV, a schizoid personality disorder diagnosis is not allowed if it occurs exclusively during the course of schizophrenia). Finally, participants were instructed not to consult the DSM-IV or other references during the experiment.

Next, participants were presented with the first group of six cases in either the DSM or FFM style. After each individual case, participants were asked to “provide any DSM-IV diagnoses you believe this patient to have.” Participants also rated their confidence in each diagnosis on a seven-point scale (where 1 = “not confident at all,” 4 = “somewhat confident,” and 7 = “very confident”).

After the first group of cases was presented, participants rated the utility of the descriptive system that they just saw by answering the following six questions on a five-point scale (1 = “not at all,” 2 = “slightly,” 3 = “moderately,” 4 = “very,” 5 = “extremely”):25

  1. “How informative is this description in making a prognosis for this person?”
  2. “How informative is this description in devising treatment plans for this person?”
  3. “How useful do you feel the system used to describe this person would be for communicating information about this individual with other mental health professionals?”
  4. “How useful do you feel the system used to describe this person would be for communicating information about the individual to him or herself?”
  5. “How useful is the system used to describe this person for comprehensively describing all the important personality problems this individual has?”
  6. “How useful was the system used to describe this person for describing the individual’s global personality?”

Participants then performed the same series of tasks for the second group of cases. Finally, participants provided demographic information and rated their own familiarity with the DSM-IV and FFM systems, respectively, on a seven-point scale (where 1 = “not at all familiar,” 4 = “moderately familiar,” 7 = “extremely familiar”). Participants gave informed consent and this study was approved by the Yale Institutional Review Board.



Seventy-three researchers (51 Ph.D.’s, 16 M.D.’s, 2 M.D./Ph.D.’s, 3 M.A.’s, and 1 M.S.W.) participated.i On average, they received their highest degree in 1994, about 14 years before this study was conducted. Participants had published a median of 15 papers on personality disorders (Mean=24, Range=[3,160]) and had been conducting research on personality disorders for an average of 15 (SD=8) years. Overall, participants reported being more familiar with the DSM than the FFM, t(72)=7.70, p<.01. However, the current participants were more familiar with the DSM (M=6.40, SD=.95) and, more importantly, with the FFM (M=4.97, SD=1.66) than were the clinicians in Rottman et al.’s study25 (M=5.68, SD=1.26, for the DSM; M=2.17, SD=1.65, for the FFM, t(174.26)=4.89, p<.01,ii for the DSM; t(252)=12.24, p<.01, for the FFM). Furthermore, a 2 (DSM vs. FFM) × 2 (clinicians vs. researchers) ANOVA revealed a significant interaction, F(1, 252)=71.91, p<.01; although Rottman et al.’s25 clinicians were much more familiar with the DSM-IV than the FFM, this difference was markedly smaller for the researchers in the current study.iii Whereas Rottman et al.’s25 clinicians rated themselves as significantly below the midpoint of “moderately familiar” with the FFM, t(181)=14.94, p<.01, the researchers in the current study rated themselves significantly above the midpoint t(72)=5.10, p<.01.

For each of the analyses below, we will also refer to a subgroup of participants who rated themselves as equally familiar with the FFM and DSM (M=6.42, SD=.93, for both systems), again significantly above the midpoint of “moderately familiar,” t(23)=12.74, p<.01. This subgroup consists of 24 researchers: 18 Ph.D.’s, 2 M.D.’s, 2 M.D./Ph.D.’s, and 2 M.A.’s. On average, they received their highest degree in 1991, 17 years before this study was conducted, and had published a median of 20 papers on personality disorders.

Diagnostic Accuracy

The prototypic cases were analyzed by averaging across the five prototypic cases seen by each individual in each system. Thus, a score of 1 means that a participant gave correct diagnoses for all five cases and a score of 0 means that the participant gave no correct diagnoses for any case. Participants almost always gave the correct diagnosis in the DSM condition (M=.99, SD=.06) and were much more accurate in the DSM condition than in the FFM condition (M=.62, SD=.25), t(72)=12.36, p<.01 (Figure 2). See also Table 1 for the results broken down by personality disorder. This pattern of means held true across all ten disorders.

Figure 2
Correct Diagnoses (95% Confidence Intervals)
Table 1
Mean Number of Correct and Incorrect Diagnoses per Case by Personality Disorder.

The comorbid cases were analyzed by examining the proportion of correct diagnoses within each condition (i.e., the FFM or the DSM-style comorbid case). Since there are two correct diagnoses for a given comorbid case, a score of 1 means that a participant correctly identified both diagnoses, a score of 0.5 means that a participant identified one of the two correct diagnoses, and a score of 0 means that a participant identified neither of the correct diagnoses. For comorbid cases, participants were again more likely to give the correct diagnoses in the DSM (M=.77, SD=.26) than in the FFM condition (M=.48, SD=.33), Z=5.03, N=73, p<.01 (Figure 2).iv

Figure 2 also re-presents results from Rottman et al.’s25 investigation of clinicians not necessarily specializing in personality disorders for comparison with the current results. Due to differences in design, inferential statistics are not possible. As can be seen in the figure, across the prototypic and comorbid cases, the personality disorder researchers in the current study provided more accurate diagnoses than Rottman et al.’s25 practicing clinicians, but importantly, they did so in both the DSM and FFM conditions. Other methods of counting correct or incorrect diagnoses (e.g., not counting features or traits, or not counting “obsessive-compulsive” as Axis II obsessive-compulsive personality disorder but rather as Axis I obsessive-compulsive disorder) would not change the main results.

Incorrect diagnoses (Figure 3), defined as any DSM-IV diagnosis mismatching the correct diagnosis and any non-DSM-IV diagnosis, were examined. Participants could provide any number of incorrect diagnoses per case. Participants gave significantly more incorrect diagnoses per prototypic case in the FFM (M=.79, SD=.48) than DSM condition (M=.16, SD=.35), t(71)=9.82, p<.01. Again, this pattern of means held true across all ten disorders (see Table 1). For the comorbid cases, they also gave more incorrect diagnoses per case in the FFM (M=.81, SD=.84) than DSM condition (M=.30, SD=.64), Z=3.95, N=73, p<.01. Again, these results differed little from those of Rottman et al.’s25 practicing clinicians except for being generally more accurate across both conditions.

Figure 3
Incorrect Diagnoses (95% Confidence Intervals)

Next, the frequency of correct and incorrect diagnoses within the subgroup of participants who rated themselves as equally familiar with the DSM-IV and FFM were examined. The results closely replicate those for the entire sample, suggesting that familiarity did not influence diagnostic accuracy. The participants in this subgroup more frequently gave correct diagnoses for prototypic cases in the DSM (M=1.00, SD=0.00; these participants always gave the correct diagnosis in the DSM) than FFM (M=.62, SD=.27), t(23)=6.96, p<.01, and gave more incorrect diagnoses for the FFM (M=.75, SD=.54) than the DSM (M=.13, SD=.25), t(23)=5.15, p<.01. The same results hold for comorbid cases: participants more frequently gave correct diagnoses in the DSM (M=.75, SD=.26) than FFM condition (M=.48, SD=.31), Z=2.97, N=24, p<.01, and gave more incorrect diagnoses in the FFM (M=.75, SD=.94) than DSM condition (M=.13, SD=.34), Z=2.58, N=24, p=.01.

Correlational analyses were also conducted between familiarity ratings and frequency of correct/incorrect diagnoses for the entire set of participants. The most important reason to look at these correlations is to determine whether familiarity with the FFM increases accuracy in identifying diagnoses from FFM patient profiles. If so, this would suggest that familiarity with the FFM facilitates being able to form a coherent image of a patient from an FFM patient profile. However, this possibility was not supported. Familiarity with the FFM did not correlate significantly with providing correct diagnoses in the FFM condition (r=.08 and r=.11 for prototypic and comorbid cases respectively, ns). Familiarity with the FFM also did not help participants avoid providing incorrect diagnoses in the FFM condition (r<.01 and r=−.01 for prototypic and comorbid cases respectively, ns).

Confidence in Diagnoses

A 2 (correct vs. incorrect diagnosis) × 2 (DSM vs. FFM) repeated-measures ANOVAv revealed that participants were more confident making diagnoses in the DSM than in the FFM condition, F(1, 25)=45.15, p<.01, ηp2=.64, and more confident for correct than incorrect diagnoses overall, F(1, 25)=75.25, p<.01, ηp2=.75. In addition, there was an interaction, F(1, 25)=53.15, p<.01, ηp2=.68, indicating that, although participants were much more confident in correct than incorrect diagnoses in the DSM condition, there was a much smaller difference in confidence between correct and incorrect diagnoses in the FFM condition. (See Figure 4.) These findings suggested that participants were more aware of the accuracy of their diagnoses in the DSM than FFM condition. Familiarity with the FFM was not significantly correlated with confidence for correct or incorrect diagnoses.

Figure 4
Confidence Ratings (Std. Error)

Clinical Utility Ratings

The mean clinical utility ratings broken down by the DSM and the FFM condition are presented in Figure 5. Paired t-tests revealed that participants found the DSM-IV to be more useful than the FFM on three measures: prognosis, treatment plans, and communicating with professionals; (t’s(69)>2.19, p’s<.05). Participants rated the FFM as more useful than the DSM for communicating with patients, t(69)=3.03, p<.01, possibly because the DSM-IV disorder names are considered to be stigmatizing and because the FFM facets are common terms rather than technical disorder names. There was no difference between the DSM and FFM for comprehensively describing all important personality problems and global personality description, p’s >.10. All of these patterns of results also hold when only including data from the condition presented first.

Figure 5
Clinical Utility Ratings for Entire Sample (Std. Error)

The clinical utility ratings for the subset of participants who were equally familiar with the DSM-IV and FFM were examined (see Figure 6 for means). Paired t-tests showed that the FFM was not rated as more useful than the DSM for four out of the six aspects of clinical utility. Participants in the subgroup did rate the FFM as more useful than the DSM for communicating with patients and describing global personality, which makes sense because the FFM is based on common adjectives describing personality and was meant to describe all types of personality, not just pathological personality. In summary, however, the clinical utility ratings do not suggest that participants found the FFM to be clearly more useful than the DSM, which would be necessary to support a switch to the FFM on grounds of increased clinical utility.

Figure 6
Clinical Utility Ratings for Subsample (Std. Error)

General Discussion

Recent work has shown that the FFM poses cognitive challenges and has relatively low clinical utility for practicing clinicians, if presented without context to disambiguate FFM traits.25 However, these previous findings were obtained in a broad sample of practicing clinicians who rated themselves as much more familiar with the DSM-IV than the FFM and who may not necessarily specialize in personality disorders per se (as opposed to other, Axis I disorders). In the current study, we examined whether people with specialized experience and knowledge about personality disorders, personality disorder researchers, are able to overcome the cognitive challenges of the FFM—especially those who are equally familiar with the FFM and the DSM. The current results, in conjunction with those reported by Rottman et al.,25 suggest that FFM traits alone may be too ambiguous as a diagnostic tool for practicing clinicians.

In the current study, experts in personality disorders, as established by their record of published research and self-identified primary interests, had difficulty identifying even highly familiar, prototypic DSM-IV diagnoses from FFM profiles. Correlational analyses ruled out the possibility that participants’ degree of familiarity with the FFM was likely to be responsible for the observed problems in identifying correct diagnoses from FFM profiles. A subgroup of participants reporting equal familiarity with the DSM-IV and FFM also had difficulty identifying prototypic DSM-IV diagnoses from FFM patient profiles and were less confident in these diagnoses than in their diagnoses of DSM profiles. This finding is consequential because it suggests that even equal familiarity with the FFM and DSM-IV is not sufficient to form a coherent image of a patient from an FFM profile alone.

One could argue that, to the extent that the DSM-IV personality disorders lack validity, it is not particularly important to be able to use them to conceive of a case. Yet, completely abandoning them would pose considerable disruption from a practical standpoint (e.g., disruption of ongoing research; difficulty for clinicians in implementing past research findings into clinical practice).3 Perhaps most importantly, clinicians have been working with a categorical personality disorder system since 1980; they cannot simply turn off their prior knowledge and experience, nor could it conceivably be desirable for them to do so. Furthermore, as mentioned earlier, certain DSM-IV personality disorders, particularly borderline personality disorder, have been acknowledged to be useful constructs even by proponents of the FFM.29 Yet, only 55% of researchers in the current study were able to identify the prototypic FFM trait pattern as borderline. Such difficulty in recognizing useful constructs in FFM case profiles suggests a problem with the FFM’s clinical utility.

The researcher-participants in the current study also judged the clinical utility of the FFM to be low in a number of aspects, further suggesting that they found the FFM descriptors to be ambiguous. Specifically, participants judged an abstract FFM patient description (e.g., a neurotic, anxious, and introverted person) to be less useful in making treatment plans and predictions about the course and outcome of the patient than a DSM description. Participants also thought that the disorder category names of the DSM-IV greatly facilitated communication between mental health professionals who know the terminology, though they thought that the commonplace adjectives used by the FFM were more useful for communicating with patients who, presumably, are less likely to know diagnostic terminology.

Investigating how people use the FFM may reveal additional issues to be considered in formulating potential diagnostic systems incorporating the FFM or other trait-based dimensional systems. In the current study, participants were able to identify some DSM-IV disorders much better than others when examining FFM profiles (see Table 1). For example, histrionic personality disorder was only correctly identified 20% of the time on the basis of an FFM profile. In contrast, obsessive-compulsive personality disorder was correctly identified 88% of the time from an FFM profile. Obsessive-compulsive personality disorder may be particularly easy to identify since the conscientiousness facets all receive very high scores (between 4 and 5), whereas none of the conscientiousness facets receive above a 4 in any of the other personality disorder prototypes.25 (In fact, a cluster analysis of the 10 FFM prototypes revealed that OCPD is the most distinctive of the 10 personality disorders.) Because the FFM conscientiousness facets are diagnostic of OCPD, a high score on the conscientiousness facets is not ambiguous – they primarily occur only for OCPD. Thus, highly distinctive and diagnostic facets allowed our participants to more easily recognize the disorder. Participants’ poor performance on histrionic personality disorder may be due to the fact that its facet scores are quite moderate, and thus it may be hard to determine which facets are clinically relevant.

Future Research

To what extent will other trait-based dimensional systems face the same cognitive challenges relating to ambiguity demonstrated in the current study? Many other systems have scales that can be mapped closely onto the FFM facets.34 For example, like the FFM, the Schedule for Nonadaptive and Adaptive Personality (SNAP-2)7,35 and The Dimensional Assessment of Personality Pathology (DAPP)8,36 have ‘Impulsivity’ and ‘Mistrust’/‘Suspiciousness’ scales. Given the degree to which such systems overlap with the FFM, we speculate that they may contain a similar degree of ambiguity. Future research, however, will be necessary to definitively assess whether or not this is indeed the case.

Another crucial future research direction is to identify ways in which the trait descriptors can be successfully disambiguated. One possible remedy is to combine trait-based dimensional systems with type-based ones (e.g., the prototypes of DSM-IV personality disorders tested by Spitzer et al).23 For example, a clinician might first determine whether a patient is similar to different personality prototypes (e.g., borderline, antisocial, etc.) along a dimensional scale and then use a trait-based assessment to further describe the patient. The idea is that the initial prototype assessment would instantiate the more specific meanings of the traits for this patient. For example, rather than thinking about a patient as ‘withdrawn,’ a clinician could think of the patient as ‘withdrawn due to paranoid fears’ (as in the paranoid type) or ‘withdrawn due to indifference to others’ (as in schizoid type). We suggest that in general, instantiated descriptors are likely to be more clinically meaningful and useful for clinicians than uninstantiated (and thereby ambiguous) ones.

Finally, future work might test whether optional steps proposed to supplement the FFM will successfully reduce ambiguity. One such proposal is to assess dysfunctional behaviors associated with abnormal trait scores (e.g., dysfunctional behaviors such as “overspending” or “excessive gambling” or “excessive use of drugs” are associated with high impulsiveness scores; behaviors such as “readily perceives malevolent intentions within benign, innocent remarks or behaviors,” or “is often involved in acrimonious arguments with friends” are associated with low trust scores;29 see Clark10 for a review). Unfortunately, there have not yet been any empirical studies examining whether clinicians find supplemental dysfunctional behaviors to be useful and whether they actually can use these supplements. For instance, although Samuel and Widiger26 and Lyman and Widiger27 have shown that researchers and clinicians can reliably assess FFM traits for prototypic cases of the DSM-IV personality disorders, no studies have yet examined whether they can also reliably identify which dysfunctional behaviors are associated with these prototypic cases.

Once the translatability of these dysfunctional behaviors from the familiar DSM-IV constructs is empirically established, additional research can further examine whether the descriptions based on these dysfunctional behaviors are unambiguous enough to be translated back to the DSM-IV constructs, as in the current study. It may be that dysfunctional behaviors would help to clarify the context of an extreme trait and improve clinicians’ ability to back translate to the DSM. On the other hand, the existing catalog of dysfunctional behaviors29,37 may not clarify the context of the traits, because it may not have been developed with the ambiguities of traits in mind. For example, the existing catalog29,37 only lists two dysfunctional behaviors associated with low ratings on the gregariousness trait -- “is socially isolated,” and “has no apparent support network due to his or her own social withdrawal”-- but clinicians may not be able to determine, based on these specified dysfunctions, whether this low gregariousness is paranoid fear, fear of not being liked by others, or indifference to others. To give yet another example, high excitement seeking is associated with the following dysfunctional behaviors: “engages in a variety of reckless and even highly dangerous activities; behavior is rash, foolhardy, and careless”29 and “easily bored; excessive thrill seeker.”37 It is not clear whether clinicians will be able to use these dysfunctional behaviors to differentiate between Narcissistic, Antisocial, Borderline, and Histrionic personality types. Since all existing studies on the clinical utility of dimensional systems have focused on the trait description itself without consideration of the associated dysfunctions, if inclusion of such dysfunctions is to be considered for the DSM-V, it will be important in future research to empirically assess whether the dysfunction assessment does indeed disambiguate traits.


This research was supported by an NSF Graduate Research Fellowship (Rottman) and NIMH Grants MH084047 (Kim), MH57737 (Ahn), and MH73708 (Sanislow). Some of these findings were presented in July 2009 at the 31st Annual Conference of the Cognitive Science Society, Amsterdam, The Netherlands.

The authors thank Rachel Litwin for help identifying and recruiting the personality disorder researchers and coding data.


iForty-eight of these researchers also saw patients. This subset of clinician-researchers had been in practice for an average of 15 (SD=9) years and worked specifically with patients with personality disorders an average of 13 (SD=11) hours weekly.

iiEqual variances not assumed.

iiiBoth main effects were also significant. The current researcher-participants were more familiar with the DSM than FFM, F(1, 252)=127.03, p<.01, and generally gave higher familiarity ratings than Rottman et al’s26 clinicians, F(1, 252)=400.02, p<.01.

ivNon-parametric Wilcoxon Signed Ranked Tests were used to analyze the accuracy of comorbid cases because there are few levels of the outcome variables.

vThis within-subjects analysis can only be conducted for the 26 participants who gave at least one correct and one incorrect diagnosis (and consequently their corresponding confidence ratings) in both the DSM and FFM. To increase the number of subjects who could be included in this analysis, prototypic and comorbid cases were both included. For example, to obtain the average confidence rating for correct diagnoses in the FFM condition, the average was computed over whichever of the six FFM cases (five prototypic and one comorbid) participants provided correct diagnoses.

viOne participant in the subset of 24 did not include utility ratings for the DSM and is excluded from analyses.

The authors report no competing interests.

Contributor Information

Benjamin M. Rottman, Department of Psychology, Yale University; New Haven, CT.

Nancy S. Kim, Department of Psychology, Northeastern University; Boston, MA.

Woo-kyoung Ahn, Department of Psychology, Yale University; New Haven, CT.

Charles A. Sanislow, Department of Psychology, Wesleyan University; Middletown, CT.

Works Cited

1. Diagnostic and statistical manual of mental disorders: text revision. 4. DC: American Psychiatric Association; 2000.
2. Kupfer DJ, First MB, Regier DE. Externalizing psychopathology in adulthood: a dimensional-spectrum conceptualization and its implications for DSM-V. J Abnorm Psychol. 2002;114:537–550. [PMC free article] [PubMed]
3. First MB. Clinical utility: a prerequisite for the adoption of a dimensional approach in DSM. J Abnorm Psychol. 2005;114:560–564. [PubMed]
4. First MB, Pincus HA, Levine JB, Williams JBW, Ustun B, Peele R. Clinical utility as a criterion for revising psychiatric diagnoses. Am J Psychiatry. 2004;161:946–954. [PubMed]
5. Shelder J, Westen D. Refining personality disorder diagnosis: Integrating science and practice. Am J Psychiatry. 2004;161:1350–1365. [PubMed]
6. Costa PT, McCrae RR. Revised NEO personality inventory: professional manual (NEO-PI-R) Odessa, Fla: Psychological Assessment Resources; 1992.
7. Clark LA, Simms LJ, Wu KD, Casillas A. Schedule for Nonadaptive and Adaptive Personality (SNAP-2) 2. Minneapolis, MN: University of Minnesota Press; in press.
8. Livesley WJ, Jackson DN. Manual for the Dimensional Assessment of Personality Pathology. Port Huron, MI: Sigma; 2006.
9. Wiggins JS. Paradigms of Personality Assessment. New York: Guilford; 2003. [PubMed]
10. Clark LA. Assessment and diagnosis of personality disorder: perennial issues and emerging reconceptualization. Annu Rev Psychol. 2007;58:227–57. [PubMed]
11. Livesley WJ. Diagnostic dilemmas in classifying personality disorder. In: Phillips KA, First MB, Pincus HA, editors. Advancing DSM. Dilemmas in psychiatric diagnosis. Washington, DC: American Psychiatric Association; 2003. pp. 153–190.pp. 323–345.
12. Widiger TA, Trull TJ. Plate tectonics in the classification of personality disorder: shifting to a dimensional model. Am Psychol. 2007;62(2):71–83. [PubMed]
13. Verheul R, Widiger TA. A meta-analysis of the prevalence and usage of the personality disorder not otherwise specified (PDNOS) diagnosis. J Personal Disord. 2004;18:283–302. [PubMed]
14. Westen D, Arkowtiz-Westen L. Limitations of axis II in diagnosing personality pathology in clinical practice. AM J Psychiatry. 1998;155:1767–1771. [PubMed]
15. Bernstein DP, Iscan C, Maser J. Board of Directors of the Association for Research in Personality Disorders, Board of Directors of the International Society for the Study of Personality Disorders. J Personal Disord. 2007;21:536–551. [PubMed]
16. Brooks LR, Hannah SD. Instantiated features and the use of “rules” J Exp Psychol Gen. 2006;135:133–151. [PubMed]
17. Kamp H. Two theories about adjectives. In: Keenen E, editor. Formal Semantics of Natural Language. Cambridge, UK: Cambridge U.P; 1975. pp. 123–55.
18. Murphy GL. Comprehending complex concepts. Cognitive Science. 1988;12:529–562.
19. Murphy GL, Medin DL. The role of theories in conceptual coherence. Psychol Rev. 1985;92:289–316. [PubMed]
20. Rosch E. Principles of categorization. In: Rosch E, Lloyd BB, editors. Cognition and Categorization. Hillsdale, NJ: Erlbaum; 1978. pp. 27–48.
21. Benjamin LS. Interpersonal diagnosis and treatment of personality disorders. NY: Guilford; 1993.
22. Samuel DB, Widiger TA. Clinicians’ judgments of clinical utility: a comparison of DSM-IV and Five Factor Models. J Abnorm Psychol. 2006;115:298–308. [PubMed]
23. Spitzer RL, First MB, Shedler J, Westen D, Skodol AE. Clinical utility of five dimensional systems for personality diagnosis: a “consumer preference” study. J Nerv Ment Dis. 2008;196:356–374. [PubMed]
24. Sprock J. Dimensional versus categorical classification of prototypic and nonprototypic cases of personality disorder. J Clin Psychol. 2003;59:991–1014. [PubMed]
25. Rottman BM, Ahn W, Sanislow CA, Kim NS. Can clinicians recognize personality disorders from Five-Factor Model descriptions of patient cases? Am J Psychiatry. 2009;166:427–433. [PMC free article] [PubMed]
26. Samuel DB, Widiger TA. Clinicians’ personality descriptions of prototypic personality disorders. J Personal Disord. 2004;18:286–308. [PubMed]
27. Lynam DR, Widiger TA. Using the five-factor model to represent the DSM-IV personality disorders: an expert consensus approach. J Abnorm Psychol. 2001;110:401–12. [PubMed]
28. Sprock J. A comparative study of the dimensions and facets of the five-factor model in the diagnosis of cases of personality disorder. J Personal Disord. 2002;16:402–423. [PubMed]
29. Widiger TA, Costa PT, McCrae RR. A proposal for axis II: Diagnosing personality disorders using the five-factor model. In: Costa PT, Widiger TA, editors. Personality disorders and the five-factor model of personality. 2. Washington, DC: American Psychological Association; 2005. pp. 431–457.
30. Ahn W, Marsh JK, Luhmann CC, Lee K. Effect of theory-based feature correlations on typicality judgments. Mem Cognit. 2002;30:107–118. [PubMed]
31. Wattenmaker WD, Dewey GI, Murphy TD, Medin DL. Linear separability and concept learning: context, relational properties, and concept naturalness. Cogn Psychol. 1986;18:158–194. [PubMed]
32. Chase WG, Simon HA. Perception in chess. Cogn Psychol. 1973;4:55–81.
33. Bornstein RF. Reconceptualizing personality disorder diagnosis in the DSM-V: the discriminant validity challenge. Clinical Psychology: Science and Practice. 1998;5:333–343.
34. Widiger TA, Simonsen E. Alternative dimensional models of personality disorder: Finding a common ground. J Personal Disord. 2005;19:110–30. [PubMed]
35. Simms LJ, Clark LA. The Schedule for Nonadaptive and Adaptive Personality (SNAP): A Dimensional Measure of Traits Relevant to Personality and Personality Pathology. In: Strack S, editor. Differentiating Normal and Abnormal Personality. New York: Springer; 2006. pp. 431–450.
36. Livesley W. The dimensional assessment of personality pathology (DAPP) approach to personality disorder. In: Strack S, editor. Differentiating Normal and Abnormal Personality. New York: Springer; 2006. pp. 401–430.
37. McCrae RR, Löckenhoff CE, Costa PT. A step toward DSM-V: Cataloguing personality-related problems in living. Eur J Pers. 2005;19:269–286.