Obsessive Compulsive Disorder (OCD) is an anxiety disorder marked by the recurrence of intrusive thoughts and repeated engagement in behaviors in an attempt to neutralize these thoughts and reduce anxiety. In assessing the severity of obsessive and compulsive behaviors that are the hallmark of OCD, the Yale-Brown Obsessive Compulsive Scale (Y-BOCS) is generally accepted to be the “gold standard” (
Antony, Orsillo, & Roemer, 2001). Indeed, in meta-analyses examining the efficacy of pharmacological and behavioral interventions for OCD (
Abramowitz, 1997;
Norton & Price, 2007), the Y-BOCS was the most frequently utilized primary or secondary measure of treatment outcome. Despite the widespread acceptance and use of the Y-BOCS, however, there are still significant gaps in the psychometric literature establishing the validity of this measure. Two such weaknesses will be addressed herein: establishment of a stable factor structure and cross-cultural validation.
Although many examinations of the factor structure of the Y-BOCS have been undertaken, no structure has been consistently identified across these studies. The scale, as created by
Goodman and colleagues (1989a;
1989b) is based on a two-factor structure comprised of an Obsessions factor and a Compulsions factor. This structure was also found by
McKay and colleagues (1995) in a clinical sample. Two-factor solutions have also been found in other studies, though the two factors did not mimic the previously found Obsessions and Compulsions factors.
Amir and colleagues (1997) identified a factor solution in which the two factors were best described as Disturbance and Symptom Severity.
Deacon and Abramowitz (2005) also found a different two factor solution comprised of a Severity factor and a Resistance and Control factor.
This inconsistency in findings across studies may be due in part to limitations of the methodologies employed in these investigations. Some previous investigations (
Deacon & Abramowitz, 2005;
Fals-Stewart, 1992;
Moritz, et al., 2002) have used data reduction techniques, such as Principal Components Analysis (PCA). While PCA is commonly used in the literature, the assumptions made by this technique are generally not suited for use with psychological questionnaire data, such as that of the Y-BOCS (
Widaman, 1993). One central reason for this is that PCA ignores measurement error, which is typically considerable in data from self-report psychological assessments (
Floyd & Widaman, 1995;
Widaman, 1993;
Widaman, 2007). Principal Components Analysis also typically makes use of orthogonal rotations which force factors to be uncorrelated. When constructs are believed to be related, such as Obsessions and Compulsions, such an approach is not ideal. Additionally, in PCA, items are regarded as continuous. That is, the ordinal items comprising the scale are treated as though they have interval properties, when this is not the case (e.g. the difference in symptom severity between a response of ‘0’ and ‘1’ is not necessarily the same as between ‘3’ and ‘4’.) For example, on item 3 (
Distress associated with obsessive thoughts) it is unclear if the distance between
None (0) and
Not too disturbing (1) is identical or similar to the distance between
Very disturbing (3) and
Near constant and disabling distress (4). This problem also exists in the application of conventional factor analysis to ordinal data, since this approach also treats variables as continuous. Using continuous data approaches with ordinal data has been repeatedly shown to result in distortion of factor analytic results (
Bernstein & Teng, 1989;
Jöreskog & Moustaki, 2001). As this method has also been applied in factor analyses of the Y-BOCS (
Deacon & Abramowitz, 2005;
Fals-Stewart, 1992;
McKay, Danyko, Neziroglu, & Yaryura-Tobias, 1995), this has likely also contributed to difficulties in firmly establishing a factor structure. A more appropriate approach to use to investigate the factor structure of the Y-BOCS would be confirmatory factor analysis (CFA) for ordered-categorical variables. This approach treats item responses as ordinal and uses information from the patterns of responses to reconstruct the continuous distributions assumed to be underlying each variable, rather than treating Likert-type items as continuous as is done in the previously discussed approaches. Another limitation in the psychometric validation of the Y-BOCS is in evaluating the appropriateness of this measure for use in cross-cultural comparisons. Measurement invariance and differential item functioning (DIF) studies are used to evaluate whether items function in the same way and are on the same metric across different groups (e.g. race/ethnicity, sex, etc.). If an assessment tool is found to lack measurement invariance across groups, or is found to exhibit DIF across groups, then comparisons between groups may not accurately reflect real group differences and may lead to erroneous conclusions. Similarly, if an individual’s score is evaluated against norms from a group where such comparisons are unfounded, this can lead to over-pathologizing or under-identification or -estimation of psychopathology.
Although the presence of OCD has been widely identified across cultural and racial/ethnic groups (e.g.
Fontenelle & Hasler, 2008;
Fontenelle, Mendlowicz, & Veriani, 2006), there have been few efforts to validate severity measures, such as the Y-BOCS, for use in cross-cultural contexts. However, what research has been done in this area suggests some of these measures may not function similarly across all cultural and racial/ethnic groups. Measures of OCD symptom severity which have been evaluated for measurement invariance, or differential item functioning, across racial/ethnic groups include the Padua Inventory and the Maudley Obsessive-Compulsive Inventory (MOCI).
Williams and colleagues (2005) investigated DIF of the Padua Inventory across Black and White participants. DIF of the Padua Inventory was identified, resulting in overestimation of scores for Black participants relative to White. A study by
Thomas and colleagues (2000) investigating DIF in the MOCI found similar differences in psychometric functioning of this measure between Black and White participants on subscales measuring severity of Cleaning and Checking behaviors. It was found that Black participants were more likely endorse items at a lower level of severity relative to White participants, which could lead to overestimation of the severity of these behaviors. The findings of these studies highlight this importance of validation of OCD measures across racial/ethnic groups, rather than simply assuming no differences exist.
Although these findings suggest non-invariance across groups when assessed using MOCI and Padua Inventory, distinct differences exist between these measures and the Y-BOCS. Most notably, the MOCI and Padua Inventory identify specific beliefs, obsessions, or compulsions (e.g., “I avoid using public telephones because of possible contamination”) whereas the Y-BOCS assesses dimensions (e.g., “Time occupied by obsessive thoughts”) of the overall obsession and compulsions regardless of specific content. As noted by
Washington, Norton, and Temple (2008) “measures designed to assess these symptoms may artificially under- or over-pathologize individuals from diverse backgrounds. This may be particularly so with measures of obsessive-compulsive disorder (OCD), as practices such as washing or cleaning, and rule-based behaviors, are likely to be heavily influenced by cultural norms” (pp. 456).
To our knowledge, no such studies have been conducted to evaluate measurement invariance in the Y-BOCS across racial/ethnic groups. As the Y-BOCS is a well-known and frequently used measure of symptom severity of OCD we sought to address the previously noted limitations in the psychometric literature of this measure. First, we undertook evaluation of the factor structure of the Y-BOCS in both community and clinical samples using confirmatory factor analysis for ordered categorical data. Second, we evaluated measurement invariance of the Y-BOCS across four racial/ethnic groups in a community sample. As previous research does not offer a strong reasoning for hypothesizing one factor structure over another, or the presence or absence of measurement invariance across racial/ethnic groups for this measure, we chose to abstain from hypothesizing about the outcome of these analyses.