|Home | About | Journals | Submit | Contact Us | Français|
Efforts to better understand bipolar spectrum disorders across ethnic groups are often hampered by the lack of commonly used self-report instruments to assess mania and depression in individuals who speak languages other than English. This article describes the translation into Spanish of 2 self-report measures of manic symptoms (i.e., the Internal State Scale and the Hypomanic Personality Scale) and 2 self-report measures of depression (i.e., the Inventory to Diagnose Depression and the Inventory to Diagnose Depression, Lifetime version). The authors translated these measures into Spanish and assessed their psychometric properties among bilingual college students (N = 88). Results suggest that the Spanish versions have psychometric properties comparable to the English versions of the instruments.
During the 1990s, the Hispanic population was the fastest growing minority group in the United States and became the largest minority population in the country (U.S. Census Bureau, 2001). This demographic phenomenon makes salient the need to incorporate this group into clinical research on mental health. A major barrier to this endeavor is the lack of valid and reliable Spanish-language assessment instruments for those Hispanics who speak Spanish as their primary language (Ginzberg, 1991; Norvy, Stanley, Averill, & Daza, 2001). Measures of psychopathology that do exist in Spanish tend to be developed for, and normed on, largely non-Hispanic, White samples; moreover, these instruments lack psychometric support for use with bilingual Americans (Norvy et al., 2001).
Although lifetime prevalence rates of bipolar disorder are comparable across ethnic groups (American Psychiatric Association, 2000), there is some preliminary evidence that cultural variables may influence the course of the disorder (Nandi, Banerjee, Mukherjee, Nandi, & Nandi, 2000). Nonetheless, little research exists comparing the course of the disorder or the mechanisms by which it unfolds across ethnic groups. Spanish-language measures of bipolar disorder are needed to facilitate this type of research. Although interview-based instruments exist, we are aware of no Spanish versions of self-report measures to assess current manic symptoms, and we are aware of only a single effort to translate a measure of lifetime vulnerability to mania (Rawlings, Barrantes-Vidal, Claridge, McCreery, & Galanos, 2000).
In this article, we report on the translation into Spanish of two self-report measures of mania, the Internal State Scale (ISS; Bauer et al., 1991) and the Hypomanic Personality Scale (HPS; Eckblad & Chapman, 1986), and on the psychometric properties of these measures. We also report on the translation into Spanish of two widely used self-report measures of depression and lifetime history of depression, the Inventory to Diagnosis Depression (IDD; Zimmerman, Coryell, Corenthal, & Wilson, 1986) and the Inventory to Diagnose Depression, Lifetime version (IDD-L: Zimmerman & Coryell, 1987), and on the psychometric properties of these measures. These instruments have been extensively used in research with English-speaking participants.
The goal of the current study was to develop Spanish-language versions of these commonly used measures of mania and depression and to compare the psychometric properties of the English versions of these measures with the Spanish versions among a sample of bilingual individuals. Specifically, internal consistency estimates and mean scale score differences were calculated and compared across language versions; intraclass correlations were calculated to assess the relationship between the measures in English and Spanish. In so comparing the two language versions, we do not intend to make cross-cultural comparisons, rather, we intended to obtain initial information on the psychometric properties of the Spanish versions of each measure. Positive findings in this respect would provide support for further testing of these measures among more diverse clinical samples.
The sample initially consisted of 90 English–Spanish bilingual undergraduates at the University of Miami who received, for participation, partial credit for an introductory psychology research assignment. To be included, participants were required to pass an English and Spanish comprehension test consisting of 12 words at the 7th-grade level (drawn from the Peabody Picture Vocabulary Test—Third Edition, Dunn & Dunn, 1997; Test De Vocabulario En Imagenes Peabody: Adaptacion Hipanoamericana, Dunn, Padilla, Lugo, & Dunn, 1986). Participants scored well above the recommended threshold of 4 correct words on both scales. That is, they correctly identified an average of 9.38 (SD = 1.93) Spanish and 11.38 (SD = 1.11) English words. A review of responses suggested a random pattern of responding for two individuals; they were removed from subsequent analyses.
Of the 88 participants retained for the analyses, 61.4% were men and 38.6% were women. Participants ranged in age from 17 to 47 (M = 19.81, SD = 4.46). Thirty-one percent of the participants were freshman, 23.9% were sophomores, 22.7% were juniors, and the remaining 22.7% were either seniors or graduate-level students. In terms of ethnicity, 81.8% of the participants reported being Hispanic, and 13.7% designated themselves as Caucasian, 1.1% as Asian, 1.1% as African American, and 1.1% as other. With respect to country of origin, 67% of participants were born in the United States or Canada (5.7% of the participants were from Puerto Rico), 21.6% were born in South America, 3.4% were born in Central America, and 3.4% were born in Cuba. The remaining 4.5% of the participants were born in other regions of the world.
Participants met with the experimenter in small groups. Participants were told that they would be completing English and Spanish versions of various questionnaires. All participants completed written informed-consent procedures with none declining to participate in the study. Each participant completed the language comprehension test noted above and then completed computerized versions of the measures. The language-order of the administration was varied, with some participants completing the English versions first and others completing the Spanish versions first. Within each language, however, the order of instrument administration was the same. To reduce the possibility that a participant would remember his or her response to any particular item, participants first completed all the measures in one language before completing the measures in the second language.
We took several steps to obtain Spanish versions of the measures. A medical translator was hired to provide an initial translation of all measures into Spanish. To ensure that different Hispanic groups could readily understand items, a team of translators who had lived in Mexico, Cuba, Argentina, Chile, and Spain reviewed the initial translations and replaced culture-specific wording with culture-neutral wording. A second team of translators back-translated the items of the Spanish versions. If wording from the original items and back-translations were discrepant, the team of translators consulted with one another and selected wording that was mutually acceptable.
The ISS is a 16-item self-report instrument designed to assess the severity of current manic and depressive symptoms. In the original version, participants respond to each item using a 100-mm visual analog scale. This response format was modified in the current research by adopting a Likert-type scale that ranged from 1 (not at all) to 10 (extremely) and that has been validated in previous research (Glick, McBride, & Bauer, 2003; Johnson, Ruggero, & Carver, in press; B. Meyer, Johnson, & Carver, 1999). Bauer et al.’s (1991) principal-components analysis yielded four subscales: Activation (ACT), Well-Being (WB), Perceived Conflict (PC), and the Depression Index (DI).
All subscales had good internal consistency reliability (ACT, α = .84; WB, α = .87; PC, α = .81; DI, α = .92; Bauer et al., 1991). With respect to validation, ACT was significantly correlated (r = .60, p < .0001) with the Young Mania Rating Scale (Young, Biggs, Ziegler, & Meyer, 1978), whereas DI was significantly correlated (r = .84, p < .0001) with the Hamilton Depression Rating Scale (Hamilton, 1960). Both scales succeeded in not only distinguishing diagnostic groups, but in discriminating changes in symptom severity (Bauer et al., 1991). As expected, the ACT and DI scales were not highly correlated (r = .17; Bauer et al., 1991).
The HPS is a 48-item self-report measure designed to identify individuals at risk for manic episodes. The HPS is one of the few measures that has been shown to predict the development of hypomania and bipolar disorder over time (Kwapil et al., 2000). The items assess positive affect, energy, extraversion, and goal-driven behavior. Each of the 48 items is keyed either “true” or “false.” Sample items include “I often feel excited and happy for no apparent reason,” “I often have moods where I feel so energetic and optimistic that I feel I could outperform almost anyone at anything,” and “There have often been times when I had such an excess of energy that I felt little need to sleep at night.”
The HPS has been shown to differentiate individuals with and without manic symptoms: More than 75% of individuals with high scores were found to meet diagnostic criteria for bipolar disorder (Eckblad & Chapman, 1986). The measure has high reliability (15-week test–retest reliability = .81; correlation α = .87) and correlates with other measures of risk for bipolar disorder (General Behavior Inventory: r = 47, n = 768; Eckblad & Chapman, 1986). The measure is uncorrelated with indices of social desirability (Crowne-Marlow Scale for Social Desirability: r = .05, n = 768; Eckblad & Chapman, 1986). High scores have been shown to predict the onset of bipolar disorder and related conditions over a 10-year period (Kwapil et al., 2000). In addition, the HPS has been shown to relate to symptoms of mania more robustly than other scales, such as the NEO-V (T. D. Meyer, 2002).
The IDD is a 22-item self-report measure designed to assess the symptoms of major depressive disorder. Unlike the Beck Depression Inventory (Beck, Rush, Shaw, & Emery, 1979), the IDD and IDD-L were designed to closely correspond with Diagnostic and Statistical Manual of Mental Disorders (3rd ed.; DSM–III; American Psychiatric Association, 1980) criteria for the diagnosis of a major depressive episode and also closely correspond with the criteria of the Diagnostic and Statistical Manual of Mental Disorders (4th ed., text rev.; DSM–IV–TR; American Psychiatric Association, 2000). Each item consists of five statements assessing the degree to which one has experienced a specific symptom of depression (0 = absence of the symptom, 1 = subclinical severity, 2–4 = clinically significant symptoms).
The IDD has been shown to differentiate between individuals with and without major depression (Zimmerman et al., 1986). The measure has high reliability (split-half reliability = .93; Cronbach’s α = .92) and correlates significantly with other measures of depression (Hamilton Rating Scale: r = .80, p < .001; Beck Depression Inventory: r = .87, p < .001; Zimmerman et al., 1986). Moreover, the IDD is sensitive to changes in depression severity from inpatient admission to discharge (Zimmerman et al., 1986).
The IDD-L is a 22-item self-report measure designed to assess lifetime history of depression. It is identical to the IDD with one exception: Rather than referring to current symptoms, the IDD-L asks respondents to focus on the week in their life when they felt the most profoundly sad or depressed. The IDD-L was originally designed to diagnose a lifetime history of major depressive disorder according to the DSM–III. However, the items cover all of the criteria necessary to make this diagnosis according to the standards of the DSM–IV–TR.
The IDD-L has good reliability (Cronbach’s α = .92; split-half reliability = .90) and has demonstrated significant concordance with other measures of lifetime history of depression (Diagnostic Interview Schedule [DIS]: K = .66). Using the DIS as the criterion, the sensitivity of the IDD-L was 74% and its specificity was 93%. The chance corrected level of agreement between the IDD-L and DIS was K = .60.
To examine links between symptoms and current affect, a list of six positive and six negative affect adjectives was also included. For each adjective, individuals were asked to describe how they were feeling “right now” on a scale of 0 (not at all) to 8 (extremely). Positive affect items included amused, elated, enthusiastic, euphoric, excited, happy, and surprised. Negative affect items included annoyed, anxious, distressed, fearful, hostile, and nervous. These items were drawn from the Current Affective State Inventory (Gross, Sutton, & Ketelaar, 1998).
Table 1 presents descriptive statistics for the sample (M, SD, and α) on the English and Spanish versions of the HPS, ISS, IDD, IDD-L, and positive and negative affect measures. Alpha coefficients of internal consistency for the Spanish versions of each scale were .70 or greater (M = .83), and all were comparable to the alpha coefficients for the English versions (M = .82). Among our measures, the HPS is the only one previously translated into Spanish (Rawlings et al., 2000); however, the current Spanish translation of the HPS, which differed by relying on back-translation, had a significantly higher alpha coefficient of internal consistency than the previously translated version (.86 vs. .70).
Intraclass correlations using the agreement model were calculated to assess the correspondence between the language versions of the measures and are presented in Table 2. The intraclass correlations between English and Spanish versions of each scale exceeded .67, ranging from .68 ( p < .01) for the ISS Activation scale to .97 (p < .01) for the IDD-L scale.1 The relatively small size of the current sample did not allow for extensive analyses of the performance of the Spanish versions of the instruments at the item level (e.g., confirmatory factor analysis). However, a preliminary comparison of item performance between the two language versions revealed only a few discrepancies.2
As would be expected in a nonclinical sample, only 4.5% of participants endorsed symptoms of a current major depressive episode on the IDD. This result was the same for both the English and Spanish versions of the measure. A lifetime major depressive episode as assessed by IDD-L was reported by 18.2% of the sample on the Spanish IDD-L and 17% on the English IDD-L (one participant met criteria on the Spanish, but not the English IDD-L). Both rates are congruent with what would be expected for lifetime history of depression among a nonclinical sample. Only 3.4% of the sample obtained a score of 36 or higher on the HPS, which was the original cut-off used in concurrent validity studies by Eckblad & Chapman (1986).
Analyses suggested that the convergent validity in Spanish, as indicated by the correlation among related scales, mirrored the convergent validity among the English versions of the scales. As would be expected, English measures of lifetime risk for mania, current hypomanic symptoms, and positive affect were robustly correlated (see Table 3). The pattern of scale correlations for the Spanish versions closely mirrored the pattern of scale correlations for the English versions.
English measures of current depression and negative affect were also robustly correlated (see Table 4). Lifetime history of depression (IDD-L) was correlated with the IDD, but not with current depression on the ISS or the negative affect measure. This finding is not too surprising given the low rates of current depression among individuals with a lifetime history of depression. As before, the pattern of scale correlations for the Spanish versions closely mirrored the pattern of scale correlations for the English versions.
The goal of this study was to provide Spanish-language versions of self-report measures of manic symptoms, as well as measures of current and lifetime depression. Results of this study provided preliminary support for the Spanish versions of these measures. Specifically, the measures had acceptable internal consistency estimates; the Spanish versions of the measures were all comparable to their English counterparts, having high intraclass correlations between language versions. Although strong support emerged for the compatibility between the English and Spanish versions of the mania and depression measures, we address some issues and limitations that influence the interpretation of results.
Most importantly, we relied on a relatively small, undergraduate, and bilingual sample. This has several repercussions. First, the small sample size limited our ability to examine the factor structure of the measures. Second, it may be that individuals with more severe forms of psychopathology would describe their symptoms differently, and it will be important to assess how well measures cohere among clinical samples. Third, because of the relatively low levels of symptoms within this sample, one would expect restriction of range. This is most likely for more rare symptoms, including current symptoms (which should be endorsed at a lower rate than lifetime symptoms), and hypomanic symptoms (which occur in a much smaller proportion of the population). This restriction-of-range issue would be expected to artificially lower the magnitude of the correlation between scales. Fourth, the properties of the current measures need to be explored in other populations, in particular, among nonbilingual participants and among participants from Spanish-speaking regions other than those of the current sample. Future research may also benefit from considering the need to adapt the current measures to accommodate for regional differences in language usage. Finally, language proficiency issues were poorly considered in this study. The current study did not include assessment of the language difficulty of measures, nor the language proficiency levels of participants. Moreover, our language-proficiency standard was quite minimal. It will be important for future studies to consider the extent to which language proficiency influences the psychometric properties of the measures. Beyond the limitations discussed so far, interpretation of the results is limited by the fact that participants knew they would complete the measures in two languages, which may have introduced bias into their responses.
The availability of Spanish versions of commonly used measures of mania and depression is a necessary prerequisite to understanding cross-cultural differences in the course of bipolar disorder. Results in the current study provide preliminary support for the use of these measures in Spanish-speaking participants; however, before further endorsement can be given, investigation of the psychometric properties of these measures in clinical as well as nonbilingual samples will be necessary.3
We extend special thanks to Sandra Trifonovic, Daniela Malazzo, Wendy Vega, Iruma Bello, Amy E. Hutchings, and Jose Menendez who assisted in the translation of measures and data collection.
1The ISS Activation scale is the primary measure of current manic symptoms. However, a single item measure of mania, “Today I feel manic,” was included. This item faired poorly, with a low intraclass correlation between the English and Spanish versions, ri = .32, p < .01.
2For scales based on items with a continuous response format, we examined the item-scale correlations for each scale in the two language versions to help us identify items that were possibly performing differently across the two languages. With only one exception, all of the items across the scales had similar item-scale correlations for the two language versions and all were significantly greater than 0. The exception was Item 5 of the IDD, where the item-scale correlation was .40 for the English version but .59 for the Spanish version. Item-scale correlations for Item 13 of the ISS Activation scale was low, but this was true for both language versions (rEnglish = .34; rSpanish = .35) and suggested a difficulty with the original instrument for our sample, not the translation. Item-level performance for the HPS was explored by comparing the rates of endorsement for each item in the English and Spanish versions. Thirty-nine of the 48 items had similar (less than a 10-point difference) rates of endorsement between the Spanish and English versions. The rates of endorsement between the language versions of the remaining 9 items (Items 2, 9, 14, 18, 21, 23, 31, 44, and 47) differed by, at most, 18 percentage points.
3The Spanish versions of the ISS and HPS described here can be downloaded directly from the Web at http://www.psy.miami.edu/faculty/sjohnson