|Home | About | Journals | Submit | Contact Us | Français|
Given the emphasis on modesty and self-effacement in Asian societies, the present study explored differential item responses for 2 positive affect items (5 = Hopeful and 8 = Happy) on a short form of the Center for Epidemiologic Studies-Depression scale. The samples consisted of elderly non-Hispanic Whites (n = 450), Korean Americans (n = 519), and Koreans (n = 2,030).
Multiple Indicator Multiple Cause models were estimated to identify the impact of group membership on responses to the positive affect items while controlling for the latent trait of depressive symptoms.
The data revealed that Koreans and Korean Americans were less likely than non-Hispanic Whites to endorse the positive affect items. Compared with Korean Americans who were more acculturated to mainstream American culture, those who were less acculturated were less likely to endorse the positive affect items.
Our findings support the notion that the way in which people endorse depressive symptoms is substantially influenced by cultural orientation. These findings call into question the common use of simple mean comparisons and a universal cutoff point across diverse cultural groups.
SINCE its initial development in the 1970s, the Center for Epidemiologic Studies-Depression scale (CES-D; Radloff, 1977) has been widely used as a screening tool for depression in both clinical and research settings. Although it was based on samples composed largely of European Americans, the CES-D has now been translated into more than 50 different languages. These versions of the CES-D have excellent psychometric properties, and the instrument is generally held in high regard by researchers around the world (Nezu, Nezu, McClure, & Zwick, 2002). The availability of multiple translations and the wide acceptance of the instrument have led to both the use of the CES-D in diverse cultural groups and concerns about its cross-group comparability or measurement equivalence (e.g., Crockett, Randall, Shen, Russell, & Driscoll, 2005).
Problems with measurement equivalence are evidenced by differential item functioning (DIF). DIF occurs when respondents from different groups show differing tendencies toward endorsing an item despite having been matched with respect to the overall ability or attribute that the item is intended to measure (Dorans & Holland, 1993; Zumbo, 1999). Studies using several analytic techniques have identified DIF in the CES-D across diverse groups. Two of the more common findings are that women are more likely than men to endorse the “crying spells” item (e.g., Cole, Kawachi, Maller, & Berkman, 2000; Stommel et al., 1993) and that Blacks are more likely than non-Hispanic Whites to endorse items reflecting problematic interpersonal relationships (“people are unfriendly” and “people disliked me”; e.g., Blazer, Landerman, Hays, Simonsick, & Saunders, 1998; Cole et al.; Kim, Chiriboga, & Jang, 2009). These findings of sex- and race-based DIF suggest that researchers should take caution when using the CES-D in cross-group comparisons because the observed mean differences may be attributable to item response bias rather than to true differences in mental health status (Mui, Kang, Chen, & Domanski, 2003).
The present study was designed to explore cultural bias in responses to the two positive affect items in a short form of the CES-D (Andresen, Malmgren, Carter, & Patrick, 1994). Most mental health screening instruments, including the CES-D, contain both positively and negatively worded questions or statements. Using a mix of positive and negative wording is one way to avoid agreement bias or acquiescence (e.g., Anastasi, 1982; Nunnally, 1978). However, recent studies have called attention to potential response biases associated with the phrasings of items (e.g., Chodosh, Buckwalter, Blazer, & Seeman, 2004; DiStefano & Motl, 2006; Quilty, Oakman, & Risko, 2006). It has been suggested that culture has a strong influence on the over- or under-endorsement of items phrased in certain ways (DiStefano & Motl; Iwata & Buka, 2002).
Response biases on positive affect items are a particular concern for researchers studying persons from Asian cultures, which often value modesty and self-effacement as cultural virtues and inhibit the expression of positive emotion (Iwata & Buka, 2002; Y. Jang, Kim, & Chiriboga, 2005). Previous studies on Korean populations have speculated that the reluctance of Koreans to respond positively to positive affect items might elevate overall CES-D scores within this population (e.g., Cho, Nam, & Suh, 1998; Y. Jang et al., 2005; Noh, Kaspar, & Chen, 1998; Young, Fogg, & Choi, 2002); however, this issue has not been fully explored.
The present study conducted DIF analysis with three large data sets of elderly non-Hispanic Whites, Korean Americans, and Koreans living in Korea that each included the 10-item CES-D. Although race/ethnicity is not synonymous with culture, we refer to the samples as cultural groups here for simplicity. Our first hypothesis was that compared with the non-Hispanic Whites, the Korean Americans and Koreans would have more pronounced DIF for the positive affect items. In addition, the sample of Korean Americans provided a unique opportunity to further explore the role of culture and acculturation in response patterns. Acculturation refers to the degree to which a person from another culture has learned the language and behaviors expected of persons who live in the host culture (Sam & Berry, 2006). People may also adopt the ways of thinking and expressing that are prevalent in the host culture (Iwata & Buka, 2002; Y. Jang et al., 2005). The literature suggests that people vary greatly in the degree to which they are acculturated (Sam & Berry, 2006). Thus, our second hypothesis was that the DIF for the positive affect items would be observed when comparing Korean American older adults who were less acculturated and those who were highly acculturated with mainstream American cultures.
The non-Hispanic White sample was drawn from the Survey of Older Floridians (SOF; Zayac, Salmon, & Chiriboga, 2005). Data were collected through computer-assisted telephone interviews between 2004 and 2005. Eligibility criteria included being age 65 or older, living in the community, and being free of mental status impairment (no more than five errors on the Short Portable Mental Status Questionnaire [SPMSQ; Pfeiffer, 1975]). The SOF, which used a statewide sampling frame complemented by an oversampling of racial/ethnic minorities, had a total sample of 1,433 respondents (non-Hispanic Whites, African Americans, non-Cuban Hispanics, and Cubans). Respondents were selected through random digit dialing. The overall response rate was 62%. More information on the SOF is available elsewhere (e.g., Y. Jang, Chiriboga, Kim, & Phillips, 2008; Zayac et al.). For the present study, only non-Hispanic Whites (n = 504) were considered.
The Mental Health Literacy among Korean American Elders project provided a sample of 675 community-dwelling Korean Americans aged 60 and older. Using multiple sample recruitment strategies, researchers collected data between March and August 2008 in Tampa and Orlando, Florida, using both face-to-face visits and mail surveys. The survey questionnaires, which were in Korean, were developed through a translation/back translation process that also included pilot testing with 20 Korean American older adults who were representative of the anticipated sample. Detailed information on the sampling procedures and validation of the multimethod recruitment strategy is available elsewhere (e.g., Y. Jang, Chiriboga, Allen, Kwak, & Haley, 2010).
Data on Koreans living in Korea were drawn from the baseline survey of the Korean Longitudinal Study of Ageing (KLoSA) conducted in 2006 by the Korean Ministry of Labor Institute. The KLoSA is a nationally representative sample of community-dwelling Koreans aged 45 and older selected via multistage stratified probability sampling. In-home face-to-face interviews were conducted by trained interviewers with 10,254 respondents. Response rates were 70.7% for households and 75.4% for individuals within the households. More detailed information on the KLoSA can be found elsewhere (e.g., S. Jang et al., 2009).
For the purpose of comparative analysis, only individuals aged 65 and older who had intact mental status were included in the present study. Because the SOF had an age requirement of 65 or older and because its participants had been screened with the SPMSQ prior to participating in the survey, no further reduction of this sample was necessary. In the Mental Health Literacy among Korean American Elders project, no formal screening for mental status was done; however, all participants had intact mental status to understand and respond to the surveys. In the KLoSA, mental status was assessed with the Korean Mini-Mental State Examination (Kang, Na, & Han, 1997; see also Folstein, Folstein, & McHugh, 1975); respondents who scored less than 24 were excluded. After excluding respondents aged between 60 and 64 years and who lacked complete information on the CES-D, the final sample consisted of 450 Whites, 519 Korean Americans, and 2,030 Koreans.
In all three data sets, the 10-item CES-D (Andresen et al., 1994; see also Radloff, 1977) was used. For Korean Americans and Koreans, the Korean language version was used. This scale uses a 4-point scale to assess the frequency of experiencing depressive symptoms during the past week (0 = rarely, 1 = some of the time, 2 = moderate amount of time, and 3 = all of the time). The scale includes two positively worded items (“I felt hopeful” and “I was happy”) and eight negatively worded items (e.g., “I felt depressed” and “I felt lonely”). The positive items were reverse coded, and all items were summed into a total score that could span from 0 (no depressive symptoms) to 30 (severe depressive symptoms). The Korean version of the CES-D was developed and its psychometric properties validated with various groups of Koreans residing in Korea and North America (e.g., Cho et al., 1998; Y. Jang et al., 2005; Noh et al., 1998). The internal consistency of the 10-item scale was satisfactory in the present samples (αs = .75 for Whites, .76 for Korean Americans, and .79 for Koreans).
The Korean American sample completed a 12-item acculturation inventory (Y. Jang, Kim, Chiriboga, & King-Kallimanis, 2007) that was based on three existing instruments (i.e., Hazuda, Stern, & Haffner, 1988; Suinn, Ahuna, & Khoo, 1992; Ying, 1995). The inventory covered language use, media consumption, food consumption, social relations, sense of belonging, and familiarity with culture. Satisfactory psychometric properties have been found in studies with both older (e.g., Y. Jang et al., 2007) and younger (e.g., Y. Jang, Chiriboga, & Okazaki, 2009) Korean Americans. Each response was coded from 0 to 3. Total scores could range from 0 to 36, with a higher score indicating a greater level of acculturation to mainstream American culture. The internal consistency of the scale in the present sample of Korean Americans was high (α = .92).
All three data sets included information on age (in years), sex (0 = male, 1 = female), marital status (0 = married, 1 = unmarried), and educational attainment (0 = less than high school, 1 = high school or greater).
DIF analyses seek to determine the extent to which group membership itself affects the probability of endorsing particular items on a scale. DIF approaches assess the probability that an item response for one group will differ from that of another group when a common variable is held constant. In the present study, Multiple Indicator Multiple Cause (MIMIC) models were used to identify DIF between groups. MIMIC models can examine the direct effects of group membership on individual item responses with simultaneous factor analysis and regression of a latent trait on group differences while controlling for covariates (Muthén & Muthén, 2009). Because of their ability to control for the level of a latent trait, MIMIC models have been increasingly used as a method of detecting DIF (e.g., Gallo, Anthony, & Muthén, 1994; Grayson, Mackinnon, Jorm, Creasey, & Broe, 2000; Nuevo et al., 2009).
In the present analysis, a separate MIMIC model was tested for each of the two positive affect items. All 10 CES-D items were considered indicators of a latent trait (i.e., depressive symptoms). The differences in depressive symptoms between groups (γ0.1) were controlled along with a set of covariates: age (γ0.2), sex (γ0.3), marital status (γ0.4), and education (γ0.5). A significant direct effect of group membership (γ1.1) on the item would indicate the presence of group-based DIF, such that one group would be more or less likely to endorse the item than the other, even at the same overall level of depressive symptoms. For the first hypothesis, three comparisons were made (Korean Americans vs. Whites, Koreans vs. Whites, and Koreans vs. Korean Americans). For the second hypothesis, two subgroups of Korean Americans were selected to represent low and high acculturation, and these two groups were compared. All analyses were carried out using Mplus Version 5.21 (Muthén & Muthén, 2009).
Mean ages were 74.6 (SD = 6.88) for Whites, 72.5 (SD = 5.99) for Korean Americans, and 70.9 (SD = 4.94) for Koreans. The proportion of women in each group was 63.3%, 53.4%, and 43.2%, respectively. More than half (53.1%) of the White sample were unmarried, whereas the corresponding figures for the Korean Americans and Koreans were less than 25%. A substantial difference was observed in educational attainment: Less than 5% of the Whites had less than a high school education compared with 30% of the Korean Americans and 70% of the Koreans.
Individual items of the CES-D were correlated, and inter-item correlations averaged .25 for Non-Hispanic Whites, .29 for Korean Americans, and .36 for Koreans. Figure 1 shows the responses to individual items on the CES-D. It should be noted that the positive items are reverse coded and that greater scores on the horizontal axis represent a greater frequency of symptoms of depression for all items. Visual inspection indicated that in the White sample, the response pattern was consistent across all 10 items. However, in the Korean American sample and the Korean sample, a distinct pattern was observed for the two reverse-coded positive affect items. When we excluded the two positive items and recomputed the internal consistencies, the alpha coefficients were noticeably improved for the Korean Americans (.76 to .83) and Koreans (.79 to .91). However, only a minimal change was observed in the White sample (.75 to .73).
Figure 2 compares CES-D scores (total, negative item total, and positive item total) across the three groups. For the total scores, the Korean Americans had the highest score followed by the Koreans and the Whites. When only negative items were included, the Korean Americans again had the highest score followed by the Whites and the Koreans. When only positive items (reverse coded) were included, the Koreans ranked the highest followed by the Korean Americans and the Whites. The mean differences were all statistically significant, F(2, 2998) = 41.9, p < .001, for CES-D total, F(2, 2998) = 22.9, p < .001, for negative item total, and F(2, 2998) = 286.1, p < .001, for positive item total. It is also notable that positive item and negative item totals were positively correlated in the White sample (r = .38, p < .001) and the Korean American sample (r = .22, p < .001) but were not correlated in the Korean sample (r = .03, p > .05).
We separately tested MIMIC models for each of the positive affect items (5 = Hopeful and 8 = Happy) by group while controlling for the latent trait of depressive symptoms and a set of covariates (age, sex, marital status, and education). DIF was tested in Korean Americans compared with Whites (Comparison 1), Koreans compared with Whites (Comparison 2), and Koreans compared with Korean Americans (Comparison 3). Parameter estimates and indices of model fit for the MIMIC models are presented in Table 1.
All three comparison models showed moderate fit. Chi-square values were large, which was to be expected given the relatively large sample sizes. Considerations of residual covariance and correlations between latent and observed variables might have improved model fit. However, because the focus of the study was not on finding a desirable model fit, we did not follow modification procedures. Such practice is common in DIF analysis using MIMIC (e.g., Gallo et al., 1994).
Supporting our first hypothesis, the direct effect of cultural group membership (γ1.1) on positive items was significant in all models. Compared with Whites, both Korean Americans and Koreans were less likely to endorse each of the two positive affect items, even when the overall level of depressive symptoms and covariates were controlled. It is interesting to note that the same pattern of DIF was observed in the comparison between Koreans and Korean Americans, with the Koreans being less likely to endorse the positive items than were the Korean Americans.
To further explore the role of culture, we replicated the MIMIC models among Korean Americans of low acculturation (i.e., a score 1 SD less than the total sample mean on the acculturation measure, ≤8; N = 125) and high acculturation (i.e., a score 1 SD more than the total sample mean on the acculturation measure, ≥23; N = 134). As shown in Table 2, acculturation (γ1.1) had a significant direct effect on each of the two positive affect items when the level of depressive symptoms and the covariates were controlled. Supporting our second hypothesis, responses to the positive affect items were substantially influenced by individuals’ level of acculturation to mainstream American culture. Like Koreans living in Korea, Korean American older adults who were less acculturated were less likely to endorse positive emotions, whereas highly acculturated Korean Americans endorsed these emotions in a way similar to that of Whites.
Over the past decade, the issue of measurement equivalence or cultural invariance in responses to the CES-D has been examined for several racial and ethnic groups (e.g., Cole et al., 2000; Crockett et al., 2005; Kim et al., 2009). This attention has been prompted by the recognition that measurement equivalence is a fundamental issue in cross-cultural studies. Although limited by relatively small sample sizes and descriptive analytical strategies, a few studies have addressed cultural response bias in the CES-D in Asian populations (e.g., Iwata & Buka, 2002; Y. Jang et al., 2005; Young et al., 2002). The availability of large data sets of non-Hispanic Whites, Korean Americans, and Koreans that used the 10-item CES-D enabled us to perform an advanced DIF analysis focusing on the role of culture. The present analysis was strengthened by the further exploration of subsamples of Korean American immigrants with varying levels of acculturation to mainstream American culture.
A series of DIF analyses using MIMIC models provided evidence that supported our hypotheses. Our results indicated that both Korean Americans and Koreans were less likely to endorse positive affect items compared with Whites, but at the same time, Korean Americans were more likely to endorse these items than those in the Korean sample. Similarly, within the Korean American sample, the same reduced likelihood of endorsing positively worded items was observed among the less acculturated subgroup compared with the more acculturated subgroup.
Given that measurement invariance is a prerequisite for a valid cross-group comparison, our findings on DIF suggest that researchers should be cautious about using simple mean comparisons and a universal cutoff point. In the presence of DIF, such approaches may lead to inaccurate prevalence estimation and invalid group comparisons (Crockett et al., 2005). Previous studies have consistently reported relatively high levels of depressive symptoms in Korean populations (e.g., Y. Jang & Chiriboga, 2010; Kuo, 1984; Lee, Moon, & Knight, 2005; cf. Mui et al., 2003). Indeed, Korean Americans are often profiled as a high-risk group in mental health research because their rates of probable depression as indexed by standardized depression screening tools are up to four times greater than those of Whites or African Americans (Y. Jang & Chiriboga). Our findings suggest that the CES-D scores of Korean populations may be inflated because of their biased response patterns independent of mental health status. However, as shown in Figure 2, our results also indicate that negative item total scores were also still relatively high for the Korean American sample. Similarly, Kim and colleagues (in press) found that Korean Americans had the highest depressive affect scores among five ethnic groups of Asian Americans in the 2007 California Health Interview Survey where the K6 (Kessler et al., 2002), an instrument that only includes items representing negative emotion, was employed.
Some limitations to the present study must be noted. The present comparative analyses used three data sets that were established as a result of independent research efforts. Therefore, some degree of discrepancy in study designs and measures is unavoidable. There were also procedural differences between the three methodologies (e.g., telephone survey/mail survey/in-person interview, 2- to 4-year differences in when the data were collected). Different mental status screening tools were used in the SOF and KLoSA, and there was no formal screening of the Korean American sample. Although the present study used the suggested cutoffs for the SPMSQ and Mini-Mental State Examination to exclude individuals with impairment in mental status, there is a lack of empirical evidence for the validity of these cutoffs in cross-cultural settings. The representativeness of the samples is another potential limitation. Although the KLoSA is a nationally representative sample and the SOF is a statewide probability sample of Florida, the Korean American sample was drawn from only two cities in Florida. Given this geographic restriction, the generalizability of the findings to whole populations may be limited. It should also be noted that we inferred differences in underlying cultural attributes across the groups in the absence of direct measures of cultural values (e.g., modesty, self-effacement). Future studies need to assess cultural values and their influence on response patterns. Because the current analysis was restricted to individuals aged 65 and older, future investigations should broaden the age range to include younger individuals and life-span perspectives. Given the fact that neither culture nor historical period is static, longitudinal and cohort sequential assessment will also provide a needed perspective.
Despite these limitations, our findings clearly show that the way in which people respond to the CES-D varies by cultural orientation. Given that most research on depressive symptoms relies on self-report inventories and that cultural orientations may systematically bias responses to such inventories, caution should be exercised in interpreting the results of this research, particularly in the context of cross-group comparisons. Regrettably, there is at present no standard approach to correcting for DIF, which calls attention to the urgent need for more research in this area.
The Survey of Older Floridians was supported by the Administration on Aging (Grant No. 90AM2750; Jennifer Salmon, PhD, Principal Investigator). The Mental Health Literacy among Korean American Elders project was supported by the National Institute of Mental Health (Grant No. 1R21MH081094-01A1; Yuri Jang, PhD, Principal Investigator).
Data from the Korean Longitudinal Study of Ageing are publicly available at http://www.kli.re.kr/.