|Home | About | Journals | Submit | Contact Us | Français|
Most measures of stigma are illness-specific and do not allow for comparisons across conditions. As part of a study of health-related quality of life for people with neurological disorders, our team developed an instrument to assess the stigma for people with chronic illnesses.
We based item content on literature review, responses from focus groups, and cognitive interviews. We then administered the items to people with neurological disorders for psychometric testing.
Five hundred and eleven participants completed items of the stigma scale. Exploratory factor analysis produced 2 factors that were highly correlated (r = 0.81). Confirmatory factor analysis produced high standardized loadings on an overall stigma factor (0.68 to 0.94), with poorer loadings on the two sub-domains (−0.12 to 0.53). These results demonstrated a sufficiently unidimensional scale that corresponded with the bifactor model. Item response theory modeling suggested good model fit, and differential item functioning analyses indicated that the 24-item scale showed potential for measurement equivalence across conditions.
Our efforts produced a stigma scale that had promising psychometric properties. Further study can provide additional information about the SSCI and its benefit in measuring the impact of stigma across conditions.
In 1963, Erving Goffman defined stigma as “the situation of the individual who is disqualified from full social acceptance (Preface)” , and since his time, social scientists have studied stigma manifested as stereotypes, prejudice, and discrimination . Public stigma, or a negative attitude held by a community member, has consequences for people with stigmatizing conditions, such as loss of employment or social isolation . These consequences can be exacerbated by six factors: concealability, course, disruptiveness, aesthetic qualities, origin, and peril . Concealability refers to whether the condition is obvious or can be hidden, course refers to the severity and pattern of the condition over time, and disruptiveness refers to the degree of interference with usual patterns of social interaction. The term ‘aesthetic qualities’ refers to how much the condition upsets others by way of the five senses, origin refers to the perceived cause and degree of responsibility a person has for contracting the illness, and finally, peril refers to the amount of fear and danger associated with a person’s illness.
Self stigma, or the internalized cognitive, emotional, and behavioral impact of others’ negative attitudes on a person who possesses a devalued characteristic , has been associated with lowered self esteem, depression, anxiety, and decreased service utilization [6, 7]. Corrigan and colleagues described a theoretical model of self stigma as a process by which public attitudes lead to personal responses and ultimately, self stigmatization (Figure 1) [8, 9, 10]. First, a person with a stigmatizing condition experiences discrimination and becomes aware of negative stereotypes around his or her illness. The awareness of the stereotype is sometimes called felt or perceived stigma, and the actual experience of a discriminatory behavior, such as social exclusion, is called enacted stigma . In the final step of the process, the person concurs that negative stereotypes apply to them and then internalizes the stereotype. The internalization is termed self, or internalized stigma, which then has negative consequences for the self (e.g. lowered self esteem).
Researchers have developed measures of stigma associated with a variety of medical and psychological conditions, including mental illness [12, 13], epilepsy , and HIV infection [14, 15]. Van Brakel (2006) identified similarities in the experience of health related stigma across conditions and pointed to a lack of psychometrically valid instruments designed to assess stigma across conditions. He recommended the development of a ‘generic’ measure of stigma in order to avoid duplication of effort across disciplines [16, 17].
As part of a larger study of health-related quality of life for people with neurological disorders, we developed an instrument that measures stigma for people across chronic illnesses. The study also developed item banks, scales, and short forms in areas such as cognition, social functioning, and fatigue. This report focuses on the development of the stigma instrument, its preliminary psychometric findings, and its use with a sample of people with neurological disorders. Neurological disorders are typically incurable and produce permanent disabilities that remain stable or progress unpredictably. In addition, individuals with neurological disorders can become dependent on others, need assistive devices to perform daily tasks, and have significant educational and occupational setbacks [18, 19, 20]. Thus, people with neurological disorders experience considerable stigma, and as such, we set out to examine the stigma scale with this population.
We used a multistep process to develop the stigma scale as part of the “Quality of Life in Neurological Disorders (NeuroQOL)” study. These methods were consistent with the National Institutes of Health Roadmap initiative, the Patient Reported Outcomes Measurement Information System (PROMIS), which prioritizes patient information to guide instrument development . We first conducted focus groups, reviewed the literature, examined items from existing instruments, developed the instrument, conducted cognitive interviews, refined items, and sent the items for psychometric testing.
Five separate focus groups were conducted with adults diagnosed with Alzheimer’s disease, ALS, Epilepsy, Parkinson’s disease, and Stroke, and two groups were conducted with adult patients with multiple sclerosis. Participants were recruited from medical clinics in Merced, California; Cleveland, Ohio; Columbia, Maryland; and Chicago, Illinois. These focus groups occurred between October 2005 and March 2006. Participants were included if they were 18 years of age or older, diagnosed with one of the six conditions, and per physician report, possessed the cognitive and physical ability to participate in a focus group.
The purpose of the focus groups was to query participants on salient aspects of health-related quality of life, not to ask about stigma specifically. Participants were asked open-ended questions about their quality of life and areas most affected by their illness and treatment. Recordings of discussions were transcribed and NVivo 2.0 (QSR International, 2002) was used to organize the qualitative data. The data were analyzed by coders trained in qualitative data analytic techniques, and coding disagreements were reconciled through discussion. A grounded theory approach guided the identification of themes, and incorporated the following techniques: coding, memo writing, and the constant comparative method .
We conducted a literature review in order to examine the wording and content of existing illness-specific measures of stigma. We examined measures of stigma associated with a number of conditions including physical distinctions resulting from severe burns , mental illness , epilepsy [24, 25, 26], HIV/AIDS , amyotrophic lateral sclerosis (ALS) , multiple sclerosis [28, 29, 30], and Parkinson’s Disease  (measures are listed in Table 1). We created an item pool using ‘binning’ and ‘winnowing’ techniques . We selected items from an existing library of Functional Assessment of Chronic Illness Therapy (FACIT) items and wrote additional items to correspond with patient-identified concerns. Each item was assigned to a primary ‘bin’, or area. For stigma, each bin corresponded with six dimensions of stigma: concealability, course, disruptiveness, aesthetic qualities, origin, and peril . Once items were assigned to bins, we “winnowed”, or systematically removed, items because of redundancy, vague or confusing language, language translatability, or narrowness of coverage.
We conducted cognitive interviews with draft stigma items to help ensure that items would be understood as intended . Research assistants received training in cognitive interviewing, and once trained, they queried participants on the language, comprehensibility, and relevance of the items. The interviews were audio-taped and transcribed verbatim.
We gathered a team of experts to analyze the content of responses  and revise items in a series of discussions. The expert reviewers were neurology professionals, chosen based on reputation via publications and presentations, and outcomes measurement specialists from the fields of health psychology, rehabilitation medicine, psychometrics, and cross-cultural translation. The neurology and outcomes measurement experts reviewed item content and wording independently. In addition, our team scrutinized response categories (1=Never, 2=Rarely, 3=Sometimes, 4=Often, 5=Always) and the context of the questionnaire ‘the past 7 days’ versus ‘lately’.
We recruited patients with Stroke, Multiple Sclerosis, Parkinson’s Disease, Epilepsy, and ALS via an online internet panel. The participants completed a set of study measures, including a socio-demographic questionnaire and the stigma items. The participants provided online informed consent before completing study measures. We were concerned about recruiting adequate numbers of people with particular neurological disorders. Thus, we prioritized recruitment of people with specific neurological disorders, and stratified the sample by neurological disorders but not other characteristics, such as racial/ethnic background. Online data collection enabled us to recruit participants in adequate numbers from across the United States. The panel testing company used illness-stratified random sampling from their registered panel of over one million members. Participants were members of the online testing company who completed questionnaires regularly for incentives provided by the company.
We examined Cronbach’s alpha and item-total correlations of the stigma item bank. We used exploratory factor analysis (EFA) on the total item set, and then conducted single factor and bifactor confirmatory factor analysis (CFA) using Mplus 4.1 (Mplus, 2006) to assess dimensionality. The EFA and CFA used polychoric correlation coefficients, considered robust with ordinal item responses. CFA was conducted using data from participants who had complete data (no missing item responses). Goodness of fit was examined for each set of items using the comparative-fit index (CFI) and the root-mean-square error of approximation (RMSEA). The criteria for good (or acceptable) model fit were: CFI values greater than 0.95 (0.90) and RMSEA values less than 0.05 (0.08). We then fitted the items with the graded response model, which assumes ordinal item responses, using the MULTILOG software program . All items were examined for satisfactory model fit statistics S-X2 . Local dependence, or item redundancy, was identified using a standard cutoff supported by our previous work (i.e., residual correlation ≥ .20) .
To assess convergent and divergent validity, we conducted three ANOVA analyses to examine the relationship between total scores on finalized scale and a (1) self-rating of psychological distress (“Please indicate the statement that best describes your current level of anxiety/depression: 1= not anxious/depressed, 2=moderately anxious/depressed, 3= extremely anxious/depressed”), (2) patient rated performance status (“Please indicate which statement below best describes your current activity level: 0=normal activity, 1=some symptoms, but no bed rest during day, 2=bed rest for < 50% of day, 3= bed rest for > 50% of day, 4=unable to get out of bed”), and (3) a self-rating of pain (“Please indicate the statement that best describes your current level of pain/discomfort: 1=no pain/discomfort, 2=moderate pain/discomfort, 3=extreme pain/discomfort”).
Several studies have demonstrated that depression and anxiety are associated with stigma [37, 38], and thus we examined the relationship between stigma and the combined construct we called ‘psychological distress’ to demonstrate convergent validity. In addition, patient rated performance status has been used extensively in health-related quality of life studies as a measure of general functional status reflecting the course of illness , and thus, we set out to examine the relationship between stigma and performance status. Furthermore, pain has not been associated with stigma, and we examined the relationship between pain and stigma to demonstrate divergent validity. Cohen considered 0.40 to be a large effect size and 0.10 to be a small effect size for a one way ANOVA . Given this information, we expected that an analysis of stigma and psychological distress would produce a large effect size (greater than 0.40) and performance status would be moderately related to stigma and produce an effect size less than 0.40. Although pain could theoretically be related to stigma, we expected an analysis of pain and stigma to produce a small effect size (less than 0.10).
In post-hoc analyses, we evaluated measurement equivalence (or differential item functioning: DIF) between responses from participants with epilepsy and stroke. These groups were chosen because numbers of participants in each group were large enough to allow for DIF analyses to be conducted. DIF was evaluated in order to examine the differing probabilities of item endorsement in selected groups of people . A nonparametric DIF detection technique was used, using the Mantel-Haenszel and Liu-Agresti cumulative common log odds ratio statistics as implemented in the DIFAS software . We examined the log odds ratio z statistic (z) in order to determine the magnitude and direction of DIF across item responses from people with epilepsy and stroke.
Results from the focus groups pertaining to stigma are summarized here, and full results are reported elsewhere . Each group was comprised of 8 patients for a total of 56 participants (sample characteristics are reported in Table 2). The concept of stigma emerged spontaneously in the patient focus groups. For example, the adult epilepsy group mentioned stigma on two occasions (4.26% of all coded responses from this group), particularly as it related to having seizures in public. During the multiple sclerosis groups, comments pertaining to stigma were mentioned in relation to looking as though intoxicated when having difficulty walking in public (10 instances, 2.48%) and when others questioned their need for assistive devices (11 instances, 2.73%). In the Parkinson’s disease groups, participants reported stigma (11 instances, 9.91%) when family and friends avoided them and doubted the legitimacy of their status. Stroke participants endorsed their experiences with stigma (10 instances, 4.46%) in relation to slurred speech and difficulty walking.
We developed a 33 item pool, and 17 participants with epilepsy, multiple sclerosis, ALS, Parkinson’s disease, and stroke responded to these items in cognitive interviews (sample characteristics are provided in Table 2). Participants commented in cognitive interviews that the draft item “I am unpredictable” was inapplicable to their situation, and so the item was dropped. Item wordings were further revised based on participant feedback expert review. For example, the word ‘condition’ that appeared in the original item pool was changed to ‘illness’ based on its potential for easier translation to Spanish. The response categories were generally well understood by patients. However, the item context ‘the past 7 days’ was modified to ‘lately’, as many of the items appeared to be more stable over short periods of time. After the participant and expert item review, 26 items remained.
Five hundred and eleven patients with neurological disorders completed the 26-item bank via the online internet panel. Fifty-three percent of the respondents were male. In terms of race and ethnicity, 5.3% self-reported as having Hispanic/Latino ethnicity and 95% reported that they were European-American on a separate question about race. Participants had a self-reported diagnosis of stroke, multiple sclerosis, Parkinson’s disease, epilepsy, or ALS. Details of the socio-demographic and clinical characteristics of the sample are in Table 2.
Cronbach’s alpha was 0.97 for the 26-item bank. All items had item-total correlations greater than 0.50, but item analyses showed the item distributions were skewed. Accordingly, polychoric correlations were used in our EFA and CFA, and this method is known to be more robust with non-normally distributed and ordinal data. In addition, the unweighted least squares (ULS) estimator was used for EFA, and weighted least squares with adjustment for means and variances (WLSMV) estimator was used for CFA.
When EFA results were interpreted using the Cattell (1966) method  (number of factors before a break in the scree plot) the analysis produced one factor, whereas using the Kaiser (1960) method  (number of factors with eigenvalues greater than 1), the EFA produced three factors with eigenvalues of 17.78, 1.44, and 1.12. We then used QUARTIMIN rotation to minimize the number of factors and allow the factors to be correlated. 13 items loaded onto a factor measuring ‘self/internalized stigma’, and 11 items loaded onto a second factor measuring ‘enacted stigma.’ Two items, “I was careful who I told that I have this illness” and “I worried that people who know I have this illness will tell others” loaded onto a third factor. We dropped items on this third factor because they were applicable only with conditions that are concealable, and our goal was to develop a ‘generic’ stigma scale. After the two items were dropped, we completed the EFA again and using the Kaiser (1960) method of interpretation, a 2 factor solution suggested by eigenvalues 16.90 and 1.12, accounting for 70% of the variance. Eleven items loaded onto the ‘enacted stigma’ factor and 13 items onto the ‘self stigma’ factor. The two factors were correlated considerably (r = .81).
Single factor and bifactor Confirmatory Factor Analyses were completed on the 24-item pool. For single factor CFA, RMSEA was 0.131 and CFI was 0.905. Two items had a tendency towards local dependence (r = 0.18): “People with my illness lost jobs when their employers found out about it” and “I lost friends by telling them that I have this illness,” and with the items “Because of my illness, people were unkind to me” and “Because of my illness, people made fun of me.” We retained these items because they did not meet the 0.20 cutoff, and because the items appeared to measure different facets of enacted stigma (i.e., employment and social relationships).
Then, a bifactor CFA was run because the two factors were highly correlated and we believed that an underlying general factor of stigma dominated. For this analysis, the RMSEA was 0.096 and CFI was 0.939. These fit statistics did not support good model fit. However, McDonald (1999) suggested that sufficient unidimensionality can be interpreted by examining standardized loadings for bifactor analyses . Lai, Crane, and Cella (2006) demonstrated that even with poor CFA fit statistics, items of a fatigue scale fit a bifactor model by examining factor loadings . Accordingly, the bifactor CFA indicated that standardized loadings on an overall stigma factor ranged from 0.68 to 0.94. Within the same model, items from the self and enacted stigma sub-domains had poorer standardized loadings; on the self stigma sub-domain, loadings ranged from −0.12 to 0.43 and on the enacted stigma sub-domain, loadings ranged from 0.17 to 0.53. The dominant general factor explained the residual correlations among the 2 subscales . Furthermore, the examination of factor loadings suggested that a dominant general stigma factor was sufficiently unidimensional for analyses requiring that the underlying construct be unidimensional, such as item response theory (IRT) analyses.
The IRT analyses indicated that items of the stigma scale fit the latent unidimensional construct of stigma. Item parameters from this analysis are shown in Table 3. Figure 2 shows the information function and trait distribution for the stigma items. The bimodal trait distribution showed that two groups of people with different levels of stigma severity were sampled. The information function suggested that the items measured the construct at moderate to severe levels of stigma, implying that the items would lack precision if used with people experiencing less severe stigma. The instrument is best used with participants whose stigma severity levels are located where the vertical marks are placed at the bottom of the figure.
Our analyses of convergent and divergent validity were conducted on the total score (sum of 24 items) of the scale we called the Stigma Scale for Chronic Illness (SSCI). The mean total score on the SSCI was 42.7 (standard deviation = 19.7). Details on the frequency of responses for psychological distress, performance status, and pain are listed in Table 2. The first ANOVA indicated a strong relationship between the total score on the SSCI (dependent variable) and psychological distress (independent variable), producing an effect size of 0.58. In the second ANOVA, which analyzed the relationship between the SSCI (dependent variable) and performance status (independent variable) produced a smaller effect size of 0.47. Lastly, the ANOVA with pain (independent variable) and the SSCI (dependent variable) produced an even smaller effect size of 0.36.
We examined DIF on a subset of the responses obtained in order to understand the scale’s performance across conditions, in this case epilepsy (N = 165) and stroke (N = 190). Extreme categories (“Often” or “Always”) were rarely indicated in this sample, and so the response categories with fewer than five observations per group were collapsed with lower categories. A total of 16 items had one or more categories collapsed. Four items of the stigma scale demonstrated DIF: “Because of my illness, I felt embarrassed in social situations” (z = +2.00), “Because of my illness, people avoided looking at me” (z = +2.11), “Because of my illness, people tended to ignore my good points” (z = −2.35), and “Because of my illness, I felt different from others” (z = −2.11). The positive and negative z statistics indicated that the stroke sample tended to respond to the first two items (‘embarrassed in social situations’ and ‘avoided looking at me’) with more severe stigma than the epilepsy sample, and the epilepsy sample tended to respond with more severe stigma than the stroke sample on the last two items (‘good points’ and ‘different from others’), at the same levels of the underlying trait. Figure 3 depicts the expected item score as a function of the logit transformed total score, corrected for overlap, separately for people with epilepsy and stroke. Although DIF was observed on these 4 items, the overall impact is likely to be small because the magnitude of DIF, as shown graphically and by the Root Mean Square Deviation values in Figure 3, was small and the directions of DIF were balanced. The effects of the DIF would likely cancel out at the test level.
Our multistep process for measurement development resulted in the 24-item Stigma Scale for Chronic Illness (SSCI). The SSCI demonstrated essential unidimensionality while measuring the theoretically-supported areas of self and enacted stigma. The scale had good internal consistency, convergent validity, and IRT model fit. In addition, DIF analyses provided preliminary evidence of measurement equivalence across the neurological conditions of epilepsy and stroke. Further psychometric testing will strengthen the validation of this instrument, and comparisons across other neurological and non-neurological conditions will help evaluate the generalizability of the SSCI to other chronic conditions.
Our approach was unique in that it began with input on stigma from people with chronic illnesses. Focus group participants underscored the impact of stigma on quality of life, and this input then guided item development. The items underwent rigorous psychometric analyses, and results showed that the SSCI conformed to the bifactor model, demonstrating sufficient unidimensionality. Although we built the scale informed by six dimensions of stigma, our factor analytic results supported the more parsimonious bifactor model. Therefore, our analyses suggested that the SSCI can be examined as one total or as two subscales of enacted and self/internalized stigma. The resulting enacted and self/internalized stigma subscales locate the SSCI within the theoretical model depicted in Figure 1. Corrigan and colleagues (2002) developed the theoretical model, but further study with the SSCI can provide more information about the relationships between enacted and self/internalized stigma.
The scale demonstrated stronger than expected associations with psychological distress, performance status, and pain. It is possible that an unmeasured factor, severity of illness may be driving the stronger than expected associations between these variables. Regardless, the SSCI did not demonstrate adequate divergent validity when compared to pain scores. Further study of the scale’s divergence with constructs unrelated to stigma should be undertaken.
The SSCI demonstrated high internal consistency, suggesting that items within the scale could be similar and the scale could be shortened without impacting its psychometric properties. Furthermore, our IRT analyses showed that people with two distinct stigma levels participated in our study, indicating that across neurological conditions, our participants reported varying severity of stigma. The results showed that the scale might not be well utilized as a screening measure to distinguish between those who do and do not experience stigma. Instead, the SSCI would be best used to measure stigma at moderate to severe levels.
Finally, results from the DIF analysis suggested that the measure has good potential to be used across conditions with minimal bias. However, the DIF analyses were conducted with small samples of people with epilepsy and stroke. Although response categories were collapsed in these analyses, requiring fewer location parameters to be estimated per item, DIF studies with larger samples can fully determine to the benefit of using the measure across conditions.
The study had some limitations. First, the participants from online panels lacked ethnic/racial diversity; 95% were of European-American racial background. Therefore, caution should be taken in generalizing these results to other ethnic/racial groups, where stigma may be compounded by social differences, disadvantages, or discrimination. In addition, the item distributions were skewed, and although our EFA and CFA analyses used polychoric correlations to account for this non-normality, the IRT parameter estimates could be less stable than their associated standard error estimates might indicate.
Overall, we developed the SSCI to be useful in better understanding the impact of stigma on people across chronic illnesses and the effectiveness of stigma reduction interventions. Future study of the SSCI can provide more information on the instrument’s psychometric properties and further evidence about its use in comparing stigma across conditions. More study is needed, but the SSCI demonstrated potential to measure important aspects of stigma.
This study was funded by the National Institute of Neurological Disorders and Stroke (NINDS) contract number HHSN265200423601C. Deepa Rao is supported by a career development award funded by the National Institute of Mental Health (NIMH: grant number K23 MH 084551). The authors would like to thank Claudia S. Moy assistance in making this study possible, Paul K. Crane for his psychometric guidance, and Patrick W. Corrigan and Nicolas Rüsch for providing comments on earlier drafts of this paper.