|Home | About | Journals | Submit | Contact Us | Français|
This article compares Positive and Negative Syndrome Scale (PANSS) data from Chinese and American inpatients with chronic schizophrenia to show how differences in item ratings may reflect cultural attitudes of raters. The Chinese sample (N=504) came from Beijing Huilongguan Hospital. The American sample came from 268 PANSS assessments of CATIE subjects hospitalized for 15 days or more to optimize equivalence of the samples. Controlling for age and gender, the Chinese sample scored significantly lower for total score by 25% (p<.0001), for the positive sub-scale by 35% (p<.0001), and on the general sub-scale by 32% (p<.0001), but not significantly different on the negative sub-scale score (+0.26%, p=0.76). However, the Chinese sample scored 26% higher on the item on poor rapport (p<.0001), 10.2% higher on passive social withdrawal (p=.003), and most notably 46% higher on the item on lack of judgment and insight (p<.0001). These results remain broadly consistent across gender sub-group analyses. Differences seem to be best explained by both cultural differences in patient clinical presentations as well as varying American and Chinese cultural values affecting rater judgment.
Cross-cultural psychiatric researchers have struggled to investigate mental illness across cultures. Cultures possess distinct norms of normal and abnormal behavior, influencing symptom patterns and raising questions about the homogeneity of illnesses across societies (Devereux, 2000). Psychiatrists have typically assumed that emotions are expressed and can be measured in similar ways across populations, although ethnographers have observed cross-cultural variations in their manifestation, apprehension, and sociolinguistic expression (Abu-Lughod and Lutz, 1990). These considerations have guided research in cultural psychiatry on the universality or specificity of psychopathology, the challenges in providing services in inter-cultural patient-physician interactions, and methods of assessment in the absence of universal pathogenic biomarkers (Kirmayer, 2007).
Instrument translation exemplifies dilemmas in cultural psychiatry. Questionnaires from one setting that do not consider idioms of distress or different symptom manifestations in another setting may neglect cultural influences on the manifestation or assessment of illness (Kleinman, 1988). In cultures without medical vocabularies to express psychological states, researchers must decide between strict lexical and broad conceptual translations (Aggarwal, 2007). In addition, questions from European languages may need considerable rethinking in Asian or African languages with different grammatical structures (Ertan and Eker, 2000). Instruments can only be applied cross culturally with appreciation that illnesses and their assessment may vary across cultures and that even well translated instruments may be applied and interpreted differently.
This article confronts these challenges by analyzing data from the widely used Positive and Negative Syndrome Scale (PANSS) among Chinese and American inpatients with schizophrenia over the past decade. The PANSS includes 30 items that generate a total symptom severity score and three sub-scales of mutually exclusive items measuring positive symptoms (i.e. hallucinations, delusions, hostility), negative syndromes (i.e. blunted affect, social withdrawal or poor rapport) and general symptoms of psychological distress (i.e. anxiety, depression, lack of insight) (Kay et al., 1987), though alternative subscale configurations are possible. Since its publication in 1987, the PANSS has been translated into many languages, including Swedish (von Knorring and Lindström, 1992), Japanese (Igarashi et al., 1998), French (Lançon et al., 1999), and Thai (Nilchaikovit et al., 2000). The Chinese translation of the PANSS appeared in 1991 after a four-center study confirmed its inter-rater reliability, short-term test-retest reliability, and internal consistency with principal components analysis validating the separateness of positive and negative factors (Phillips et al., 1991). The authors attempted to reach the publisher of the PANSS, Multi-Health Systems, for a full list of translations that was not provided at the time of this writing.
Few studies have analyzed cross-cultural PANSS data. International, multicenter clinical medication trials typically randomize or stratify PANSS data by treatment group since their interest is drug efficacy (Beasley et al., 1997; Costa e Silva et al., 2001; Findling et al., 2008; Garcia et al., 2009; Kane et al., 2006; Kasper et al., 2003; Lasser et al., 2004; Lee et al., 2002; Olié et al., 2002; Ryckmans et al., 2009; van Os et al., 2004), with little interest in differences in PANSS responses between countries, ethnicities, or languages. In a previous study, Aggarwal and colleagues (2011) found substantial differences between psychiatric inpatients from four hospitals in Changsha, China and inpatients in the US CATIE schizophrenia trial (Lieberman et al., 2005). Generally, Chinese scores were substantially lower than American scores, but Chinese ratings were especially (>20%) higher on items such as hostility, disturbance of volition, lack of insight and poor impulse control, which the authors attributed to rater differences in the application of interpersonal cultural norms emphasizing harmonious relationships and adherence to authority in China. In a rare exception, Mechri et al. (2009) organized PANSS data according to subscale and total scores for patients with schizophrenia from France and Tunisia. They found that neurological soft signs were associated with the PANSS disorganization sub-scale in the French sample, but with the PANSS negative sub-scale in the Tunisian sample. Most studies using the PANSS focus on treatment comparisons, assuming that cross-cultural impacts are negligible.
This study compares differences among inpatients in the CATIE study and a Beijing sample on total and subscale scores. Individual items are identified that differ by sample in the total PANSS scale and subscales. Assuming that schizophrenia manifests similarly in China and the US and that translation from English to Chinese is reasonably equivalent, variant item ratings countering broad trends may reflect cultural differences in clinical presentations, patient responses, or PANSS application by raters.
The Chinese sample came from hospitalized inpatients in Beijing (N=504) and included age, gender, days of hospitalization, and PANSS individual item ratings. The study was conducted at the Beijing Hui Long Guan Hospital and HeBei Province Veteran Psychiatric Hospital. Beijing Hui Long Guan Hospital is one of the largest psychiatric hospitals in China with more than 1300 inpatient beds and is located 30 km from central Beijing. This hospital serves a catchment population of 13 million people. These patients had a duration of illness averaging 24.4 ± 9.6 years with 10.0 ± 9.5 years of hospitalization.
This hospital also has an outpatient setting, treating individuals with slight to moderate severity of psychosis or following-up patients after inpatient stay. Patients with moderate to severe severity of symptoms [generally with Clinical Global Impression (CGI) ≥4], first-episode psychosis, and acutely exacerbations are hospitalized. For first-episode patients, the length of hospitalization is short, generally 2–6 months due to good response to antipsychotic treatment. Commonly used antipsychotics include chlorpromazine, haloperidol or perphenazine among first-generation antipsychotics and risperidone, clozapine, olanzapine or aripiprazole among second generation antipsychotics. The study was performed between December 2006 and May 2007.
For the Chinese samples, all schizophrenic patients were of the chronic subtype, with duration of illness for at least 5 years. The recruitment criteria included: 1) age 16–75 years; 2) Han Chinese ethnicity; 3) confirmed DSM-IV diagnosis of schizophrenia; 4) at least 5 years of illness; 5) stable prescriptions of oral antipsychotic drugs for at least 6 months before entry into the study; 6) written informed consent and able to take part in neuropsychological assessment. Patients with medical or psychiatric co-morbidities and those needing other medications were included.
The US sample was derived from the Clinical Antipsychotic Trials of Intervention Effectiveness (CATIE), a large, NIMH funded antipsychotic drug trial conducted between January 2001 and December 2004 at 57 US sites based on an algorithmically-determined treatment phases. Patients were initially assigned to olanzapine, perphenazine, quetiapine, risperidone, or ziprasidone under double-blind conditions. Since there were few differences in drug outcomes across multiple measures (Rosenheck and Sernyak, 2009), we pooled results from all treatment arms for these analyses .
In both samples, patients with first-episode psychosis were excluded if they did not meet a DSM-IV diagnosis of schizophrenia to avoid enrollment of subjects whose diagnostic categories might change over time. Since both samples comprise treatment efficacy studies, patients with treatment-refractory schizophrenia were excluded for their low rate of response to multiple trials of adequate doses of antipsychotics.
In Beijing, four psychiatrists, blind to the clinical status and treatment condition, assessed the patient’s psychopathology using the PANSS upon admission. To ensure consistency and reliability of rating across the study, these four psychiatrists who had worked at least 5 years in clinical practice, were trained with the PANSS before the start of the study. After training, a correlation coefficient greater than 0.8 was maintained for inter-rater reliability on the PANSS total score at repeated assessments during the study.
In the CATIE sample, PANSS assessments were completed at 1 month and then every 3 months after randomization up to 18 months. Hospital days were assessed every month. Data presented here include all baseline assessments of patients hospitalized for at least 15 days during the month prior to assessment (N=268) to maximize comparability to the Beijing sample.
Symptom outcomes were assessed with the PANSS, in which higher scores reflect worse symptoms, and a 20% difference in scores was used in this analysis to represent a clinically important difference (Cramer et al., 2001; Leucht et al., 2005). Information from both data sets was also available on age, gender and duration of illness.
First, t-tests and Pearson chi-square were used to assess the significance of differences between the samples in age and gender. Next, we used generalized linear models to assess the significance of differences between the two samples on PANSS summary scores, the three subscales, and individual items, adjusting for differences in age and gender. We present least square means and compare the percentage difference in scores between the Beijing and CATIE samples. We could not control for duration of hospitalization because the measurements of duration of hospital stay were different in the two studies. Statistical analyses was performed using SAS 9.1 statistical software (SAS institute Inc, Cary, North Carolina, USA)
The sample included 772 observations. The Chinese sample ranged 16–76 years old, averaging 50 years (SD=10.712), compared to the CATIE sample which ranged in age from 18–65 with a mean of 37.8 years old (SD=11.7) (t=14.53 , DF=764, p<.0001). There were 332 (65.9%) males and 172 (34.1%) females in the Chinese sample compared to the 205 (76.5%) males and 63 females (23.5%) in the CATIE sample (chi-square=7.1, DF=1, p=.007).
Controlling for age and gender, scores in the Chinese sample were significantly lower than the CATIE sample for total score by 25% (p<.0001), for the positive sub-scale by 35% (p<.0001), and on the general sub-scale by 32% (p<.0001) [Table 1]. The Chinese sample also scored 24–49% lower than the CATIE sample on all positive items (all p<.0001). Differences on the negative sub-scale, in contrast were not significantly different. Notably, positive symptom items assess hallucinatory behavior, grandiosity, abstract thinking, stereotypy, somatic concerns, anxiety, guilt, tension, mannerisms, depression, and disorientation. One hypothesis is that the Chinese sample exhibited less psychopathology than the CATIE sample. In this manner, the American sample could be conceived as “more ill” clinically than the Chinese sample. Another hypothesis concerns the rater administrations of the PANSS itself. These ratings are assessed through direct verbal response of the patient as judged by trained raters. Cultural differences in stigma could influence whether American were viewed as more likely to endorse these symptoms than were Chinese patients.
However, the Chinese sample scored 26% higher on the item representing poor rapport or lack of interpersonal empathy (p<.0001), and 12% higher on passive social withdrawal (p=.003). On the general subscale, the Chinese sample scored significantly lower by 20–57% on all items except for lack of judgment and insight where they scored 46% higher than CATIE subjects (p<.0001). Poor rapport, passive social withdrawal, and lack of judgment and insight are scored interpersonally through rater interpretations of patient behavior and deserve further cross-cultural scrutiny through qualitative analysis.
Gender sub-analyses across the CATIE and Chinese samples follow similar trends. Chinese men scored significantly lower than CATIE men on the total score by 26.1% (p<.0001), on the positive subscale by 37.5% (p<.001), and on the general subscale by 34.3% (p<.0001) [Table 2]. Chinese men scored 24–54% lower than CATIE men on all positive items (p<.0001). Similarly, Chinese men scored 24.4% higher on the item representing poor rapport (p<.0001) and 10.4% higher on social withdrawal (p=.008). On the general subscale, Chinese men scored significantly lower by 15–61% on all items except for lack of judgment and insight where they scored 42.2% higher than CATIE men (all p<.0001).
Likewise, Chinese women scored significantly lower than CATIE men on the total score by 22% (p<.0001), on the positive subscale by 29.1% (p<.001), and on the general subscale by 28.5% (p<.0001) [Table 3]. Chinese women scored 20–42% lower than CATIE women on all positive items (p=.02 or less for all scores). Like Chinese men, Chinese women scored 26.5% higher on the item representing poor rapport (p=.0009). However, some gender differences emerge. Compared to women in the CATIE study, Chinese women did not show statistically significant differences on social withdrawal, though they did show a 38.4% decrease in stereotyped thinking (p<.0001). Furthermore, there are the same dramatic differences within the general subscale among Chinese and CATIE women. Like Chinese men, Chinese women continued to score substantially higher on lack of judgment and insight (53.7%, p<.0001), while scores for other items were lower by 26–61% (p=.004 or less for all scores) except for non-significant differences in motor retardation, uncooperativeness, and unusual thought content.
This study used PANSS data collected from people with schizophrenia in a sample of hospitalized patients from Beijing, China and in the inpatient subgroup of the CATIE study, a large clinical trial enrolling Americans with schizophrenia from multiple sites across the country. We assume from the PANSS ratings that the clinical presentation of patients was comparable with generally lower symptom levels for all domains in the Chinese sample, perhaps due to longer hospitalizations. To maximize validity of this assumption, we used multivariate analyses, adjusting for differences in age and gender.
Ratings from the Chinese sample were consistently and statistically significantly lower than in the American sample for total PANSS score, two of the three subscales, and most individual items. We ascribe these findings to general response differences that are inevitable with non-equivalent sampling strategies. If the Chinese were hospitalized longer and patients achieved clinical stability, their lower positive symptoms may represent residual symptoms, contributing to poor insight, social withdrawal, and poor rapport. A study of first-episode schizophrenic illness found that ethnic Han scored lower than Malay subjects across all PANSS subscales (Lim et al., 2011). It has been claimed that the doctor-patient relationship in China inherits Confucian notions of moral self-cultivation and harmonious relationships in which patients are expected to obey their families and physicians (Tsai, 2001). This may account for lower overall PANSS scores as well. Perhaps most obviously, the Chinese sample could have also demonstrated less severe clinical psychopathology than the American sample.
Of special interest are the two items in which the Chinese sample rated substantially (over 20–40%) higher than the US sample at a statistically significant level (poor rapport and lack of judgment and insight), and to a lesser extent the items that were significantly higher although with a smaller magnitude of difference (blunted affect and emotional withdrawal). From a purely clinical standpoint, it is unclear how to interpret these findings given that Chinese subjects were hospitalized longer: improved – not poor – rapport, judgment, and insight could be expected with longer inpatient stays as symptoms improve and patients received psychoeducation, but such differences were not found. We therefore hypothesize that cultural differences may have affected assessments of these items by the Chinese raters, influenced by distinctive interpersonal norms. These items represent behaviors, especially lack of conventional judgment and insight associated with schizophrenia that are perhaps less tolerated in Chinese culture and more highly scored by Chinese raters, and conversely relatively more tolerated and less highly scored by US raters. PANSS data from inpatients in Changsha, China compared to inpatients in the American CATIE sample showed similar trends of higher ratings for rater-based interpretations of patient behavior (e.g. insight, hostility, disturbance of volition and poor impulse control) in contrast to items scores based more exclusively on patient verbal response (Aggarwal et al., 2011), suggesting a potentially generalizable finding. Bioethics in China has focused on the “relational” nature of the Chinese self, defined, shaped, and situated interpersonally where dignity and humaneness unfold through proper interaction with others (Cong, 2004). Chinese raters may have perceived the behavior of study subjects to represent delinquency in their interpersonal responsibilities, resulting in higher scores for these PANSS items.
In contrast, Chinese raters evaluated symptoms of individual psychopathology lower than American raters on items such as hallucinatory behavior, grandiosity, difficulty in abstract thinking, stereotypy, somatic concerns, anxiety, guilt, tension, mannerisms, depression and disorientation. While, as noted above, there may be non-cultural explanations for these differences, American raters may have scored these symptoms higher given their cultural emphasis on individualism and the cultivation of higher faculties such as reason, abstraction, and volition that may be diminished in schizophrenia. Americans may regard hallucinations, false and idiosyncratic beliefs (delusions, grandiosity), neglect of personal hygiene (disorganization), and beliefs that others want to harm them (paranoia) as representing more severe deficiencies (Roe and Davidson, 2001; Sass, 2001). Differences in Chinese and American scores may therefore reflect cultural differences in norms for personal conduct and its abnormality in schizophrenia.
Differences between the Chinese and CATIE samples followed broadly similar trends when stratified by gender. This Chinese sample has been previously reported to include fewer women who have been previously reported to have lower PANSS scores (Zhang et al., 2009) as well as more positive and affective symptoms than men (Tang et al., 2007) in homogeneous ethnic Han samples. Chinese men and women both scored lower than their American counterparts on the total PANSS, positive subscale, and negative subscale scores and higher on the particular items for lack of judgment and insight and poor social rapport. Interestingly, Chinese women scored lower than American women around stereotyped thinking. “Excessive thinking” as a widespread Chinese idiom of distress seems to de-stigmatize paranoid ideation with related disrupted behavior among patients with schizophrenia (Yang et al., 2010). It is unclear whether Chinese patients exhibited less paranoia in this sample or if their symptoms were rated lower. If rated lower for cultural reasons, this idiom of distress appears to be utilized mostly for women, at least for this sample. Chinese women did not show statistically significant differences compared to American women on motor retardation, uncooperativeness, and unusual thought content. This suggests that either the clinical severity between both sets of women was comparable or that the PANSS was administered in the same way across both groups.
Several methodological limitations require comment. First, the Chinese sample comes from one city whereas the American sample collects patients from over 57 sites. Thus, our study represents findings from one Chinese region and may not represent findings throughout all of China, though they broadly conform to findings from the southern city of Changsha (Aggarwal et al., 2011). The training of PANSS raters is assumed to be uniform so findings from a small number of raters in China could reflect local practice. On the contrary, the large number of CATIE raters may have introduced significant discrepancies in the ways that the American sample was generally rated. Therefore, we cannot estimate the importance of this limitation. Second, the PANSS was administered at different times throughout hospitalization. Variant ratings could be attributed to the differential effect of various phases in the course of illness with higher PANSS scores presumably reflecting earlier intervals in the hospitalization before clinical stability. However, this is not likely to have affected the differential pattern of item responses. Our data do not allow for finer analyses to estimate confounding variables such as length of time in hospital, type and dose of any medication(s), and role of psychotherapeutic interventions such as individual, group, or family therapies in changing PANSS score, although these factors are also unlikely to have affected the differential pattern of item responses. Third, hospitalization criteria differ in China and the US, possibly leading to sample differences, although these may well be reflected in the generally lower PANSS scores in the Chinese samples. Fourth, the Chinese sample is comprised of ethnic Han patients whereas the American sample includes Caucasians, African-Americans, Native Americans, Asians, Pacific Islanders, and those with mixed race (Lieberman et al. 2005). Even though we compared the samples by nationality, we may have overlooked interracial variations by analyzing the American sample together.
Nevertheless, we believe that our results signal an innovative strategy toward quantitative-qualitative analysis of the cultural patterning of responses to symptoms measured through psychiatric instruments. Without laboratory or radiological biomarkers, psychiatry remains an interpretation of an interpretation – the first interpretation, a patient’s report of symptoms, and the second interpretation, a clinician’s assessment from a patient’s report of symptoms (Kleinman, 1996). Psychiatric instruments with accurate linguistic translations can standardize the process of the obtaining the first interpretation. Our data suggests that the second interpretation – here, the clinician’s assessment through rater scores – may be taken for granted within instrument translations when it is not at all clear that the process of rating has been calibrated. This preliminary study calls attention to the cultural shaping of rater application of structured instruments in psychiatric assessment and the opportunity to rethink the international standardization of raters, a focus for future research.
Several important conclusions can be drawn from this study. First, while instruments appear to be standardized for content, there is not yet a process for standardizing raters in cross-cultural contexts. The data in this paper suggest that this would be a vital area for further research. Second, while the rating of symptoms appears to be clinically influenced, there may also be cultural influences based on local interpretations of normal and abnormal behavior especially in the evaluation of judgment and insight.
This work was supported, in part, by the Yale-China Program, New Haven, CT, through the Chia Fellowship Program made possible by the Chia Family Foundation, which facilitated this research collaboration. The funding source did not have a role in study design; in the collection, analysis and interpretation of data; in the writing of the report; and in the decision to submit the paper for publication.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
CONFLICTS OF INTEREST
None of the authors have any conflicts to declare.