|Home | About | Journals | Submit | Contact Us | Français|
Documentation of language usage in medical settings could be effective in identifying and addressing language barriers and would improve understanding of health disparities.
This study evaluated the availability and accuracy of medical records information on language for 1,664 cancer patients likely to have poor English proficiency. Accuracy was assessed by comparison to language obtained from interview-based research studies.
For patients diagnosed at facilities where information on language was not abstracted electronically, 81.6% had language information in their medical records, most often in admissions documents. For all 37 hospitals, agreement between medical records and interview language was 79.3% overall and was greater for those speaking English than another language.
Language information is widely available in hospital medical records of cancer patients. However, for the data to be useful for research and reducing language barriers in medical care, the information must be collected in a consistent and accurate manner.
In the United States (US), 4% of the population does not speak English well or at all, and in California, which has a large immigrant population, this figure is nearly three times higher (1). Hispanics and Asians are the populations with the highest proportions of linguistic isolation, meaning that no household member speaks English “very well.” In California, where Hispanics and Asians make up a large proportion of the total population, 26% of Spanish-speaking households and 31% of Asian language-speaking households are considered linguistically isolated. Similarly, in California, 31% of Hispanic and 24% of Asian individuals do not speak English well or at all (1).
In health care settings, language barriers between patient and health care provider can result in misunderstandings, poor compliance, or inferior or delayed care (2, 3), contributing to health disparities among this growing segment of the population. Miscommunications are common between English-speaking doctors and patients who speak other languages (4–7). In the presence of a language barrier, there is often less interaction between doctor and patient; patients are less likely to talk to their doctors and more likely to have their comments ignored (6). As part of ensuring that all individuals receive quality health care, researchers and various organizations have advocated the routine collection of language information in health care settings (8, 9), which can help in determining patients’ language needs and in providing the necessary language assistance services, such as interpreters and language-appropriate written materials. Provision of these resources could, in turn, help to provide those with poor English skills improved communication with their health care providers and, ultimately, help to remove language as a barrier in the delivery of medical care.
To the extent that medical records data are subsequently used for research, such as in cancer registry data, language preference and/or usage, which serve as proxies for acculturation (11–15), can be used to identify disparities in disease incidence and health outcomes. Therefore, documentation of language usage in medical records could be effective in identifying and decreasing adverse public health consequences associated with language barriers and may also serve ultimately to increase our understanding of racial/ethnic disparities in health, help target the most vulnerable groups for interventions, and aid in the evaluation of programs designed to decrease health disparities (16, 17).
Despite these recognized needs and calls to action, the extent to which language capability, preference, and usage are noted in the medical records for linguistically isolated groups is unknown. Therefore, we designed a study to evaluate the availability and accuracy of language information in cancer patients’ medical records across hospitals in the Greater San Francisco Bay Area, a region with large Asian and Hispanic populations. As this study was conducted within the context of examining the feasibility of reporting patient language information routinely to the population-based cancer registry, we also sought to determine the efficiency (i.e., additional time required for abstraction) with which these data could be collected from the medical records.
Cancer cases were identified from the Greater Bay Area Cancer Registry (GBACR), a Surveillance, Epidemiology, and End Results (SEER) and California Cancer Registry-supported population-based cancer registry whose catchment area comprises the counties of San Franciso, Alameda, Contra Costa, Marin, San Mateo, Santa Clara, Santa Cruz, Monterey, and San Benito in Northern California. We first identified the 37 (out of 64 total hospitals in the area) GBACR hospitals at which the certified tumor registrars (CTR) who routinely abstract medical records data for the GBACR were available and willing to conduct the medical record abstraction for this study. These 37 facilities did not differ from the remaining facilities in hospital size (estimated by number of patient beds) or teaching status, but they were slightly more likely to be privately owned or part of a health maintenance organization (HMO) network. From these 37 facilities, we selected two mutually exclusive groups of cancer cases: 1) cases diagnosed in the period 1998 through 2002 and from whom language data had been obtained through patient report during 14 prior research studies (18–33) (described in detail in Gomez et al. (34)), and 2) those diagnosed in the period 1998 through 2004 who had not participated in the prior research studies. To maximize the proportion speaking a language other than English, cases were further selected based on having: 1) Asian or Hispanic race/ethnicity; 2) non-Asian or non-Hispanic (NH) race/ethnicity but foreign birthplace based on registry data; or 3) non-Asian or NH race/ethnicity without registry birthplace information but whose social security number had been assigned after age 18 years, indicating that they were likely to be foreign-born.
Patient characteristics obtained from the GBACR included age at diagnosis, race/ethnicity, sex, stage at diagnosis, year of diagnosis, country of birth and/or nativity (US- or foreign-born), and reporting hospital. Information on the reporting hospitals included the number of beds and ownership status, and was obtained from the Office of Statewide Health Planning and Development (OSHPD) (35); information on hospital teaching status (defined as membership in the Council of Teaching Hospitals) was obtained from the American Hospital Association (36).
For the 2,298 eligible study cases for whom we attempted to review medical records for information on language and interpreter use, medical records could not be located for 51. For the remaining 2,247 cases, 1,085 (48.3%) also had self-reported data on language usage assessed in prior research studies. For the prior studies in which language usage questions were not asked directly, language usage was determined based on the language of interview.
Based on the expertise of the CTRs for each of their hospitals, information on the availability of data on language and interpreter use was abstracted from up to ten documents in the medical records onto a study-specific form for the majority of cases (N=1,664). Because language and interpreter information was abstracted from a central electronic database for 10 facilities, the results for availability of language information by medical record location (Table I) are restricted to the 1,664 cases from the other 27 facilities for which the information was not abstracted electronically.
To determine accuracy, or agreement with self-report, of language information in the medical records, cases were categorized as: 1) speaking English, if the language specified in the medical records was English or if both English and another language were specified; or 2) speaking another language (“non-English”), if only a language other than English was specified. Cases were similarly categorized for self-reported language usage.
The availability of language data was calculated as the proportion of cases for whom language information was located in the medical records; this analysis was based on the 1,664 cases from the 27 facilities where language information was not abstracted electronically. Differences in availability of language information according to patient and hospital characteristics were evaluated using the chi-square statistic.
To determine the accuracy of language information in the medical records, we calculated the positive predictive value (not adjusted for chance) as the percentage of cases having the same language noted in the medical records as from interview (considered the gold standard); this analysis was based on data from all 37 facilities. Differences in the positive predictive value according to patient and hospital characteristics were evaluated using the chi-square statistic, and multivariate logistic regression was used to assess their associations with language disagreement. The kappa statistic was additionally calculated to determine the level of agreement between interview language and language noted in the medical records. These analyses were restricted to cases with explicit mention of language in their medical records and documented language from interview.
To assess the efficiency of obtaining data on language and interpreter use from medical records at facilities where this information was not abstracted electronically, we calculated mean and median estimates of the additional time reported by the CTRs to abstract these data items over and above the usual abstracting time for routine cancer reporting.
This study was approved by the Institutional Review Board of the Northern California Cancer Center.
Of the 2,247 cancer patients included in study, 65.1% were female. The cases had the following racial/ethnic distribution: 34.0% Hispanic; 38.0% NH Asian/Pacific Islander; 25.2% NH White; and 2.8% NH Black. Foreign-born cases represented 71.9% of the study sample.
Overall, information on spoken language in the medical records was available for 86.4% of the study cancer patients. For patients diagnosed at the 27 facilities for which language information was not abstracted electronically, this percentage was 81.6%, and such information was most often found in admission registration and billing documents (91.5%), admission assessment examinations (88.7%), surgery records (82.4%) and social services documents (71.0%) (Table I). Information on interpreter use was most commonly found in consent forms (41.6%) and nurses’ notes (36.9%).
Availability of language information was significantly greater for patients of non-White race/ethnicity and those diagnosed in larger facilities or in private or non-teaching hospitals (Table II). There was also a significant difference by year of diagnosis (which was primarily driven by greater availability for cases diagnosed in 2002) and a non-significant increase in availability for cases diagnosed with more advanced stage. These patterns were consistent across all study facilities.
Overall agreement between medical record-based and interview language was 79.3%. Among cases that self-reported speaking at least some English or had spoken English at the interview, 85.3% also had English reported in their medical records. Among cases who self-reported not speaking English or had spoken another language at the interview, 64.2% also had a language other than English indicated in their medical records (Table III). The overall kappa statistic of 0.493 indicates fair agreement between interview and medical record-based language usage overall. Agreement was greater for females, for patients diagnosed at younger ages, for NH White and NH Black patients, for patients diagnosed at private hospitals, teaching hospitals, or hospitals where information was abstracted electronically, and for US-born patients (Table IV).
Multivariate logistic regression analyses revealed similar results (data not shown) with disagreement between self-reported language and language noted in the medical record being significantly associated with age 70 years and older (compared to age less than 40 years), Asian/Pacific Islander and Hispanic race/ethnicity (compared to Whites), diagnosis at public hospitals (compared to private hospitals), diagnosis in 1998 (compared to 2002), and being foreign-born (compared to US-born).
For the facilities where language information was not abstracted electronically, abstraction time was not recorded for 232 (13.9%) cases; for the remaining cases, mean and median times for abstraction were 5.9 (standard deviation=5.0) and 5.0 minutes (interquartile range = 2–9 minutes), respectively. Abstraction times were significantly shorter for records of patients diagnosed in private facilities and in more recent years (data not shown).
With the continued and substantial growth of foreign-born and non-English speaking populations in the US, disparities in medical care may be further exacerbated by increasing language barriers. In response to this growing concern, many states, including California in 2006, have required that patient language be recorded routinely in medical facilities. However, compliance with these requirements is unknown. This study of cancer patients diagnosed at Greater San Francisco Bay Area hospitals and selected for their likelihood to have language barriers found that at least some information on language usage was present in the medical records for the vast majority of cases. At facilities where patient information on spoken language was not abstracted electronically, language information was most often found in admission registration/billing and admission assessment examination records, and most of those documents contained some information on the patient’s language. These findings indicate that language information is not only available in the medical records for most cancer cases but also is found consistently in specific locations regardless of the abstractor or facility. Language information was not found consistently in other medical record locations, however, particularly outpatient admissions, discharge/transfer summaries, and medical history/exam records. This is important for patients with diseases that are less likely to involve a hospital admission and who, therefore, may have incomplete admission documents.
In addition to discrepancies in the location of language information, availability differed according to hospital characteristics with greater availability found in larger, private, and teaching hospitals. This may be due to greater resources available at hospitals with these characteristics, which may include electronic records systems, more highly trained staff, more systematized protocols regarding data collection, better evaluation and reporting procedures, and possibly greater appreciation (particularly among teaching hospitals) for the value of patient language information.
While largely available, our kappa and positive predictive value statistics demonstrate that the language documented in the medical records may not consistently agree with language recorded in a research interview setting. As expected, disagreement was associated with being foreign-born, of Asian/Pacific Islander or Hispanic race/ethnicity, and of older age. Greater disagreement in the year 1998 may be due to decreased awareness of the importance of this information in earlier years. Disagreement was also associated with diagnosis at public hospitals, in part due to greater agreement at those private facilities where data were abstracted electronically.
Discrepancies in language usage between interview settings and medical records may be due partly to differences in the context and/or purpose of recording or reporting language usage between medical records and research study interview settings. In addition, differences may occur because a health professional records the patient’s language based solely on observation and without soliciting the patient directly (16). Patients may also feel uncomfortable specifying a language other than English in a medical setting for fear of discrimination (10). While such fear also may be present during research study interviews, it is likely that more accurate, or at least more objective, information on spoken language can be obtained through self-report than by health professionals, as is true with the collection of race/ethnicity data (17).
Our results regarding language availability are similar to those of Polednak (37), who found language information for 82% of minority patients sampled for medical record abstraction in Connecticut hospitals, although this information was most often found in locations different than those in the current study. These similarly high percentages of language availability in hospital medical records are encouraging; however, in both studies, it appears that the information is not collected in a consistent or uniform manner. Hasnain-Wynia et al. (16) evaluated whether and how patient race/ethnicity and primary language were recorded in patient records and found that, while there appears to be a theoretical commitment to collecting such data, collection practices are inconsistent both across and within medical facilities. The results of these studies along with the present findings highlight the need for standardized protocols to collect information on language usage in medical records in order for this information to be useful for research purposes and to reduce language barriers in the delivery of medical care.
The present study has some limitations, including the difficulty in interpreting the meaning of the language documented in the medical records due to the inconsistency with which it may have been collected. For example, if a patient’s language is described as “Chinese” in his/her records, was this information obtained by asking the patient, by observing that the patient spoke Chinese but not English, or by observing that the patient spoke both Chinese and English? What does it mean if both “Chinese” and “English” are noted, but in different areas of the medical records? In the current analyses, cases with both English and another language noted in their medical record were categorized as “English” based on the assumption that their English proficiency would be sufficient to navigate the healthcare system. However, we were unable to determine which of the stated languages was considered the patient’s primary language. A second study limitation is the likely variability in searching methods used by the abstractors to collect language information from the medical records, given the lack of established protocols for collecting this information at the facilities studied. Because of the potential differences in searching methods used by abstractors, we are limited in our ability to generalize the findings regarding location of language information in the medical record to all facilities. A third limitation is the potential for selection bias in the sample of hospitals contacted to participate in the study. These facilities did not differ from those not contacted with the exception of being slightly more privately owned. Given that availability of language information and agreement with self-reported data were both greater for private hospitals, this limits our ability to generalize these results to all hospitals in the region.
The present study demonstrates that information on language usage is largely available in the medical records of Greater San Francisco Bay Area cancer patients selected for their likelihood to be foreign-born or to have poor English proficiency. However, our results indicate that, while language information may be available in the medical records, it may not always be accurate. Hasnain-Wynia et al. (16) argue that inconsistencies in data collection practices demonstrate the need for state or federal policies to enforce complete and accurate data collection. On January 1, 2006, California implemented Assembly Bill 800 (AB800), which requires “all health facilities and all primary care clinics… to include a patient’s principal spoken language on the patient’s health records.” Existence of such a mandate significantly increases the likelihood that hospitals will now collect these data (16), so that this new requirement in California should help to improve the availability of these data in the medical records. However, in order for the data to be useful for research and for reducing language barriers in medical care, the information needs to be collected in a consistent and accurate manner. To the extent that this information can then be included in data used for research and public health surveillance, our understanding of disparities in disease incidence and outcomes will be improved. However, until accuracy of medical records data on patient language can be demonstrated, it is not recommended at this time that these data be used for research and public health surveillance.
The authors thank Patricia Weeks, Sarah Aroner, and Cammie d’Entremont for their contributions to this study. The authors also acknowledge the following researchers for their contributions of patient interview data to this study: J. Bloom, V. Ernster, S. Glaser, E. Holly, P. Horn-Ross, E. John, K. Kerlikowske, M. Lee, M. Schlitz, D. West, A. Whittemore, and M. Wrensch. This study was supported by a grant from the National Cancer Institute’s Surveillance, Epidemiology and End Results (SEER) Rapid Response Surveillance Study under contract N01-PC-35136 awarded to the Northern California Cancer Center. The collection of cancer incidence data used in this study was supported by the California Department of Health Services as part of the statewide cancer reporting program mandated by California Health and Safety Code Section 103885, the NCI SEER Program under contract N01-PC-35136 awarded to the Northern California Cancer Center, and the Centers for Disease Control and Prevention’s National Program of Cancer Registries, under agreement #U55/CCR921930-02 awarded to the Public Health Institute. The ideas and opinions expressed herein are those of the authors and endorsement by the State of California, Department of Health Services, the National Cancer Institute, and the Centers for Disease Control and Prevention or their contractors and subcontractors is not intended nor should be inferred. The Breast Cancer Family Registry (Breast CFR) was supported by the National Cancer Institute, National Institutes of Health under RFA-CA-06-503 and through cooperative agreements with members of the Breast CFR and Principal Investigators. This analysis included Breast CFR data collected by the Northern California Cancer Center (U01 CA69417). The content of this manuscript does not necessarily reflect the views or policies of the National Cancer Institute or any of the collaborating centers in the Breast CFR, nor does mention of trade names, commercial products, or organizations imply endorsement by the US Government or the Breast CFR.