|Home | About | Journals | Submit | Contact Us | Français|
The Hispanic population in the United States represents more than 40 million individuals, with Mexican Americans (MA) as the largest subgroup. To assess the utility of death certificates and medical records as the source of race/ethnicity data for epidemiologic studies, we compared self-reported race/ethnicity to race/ethnicity recorded on death certificates and medical records in a bi-ethnic, non-immigrant U.S. community with a significant MA population.
This study utilized data collected from a subset of 1,856 participants of the Brain Attack Surveillance in Corpus Christi (BASIC) project. In-person interviews were conducted to determine self-reported race/ethnicity. Of those interviewed, 480 subsequently expired. Using self-reported race/ethnicity as the gold standard, we determined percent agreement, sensitivity, and specificity of the death certificate and medical record.
Of the 480 subjects, 259 self-reported their race/ethnicity as non-Hispanic white (NHW), 195 self-reported as MA, and 26 self-reported as non-Hispanic black. Median age was 78.5 years and 55.8% were female. Percent agreement between self-reported race/ethnicity and race/ethnicity recorded on the death certificate and medical record was 97.1% and 96.3% respectively. Five percent of MAs were misclassified as NHW on their death certificates and 3% on their medical records.
Results indicated that Hispanic designation recorded on death certificates and medical records in this community was largely consistent with that of self-report. This study suggests that vital statistics data in non-immigrant U.S. Hispanic communities can be used with confidence to investigate ethnic-specific aspects of disease and mortality. Similar studies in other multi-racial communities should be conducted to confirm and generalize these results.
The Hispanic population currently represents more than 40 million individuals in the United States, 12.5% of the total U.S. population.1 It is expected that by 2050, 24.4% of the U.S. population will consist of individuals of Hispanic origin.1 Studying the Hispanic American population as a whole is important; however, it is a highly heterogeneous group consisting of several diverse subgroups. The largest subgroup of Hispanics in the United States is Mexican Americans (MA), comprising 67% of the Hispanic population in the United States.2
Due to biological and social factors, Hispanic Americans may be at a higher risk than other ethnic groups for developing certain diseases.3,4 Historically, researchers have used death certificates to study disease patterns, population life expectancy rates, and cause-specific mortality rates in various race/ethnic populations.5,6 However, several researchers have drawn attention to errors in the recording of race/ethnicity on the death certificate that could bias epidemiologic studies using these data sources.5–15 Poe and colleagues found that when Hispanic ethnicity was documented on the death certificate, ethnicity data from the National Mortality Followback Survey (NMFS) agreed in 98.9% of the cases.5 However, roughly 20% more Hispanic cases were identified from the NMFS data compared with the death certificate data, suggesting that Hispanic deaths may be greatly underestimated by vital statistics data.
Misclassification of race/ethnicity information is not unique to the death certificate. Indeed, there are multiple reports that misclassification of this information exists in other data sources used for research or planning purposes, including cancer registries,16–21 Medicare and health maintenance organization (HMO) data,22–24 and AIDS surveillance data,25,26 to name a few. One of the primary sources of race/ethnicity data for the above-listed data sources is the medical record, which has also been criticized for inaccuracies in demographic data due to inconsistent recording policies and lack of clarity in definitions of race and ethnicity.27,28
Whereas bias has been identified in the recording of race/ethnicity on death certificates and medical records in the past, it is likely to decrease as the U.S. population shifts toward an increasingly multi-ethnic structure. The current study's objective was to assess the utility of vital statistics for use in epidemiologic studies that compare ethnic groups. We compared race/ethnicity status on death certificates with that of the person's own report in a bi-ethnic, non-immigrant U.S. community with a large MA population. Secondarily, we compared race/ethnic status on the medical record, an alternative data source, with self-reported race/ethnicity.
This study is part of the Brain Attack Surveillance in Corpus Christi (BASIC) Project. Methods of BASIC have been reported previously.29,30 BASIC is a population-based stroke surveillance study conducted in Nueces County, Texas. Nueces County is on the Texas Gulf Coast and 95% of its population resides in the city of Corpus Christi. Corpus Christi is approximately 145 miles from San Antonio and more than 200 miles from Houston; thus it serves as the regional referral center for the sparsely populated surrounding counties. This distance affords complete case capture for initial presentation of acute medical conditions such as stroke. The population size of the county is 313,645, with non-Hispanic whites (NHW) comprising 38% of the population, and MAs comprising 56%.31 The MA community is predominantly second- and third-generation U.S. citizens.32 Nueces County is clearly not an immigrant community.
Acute cerebrovascular events (i.e., completed ischemic strokes, transient ischemic attacks, intracerebral hemorrhages, and subarachnoid hemorrhages) were identified among patients 45 years and older who were seen at one of the seven area hospitals between January 1, 2000, and June 30, 2003, using active and passive surveillance methods. Cases that did not present to a hospital were identified through a simple random sample from 45 primary care physicians from the community, four nursing homes, and from all 11 neurologists practicing in the county. Cerebrovascular events were validated, based on published criteria,33 by fellowship-trained stroke neurologists who were blinded to subjects' age and ethnicity.
The Texas Department of Health (TDH) electronically provided all death certificates for Nueces County residents for January 1, 2000, through June 30, 2003. To ensure complete capture, six months were allowed to elapse before the data were accessed. The TDH reported >99% case capture at this interval regardless of location of death (Nueces County, elsewhere in Texas, or elsewhere in the United States). Five identifiers collected from the medical record, including first name, last name, social security number, date of birth, and permanent address, were crossed-referenced with the TDH death certificate database. At least three of the five items must have been identical to be considered a match with the death certificate data. For a small fraction of cases (>5%), manual review of the data was necessary due to errors in name spelling and transposed numbers.
A random sample comprising two-thirds of cases with potential cerebrovascular events were asked to participate in an in-person interview. Of those randomized, 84.6% were located and approached for participation in the interviews. Of those who were asked to participate, 88.9% agreed and completed an interview. There was no difference in response rate by race/ethnicity (MA 89.4%, NHW 88.2%, black 89.8%; p-value=0.71). The interview consisted of questions regarding demographics, including race and ethnicity, and other items related to stroke. Patients unable to respond accurately to a series of orientation questions were interviewed by proxy in the presence of the patient when possible. Proxy interviews were conducted with the person who best knew the patient's daily activities and medical history. Interviews were performed in English or Spanish depending on the subject's language preference. Race/ethnicity in the BASIC interview is collected similarly to the 2000 U.S. Census.31 Hispanic patients were also asked questions regarding their family ancestry.
Since local hospitals do not record race/ethnicity as two separate categories as is done in the U.S. Census, a hierarchy was used to determine race and ethnicity from the medical record. If race and/or ethnicity were expressly mentioned in the medical record, this information was used first. If race and/or ethnicity were not mentioned in the medical record, race and/or ethnicity recorded on the face page were used. If a patient was recorded as Hispanic only, this information was recorded and race was documented as “not recorded.” If a patient was recorded as white only in the medical record, race was recorded appropriately and ethnicity was documented as “not of Hispanic origin.”
From these data, it was determined appropriate to classify Hispanic subjects in this population as MA. From the BASIC interview data regarding family ancestry, only 1% of Hispanic patients interviewed reported ancestry other than Mexican. Similarly, only 6% of death certificates for Hispanic subjects in this population had a country of origin other than Mexico.
Using self-reported race/ethnicity as the gold standard, we determined percent agreement, sensitivity, and specificity of the death certificate and medical record for recording of race/ethnicity. Analyses were first performed using the total study population and then separately by gender and age. As a separate analysis, race/ethnicity from the medical record was descriptively compared for those individuals with multiple strokes during the study time period (n=111).
The institutional review boards of the University of Michigan, the University of Texas, Houston, and each of the Nueces County hospitals approved this project.
During the study period, January 1, 2000, to June 30, 2003, 20,460 cases were screened and evaluated for study eligibility. Of these, 4,392 were eligible and screened positive for “potential stroke.” A random subset of 1,856 interviews was conducted. There were 511 interviews available from subjects who subsequently expired. Thirty-one of the 511 interviews were excluded from the analysis. Of these, four subjects did not have death certificate data since the death was found by another data source such as the Social Security Death Index. Eleven subjects did not have complete self-reported race/ethnicity information. Four subjects of American/Alaskan Indian race were excluded due to small numbers. Ten subjects were interviewed multiple times by BASIC. For these individuals, all interviews subsequent to the initial interview were excluded. One patient with two interviews provided inconsistent race/ethnicity information. Therefore, both of the interviews were excluded. Thus, 480 independent interviews (i.e., one interview per unique individual) remained in the dataset. Of the final 480, 43 were validated by neurologists as having “no stroke.” Two hundred seventy-four subjects were found to have completed ischemic stroke, 79 had intracerebral hemorrhage, 12 had subarachnoid hemorrhage, 78 had a transient ischemic attack, and six had strokes of unknown type as no neuroimaging was available. Three hundred and fourteen (65%) of the interviews were with a proxy and 166 were with the patient. However, 95% (297/314) of proxy interviews were with family members and the overwhelming majority of these were with an immediate family member (spouse, child, or sibling). The remaining 5% of proxy interviews were with friends, caregivers, powers of attorney, etc.
Of the 480 subjects, 259 self-reported race/ethnicity as NHW, 195 MA, and 26 non-Hispanic black. The median age of subjects was 78.5 years (Inner Quartile Range [IQR]: 71.5–85.9), and 55.8% were female. Percent agreement between self-reported race/ethnicity and race/ethnicity recorded on the death certificate and medical record was 97.1% and 96.3% respectively (Tables 1 and and2).2). The medical record was slightly more sensitive for recording of Hispanic ethnicity (sensitivity=96.9%) than the death certificate (sensitivity=95.4%), but slightly less specific (medical record 96.8% vs. death certificate 99.0%). In the case of the death certificate, nine MA subjects were incorrectly classified (false negatives) as NHW and five NHW subjects were incorrectly classified (false positives) as MA (3) or black (2). Results for the medical record were somewhat opposite, with five MA subjects incorrectly classified as NHW and 11 NHW subjects incorrectly classified as MA (9) or black (2). Two cases did not include race/ethnicity on the medical record. Percent agreement between self-reported race/ethnicity and race/ethnicity recorded on the death certificate did not differ depending on the use of proxy interviews (no proxy=97.1%; proxy=97.1%), nor did the percent agreement between self-report and the medical record (no proxy=96.6%; proxy=96.1%). Death certificates were slightly more sensitive for recording Hispanic ethnicity in those 75 years of age or older compared with younger ages (Table 3). Medical records were slightly more sensitive for recording Hispanic ethnicity in males compared with females. Percent agreement between race/ethnicity recorded on the death certificate and the medical record was 95.0%.
There were 111 subjects with multiple strokes during the study time period and thus multiple recordings of race/ethnicity information from the medical record. Of these 111 subjects, nine (8.3%) had inconsistent recording of their race/ethnicity information over time in their medical record(s). Three of the nine documented inconsistencies were due to race/ethnicity not recorded for one of the events.
Establishing reliable information regarding mortality of and specific health risks to members of ethnic groups is currently an important focus of national public health.34 Our results indicated that race/ethnicity information reported on death certificates and medical records in this bi-ethnic, non-immigrant U.S. population is largely consistent with that of self-report, with accuracy reaching 97% and 96% respectively. There should be a reasonable degree of confidence in the accuracy of ethnic-specific mortality statistics in this population derived from these data. Although concordance was high, it did not reach 100%, with 5% of MA individuals misclassified as NHW on their death certificate and 3% on their medical record.
Comparison of our findings with those of other studies, while useful, is challenging due to the different populations studied and approaches employed. Sorlie and colleagues6 used race/ethnicity data from the Current Population Surveys as the gold standard and compared this to race/ethnicity on the death certificate. They estimated agreement of 89.7% for any Hispanic designation and 84.9% for Hispanics of Mexican origin. Our estimate for sensitivity, which was calculated in the same manner, was higher at 95.4%. Poe and colleagues used a counter approach and specified the death certificate as the gold standard; they estimated agreement at 98.9% for the reporting of Hispanic ethnicity in the National Mortality Followback Survey.5 Although not presented as part of our results because we believed self-report should be considered the gold standard, our data if recalculated in this manner are in line with the findings of Poe and colleagues at 98.4%. Other work on the misclassification of race/ethnicity on the death certificate has addressed the issue from different perspectives, including inconsistencies with the reporting of race/ethnicity on birth and death records;8,10,15 errors in the recording of race on death certificates of American Indians, Alaskan Natives, and Asian Americans;7,9,13,35,36 inconsistencies between race-ethnicity on death certificates and AIDS surveillance data;25 and errors in the recording of race on death certificates of multiple-race individuals.11 Thus, direct comparison with these studies is not warranted.
The discrepancy between our findings and those of Sorlie and colleagues6 is perhaps due to the different populations studied and a trend over time in the growth of the Hispanic American population and more specifically the MA population. The demographic composition of Nueces County is more than 50% MA, as opposed to previous work conducted using national population-based samples,5,6,12 which included smaller proportions of Hispanics. Indeed, it is likely that the individuals recording race/ethnicity information in this community were more familiar with individuals of Hispanic ethnicity or were themselves Hispanic. Research in the U.S. American Indian population has suggested that accuracy of race coding on death certificates is improved in geographic areas more heavily populated by American Indians10,36 or with a higher percentage of American Indian ancestry.7 Although these data are based on a different race/ethnic minority group, such a trend is likely in other geographic regions with high minority representation.
Results of the present study suggest that there is still room for improvement in the recording of race and ethnicity data on medical records and death certificates, even in this population that is more than 50% MA. Funeral directors could benefit from training on completion of death certificates, including information about race/ethnicity classification and the importance of accurately reporting race/ethnicity data on these documents. In a national survey of funeral directors, 52% of respondents indicated that they had never had formal training on completion of death certificates.37 Further, when asked if there were particular items on the death certificate that they had difficulty with, 26% of funeral directors indicated that recording of race information was problematic because of “inadequate criteria for judgment/unclear” and “people wonder why it is necessary.” Education and training would not only encourage funeral directors to be more diligent in the completion of these data, but would also provide them with the information they need to better explain the importance of this information to family members from whom they acquire the information. Indeed, some have suggested that mortuary personnel collect race and ethnicity information at the time of burial prearrangements to avoid emotional circumstances, and in many cases, to collect this information from the individuals themselves.12 Recording of race/ethnicity solely from assumptions, whether by familiarity with the family or surname, should be discouraged in the profession, as it is likely to lead to mistakes. Similar suggestions could be made for the recording of demographic data in medical records in addition to the establishment of structured protocols for collecting these data. Hospital staff should also be informed of the importance of accurately and completely recording race/ethnicity information.
The present study has a number of strengths. First, utilization of the BASIC study data allowed for analysis of a large sample of 480 individuals. Whereas most previous studies have focused solely on death certificate information, this study had the benefit of availability of self-report information in addition to data from both the death certificate and medical record, allowing for comparisons of all sources of demographic data. Also, whereas most previous studies examined ethnicity data from the nation as a whole, combining all Hispanic subgroups, the present study specifically examined a large group of Mexican Americans, an important and rapidly expanding subgroup in the U.S. population. Finally, contrary to most studies of the Hispanic population, the area studied was a non-immigrant community. Thus, the present study is more representative of the future distribution of minority groups in this country, with fewer being first-generation immigrants, and more being U.S.-born.
This study, however, has limitations, primarily with generalizability. It was conducted in one county in Texas with a large population of only one of the several Hispanic subgroups in the U.S. Therefore, the hospital employees and mortuary personnel recording race/ethnicity data on the medical records and death certificates are likely to be highly familiar with individuals of Hispanic ethnicity, and possibly Hispanic themselves. As a result, the 96% to 97% concordance found in this study is likely to be higher than would be found in other parts of the country with a smaller Hispanic population and possibly with different Hispanic subgroups. In addition, this study examined a population of patients 45 years of age and greater with suspected stroke, so caution is advised regarding generalizability to the general population of Hispanic individuals. Finally, a large number of interviews were conducted with proxy subjects, though the vast majority was with immediate family members and the percent agreement between the different sources for race/ethnic classification did not vary with the use of proxy subjects.
In conclusion, results indicated that Hispanic designation recorded on death certificates and medical records in this community was largely consistent with that of self-report. Similar studies in other multi-racial, multi-ethnic communities should be conducted to confirm and generalize these results.
This study was funded by National Institutes of Health grant number RO1 NS38916.