|Home | About | Journals | Submit | Contact Us | Français|
Small sample sizes in Asian, Hispanic, and Native American groups and misreporting of race/ethnicity across all groups (including blacks and whites) limit the usefulness of racial/ethnic comparisons based on Medicare data. The objective of this paper is to compare procedure rates for these groups using Medicare data, to assess how small sample size and misreporting affect the validity of comparisons, and to compare rates after correcting for misreporting.
We use 1997 physician claims data for a 5 percent sample of Medicare beneficiaries aged 65 and older to study cardiac procedures and tests.
We calculate age and sex-adjusted rates and confidence intervals by race/ethnicity. Confidence intervals are compared among the groups. Out-of-sample data on misreporting of race/ethnicity are used to assess potential bias due to misreporting, and to correct for the bias.
Sample sizes are sufficient to find significant ethnic and racial differences for most procedures studied. Blacks' rates tend to be lower than whites. Asian and Hispanic rates also tend to be lower than whites', and about the same as blacks'. Sample sizes for Native Americans are very small (about .1 percent of the data); nonetheless, some significant differences from whites can still be identified. Biases in rates due to misreporting are small (less than 10 percent) for blacks, Hispanics, and whites. Biases in rates for Asians and Native Americans are greater, and exceed 20 percent for some procedures.
Sample sizes for Asians, blacks, and Hispanics are generally adequate to permit meaningful comparisons with whites. Implementing a correction for misreporting makes Medicare data useful for all ethnic groups. Misreporting race/ethnicity and small sample sizes do not materially limit the usefulness of Medicare data for comparing rates among racial and ethnic groups.
Research on racial and ethnic disparities in health care in the United States has mainly compared blacks and whites (Institute of Medicine 2002; Mayberry, Mili, and Ofili 2000). Pioneering and influential research on treatment for cardiac care and cancer, for example, does not include Asians, Hispanics, or Native Americans (Sheifer, Escarce, and Schulman, 2000; Blustein, Arons, and Shea 1995; Ball and Elixhauser 1996; Bach et al. 1999). Identifying disparities arising from within the clinical encounter requires that insurance, other system factors, and health status be ruled out as possible causes for differential treatment. Detailed medical and financial records are used to isolate clinical discrimination in research by Bach et al. (1999) for cancer and Watson et al. (2001) for cardiac care, and in many other studies. This type of research is costly per case studied, and is usually conducted in one or a few clinical settings. Comparison across more than two ethnic groups is generally not feasible. Other research uses large national databases within a given payment system (e.g., Medicare) to control for insurance factors, and then relies on statistical adjustment to control for health status. This second type of research is inferior in terms of controls for health status, but has the advantage of being better able to document the lay of the land in terms of disparities across many treatment areas (Escarce et al. 1993; Gornick et al. 1996), and potentially for the many ethnic and racial groups.
Two problems limit the utility of large national databases for comparing rates of use for nonblack minorities. First, sample sizes of these groups are much smaller, particularly among the elderly, than for blacks and whites. Second, the ethnic/racial group is sometimes not recorded correctly. Arday et al. (2000) recently compared the reporting of race/ethnicity in two Medicare databases and concluded that despite recent improvements in accuracy, “one cannot yet utilize all the other [i.e., nonblack, nonwhite] categories with equal confidence.” In a recent study, Sehgal (2003) used Medicare files to study disparities in hemodialysis between blacks and whites. In a companion editorial Aaron and Clancy (2003) note the limitations of Medicare data and call for research to enable extension of such comparisons to other ethnic groups using Medicare data. Given the importance of information for nonblack minorities, it is worthwhile to pursue the question of the degree to which small samples and misreporting limit the validity of comparisons involving these groups. This paper conducts this assessment.
Medicare is one of the most important sources of payment for health care in the United States and one of the most important sources of data for comparing rates of health care use for racial and ethnic groups. Medicare provides health insurance to virtually every person in the United States over 65 years old, about forty million people in total. Medicare records information about race and ethnicity at enrollment. In 1997, 87.0 percent of Medicare beneficiaries were white, 8.4 percent black, 2.3 percent Hispanic, 1.1 percent Asian, with the balance distributed among American Indian/Alaska Native, Other, and Unknown. Health care data in Medicare, based on paid claims, are highly reliable. In the paper serving as a model for the present paper (Escarce et al. 1993), differences in health care use between blacks and whites for selected procedures and tests were studied using Medicare data for 1986. For most services, and particularly for newer or high technology services, Escarce et al. found whites had age–sex adjusted rates of use exceeding that for blacks. We apply Escarce et al.'s (1993) methodology for defining rates of use to all ethnic and racial groups.
Two data files were merged to combine information on health care use with information about the beneficiary. The 1997 Part B Beneficiary File available from the Centers for Medicare and Medicaid Services (CMS) contains a detailed record for physician and other services paid for in the fee-for-service system in Medicare for a 5 percent sample of beneficiaries. Information about the beneficiaries is contained in the enrollment data-base (EDB) also maintained by CMS. Information includes age, sex, race/ethnicity, date of Medicare eligibility, Medicaid eligibility status, enrollment status in a Health Maintenance Organization (HMO), and date of death, if applicable.
Data on race and ethnicity come from the Social Security Administration's Master Beneficiary Record and from surveys recently conducted by CMS. In a special analysis for 1997, researchers at CMS compared the reported race/ethnicities from the EDB with the more detailed (and accurate) information from the CMS Current Beneficiary Survey (Arday et al. 2000). CMS employs six mutually exclusive categories: white (non-Hispanic), black (non-Hispanic), Hispanic, Asian/Pacific Islander, American Indian, and Other. Data on race and ethnicity are collected in the Current Beneficiary Survey (CBS) in the face-to-face interviews (Adler 1994). The CBS asks separate questions for race and ethnicity, as required by the Office of Management and Budget (1997). Following Arday et al. (2000), this paper uses CBS as a standard for assessing misreporting.
Medicare beneficiaries less than 65 years old or those with end stage renal disease were excluded, as were individuals with partial-year enrollments due to death or other reasons. Those enrolled in HMOs were also excluded.1 After applying exclusions, our study sample consisted of 1,547,000 elders.
We began with the ten cardiac procedures used in Escarce et al. (1993), and added five new cardiac procedures (the last five in the tables) that were not in common use in 1986. Study services were defined by the CPT-4 code used as a basis for procedure information in Medicare (American Medical Association 1997). Procedures and tests may be defined by more than one code. Some procedures were only counted if they occurred following other diagnostic procedures (e.g., coronary angiography among beneficiaries with an exercise stress test). For purposes of rate calculations, we counted only one procedure/test of each type for each person per year. Our rates therefore should be interpreted as the rate of persons with at least one procedure/test in each category per year. The algorithms used to define rates are available upon request.
We calculated age- and sex-adjusted rates of use among the elderly for each group for each service studied. To focus on misreporting and sample size, we do not adjust for other factors that might account for differences in rates among the population groups, such as state, urban/rural residence, hospital, or general medical risk.
We define five mutually exclusive population groups: American Indian/Alaska Native, Asian/Pacific Islander, black, Hispanic, and white. The “Other” group is ignored in our correction. These are the same as used in both CMS data sources. Index groups are identified by i or j. We are interested in knowing the true rate of a procedure for each group, defined as ri for group i. We assume that groups are homogeneous in the respect that the rate of use for each person in group i is ri. The true number of people in each group is ni, but we do not observe this number directly. In studies of this kind, data available are reports of race and ethnicity, some of which are incorrect.2
Data available are the number of people reported to be in group j, j, and the rate of use for these people, j. We can relate the reported to the true rate of use as follows. Define pij to be the probability that someone who reports themselves in group j is actually a member of group i. Note that pjj is the probability that someone reporting group j is actually a member of group j. The pjj is known as the positive predictive value (ppv) for each group. The reported rate of use in group j can be expressed as a weighted average of the true rates for all the groups who report themselves to be a member of group j. Thus,
Expression 1 can be regarded as five linear equations in five unknowns, the ri's. With information on the pattern of misreporting, the pij's, it is straightforward to solve for the true rates, the ri's.3
For purposes of this paper, we wrote to officials at CMS and obtained the cross-tabulation of race and ethnicity as recorded in the Enrollment Data Base (EDB) and as reported in the Medicare Current Beneficiary Survey (MCBS).4 We used these data to calculate the pij's5 The MCBS is regarded as more accurate than the EDB for purposes of this data element, and we refer to the rates after application of the correction contained in (1), as the “corrected” rates.
Adjustment for age and sex is by direct standardization using 1997 population weights. We assessed differences in rates of use by comparing each ethnic/racial minority to whites using relative risks (RRs) adjusted for age and sex using the Mantel-Haenszel method (Kleinbaum, Kupper, and Morgenstern 1982). We constructed test-based confidence intervals (CIs) for the relative risks to assess the effect of sample size on our estimates. We compared the magnitude of the mean differences across groups, and the frequency with which differences from whites are found to be significant for each of the minority group's rates.
Table 1 shows the characteristics of the elderly sample by the racial/ethnic groups, ordered by size. Elderly whites were 88.4 percent of the study population, and elderly blacks 7.5 percent. The numbers for the other groups were much smaller, Hispanics (1.9 percent), Asians (1.0 percent), and Native Americans (0.1 percent). Elderly blacks are more likely to be female, younger, and live in an urban county than elderly whites. In spite of being younger and more female, blacks die at a higher rate than whites. Hispanics, Asians, and Native Americans are more balanced in terms of gender than either blacks or whites, but tend to be even younger than blacks. Hispanics and Asians are the most urban, whereas Native Americans are by far the least urban of any group. The unadjusted death rate for these three smaller minorities is equal or below whites, probably due to their younger age. The largest difference between blacks and whites is in the rate of Medicaid coverage: 26.7 percent of elderly blacks in Medicare also have Medicaid, whereas only 6.9 percent of whites also have Medicaid coverage. Hispanics, Asians, and Native Americans have very high rates of Medicaid coverage, 38.7 percent, 54.8 percent, and 40.4 percent respectively.
Table 2 contains information on the pattern of race/ethnicity reporting in the EDB and the MCBS, both for 1997. An entry in the table shows the fraction of the respondents in the row group (EDB) who reported themselves to be in the various groups in the MCBS. The fractions sum to 1.0 along a row. The total number of respondents was 15,184, distributed in terms of the entry in the EDB according to the last column of Table 2. Diagonals in Table 2 are the ppvs: .954 for whites, .943 for blacks, .977 for Hispanics, .753 for Asians, and .722 for Native Americans. In general, the entries in the table are the pij's from equation 1. With this information, we can apply expression 1 to solve for the corrected rates of use.
If we regard the MCBS as the more accurate information, we can see the pattern of “mistakes” in the EDB. More than three-quarters of those misclassified as whites on the EDB are Hispanic on the MCBS. Also, half or more of those misclassified as blacks, Hispanics, and Asians on the EDB are whites on the MCBS. There are too few Native Americans among the respondents to say much about the pattern of misclassification for this group.
The expression 1 makes clear the importance of a high ppv when we correct for misreporting. Expression 1 can be rewritten slightly as
Expression 1′ simply takes the ppv for group j, pjj out of the summation sign for emphasis. When the ppvj is near 1.0 the reported rate will be very close to the true rate.
To illustrate the importance of a high ppv, suppose many Hispanics misreport themselves as non-Hispanic whites, but the ppv for Hispanics remains high. This would occur if few people in other groups misreport themselves as Hispanic. Then, with the high ppv (those who say they are Hispanic really are), the reported Hispanic rate would still be close to the true rate.
If the predominant form of misreporting is minority groups misreporting themselves to be white, then the ppv can be high for all groups. It is high for each minority because very few whites call themselves minority group members (and other minority groups are [by definition] small). The ppv for whites is also high because in spite of the fact that the white group is the destination for most misreports, the white group is very large, and the misreports are a small fraction of the total of reported whites. In Medicare data, the ratio of whites 65 years old or older to Hispanics, for example, is nearly 50:1, and 100:1 for Asians. The white group can readily withstand some misclassified Asians and Hispanics with little effect on the accuracy of the white estimate.
Table 3a contains reported rate information for all groups for the 15 procedures studied in this paper, adjusted for age and sex. Table 3b contains the rates after correction for misreporting. The rates in Table 3b are found as the solution to the series of equations in expression 1. We compare the uncorrected results in Table 3a to the “corrected” results in Table 3b to assess the effect of misreporting in Medicare. Our most important result is the following: for whites, blacks, and Hispanics, all rates in Table 3b are within 10 percent of the rates in Table 3a. For Asians and Native Americans, however, the correction matters more. The corrected rates for these two groups are generally lower than the reported rates, in some cases substantially. For Asians, in 8 of 15 cases, the reported rates are 10 percent or more too high in relation to the corrected rates, and in 3 of these the reported rates are 20 percent or more too high. The story is about the same for Native Americans. The main reason for this is the lower ppv for these groups. Non-Asians and non-Native Americans are more frequently mixed in with these groups, and the mix-ins have higher rates. After correcting for this, the rates for these two groups fall.
Table 4 contains information about the relative risks for procedures among populations. We use this to assess the power to detect differences among the groups. Relative risks reported here use the data from Table 3a. Table 4 shows the relative risks for whites in relation to each minority. The numbers in the black column, for example, are the ratio of the adjusted rate of use of whites to blacks for each procedure. A relative risk (RR) greater than 1.0 means whites use more. The 95 percent CI is shown for the reported RRs. Reported RRs with CI ranges all above 1.0 are listed in bold and those with CI ranges below 1.0 are put in italics.
Looking at the reported RRs, it is clear that whites use more than other groups. Of the 15 RRs for blacks, 12 are greater than 1.0, and three are less. All RRs for blacks are significantly different from 1.0. The effect of sample size can be seen when examining the RRs for the nonblack minorities. In the case of Hispanics, 10 RRs are significantly greater than 1.0, three are significantly less than 1.0, and two are insignificantly different from 1.0. As a simple summary measure, we figure the average of the RRs for each of the 15 procedures. The average RR for Hispanics is 1.31, a little less than for blacks (1.49). One thing to keep in mind is that the health status of minorities in Medicare tends to be worse than for whites (Beirman, Haffer, and Hwang 2001), so if adjustment were made for underlying condition, the minority–white rates would diverge even more.
The average RRs for Asians, Native Americans, and Others are 1.53, 1.46, and 2.89, respectively. For Asians, only 1 percent of the data, we still find 12 of 15 procedures significantly higher than 1.0, with 3 not significantly different than 1.0. Even for Native Americans, about .1 percent of the data, five of 15 procedure RRs are significantly greater than 1.0. Another way to assess the impact of small samples is to examine the confidence intervals in Table 4. There are about seven times more blacks in the data as Asians (116,406 versus 15,035). The CIs for the white–black relative risks are about twice the CIs for the white–Asian comparisons. The CIs for white–Hispanic comparisons (Hispanics are 30,067) are closer to the black CIs than the Asian. The CIs for the least numerous group, the Native Americans (1,652), are of course the highest.
As the 2002 Institute of Medicine report, Unequal Treatment, makes clear, health care disparities are a problem for Native Americans, Asians, and Hispanics, as well as for blacks. Although groups for which numbers are small present the greatest challenges for research, these are also the groups for which current information is less available. Comparative research including nonblack minorities is clearly a high priority.
Use of race and ethnicity in epidemiology and service research is complex and controversial (Lillie-Blanton and LaViest 1996; Williams, Lavizzo-Mourey, and Warren 1996). The U.S. Census Bureau definitions tend to govern the coding of race/ethnicity in government databases; these have recently become more “fine,” allowing for more and multiple responses. (See the papers from a symposium on Race/Ethnicity in the 2000 Census published in the American Journal of Public Health, November 2000, vol. 90, no. 11.) Some advocates and researchers argue that collecting data on race/ethnicity and studying differences does more harm than good (Stolley 1999), but a more widely held view is that reliable race/ethnicity data is essential to “monitor progress or setbacks” in inequalities in health and health care use (Krieger 2000).
Medicare is the single most important payer of health care in the United States, and is a natural source of study for all racial and ethnic groups. In general, the findings of this paper support the usefulness of expanding applications of Medicare data beyond black/white comparisons. In terms of sample size, while the number of beneficiaries among nonblack minorities is much smaller than blacks, when rate comparisons are made in the cardiac area, the smaller numbers still have sufficient power to find many significant differences. We conducted our comparison with one year's data only; a natural next step would be to increase power by including multiple years, and by extending comparisons to other clinical areas.
More recent data would also reinforce the utility of data for nonblack minorities. As the Asian and Hispanic populations grow (and age), more data will become available for these population groups at a faster rate than for blacks or whites. To illustrate the impact of aging alone, in the Medicare data analyzed here, there are 63 whites for every Asian or Hispanic enrollee aged 85 or greater, but there are only 25 whites for every Asian or Hispanic beneficiary aged 65–69.
In health care databases, information on race/ethnicity is collected in a variety of ways, including self-report, report by proxy (such as a relative), or recorded by an observer (such as an admission official at a hospital). While improving the accuracy of data is the direct way to deal with misclassification, this may not always be feasible. The main implication of our paper for data collection is to demonstrate the utility of special studies on race/ethnicity that reveal the pattern of misreporting. In some cases, a special study of how race/ethnicity is reported in relation to the actual race/ethnicity could be the most effective way to derive accurate group-specific estimates.
As racial/ethnic categories are made more “fine,” the issues of misclassification and sample size become more salient. For example, the thirty thousand Hispanics in the Medicare data are composed of Cubans, Mexican-Americans, Puerto Ricans, and others—groups with distinct patterns of health care use (Wolinsky et al. 1989). Application of the methods outlined here to subgroup analysis may be particularly useful.
One limitation of our study is important to keep in mind. We have assumed that misclassification is “random” in the sense that those from a group who misreport their race/ethnicity are identical in terms of rates of use as those who report correctly. There are reasons to think this assumption is inaccurate. For example, those minorities who “misreport” themselves to be white might have true rates closer to the white group. In this case, an “uncorrected” comparison might be more meaningful. It would be worth considering how sensitive the findings here are to violations of the random-misclassification assumption.
Misreporting of race/ethnicity in standard Medicare files presents less of a barrier to research than may have been previously thought. Even with no correction for misreporting, we find that rates for whites, blacks, and Hispanics are accurate within 10 percent. Furthermore, a correction using information about the patterns of misreporting is readily applicable to Medicare, bringing Native Americans and Asians within the groups for which comparisons can be usefully made.
We are grateful to Zhun Cao and JiTian Sheu for programming assistance. Margarita Alegria, Susan Arday, Robert Arday, Ana Balsa, Kevin Lang, Robert Roberts, and Alan Zaslavsky provided helpful comments on an earlier draft.
1Differential enrollment rates among racial/ethnic groups might introduce differences in rates due to selection effects on unobserved health status. We analyzed the enrollment rates in HMOs in Medicare by ethnic group, and even after adjustment for gender and age, the black (18.8%) and white (17.1%) rates of enrollment are quite a bit less than the rate for Asians (29.0%) and Hispanics (23.6%). The adjusted rate for Native Americans was 8.6%. Differential rates of HMO enrollment may therefore account for some of the Asian/white and Hispanic/white differences, but not the black/white differences.
3In matrix notation, j is a (5×1) vector, pa (5×5) matrix with typical element pij and r a (1×5) vector. Then r=p−1, where p−1 is the inverse matrix of p.
4We are grateful to Susan Arday of CMS for providing us with this information.
5The pij's we calculate could, themselves, be regarded as estimates, in which case an additional source of error would need to be recognized in the estimates of the corrected rates.
This research was supported by grant P01 MH59876 and R01 MH 59254 from the National Institute of Mental Health.