Search tips
Search criteria 


Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Health Serv Outcomes Res Methodol. Author manuscript; available in PMC 2010 March 25.
Published in final edited form as:
Health Serv Outcomes Res Methodol. 2009 March 1; 9(1): 1–21.
doi:  10.1007/s10742-008-0039-6
PMCID: PMC2845167

Adjusting for Health Status in Non-Linear Models of Health Care Disparities


This article compared conceptual and empirical strengths of alternative methods for estimating racial disparities using non-linear models of health care access. Three methods were presented (propensity score, rank and replace, and a combined method) that adjust for health status while allowing SES variables to mediate the relationship between race and access to care. Applying these methods to a nationally representative sample of blacks and non-Hispanic whites surveyed in the 2003 and 2004 Medical Expenditure Panel Surveys (MEPS), we assessed the concordance of each of these methods with the Institute of Medicine (IOM) definition of racial disparities, and empirically compared the methods' predicted disparity estimates, the variance of the estimates, and the sensitivity of the estimates to limitations of available data. The rank and replace and combined methods (but not the propensity score method) are concordant with the IOM definition of racial disparities in that each creates a comparison group with the appropriate marginal distributions of health status and SES variables. Predicted disparities and prediction variances were similar for the rank and replace and combined methods, but the rank and replace method was sensitive to limitations on SES information. For all methods, limiting health status information significantly reduced estimates of disparities compared to a more comprehensive dataset. We conclude that the two IOM-concordant methods were similar enough that either could be considered in disparity predictions. In datasets with limited SES information, the combined method is the better choice.

Keywords: Racial disparities, statistical adjustment for health status, propensity score, rank and replace

1. Introduction

In Unequal Treatment, the Institute of Medicine (IOM) defines a health services disparity as the difference in treatment provided to members of racial (or ethnic) minorities that is not justified by the health status or treatment preferences of the patient (IOM 2002). When implementing the IOM definition, estimates of “disparities” should be adjusted for clinical appropriateness and need, but not for socio-economic status (SES) variables (Cook 2007; McGuire et al. 2006; Zaslavsky and Ayanian 2005). Using a health care survey, clinical appropriateness and need can be approximated by health status and closely related variables. For example, lesser use of health care by blacks because they are younger or healthier than whites would not constitute a disparity; thus estimates of disparities should adjust for measures of age and health. On the other hand, less health care use by blacks because they disproportionately live in poverty does constitute a disparity according to the IOM definition, and thus SES variables like poverty status should not be held constant across racial groups in disparity estimation.

Although the IOM definition is clear in principle, translation of the definition to empirical work is not straightforward in the nonlinear models typically used to model health care utilization. This paper evaluates three methods for implementation of the IOM definition of racial/ethnic disparities in the context of these non-linear models, assessing their concordance with the IOM definition, applying them to an analysis of racial disparities in total medical expenditure, and assessing their sensitivity to missing SES and health status variables.

1.1 Background

In health services research, estimates are often adjusted to hold health status variables constant across individuals or groups to isolate effects of non-health variables. For example, quality-of-care reports may be adjusted for health status to provide comparisons among hospitals (Hargraves et al. 2001; Kupersmith 2005; O'Malley et al. 2005) and physicians (Hofer et al. 1999). Capitated payments to health plans are risk-adjusted by elements of health status to pay plans fairly for the enrollees they attract (Ellis and van de Ven 2000; Howland et al. 1987; Zaslavsky et al. 2001). Similarly, the IOM definition of disparities suggests that comparisons should hold the health-related case mix constant across racial groups.

The operation of health care systems and their legal and regulatory climate may affect individuals of different races differently, thus contributing to health care disparities. These differences are in part mediated by SES variables. For example, individuals of lower SES may have more difficulty paying for care and navigating the health care system. If this leads to poorer health care, and minorities are disproportionately represented in lower SES categories, then SES mediates disparities. The quality of health care also varies widely across geographic regions, engendering disparities in utilization of some medical procedures because minorities are disproportionately located in low use areas (Baicker et al. 2004; Chandra and Skinner 2003; Skinner et al. 2003; Zaslavsky and Ayanian 2005). On the other hand, race may influence health care independent of these geographic or SES effects, perhaps through discriminatory practices or cultural “incompetence” that interfere with access to and utilization of healthcare.

Clinically-based measurement of disparities tries to compare members of different groups with similar health status by selecting study participants based on their appropriateness for a specific procedure or course of treatment. For example, clinically-based studies measuring disparities in cardiac care treatment and outcomes have compared minorities and whites presenting at cardiac care centers with similar symptoms, finding that African-Americans have lower rates of cardiac catheterization (Ford, Newman, and Deosaransingh 2000; Kressin et al. 2004; Kressin and Petersen 2001), other cardiac tests (Ford et al. 2000), bypass surgery (Kressin and Petersen 2001; Petersen et al. 2002), coronary angioplasty and bypass surgery after cardiac catheterization (Ford et al. 2000; Ibrahim et al. 2003; Kressin and Petersen 2001; Peterson et al. 1997), and overall quality of cardiac care (Ayanian et al. 1999).

Because the inclusion criteria for these studies are based on health status but not SES factors, the distributions of health status in the populations are approximately the same, whereas the SES distributions may be different. Thus, a simple comparison of quality and access (with no further adjustment for SES factors) yields a disparity measure concordant with the IOM definition, including racial differences mediated through SES.1

In contrast, population-based observational surveys usually lack the information required to set strict exclusion criteria based on clinical need.2 In this setting, to adjust for health status variables but not SES variables requires the creation of a hypothetical comparison group. To illustrate, consider a trial of a new treatment. The researcher cannot compare the treated group's outcomes to those that would have ensued without treatment, since the latter are counterfactual, i.e. not actually observed. Instead, she uses experimental or quasi-experimental methods to compare outcomes in the treatment group and a control group with similar characteristics to the treatment group except for exposure to the treatment. A difference between the observed and counterfactual outcomes can then be estimated.

In racial disparities studies, minority race is the “treatment” of interest, and the counterfactual group is a group identical in all aspects to a factual minority group, except for minority race status. Experimental randomization of the race of individuals is clearly not feasible, but other methods of balancing groups can be undertaken.3 For example, Schulman et al. (1999) presented physicians with videos of black and white actors matched on all characteristics except race to measure differences in diagnosis and treatment patterns associated with patient race. Economists and social scientists have matched black and white applications to test for discrimination in the housing and labor markets (Riach and Rich 2002). One recent study demonstrated discrimination in hiring practices by submitting job applications with given names typical of blacks and whites to signal membership in minority groups (Bertrand and Mullainathan 2004). Implementing the IOM definition of racial disparities also requires construction of a hypothetical control group with counterfactual distributions of health status variables. As we discuss below, although statistically implementing this counterfactual is straightforward in a linear model, it is not straightforward in the context of non-linear models.

1.2 Defining and Measuring Disparities

This section defines the distributional properties that we regard to be necessary for a disparity method to be concordant with the IOM definition of racial/ethnic disparities in healthcare use. Methods used in previous disparities studies are then described and evaluated against these concordance criteria.

1.2.1. Empirical Definition of Disparity

Suppose we wish to estimate disparities in health care utilization by two racial/ethnic groups, which we will call “white” and “black”. Let expected health care expenditure be y = y(H,S,R), a function of a vector of health status variables (H), a vector of SES variables (S), and race R (for purposes of exposition assumed to be black or white). At a population level, the disparity can be defined as the difference between the average utilization of a counterfactual white population with black health status and the actual utilization of the factual black population:4


where the three subscripts on y represent the distribution by race of H, S, and R, respectively. yBWW represents the expected value of a utilization function averaged over a black distribution of health status, a white distribution of SES, and assuming white race. Actual expenditure can be expressed as yBBB representing the expected value of utilization averaged over black distributions of health status and SES, assuming black race. The disparity is thus the difference in use between the two groups due to factors other than health status, because both estimates assume the same distribution of health status (that for blacks). The first estimate is a counterfactual because it assumes that white race and the corresponding SES distribution are combined with the black distribution of health variables.

Comparisons of factual and counterfactual distributions are common in causal inference (Rubin 1974). We formalize these counterfactual distributions in order to be precise about the comparisons we make. The factual black expenditure expectation can be expressed as:


where the function of health status and SES conditional on race y(H,S,Race = Black) is integrated over the joint density of health status and SES for blacks, fBB(H,S). Analogously, the counterfactual expenditure expectation, yBWW, can be expressed as:


where the expected utilization for whites is integrated over a counterfactual joint density of health status and SES, fBW(H,S). We do not observe fBW(H,S), but construct it using different approaches described in detail below. Disparities5 are then:


Key to implementing a definition of disparities based on (4) is the choice of fBW(H,S). The defining property of fBW(H,S) is that the marginal distribution of health status is like that of the black population while the marginal distribution of SES is that of whites:


The marginals are factual and observable, but the joint distribution fBW(H,S) is not. We will define a disparity measure based on (4) to be concordant with the IOM definition of racial disparities if fBW(H,S) has the marginal properties described in (5a-b). Property (5a) represents control for health status since both expectations assume the same distribution of the variable; and (5b) addresses the goal of preserving the distribution of SES characteristics in the white population in the counterfactual fBW* (H,S)

There are many hypothetical joint distributions fBW(H,S) that satisfy (5a-b), with various correlation structures of health status and SES. For example, if the distribution of white health status is made to match that of blacks by replacing all white health status indicators with values randomly drawn from the black sample, then the correlation between health status and SES for this hypothetical joint distribution will be close to 0. Other methods produce more plausible correlation structures that better approximate those of actual (white or black) joint distributions of health status and SES. In order to avoid causal assumptions about the relationship between health status and SES, we do not require a specific counterfactual joint distribution.

1.2.2. Assessment of Previous Methods for Estimating Disparities

To implement the empirical definition of disparities in population-based surveys, the researcher must statistically adjust for health status variation, while at the same time incorporating SES variation into disparity calculations. Table 1 provides a framework for classifying disparity measurement methods, categorizing previous studies by whether they adjust for health status and SES or health status alone, and whether they rely on non-linear or linear models.

Table 1
Adjustment methods in health services racial disparities research Methods Adjusting For All Variables Including Health Status And SES: Boxes 1 And 2

“Race residual”, or residual direct effect (RDE), methods measure racial disparities by estimating the race coefficient (and possibly race interactions) in a multivariate regression (either linear (Cell 1) or non-linear (Cell 2)) controlling for all other observable factors. In the notation introduced above, the RDE is:


The difference between the IOM definition (4) and the RDE is:


The term in (7) is the difference in utilization due to substitution of white for black SES (given health status and race). Thus, the RDE excludes from the disparity the racial differences mediated through SES differences.

A notable non-linear application of the RDE appears in parts of the 2004 –2007 National Healthcare Disparities Reports (NHDR) (AHRQ 2004, 2005, 2006, 2007), in which disparities are calculated for some measures using multivariate logistic regressions adjusting for “age, sex, race, ethnicity, household income, education, insurance, and residence location as appropriate” (AHRQ 2004, p.13). Significant odds ratios for race/ethnicity, in the presence of these controls, are regarded as indicating a disparity. For example, controlling for all of these covariates in a 2001 sample of individuals with diabetes, blacks were 38% and Hispanics were 33% less likely than whites to receive effective diabetes management (AHRQ 2004).

Similarly, Guevara et al. (2006) demonstrated disparities in children's ambulatory visit and use of prescription medications, using race coefficients in logistic regression controlled for age, gender, family size, income, health insurance, and health status variables. Fiscella et al. (2002) found that Spanish-speaking Hispanics had a lower probability of a physician or mental health visit, and that blacks had a significantly lower probability of visiting a mental health professional, after adjustment for age, gender, health status, demographics, health insurance, income, and telephone access. Methods that Adjust for Health Status and Allow for SES Mediation in Linear Models: Box 3

Mean replacement and decomposition methods used in health services research and other areas (e.g. labor economics) adjust for health status in a linear model while allowing for the mediation of SES factors (box 3 of Table 1). Suppose the relation between a dependent variable representing health care utilization (y) and the factors affecting it, including health status, SES, and race, is linear:


Then, the expected value of the counterfactual predictions for whites,yBWW, can be determined by inserting into the prediction equation the expected value of health status given black race, EB[H], the expected value of SES given white race, EW[S], and white race, R = White:


In linear models, these expected values are calculated by simply taking the mean for each variable within each racial group so that [H with macron]Black indicates the vector of means for blacks of the health status variables and SWhite the vector of SES means for whites:


and the disparity can be calculated as yBWWyBBB This disparity estimate is the sum of the direct effect of race and the effect mediated through SES, or [beta]3 + [beta]2(SWhiteSBlack). If a model with Race*SES and Race*HS interactions better fits the data, then the health status can be adjusted by entering the mean black health status into the calculation of both the main and interaction effects. This method is equivalent to comparison to a health status counterfactual concordant with the IOM definition because whites retain their SES and race characteristics. Because of the linearity of the model, means of the predictor variables (e.g. health status) determine the predicted means of the outcome, without the necessity of specifying the joint distribution of the predictors.

The Oaxaca-Blinder (henceforth “O-B”) decomposition (Blinder 1973; Oaxaca 1973) uses fully race-stratified models, thus including all possible interactions with race. In addition to including differences in underlying characteristics, or “endowments” (the appropriate term to the original applications in labor economics), the O-B method also allows for differences in the coefficients representing the effect of these characteristics for the two racial groups. Calculating disparities by assigning whites the mean health status of blacks, the O-B method yields a health status adjustment concordant with the IOM definition of racial disparity within a linear model. In the context of healthcare disparities, the O-B decomposition can be regarded as dividing the disparity into a part due to racial differences in SES distribution and a part due to racial differences in the effects of SES and health status on access to care. Decomposition methods based in linear probability models have been used to measure the contribution of separate factors to racial/ethnic differences in having a usual source of care (Hargraves and Hadley 2003; Kirby, Taliaferro, and Zuvekas 2006; Waidmann and Rajan 2000; Weinick, Zuvekas, and Cohen 2000; Zuvekas and Taliaferro 2003). These studies showed that Hispanics are 15% less likely (in raw percentages) to have a usual source of care than whites, and blacks are between 4% and 8% less likely to have a usual source of care than whites, with health insurance accounting for between 5% and 40% of these differences. Decomposition studies can be used to separate out the contribution of health status factors to the difference between races.

Mean replacement or O-B decomposition in linear models can be used to estimate disparities concordant with the IOM definition because they adjust health status variables without affecting the independent contributions of other variables, and the adjustment of means (rather than distributions) is sufficient in a linear model without curvature or interaction terms.6 These methods do not translate directly to non-linear models because full joint distributions of the predictors, rather than their means alone, are required to calculate the predicted mean of the outcome. SES Mediation of Disparities in Non-Linear Models: Box 4

Measures of health services costs, utilization or outcomes often bear a nonlinear relationship to predictors (Basu and Rathouz 2005). Methods that transform distributions (rather than means) have been used to decompose differences in nonlinear models in labor economics and demography, thus extending O-B decomposition methods to non-linear models. Transformation methods include sequential replacement (Fairlie 2006) and the use of regression-based weights (Even and Macpherson 1993; Nielsen 1998; Powers and Pullman 2006; Yun 2004). The first method replaces the value of each variable of group A with that of group B sequentially one by one. The order of replacement can be determined randomly or in a rank order based on respondents' predicted probability of having the outcome of interest. Critics of sequential replacement argue that results are sensitive to the order of switching and that results rely strongly on the choice of a base model (Fournier 2005; Ham, Svejnar, and Terrell 1998, p.1137; Yun 2004). The second method extends O-B decomposition to non-linear models. Rather than using other group means to decompose differences as in linear O-B decomposition, these methods apply weights based on endowments and difference model coefficients that equalize groups on the distributions of covariates (Yun 2004).

This paper studies three new, straightforward transformation methods, one a propensity score-based weighting method, the second a rank-and-replace method, and the third a combination of the two. We discuss the conceptual and empirical advantages and disadvantages of these methods in an analysis of racial disparities in medical expenditure.

2. Data

We pooled data for whites and blacks from the 2003 and 2004 Medical Expenditure Panel Surveys (MEPS), which contain variables related to individuals' health care expenditures, utilization, and demographic, SES and health status characteristics. The main dependent variable is total medical expenditure, generated from compiled claims data and self-reported out-of-pocket expenditure, and standardized to 2004 dollars. For purposes of analysis, covariates are split into two broad categories: SES variables, and health status variables (Table 2). SES variables include education level, poverty status, region of the country, insurance coverage and participation in an HMO. Health status variables include gender, age, marital status, self-assessed health test scores, Body Mass Index (BMI), and binary indicators for 11 health conditions.

Table 2
2003-2004 Medical Expenditure Panel Survey (MEPS)

The MEPS sample contains 22,209 whites and 5,406 blacks 18 years of age or older, and is weighted to be representative of the black and white adult population of the United States. MEPS has previously been used in numerous racial disparities studies (i.e., Guevara et al. 2006; Shi 1999; Weinick et al. 2000; Zuvekas and Taliaferro 2003), and most notably was a data source for the AHRQ National Health Disparities Reports (AHRQ 2004, 2005, 2006, 2007).

3. Methods

Three methods of health status adjustment were evaluated in the context of non-linear models: propensity score weighting, the rank and replace method, and a combined method. The objective of each of the adjustment methods is to produce a plausible hypothetical joint distribution fBW*(H,S) to be used in calculation of disparities.

We calculated disparity estimates in the following manner: coefficients were estimated from an appropriate model for expenditure, the counterfactual was generated by adjusting white health status to mimic the black distribution using one of the three methods, and predictions were calculated and averaged for each racial group. To focus on methods of adjustment, we estimated one non-linear model of expenditures, and applied adjustments to that model throughout.

We used a generalized linear model (GLM) with quasi-likelihoods (McCullagh and Nelder 1989), similar to that used in a recent study on racial disparities in mental health utilization (McGuire et al. 2006). The GLM is preferable to an ordinary least-squares regression because it allows a non-linear link function to capture the relationship between the dependent and independent variables. The GLM also allows for heteroscedastic residual variances related to the predicted mean (Buntin and Zaslavsky 2004). Standard errors for regression parameter estimates were calculated using the survey module in Stata 9 (StataCorp 2005) which accounts for the complex survey sampling design and the fact that the same primary sampling units were used in each year's panel selection.

3.1. Health Status Propensity Score Method

The propensity score is the probability of assignment to treatment conditional on a vector of observed covariates, P(D=1|X=x) where D is treatment status and X is a vector of observed covariates measured prior to assignment to treatment. Conditional on the propensity score, the distributions of observed covariates are the same for the treatment and the control group, approximating the randomization of individuals to these groups (Rubin 1997).

Balancing on observed covariates makes the propensity score an appealing tool to adjust for health status in non-linear models. Instead of modeling the predicted probability of being in the black race group conditional on all observable variables, however, we modeled that probability conditional on only health status variables; we do not seek balance on SES mediators. The estimated propensity score êi (H) is the predicted probability under a logit model for this probability. Weighting based on the predicted score (Rosenbaum 1987) balances the distributions of health status for whites and blacks. (This method could also be implemented using a propensity score matching approach (Rosenbaum and Rubin 1984)). We applied a unity weight to the treatment units and e^i(H)1e^i(H) weight to the control units (Hirano and Imbens 2001).7 This weighting adjusts the white (“control”) distribution of health status to equal the black (“treatment”) distribution, as in (3). Note that the estimated propensity score is approximately proportional to the ratio of black and white health status densities, fB(H)fw(H).

To the extent that SES and health status variables are correlated, however, the health status propensity score method is not concordant with the IOM definition. As described in 5(a), concordance requires the weighted distribution of health status to be equal to the marginal black health status distribution. However, propensity score weighting of individuals based only on their health status values also alters the distribution of SES,8 contrary to 5(b), since


unless health status and SES are independent, fWW (H,S) = fW (H) fW (S).

We can anticipate the direction of bias in this adjustment method. Assuming whites have overall higher health status and SES, and that the covariance between health status and SES variables is positive, then white individuals with lower health status will be weighted more heavily to balance the white and black distributions of health status. This, in turn, will cause white individuals with lower SES to also be weighted more heavily. Assuming that those in lower SES categories have lower health utilization and expenditure, we would predict that this method will underestimate the counterfactual white expenditure, and thus underestimate the disparity.

3.2. Rank and Replace Method

The “rank and replace” method (McGuire et al. 2006) matches individuals based on their rank of health status measures and transforms the distribution of health status conditional on white race to equal the health status distribution conditional on black race. SES values are not altered. In previous studies implementing the IOM definition of disparities, health status variables were transformed seriatim, so that the Hispanic and black distributions for each health status variable were identical to white distributions (Cook 2007; McGuire et al. 2006). A “rank and replace” method was used to adjust continuous health status variables: blacks, Hispanics and whites were ranked according to their scores on continuous health status variables and the values of Hispanic and black individuals were adjusted to equal their equivalently ranked white individual. These analyses adjusted dichotomous health status variables by randomly changing minority health status indicators from 1 to 0 (or 0 to 1) until white and minority proportions were equivalent.

In this study, we simplified the adjustment by applying the rank and replace method to a model-based index of health status, rather than to each health status variable. Also, in contrast to prior work, we adjusted the white health status distribution to equal the black health status distribution to increase the policy relevance of the analysis. To implement this procedure, we first estimated a GLM for total health care expenditure as a function of all relevant independent covariates, including appropriate interactions. Writing the GLM in standard form as


a health status summary index score is defined as β1H, the combined contribution to the prediction of all health-status variables. Blacks and whites were then ranked according to their scores, and scores of white individuals were replaced by the scores of equivalently ranked black individuals, creating a hypothetical white subgroup with a distribution of this health status index that was identical to the black subgroup. Predicted expenditures were then calculated under model (10) using the revised values of the health-status index but the original (white) values of SES and race. Because the predictions depend on health status only through the index β1H, transforming the distribution of the index is equivalent to transforming the entire health status distribution, maintaining the joint relationships of health status variables and the rank relationship of SES and health status. However this method requires transforming only a scalar distribution and does not require arbitrary random transformation of dichotomous variables.

3.3. Combined Method: Health Status Propensity Score Method with SES Adjustment

A third method, the propensity score method with SES adjustment, employs a similar adjustment applied to a summary SES index to undo the undesired reweighting of the distributions of SES variables during the propensity score weighting process. First, we estimated the GLM (10) for total health expenditure regressed on the complete set of SES and health status variables. Second, we calculated the SES index score β2S. Third, the SES predictive index was weighted using the health status propensity score weight described above. We restore the original white SES characteristics using the rank and replace method, replacing white propensity score weighted SES index values with equivalently ranked white, pre-propensity score weighted SES index values. With this adjustment, the propensity-score-weighted distribution of SES index values for whites is approximately the same as the original distribution of the index values before reweighting. Predicted expenditures were then calculated under model (10) using the original health-status index, the adjusted SES index, and health status propensity score weights.

The propensity score weights white individuals based on a composite health status propensity score so that the integration of the joint function f*BW (H,S) results in the marginal distribution given black race for every health status variable. The white SES variable distribution is recaptured through the SES index adjustment.

3.4. Empirical Evaluation Criteria

3.4.1 Variance

To measure the efficiency of the three approaches just described, we compared absolute and relative variances of estimates under these three approaches, using the bootstrap procedure (Efron 1979). We drew simulated samples with replacement, replicating the sampling design (stratification and clustering) of the MEPS to create 100 samples. Predicted expenditures and IOM-based disparities were calculated for each sample, and the variance of these 100 estimates was determined.

3.4.2. Application of These Methods to Datasets with More Complete Measures of SES or Health Status

Despite the importance of SES and health status variables in rigorous measurement of racial disparities, available datasets on health care quality and access may not include all desirable variables. Medicare and Medicaid claims data (Schneider et al. 2001; Skinner et al. 2003), like many clinically-based studies, are rich in health status information, but have very limited information on the SES of individuals. Conversely, the Current Population Survey (CPS) collects detailed SES information, but none on conditions or diagnoses other than a self-reported health status measure. Thus, a method of health status adjustment might be preferable if it is less dependent on the availability of extensive health status and SES data. To address this issue, we tested the sensitivity of the different methods to a number of different data scenarios. Because age and sex are powerful predictors of health spending in risk adjustment models and are readily available,9 we used these as baseline measures of health status. We then created four hypothetical datasets by paring down the MEPS dataset: weakly-measured health status (age and sex only) combined with a full set of SES variables, moderately well-measured health status (age, sex, marital status and self-reported health status) with a full set of SES variables, weakly-measured SES (education only) with a full set of health status variables, and moderately-well measured SES (education and poverty status) with a full set of health status variables. Disparities estimates with each method and each of these datasets were compared to those from the full MEPS dataset, which has additional information on geographical region, health insurance status and type, and specific health status conditions and scales. The stability of the estimates and variances across these datasets provide insight into the effects of the limitations of different datasets.

4. Results

Table 2 shows significant unadjusted differences between blacks and whites in nearly all SES and health status measures. Unadjusted mean expenditure was $1082 lower for blacks than for whites (p<.05).

A GLM with a log link and variance proportional to the square of the mean models the relationship between race and total health care expenditure, controlling for numerous SES and health status variables (Table 3). The link and variance functions were determined using diagnostics suggested by Buntin and Zaslavsky (2004), including the “Park test” to estimate the variance function (Park 1966). The significant negative coefficient on the race variable shows that blacks receive fewer resources than whites, controlling for all other variables. As expected, older age, lower scores on health scales, and the presence of medical conditions positively influenced expenditure. Higher income and education, and living in the Midwest, were other predictors of greater medical expenditure. Being uninsured was a significant negative predictor of medical expenditure.

Table 3
GLM Model of total medical expenditures ($2004) in last year on race with demographic, SES and health status variables used as independent controls.1 (n= 27,615)

Table 4 compares different methods of computing disparities, differing in the variables that were included in the adjustment and in use of linear or non-linear models. The residual direct effect (RDE) method adjusts for all variables in order to isolate the effect of race, controlling for mediating factors including SES, health status and preferences. Using a linear model, this method calculated a disparity of $489. The RDE method, using the non-linear GLM described above, calculated a disparity of $842, nearly twice as large. Allowing differences in SES to mediate the relationship between race and total expenditure, by only adjusting for health status, increased estimates of disparity in both linear and non-linear models. The O-B decomposition calculated from separate black and white linear regressions found a disparity of $913. The four methods of adjusting for health status while allowing for SES mediation in non-linear models led to very similar disparity predictions. The RDE of disparity from a non-linear model that included only health status variables was $1126, the health status propensity score method calculated a disparity of $1285, the rank and replace method found a disparity of $1407, and the combined propensity score with SES adjustment method calculated a disparity of $1454. The disparity estimate using the rank and replace and combined methods had very similar variances both in absolute magnitude and as a percentage of the disparity estimate (both methods had coefficients of variation (CV) of approximately 34%). Table 5 shows estimates of disparities using the complete 2003-2004 dataset, the same dataset with missing SES variables and a full set of health status variables, and again with missing health status variables with a full set of SES variables. All three disparity calculations were sensitive to changes in the availability of health status data. For example, using a model with only gender and age as health status variables, the rank and replace method found a disparity of $732, nearly half the disparity estimated using a full model. On the other hand, disparity estimates differed little when SES variables were removed. Using only educational status as a SES variable, the combined method found a disparity of $1318, very similar to the disparity in the full model of $1454. Comparing the rank and replace method to the combined method, the former was more sensitive to missing SES information. The rank and replace method calculated a disparity of $1094 in the education-only model and $1407 in the full model, compared to $1318 and $1454 respectively calculated by the combined method.

Table 4
Comparison of methods of calculating black-white disparity in total medical expenditure using linear and non-linear models
Table 5
Comparing disparity estimates from datasets with varying omissions of health status and SES variables

5. Discussion

Adjusting for health status in a regression framework is difficult because of the non-linearity of the relationships of many health-related dependent variables to their predictors, and because health status is often highly correlated with SES characteristics. The IOM-related definition of racial disparities has been implemented using the rank and replace method (Cook 2007; Cook, McGuire, and Zuvekas 2008; Cook, Miranda, and McGuire 2007; McGuire et al. 2006) which should be evaluated against alternatives. This paper presents methods of implementing the IOM-related definition of racial disparities and a framework for evaluating the concordance of each method with the IOM definition of racial disparities, efficiency (low variance), and sensitivity to availability of different amounts of data on SES and health status.

The rank and replace method adjusts an index representing the combined effect of the health status variables to have exactly the same distribution of values for both blacks and whites while preserving the rank order of individuals' health status indicators. The propensity score method aligns black and white health status using a scalar weight. This method is appealing because the foundation of propensity score adjustment is well established in the statistical and health services literatures, but it requires a second step to “undo” the weighting of SES variables that are correlated with weighted health status variables. The combined method accomplishes this second step by replacing the white post-weight SES index distribution with the white pre-weight SES index distribution.

Disparities research using linear models can use mean adjustment methods or a variant of the Oaxaca-Blinder decomposition method that adjust health status means to implement the IOM definition of racial disparities. In this analysis, we find as in previous health services research that accounting for the non-linearity of health expenditure is necessary for accurate modeling predictions.

It is straightforward to estimate the RDE in a non-linear model with only health status variables, but this method does not account for the full mediated contribution to disparities of SES variables. Models that include health status variables while omitting SES variables unintentionally adjust for SES characteristics that are both predictors of expenditure and correlated with health status. For example, calculating disparities using a model that includes age as a covariate but not income unintentionally adjusts for the effect of age on disparities that is mediated through income. Adjusting for this partial contribution of income to the disparity calculation is not concordant with the IOM definition of disparities. However, if the relationship between health status and SES variables is modest, failing to account for the effect of health status mediated through SES typically will have little impact on the disparity estimate.

All three methods of adjusting for health status differences are relatively easy to implement empirically. The rank and replace method and the combined method presented in this paper estimate a disparity that is concordant with the IOM definition, and both methods had similar variance in this empirical analysis.

Because all three of the methods were sensitive to the omission of health status variables, caution is warranted when applying these methods to datasets with limited health status data. When health status information is missing, racial differences in the effects of health status load into the race coefficient and are included in the estimated disparities. Failure to adjust for health status compares blacks who are sicker and need more health care to whites who are healthier, thus decreasing the estimated disparity.

As we expected, the three disparity methods were less sensitive to the omission of SES variables than to the omission of health status variables because SES variables indirectly contribute to the disparity estimate through the race coefficient. Comparing across methods, we found that the propensity score and combined methods were less sensitive to the unavailability of SES variables than the rank and replace method and had smaller variance.

The accuracy of the comparison of racial disparities methods in this paper is limited by the available data. For example, the inclusion of other SES-related variables such as wealth would improve confidence in the predicted disparity results. Another limitation is the absence of patient preference measures, preventing us from adjusting disparity calculations for preferences. Patient preference measures are often problematic because patients are rarely “fully informed” about their healthcare options (Ashton et al. 2003; IOM 2002), and measures used in other surveys do not take into account different levels of knowledge and experience with the health care system (Cooper-Patrick et al. 1997) or the extent to which expressed preferences might represent a realistic response to inferior access to and quality of health care, rather than an exogenously-determined preference.

In this paper, we define concordance to the IOM definition of racial and ethnic disparities and apply three different methods to the estimation of disparities in medical expenditures. Future research should investigate the applicability of these methods to other types of health services variables, and datasets with more comprehensive variables of preferences and SES. Other possible extensions are to apply the IOM-concordant methods to evaluate disparities at various quantiles of medical expenditures or among different subgroups of patients.


This research was supported by grants P50 MHO73469 and P01 MH059876 from the National Institute of Mental Health, and P60 MD002261 and MD0300 from the National Center for Minority Health and Disparities.


1Peterson et al. (1997) state as a limitation of their methods that “race may only be a surrogate marker for other socioeconomic factors… that may affect decisions about care to an equal or greater extent.” (p.485). This is actually not a limitation according the IOM definition of disparities.

2In some studies using population-based surveys, researchers use a clinical diagnosis or a set of diagnostic items to establish need. A study typical of this group (Wells et al. 2001) identifies individuals with a depression diagnosis using an SF-12 mental health score, and then assesses the utilization of care by this subpopulation.

3Rubin argues that race is not a treatment at the individual level in the sense of his causal model since it cannot be manipulated by any conceivable experiment; however since our purpose is to compare group rather than individual outcomes we do not consider this an obstacle to our analysis.

4An alternative (not used in this paper) is to compare a counterfactual black population that has the same health status distribution of whites, with an actual white population. Our choice is based on the presumption that remediation of disparities would entail providing blacks with the (generally superior) level of health services received by whites, rather than the reverse.

5Though we apply the disparities definition to detect disparities between means for two groups (one counterfactual and one factual), our methods can easily be applied to measure disparities within subgroups (e.g., racial disparities among the critically ill), either by fitting a separate model within that subgroup or by applying the overall model (calibrated to balance in the subgroup) to the appropriate subsample.

6In a model with interactions, the expected value of the interaction (product) term enters into the prediction. Specifically, in a model with HS × SES interactions, adjustment of health status means will alter the effects of SES variables.

7In our analysis, propensity score weights were capped at 0.9 to prevent single individuals from having undue influence on the predicted outcome. As a result, two of the 22,209 white individuals had their final weights (propensity score weight times MEPS survey weights) reduced from 960,797 to 40,380 and 8,700,800 to 131,648 respectively. This adjustment greatly improved the match between black and white health status variable distributions.

8The propensity score's discordance from the IOM definition is similar to a regression of expenditure on health status variables, ignoring adjustment of SES variables (e.g., see preliminary models in Saha et al. (2003). Leaving SES out of the model will load racial differences in SES on to the race coefficient allowing for an approximation of the IOM-defined disparity (Balsa et al. 2007), but will not identify the contribution of racial differences in SES that is correlated with health status.

9Insurance companies with limited information on individuals' health status often use age and sex alone as risk adjusters for determining premiums. In addition, recent studies on disparities among Medicare beneficiaries have used age and sex to adjust for health status (e.g., Escarce and McGuire 2004).

Contributor Information

Benjamin L. Cook, Center for Multicultural Mental Health Research, Cambridge Health Alliance – Harvard Medical School, 120 Beacon Street, 4th floor, Somerville, MA 02143, 617-503-8449.

Thomas G. McGuire, Department of Health Care Policy, Harvard Medical School, 180 Longwood Ave., 2nd Floor, Boston, MA 02115.

Ellen Meara, Department of Health Care Policy, Harvard Medical School, 180 Longwood Ave., 2nd Floor, Boston, MA 02115 and National Bureau of Economic Research, 1050 Massachusetts Ave., Cambridge, MA 02138.

Alan M. Zaslavsky, Department of Health Care Policy, Harvard Medical School, 180 Longwood Ave., 2nd Floor, Boston, MA 02115.


  • AHRQ. National Healthcare Disparities Report, 2004. Rockville, MD: Agency for Healthcare Research and Quality; 2004.
  • AHRQ. National Healthcare Disparities Report, 2005. Rockville, MD: Agency for Healthcare Research and Quality; 2005.
  • AHRQ. National Healthcare Disparities Report, 2006. Rockville, MD: Agency for Healthcare Research and Quality; 2006.
  • AHRQ. National Healthcare Disparities Report, 2007. Rockville, MD: Agency for Healthcare Research and Quality; 2007.
  • Ashton CM, Haidet P, Paterniti DA, Collins TC, Gordon HS, O'Malley K, Petersen LA, Sharf BF, Suarez-Almazor ME, Wray NP, Street RL., Jr Racial and ethnic disparities in the use of health services: bias, preferences, or poor communication? J Gen Intern Med. 2003;18(2):146–52. [PMC free article] [PubMed]
  • Ayanian JZ, Cleary PD, Weissman JS, Epstein AM. The effect of patients' preferences on racial differences in access to renal transplantation. N Engl J Med. 1999;341(22):1661–9. [PubMed]
  • Baicker K, Chandra A, Skinner JS, Wennberg JE. Who you are and where you live: how race and geography affect the treatment of medicare beneficiaries. Health Aff (Millwood) 2004:33–44. Suppl Web Exclusives: VAR. [PubMed]
  • Balsa AI, Cao Z, McGuire TG. Does managed health care reduce health care disparities between minorities and Whites? J Health Econ. 2007;27(1):101–21. [PubMed]
  • Basu A, Rathouz PJ. Estimating marginal and incremental effects on health outcomes using flexible link and variance function models. Biostatistics. 2005;6(1):93–109. [PubMed]
  • Bertrand M, Mullainathan S. Are Emily and Brendan More Employable than Lakisha and Jamal? A Field Experiment on Labor Market Discrimination. American Economic Review. 2004;94(4):991–1013.
  • Blinder A. Wage discrimination: reduced form and structural estimates. Journal of Human Resources. 1973;8:436–55.
  • Buntin MB, Zaslavsky AM. Too much ado about two-part models and transformation? Comparing methods of modeling Medicare expenditures. Journal of Health Economics. 2004;23(3):525–42. [PubMed]
  • Chandra A, Skinner J. NBER Working Paper No 9513. Cambridge, MA: 2003. Geography and Racial Disparities in Health and Health Care.
  • Cook BL. Effect of Medicaid managed care on racial disparities in health care access. Health Services Research. 2007;42(1):124–45. [PMC free article] [PubMed]
  • Cook BL, McGuire TG, Zuvekas SH. Measuring Trends in Racial/Ethnic Health Care Disparities. Medical Care Research and Review. 2008 In Press. [PMC free article] [PubMed]
  • Cook BL, Miranda J, McGuire TG. Measuring Trends in Mental Health Care Disparities, 2000-2004. Psychiatric Services. 2007;58(12):1533–40. [PubMed]
  • Cooper-Patrick L, Powe NR, Jenckes MW, Gonzales JJ, Levine DM, Ford DE. Identification of patient attitudes and preferences regarding treatment of depression. J Gen Intern Med. 1997;12(7):431–8. [PMC free article] [PubMed]
  • Efron B. Bootstrap Methods: Another Look at the Jackknife. Annals of Statistics. 1979;7:1–26.
  • Ellis R, van de Ven W. Risk Adjustment in Competitive Health Insurance Markets. In: Culyer AJ, Newhouse JP, editors. Handbook of Health Economics. Elsevier; 2000. pp. 755–845.
  • Even WE, Macpherson DA. The Decline of Private-Sector Unionization and the Gender Wage Gap. Journal of Human Resources. 1993:279–96.
  • Fairlie R. IZA Discussion Papers. Bonn, Germany: 2006. An Extension of the Blinder-Oaxaca Decomposition Technique to Logit and Probit Models. I. f. t. S. o. L. (IZA)
  • Fiscella K, Franks P, Doescher MP, Saver BG. Disparities in health care by race, ethnicity, and language among the insured: findings from a national sample. Medical Care. 2002;40(1):52–9. [PubMed]
  • Ford E, Newman J, Deosaransingh K. Racial and ethnic differences in the use of cardiovascular procedures: findings from the California Cooperative Cardiovascular Project. Am J Public Health. 2000;90(7):1128–34. [PubMed]
  • Fournier M. Exploiting information from path dependence in Oaxaca-Blinder decomposition procedures. Applied Economics Letters. 2005;12:669–72.
  • Guevara JP, Mandell DS, Rostain AL, Zhao H, Hadley TR. Disparities in the reporting and treatment of health conditions in children: an analysis of the Medical Expenditure Panel Survey. Health Serv Res. 2006;41(2):532–49. [PMC free article] [PubMed]
  • Ham J, Svejnar J, Terrell K. Unemployment and the Social Safety Net During Transitions to a Market Economy: Evidence from the Czech and Slovak Republics. American Economic Review. 1998;88(5)
  • Hargraves JL, Hadley J. The contribution of insurance coverage and community resources to reducing racial/ethnic disparities in access to care. Health Serv Res. 2003;38(3):809–29. [PMC free article] [PubMed]
  • Hargraves JL, Wilson IB, Zaslavsky A, James C, Walker JD, Rogers G, Cleary PD. Adjusting for patient characteristics when analyzing reports from patients about hospital care. Med Care. 2001;39(6):635–41. [PubMed]
  • Hirano K, Imbens G. Estimation of Causal Effects using Propensity Score Weighting: An Application to Data on Right Heart Catheterization. Health Services and Outcomes Research Methodology. 2001;2:259–78.
  • Hofer TP, Hayward RA, Greenfield S, Wagner EH, Kaplan SH, Manning WG. The unreliability of individual physician “report cards” for assessing the costs and quality of care of a chronic disease. Jama. 1999;281(22):2098–105. [PubMed]
  • Howland J, Stokes J, 3rd, Crane SC, Belanger AJ. Adjusting capitation using chronic disease risk factors: a preliminary study. Health Care Financ Rev. 1987;9(2):15–23. [PMC free article] [PubMed]
  • Ibrahim SA, Whittle J, Bean-Mayberry B, Kelley ME, Good C, Conigliaro J. Racial/ethnic variations in physician recommendations for cardiac revascularization. Am J Public Health. 2003;93(10):1689–93. [PubMed]
  • IOM. Unequal Treatment: Confronting Racial and Ethnic Disparities in Health Care. Washington, DC: National Academies Press; 2002. [PMC free article] [PubMed]
  • Kirby JB, Taliaferro G, Zuvekas SH. Explaining racial and ethnic disparities in health care. Med Care. 2006;44(5 Suppl):I64–72. [PubMed]
  • Kressin NR, Chang BH, Whittle J, Peterson ED, Clark JA, Rosen AK, Orner M, Collins TC, Alley LG, Petersen LA. Racial differences in cardiac catheterization as a function of patients' beliefs. Am J Public Health. 2004;94(12):2091–7. [PubMed]
  • Kressin NR, Petersen LA. Racial differences in the use of invasive cardiovascular procedures: review of the literature and prescription for future research. Ann Intern Med. 2001;135(5):352–66. [PubMed]
  • Kupersmith J. Quality of care in teaching hospitals: a literature review. Acad Med. 2005;80(5):458–66. [PubMed]
  • McCullagh P, Nelder JA. Generalized Linear Models. London: Chapman & Hall; 1989.
  • McGuire TG, Alegria M, Cook BL, Wells KB, Zaslavsky AM. Implementing the Institute of Medicine definition of disparities: an application to mental health care. Health Services Research. 2006;41(5):1979–2005. [PMC free article] [PubMed]
  • Nielsen HS. Discrimination and Detailed Decomposition in a Logit Model. Economics Letters. 1998;61:115–20.
  • O'Malley AJ, Zaslavsky AM, Elliott MN, Zaborski L, Cleary PD. Case-mix adjustment of the CAHPS Hospital Survey. Health Serv Res. 2005;40(6 Pt 2):2162–81. [PMC free article] [PubMed]
  • Oaxaca R. Male-female wage differentials in urban labor markets. International Economic Review. 1973;9:693–709.
  • Park R. Estimation with heteroscedastic error terms. Econometrica. 1966;34:888.
  • Petersen LA, Wright SM, Peterson ED, Daley J. Impact of race on cardiac care and outcomes in veterans with acute myocardial infarction. Med Care. 2002;40(1 Suppl):I86–96. [PubMed]
  • Peterson ED, Shaw LK, DeLong ER, Pryor DB, Califf RM, Mark DB. Racial variation in the use of coronary-revascularization procedures. Are the differences real? Do they matter? N Engl J Med. 1997;336(7):480–6. [PubMed]
  • Powers DA, Pullman TW. Multivariate Decomposition for nonlinear models. Population Association of America 2006 Annual Meeting; Princeton, NJ. Year. edited by, pp.
  • Riach PA, Rich J. Field experiments of discrimination in the market place. The Economic Journal. 2002;112:F480–F518.
  • Rosenbaum P. Model-Based Direct Adjustment. Journal of the American Statistical Association. 1987;82:387–94.
  • Rosenbaum P, Rubin D. Reducing bias in observational studies using subclassification on the propensity score. Journal of the American Statistical Association. 1984;79:516–24.
  • Rubin D. Estimating Causal Effects of Treatments in Randomized and Nonrandomized Studies. Journal of Educational Psychology. 1974;66(5):688–701.
  • Rubin DB. Estimating causal effects from large data sets using propensity scores. Ann Intern Med. 1997;127(8 Pt 2):757–63. [PubMed]
  • Saha S, Arbelaez JJ, Cooper LA. Patient-physician relationships and racial disparities in the quality of health care. Am J Public Health. 2003;93(10):1713–9. [PubMed]
  • Schneider EC, Leape LL, Weissman JS, Piana RN, Gatsonis C, Epstein AM. Racial differences in cardiac revascularization rates: does “overuse” explain higher rates among white patients? Ann Intern Med. 2001;135(5):328–37. [PubMed]
  • Schulman KA, Berlin JA, Harless W, Kerner JF, Sistrunk S, Gersh BJ, Dube R, Taleghani CK, Burke JE, Williams S, Eisenberg JM, Escarce JJ. The effect of race and sex on physicians' recommendations for cardiac catheterization. N Engl J Med. 1999;340(8):618–26. [PubMed]
  • Shi L. Experience of primary care by racial and ethnic groups in the United States. Med Care. 1999;37(10):1068–77. [PubMed]
  • Skinner J, Weinstein JN, Sporer SM, Wennberg JE. Racial, ethnic, and geographic disparities in rates of knee arthroplasty among Medicare patients. N Engl J Med. 2003;349(14):1350–9. [PubMed]
  • StataCorp. Stata Statistical Software 9.0. College Station, TX: Stata Corporation; 2005. (release.
  • Waidmann TA, Rajan S. Race and ethnic disparities in health care access and utilization: an examination of state variation. Med Care Res Rev. 2000;57 1:55–84. [PubMed]
  • Weinick RM, Zuvekas SH, Cohen JW. Racial and ethnic differences in access to and use of health care services, 1977 to 1996. Med Care Res Rev. 2000;57 1:36–54. [PubMed]
  • Wells K, Klap R, Koike A, Sherbourne C. Ethnic disparities in unmet need for alcoholism, drug abuse, and mental health care. American Journal of Psychiatry. 2001;158(12):2027–32. [PubMed]
  • Yun M. Decomposing Differences in the First Moment. Economics Letters. 2004;82:275–80.
  • Zaslavsky A, Zaborski L, Ding L, Shaul J, Cioffi M, Cleary P. Adjusting performance measures to ensure equitable plan comparisons. Health Care Financing Review. 2001;22(3):109–26. [PMC free article] [PubMed]
  • Zaslavsky AM, Ayanian JZ. Integrating research on racial and ethnic disparities in health care over place and time. Medical Care. 2005;43(4):303–7. [PubMed]
  • Zuvekas SH, Taliaferro GS. Pathways to access: health insurance, the health care delivery system, and racial/ethnic disparities, 1996-1999. Health Aff (Millwood) 2003;22(2):139–53. [PubMed]