Search tips
Search criteria 


Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Stat Med. Author manuscript; available in PMC 2009 September 10.
Published in final edited form as:
Stat Med. 2008 September 10; 27(20): 3941–3956.
doi:  10.1002/sim.3283
PMCID: PMC2715701

Disparities in Defining Disparities: Statistical Conceptual Frameworks


Motivated by the need to meaningfully implement the Institute of Medicine’s (IOM’s) definition of health care disparity, this paper proposes statistical frameworks that lay out explicitly the needed causal assumptions for defining disparity measures. Our key emphasis is that a scientifically defensible disparity measure must take into account the direction of the causal relationship between allowable covariates that are not considered to be contributors to disparity and non-allowable covariates that are considered to be contributors to disparity, to avoid flawed disparity measures based on implausible populations that are not relevant for clinical or policy decisions. However, these causal relationships are usually unknown and undetectable from observed data. Consequently, we must make strong causal assumptions in order to proceed. Two frameworks are proposed in this paper, one is the conditional disparity framework under the assumption that allowable covariates impact non-allowable covariates but not vice versa. The other is the marginal disparity framework under the assumption that non-allowable covariates impact allowable ones but not vice versa. We establish theoretical conditions under which the two disparity measures are the same, and present a theoretical example showing that the difference between the two disparity measures can be arbitrarily large. Using data from the Collaborative Psychiatric Epidemiology Survey, we also provide an example where the conditional disparity is misled by Simpson’s paradox, while the marginal disparity approach handles it correctly.

Keywords: Counterfactual populations, Disparities, Potential outcomes, Weighting, Mental health, Simpson’s paradox


1.1 The Causal Implication of the IOM Definition

The Institute of Medicine (IOM) [1] defines health care disparities as “racial or ethnic differences in the quality of health care that are not due to access-related factors or clinical needs, preference, and appropriateness of intervention.” This definition represents an important advance in disparity research, because it explicitly recognizes the role of causality in the determination of disparities through its reference to the causal expression “not due to”. However, it leaves open the interpretation of the causal model underlying this causal statement. In this paper we identify several causal models under which the IOM definition can be implemented meaningfully, and propose the corresponding frameworks for defining and comparing statistically justifiable disparity measures following these models. Our work can be viewed as a statistically oriented conceptualization of research in this area (e.g., [2, 3, 4, 5, 6]). Although our work was directly motivated by the IOM definition, the proposed general frameworks are equally applicable to other areas, such as in legal settings (e.g.,[7, 8, 9]).

The statistical frameworks proposed in this paper assume that the covariates of interest have been classified into allowable and non-allowable categories. Allowable covariates are considered to be justifiable to cause difference and hence should be adjusted before measuring disparity. The remaining covariates are classified as non-allowable.

It is important to note that the classification of allowable versus non-allowable covariates can, and should, vary from study to study, depending on the particular purpose for the study. For example, IOM’s classification of access-related factors as allowable is appropriate for studying disparity at the level of patient-clinician encounter, with the focus being the treatment delivered during the encounter, controlled for all historical factors that occurred prior to the encounter. However, when studying health care disparity at the level of service systems, it would be more appropriate to classify access-related factors as non-allowable, thus holding the service systems accountable for failure to engage disadvantaged patients into care. The statistical frameworks we establish in this paper apply to any of such classifications.

As a specific example for illustration, suppose that covariates that might be predictive of health care are classified as follows:

  • Clinical needs and preference are considered allowable. Differences in health care due to these covariates are not considered to be part of health care disparity.
  • All other covariates, such as knowledge about health, state of residency, insurance coverage, and education (to name a few), are considered non-allowable. Differences in health care due to these covariates are considered to be health-care disparity.

Given such a classification, our goal then is to measure the disparity that is “not due to” the allowable covariates.

A seemingly obvious, and hence very common, approach is to substitute the levels of allowable covariates of, for example, Afro-Caribbean with those of their non-Latino white counterparts, while leaving the levels of non-allowable covariates unchanged. This procedure is often used in Analysis of Covariance models that adjusts for allowable covariates across racial/ethnic groups. However, this approach is sensible in general only if the allowable covariates are statistically independent of the non-allowable covariates, a condition that is unlikely to hold in practice. Without this independence condition, this direct substitution may lead to an implausible population, such as a hypothetical population with high level of income (as a non-allowable covariate that remains unchanged) and a high level of chronic diseases (as an allowable covariate that was substituted with the levels from the reference population). As a result, the disparity estimates obtained from this procedure may not be relevant for clinical, policy or other purposes, because they are based on a postulated population that cannot be realized by policy changes or disparities interventions.

Not accounting properly for the causal relationships between allowable and non-allowable covariates is especially problematic when the two sets of covariates are highly correlated in the observed data, and both sets of variables are included in our outcome model. In such cases, the allowable covariates might appear to be very weak for predicting the outcome in the fitted model due to the well-known “collinearity” phenomenon. Consequently, replacing a minority group’s allowable covariates by their counterparts in the non-Latino white group in the fitted model may only produce trivial adjustment, even if in reality a substantial part of the observed racial/ethnic difference is indeed due to the difference in the allowable covariates. This could be either because of their direct impact on the outcome (which would not be detected by the fitted regression model because of the strong collinearity) or on the non-allowable covariates, or both. The frameworks proposed in this paper can help to substantially reduce such serious misestimation of disparity because they explicitly take into account the causal relationship between the allowable and non-allowable covariates. For example, our approaches permit an adjustment in allowable covariates to cause substantial adjustment in the non-allowable ones, which in turn may lead to substantial adjustment in the predicted outcome, even if the allowable predictor appears to be very weak in the fitted model for predicting the outcome.

1.2 Explicating the Underlying Causal Assumptions

In order to measure disparity meaningfully, such as to implement the IOM definition for health care disparity, one must be explicit about the underlying causal assumptions that are imbedded in any disparity measure. The fact that the exact causal mechanisms may not be known or may not even be knowable is not a reason to “sweep everything under the rug”. To the contrary, this is precisely the reason for us to be explicit about our assumptions so it is possible for policy makers and subsequent researchers to correctly interpret the disparity measures/estimates we obtain, as well as to determine the directions for correction or improvement when newer information becomes available for the underlying causal relationships.

The key reason that we need to make causal assumptions is that once an action is forced upon a particular variable (e.g., by changing a minority group’s distribution of clinical needs to match that of the non-Latino white population), it will have a ripple effect—in real life—on other variables (e.g., income level) that are impacted by the one adjusted. However, this ripple effect is not estimable without carrying out the actual (social) experiment, because the observed relationships in a natural population may or may not be preserved after an intervention. As an illustrative example, in a natural population, a person’s left-eye visual acuity (AV) may be highly correlated with the person’s right-eye AV. However, this correlation will be destroyed or at least reduced if we perform a vision correction laser surgery on the right eye only. The two AVs will become independent shortly after the surgery, but may become correlated again over time, though the cross-sectional data from a natural population would tell us little about how large this correlation could be or whether it would ever reach the same level as in the natural population.

Therefore, in order to measure the disparity “not due to” the allowable covariates, we must postulate causal directions, as well as how any relationships among relevant variables are preserved or altered with the change from a natural population to a hypothetical one. There are two extreme types of unidirectional causal relationships: (A) allowable covariates impact non-allowable covariates but not vice versa; and (B) non-allowable covariates impact allowable covariates but not vice versa. The more realistic relationships are likely to be either (C) allowable covariates and non-allowable covariates are inter-related and reciprocally impact each other, or (D), which is (C) plus the possibility that both allowable and non-allowable covariates are also impacted by the outcome itself (over time).

While (C) and (D) are most dynamic and realistic, they do not permit useful modeling without further specifications on how the variables involved impact each other. As these specifications are content dependent and can be extremely difficult to postulate, we will pursue them in future work. In this paper, we lay out the statistical frameworks for the simpler causal relationships (A) and (B). These two frameworks serve as building blocks for more complex causal specifications, and at the same time provide plausible specifications that might yield useful bounds on the true disparity when more complicated causal relationships are present. In some applications, such as the one presented in Section 3.2, such simplistic causal assumptions are actually reasonable, leading to sensible practical solutions.


2.1 Linking Natural and Hypothetical Joint Distributions

Let XN denote non-allowable covariates such as knowledge about health, and let XA denote allowable covariates such as clinical needs. Let Y denote the outcome of interest, such as log of the health care expenditure. To measure the disparity, we want to adjust the levels of allowable covariates (XA) but not the levels of non-allowable covariates (XN). Note here that all variables are measured for each individual i, but we suppress the subscript i throughout the text to simplify the notation. To describe the distribution of these variables, we use the common generic notation P( ), e.g., P(XN). Whenever needed, we will use subscript 1 to denote the reference group (e.g., the non-Latino whites) and 2 the group of interest (e.g., a minority group), for example, P1(XN) and P2(XN).

The goal of our modeling is to estimate the potential outcome if the group of interest has the same levels of allowable covariates as the reference group. The first step in setting up our proposed frameworks is to explicitly consider the joint distribution of (Y, XA, XN), and recognize that there are two joint distributions of interest: one for the natural population, and one for the adjusted hypothetical population. We use the superscript (H) to denote different populations, e.g. P2(H) ( ), where (H) can refer to either an adjustment rule for a hypothetical population (e.g. P2(A) ( ), for adjustment rule (A)) or a natural (or non-adjusted) population (e.g. P2(N) ( )). But for any (H), we always have the following decomposition:


The importance of recognizing the dependence on H here is that only the natural population, P(N) (Y, XN, XA), can be estimated from the data. Therefore, in order to calculate disparities under a hypothetical population, we need to make strong assumptions to link the hypothetical population, such as P2(A)(Y,XN,XA), to the natural population P2(N)(Y,XN,XA). Our first assumption, which appears to be taken for granted in much of the existing literature, is that the “forced action” of the adjustment has no impact on the conditional distribution of Y given (XN, XA). That is, for any adjustment rule (A), we assume


We will refer to (2) as the “predictively nature preserving” (PNP) assumption, meaning that the predictive nature of {XN, XA} on Y is preserved despite of the “forced action” on XA.

One can easily consider a scenario under which the PNP assumption is false, but without such an assumption, the estimation of the disparity is essentially impossible. For example, in our hypothetical eye vision example, two people may have identical AVs for both eyes (e.g., both are 20/20 in the right eye but 20/40 for in left eye), but they can have quite different probabilities of having automobile accidents if one of them was born with such vision, but the other achieved it via laser surgery to his right eye. Clearly, if this occurs, then it is impossible to estimate—using only the data collected from the natural population—the accident rate for the group of people with vision corrections done to their right eyes only.

To carry the decomposition (1) further, we can decompose the component P2(H)(XN,XA) in (1) into one conditional and one marginal distribution. This time, there are two possibilities:




The first decomposition is the basis for our conditional framework, which assumes that non-allowable covariates XN are causally dependent on allowable covariates XA but not vice versa. The second decomposition is suitable for the marginal causal framework, which assumes that the allowable covariates XA are causally dependent on the non-allowable covariates XN but not vice versa. Below we show how we can create different counterfactual populations, a standard practice in causal inferences (e.g., see [10]), using these decompositions.

2.2 Conditional Disparity

Under the conditional framework, we adjust the marginal distribution of the allowable covariates XA from the natural population (such as Latinos) to the corresponding marginal distribution of the reference group (such as non-Latino whites), while preserving the conditional distribution for non-allowable covariates XN given allowable covariates XA as in the natural population. Specifically, the hypothetical joint distribution is obtained by replacing the marginal distribution of XA in the natural population


by that of the reference population (e.g., non-Latino whites), and thereby creating the following hypothetical population distribution:


Although P1(N)(XA) is taken from the natural population of the reference group, its insertion into (5) leads to a hypothetical population that retains the natural conditional distributions P2(N)(Y|XN,XA) and P2(N)(XN|XA), with the component P2(N)(XA) “mutated” into P1(N)(XA). We denote this adjustment rule under the conditional disparity framework as adjustment (C).

In order for (6) to be a meaningful hypothetical population, our assumptions are (i) the PNP assumption holds, and (ii) the adjustment action has no impact on the conditional distribution of XN given XA either, that is,


which is plausible when the causal direction is from XA to XN but not vice versa. We will refer to (7) as the “conditionally nature preserving” (CNP) assumption, meaning that the natural conditional distribution P2(XN|XA) is preserved after the adjustment on XA.

The ratio between the adjusted joint density (6) and the natural joint density (5) is simply the ratio of the marginal densities


Following the principle of importance weighting, the expected outcome under the hypothetical population (6) can be expressed as the following weighted expectation of Y under the natural population (5), with the importance weight RC(XA):


where E2(C) denotes the expectation with respect to the hypothetical population in (6), and E2(N) denotes the expectation with respect to the natural population in (5).

Expression (9) gives us a practical way to estimate E2(C) [Y] because its right hand side only involves expectations with respect to the natural population (5), from which we can estimate from the sample data. Since the current paper focuses on setting up conceptual frameworks, the detailed estimation procedures, particularly for estimating RC(XA), will be presented in a subsequent paper.

Intuitively, the adjustment under our conditional framework amounts to weighting the level of health care (Y) among minorities by the density ratio RC(XA). Minorities with higher density ratio RC get weighted up because a value of RC(XA) > 1 tells us that there are more non-Latino whites with the levels of XA than minorities with the same levels of XA. The corresponding disparity is then measured as the difference between the expected value of Y for the adjusted (hypothetical) population (6) and that of the reference population:


We term DC of (10) as conditional disparity because the main source of disparity is in the difference in the conditional distributions P2(N)(XN|XA) and P1(N)(XN|XA). The difference in P2(N)(Y|XN,XA) and P1(N)(Y|XN,XA) may also be of interest in its own right,anissue we shall not pursue here due to page limitation, but will briefly touch upon in Section 3.3.

Applying expression (9) to the definition (10), we have the following expression for conditional disparity that can be estimated using sample data:


Notice that this expression for conditional disparity does not involve the non-allowable covariates, XN. This is possible because of the assumption that XN is caused by XA. Under such an assumption, we can greatly simplify the estimation task since (11) bypasses the need to model XN. The theoretical implication of this simplification will be discussed in Section 4.

2.3 Marginal Disparity

In contrast to conditional disparity, which equates the two marginal distributions of XA, the marginal disparity framework replaces the conditional distribution of XA, conditioning on XN, of the population of interest (e.g., Latinos) by that of the reference population (e.g., non-Latino whites). Specifically, we replace the conditional distribution P2(N)(XA|XN) in the natural population


by that of the reference population to create the following hypothetical population


We denote this adjustment rule under the marginal disparity framework as adjustment (M).

Similar to the conditional disparity framework, in order for (13) to be a meaningful hypothetical population, we have assumed that (i) the PNP assumption holds, and (ii) the adjustment action has no impact on the marginal distribution of XN either; that is,


which is plausible when the causal direction is from XN to XA but not vice versa. We will refer to (14) as the “marginally nature preserving” (MNP) assumption, meaning that the marginal distribution P2(XN) is preserved after the adjustment on XA.

Similar to (8), the ratio between the joint densities (13) and (12) is given by the ratio between the two conditional densities


Again, the ratio (15) can be used as the importance weight to express:


where E2(M) denotes expectation under the hypothetical population (13), and E2(N) denotes expectation under the natural population (12). Note that the right hand side of (16) can be estimated from sample data obtained in the natural population (12).

It is useful to visualize the adjustment under our marginal framework as first stratifying the minority population by the level of the non-allowable covariates (e.g., knowledge of health). We then apply the same weighting scheme as with the conditional disparity approach but now within each stratum, therefore the weights there, namely, the ratio of marginal densities RC(XA) is now replaced by the ratio of the corresponding conditional densities RM(XA; XN). Minorities within a particular stratum, as defined by their values of XN, with higher conditional density ratio RM get weighted up when there are more non-Latino whites with the levels of XA than minorities in the same stratum as defined by the value of XN.

The marginal disparity measure then is defined as the difference between the expected value of Y for the adjusted (hypothetical) population (13) and that of the reference population (12):


We term DM as marginal disparity because the main source of the disparity is in the difference in the marginal distributions P2(N)(XN) and P1(N)(XN), in addition to any difference in P2(N)(Y|XN,XA) and P1(N)(Y|XN,XA). Again, applying expression (16) to the definition (17), we have the following expression for marginal disparity that can be estimated using sample data:


The estimation of RM(XA; XN) is more complicated than estimating RC(XA) due to the higher dimensionality. Again, these technical details will be addressed in a subsequent paper.


With the two frameworks given above, a natural question is when do they give the same disparity estimates, or more profoundly, do they give different values that would matter in practice? The answer to the first part is a clean-cut theoretical result we present below. The answer to the second part is obviously “it depends” because it depends critically on the nature of the dependence structure between XA and XN, as well as the dependence of Y on (XA, XN), in particular applications. We will illustrate this with two examples, one of which shows the difference between getting it right or wrong, and the other gives a class of cases where the difference can be made arbitrarily large. For the rest of this paper, we suppress the superscript (N) as a notation for the natural population, whenever the context is clear.

3.1 A Theoretical Result Related to Local Dependence Function

The difference between DC and DM can be expressed as


The two disparity measures will be identical, ΔD = 0, if


This condition is equivalent to the condition that


Here the G function can be viewed as a measure of the dependence structure between XN and XA, and therefore condition (21) says that as long as the dependence structure is the same for the two groups (e.g., it remains the same across the two racial/ethnic groups), the two disparity measures would be identical. As a special case, if XN and XA are independent under both populations, then the two measures are the same because both G1 and G2 are then identical to 1.

For continuous variables, the notion that G is a measure of dependence structure can also been examined through the local dependence function (LDF), as defined in [11] and studied in [12] and [13],




it is obvious that condition (21) implies that the LDF is independent of the group index, i.e., the LDF does not change with the racial/ethnic group. Note however that the reverse is not necessarily true; that is, we can have LDF invariant to group index, but the condition (21) does not hold. In this sense, the measure of dependence by the G function is more stringent than that by the LDF.

Finally we note that condition (21) is sufficient but not necessary for ΔD = 0. A simple example is that ΔD = 0 when the regression of Y on XA and XN, that is, E2[Y|XN, XA] are free of both XN and XA (note that this is weaker requirement than the independence between Y and (XN, XA) since only the conditional mean of Y is involved). This, of course, does not happen when XN and/or XA are useful predictors of Y. But it reminds us that the difference between DC and DM also depends on the relationship between Y and (XN, XA), and the difference will be small when both XN and XA are weak predictors.

We emphasize here that the statement we just made is true only when both XN and XA are weak predictors. If one is weak but the other is not, the difference between the two measures can still be very large if there is high correlation between XN and XA. Indeed, the appearance of “one-weak and one-strong” scenario is quite common in practice when the two predictors are highly correlated because of the well-known “collinearity” problem among the predictors. And it is precisely in such cases that the recognition of the impact of the allowable covariates on the non-allowable ones, or vice versa, is of critical importance. As mentioned in Section 1, the common approach of adjusting only the allowable covariates without conisdering its impact on the non-allowable covariates can lead to serious mis-estimation of the disparity when the allowable covariates appears to be a weak predictor in the regression of Y on XN and XA.

3.2 A Discrete-Distribution Example

We start with a simple 2 × 2 × 2 contingency table example to both illustrate the basic calculations for DC and DM, as well as their differences. We use data from the combined data set of three large epidemiological studies, namely, the NIMH Collaborative Psychiatric Epidemiology Survey (CPES): the National Latinos and Asian American Study (NLAAS) [14], the National Comorbidity Study Replication (NCS-R) [15], and the National Study of American Life (NSAL) [16]. These studies focus on collecting epidemiological information on mental health and substance disorders and services utilization among the general population with special emphasis on ethnic minority groups in the NLAAS (Latinos and Asians) and NSAL (African Americans and Afro-Caribbean) with non-Latino white comparisons from the NCS-R. The studies were designed to allow integration as though they were a single, nationally-representative study [17]. The combined data set is the largest epidemiological data set available for examining the patterns and correlates of mental health services use in minority populations in the United States. The sampling frames and sample selection procedures are described in detail elsewhere [18]. For illustration purposes, here we treat this combined data set as a population by itself, and therefore all the numbers below are regarded as population quantities (e.g., probabilities), instead of sample estimates (e.g., sample proportions).

For simplicity, we focus on a dichotomous outcome, namely, Y = 1 means the respondent had at least one visit to any mental health service provider (either specialist or generalist) in the past year, and Y = 0 otherwise. The allowable covariate is also a binary variable indicating clinical need: XA = 1 if there was a need, and XA = 0 if there was not. The non-allowable covariate is a binary variable indicating nativity: XN = 1 if the respondent is an immigrant, and XN = 0 if the respondent was born in United States.

Table I provides the data for the non-Latino white population, from which we can easily calculate the service use rate for this population. In Table I, there are two numbers in each of the cells in the 2 × 2 layout. The top number is the percentage of individuals who fall into the (i, j)-cell defined by the values of (XN = i, XA = j), and the bottom bracketed number μij is the percentage of people in that cell who have used services, that is, μij = P(Y = 1|XN = i, XA = j). Consequently, the overall service rate for the non-Latino white population, namely E1[Y] is obtained by multiplying the two numbers in each cell, and adding them up across all cells. This leads to E1[Y] = 14.39%. Similarly, for the Afro-Caribbean population (Table II), E2[Y] = 6.75%, so the observed racial/ethnic difference is

Table I
Non-Latino White Population, where μij = P1(Y = 1|XN = i, XA = j).
Table II
Afro-Caribbean Population: where μij = P2(Y = 1|XN = i, XA = j)

This, however, is not necessarily the disparity in the sense of the IOM definition because it has not adjusted for the difference in clinical needs.

Comparing Table I and Table II, we observe an interesting phenomenon. The percentages of people in need are greater in the Afro-Caribbean population than in the non-Latino white population when conditional on the nativity—55.75% versus 41.62% for the US Born population and 33.90% versus 30.91% for the immigrant population. The pattern, however, is reversed for the marginal rates, that is, when we combine the US born and the immigrants together: 41.18% for Afro-Caribbean versus 41.28% for non-Latino whites. Although the difference between these two marginal rates is minimal (but there is no estimation error here since we are using the data as if they were the entire population), it is nevertheless an example of the well-known Simpson’s paradox[19]. The reason is the extreme imbalance of the nativity groups in the two populations: more than 95% of the non-Latino whites were US born, but only 1/3 of the Afro-Caribbean were US born.

The implication of this phenomenon for our disparity measure is clear. First, given that the difference in the marginal rates is so small, 41.18% verse 41.28%, one would expect that the conditional disparity which results from adjusting the Afro-Caribbean’s marginal rate from 41.18% to the non-Latino whites marginal rate of 41.28% will have a minimal impact on the value of RD of (24). Indeed, as shown below, the conditional disparity in this case is DC = −7.62%, nearly identical to RD = −7.64%.

Second, this adjustment in fact is in the wrong direction, because in this case the casual assumption underlying the conditional disparity, that is, the allowable covariate (clinical need) causes the non-allowable (nativity) is clearly a very implausible one. The marginal disparity approach is a much more sensible one, because it makes adjustment of clinical needs within each nativity category. Given the fact that the two nativity groups have very different levels of clinical needs, with the US Born having more needs, it is intuitive that we should make the adjustment after stratifying by nativity groups. Because the Afro-Caribbean population has more needs in each of the nativity groups, it is also intuitive that had their needs been the same as the non-Latino whites, the observed racial/ethnic difference would be even larger. Indeed, as shown below, the marginal disparity in this case is DM = −8.84%. In contrast to DC, which points to the wrong direction, DM shows that the disparity is actually more pronounced than the unadjusted racial/ethnic difference by about (8.84 − 7.64)/7.64 ≈ 16%.

3.3. Disparity Calculations

The calculations of DC and DM can be best illustrated by creating two adjusted versions of Table II, corresponding respectively to the two hypothetical populations as defined in (6) and (13). They are given in Table III and Table IV respectively. To construct Table III, which is for the conditional disparity, we need to compute the density ratio RC of (8). From the last row of Table I and Table II respectively, we can obtain this easily as

Table III
Adjusted Afro-Caribbean Population for Computing DC
Table IV
Adjusted Afro-Caribbean Population for Computing DM

We can then multiply each of the three un-bracketed proportions in the “No (0)” column of Table II by RC(0), and multiply each of the three un-bracketed proportions in the “Yes (1)” column of Table II by RC(1). This will yield the adjusted population corresponding to the conditional disparity approach, as given in Table III, where the last column P(XA = 1|XN) has also been changed using the adjusted cell probabilities. We see that Table III and Table I have the same marginal distribution for XA (rounding errors notwithstanding), as intended. The expected value of Y under this adjusted population can be easily obtained by multiplying each cell probability in Table III with the corresponding μij from Table II and then sum them up. This leads to E2(C)[Y]=6.77%, and hence


To calculate the marginal disparity, we need first to compute the RM function of (15), which is determined by the right most columns labeled “P(XA = 1|XN)” in Table I and Table II. Specifically, we have


Table IV then is obtained by multiplying the (i, j)-cell proportion (the top un-bracketed percentage) in Table II with RM(i; j) just obtained for i,j = 0,1, and then compute the corresponding P(XA = 1|XN) and P2(M) (XA) accordingly. We note that the resulting conditional distribution P2(M) (XA|XN) is the same as that from Table I (rounding errors notwithstanding), as it should be, but the marginal distribution P2(M) (XA) is now markedly different from the one from the non-Latino whites P1(XA). This difference reflects the difference between the two approaches, because with the conditional disparity approach we have P2(C)(XA)=P1(XA). As we discussed previously, the seemingly natural “equating-the-need-level” approach actually is misleading in this application because of the Simpson paradox. Equating the need level after stratifying on nativity is a much more sensible approach.

To find the expectation of Y under this adjusted Afro-Caribbean population, we multiply the four cell percentages in Table IV respectively by the four μij values in Table II and then sum them up. This yields E2(M)[Y]=5.55%. Consequently, the marginal disparity, which in this example can be regarded as a sensible measure of disparity, is given by


3.4 A Continuous-Distribution Example

This theoretical example establishes the mathematical fact that the difference in the conditional disparity and marginal disparity can be arbitrarily large. It also illustrates another form of the Simpson’s paradox, that is, even when there is no disparity in any strata defined by the non-allowable variables XN, in the aggregated population one can still observe a disparity due to the correlation between XN and race/ethnicity in the aggregated population and the fact that XN is classified as non-allowable.

To see this, let us consider a simple linear regression case


where k = 1 indexes the non-Latino white population and k = 2 the minority population. To simplify algebra, suppose in the natural populations (XN, XA) is bivariate normal, with mean (μN(k),μA(k)), unit variances and correlation ρ(k). That is


Under this setting, for the conditional disparity, the hypothetical joint distribution P2(C)(XN,XA)=P2(XN|XA)P1(XA) is a bivariate normal with the following distribution:


In contrast, under the marginal disparity approach, the hypothetical joint distribution for (XN, XA) is given by P1(XA|XN)P2(XN), which is also bivariate normal but with the following means and covariance matrix:


Simple algebra then yields that the difference between the two measures is


From (29), we have the following observations, two of which are special cases of what we have discussed in general in Section 3.1. Specifically, we see that ΔD = 0 whenever one of the following three condition holds:

  1. ρ(1) = ρ(2) = 0; that is, when XN and XA are independent in both populations;
  2. βN(2)=βA(2)=0; that is, when the regression (25) does not depend on either XN or XA in the population of interest (not necessarily in the reference population);
  3. μN(1)=μN(2) and μA(1)=μA(2), that is, when the two populations have the same marginal distributions for both XN and XA.

Of course ΔD can be zero by many other (incidental) combinations of the parameter values, but the above three are most useful for theoretical insights. Note in particularly that conditions (a) and (b) are applicable in general, but condition (c) only works when the regression of Y is linear in both XN and XA. We emphasize that since the parameters in (29) have no restrictions other than |ρ(k)| ≤ 1, ΔD can be arbitrarily large, including approaching infinity.

We also remark a special case of interest, that is, when Ek[Y|XN, XA] of (25) is free of both k (e.g., race/ethnicity index) and XN (i.e., βN(k)=0). In such cases, there is no racial/ethnic disparity under the conditional disparity model, since XA is being adjusted to have the same distribution for both racial/ethnic groups and (11) does not involve XN. Under the marginal disparity model, however, the matter is more complicated. Although XN does not impact Y directly, it impacts XA when it is correlated with XA. Consequently, the difference in the marginal distributions of XN in the two racial/ethnic groups will result in differences in the marginal distribution of XA even when, or rather especially when, the conditional distribution Pk(N)(XA|XN) is adjusted to be invariant to the race/ethnicity index k. It follows then that there will be racial/ethnic disparity due to the indirect impact of XN on Y via XA. Indeed, it is easy to verify for the current example that the marginal disparity is given by


This is zero only when (i) ρ(1) = 0 and hence XA and XN are independent in the reference population so XN cannot impact XA in the hypothetical population, or (ii) βA(2)=0 and hence the impact of XN on XA does not translate into any impact on Y in the hypothetical population, or (iii) μN(2)=μN(1) and hence the distribution of XN is actually invariant to race/ethnicity.

Perhaps most important here is to notice the Simpson’s paradox again. Although in the aggregated population there is a marginal disparity for the case above, clearly there is no disparity in any subpopulation defined by a particular value of XN, that is, when we condition on XN, because the conditional distribution P2(XA|XN) has been adjusted for to be the same as P1(XA|XN). This of course is not paradoxical, just as Simpson’s paradox is not a real paradox in the mathematical sense. Once we classify XN as a non-allowable variable, then logically we have to accept any difference caused by it as a part of the overall disparity, regardless of whether the difference comes from its direct impact or indirect impact on the outcome Y. Of course, one may argue whether the indirect part really should be viewed as disparity, which is not an easy issue to address as then one is implying that XN is both a non-allowable available (for the direct impact) and allowable variable (for the indirect impact via XN). We shall pursue this complex issue in subsequent work.


The IOM definition of disparities takes an indirect approach of elimination, and defines health care disparity as the difference in health care that is not due to allowable covariates. While this approach is appropriate for capturing disparity in its entirety irrespective of source attribution, it leaves open the question of plausible causes for the disparity, and what can be done to eliminate or reduce the disparity.

An alternative direct, constructive approach, is to define health care disparity attributable to specific non-allowable covariates as the difference in health care that is due to these covariates. This alternative approach can be implemented using the similar statistical frameworks proposed above, but with the role of allowable and non-allowable covariates switched. This approach does not capture disparity in its entirety, because it only captures disparity attributable to the specific non-allowable covariates, and may miss the disparity attributable to other non-allowable covariates, including those that may not have been observed. However, this approach may have more direct policy implications, providing guidance on the potential to reduce or even eliminate health care disparities through specific policy implementations regarding the specific non-allowable covariates.

In practice, we believe both versions of the disparity are important. The elimination approach is useful for estimating the magnitude of the overall disparity, whereas the constructive approach is a tool for estimating how much disparity can be eliminated through specific policy interventions. A comparison between the two is also important in revealing how much of the overall disparity the policy intervention can eliminate. If a large portion remains, a new policy intervention needs to be identified. We plan to explore these issues in subsequent work, especially in the context of longitudinal data.

Another issue that we plan to investigate is the issue of variables that are not included in the model for predicting the outcome Y but may actually be important. Traditionally there is not much one can do about those variables other than trying one’s best to include as many variables as one can find and afford to measure. For the conditional disparity framework as we outlined, one may have noticed that the conditional disparity as defined by (11) does not involve the non-allowable variables. This provides an opportunity to realize the implicit assumption carried in the IOM definition, that is, the non-allowable category is the “catch all” category that includes all covariates that have not been named explicitly in the allowable category. Of course, without strong assumptions, nothing can be done for variables that are not even identified. Recall the fundamental assumption underlying our conditional disparity model is that the allowable variables, which clearly need to be identified and measured, are causes for non-allowable variables. Therefore, if in specific applications where such an assumption can be viewed as reasonable, even when the non-allowable variables form the “catch all” category, then the conditional disparity measure enjoys the property of being more general than we discussed in the current paper.

However, the “catch-all” formulation of the non-allowable variables would not produce anything meaningful under the marginal disparity model, because we simply cannot stratify on variables that are not measured, nor should it be as logically there is nothing can be done when the causes are not even identified. All these issues remind us again of the fundamental importance of explicitly formulating, identifying, and stating causal assumptions underlying any disparity measure.


We thank J. Gastwirth, X. Xie and A. Zaslavsky for helpful exchanges.

Contract/grant sponsor: NIH; contract/grant number: P50-MHO73469-03, U01-MH06220-06A2


1. Institute of Medicine. Unequal Treatment: Confronting Racial and Ethnic Disparities in Health Care. Washington, DC: National Academy Press; 2002. [PMC free article] [PubMed]
2. Asch DA, Armstrong K. Aggregating and partitioning populations in health care disparities research: differences in perspective. Journal of Clinical Oncology. 2007;25(15):2117–2121. [PubMed]
3. Cook B. Effect of medicaid managed care on racial disparities in health care access. Health Services Research. 2007;42:124–145. [PMC free article] [PubMed]
4. McGuire TG, Alegria M, Cook BL, Wells KB, Zaslavsky AM. Implementing the Institute of Medicine definition of disparities: an application to mental health care. Health Services Research. 2006;41:1979–2005. [PMC free article] [PubMed]
5. Rao RS, Graubard BI, Breen N, Gastwirth JL. Understanding the factors underlying disparities in cancer screening rates using the Peters-Belson approach. Mediacal Care. 2004;42(8):789–800. [PubMed]
6. Fiscella K, Franks P, Doescher MP, Saver BG. Disparities in health care by race, ethnicity, and language among the insured: findings from a national sample. Medical Care. 2002;40(1):52–59. [PubMed]
7. Gastwirth JL. A clerification of some statistical issues in Watson V. Fort Worth Bank and Trust. Jurimetrics Journal. 1989;29:267–284.
8. Gastwirth JL, Greenhouse SW. Biostatistical concepts and methods in the legal setting. Statistics in Medicine. 1995;14:1641–1653. [PubMed]
9. Nayak TK, Gastwirth JL. Statistical measures of economic discrimination useful in evelauating fairness. Proceedins of Biopharmaceutical Section of the American Statistical Association. 1995:87–94. (1995)
10. Gelman A, Meng X-L, editors. Applied Bayesian Modeling and Causal Inference from Incomplete-data Perspectives. U.K.: Wiley & Sons; 2004.
11. Holland PW, Wang YJ. Depedence function for continuous bivariate densities. Comm. Statist. Theory Methods. 1987;16:863–876.
12. Wang YJ. Construction of continuous bivariate density functions. Statistica Sinica. 1993;3:173–187.
13. Molenberghs G, Lesaffre E. Non-linear integral euqations to approximate bivariate densities with given marginals and depedence functions. Statistica Sinica. 1997;7:713–738.
14. Alegria M, Takeuchi D, Canino G, Duan N, Shrout P, Meng X-L, Vega W, Zane N, Vila D, Woo M, Vera M, Guarnaccia P, Aguilar-Gaxiola S, Sue S, Escobar J, Lin K, Gong F. Considering Context, Place and Culture: the National Latino and Asian American Study. International Journal of Methods in Psychiatric Research. 2004;13(4):208–220. [PMC free article] [PubMed]
15. Kessler R, Merikangas K. The National Comorbidity Survey Replication (NCS-R) International Journal of Methods in Psychiatric Research. 2004;13(2):60–68. [PubMed]
16. Jackson J, Torres M, Caldwell C, Neighbors H, Nesse R, Taylor RJ, Treirweiler S, Williams DR. The National Survey of American Life: a study of racial, ethnic and cultural influences on mental disorders and mental health. International Journal of Methods in Psychiatric Research. 2004;13(4):196–207. [PubMed]
17. Heeringa S. National Institutes of Mental Health (NIMH) Data Set, Collaborative Psychiatric Epidemiology Survey Program (CPES): Integrated Weights and Sampling Error Codes for Design-based Analysis.
18. Heeringa S, Wagner J, Torres M, Duan N, Adams T, Berglund P. Sample Designs and Sampling Methods for the Collaborative Psychiatric Epidemiology Studies (CPES) International Journal of Methods in Psychiatric Research. 2004;13(4):221–240. [PubMed]
19. Simpson EH. The interpretation of interaction in contingency tables. Journal of the Royal Statistical Society, Series B. 1951;13:238–241.