With the two frameworks given above, a natural question is when do they give the same disparity estimates, or more profoundly, do they give different values that would matter in practice? The answer to the first part is a clean-cut theoretical result we present below. The answer to the second part is obviously “it depends” because it depends critically on the nature of the dependence structure between XA and XN, as well as the dependence of Y on (XA, XN), in particular applications. We will illustrate this with two examples, one of which shows the difference between getting it right or wrong, and the other gives a class of cases where the difference can be made arbitrarily large. For the rest of this paper, we suppress the superscript (N) as a notation for the natural population, whenever the context is clear.
3.1 A Theoretical Result Related to Local Dependence Function
The difference between
DC and
DM can be expressed as
The two disparity measures will be identical, Δ
D = 0, if
This condition is equivalent to the condition that
Here the
G function can be viewed as a measure of the dependence structure between
XN and
XA, and therefore condition
(21) says that as long as the dependence structure is the same for the two groups (e.g., it remains the same across the two racial/ethnic groups), the two disparity measures would be identical. As a special case, if
XN and
XA are independent under
both populations, then the two measures are the same because both
G1 and
G2 are then identical to 1.
For continuous variables, the notion that
G is a measure of dependence structure can also been examined through the
local dependence function (LDF), as defined in [
11] and studied in [
12] and [
13],
Because
it is obvious that condition
(21) implies that the LDF is independent of the group index, i.e., the LDF does not change with the racial/ethnic group. Note however that the reverse is not necessarily true; that is, we can have LDF invariant to group index, but the condition
(21) does not hold. In this sense, the measure of dependence by the
G function is more stringent than that by the LDF.
Finally we note that condition
(21) is sufficient but not necessary for Δ
D = 0. A simple example is that Δ
D = 0 when the regression of
Y on
XA and
XN, that is,
E2[
Y|
XN,
XA] are free of
both XN and
XA (note that this is weaker requirement than the independence between
Y and (
XN,
XA) since only the conditional mean of
Y is involved). This, of course, does not happen when
XN and/or
XA are useful predictors of
Y. But it reminds us that the difference between
DC and
DM also depends on the relationship between
Y and (
XN,
XA), and the difference will be small when both
XN and
XA are weak predictors.
We emphasize here that the statement we just made is true only when both XN and XA are weak predictors. If one is weak but the other is not, the difference between the two measures can still be very large if there is high correlation between XN and XA. Indeed, the appearance of “one-weak and one-strong” scenario is quite common in practice when the two predictors are highly correlated because of the well-known “collinearity” problem among the predictors. And it is precisely in such cases that the recognition of the impact of the allowable covariates on the non-allowable ones, or vice versa, is of critical importance. As mentioned in Section 1, the common approach of adjusting only the allowable covariates without conisdering its impact on the non-allowable covariates can lead to serious mis-estimation of the disparity when the allowable covariates appears to be a weak predictor in the regression of Y on XN and XA.
3.2 A Discrete-Distribution Example
We start with a simple 2 × 2 × 2 contingency table example to both illustrate the basic calculations for
DC and
DM, as well as their differences. We use data from the combined data set of three large epidemiological studies, namely, the NIMH Collaborative Psychiatric Epidemiology Survey (CPES): the National Latinos and Asian American Study (NLAAS) [
14], the National Comorbidity Study Replication (NCS-R) [
15], and the National Study of American Life (NSAL) [
16]. These studies focus on collecting epidemiological information on mental health and substance disorders and services utilization among the general population with special emphasis on ethnic minority groups in the NLAAS (Latinos and Asians) and NSAL (African Americans and Afro-Caribbean) with non-Latino white comparisons from the NCS-R. The studies were designed to allow integration as though they were a single, nationally-representative study [
17]. The combined data set is the largest epidemiological data set available for examining the patterns and correlates of mental health services use in minority populations in the United States. The sampling frames and sample selection procedures are described in detail elsewhere [
18]. For illustration purposes, here we treat this combined data set as a
population by itself, and therefore all the numbers below are regarded as population quantities (e.g., probabilities), instead of sample estimates (e.g., sample proportions).
For simplicity, we focus on a dichotomous outcome, namely, Y = 1 means the respondent had at least one visit to any mental health service provider (either specialist or generalist) in the past year, and Y = 0 otherwise. The allowable covariate is also a binary variable indicating clinical need: XA = 1 if there was a need, and XA = 0 if there was not. The non-allowable covariate is a binary variable indicating nativity: XN = 1 if the respondent is an immigrant, and XN = 0 if the respondent was born in United States.
provides the data for the non-Latino white population, from which we can easily calculate the service use rate for this population. In , there are two numbers in each of the cells in the 2 × 2 layout. The top number is the percentage of individuals who fall into the (
i,
j)-cell defined by the values of (
XN =
i,
XA =
j), and the bottom bracketed number μ
ij is the percentage of people in that cell who have used services, that is, μ
ij =
P(
Y = 1|
XN =
i,
XA =
j). Consequently, the overall service rate for the non-Latino white population, namely
E1[
Y] is obtained by multiplying the two numbers in each cell, and adding them up across all cells. This leads to
E1[
Y] = 14.39%. Similarly, for the Afro-Caribbean population (),
E2[
Y] = 6.75%, so the observed
racial/ethnic difference is
| Table INon-Latino White Population, where μij = P1(Y = 1|XN = i, XA = j). |
| Table IIAfro-Caribbean Population: where μij = P2(Y = 1|XN = i, XA = j) |
This, however, is not necessarily the disparity in the sense of the IOM definition because it has not adjusted for the difference in clinical needs.
Comparing and , we observe an interesting phenomenon. The percentages of people in need are greater in the Afro-Caribbean population than in the non-Latino white population when
conditional on the nativity—55.75% versus 41.62% for the US Born population and 33.90% versus 30.91% for the immigrant population. The pattern, however, is
reversed for the
marginal rates, that is, when we combine the US born and the immigrants together: 41.18% for Afro-Caribbean versus 41.28% for non-Latino whites. Although the difference between these two marginal rates is minimal (but there is no estimation error here since we are using the data as if they were the entire population), it is nevertheless an example of the well-known
Simpson’s paradox[
19]. The reason is the extreme imbalance of the nativity groups in the two populations: more than 95% of the non-Latino whites were US born, but only 1/3 of the Afro-Caribbean were US born.
The implication of this phenomenon for our disparity measure is clear. First, given that the difference in the marginal rates is so small, 41.18% verse 41.28%, one would expect that the
conditional disparity which results from adjusting the Afro-Caribbean’s marginal rate from 41.18% to the non-Latino whites marginal rate of 41.28% will have a minimal impact on the value of
RD of
(24). Indeed, as shown below, the
conditional disparity in this case is
DC = −7.62%, nearly identical to
RD = −7.64%.
Second, this adjustment in fact is in the wrong direction, because in this case the casual assumption underlying the conditional disparity, that is, the allowable covariate (clinical need) causes the non-allowable (nativity) is clearly a very implausible one. The marginal disparity approach is a much more sensible one, because it makes adjustment of clinical needs within each nativity category. Given the fact that the two nativity groups have very different levels of clinical needs, with the US Born having more needs, it is intuitive that we should make the adjustment after stratifying by nativity groups. Because the Afro-Caribbean population has more needs in each of the nativity groups, it is also intuitive that had their needs been the same as the non-Latino whites, the observed racial/ethnic difference would be even larger. Indeed, as shown below, the marginal disparity in this case is DM = −8.84%. In contrast to DC, which points to the wrong direction, DM shows that the disparity is actually more pronounced than the unadjusted racial/ethnic difference by about (8.84 − 7.64)/7.64 ≈ 16%.
3.3. Disparity Calculations
The calculations of
DC and
DM can be best illustrated by creating two adjusted versions of , corresponding respectively to the two hypothetical populations as defined in
(6) and
(13). They are given in and respectively. To construct , which is for the
conditional disparity, we need to compute the density ratio
RC of
(8). From the last row of and respectively, we can obtain this easily as
| Table IIIAdjusted Afro-Caribbean Population for Computing DC |
| Table IVAdjusted Afro-Caribbean Population for Computing DM |
We can then multiply each of the three
un-bracketed proportions in the “No (0)” column of by
RC(0), and multiply each of the three
un-bracketed proportions in the “Yes (1)” column of by
RC(1). This will yield the adjusted population corresponding to the conditional disparity approach, as given in , where the last column
P(
XA = 1|
XN) has also been changed using the adjusted cell probabilities. We see that and have the same marginal distribution for
XA (rounding errors notwithstanding), as intended. The expected value of
Y under this adjusted population can be easily obtained by multiplying each cell probability in with the corresponding μ
ij from and then sum them up. This leads to

, and hence
To calculate the marginal disparity, we need first to compute the
RM function of
(15), which is determined by the right most columns labeled “
P(
XA = 1|
XN)” in and . Specifically, we have
then is obtained by multiplying the (
i,
j)-cell proportion (the top un-bracketed percentage) in with
RM(
i;
j) just obtained for
i,
j = 0,1, and then compute the corresponding
P(
XA = 1|
XN) and

(
XA) accordingly. We note that the resulting conditional distribution

(
XA|
XN) is the same as that from (rounding errors notwithstanding), as it should be, but the marginal distribution

(
XA) is now markedly different from the one from the non-Latino whites
P1(
XA). This difference reflects the difference between the two approaches, because with the conditional disparity approach we have

. As we discussed previously, the seemingly natural “equating-the-need-level” approach actually is misleading in this application because of the Simpson paradox. Equating the need level after stratifying on nativity is a much more sensible approach.
To find the expectation of
Y under this adjusted Afro-Caribbean population, we multiply the four cell percentages in respectively by the four μ
ij values in and then sum them up. This yields

. Consequently, the marginal disparity, which in this example can be regarded as a sensible measure of disparity, is given by
3.4 A Continuous-Distribution Example
This theoretical example establishes the mathematical fact that the difference in the conditional disparity and marginal disparity can be arbitrarily large. It also illustrates another form of the Simpson’s paradox, that is, even when there is no disparity in any strata defined by the non-allowable variables XN, in the aggregated population one can still observe a disparity due to the correlation between XN and race/ethnicity in the aggregated population and the fact that XN is classified as non-allowable.
To see this, let us consider a simple linear regression case
where
k = 1 indexes the non-Latino white population and
k = 2 the minority population. To simplify algebra, suppose in the natural populations (
XN, XA) is bivariate normal, with mean

, unit variances and correlation ρ
(k). That is
Under this setting, for the conditional disparity, the hypothetical joint distribution

is a bivariate normal with the following distribution:
In contrast, under the marginal disparity approach, the hypothetical joint distribution for (
XN,
XA) is given by
P1(
XA|
XN)
P2(
XN), which is also bivariate normal but with the following means and covariance matrix:
Simple algebra then yields that the difference between the two measures is
From
(29), we have the following observations, two of which are special cases of what we have discussed in general in Section 3.1. Specifically, we see that Δ
D = 0 whenever one of the following three condition holds:
- ρ(1) = ρ(2) = 0; that is, when XN and XA are independent in both populations;
; that is, when the regression (25) does not depend on either XN or XA in the population of interest (not necessarily in the reference population);
and
, that is, when the two populations have the same marginal distributions for both XN and XA.
Of course Δ
D can be zero by many other (incidental) combinations of the parameter values, but the above three are most useful for theoretical insights. Note in particularly that conditions (a) and (b) are applicable in general, but condition (c) only works when the regression of
Y is linear in both
XN and
XA. We emphasize that since the parameters in
(29) have no restrictions other than |ρ
(k)| ≤ 1, Δ
D can be arbitrarily large, including approaching infinity.
We also remark a special case of interest, that is, when
Ek[
Y|
XN,
XA] of
(25) is free of both
k (e.g., race/ethnicity index) and
XN (i.e.,

). In such cases, there is no racial/ethnic disparity under the conditional disparity model, since
XA is being adjusted to have the same distribution for both racial/ethnic groups and
(11) does not involve
XN. Under the marginal disparity model, however, the matter is more complicated. Although
XN does not impact
Y directly, it impacts
XA when it is correlated with
XA. Consequently, the difference in the marginal distributions of
XN in the two racial/ethnic groups will result in differences in the marginal distribution of
XA even when, or rather especially when, the conditional distribution

is adjusted to be invariant to the race/ethnicity index
k. It follows then that there will be racial/ethnic disparity due to the indirect impact of
XN on
Y via
XA. Indeed, it is easy to verify for the current example that the marginal disparity is given by
This is zero only when (i) ρ
(1) = 0 and hence
XA and
XN are independent in the reference population so
XN cannot impact
XA in the hypothetical population, or (ii)

and hence the impact of
XN on
XA does not translate into any impact on
Y in the hypothetical population, or (iii)

and hence the distribution of
XN is actually invariant to race/ethnicity.
Perhaps most important here is to notice the Simpson’s paradox again. Although in the aggregated population there is a marginal disparity for the case above, clearly there is no disparity in any subpopulation defined by a particular value of XN, that is, when we condition on XN, because the conditional distribution P2(XA|XN) has been adjusted for to be the same as P1(XA|XN). This of course is not paradoxical, just as Simpson’s paradox is not a real paradox in the mathematical sense. Once we classify XN as a non-allowable variable, then logically we have to accept any difference caused by it as a part of the overall disparity, regardless of whether the difference comes from its direct impact or indirect impact on the outcome Y. Of course, one may argue whether the indirect part really should be viewed as disparity, which is not an easy issue to address as then one is implying that XN is both a non-allowable available (for the direct impact) and allowable variable (for the indirect impact via XN). We shall pursue this complex issue in subsequent work.