Home | About | Journals | Submit | Contact Us | Français |

**|**HHS Author Manuscripts**|**PMC3349439

Formats

Article sections

- Abstract
- 1. Introduction
- 2. Preliminaries
- 3. The Extended Concentration Index
- 4. Symmetry and Distributional Sensitivity
- 5. Symmetry and Mirror
- 6. An Empirical Application
- 7. Conclusions
- References

Authors

Related links

J Health Econ. Author manuscript; available in PMC 2012 May 10.

Published in final edited form as:

Published online 2011 November 22. doi: 10.1016/j.jhealeco.2011.10.009

PMCID: PMC3349439

NIHMSID: NIHMS374595

Guido ERREYGERS, Department of Economics, University of Antwerp, City Campus, Prinsstraat 13, 2000 Antwerpen, Belgium;

Guido ERREYGERS: eb.ca.au@sregyerre.odiug; Philip CLARKE: ua.ude.dysu.htlaeh@cpilihp; Tom VAN OURTI: ln.rue.ese@itruonav

The publisher's final edited version of this article is available at J Health Econ

See other articles in PMC that cite the published article.

This paper explores four alternative indices for measuring health inequalities in a way that takes into account attitudes towards inequality. First, we revisit the extended concentration index which has been proposed to make it possible to introduce changes into the distributional value judgements implicit in the standard concentration index. Next, we suggest an alternative index based on a different weighting scheme. In contrast to the extended concentration index, this new index has the ‘symmetry’ property. We also show how these indices can be generalized so that they satisfy the ‘mirror’ property, which may be seen as a desirable property when dealing with bounded variables. We compare the different indices empirically for under-five mortality rates and the number of antenatal visits in developing countries.

Pereira (1998) and more recently Wagstaff (2002) have proposed to extend the concentration index by including a distributional judgement parameter. The extension is seen as a device which makes it possible to incorporate attitudes towards inequality into the calculation of the index of socioeconomic inequality of health. It builds on suggestions of Kakwani (1980) developed by Yitzhaki (1983), who shows how a similar extension of the Gini coefficient allows the expression of distributional judgements in the context of income inequality measurement.

The extended concentration index can be applied to a broad range of health and health care variables. Following Pereira (1998) and Wagstaff (2002), who have used the index to calculate the degree of socioeconomic inequality in child mortality in developed as well as developing countries, there are now a growing number of empirical studies which have applied the index to various health variables. Examples include health limitations within eight European countries across time (Hernández- Quevedo et al. 2006), child malnutrition in Nigeria (Uthman, 2009), immunization ratios in developing countries (Gaudin and Yazbeck 2006; Meheus and Van Doorslaer 2008), and child mortality and child malnutrition in India (Arokiasamy and Pradhan 2010).

In line with recent research on health inequality measurement, we make a clear distinction between bounded and unbounded variables, and hence treat them separately. The main reason for this different treatment is that bounded variables, in contrast to unbounded variables, can be looked at from two points of view: the positive side, where the focus is on ‘good health’ (e.g. the proportion of children without malnutrition), and the negative side, where the emphasis is on ‘ill health’ (e.g. the proportion of children with malnutrition).

In this paper we first of all explore whether the extended concentration index is an appropriate tool to take into account attitudes to inequality when measuring the socioeconomic inequality of health. This form of inequality measurement tries to answer the question: “To what extent are there inequalities in health that are systematically related to socioeconomic status?” (Wagstaff, Paci and van Doorslaer 1991: 546). Our initial focus is on understanding the precise way the extended concentration index incorporates distributional sensitivity when it is applied to unbounded health variables (section 3). Next, we identify a property which the index does *not* have, and suggest an alternative index – the symmetric index – based upon a different distributional weighting scheme (section 4). We then move to bounded variables, and generalize both the extended concentration index and the symmetric index (section 5). An empirical study serves to illustrate the differences between the indices (section 6). We also include an appendix specifying how we deal with smallsample bias, ties in the ranking variable, and differences in (ex-post) sampling probabilities when doing empirical work using finite samples.

In the first part of this paper we consider unbounded ratio-scale health variables. These are variables which have no natural upper bound and vary between 0 and +∞; health expenditure is an example. In section 5 we will turn our attention to bounded variables, which occur very frequently in the domain of health.

Suppose the population *N* consists of *n* individuals, where *n* is a finite, positive natural number. Let *N*= {1, 2,…,*n*}, and assume that individuals are ranked according to their socioeconomic position, in ascending order (i.e. individual 1 is the poorest, and individual *n* the richest person). If individual *i* is not tied to any other, his rank ρ* _{i}* coincides with his number

When *n* becomes very large, the fractional rank can be approximated by a continuous variable *p* defined over the interval [0,1]. The interval [0, *p*] then represents the 100 *p*% poorest individuals of the population, just as those with fractional ranks 1/(2*n*),3/(2*n*),…,(2*i*−1)/(2*n*) represent the 100(*i*/*n*)% poorest individuals. The function *h*(*p*) expresses the health status of an individual as a function of where this individual is located in the interval [0,1]. Clearly, *h*(0) is the health status of the poorest individual and *h*(1) that of the richest.

In case of a finite number of individuals, the average health status is defined as:

$${\mu}_{h}=\frac{1}{n}\sum _{i=1}^{n}{h}_{i}$$

(1)

For the continuous case we have:

$${\mu}_{h}={\int}_{0}^{1}h(p)dp$$

(2)

Following suggestions by Kakwani (1980) and Yitzhaki (1983), both Pereira (1998) and Wagstaff (2002) have introduced the following extended concentration index:

$$C(h,\nu )=1-\frac{\nu}{n{\mu}_{h}}\sum _{i=1}^{n}{(1-{R}_{i})}^{\nu -1}{h}_{i}$$

(3)

where ν ≥1 is a distributional sensitivity parameter. Expression (3) can be formulated in many equivalent ways. Using the definitions of the previous section, (3) can be transformed into a weighted sum of health shares:

$$C(h,\nu )=\frac{1}{{\mu}_{h}}\sum _{i=1}^{n}\left[\frac{1-\nu {(1-{R}_{i})}^{\nu -1}}{n}\right]{h}_{i}$$

(4)

Since *p* corresponds to *R _{i}*, the continuous counterpart of (4) is:

$$C(h,\nu )=\frac{1}{{\mu}_{h}}{\int}_{0}^{1}[1-\nu {(1-p)}^{\nu -1}]h(p)dp$$

(5)

The extended concentration index (5), in a similar fashion to income inequality measures such as the extended Gini index (Yitzhaki 1983), assigns weights to individuals based upon their fractional rank *p* modified by the distributional sensitivity parameter ν. A good way to understand how the extended concentration index is influenced by ν is to focus on the weighting function, which expresses how the weight of a person depends on her fractional rank *p* and the distributional sensitivity parameter ν:

$$w(p,\nu )=1-\nu {(1-p)}^{\nu -1}$$

(6)

Figure 1 plots the weighting function for a range of values of ν.

What we will term the *standard* concentration index is simply a special case of (5) with ν being set equal to 2. In this case the weighting function is *w*(*p*, 2) = 2*p*−1, a linear function of *p* which goes from −1 to +1 as the individual’s position increases from the lowest to the highest in the population. Those above the median have positive weights, and those below it negative ones. In case all have the same health level, the standard concentration index is zero; a negative *C*(*h*, 2) indicates that health is concentrated more among the poor than the rich, and a positive one the reverse.

With regard to other values of the distributional sensitivity parameter, if we take ν =1 the weighting function is constant and equal to 0. Therefore the index will always have the value of 0 and so inequalities are not taken into account in the extended index. From now on we assume that ν >1; the weighting function is then a strictly increasing function of the fractional rank, with some individuals having a negative weight and some a positive. (The cut-off point between positive and negative values can be determined by searching for the individual whose fractional rank *p* is such that *p* =1−(1/ν)^{1/(ν−1)}.) For values 1< ν < 2 only the individuals at the higher end of the income distribution have positive weights. For values ν > 2 also individuals below the median receive positive weights, but those at the bottom of the income distribution have quite large negative weights. As the value of ν increases, gradually more and more individuals have positive weights (which will all tend to 1) and in the end only the poorest individual has a negative (and very large) weight.^{1} In the most extreme case when ν → +∞, the extended concentration index in equation (5) tends to
$\frac{{\mu}_{h}-h(0)}{{\mu}_{h}}$. So unless we take ν = 2 the weighting scheme is asymmetric, and the more so the higher the value of ν.

The bounds of the extended concentration index can be derived by assuming that either the poorest or the richest individual is the only one with a positive level of health. Using the continuous version of the index we obtain:

$$1-\nu \le C(h,\nu )\le 1$$

(7)

Except for the case when ν = 2, these bounds are not symmetric. An intuitive interpretation of these bounds is that they provide the weights given to the poorest and the richest person when calculating the extended index. In the standard case these weights are −1 and +1, which means that the absolute distance between the two is equal to 2. The choice of a particular value of ν can therefore be made dependent on the desired distance between the two.

The extended concentration index has been obtained by applying a concept used for the measurement of the inequality of income – the extended Gini coefficient – to the measurement of the socioeconomic inequality of health. Basically, a one-dimensional construction is transplanted into a two-dimensional context. It cannot be taken for granted, however, that anything which works well in a univariate environment is automatically suited for a bivariate world.

Here we propose a simple test to check whether an indicator is a good measure of the degree of association between socioeconomic status and health. Imagine that we turn the world upside down: for a brief moment of time the poor and the rich switch roles (one may think of Carnival). More specifically, let us assume that the poorest person and the richest person switch their health levels, that the second poorest and the second richest person switch their health levels, etc. In formal terms, this leads to a new health function *g*(*p*)*h*(1− *p*), which is the health function *h*(*p*) turned upside down. Our test consists of looking at the reaction of the indicator when the health function *h*(*p*) is replaced by *g*(*p*). We say that an indicator *I* passes the ‘upside down’ test if *I*(*h*) and *I*(*g*) are always of the opposite sign (or both equal to zero). In other terms, if the indicator states that distribution *h*(*p*) is pro-poor (c.q. pro-rich), then it must always state that distribution *g*(*p*) is pro-rich (c.q. pro-poor).

It is not difficult to verify that the extended concentration index does not pass this test, except when ν =1, a case we excluded, or when ν = 2, the standard concentration index. The reason for this lies in the asymmetric nature of the weighting function. This can be illustrated by looking at the case where the chances of having high or low health levels are symmetrically distributed over the rich and the poor. An extreme example of such a symmetric distribution is the one in which only the richest and the poorest individuals have a very high health level, and all others the minimum level. This is of course a very unequal distribution, but it may be argued that since there is no systematic bias in favour of either the rich or the poor, the index of socioeconomic inequality should therefore be equal to zero. This is exactly what we find if we use the standard concentration index, but not if we use the extended concentration index with ν different from 1 or 2.

When looking for an alternative, we will try to remain as close as possible to the extended concentration index. Let us consider indices of the following type:^{2}

$$I(h,\epsilon )=f({\mu}_{h},\epsilon ){\int}_{0}^{1}w(p,\epsilon )h(p)dp$$

(8)

where *f* (μ* _{h}*, ε) is a normalization function,

The following result specifies for which type of weighting function an indicator passes the ‘upside down’ test, i.e. is such that *I*(*h*, ε) and *I*(*g*, ε) are always of the opposite sign (or both equal to zero).

The index *I*(*h*, ε) passes the ‘upside down’ test if and only if the weighting function is inversely symmetric around
${\scriptstyle \frac{1}{2}}$, i.e. if and only if we have *w*(*p*, ε) = −*w*(1− *p*, ε) for any 0≤ *p* ≤1.

(i) Let us rewrite (8) as
$I(h,\epsilon )=f({\mu}_{h},\epsilon )\left[{\int}_{0}^{1/2}w(p,\epsilon )h(p)dp+{\int}_{1/2}^{1}w(p,\epsilon )h(p)dp\right].$ If *w*(*p*, ε) = −*w*(1− *p*, ε) for any 0≤ *p* ≤1, then obviously we have
${\int}_{1/2}^{1}w(p,\epsilon )h(p)dp=-{\int}_{0}^{1/2}w(p,\epsilon )h(1-p)dp$. Hence we obtain
$I(h,\epsilon )=f({\mu}_{h},\epsilon ){\int}_{0}^{1/2}w(p,\epsilon )[h(p)-h(1-p)]dp$. Likewise we derive that
$I(g,\epsilon )=f({\mu}_{g},\epsilon ){\int}_{0}^{1/2}w(p,\epsilon )[g(p)-g(1-p)]dp$. Since *g*(*p*) =*h*(1− *p*) and μ* _{g}* = μ

The sufficiency part of the proof shows that indicators which pass the upside down test are always such that *I*(*h*, ε) = −*I*(*g*, ε). This is what we call the *symmetry* property. Although the symmetry property is at first sight a stronger requirement than passing the upside down test, Theorem 1 reveals that they are equivalent. The theorem also shows that if we want the symmetry property to hold, then we are obliged to abandon the weighting function of the extended concentration index.

In order to construct an index with the symmetry property, we have to replace the asymmetric weighting scheme of the extended concentration index by an inversely symmetric weighting scheme. This implies that if we want to maintain relatively high negative weights for the poorest individuals, we need to give relatively high positive weights to the richest individuals. The symmetric index we propose here is defined as follows:

$$S(h,\beta )=\frac{1}{{\mu}_{h}}{\int}_{0}^{1}\beta {2}^{\beta -2}{\left[{(p-{\scriptstyle \frac{1}{2}})}^{2}\right]}^{{\scriptstyle \frac{\beta -2}{2}}}(p-{\scriptstyle \frac{1}{2}})h(p)dp$$

(9)

with β >1.^{4} In terms of expression (8), we have ε = β and:

$$f({\mu}_{h},\beta )=\frac{1}{{\mu}_{h}},\phantom{\rule{0.38889em}{0ex}}w(p,\beta )=\beta {2}^{\beta -2}{\left[{(p-{\scriptstyle \frac{1}{2}})}^{2}\right]}^{{\scriptstyle \frac{\beta -2}{2}}}(p-{\scriptstyle \frac{1}{2}})$$

(10)

One can check that for β = 2 we have *w*(*p*, 2) = 2*p*−1, which means that for this value the symmetric index coincides with the extended concentration index with ν = 2.

The weighting scheme has been devised in such a way that those with fractional ranks above the median always have positive weights, and those below the median always negative weights. As can be seen from Figure 2, by taking 1< β < 2 we give relatively higher weights to those with a fractional rank close to the median, while by taking β > 2 we give relatively higher weights to those at the upper and lower end of income distribution. In the most extreme case (β → +∞) the symmetric index tends to
$\frac{h(1)-h(0)}{4{\mu}_{h}}$. The value of the index varies between −β/2 and +β/2.^{5} Just as for the extended concentration index, the distance between the bounds is equal to the value of the distributional sensitivity parameter, β, and coincides with the distance between the weights of the richest and the poorest individual.

Because of the symmetry property, the individual who occupies the median position in the income distribution plays a pivotal role in the calculation of the symmetric index. Suppose there is a ceteris paribus increase of the health level of one person located at position *p* in the socioeconomic distribution. What would be the effect of such a change upon the value of the index measuring socioeconomic health inequalities? Let us start at *p* = 0 (the poorest individual). Obviously this is a pro-poor change, and we expect the index to become more pro-poor, i.e. to decrease in value. This implies that we always have *w*(0, ε) < 0. Next, let us increase *p* and wonder from what value of *p* the change becomes pro-rich, i.e. at which point *w*(*p*, ε) turns positive. If we think that this threshold value *p*^{*} should be lower than the median, we could opt for the extended concentration index: given *p*^{*} < 1/2, if we choose the value of the distributional sensitivity parameter ν^{*}, where ν^{*} is such that *p*^{*} =1−(1/ν^{*})^{1/(ν* −1)}, we obtain the desired result. If, however, we decide that the threshold value should always be equal to the median, the symmetric index seems a more appropriate choice.

The threshold value *p*^{*} demarcates the group of the poor from the group of the non-poor. We believe that the choice of *p*^{*} = 0.5 is a reasonable point of departure as 0.5 is the expected location of a person. In other words, the lower half of the population is considered as poor, and the upper half as rich. We do not exclude that another value, say *p*^{*} = 0.25, might be more appropriate than our *a priori* choice, but without additional information (e.g. on income levels) we think it is very hard to make a case for such an alternative boundary. By construction, rank-dependent inequality measures leave that kind of information out of consideration, and therefore naturally lead us to take *p*^{*} = 0.5, at least as a starting point.

Another issue concerns the reaction of the index of socioeconomic health inequality to health transfers at different locations in the distribution. Suppose there is a transfer of health Δ from a person located at position *p _{j}* to a person located at position

When the issue is the measurement of one-dimensional inequality, for instance of incomes, we believe ‘sensitivity to poverty’ is the appropriate distributional concept. But it can be questioned whether the same concept is also the most appropriate one in the case of two-dimensional inequality, for instance of health in relation to socioeconomic rank. In the latter case, we are not measuring the inequality of health as such, but the degree of association between the distribution of health and the socioeconomic ranking. The measure of this degree of association should take into account the whole spectrum of possibilities, and not privilege inequality in one dimension over inequality in the other. By making the measure more sensitive to one end of the income spectrum (‘the poor’) than to the other (‘the rich’), we run the risk of reducing or even neglecting part of the existing inequality. Why should a person with a low income rank but a high health level count more than a person with a high income rank but a low health level? While the symmetric index does not address this issue directly, it expresses the idea that what is happening at the extremities of the income distribution, whether it be at the high end or the low end, should carry more weight than what is happening in the middle.

This brings us to an interesting resemblance between rank-dependent indices that satisfy the symmetry property and the range, which is probably the oldest and most frequently used measure in the field of health inequality (e.g. Townsend and Davidson 1982). The range compares the health levels of the top and bottom income groups, and its implicit value judgment is that the difference between the best and worst-off income group is what matters for health inequality. This is very much in line with the value judgements of the symmetric index with a high ‘sensitivity to extremity’ (i.e. a high value of β). Hence, the symmetric index proposed in this paper allows to bring together the value judgements underlying rank-dependent indices, such as the concentration index, and those underlying indices focusing on ‘extremes’, such as the range.

Due to its exclusive focus on income poverty, the extended concentration index may lead to counterintuitive results. Consider a health distribution in which the poorest 10% of the population have a very high health level, say *c* > 0, the richest 20% also, and all the rest a very low health level, say 0. Since there are twice as much rich persons in good health than poor persons, we believe few people would doubt that health is distributed rather strongly in favour of the rich, and therefore we expect a positive value of the index. Yet, the extended concentration index will be *negative* for any value of ν (approximately) higher than 3.33 (the value of the extended concentration index is in this example equal to
$\frac{10}{3}[{(0.9)}^{\nu}-{(0.2)}^{\nu}-0.7]$)^{6}. By contrast, the symmetric index will always be positive.

The explanation of the divergence lies in the way in which the two indices treat different combinations of ranks and health levels. Low health levels always have a small contribution to the value of the extended concentration index (positive in case of a high rank and negative for low ranks); and this also holds for the symmetric index. But things are different for high health levels. In case of the extended concentration index, these lead to a moderately positive contribution for high ranks, and a very large negative contribution for low ranks; while there is no such difference (apart from the sign of the contribution) for the symmetric index, i.e. there is a large positive contribution for high ranks, and a large negative contribution for low ranks.

While the extended and symmetric concentration indices can be applied to unbounded variables, our focus now shifts to the case of bounded health variables, i.e. health variables which can be looked at from two points of view: the positive side, where the focus is on ‘good health’ (e.g. the proportion of children without malnutrition), and the negative side, where the emphasis is on ‘ill health’ (e.g. the proportion of children with malnutrition). For any good health variable which has a finite upper bound it is in principle possible to define a corresponding ill health variable by calculating the shortfall with regard to the maximum. This twofold character of many health variables introduces an element into the measurement of health inequality which does not occur in the measurement of income inequality (Wagstaff 2005, Erreygers 2009a, Wagstaff 2009, Erreygers 2009b, Lambert and Zheng, 2011, Erreygers and Van Ourti 2011a, Wagstaff 2011a, Erreygers and Van Ourti 2011b, Wagstaff 2011b, Kjellsson and Gerdtham 2011).

In this paper we limit ourselves to bounded health variables of the ratio-scale type.^{7} Any such variable can always be transformed into a standardized variable with a range equal to the interval [0,1]: if *h _{i}* is a variable with a range equal to [

It is now well-known that the standard concentration index may give conflicting information when applied separately to health and ill health. When comparing two different distributions, it can occur that the distribution with the highest measured degree of health inequality does not show the highest degree of measured ill health inequality (Clarke et al. 2002; Erreygers 2009a). With regard to one-dimensional inequality, Lambert and Zheng (2011) show that the requirement that the ranking of distributions generated by the health index should be the same as the ranking generated by the ill health index, is equivalent to imposing the perfect complementarity property, by which we mean that for a given distribution the value of the health index must always be exactly equal to the value of the ill-health index. In the two-dimensional case, this translates into the mirror property: the value of the health index should be exactly the opposite of the value of the ill health index, i.e. *I*(*h*^{*},ε) = −*I*(*s*^{*},ε). Erreygers and Van Ourti (2011a) show that the mirror property is incompatible with rank-dependent inequality indices focusing on relative (ill) health differences between individuals; and this explains why the violation of the mirror property carries over to the extended and symmetric concentration indices. If the mirror property is believed to be more important than the focus on relative differences, then obviously the extended and symmetric concentration indices must be abandoned or modified.

For the class of indices studied by Erreygers and Van Ourti (2011a) the mirror property requires the normalization functions of the health and ill-health indices to be symmetrical around
${\mu}_{{h}^{}}$. This is straightforwardly applied to the class of indices introduced in (8), i.e. we must have *f* (μ_{h*}, ε) = *f*(1−μ* _{h}*, ε for any given values of μ

The reason why the extended concentration index and the symmetric index fail to satisfy the mirror property is that their normalization functions do not have the required property. One obvious way to remedy the situation is therefore to modify the normalization function, keeping the weighting function intact. The simplest way of ensuring that *f* (μ_{h*}, ε) =*f* (1 − μ_{h*}, ε) holds for any value of μ_{h*} is to make the function independent of μ_{h*}. This procedure constitutes also the basis of the corrected version of the standard concentration index proposed by Erreygers (2009a) (hereafter the Erreygers index), which is itself closely related to the so-called generalized concentration index.

The generalized extended concentration index we propose here is defined as follows:

$$GC({h}^{}$$

(11)

In terms of (8), this means we take ε = ν and:

$$f({\mu}_{{h}^{}}$$

(12)

For ν = 2 we have *f* (μ_{h*}, 2) = 4 and *w*(*p*, 2) = 2*p* − 1, and we obtain the continuous version of the Erreygers index.

We have already observed that, for a given value of ν, those with fractional ranks below 1−(1/ν)^{1/(ν−1)} have negative weights and those with fractional ranks above this value positive weights. Since ∫ [1− ν(1− *p*)^{ν−1}]*dp* = *p* + (1 − *p*)^{ν} +*C*, the sum of the positive weights is equal to (ν −1)ν^{−ν/(ν−1)}, and those of the negative weights −(ν −1)ν^{−ν/(ν−1)}. This implies that when all individuals with positive weights have maximum health and all individuals with negative weights minimum health, then the value of index (11) is equal to +1; in the opposite case it is equal to −1. In all other cases the value of the index will be strictly higher than −1 and strictly lower than +1. In fact, for given values of ν and μ_{h*} the lower and upper bounds of the index are such that:

$$\frac{{\nu}^{\nu /(\nu -1)}}{\nu -1}(1-{\mu}_{{h}^{}}$$

(13)

(The maximum upper bound of +1 can be reached only when μ_{h*} = (1/ν)^{1/(ν−1)}, and the minimum lower bound of −1 only when μ_{h*} = 1 − (1/ν)^{1/(ν−1)}). Finally, when ν → +∞, the generalized extended concentration index in equation (11) tends to μ_{h*} − *h*^{*} (0).

The generalized version of the symmetric index is defined as:

$$GS({h}^{}$$

(14)

This means that we take ε = β in (8) and:

$$f({\mu}_{{h}^{}}$$

(15)

In this case too, the value of β = 2 leads to the Erreygers index. The sum of the normalized positive weights is always equal to 1, which implies that the value of index (14) is equal to +1 when all individuals above the median have maximum health and all individuals below it minimum health; in the opposite case the index is −1. In general, for given values of β and μ_{h*} the bounds of the index are equal to:

$$-1+{2}^{\beta}{[{({\scriptstyle \frac{1}{2}}-{\mu}_{{h}^{}})2}^{}]{\scriptstyle \frac{\beta}{2}}}^{\le}$$

(16)

(The maximum bound of +1 and the minimum bound of −1 can be reached only when
${\mu}_{{h}^{}}$.) When β → +∞, the generalized symmetric index tends to *h*^{*}(1) − *h*^{*}(0).

We end up with four indices, two of which were developed for unbounded variables and two for bounded variables. We can classify the available indices according to whether they have the symmetry and/or mirror property (see Table 1). For bounded variables we think the mirror property is crucial; in terms of value judgements, we believe it is better to use an index which has the mirror property than an index which focuses exclusively on relative (ill) health differences. Those who believe that accounting for relative (ill) health differences is more important than satisfying the mirror property can resort to the extended concentration and symmetric index when using bounded variables (see also Erreygers and Van Ourti 2011a). Whether the symmetry property is essential or not is open to debate; depending on one’s preference, there is a choice between the (generalized) symmetric index and the (generalized) extended concentration index.

Before we move to an empirical application, it deserves to be mentioned that we have not explored all possible indices. There is much to be said in favour of a procedure which starts from a very broad set of indices and then narrows it down after specifying a list of desirable properties. We fully realize that the (generalized) extended concentration index and the (generalized) symmetric index are not the only indices that incorporate distributional sensitivity, but since the extended concentration index has been used in practice and the other indices are closely related to it, we decided to limit the discussion to these four indices.

In this section we will illustrate some of the measurement issues concerning each of the four previously discussed inequality measures using real data. All calculations^{8} are based on data collected from the Demographic and Health Surveys (DHS) which involve a range of measures regarding health (and ill health) and use of types of health care collected over 40 developing countries. The DHS has been used before in other studies of health inequalities (among others, Wagstaff 2002 and Van De Poel et al. 2007). The surveys we use range in size from around 2,500 to over 30,000 individuals and refer to the period between 1996 and 2004.

In this study we focus on two health variables from subsets of these data: (a) under-five mortality for children born between 5 and 15 years from the time of the survey^{9}; and (b) the number of antenatal visits the mother had for her lastborn child. Under-five mortality is a bounded variable and hence will be used to illustrate the ‘generalized’ measures. Since this variable is binary and only takes the values of 0 and 1, there is no need to standardize before applying these indices. We use the number of antenatal visits as an example of an unbounded variable to illustrate the standard, i.e. ‘relative’, versions of the two indices.^{10} Key characteristics of the surveys including the total number in each sub-sample as well as the proportion of children dying under five years of age, and summary statistics of the number of antenatal visits are reported in Table 2 by country.

In some countries, a high proportion of households have the same value for the constructed socioeconomic variable, which results in ties of the socioeconomic rank.^{11} For example, more than 35 percent of the sample in Comoros, Haiti, Nepal and Zambia have the same value for the constructed socioeconomic variable. In addition, smallsample bias and differences in (ex-post) sampling probabilities need to be addressed when undertaking analyses based on finite samples. In the appendix we provide detailed information on how to adjust for these biases when using these indices.

Before applying the indices to all countries it is useful to examine two countries (Niger and The Philippines) in detail in order to understand how and why the indices vary for different levels of the distributional sensitivity parameters β and ν. Figure 3 presents the distribution of under-five mortality across socioeconomic deciles in these two countries. In Niger under-five mortality rates are very high, with around 34% of children in households in the lowest socioeconomic decile dying before the age of five, compared to around 19% in the highest decile (see Figure 3a). In The Philippines the rates of mortality range from 9% in the lowest to 3% in the highest decile (see Figure 3b).

Since the infant mortality rate is a bounded variable, we use the generalized versions of our indices to compare the two countries. Figure 3c and 3d present the values of the indices for both Niger and The Philippines for a range of values of the distributional sensitivity parameters based on the proportion of children dying in each decile. There is a striking difference between the messages conveyed by the generalized extended concentration index and the generalized symmetric index. For low values of the distributional sensitivity parameters both indices conclude that infant mortality has a pro-poor bias in both countries, and that the absolute level of socioeconomic inequality is higher in Niger than in the Philippines. For high values of the distributional sensitivity parameter ν, however, the generalized extended concentration index becomes very small for Niger, whereas the values for The Philippines remain fairly constant.^{12} As a result, the inequality ranking of Niger and The Philippines based on the generalized extended concentration depends on the value of the distributional sensitivity parameter. The generalized symmetric index, by contrast, constantly measures pro-poor bias in both countries and absolute values which are higher in Niger than in The Philippines.^{13} The main reason for the difference between the two indices lies in the non-monotonic nature of the distribution of under-five mortality according to socioeconomic status in Niger. For high values of ν the generalized extended concentration index is, in fact, dominated by the rising mortality rates among the poorest deciles. The generalized symmetric index also takes this into account, but compensates it by giving just as much weight to the falling mortality rates among the richest deciles.

As an example of an unbounded variable we examine the average number of antenatal visits. For this type of variable we use the (relative) extended concentration index and the (relative) symmetric index. Figures 4a and 4b show there is a clear prorich social gradient in The Philippines; there is also a pro-rich social gradient in Niger, but the average number of visits declines across the first few deciles. Figures 4c and 4d reveal the different reactions of the two indices. For higher values of ν the extended concentration index for Niger declines, while it increases for The Philippines; again, there is a reversal of the inequality ranking of the two countries when ν increases. By contrast, for all values of β Niger has higher values of the symmetric index than The Philippines.

A more complete picture emerges when we take all countries in the DHS database into account. Table 3 lists the values of the indices for the variable infant mortality, and Table 4 for the variable number of antenatal visits. For the sake of brevity we present the result for four values only of the distributional sensitivity parameters: 1.5, 2, 3 and 6.

A good indication of the difference between the various indices is provided by a pairwise comparison of the ranks of the different countries for comparable values of the distributional sensitivity parameters. Figure 5 plots the rankings of countries produced by the generalized extended concentration index *GC*(*h*^{*},ν) and the generalized symmetric index *GS*(*h*^{*},β) for infant mortality, for different values of ν = β. In each case, the countries are ranked from most pro-poor to most pro-rich. For ν = β = 2 the values of these indices coincide, so the ranks are perfectly correlated (see Figure 5b). For values of ν and β close to 2 the ranks are no longer perfectly correlated, but remain fairly similar (see Figures 5a and 5c). For ν = β = 6, however, the correlation is much weaker (see Figure 5d): the Spearman rank correlation coefficient drops to 0.770, which indicates an increasing divergence in the ranking of countries across indices.

Figure 6 shows the same information, but now for the number of antenatal visits and using the relative inequality measures, i.e. the extended concentrating index and the symmetric index. A broadly similar pattern emerges, although in this case there seems to be more consistency between the measures: the Spearman rank correlation coefficients tend to be slightly higher.

In this paper we have explored various ways to incorporate attitudes towards inequality into the measurement of socioeconomic health inequalities. We started by revisiting the extended concentration index that was proposed by Pereira (1998) and Wagstaff (2002). Its asymmetric weighting scheme is based on the idea – borrowed from the income inequality literature – that we should give relatively higher (absolute) weights to the poor than to the rich. But this comes at a price: the index can in some instances conclude there exists pro-poor bias in cases when there is no systematic bias in favour of either the rich or the poor, i.e. when the chances of having high or low health levels happen to be symmetrically distributed over the rich and the poor. Informed by this analysis, we then introduced a new requirement – the symmetry property – and a new index – the symmetric index – based on a distributional weighting scheme which is (inversely) symmetric around the median income rank and gives higher (absolute) weights to both the poor *and* the rich.

We also paid special attention to bounded variables. Similarly to the standard concentration index, it turns out that neither the extended concentration nor the symmetric index satisfies the mirror condition, i.e. the requirement that socioeconomic inequality in health attainments should mirror socioeconomic inequality in health shortfalls (or ill-health). We then showed, however, that a simple re-normalization suffices to transform both of these indices into a generalized form which does satisfy the mirror condition.

In an empirical section we illustrated that the rankings of countries generated by the indices can differ substantially, especially for high values of the distributional sensitivity parameters. This holds for both unbounded and bounded variables. Unlike what is observed for the extended Gini coefficient, increasing the degree of distributional sensitivity does not always lead to increasing pro-rich inequality: these indices can rise or fall in value and can even switch sign whenever measures of health (or ill-health) are not monotonically changing across socio-economic groups. The issue appears to be most relevant for the extended and generalized extended concentration indices. Hence more caution needs to be exercised when choosing the value of the distributional sensitivity parameter than in the case of the extended Gini coefficient. What is important for all indices when doing empirical work, are biases due to finite samples, ties in the ranking variable and differences in (ex-post) sampling probabilities. In the appendix we point out how we have dealt with those issues.

Finally, while we have examined the properties of the four indices incorporating different weighting schemes and different normalization functions, we can still say very little about the preferences of society regarding the distribution of health across income. Developing a plausible range of weighting schemes which can be employed in empirical work to reliably inform policy analysis therefore remains an important challenge

Tom Van Ourti is supported by the NETSPAR project ‘Income, health and work across the life cycle II’, and acknowledges support by the National Institute on Ageing, under grant R01AG037398-01. We have benefited from the comments and suggestions of two anonymous referees, Ramses Abul Naga, Paul Allanson, Clément de Chaisemartin, Gustav Kjellsson, Andreas Knabe, Ann Lecluyse, Dennis Petrie, and participants of seminars at Erasmus University Rotterdam, University of Antwerp, Lund University, the Health, happiness and inequality conference at the Technische Universität Darmstadt, the 2010 Irdes workshop on applied health economics and policy evaluation in Paris, the LowLands Health Economics Study Group in Egmond aan Zee, and the 2011 iHEA conference in Toronto. We also thank Ellen Van de Poel for assistance with the DHS data. The usual caveats apply and all remaining errors are our responsibility.

In empirical work one works with finite samples, which means that *p* is not a continuous variable and that a discrete version of the formulas has to be used. The value of *p* is usually approximated by the fractional rank *R _{i}* (Lerman and Yitzhaki 1989). The fractional rank versions of the four indices are:

$${C}^{R}(h,\nu )=\frac{1}{{\mu}_{h}}\sum _{i=1}^{n}\left[\frac{1-\nu {(1-{R}_{i})}^{\nu -1}}{n}\right]{h}_{i}$$

(A1)

$${GC}^{R}(h,\nu )=\frac{{\nu}^{\nu /(\nu -1)}}{\nu -1}\sum _{i=1}^{n}\left[\frac{1-\nu {(1-{R}_{i})}^{\nu -1}}{n}\right]{h}_{i}^{}$$

(A2)

$${S}^{R}(h,\beta )=\frac{1}{{\mu}_{h}}\sum _{i=1}^{n}\left\{\frac{\beta {2}^{\beta -2}{\left[{({R}_{i}-{\scriptstyle \frac{1}{2}})}^{2}\right]}^{{\scriptstyle \frac{\beta -2}{2}}}\left({R}_{i}-{\scriptstyle \frac{1}{2}}\right)}{n}\right\}{h}_{i}$$

(A3)

$${GS}^{R}(h,\beta )=4\sum _{i=1}^{n}\left\{\frac{\beta {2}^{\beta -2}{\left[{({R}_{i}-{\scriptstyle \frac{1}{2}})}^{2}\right]}^{{\scriptstyle \frac{\beta -2}{2}}}\left({R}_{i}-{\scriptstyle \frac{1}{2}}\right)}{n}\right\}{h}_{i}^{}$$

(A4)

(The superscript *R* is added to indicate that these expressions are based on the fractional ranks *R _{i}*).

In general it can be said that the value of the fractional rank indices will be very close to the value of the continuous indices for high values of the number of persons *n*, and that the degree of approximation increases with *n*. However, for relatively small values of *n*, the deviation between the two indices may be substantial. The magnitude of this ‘small-sample bias’ is distribution-specific and will be larger for values of the distributional sensitivity parameters ν or β that are relatively further away from 2.

One of the remarkable things, though, is that the small-sample bias also shows up for the extended and generalized extended concentration indices with ν ≠ 2 in the case of an *equal* distribution of health. By replacing *p* by *R _{i}*, the sum of the weights becomes slightly positive (negative) when ν > 2 (ν < 2), whereas they should be equal to zero. An additional reason for the small-sample bias that shows up for an

Our solution to this small-sample bias is based on the idea that the individual weights should be adjusted in such a way that they are equal to the corresponding continuous weights. We assume that individual *i* in a sample of *n* individuals corresponds to the interval
$\left[\frac{i-1}{n},\frac{i}{n}\right]$ in the continuous population [0,1]. Given the continuous weighting function *w*(*p*, ε), the corresponding ‘small-sample corrected’ weight *w ^{S}* (

$${w}^{S}(i,\epsilon )={\int}_{(i-1)/n}^{i/n}w(p,\epsilon )dp$$

(A5)

The small-sample adjusted version of our general family of indices becomes:

$${I}^{S}(h,\epsilon )=f({\mu}_{h},\epsilon )\sum _{i=1}^{n}{w}^{S}(i,\epsilon ){h}_{i}$$

(A6)

(The superscript *S* indicates that this expression refers to the small-sample adjusted version).

On the basis of (A5) the small-sample adjusted weights are equal to:

$${w}^{S}(i,\nu )=\frac{1}{n}-\left[{\left(1-\frac{i-1}{n}\right)}^{\nu}-{\left(1-\frac{i}{n}\right)}^{\nu}\right]$$

(A7)

$${w}^{S}(i,\beta )={2}^{\beta -2}\left[{\left|\frac{i}{n}-\frac{1}{2}\right|}^{\beta}-{\left|\frac{i-1}{n}-\frac{1}{2}\right|}^{\beta}\right]$$

(A8)

It can be checked that the small-sample adjusted and the fractional rank based weights coincide only when ν = β = 2 .

When there are ties, the fractional ranks of the tied individuals are based on the average rank within the group to which they belong, which creates an additional source of bias. The latter source of bias is identical to the so-called bias from grouping which arises when data are grouped into categories or ranges, e.g. income quintiles (Clarke and Van Ourti, 2010; Van Ourti and Clarke; 2011). Wagstaff suggested to correct for the bias due to grouping by subtracting the excess value of the sum of the weights from the value of the fractional rank index (Wagstaff 2002, Appendix A.2; O’Donnell et al. 2008, equation 9.6). We follow an alternative approach that generalizes the idea that the adjusted weights should be equal to the corresponding continuous weights which allows to address both the small-sample bias *and* the bias due to grouping.

Suppose there are *K* groups in the population, denoted as 1, 2,…,*K*, with 1 referring to the poorest group, 2 to the second poorest, etc. The number of people in group *J* is equal to *n _{J}*. Let us define the total number of people in all groups up to and including

$${w}^{s}(J,\nu )=\frac{{n}_{J}}{n}-\left[{\left(1-\frac{{I}_{J-1}}{n}\right)}^{\nu}-{\left(1-\frac{{I}_{J}}{n}\right)}^{\nu}\right]$$

(A9)

$${w}^{s}(J,\beta )={2}^{\beta -2}\left[{\left|\frac{{I}_{J}}{n}-\frac{1}{2}\right|}^{\beta}-{\left|\frac{{I}_{J-1}}{n}-\frac{1}{2}\right|}^{\beta}\right]$$

(A10)

In survey design differences in (ex-post) sampling probabilities are usually counterbalanced by using so-called ‘sampling weights’. If we denote the sampling weight of individual *i* by *w _{i}*, the only adjustments to equations (A9)–(A10) are that: (a) the number of people in group

We know that the biases discussed in sections A.1–A.4 are distribution-specific, but their magnitude, how they vary with the number of observations, the parameters ν and β, and the shape of the distribution is unknown. Similarly, it is not a priori clear how much of the bias is removed by applying the small-sample adjusted weights in equations (A9)–(A10).

In order to increase our understanding, we have performed Monte Carlo simulations using the beta distribution which allows to analyze these issues under a wide variety of shapes of the distribution function, including bimodal and left- and right-skewed distributions. We have also checked a wide variety of values for ν = β in order to reflect a wide range of distributional concerns.

More details on the Monte Carlo simulations can be obtained from the authors upon request, but the general message is that the magnitude of the small-sample bias can be large and that it is more important when health and the income rank are strongly correlated. We also find that the small-sample adjustments reduce the small-sample bias, although there are some cases where the point estimates of the indices did not improve.

^{1}For ν → +∞, the value of the index based on a finite number of individuals becomes distorted and needs to be adjusted. The appendix provides more details.

^{2}This is the continuous version; we introduce the version for a finite number of individuals in the appendix.

^{3}These conditions ensure that the property of income-related health transfers holds (Bleichrodt and van Doorslaer 2006), and that the value of the index is equal to zero when health is distributed equally.

^{4}For β < 1 the weighting function is decreasing in *p*, and for β = 1 it has a discontinuity at *p* = 1/2 (it jumps from −1/2 to +1/2), so we exclude these values

^{5}By changing the normalization function into
$f({\mu}_{h},\beta )=\frac{2}{\beta {\mu}_{h}}$, we would obtain an index which always varies between −1 and +1.

^{6}This can be checked by noting that ∫ [1 − ν (1 − *p*)^{ν−1}] *dp* = *p* + (1 − *p*)* ^{ν}*+

^{7}Erreygers and Van Ourti (2011a) discuss the implications for (rank-dependent) inequality measurement of different measurement scales.

^{8}The STATA-programs used to calculate the four indices can be obtained from the authors upon request.

^{9}This variable has previously been used by Wagstaff (2002) to illustrate the extended concentration index.

^{10}The DHS does not include information on health care expenditures which would constitute an ideal candidate for an unbounded ratio-scale variable (see e.g. Table 1 in Erreygers and Van Ourti 2011a). We use the number of antenatal visits as a proxy for an unbounded variable which seems a reasonable assumption given that it might take reasonably high values in practice.

^{11}The socioeconomic variable on the basis of which ranks are assigned, is constructed using principal component analysis by combining information on a set of household assets and living conditions into one indicator (Filmer and Pritchett 2001).

^{12}For Niger the value of the generalized extended concentration index goes from −0.094 (ν = 1.5) to −0.008 (ν = 10), while for the Philippines it goes from −0.045 (ν = 1.5) to −0.042 (ν = 10). These values are based on decile data; the values reported in Table 3, which are slightly different, are based on individual data (and therefore more accurate).

^{13}For Niger the value of the generalized symmetric index goes from −0.067 (β = 1.5) to −0.139 (β = 10), while for the Philippines it goes from −0.042 (β = 1.5) to −0.058 (β = 10).

Guido ERREYGERS, Department of Economics, University of Antwerp, City Campus, Prinsstraat 13, 2000 Antwerpen, Belgium.

Philip CLARKE, School of Public Health, University of Sydney, Edward Ford Building, NSW 2006 Australia.

Tom VAN OURTI, Erasmus School of Economics, Erasmus University Rotterdam, PO Box 1738, 3000 DR Rotterdam, The Netherlands; Tinbergen Institute and NETSPAR.

- Arokiasamy P, Pradhan J. Measuring wealth-based health inequality among Indian children: the importance of equity vs. efficiency. Health Policy and Planning. 2010 doi: 10.1093/heapol/czq075. [PubMed] [Cross Ref]
- Bleichrodt H, van Doorslaer E. A welfare economics foundation for health inequality measurement. Journal of Health Economics. 2006;25:945–957. [PubMed]
- Clarke P, Gerdtham UG, Johannesson M, Bingefors K, Smith L. On the measurement of relative and absolute income-related health inequality. Social Science & Medicine. 2002;55:1923–1928. [PubMed]
- Clarke P, Van Ourti T. Calculating the concentration index when income is grouped. Journal of Health Economics. 2010;29:151–157. [PubMed]
- Erreygers G. Correcting the concentration index. Journal of Health Economics. 2009a;28:504–515. [PubMed]
- Erreygers G. Correcting the concentration index: A reply to Wagstaff. Journal of Health Economics. 2009b;28:521–524. [PubMed]
- Erreygers G, Van Ourti T. Measuring socioeconomic inequality in health, health care and health financing by means of rank-dependent indices: A recipe for good practice. Journal of Health Economics. 2011a;30(4):685–694. [PMC free article] [PubMed]
- Erreygers G, Van Ourti T. Putting the cart before the horse. A comment on Wagstaff on inequality measurement in the presence of binary variables. Health Economics. 2011b;20(10):1161–1165. [PMC free article] [PubMed]
- Filmer D, Pritchett L. Estimating wealth effects without expenditure data – or tears: An application to educational enrollments in states of India. Demography. 2001;38:115–32. [PubMed]
- Gaudin S, Yazbeck AS. An Equity-Adjusted Approach, World Bank, Health, Nutrition and Population Discussion Paper. 2006. Immunization in India.
- Hernández Quevedo C, Jones AM, López Nicolás Á, Rice N. Socioeconomic inequalities in health: a comparative longitudinal analysis using the European Community Household Panel. Social Science & Medicine. 2006;63:1246–1261. [PubMed]
- Kakwani NC. Methods of Estimation and Policy Applications. Oxford University Press; Oxford: 1980. Income Inequality and Poverty.
- Kjellsson G, Gerdtham U-G. Correcting the Concentration Index for Binary Variables. Department of Economics, Lund University; 2011. Working Paper No 2011:4.
- Lambert P, Zheng B. On the consistent measurement of achievement and shortfall inequality. Journal of Health Economics. 2011;30(1):214–219. [PubMed]
- Lerman R, Yitzhaki S. Improving the accuracy of estimates of Gini coefficients. Journal of Econometrics. 1989;42:43–47.
- Meheus F, Van Doorslaer E. Achieving better measles immunization in developing countries: does higher coverage imply lower inequality? Social Science & Medicine. 2008;66:1709–1718. [PubMed]
- Mehran F. Linear measures of income inequality. Econometrica. 1976;44:805–809.
- O’Donnell O, van Doorslaer E, Wagstaff A, Lindelow M. Analyzing health equity using household survey data: A guide to techniques and their implementation. The World Bank; Washington DC: 2008.
- Pereira JA. Inequality in infant mortality in Portugal, 1971–1991. In: Zweifel P, editor. Health, the Medical Profession, and Regulation. Developments in Health Economics and Public Policy. Vol. 6. Kluwer: Boston/Dordrecht/London; 1998. pp. 75–93. [PubMed]
- Townsend P, Davidson N. Inequalities in health: the black report. Penguin; Harmondsworth: 1982.
- Uthman O. Using extended concentration and achievement indices to study socioeconomic inequality in chronic childhood malnutrition: the case of Nigeria. International Journal for Equity in Health. 2009;8:22. doi: 10.1186/1475-9276-8-22. [PMC free article] [PubMed] [Cross Ref]
- Van de Poel E, O’Donnell O, Van Doorslaer E. Are urban children really healthier? Evidence from 47 developing countries. Social Science and Medicine. 2007;65:1986–2003. [PubMed]
- Van Ourti T, Clarke P. A simple correction to remove the bias of the Gini coefficient due to income grouping. Review of Economics and Statistics. 2011;93(3):982–994.
- Wagstaff A. Inequality aversion, health inequalities, and health achievement. Journal of Health Economics. 2002;21:627–641. [PubMed]
- Wagstaff A. The bounds of the concentration index when the variable of interest is binary, with an application to immunization inequality. Health Economics. 2005;14(4):429–432. [PubMed]
- Wagstaff A. Correcting the concentration index: A comment. Journal of Health Economics. 2009;28:516–520. [PubMed]
- Wagstaff A. The concentration index of a binary outcome revisited. Health Economics. 2011a;20(10):1155–1160. [PubMed]
- Wagstaff A. Reply to Guido Erreygers and Tom Van Ourti’s comment on ‘The concentration index of a binary outcome revisited’ Health Economics. 2011b;20(10):1166–1168. [PubMed]
- Wagstaff A, Paci P, Van Doorslaer E. On the measurement of inequalities in health. Social Science & Medicine. 1991;33(5):545–557. [PubMed]
- Yitzhaki S. On an extension of the Gini inequality index. International Economic Review. 1983;24:617–628.

PubMed Central Canada is a service of the Canadian Institutes of Health Research (CIHR) working in partnership with the National Research Council's national science library in cooperation with the National Center for Biotechnology Information at the U.S. National Library of Medicine(NCBI/NLM). It includes content provided to the PubMed Central International archive by participating publishers. |