Home | About | Journals | Submit | Contact Us | Français |

**|**Demography**|**v.45(4); 2008 November**|**PMC2832329

Formats

Article sections

- Abstract
- METHOD
- APPLICATIONS
- DISCUSSION: COMPARISON WITH OTHER METHODS
- SUMMARY AND CONCLUSIONS
- REFERENCES

Authors

Related links

Demography. 2008 November; 45(4): 785–801.

PMCID: PMC2832329

NIHMSID: NIHMS89135

Shiro Horiuchi, Program in Urban Public Health, Hunter College; and CUNY Institute for Demographic Research.

John R. Wilmoth, Department of Demography, University of California, Berkeley.

Scott D. Pletcher, Huffington Center on Aging and Molecular and Human Genetics, Baylor College of Medicine.

Address correspondence to Shiro Horiuchi, Program in Urban Public Health, Hunter College, 425 East 25th Street, Box 816, New York, NY 10010-2590; e-mail: ude.ynuc.retnuh@hcuirohs.

Copyright © 2008 Population Association of America

This article has been cited by other articles in PMC.

A demographic measure is often expressed as a deterministic or stochastic function of multiple variables (covariates), and a general problem (the decomposition problem) is to assess contributions of individual covariates to a difference in the demographic measure (dependent variable) between two populations. We propose a method of decomposition analysis based on an assumption that covariates change continuously along an actual or hypothetical dimension. This assumption leads to a general model that logically justifies the additivity of covariate effects and the elimination of interaction terms, even if the dependent variable itself is a nonadditive function. A comparison with earlier methods illustrates other practical advantages of the method: in addition to an absence of residuals or interaction terms, the method can easily handle a large number of covariates and does not require a logically meaningful ordering of covariates. Two empirical examples show that the method can be applied flexibly to a wide variety of decomposition problems. This study also suggests that when data are available at multiple time points over a long interval, it is more accurate to compute an aggregated decomposition based on multiple subintervals than to compute a single decomposition for the entire study period.

Demographers often wish to compare two populations—either two populations at the same moment or the same population at two points in time—in terms of some variable of interest. Measures of most demographic processes (e.g., fertility, mortality, migration, marriage) have changed significantly over time and show considerable variation across populations or groups (e.g., race/ethnicity, nationality, sex, and region of residence). Such a measure can often be expressed as a function of several covariates (with or without an error term) and is thus regarded as the dependent variable. A general problem (the decomposition problem) is to assess contributions of changes or differences in the covariates between the two populations to the corresponding change or difference in the dependent variable. For example, women tend to live longer than men, and we may inquire about the relative contribution of differences in death rates by age and cause to the sex difference in life expectancy (Arriaga 1984; Pollard 1982, 1988). Likewise, the mean completed parity has increased or decreased in various contexts, and we may wish to express these temporal changes as functions of trends in parity progression ratios (Pullum, Tedrow, and Herting 1989).

As Das Gupta (1991) noted, there are two fundamentally different types of decomposition problems, depending on whether the populations involved are treated as homogeneous or heterogeneous with respect to the dependent variable of interest and its covariates. In the first type of decomposition problem, the dependent variable is described as a function of covariates; only a single set of values for the dependent variable and its covariates is known for a population at any given moment, and thus the change in the dependent variable is decomposed into effects due to the change in each covariate for the population as a whole. In the second type, the dependent variable takes on different values for population subgroups; these subgroups are defined by the associated set of covariate values, and thus the change in the population mean of the dependent variable is decomposed into effects due to changes in the dependent variable within the various subgroups and effects due to changes in the population distribution across the subgroups. The distinction between these two types will be presented more formally later in the article (Eqs. (13) – (15)). Various decomposition methods of these two types are reviewed by Canudas Romo (2003).

This article focuses on the first type, in which the difference in a dependent variable is expressed as a sum of the effects of differences in its covariates. (Hereafter, “decomposition” means the first type only, unless specified otherwise.) Decomposing the difference in the dependent variable is straightforward if it can be written as an *additive* function of its covariates. For example, the total fertility rate (TFR) is the sum of age-specific birth rates. Therefore, the contribution to a change in the TFR that is attributable to a change in the birth rate at some age is merely the change in the age-specific birth rate itself. Likewise, a difference in the dependent variable of any standard linear regression equation can be expressed as an additive function of differences in the covariates and the error term. However, many demographic measures cannot be expressed in a simple additive format. This problem has stimulated the development of various decomposition methods that deal with nonadditive relationships.

Previous approaches to decomposition problems have been based on some manner of *discrete* change in the value of each covariate from the first population to the second. In this article, we propose a method of decomposition analysis relying on an assumption that values of the covariates change *continuously*, or gradually, along an actual or hypothetical dimension. This assumption seems like the natural choice for time-trend analyses because many variables change gradually over time. As we discuss later, this assumption provides a reasonable justification for the additivity of covariate effects, which is a fundamental condition for decomposition analysis, and also for the elimination of interaction effects, which was an important issue in many previous decomposition studies.

The proposed method also requires an assumption about the relationship between covariates as they change gradually between two observation points. A convenient assumption is that changes in the covariates are proportional to one another. In other words, equal proportions of the total change in each covariate are assumed to occur simultaneously. This assumption has a precise mathematical specification, as described later in the article.

Although a number of previous methods were developed for specific dependent variables (e.g., life expectancy, mean completed parity, proportion of population in old age), this method can be applied to any type of dependent variable and its covariates, so long as the former is a differentiable function of the latter. Thus, the method is not limited to demography but applicable in any scientific fields that are concerned about the difference between two observations of a function of multiple variables. The relationship between the dependent variable and its covariates can be deterministic or stochastic.

This method was devised as a generalization of two methods that had been independently developed for some specific dependent variables (Pletcher, Khazaeli, and Curtsinger 2000; Wilmoth and Horiuchi 1999). We and our collaborators used this method in a few recent studies (Glei and Horiuchi 2007; Wilmoth et al. 2000), but their reports described the method very briefly and cited an early, unpublished version of this article for methodological details. In what follows, we describe the method, present two examples of its application, and compare the method with previous methods.

Let us start by clarifying the meanings of some key terms. In this article, *x* is called a *covariate* of *y* if *y* can be expressed as a mathematical function of *x* (and some other variables), regardless of whether *x* is associated with *y* through some causal pathway. An *effect* of *x* on *y* is a change or difference in *y* produced by a change or difference in *x*. (In some cases, we write “effect of the change (difference) in *x*” instead of “effect of *x*” to emphasize the variation in *x*.) A change or difference in *y* is *decomposed* by expressing it as the sum of effects of its covariates (and in some cases, we include additional terms such as *interaction effects* and *residuals*).

In decomposition analysis, effects of the covariates are assumed to be additive, even though the dependent variable is not usually an additive function of the covariates. (If it is an additive function, decomposition is simple and no special method is needed.) In the previous literature, it has been unclear whether this apparent paradox (nonadditive function of covariates, yet additive covariates effects) is justifiable, or if the decomposition is simply a computational trick without a firm theoretical foundation. The method proposed here is based on a mathematical model that logically justifies the additivity of covariate effects.

Suppose that a population is described by a numerical characteristic *y*, which is a differentiable function *f* of *n* covariates denoted by **x** = [*x*_{1},*x*_{2}, . . . ,*x** _{n}*]. Assume that both

$$y(t)=f(\mathbf{\text{x}}(t))=f({x}_{1}(t),{x}_{2}(t),\dots ,{x}_{n}(t)),$$

(1)

we have

$$y({t}_{2})-y({t}_{1})={\int}_{{t}_{1}}^{{t}_{2}}\frac{d}{dt}y(t)dt,$$

(2)

by the fundamental theorem of calculus. By applying the chain rule for partial derivatives of a composite function, we obtain

$$y({t}_{2})-y({t}_{1})={\int}_{{t}_{1}}^{{t}_{2}}\left\{\sum _{i=1}^{n}\frac{\partial}{\partial {x}_{i}(t)}y(t)\xb7\frac{d}{dt}{x}_{i}(t)\right\}dt.$$

(3)

Exchange of the integration and the summation, and application of the substitution rule of definite integrals lead to the following equation:

$$y({t}_{2})-y({t}_{1})=\sum _{i=1}^{n}{\int}_{{x}_{i}({t}_{1})}^{{x}_{i}({t}_{2})}\frac{\partial}{\partial {x}_{i}(t)}y(t)d{x}_{i}(t).$$

(4)

Writing *y*(*t*_{1}), *y*(*t*_{2}), *x** _{i}*(

$${y}_{2}-{y}_{1}=\sum _{i=1}^{n}{c}_{i},\text{where}{c}_{i}={\int}_{{x}_{i1}}^{{x}_{i2}}\frac{\partial y}{\partial {x}_{i1}}d{x}_{i}.$$

(5)

In this notation, *c** _{i}* is the total change in

The preceding discussion provides a general theoretical foundation for decomposition analysis because it implies that even if a dependent variable is not an additive function of its covariates, a change in the dependent variable can be expressed as a sum of effects of the covariates. Geometrically, the vector **x**(*t*) is a point in an *n*-dimensional space, and the difference between *y*(*t*_{1}) and *y*(*t*_{2}) is an integral of the change in *y* along a curve on which the point moves from **x**(*t*_{1}) to **x**(*t*_{2}). The integral can be split into *n* additive components, as shown in Eq. (3). This type of integral is called *line integral*, which is widely used in mechanics (Williamson, Crowell, and Trotter 1968). In this sense, the mathematical framework represented by Eqs. (1)–(5) may be labeled the line integral model of decomposition.

It is important to note that regression analyses and decomposition analyses are based on different notions of the “effect” of a covariate on a dependent variable. The linear regression coefficient for *x** _{i}* shows a change in

The computational procedure of this method is essentially a combination of the delta method (i.e., the approximation of small finite changes by derivatives) and numerical integration.^{1} A change in the dependent variable can be considered an accumulation of many small changes. Each of these small finite changes in the dependent variable can be approximated by a linear combination of *n* partial derivatives of the dependent variable with respect to the covariates. Then the additive terms of the linear combinations can be aggregated for each of the *n* covariates over the path of entire change.

To compute these partial derivatives, we need some information about the trajectory of the curve in the *n*-dimensional space (i.e., the joint patterns of change in the *x** _{i}*’s between

Probably the simplest assumption is that *x** _{i}*(

$$\frac{{x}_{i}(t)-{x}_{i}({t}_{1})}{{x}_{i}({t}_{2})-{x}_{i}({t}_{1})}=\frac{t-{t}_{1}}{{t}_{2}-{t}_{1}}$$

(6)

for all *i* and any *t* between *t*_{1} and *t*_{2}. With this assumption, it is possible to compute *y* / *x** _{i}* over the range of

$$\frac{{x}_{i}(t)-{x}_{i}({t}_{1})}{{x}_{i}({t}_{2})-{x}_{i}({t}_{1})}=g(t),$$

(7)

for all *i* and *t* [*t*_{1},*t*_{2}]. Note that *g*(*t*_{1}) = 0, and *g*(*t*_{2}) = 1. With this assumption, given some intermediate value of *x** _{i}* (i.e.,

The proportionality assumption, presented as Eq. (7), is equivalent to assuming that the curve between **x**(*t*_{1}) and **x**(*t*_{2}) is a straight line. Lacking information about the true path between the two data points, this linear path is justified by the principle of Occam’s razor.^{2}

Eq. (7) can be easily adapted to specific decomposition problems. The assumption of proportionality can be applied to the covariates in their original scale or to some transformations thereof. For example, we may assume proportional changes in *x** _{i}* and log

With the simple assumption of Eq. (7), the *c** _{i}*’s in Eq. (5) can be found by numerical integration (i.e., by dividing

$$\epsilon =\left|\frac{\sum _{i=1}^{n}{\widehat{c}}_{i}}{{y}_{2}-{y}_{1}}-1\right|,$$

(8)

and we can select a value of *N* that makes ε practically zero.

Additional information on the trajectory, if available, will help to improve the accuracy of the assumed trajectory and, in turn, the accuracy of decomposition results. In some cases, we have observations at several intermediate points on the curve between the initial and end points. For example, if we want to decompose some demographic change between the 1950 and 2000 censuses in a country with decennial censuses, we should decompose the change in each of the five decades between 1950 and 2000 by assuming proportional changes during the 10-year period and then aggregate the five sets of decomposition results, instead of decomposing the change in the entire 50-year period directly by assuming proportional changes from 1950 to 2000. (This will be illustrated later with some empirical examples.)

A few words are needed about interaction effects. If a dependent variable is not an additive function of its covariates, the effect of an individual covariate often depends on values of the other covariates. In regression analysis, this interdependency among the effects of different covariates is called *interaction*. Similarly, in some previous decomposition studies, if the sum of covariate effects (also called *main effects*) did not match *y*_{2} − *y*_{1}, the discrepancy was called an *interaction effect.* However, such interaction effects are more difficult to interpret than simple main effects; furthermore, they represent an incomplete separation of the contributions of individual covariates to the overall change (or difference) in a dependent variable. For these reasons, it has usually been considered desirable to reallocate the interaction effect among the main effects (Das Gupta 1993: chap. 1)

In previous studies, decomposition was based on a *discrete* change of each covariate from the first population to the second, while holding constant the other covariates at certain levels. In order to avoid interaction effects, these constant values must be selected in such a way that the main effects add up exactly to *y*_{2} − *y*_{1}. However, the method proposed here relies on an assumption of *gradual* changes in the covariates, which makes it impossible for any interaction effect to enter the decomposition equation. From this viewpoint, interaction effects in decomposition studies are merely the result of insufficient information. If we know all details of the (continuous) transition process between the two populations, the change from *y*_{1} to *y*_{2} can be described in the additive format of Eq. (5), which fully separates the effects of individual covariates without any interaction component.

A distinction can be made between two kinds of decomposition problems to which the present method can be applied. In the first case, the dependent variable and its covariates change gradually between two sets of observations, typically as a function of time, and thus the variable *t* refers to some real dimension of change. In the second case, however, the two sets of observations refer to populations that are qualitatively different (e.g., males and females), and thus *t* is merely a hypothetical underlying dimension. In applying this method to the latter case, we assume implicitly that *y* and **x** change gradually between two qualitatively different populations—as if they were changing over time—even though actual changes are discrete, not continuous.

Furthermore, decomposition analyses (here and in general) can be classified according to the form of the underlying functional relationship. Although deterministic relationships are assumed in Eqs. (1)–(5) and (8), the proposed method can easily be extended to probabilistic models. For example, if *y = ŷ + e*, where *ŷ* is some function of **x** and *e* is an observed value of some random variable, then a change or difference in *y* can be decomposed into *c** _{i}*’s and the change or difference in

This section presents two applications of the proposed method to show that it can be used in different ways for different purposes. In Example 1, we decompose changes over time in three summary measures of mortality (the median, mean, and standard deviation of ages at death in the life table) into effects attributable to changes in death rates at various ages. In most previous decompositions of life table quantities, the sole variable of interest was the mean age at death, or life expectancy at birth *e*_{0} (Arriaga 1984; Carlson 2006; Pollard 1982, 1988; Ponnapalli 2005; Vaupel and Canudas Romo 2003). Example 1 shows that the proposed method can be applied not only to the mean but also to the median, the standard deviation, and other summary measures, such as the interquartile range. (However, it would not be appropriate to apply this method in a similar way to the modal age at death, which is not a differentiable function of age-specific death rates.)

In Example 2, the regional difference in self-reported health between Minnesota and Mississippi is decomposed into effects of some socioeconomic and behavioral characteristics. The dimension *t* is time in Example 1 but a hypothetical dimension in Example 2. Example 1 deals with deterministic relationships between the dependent variable and its covariates, but Example 2 illustrates an application of the proposed method to a relationship that includes a stochastic error term.^{3}

After World War II, the level of mortality in Japan declined at an unprecedented pace. The age distribution of deaths in the life table has shifted to older ages, raising the median and mean ages at death considerably. In addition, the mortality decline in Japan reduced the standard deviation of ages at death by lowering the proportion of deaths at young ages and concentrating deaths into old ages (Wilmoth and Horiuchi 1999).

However, the increases in the median and mean and the decrease in the standard deviation proceeded differently (Figure 1). Although changes in these measures generally slowed, the deceleration was most pronounced for the decline of the standard deviation, followed by the rise of the mean, but was modest for the rise of the median age.

We used the decomposition method in order to investigate reasons for these somewhat different trends among the median, mean, and standard deviation. Some methods were developed previously for decomposing changes or differences in life expectancy, but to our knowledge no comparable technique has been proposed for the median or standard deviation. The present method can be used flexibly to decompose various summary measures of the life table into effects of age-specific, or age- and cause-specific death rates.

We decomposed changes in these three measures into effects due to changes in death rates by single years of age (0, 1, 2, . . . , 102, 103, and 104+). The method was applied to each of the 54 pairs of successive years between 1950 and 2004. Changes in the *logarithms* of the 105 age-specific death rates between two successive years were assumed to be proportional to each other.^{4}

Thus, the decomposition of changes in life expectancy, for example, was based on the following equations. The effect of the death rate for the *i*-th age group on the change in life expectancy *e*_{0}(*t*) from the period *t*_{1} to the next period *t*_{2} can be calculated as

$${c}_{i}={\int}_{{M}_{i}({t}_{1})}^{{M}_{i}({t}_{2})}\frac{\partial {e}_{0}(t)}{\partial {M}_{i}(t)}d{M}_{i}(t),$$

(9)

where *M** _{i}*(

$${e}_{0}(t)=f({M}_{1}(t),{M}_{2}(t),\dots ,{M}_{n}(t)),$$

(10)

where *f* indicates an algorithm that transforms the vector of death rates into the value of life expectancy at birth. The numerical integration relies on the following assumption:

$$\frac{\text{ln}{M}_{i}(t)-\text{ln}{M}_{i}({t}_{1})}{\text{ln}{M}_{i}({t}_{2})-\text{ln}{M}_{i}({t}_{1})}=\frac{\text{ln}{M}_{j}(t)-\text{ln}{M}_{j}({t}_{1})}{\text{ln}{M}_{j}({t}_{2})-\text{ln}{M}_{j}({t}_{1})}$$

(11)

for any pair of age groups *i* and *j* and for any *t* between *t*_{1} and *t*_{2}. The decomposition of changes in the median age and that of changes in the standard deviation were done in a similar manner.

The number of intervals (*N*) used for numerical integration was set at 20 for each pair of successive years. The proportional errors of the various decompositions (ε’s) were very small: the maximum ε for the 54 pairs of period life tables was 0.1% for the median, 0.005% for the mean, and 0.001% for the standard deviation.

In Table 1, the decomposition results are aggregated for three 18-year time periods and four broad age categories: 0 (infants), 1–14 (children), 15–64 (adults), and 65 and older (the elderly). The 12 (3 × 4) effects for each measure in Table 1 are actually a summary of 5,670 (54 × 105) computed effects. The major findings of this analysis can be summarized as follows: (1) for each of the three measures, the effects of changing infant and child mortality diminished over time, resulting in decelerating rates of change in the summary measure; (2) nevertheless, the median and mean ages continued to rise noticeably, thanks to the growing significance of mortality reduction at older ages; (3) in contrast, the trend in the standard deviation virtually leveled off, since the effect of old-age mortality reductions on the standard deviation was small or even positive (because mortality reduction at older ages stretches out the upper tail of the distribution of ages at death); and (4) the rise of the median age in earlier periods was less pronounced than the rise of the mean age, since reductions in infant and child mortality affected the median much less than the mean. In summary, the decomposition analysis shows that age-specific death rates affected trends in these three measures of mortality in noticeably different ways.

Health conditions differ substantially by region. For instance, the proportion of residents whose health conditions are reported as “fair” or “poor” varies among U.S. states. The age-adjusted proportion for adults above age 18 in 2003–2005 ranges from 10.9% in Minnesota and New Hampshire to 23.1% in Mississippi and West Virginia (National Center for Health Statistics 2007). What factors account for the difference between, for example, Minnesota and Mississippi?

In order to investigate regional differences in health status, we assume that the age-adjusted proportion of population in the state whose health conditions are reportedly fair or poor, denoted by θ, can be expressed as

$$\theta =\frac{{e}^{\mathbf{\text{x\beta}}+e}}{1+{e}^{\mathbf{\text{x\beta}}+e}},$$

(12)

where **x** is a row vector of covariates including a constant of 1, **β** is a column vector of their coefficients, and ε is an independent random variable that is normally distributed with a mean of 0 and the same variance for each state.

State-level data on self-reported health as well as some socioeconomic and lifestyle characteristics were downloaded from the Web sites of the U.S. Census Bureau (2007) and the National Center for Health Statistics (2007). The regression coefficients were estimated from data for the 50 states and the District of Columbia around 2005 by minimizing the squared errors of the following model: logit(θ) = **xβ** + *e*. (Although this estimation procedure appears similar to the usual form of logistic regression, it is fundamentally different because the dependent variable here is not binary but continuous between 0 and 1, and the coefficients are not estimated on the basis of maximum likelihood and binomial distributions.)

Nine covariates were included in the initial model of Eq. (12), but the correlation matrix of those covariates included a number of notably high values. After stepwise removal of variables whose coefficients seemed to be strongly affected by the multicollinearity problem, four covariates remained in the final model (*R*^{2} = .89): the proportion of persons aged 25 years and older who completed high school (including equivalency), the proportion of persons aged 18–64 who are not covered by health insurance, the age-adjusted proportion of those aged 18 and older who are currently smoking, and the age-adjusted proportion of those aged 20 and older who are obese.

Results of the regression analysis are shown in the rightmost column of Table 2. In terms of the *p* value, the strongest among the four factors is the proportion of adults who completed high school. This probably reflects substantial impacts of socioeconomic status on health through various pathways (other than health insurance coverage, smoking, and obesity) as well as contextual effects on the health of residence in well-to-do states.

Decomposition of Difference Between Minnesota and Mississippi in Self-Reported Health Status,^{a} 2003–2005

The decomposition analysis was applied to the difference in θ between Minnesota and Mississippi using the four-covariate model of Eq. (12). Splitting the difference into six intervals was sufficient to make the proportional error as low as 0.001%. Table 2 shows that about 95% of the difference is “explained” by the four factors. More than half of the difference is attributed to the proportion who completed high school, partly because of the large difference in the proportion between the two states (90.9% in Minnesota and 78.5% in Mississippi) and partly because of its relatively large regression coefficient. This analysis confirms the well-known socioeconomic effects on health and indicates that the difference between Minnesota and Mississippi is no exception.

We now try to clarify characteristics of the proposed method through a comparison with previous ones. First, as mentioned earlier, there are two fundamentally different types of decomposition analysis. The distinction, originally made descriptively by Das Gupta (1991, 1993), can be expressed more formally as follows. In the first type, the variable of interest is a function of multiple variables, that is,

$$y(t)=f({x}_{1}(t),\dots ,{x}_{n}(t)),$$

(13)

and the decomposition analysis expresses a change of the dependent variable (*y*) as the sum of effects of its covariates (*x** _{i}*’s). In the second type, the variable of interest is the mean of a function

If the covariates are continuous variables, the mean of *y* can be expressed formally as

$$\overline{y}(t)=\int \dots \int f({x}_{1},\dots ,{x}_{n};t)w({x}_{1},\dots ,{x}_{n};t)d{x}_{1}\dots d{x}_{n},$$

(14)

where *w*(*x*_{1}, . . . , *x** _{n}*;

$$\overline{y}(t)=\sum _{{j}_{1}=1}^{{k}_{1}}\dots \sum _{{j}_{n}=1}^{{k}_{n}}{f}_{{j}_{1}\dots {j}_{n}}(t){w}_{{j}_{1}\dots {j}_{n}}(t),$$

(15)

where *f _{j1… jn}* (

The goal of the second type of decomposition analysis is to separate the change (or difference) in into two distinct parts: a component due to changes in the functional relationship, *f*, and another one due to changes in the joint distribution of *x** _{i}*’s or

Since the proposed method belongs to the first type, we should compare it only with others of the same type. As described earlier, the method is based on the assumption that the change in *y* is produced by gradual changes of its covariates, but in all previous methods, the effect of a covariate is calculated as the change in *y* produced by a discrete change of the covariate from *t*_{1} to *t*_{2}, while holding constant the other covariates at certain values. Thus, different choices of constant values of the other covariates lead to different methods, which may be grouped as *discrete-change methods* as opposed to the *continuous-change method* proposed here. We will discuss four different discrete-change approaches (labeled here as Methods A, B, C, and D) adopted in previous decomposition studies. All of them are widely applicable methods, and methods that are limited to particular dependent variables are not considered here.

In Method A (Kitagawa’s method), one of the two populations is chosen as the reference population. The effect of the *i*-th covariate, *c** _{i}*, is calculated as the change in

However, interaction effects are not only difficult to interpret, they also make the exercise unsatisfying because the purpose of a decomposition is to separate the effects of individual covariates. Method B avoids an interaction effect by changing values of covariates in a certain order. This idea is called *stepwise replacement* (Andreev, Shkolnikov, and Begun 2002). Effects of covariates are estimated in the order of *x*_{1}, *x*_{2}, . . . , *x** _{n}*, and unlike Method A, once the value of

For a given data set, this method also has three versions: ascending order from *x*_{1} to *x** _{n}* (Method B1), descending order from

However, it is not always possible to arrange covariates in a meaningful order. If it is impossible to select one particular sequence of the covariates, Method C (Das Gupta’s method) seems more appropriate: the stepwise replacement is carried out for each of all mathematically possible sequences (permutations), and their average is taken as the final decomposition result. Andreev et al. (2002) found that this algorithm is equivalent to the method developed by Das Gupta (1999).

Method D (the delta method) implicitly assumes continuous changes but uses a single discrete change for actual calculation. Although the partial derivative of *y* with respect to *x** _{i}* varies between

In order to understand differences among these methods further, we applied them to the same data set and compared the decomposition results. The mortality data for Japanese women in Example 1 were used, and the *e*_{0} change between 1950 and 2004 was decomposed using each method in two different ways: by examining changes in each pair of successive calendar years and then aggregating 54 decompositions across the entire period (Table 3), and by examining the difference between 1950 and 2004 without using data for calendar years between them (Table 4). In each case, effects of 105 single-year age groups were calculated and then aggregated for four broad age categories as in Table 1.^{7}

Comparison of Different Methods: Sum of 54 Decompositions of Annual Changes in the Expectation of Life at Birth for Japanese Females, 1950–2004^{a}

Comparison of Different Methods: Single Decomposition of the Difference Between 1950 and 2004 in the Expectation of Life at Birth for Japanese Females^{a}

Table 3 shows that the aggregated results of 54 annual-change decompositions using Methods A3, B3, C, D, and the continuous-change method are nearly identical. The selection of reference population in Method A (A1 and A2) and the reversal of order of stepwise replacement in Method B (B1 versus B2) make some difference, but the averaging out of those differences (A3 and B3) makes the results of Methods A and B very close to those of the other methods. Table 4 seems to suggest that if those methods are used for decomposition of a small change, they tend to produce similar results.

Table 4 shows results of a single decomposition (of the difference between 1950 and 2004) and compares them with the average of the five nearly identical results in Table 3, which may be considered good proxies of “true” effects. Differences among the methods in Table 4 are larger than those in Table 3, suggesting that the choice of decomposition method may make nonnegligible differences if applied to relatively large changes in a long period. The estimated effects differ notably between A1 and A2, and also between B1 and B2, indicating that the decomposition result may be sensitive to the selection of the reference population for Method A and the order of stepwise replacement for Method B. In terms of the index of dissimilarity, the most accurate results were produced by the continuous-change method, but the results of Methods B1, B3, C, and D seem fairly close to the “true” result as well.^{8}

However, the comparative study in Table 4 does not necessarily suggest that the continuous-change method always produces most accurate results. For example, if actual changes follow the stepwise-replacement scenario more closely than the proportional-change scenario, results of Method B should be more accurate than those of the continuous-change method, if the right sequence of covariates is chosen.

The discrete-change decomposition methods can be interpreted in terms of the line integral model. Method B can be considered as a special version of the continuous-change decomposition method, with the assumption that the point in *n*-dimensional space (as defined by vector **x**) follows a stepwise trajectory with *n* − 1 orthogonal turns: first, the point moves from its initial location along the *x*_{1} axis, then turns perpendicularly and moves along the *x*_{2} axis, and so on, until it reaches its final location by moving along the *x** _{n}* axis. Method C is the average of results for all possible stepwise paths. Methods A and D can be considered as attempts to evaluate the line integral by replacing the function that yields the dependent variable with an additive and a linear approximation, respectively. The error of approximation is regarded as the interaction effect or the residual. No assumption about the trajectory is needed for Methods A and D because if the function is additive or linear,

In this article, we proposed a method for decomposing a change or difference in a function of multiple variables. The method relies on the assumption that covariates change gradually along an actual or hypothetical dimension and the dependent variable is a differentiable function of the covariates. It has a few major theoretical and practical advantages, as summarized below.

The proposed method is based on a mathematical model (the line integral model of decomposition) that justifies the additivity of covariate effects and the elimination of interaction effects. In decomposition analysis, the effects are assumed to be additive, even though the dependent variable is usually a nonadditive function of its covariates. In the previous literature, it was not fully clear whether this apparent paradox was logically justifiable. The line integral model provides a theoretical foundation for decomposition analysis.

The model also implies that interaction terms should be eliminated, not merely because they complicate decomposition results, but because in a model of continuous change, they do not exist. From this viewpoint, interaction effects in previous decomposition studies may be regarded as the result of incomplete information about patterns of change between observation points.

A few general methods for decomposing a change or difference in a multivariate function were developed previously, but they have some practical limitations. Method A (Kitagawa’s method) produces an interaction term, Method D (delta method) produces a residual term, and both of the terms are not easily interpretable. Method B (stepwise replacement) should not be used if covariates cannot be ordered in a meaningful sequence. Furthermore, a logically meaningful sequence is not necessarily justifiable as an appropriate order of stepwise replacement: for example, vital rates at younger ages do not necessarily tend to change earlier (or later) than those at older ages. Method C (Das Gupta’s method) is based on permutations of the covariates, which may require an astronomical amount of memory and computation if the number of covariates is large.^{9}

The proposed method has none of these limitations. It does not have an interaction term or a nonnegligible residual term, nor does it require a meaningful ordering of covariates. It can easily handle data with many covariates because the amount of computation increases linearly with the number of covariates, not geometrically or in proportion to the number of their permutations (see the Appendix for more details of computation amount).

Its major difference from previous methods is the assumption that covariates change gradually along an actual or hypothetical dimension. This assumption fits some decomposition problems very naturally. For example, this method seems highly appropriate for decomposing time trends if relevant variables can be reasonably assumed to change gradually over time. On the other hand, if some covariates actually change in noticeably discrete manners, the assumption is not compatible with reality. This could be a limitation of the proposed method in certain cases. However, although vital events (such as birth and death) are discrete changes at the individual level, many of the corresponding measures at the aggregate level (such as birth rates and death rates) can be reasonably approximated as continuous variables.

The method requires an additional assumption about the trajectory of changes between the two data points. Recommended as the “default” is the straight-line path—that is, the assumption that increments of the covariates are proportional to each other. The validity of this assumption should vary among research subjects and data.

In addition, our empirical results (Tables 3 and and4)4) suggest (as seems logically reasonable) that it is better to aggregate decomposition results for relatively short time intervals than to carry out one decomposition for the entire period if data for some intermediate time points in the decomposition period are available.

We are grateful to Juha Alho, Ronald Lee, and the anonymous reviewers for comments on earlier versions of this article. Joel E. Cohen gave us a useful technical suggestion. Supplementary documents for this article (including sensitivity analyses, additional examples, and a MATLAB program) are available online at http://www.demog.berkeley.edu/~jrw/Papers/decomp.suppl.pdf.

The right side of Eq. (5) can be approximated by numerical integration. For each covariate *x _{i}*, the range between

$$\Delta {x}_{i}=({x}_{i2}-{x}_{i1})/N.$$

(A1)

In order to change the value of *x _{i}* in the

$${\mathbf{\text{x}}}_{ik+}=[{x}_{j}|{x}_{j}={x}_{ik+}\text{if}j=i;{x}_{j}={x}_{ik\u2022}\text{if}j\ne i]$$

and

$${\mathbf{\text{x}}}_{ik-}=[{x}_{j}|{x}_{j}={x}_{ik-}\text{if}j=i;{x}_{j}={x}_{ik\u2022}\text{if}j\ne i],$$

(A2)

where *x** _{ik+}* =

If *N* is large, we have

$${y}_{2}-{y}_{1}\cong \sum _{i=1}^{n}{\widehat{c}}_{i},\text{where}{\widehat{c}}_{i}=\sum _{k=1}^{N}\left\{f({\mathbf{\text{x}}}_{ik+})-f({\mathbf{\text{x}}}_{ik-})\right\}.$$

(A3)

This method is computationally intensive. For Example 1, with *N* = 20, a complete life table had to be constructed 113,455 times ((20 intervals × 105 variables × 54 period-pairs) + 55 periods). Nevertheless, it took only about 8 minutes of CPU time for the MATLAB 6.1 program on a PC with 1.8 GHz, 256 MB RDRAM, and 384 MB of virtual memory to carry out the entire calculation. The MATLAB function of the proposed method is available from the first author upon request.

This research was supported by Grant R01-AG11552 from the National Institute on Aging.

^{1.}The Divisia decomposition (Divisia 1925), which is widely used in economics for analyzing changes in monetary aggregates, may be considered a simple case of this approach.

^{2.}We conducted sensitivity analyses with two sets of empirical data and found that the decomposition results were reasonably insensitive to deviations from the proportionality assumptions. Details of the sensitivity analyses are given online at http://www.demog.berkeley.edu/~jrw/Papers/decomp.suppl.pdf.

^{3.}Two other examples—a decomposition of changes in the intrinsic growth rate in Sweden into the effects due to changes in age-specific death rates and a decomposition of sex difference in the life expectancy of fruit flies into the effects of the logistic model parameters—are shown online at http://www.demog.berkeley.edu/~jrw/Papers/decomp.suppl.pdf.

^{4.}Lee and Carter (1992) adopted the same assumption in their model of mortality change.

^{5.}The original method by Kitagawa (1955), which decomposes a difference in the proportion of those who have a characteristic of interest, belongs to both of the types. It is a special case of the second type in which there is only one covariate that has a frequency distribution among *n* categories at *t*. Thus, Eq. (15) becomes a simple form,
$\overline{y}(t)=\sum _{j=1}^{n}{f}_{j}(t){w}_{j}(t)$, where *w** _{j}*(

^{6.}The other version, which does not include an interaction term, can be considered a special case of Method B3 and also Method C.

^{7.}An exception was necessary in the case of Method C. To apply this method in the standard fashion would have required computing life tables while making sequential changes in all possible permutations of 105 single-year age groups. Clearly, the computational demands of such an exercise are overwhelming. As a practical alternative, changes were introduced for all ages simultaneously within one of the four broad age groups, and life tables were computed (as for the other methods) using single-year data. Thus, this adaptation of the method took into consideration all possible orderings of changes for four broad age groups, rather than for 105 single-year age groups.

^{8.}The index of dissimilarity was calculated using the broad age categories in Table 4.

^{9.}Our MATLAB 6.1 program of Method C on a PC with 1.8 GHz, 256 MB RDRAM, and 384 MB of virtual memory worked with nine or fewer covariates, but not with 10 or more covariates because of insufficient memory. Although more sophisticated programming and enlarged virtual memory will increase the possible number of covariates, it may be difficult to use Method C even for a problem of modest size (e.g., decomposing a difference in an overall demographic measure into effects of about 20 vital rates for five-year age groups).

- Andreev EM, Shkolnikov VM, Begun AZ. 2002. “Algorithm for Decomposition of Differences Between Aggregate Demographic Measures and Its Application to Life Expectancies, Healthy Life Expectancies, Parity-Progression Ratios and Total Fertility Rates.” Demographic Research 7article 14499–522.522Available online at http://www.demographic-research.org/Volumes/Vol7/14
- Arriaga EE. “Measuring and Explaining the Change in Life Expectancies” Demography. 1984;21:83–96. [PubMed]
- Canudas Romo V. Decomposition Methods in Demography. Amsterdam: Rozenberg Publishers; 2003.
- Carlson E. 2006. “Age of Origin and Destination for a Difference in Life Expectancy.” Demographic Research 14article 11217–36.36Available online at http://www.demographic-research.org/volumes/vol14/11/14-11.pdf.
- Clogg CC. “Adjustment of Rates Using Multiplicative Models” Demography. 1978;15:523–39. [PubMed]
- Das Gupta P. “Decomposition of the Difference Between Two Rates When the Factors Are Nonmultiplicative With Applications to the U.S. Life Tables.” Mathematical Population Studies. 1991;3:105–25.
- Das Gupta P. Standardization and Decomposition of Rates: A User’s Manual. Washington, DC: U.S. Bureau of the Census; 1993. (Current Population Reports, Special Studies, P23-186)
- Das Gupta P. “Standardization and Decomposition of Rates From Cross-Classified Data” Genus. 1994;50:171–96. [PubMed]
- Das Gupta P. “Decomposing the Difference Between Rates When the Rate Is a Function of Factors That Are Not Cross-Classified” Genus. 1999;55:9–26.
- Divisia F. “L’Indice monétaire et la théorie de la monnaie” Revue d’Economie Politique. 1925;39:980–1008. [The price index and the theory of money]
- Glei D, Horiuchi S. “The Narrowing Sex Gap in Life Expectancy: Effects of Sex Differences in the Age Pattern of Mortality” Population Studies. 2007;61:141–59. [PubMed]
- Human Mortality Database (HMD) University of California, Berkeley (USA), and Max Planck Institute for Demographic Research (Germany) 2007. Available online at http://www.mortality.org.
- Keyfitz N. Introduction to the Mathematics of Population. Reading, MA: Addision-Wesley; 1968.
- Kitagawa EM. “Components of a Difference Between Two Rates” Journal of the American Statistical Association. 1955;50:1168–94.
- Lee RD, Carter LR. “Modeling and Forecasting U.S. Mortality” Journal of the American Statistical Association. 1992;87:659–71.
- Liao TF. “A Flexible Approach for the Decomposition of Rate Differences” Demography. 1989;26:717–26. [PubMed]
- National Center for Health Statistics Health Data for All Ages. 2007. Available online at http://www.cdc.gov/nchs/health_data_for_al_ages.htm.
- Pletcher SD, Khazaeli AA, Curtsinger JW. “Why Do Lifespans Differ? Partitioning Mean Longevity Differences in Terms of Age-Specific Mortality Parameters” Journal of Gerontology: Biological Sciences. 2000;55:B381–B389. [PubMed]
- Pollard JH. “The Expectation of Life and its Relationship to Mortality” Journal of the Institute of Actuaries. 1982;109:225–40.
- Pollard JH. “On the Decomposition of Changes in Expectation of Life and Differentials in Life Expectancy” Demography. 1988;25:265–76. [PubMed]
- Ponnapalli KM. 2005. “A Comparison of Different Methods for Decomposition of Changes in Expectation of Life at Birth and Differentials in Life Expectancy at Birth.” Demographic Research 12article 7141–72.72Available online at http://www.demographic-research.org/volumes/vol12/7
- Pullum TW, Tedrow LM, Herting JR. “Measuring Change and Continuity in Parity Distributions” Demography. 1989;26:485–98. [PubMed]
- U.S. Census Bureau. 2005 American Community Survey. 2007. Available online at http://www.census.gov.acs/www.
- Vaupel JW, Canudas Romo V. “Decomposing Demographic Change Into Direct vs. Compositional Components” Demographic Research. 2002;7:2–14.
- Vaupel JW, Canudas Romo V. “Decomposing Change in Life Expectancy: A Bouquet of Formulas in Honor of Nathan Keyfitz’s 90th Birthday” Demography. 2003;40:201–16. [PubMed]
- Williamson RE, Crowell RH, Trotter HF. Calculus of Vector Functions. Englewood Cliffs, NJ: Prentice Hall; 1968.
- Wilmoth JR, Deegan LJ, Lundström H, Horiuchi S. “Increase in Maximum Life Span in Sweden, 1861–1999” Science. 2000;289:2366–68. [PubMed]
- Wilmoth JR, Horiuchi S. “Rectangularization Revisited: Variability of Age at Death Within Human Populations” Demography. 1999;36:475–95. [PubMed]
- Xie Y. “An Alternative Purging Method: Controlling the Composition-Dependent Interaction in an Analysis of Rates” Demography. 1989;26:711–16. [PubMed]

Articles from Demography are provided here courtesy of **The Population Association of America**

PubMed Central Canada is a service of the Canadian Institutes of Health Research (CIHR) working in partnership with the National Research Council's national science library in cooperation with the National Center for Biotechnology Information at the U.S. National Library of Medicine(NCBI/NLM). It includes content provided to the PubMed Central International archive by participating publishers. |