Home | About | Journals | Submit | Contact Us | Français |

**|**HHS Author Manuscripts**|**PMC2921856

Formats

Article sections

Authors

Related links

Sociol Methodol. Author manuscript; available in PMC 2010 August 16.

Published in final edited form as:

Sociol Methodol. 2009 August 1; 39(1): 185–232.

doi: 10.1111/j.1467-9531.2009.01212.xPMCID: PMC2921856

NIHMSID: NIHMS132809

Lawrence L. Wu, Department of Sociology New York University;

Direct correspondence to Lawrence L. Wu, Department of Sociology, Puck Building, 295 Lafayette Street, 4th floor, New York University, New York, NY 10012-9605, Email: ude.uyn@uw.ecnerwal.

See other articles in PMC that cite the published article.

This paper outlines decomposition methods for assessing how exposure affects prevalence and cumulative relative risk. Let **x** denote a vector of exogenous covariates and suppose that a single dimension of time *t* governs two event processes *T*_{1} and *T*_{2}. If the occurrence of the event *T*_{1} determines entry into the risk of the event *T*_{2}, then subgroup variation in *T*_{1} will affect the prevalence *T*_{2}, even if subgroups in the population are otherwise identical. Although researchers often acknowledge this phenomenon, the literature has not provided procedures to assess the magnitude of an exposure effect of *T*_{1} on the prevalence of *T*_{2}. We derive decompositions that assess how variation in exposure generated by direct and indirect effects of the covariates **x** affect measures of absolute and relative prevalence of *T*_{2}. We employ a parametric but highly flexible specification for baseline hazard for the *T*_{1} and *T*_{2} processes and use the resulting parametric proportional hazard model to illustrate the direct and indirect effects of family structure when *T*_{1} is age at first sexual intercourse and *T*_{2} is age at a premarital first birth for data on a cohort of nonhispanic white U.S. women.

Social scientists have long understood that exposure and prevalence are related—that how long people are exposed to the risk of an outcome will affect the prevalence of that outcome. Consider nonmarital births, which have been increasingly prevalent in the United States and which accounted for 38 percent of all U.S. births in 2006 (Hamilton, Martin, and Ventura 2007). Declining trends in the sexual activity of U.S. teens are often hailed as promising news in efforts to curb nonmarital childbearing among unmarried teen women; conversely, a declining age at first intercourse within particular teen subpopulations are often seen as cause for concern. Implicit is a classic social scientific insight—that all else being equal, later entry into sexual activity will decrease an unmarried young woman’s exposure to the risk of a birth. An implication is that in aggregate populations, the prevalence of first births prior to a first marriage will then vary with shifts in the onset of sexual activity—that is, in exposure to risk.

The above example has several components that can be readily formalized. We have posited two events of interest, with the occurrence and timing of a first event, *T*_{1}, placing individuals at risk of second event, *T*_{2}. Variation in *T*_{1} timing will generate variation in exposure to risk of the *T*_{2} event, which in turn implies variation in the prevalence of the *T*_{2} event.

Issues of exposure and prevalence also extend naturally to the influence of covariates. For example, a substantial literature has documented that nonmarital birth risks vary with social and demographic characteristics, as does women’s age at onset of sexual activity. Covariates affecting both event processes will then have both direct and indirect effects on the second event process in ways that are similar to the classical recursive structural equation model for metric outcomes (see, e.g., Duncan 1975).

These issues also often arise in questions that are of considerable substantive and policy importance. Take, for example, the finding that women from single mother families have a higher probability of a premarital first birth than than women who resided in families with two biological parents. Some of this greater prevalence will be due to increased exposure due to the earlier average onset of sexual activity among women residing in single mother families; likewise some of this greater prevalence will be due to higher premarital first risks for these women in the period following onset of sexual activity. A natural question then is which of these is larger. This paper answers this question by providing methods that decompose the effect of a covariate into direct and indirect components, including an indirect component reflecting the influence of exposure on prevalence.

Although there have long existed methods for tracing the implications of differential exposure to risk in aggregate populations (see, e.g., Kitagawa 1955; Das Gupta 1993; Smith, Morgan, and Koropeckyj-Cox 1996), we lack corresponding methods relating exposure and prevalence for the hazard regression context and when using individual-level data. This paper fills this gap for the case in which the occurrence of one event determines entry into risk of a second event process.

Focusing attention on situations in which a first event governs entry into the risk of a second event may at first glance appear highly restrictive, but such situations are common to a number of phenomena studied by social scientists. Examples include career mobility, in which a first job necessarily precedes a second job (Tuma 1976); movement through educational institutions that typically requires completion of earlier levels before proceeding to higher levels (Mare 1980); birth parity progressions, in which a third birth is necessarily preceded by two prior births (Westoff, Potter, and Sagi 1963); and marital transitions, in which divorce can occur only after marriage (Hannan, Tuma, and Groeneveld 1978).

A consistent pattern that emerges from our empirical decompositions is that the direct effects of covariates are, in general, larger in magnitude than their corresponding indirect effects. Although these findings will for many social scientists appear unsurprising—direct effects very often dominate indirect effects—they could also be taken to suggest a smaller role of exposure than is often presumed. But because our data are observational, it is not possible to regard these findings as identifying causal effects, but rather as providing estimates of the regression-adjusted association between various quantities. However, we also show below that unobservables must have highly specific characteristics if indirect effects are to dominate direct effects, a result that follows from the structure of our formal model. This implies a far smaller class of relevant unobserved confounds for these hypotheses than is usually the case.

The paper is organized as follows. We begin by stating the problem informally and comparing it with the classical recursive structural equation model. We then formalize ideas. We outline procedures for decomposing direct and indirect effects of covariates on the prevalence of the second event, which shows how variation in the timing of the first event affects prevalence of the second event. The essential ideas involve left truncating the hazard rate for the second event on the timing of the first event and decomposing the integrated hazard and survival functions for the second event into components representing the direct effects of covariates and indirect effects of exposure. These formal results apply to both observed and unobserved covariates. We then illustrate these methods using data on age at first sexual intercourse and at a premarital first birth from the 1988 National Survey of Family Growth.

We begin by comparing the classical three-variable recursive model for an uncensored metric outcome with an analogous three-variable recursive hazard model. We begin informally and then formalize details. To simplify the discussion, we consider two dichotomous events, *T*_{1} and *T*_{2}, in which there are two transitions of interest: an initial transition to *T*_{1}, and a second transition from *T*_{1} to *T*_{2}. We then present the familiar derivation for the direct effect of exogenous covariates **x** on *T*_{2}. A formally similar derivation is given for the effect of exposure on prevalence of *T*_{2} resulting from variation in *T*_{1}. The section concludes with derivations for the indirect effects of **x** on *T*_{2} prevalence.

The classical recursive model for uncensored metric outcomes can be depicted as:

$$\underset{{\mathbf{b}}_{z\mathbf{x}}}{\mathbf{x}}\underset{\searrow}{\overset{{\mathbf{b}}_{y\mathbf{x}}}{\to}}\underset{\underset{Z}{\downarrow}{b}_{zy}}{Y}$$

or equivalently as the pair of equations

$$\begin{array}{cc}\hfill Y& ={\mathbf{b}}_{y\mathbf{x}}\mathbf{x}+{\u220a}_{y}\hfill \\ \hfill Z& ={b}_{zy}Y+{\mathbf{b}}_{z\mathbf{x}}\mathbf{x}+{\u220a}_{z}\hfill \end{array}$$

(1)

where *Y* is a metric variable that is predetermined with respect to a metric variable *Z*, and **x** is a vector of covariates that are predetermined with respect to both *Y* and *Z*.^{1} A classic sociological example is status attainment (Featherman and Hauser 1978), in which *Z* is a measure of the respondent’s occupational status during early adulthood, *Y* is the respondent’s educational attainment, and **x** includes background factors such as number of siblings and the occupational status and educational attainment of the respondent’s father.

Using (1), one can decompose the gross effect of covariate *x* into a direct effect *b _{zx}* and an indirect effect

$$E\left[Z\right]={b}_{zy}\left({\mathbf{b}}_{y\mathbf{x}}\mathbf{x}\right)+{\mathbf{b}}_{z\mathbf{x}}\mathbf{x},$$

(2)

under the usual assumptions that *E*[* _{y}*] =

Suppose instead that *Z* is an event that may be subject to censoring. If *Y* is a metric variable, then no particular complications arise in this “mixed” case, and much of the apparatus familiar from the more classical metric outcome setting carries over.

In this paper, we treat the case in which both *Y* and *Z* are events that may be subject to censoring. Under these restrictions, we derive direct and indirect effects of **x** on *Z* that are roughly analogous to those in the classical recursive model for a metric variable *Y*. There are three ways, however, in which this case varies from the classical one. A first observation is that the timing of the event *Y* determines entry into the risk of the event *Z*. That is, prior to the occurrence of the event *Y*, individuals are, by assumption, not at risk of the event *Z*; hence, there can be no direct effect of **x** on *Z* prior to the occurrence of *Y*, an issue that does not arise in recursive models with a metric variable *Y*. Accounting for this difference involves conditioning in the likelihood for *Z* on entry into risk at time *Y*, where *Y* provides the so-called left-truncation time for the process *Z*.

A second observation is that variation in *Y* affects the duration that individuals are exposed to the risk of *Z*. Such variation will yield an effect of exposure on the *prevalence* of *Z* even in otherwise homogeneous populations. To see this informally, consider a population with two homogeneous subgroups, *A* and *B*, that differ only in the timing of *Y*. If the event *Y* occurs later in group *A* than in *B*, then the event *Z* will occur less frequently in group *A* than in *B* because members in group *A* spend less time exposed to the risk of event *Z* than members in group *B*. Because groups *A* and *B* are identical in all respects save for the timing of *Y*, differences in the prevalence of *Z* arise solely from differences in exposure.

The above discussion implies *two* indirect effects of **x** on the prevalence of *Z*. The first arises because the timing of *Y* determines entry into the risk of *Z*. The second is a hazard model variant of the usual indirect effect **b**_{yx} of **x** on *Z* for metric outcomes, in which (1) **x** affects the *risk* of *Y*, (2) *Y* risks determine the *timing* of *Y*, and (3) the *timing* of *Y* affects the prevalence of *Z*. Thus, the structure of this problem is more complicated than the classic recursive model for a metric *Y* and *Z*, in which there is a single indirect effect of **x** on *Z*.

Another difference between recursive models for metric and event history outcomes involves censoring. As noted above, observed differences in *Y* may be due to compositional differences as reflected in the covariates **x**; this is true for recursive models in both settings. However, in a hazard model context, two additional complications arise: first, some individuals may not experience the event *Y* during the period of observation; and second, some individuals may never experience the event *Y*, even if observed for an infinitely long time. If the second condition holds, the distribution of *Y* is said to be defective and the prevalence of *Z* will necessarily be selected only on those who experience *Y*. A technical point about defective distributions is that the expected value for the timing of the event is undefined. To deal with this issue, we summarize distributional aspects of the timing of an event using percentiles such as the median rather than statistics such as the mean.

Finally, it should be noted that for metric outcomes, interpreting direct and indirect effects is straightforward because the indirect effect *b _{yx}* ×

To make clear the sequential dependence between *Y* and *Z*, we henceforth write them as (*T*_{1}, *δ*_{1}) and (*T*_{2}, *δ*_{2}), respectively, where *T*_{1} and *T*_{2} are random variables denoting the event times and *δ*_{1} and *δ*_{2} are indicator variables for the two events. To simplify matters further, we often write *T*_{1} and *T*_{2} in place of (*T*_{1}, *δ*_{1}) and (*T*_{2}, *δ*_{2}); similarly, we engage in a slight abuse of language by referring, for example, to “the event *T*_{1}” or “the risk of *T*_{2}.” We further assume that both *T*_{1} and *T*_{2} are governed by a common dimension of time *t*, that all cases enter at risk of *T*_{1} at *t* = 0, and that the occurrence of *T*_{1} determines entry into the risk of the event *T*_{2}.

To establish basic results, we proceed as follows. We begin with the transition to *T*_{1}, reviewing standard results for the proportional hazard model on how variation in an exogenous covariate between two groups affects the predicted hazard rate for the transition to *T*_{1}. We then trace how change in the predicted hazard rate for the transition to *T*_{1} induces change in the distribution of *T*_{1} event times. Percentiles of the *T*_{1} event time distribution can then be used to show when a proportion *p* of persons with specified characteristics would be expected to experience this event. Inverting this problem shows how change in an exogenous covariate induces change in timing of the transition to *T*_{1}. Thus, (1) change in an exogenous covariate *x* between two groups implies corresponding differences in the *rate* for the transition to *T*_{1}, (2) group differences in the *rate* for the transition to *T*_{1} imply differences in the *T*_{1} distribution; and (3) group differences in *timing* of the transition to *T*_{1} can be traced via percentiles of the *T*_{1} distribution.

We then turn to the transition from *T*_{1} to *T*_{2}, again under a proportional specification. We first show how to model staggered entries of persons into the risk of the transition to *T*_{2} using standard methods for left-truncated event history data. Covariate effects in such an equation can then be interpreted as direct effects for exogenous covariates. Because variation in *T*_{1} induces variation in the exposure to risk of *T*_{2}, we show how differences in the timing of *T*_{1} affect the cumulative hazard of *T*_{2} and two measures of *T*_{2} prevalence; these relationships thus show how *T*_{1} variation induces exposure consequences for the *T*_{2} process. The final step combines this latter result with a previous result linking variation between groups in an exogenous covariate and variation in *T*_{1}, with the resulting analytical expressions showing how covariates exert indirect effects on the *T*_{2} process when persons become exposed to the risk of *T*_{2}.

In the classical recursive model, the realization *y* of the random variable *Y* appears as a right-hand-side covariate in the *Z* equation. This can occur in an event history model as well, since one can condition on any aspect of an individual’s past history when modeling the *T*_{2} process, including, for example, an individual’s age at time of entry into the left-truncated risk of *T*_{2}. This would involve including the realization *t*_{1} of *T*_{1} as an ordinary right-hand-side covariate in the *T*_{2} equation. The key point is that differences between two groups in an exogenous covariate will induce variation in *T*_{1}, which in turn will induce variation in *T*_{2}.

Suppose that both *T*_{1} and *T*_{2} are governed by a common dimension of time *t*, that all cases enter at risk of *T*_{1} at *t* = 0, and that the occurrence of *T*_{1} determines entry into the risk of the event *T*_{2}. Under these assumptions and further assuming a proportional hazard specification, the hazard rate for *T*_{1} is given by:

$${r}_{1}(t\mid \mathbf{x})={q}_{1}\left(t\right)\mathrm{exp}\left(\mathbf{a}\mathbf{x}\right),$$

(3)

where *q*_{1}(*t*) denotes the baseline hazard for *T*_{1}, **x** is a vector of covariates, and **a** denotes a vector of parameters to be estimated. Two fundamental quantities are the cumulative risk function *H*_{1}(*t***x**) and the survivor function *S*_{1}(*t***x**) given by:

$${H}_{1}(t\mid \mathbf{x})={\int}_{0}^{t}{r}_{1}(u\mid \mathbf{x})du=\mathrm{exp}\left(\mathbf{a}\mathbf{x}\right){Q}_{1}\left(t\right)$$

(4)

where

$${Q}_{1}\left(t\right)={\int}_{0}^{t}{q}_{1}\left(s\right)ds,$$

(5)

and

$${S}_{1}(t\mid \mathbf{x})=\mathrm{Pr}({T}_{1}>t\mid \mathbf{x})=\mathrm{exp}[-{H}_{1}(t\mid \mathbf{x}\left)\right]=\mathrm{exp}[-\mathrm{exp}(\mathbf{a}\mathbf{x}\left){Q}_{1}\right(t\left)\right].$$

(6)

As in the linear regression context, compositional differences in the population will generate variation in *T*_{1}. Note, however, that (6) provides predictions for a distribution of *T*_{1} event times; thus unlike a linear regression, where a homogeneous subgroup will have a single predicted value of the outcome *Y*, under (6), a homogeneous subgroup will have a predicted distribution for the event times *T*_{1} given by *S*_{1}(*t***x**). Because of this, because some persons may be censored, and because the distribution of *T*_{1} may be defective, it is more natural to compare percentiles of *T*_{1} across groups than to compare expectations of *T*_{1} across groups. Let *t*_{1p} denote the *p*th percentile of the distribution of *T*_{1}. Suppose *A* and *B* are two groups of substantive interest, with covariate means given by ${\stackrel{\u2012}{\mathbf{x}}}_{A}$ and ${\stackrel{\u2012}{\mathbf{x}}}_{B}$, respectively. We wish to use (6) to obtain predictions for the timing of *T*_{1} for groups *A* and *B*, evaluated at the *p*th percentile of the *T*_{1} distribution. Since *S*_{1}(*t*) varies between 0 and 1, consider the simple transformation from a *p*th percentile to the corresponding proportion given by *π* = 1 – (*p*/100); then from (6), we have:

$${t}_{1p}={Q}_{1}^{-1}[-\mathrm{log}(\pi )\u2215\mathrm{exp}(\mathbf{a}\mathbf{x}\left)\right],$$

(7)

where ${Q}_{1}^{-1}\left(v\right)$ is the function such that if *v* = *Q*_{1}(*t*) then $t={Q}_{1}^{-1}\left(v\right)$. Given the above, note that compositional differences between groups *A* and *B* will generate differences in the *p*th percentiles of *T*_{1} corresponding to:

$$\Delta {t}_{1p}={t}_{1p}^{B}-{t}_{1p}^{A}={Q}_{1}^{-1}[-\mathrm{log}(\pi )\u2215\mathrm{exp}(\mathbf{a}{\stackrel{\u2012}{\mathbf{x}}}_{B}\left)\right]-{Q}_{1}^{-1}[-\mathrm{log}(\pi )\u2215\mathrm{exp}(\mathbf{a}{\stackrel{\u2012}{\mathbf{x}}}_{A}\left)\right],$$

(8)

where ${t}_{1p}^{A}$ and ${t}_{1p}^{B}$ denote the *p*th percentiles of *T*_{1} in groups *A* and *B*, respectively. Note that obtaining ${t}_{1p}^{A}$, ${t}_{1p}^{B}$, and Δ*t*_{1p} requires inverting the function for the integrated hazard *Q*(*t*) either analytically or numerically. Analytic expressions are available for choices of *q*(*t*) such as the exponential, Weibull, and Gompertz models and piecewise variants of these models (see, e.g., self-citation 2 and Appendix 1); similarly, generalizing (7) and (8) to incorporate time-varying variables is straightforward (see Appendix 2). Note, however, that the Cox proportional hazard model (1972) is more difficult to use in this context because it does not specify a parametric form for *q*(*t*); in particular, standard proposals for estimating the integrated hazard under a Cox model (see, e.g., Breslow 1974) will not, in general, yield a well-defined inverse function for ${Q}_{1}^{-1}\left(x\right)$.

As noted above, one way in which *T*_{1} influences the process *T*_{2} is that individuals are not at risk of *T*_{2} until the occurrence of *T*_{1}. In addition, when modeling *T*_{2}, one can condition on any relevant aspect of an individual’s history (Aalen 1978; Tuma and Hannan 1984), including the timing of the event *T*_{1}. Consequently, one can treat the observed value *t*_{1} as a ordinary right-hand-side covariate in the same way as *Y* appears as a covariate in the linear regression of *Z* in the standard three-variable recursive model for metric outcomes. Let *u* = *t* – *t*_{1} denote duration since sexual onset and let the hazard *r*_{2}(*t**t*_{1}, **x**) for *T*_{2} be given by:

$${r}_{2}(t,u\mid {t}_{1},\mathbf{x})={q}_{21}(t\mid {t}_{1}){q}_{22}\left(u\right)\mathrm{exp}(\alpha {t}_{1}+\mathbf{b}\mathbf{x}),$$

(9)

where *q*_{21}(*t**t*_{1}) reflects the dependence of *T*_{2} on *t*, given entry into risk at time *t*_{1}, and *q*_{22}(*u*) reflects the dependence of *T*_{2} on duration *u*. To simplify the exposition below, we first focus attention on the simpler case in which *q*_{22}(*u*) = 1, deferring for the moment details of the more general case of age and duration dependence.

When *q*_{22}(*u*) = 1 in (9), the role of *T*_{1} in determining entry into risk can be seen more easily in the expression for the survivor function, *S*_{2}(*t**t*_{1}, **x**):

$${S}_{1}(t\mid {t}_{1},\mathbf{x})=\mathrm{exp}\left[-{\int}_{{t}_{1}}^{t}{r}_{2}(s\mid {t}_{1},\mathbf{x})ds\right]=\mathrm{exp}\left[-\mathrm{exp}(\alpha {t}_{1}+\mathbf{b}\mathbf{x}){\int}_{{t}_{1}}^{t}{q}_{21}\left(s\right)ds\right],$$

(10)

where the quantity *t*_{1} appears in (10) both as a right-hand-side covariate in the expression exp(*αt*_{1}) and by left-truncating the period of risk via the lower limit of integration. As such, the *T*_{1} equation in (6) provides, by assumption, the selection mechanism governing entry into risk of *T*_{2}. If the distribution of *T*_{1} is defective, then some fraction of the population will never experience the event *T*_{1} and hence will never enter into risk of *T*_{2}.

The following derivation replicates an elementary result covered in standard texts on hazard methods (albeit with a slight modification for the left truncation by *t*_{1}). Suppose that groups *A* and *B* are identical in all respects except that *x*_{1} for groups *A* and *B* differs by a constant, i.e., *x*_{1B} = *x*_{1A} + Δ. Consider the cumulative relative risk defined as the ratio of the cumulative hazard for group *B* to that for group *A*. Under (9), this ratio is:

$$\begin{array}{cc}\hfill \frac{{H}_{2}(t\mid {t}_{1},{\mathbf{x}}_{B})}{{H}_{2}(t\mid {t}_{1},{\mathbf{x}}_{A})}& =\frac{{\int}_{{t}_{1}}^{t}{q}_{21}\left(s\right)\mathrm{exp}[\alpha {t}_{1}+{b}_{1}({x}_{1i}+\Delta )+\cdots ]ds}{{\int}_{{t}_{1}}^{t}{q}_{21}\left(s\right)\mathrm{exp}[\alpha {t}_{1}+{b}_{1}{x}_{1i}+\cdots ]ds}\hfill \\ \hfill & =\frac{\mathrm{exp}[\alpha {t}_{1}+{b}_{1}({x}_{1i}+\Delta )+\cdots ]{\int}_{{t}_{1}}^{t}{q}_{21}\left(s\right)ds}{\mathrm{exp}[\alpha {t}_{1}+{b}_{1}{x}_{1i}+\cdots ]{\int}_{{t}_{1}}^{t}{q}_{21}\left(s\right)ds}\hfill \\ \hfill & =\mathrm{exp}\left({b}_{1}\Delta \right).\hfill \end{array}$$

(11)

Thus, the direct effect on the cumulative relative risk of a shift from *x*_{1} to *x*_{1} + Δ is given by the usual estimate of relative risk.

How is (11) related to prevalence? Substantively, one might be interested in two quantities related to prevalence, one involving *absolute* prevalence—the arithmetic difference in prevalence between groups *A* and *B*—and the other involving *relative* prevalence—the ratio of prevalence for the two groups. Unlike the case for relative risk, the equations for the arithmetic and for the ratio difference do not simplify algebraically. Instead, one must solve equation 10 for each of the values of x to be compared, then subtract or divide the two prevalences as desired.

Variation in *T*_{1} will affect the prevalence of *T*_{2} even in otherwise homogeneous populations because some individuals will have longer durations of exposure to the risk of *T*_{2} by virtue of quicker *T*_{1} transitions. Standard hazard regressions *adjust* for such variations in exposure in the hazard rate, but do not quantify the magnitude of the effect of exposure on prevalence. However, such exposure effects of *T*_{1} on *T*_{2} can be derived via the same ideas as used above. Consider two groups of individuals, *A* and *B*, who are identical in all respects save for the timing *T*_{1}, and suppose that the random variable *T*_{1A} is realized as *t*_{1} for group *A* and that *T*_{1B} is realized as *t*_{1} + Δ for group *B*. Then the cumulative relative risk is given by:

$$\begin{array}{cc}\hfill \frac{{H}_{2}(t\mid {t}_{1B},\mathbf{x})}{{H}_{2}(t\mid {t}_{1A},\mathbf{x})}& =\frac{{\int}_{{t}_{1+\Delta}}^{t}{q}_{21}\left(s\right)\mathrm{exp}\left[\alpha \right({t}_{1}+\Delta )+\mathbf{b}\mathbf{x}]ds}{{\int}_{{t}_{1}}^{t}{q}_{21}\left(s\right)\mathrm{exp}[\alpha {t}_{1}+\mathbf{b}\mathbf{x}]ds}\hfill \\ \hfill & =\frac{\mathrm{exp}\left[\alpha \right({t}_{1}+\Delta )+\mathbf{b}\mathbf{x}]{\int}_{{t}_{1+\Delta}}^{t}{q}_{21}\left(s\right)ds}{\mathrm{exp}[\alpha {t}_{1}+\mathbf{b}\mathbf{x}]{\int}_{{t}_{1}}^{t}{q}_{21}\left(s\right)ds}\hfill \\ \hfill & =\mathrm{exp}\left(\alpha \Delta \right)\left[{\int}_{{t}_{1+\Delta}}^{t}{q}_{21}\left(s\right)ds\u2215{\int}_{{t}_{1}}^{t}{q}_{21}\left(s\right)ds\right].\hfill \end{array}$$

(12)

To motivate the notion of an exposure effect, we postulated a counterfactual tracing the consequences for an otherwise homogeneous population at risk of *T*_{2} in which the timing of *T*_{1} was shifted, yielding a change in exposure to risk. Such a “pure” effect of exposure implies *α* = 0 and involves only the bracketed ratio of integrals in (12). Thus, the “pure” effect of exposure generated by a shift from *t*_{1} to *t*_{1} + Δ is given by the bracketed ratio of integrals in (12).

By contrast, *α* ≠ 0 suggests that the observed realization *t*_{1} of the random variable *T*_{1} has an effect on *T*_{2} as a usual covariate in the *T*_{2} equation. Such a situation could arise if *T*_{1} has an causal effect on *T*_{2} conditional on the other covariates in the model, or if the realization *t*_{1} of the random variable *T*_{1} was correlated with unobserved covariates that influence *T*_{2}. In our empirical application below, we rely on this second interpretation, interpreting *α* as reflecting the association of unobservables that are correlated with *T*_{1} but that are not captured by the other covariates in (10).

Equation (12) provides a decomposition of the cumulative relative risk into two multiplicative components corresponding to the exposure effect given by the bracketed ratio of integrals and a more “standard” proportional effect of *T*_{1} on *T*_{2} given by exp(*α*Δ). Note that while the “standard” effect does not vary with *t* by assumption, the effect of exposure will in general vary in nonlinear ways with *t*. As a result, it can be useful to evaluate the effect of exposure over a range of *t*.

One objection to (9) is that *t*_{1} enters linearly in the specification of log *r*_{2}(*t*). Weakening this assumption involves no major difficulties; for example, suppose that the effect of *t*_{1} on log *r*_{2}(*t*) is not linear, but instead captured by some nonlinear function *f*(*t*_{1}), i.e.,

$${r}_{2}(t\mid {t}_{1},\mathbf{x})={q}_{21}(t\mid {t}_{1})\mathrm{exp}\left[f\right({t}_{1})+\mathbf{b}\mathbf{x}],$$

(13)

Employing (13) rather than (9) yields

$$\frac{{H}_{2}(t\mid {t}_{1B},\mathbf{x})}{{H}_{2}(t\mid {t}_{1A},\mathbf{x})}=\frac{\mathrm{exp}\left[f\right({t}_{1}+\Delta \left)\right]}{\mathrm{exp}\left[f\right({t}_{1}\left)\right]}\left[{\int}_{{t}_{1+\Delta}}^{t}{q}_{21}\left(s\right)ds\u2215{\int}_{{t}_{1}}^{t}{q}_{21}\left(s\right)ds\right].$$

(14)

Note in particular that the “standard” effect arising from the presence of *t*_{1} as a right-hand-side covariate is more complicated in (14) than in (12), but that the bracketed term for exposure is identical in (12) and (14).

The expression in (12) concerns the cumulative relative risk but does not speak directly to prevalence. As noted above, one might assess prevalence by examining the difference in *T*_{2} prevalence between groups *B* and *A*. This yields:

$$\begin{array}{cc}\hfill [1-{S}_{2}(t\mid {t}_{1},{\mathbf{x}}_{B}\left)\right]-[1-{S}_{2}(t\mid {t}_{1},{\mathbf{x}}_{A}\left)\right]=& (1-\mathrm{exp}[-{H}_{2}(t\mid {t}_{1B},\mathbf{x})\left]\right)-(1-\mathrm{exp}[-{H}_{2}(t\mid {t}_{1A},\mathbf{x})\left]\right)\hfill \\ \hfill =& \mathrm{exp}[-\mathrm{exp}(\alpha {t}_{1}+\mathbf{b}\mathbf{x}\left){\int}_{{t}_{1}}^{t}{q}_{21}\right(s\left)ds\right]-\hfill \\ \hfill & \mathrm{exp}[-\mathrm{exp}(\alpha [{t}_{1}+\Delta ]+\mathbf{b}\mathbf{x}\left){\int}_{{t}_{1}+\Delta}^{t}{q}_{21}\right(s\left)ds\right].\hfill \end{array}$$

(15)

Similarly, relative *T*_{2} prevalence for groups *B* and *A* is given by the ratio:

$$\frac{1-{S}_{2}(t\mid {t}_{1B},\mathbf{x})}{1-{S}_{2}(t\mid {t}_{1A},\mathbf{x})}=\frac{1-\mathrm{exp}[-{H}_{2}(t\mid {t}_{1B},\mathbf{x}\left)\right]}{1-\mathrm{exp}[-{H}_{2}(t\mid {t}_{1A},\mathbf{x}\left)\right]}=\frac{1-\mathrm{exp}[-\mathrm{exp}(\alpha [{t}_{1}+\Delta ]+\mathbf{b}\mathbf{x}\left){\int}_{{t}_{1}+\Delta}^{t}{q}_{21}\right(s\left)ds\right]}{1-\mathrm{exp}[-\mathrm{exp}(\alpha {t}_{1}+\mathbf{b}\mathbf{x}\left){\int}_{{t}_{1}}^{t}{q}_{21}\right(s\left)ds\right]}.$$

(16)

Note that the decomposition in (14) of the indirect effect into standard and exposure effects does not hold for either (15) or (16).

Assessing the indirect effect of **x** on *T*_{2} proceeds along the same formal lines as in the previous sections, with **x** affecting *T*_{1} and *T*_{1} affecting *T*_{2} in two ways—via an indirect effect of exposure and a more “standard” indirect effect. A first step is to trace the effect of *x* on the timing of *T*_{1}. Consider the pool of individuals who have not yet experienced the event *T*_{1} and suppose that two groups, *A* and *B*, are identical in all respects save for their values of *x*_{1}. As before, set *x*_{1B} = *x*_{1A} + Δ; then from (8), the effect of composition on the timing of *T*_{1} is given by:

$$\begin{array}{cc}\hfill \Delta {t}_{1p}& ={t}_{1Bp}-{t}_{1Ap}\hfill \\ \hfill & ={Q}_{1}^{-1}[-\mathrm{log}(\pi )\u2215\mathrm{exp}(\mathbf{b}{\mathbf{x}}_{B}\left)\right]-{Q}_{1}^{-1}[-\mathrm{log}(\pi )\u2215\mathrm{exp}(\mathbf{b}{\mathbf{x}}_{A}\left)\right]\hfill \\ \hfill & ={Q}_{1}^{-1}[-\mathrm{log}(\pi )\u2215\mathrm{exp}({b}_{1}({x}_{1}+\Delta )+\cdots \left)\right]-{Q}_{1}^{-1}[-\mathrm{log}(\pi )\u2215\mathrm{exp}({b}_{1}{x}_{1}+\cdots \left)\right],\hfill \end{array}$$

(17)

where *π* = 1 – (*p*/100) and *p* corresponds to the *p*th percentile for the distribution of *T*_{1}. Note that the function *Q*^{−1} is highly nonlinear; hence, the predicted effect of a covariate *x* on the percentile distribution *T*_{1} will vary with the percentile *p* at which the effect is evaluated.

Recall from (12) that a shift from *t*_{1} to *t*_{1} + Δ influences the cumulative relative risk in two ways, through an indirect effect of exposure and a “standard” indirect effect. But a shift from *x*_{1} to *x*_{1} + Δ will induce a shift in *t*_{1}, thus generating both direct and indirect effects for the *T*_{2} equation. This is given by combining (12) and (17), from which one can derive the indirect effect of shifting *x*_{1} to *x*_{1} + Δ on the cumulative relative risk:

$$\frac{{H}_{2}(t\mid {t}_{1B},{\mathbf{x}}_{B})}{{H}_{2}(t\mid {t}_{1A},{\mathbf{x}}_{A})}=\mathrm{exp}\left({b}_{1}\Delta \right)\mathrm{exp}\left(\alpha \Delta {t}_{1p}\right)\left[{\int}_{{t}_{1A\pi}+\Delta {t}_{1p}}^{t}{q}_{21}\left(s\right)ds\u2215{\int}_{{t}_{1A\pi}}^{t}{q}_{21}\left(s\right)ds\right]$$

(14)

Thus, the consequence of shifting from *x*_{1} to *x*_{1}+ Δ in the *T*_{1} equation appears in three places in the *T*_{2} equation in (18): a direct effect of *x*_{1} on *T*_{2} represented by the quantity *b*_{1}Δ, and two indirect effects of *x*_{1} via *T*_{1}—an indirect effect of exposure represented by the lower limit of integration in (18), and a more usual indirect effect represented by the quantity *α*Δ*t*_{1p}.

A shift from *x*_{1} to *x*_{1} + Δ will also generate differences between groups *B* and *A* in *T*_{2} prevalence. As before, one may wish to assess these differences in either absolute or relative terms. The absolute difference in *T*_{2} prevalence generated in this way is given by:

$$\begin{array}{cc}\hfill [1-{S}_{2}(t\mid {t}_{1B},{\mathbf{x}}_{B}\left)\right]-[1-{S}_{2}& (t\mid {t}_{1A},{\mathbf{x}}_{A})]\hfill \\ \hfill =& \mathrm{exp}[-{H}_{2}(t\mid {t}_{1A},{\mathbf{x}}_{A}\left)\right]-\mathrm{exp}[-{H}_{2}(t\mid {t}_{1B},{\mathbf{x}}_{B}\left)\right]\hfill \\ \hfill =& \mathrm{exp}\left[\mathrm{exp}\right[\alpha {t}_{1A\pi}+{b}_{1}{x}_{1}+\cdots \left]{\int}_{{t}_{1A\pi}}^{t}{q}_{21}\right(s\left)ds\right]-\hfill \\ \hfill & \mathrm{exp}\left[\mathrm{exp}\right[\alpha ({t}_{1A\pi}+\Delta {t}_{1p})+{b}_{1}({x}_{1}+\Delta )+\cdots \left]{\int}_{{t}_{1A\pi}+\Delta {t}_{1p}}^{t}{q}_{21}\right(s\left)ds\right].\hfill \end{array}$$

(19)

Similarly, the relative difference in *T*_{2} prevalence from shifting *x*_{1} to *x*_{1} + Δ is given by:

$$\begin{array}{cc}\hfill \frac{1-{S}_{2}(t\mid {t}_{1B},{\mathbf{x}}_{B})}{1-{S}_{2}(t\mid {t}_{1A},{\mathbf{x}}_{A})}& =\frac{1-\mathrm{exp}[-{H}_{2}(t\mid {t}_{1B},{\mathbf{x}}_{B}\left)\right]}{1-\mathrm{exp}[-{H}_{2}(t\mid {t}_{1A},{\mathbf{x}}_{A}\left)\right]}\hfill \\ \hfill & =\frac{1-\left(\mathrm{exp}\left[\alpha \right[{t}_{1A\pi}+\Delta {t}_{1p})+{b}_{1}({x}_{1}+\Delta )+\cdots ]{\int}_{{t}_{1A\pi}+\Delta {t}_{1p}}^{t}{q}_{21}\left(s\right)ds\right)}{1-\left(\mathrm{exp}[\alpha {t}_{1A\pi}+{b}_{1}{x}_{1}+\cdots ]{\int}_{{t}_{1A\pi}}^{t}{q}_{21}\left(s\right)ds\right)}.\hfill \end{array}$$

(20)

Neither (19) nor (20) decompose into separable multiplicative components, as does (18) for the cumulative relative risk. However, there exist standard demographic procedures that let one scale quantities such that the direct and indirect components in (19) sum to the absolute difference in prevalence (Das Gupta 1993; see also Smith, Morgan, and Koropeckyj-Cox 1996). To motivate such a decomposition for the difference in absolute prevalence in (19), let

$$\begin{array}{cc}\hfill & {G}_{A}\left(t\right)=G(t\mid {t}_{1A},{\mathbf{x}}_{A})=1-{S}_{2}(t\mid {t}_{1A},{\mathbf{x}}_{A})\hfill \\ \hfill & {G}_{B}\left(t\right)=G(t\mid {t}_{1B},{\mathbf{x}}_{B})=1-{S}_{2}(t\mid {t}_{1B},{\mathbf{x}}_{B}),\hfill \end{array}$$

(21)

where for notational clarity, we suppress the dependence of *G* on *t* and **x** by writing *G*(*t*) = *g*(*γ*_{1}, *γ*_{2}, *γ*_{3}), that is, as a function of three terms–a direct effect *γ*_{1}, an indirect effect of exposure *γ*_{2}, and an indirect covariate effect *γ*_{3}. Then roughly speaking, the main idea in such a Das-Gupta decomposition for (19) is to evaluate the influence of switching from group *A* to *B* across all possible permutations of *γ*_{1}, *γ*_{2}, and *γ*_{3}. For example, one set of permutations will be:

$$\begin{array}{cc}\hfill & g({\gamma}_{1B},{\gamma}_{2A},{\gamma}_{3A})=1-\mathrm{exp}\left[-\mathrm{exp}(\alpha {t}_{1A}+\mathbf{b}{\mathbf{x}}_{B}){\int}_{{t}_{1A}}^{t}{q}_{21}\left(s\right)ds\right]\hfill \\ \hfill & g({\gamma}_{1A},{\gamma}_{2B},{\gamma}_{3A})=1-\mathrm{exp}\left[-\mathrm{exp}(\alpha {t}_{1A}+\mathbf{b}{\mathbf{x}}_{A}){\int}_{{t}_{1B}}^{t}{q}_{21}\left(s\right)ds\right]\hfill \\ \hfill & g({\gamma}_{1A},{\gamma}_{2A},{\gamma}_{3B})=1-\mathrm{exp}\left[-\mathrm{exp}(\alpha {t}_{1B}+\mathbf{b}{\mathbf{x}}_{A}){\int}_{{t}_{1A}}^{t}{q}_{21}\left(s\right)ds\right]\hfill \end{array}$$

(22)

with similar expressions holding for the remaining possible permutations *g*(*γ*_{1A}, *γ*_{2A}, *γ*_{3A}), …, *g*(*γ*_{1B}, *γ*_{2B}, *γ*_{3B}).

To impose the standardization constraint that the sum of the three direct and indirect effects equal the total difference in (19), let Δ*g*_{1}, Δ*g*_{2}, and Δ*g*_{3} denote the standardized decomposition for the direct effect *γ*_{1}, the indirect exposure effect *γ*_{2}, and the indirect covariate effect *γ*_{3}, respectively. Then consider the following three standardized decompositions:

$$\begin{array}{cc}\hfill \Delta {g}_{1}=& g({\gamma}_{1B},\cdot ,\cdot )-g({\gamma}_{1A},\cdot ,\cdot )\hfill \\ \hfill =& \frac{2}{6}\left[g\right({\gamma}_{1B},{\gamma}_{2A},{\gamma}_{3A})-g({\gamma}_{1A},{\gamma}_{2A},{\gamma}_{3A}\left)\right]+\hfill \\ \hfill & \frac{2}{6}\left[g\right({\gamma}_{1B},{\gamma}_{2B},{\gamma}_{3B})-g({\gamma}_{1A},{\gamma}_{2B},{\gamma}_{3B}\left)\right]+\hfill \\ \hfill & \frac{1}{6}\left[g\right({\gamma}_{1B},{\gamma}_{2A},{\gamma}_{3B})-g({\gamma}_{1A},{\gamma}_{2A},{\gamma}_{3B}\left)\right]+\hfill \\ \hfill & \frac{1}{6}\left[g\right({\gamma}_{1B},{\gamma}_{2B},{\gamma}_{3A})-g({\gamma}_{1A},{\gamma}_{2B},{\gamma}_{3A}\left)\right]\hfill \end{array}$$

(23)

$$\begin{array}{cc}\hfill \Delta {g}_{2}=& g(\cdot ,{\gamma}_{1B},\cdot )-g(\cdot ,{\gamma}_{1A},\cdot )\hfill \\ \hfill =& \frac{2}{6}\left[g\right({\gamma}_{1A},{\gamma}_{2B},{\gamma}_{3A})-g({\gamma}_{1A},{\gamma}_{2A},{\gamma}_{3A}\left)\right]+\hfill \\ \hfill & \frac{2}{6}\left[g\right({\gamma}_{1B},{\gamma}_{2B},{\gamma}_{3B})-g({\gamma}_{1B},{\gamma}_{2A},{\gamma}_{3B}\left)\right]+\hfill \\ \hfill & \frac{1}{6}\left[g\right({\gamma}_{1A},{\gamma}_{2B},{\gamma}_{3B})-g({\gamma}_{1A},{\gamma}_{2A},{\gamma}_{3B}\left)\right]+\hfill \\ \hfill & \frac{1}{6}\left[g\right({\gamma}_{1B},{\gamma}_{2B},{\gamma}_{3A})-g({\gamma}_{1B},{\gamma}_{2A},{\gamma}_{3A}\left)\right]\hfill \end{array}$$

(24)

$$\begin{array}{cc}\hfill \Delta {g}_{3}=& g(\cdot ,\cdot ,{\gamma}_{1B})-g(\cdot ,\cdot ,{\gamma}_{1A})\hfill \\ \hfill =& \frac{2}{6}\left[g\right({\gamma}_{1A},{\gamma}_{2A},{\gamma}_{3B})-g({\gamma}_{1A},{\gamma}_{2A},{\gamma}_{3A}\left)\right]+\hfill \\ \hfill & \frac{2}{6}\left[g\right({\gamma}_{1B},{\gamma}_{2B},{\gamma}_{3B})-g({\gamma}_{1B},{\gamma}_{2B},{\gamma}_{3A}\left)\right]+\hfill \\ \hfill & \frac{1}{6}\left[g\right({\gamma}_{1A},{\gamma}_{2B},{\gamma}_{3B})-g({\gamma}_{1A},{\gamma}_{2B},{\gamma}_{3A}\left)\right]+\hfill \\ \hfill & \frac{1}{6}\left[g\right({\gamma}_{1B},{\gamma}_{2A},{\gamma}_{3B})-g({\gamma}_{1B},{\gamma}_{2A},{\gamma}_{3A}\left)\right],\hfill \end{array}$$

(25)

where the three terms in (23)–(25) sum to the “total” difference:

$$\Delta {g}_{1}+\Delta {g}_{2}+\Delta {g}_{3}=g({\gamma}_{1B},{\gamma}_{2B},{\gamma}_{3B})-g({\gamma}_{1A},{\gamma}_{2A},{\gamma}_{3A}).$$

(26)

The expressions in (23)–(25) provide a Das Gupta-type decomposition for the absolute difference in (19). To obtain a similar decomposition for (20), rewrite the left-hand-side ratio in (20) as

$$\frac{1-{S}_{2}(t\mid {t}_{1B},{\mathbf{x}}_{B})}{1-{S}_{2}(t\mid {t}_{1A},{\mathbf{x}}_{A})}=\frac{{g}_{B}({\gamma}_{1},{\gamma}_{2},{\gamma}_{3})}{{g}_{A}({\gamma}_{1},{\gamma}_{2},{\gamma}_{3})}.$$

(27)

Then taking logarithms yields

$$\mathrm{log}\left(\frac{1-{S}_{2}(t\mid {t}_{1B},{\mathbf{x}}_{B})}{1-{S}_{2}(t\mid {t}_{1A},{\mathbf{x}}_{A})}\right)=\mathrm{log}\phantom{\rule{thinmathspace}{0ex}}{g}_{B}({\gamma}_{1},{\gamma}_{2},{\gamma}_{3})-\mathrm{log}\phantom{\rule{thinmathspace}{0ex}}{g}_{A}({\gamma}_{1},{\gamma}_{2},{\gamma}_{3}).$$

(28)

But because (28) is an arithmetic difference in log *g*, replacing *g* by log *g* in Eqs. (23)–(26) will providing an analogous decomposition for relative prevalence.

We now turn to the more general case in which the transition from *T*_{1} to *T*_{2} varies both with age and duration. We generalize (9) by considering the case in which the hazard *r*_{2}(*t**t*_{1}, **x**) for *T*_{2} is given by a specification in which the age and duration dependence are separable:

$${r}_{2}(t,u\mid {t}_{1},\mathbf{x})={q}_{21}(t\mid {t}_{1}){q}_{22}\left(u\right)\mathrm{exp}(\alpha {t}_{1}+\mathbf{b}\mathbf{x}),$$

(29)

where *q*_{21}(*t**t*_{1}) reflects the dependence of *T*_{2} on *t* given entry into risk at time *t*_{1} and *q*_{22}(*u*) the dependence of *T*_{2} on duration *u*, with *u* = *t* – *t*_{1} denoting duration since *T*_{1} onset. Under (29), the survivor function, *S*_{2}(*t*, *u**t*_{1}, **x**) will be given by the double integral

$$\begin{array}{cc}\hfill {S}_{2}(t\mid {t}_{1},\mathbf{x})& =\mathrm{exp}\left[-{\int}_{{t}_{1}}^{t}{\int}_{0}^{u}{r}_{2}(s,v\mid {t}_{1},\mathbf{x})ds\phantom{\rule{thinmathspace}{0ex}}dv\right]\hfill \\ \hfill & =\mathrm{exp}\left[-\mathrm{exp}(\alpha {t}_{1}+\mathbf{b}\mathbf{x}){\int}_{{t}_{1}}^{t}{q}_{21}\left(s\right)ds{\int}_{0}^{u}{q}_{22}\left(v\right)dv\right].\hfill \end{array}$$

(30)

As before, consider a thought experiment in which groups *A* and *B* are identical in all respects, including the timing of *T*_{1} onset, except that *x*_{1} for groups *A* and *B* differs by a constant, i.e., *x*_{1B} = *x*_{1A} + Δ. Then from (30), the expression in (11) for the cumulative relative risk for group *B* to that for group *A* remains unchanged, since

$$\begin{array}{cc}\hfill \frac{{H}_{2}(t,u\mid {t}_{1},{\mathbf{x}}_{B})}{{H}_{2}(t,u\mid {t}_{1},{\mathbf{x}}_{A})}& =\frac{{\int}_{{t}_{1}}^{t}{\int}_{0}^{u}{q}_{21}\left(s\right)\phantom{\rule{thinmathspace}{0ex}}{q}_{22}\left(v\right)\phantom{\rule{thinmathspace}{0ex}}\mathrm{exp}[\alpha {t}_{1}+{b}_{1}({x}_{1i}+\Delta )+\cdots ]\phantom{\rule{thinmathspace}{0ex}}ds\phantom{\rule{thinmathspace}{0ex}}dv}{{\int}_{{t}_{1}}^{t}{\int}_{0}^{u}{q}_{21}\left(s\right)\phantom{\rule{thinmathspace}{0ex}}{q}_{22}\left(v\right)\phantom{\rule{thinmathspace}{0ex}}\mathrm{exp}[\alpha {t}_{1}+{b}_{1}{x}_{1i}+\cdots ]\phantom{\rule{thinmathspace}{0ex}}ds\phantom{\rule{thinmathspace}{0ex}}dv}\hfill \\ \hfill & =\mathrm{exp}\left({b}_{1}\Delta \right).\hfill \end{array}$$

(31)

By contrast, the absolute and relative difference in *T*_{2} prevalence in (12) and (13) are now given, respectively, by

$${S}_{2}(t,u\mid {t}_{1},{\mathbf{x}}_{A})-{S}_{2}(t,u\mid {t}_{1},{\mathbf{x}}_{B})=\mathrm{exp}[-{H}_{2}(t,u\mid {t}_{1},{\mathbf{x}}_{A}\left)\right]-\mathrm{exp}[-\mathrm{exp}({b}_{1}\Delta \left){H}_{2}\right(t,u\mid {t}_{1},{\mathbf{x}}_{A}\left)\right],$$

(32)

and

$$\frac{1-{S}_{2}(t,u\mid {t}_{1},{\mathbf{x}}_{B})}{1-{S}_{2}(t,u\mid {t}_{1},{\mathbf{x}}_{A})}=\frac{1-\mathrm{exp}[-{H}_{2}(t,u\mid {t}_{1},{\mathbf{x}}_{B}\left)\right]}{1-\mathrm{exp}[-{H}_{2}(t,u\mid {t}_{1},{\mathbf{x}}_{A}\left)\right]}=\frac{1-\mathrm{exp}[-\mathrm{exp}({b}_{1}\Delta \left){H}_{2}\right(t,u\mid {t}_{1},{\mathbf{x}}_{A}\left)\right]}{1-\mathrm{exp}[-{H}_{2}(t,u\mid {t}_{1},{\mathbf{x}}_{A}\left)\right]}.$$

(33)

Now consider two groups of individuals, *A* and *B*, who are identical in all respects save for the timing *T*_{1}, and suppose that the random variable *T*_{1A} is realized as *t*_{1} for group *A* and that *T*_{1B} is realized as *t*_{1} + Δ for group *B*. Then the cumulative relative risk in (12) will now be given by the double integral:

$$\begin{array}{cc}\hfill \frac{{H}_{2}(t,u\mid {t}_{1B},\mathbf{x})}{{H}_{2}(t,u\mid {t}_{1A},\mathbf{x})}& =\frac{{\int}_{{t}_{1}+\Delta}^{t}{\int}_{0}^{u-\Delta}{q}_{21}\left(s\right)\phantom{\rule{thinmathspace}{0ex}}{q}_{22}\left(v\right)\phantom{\rule{thinmathspace}{0ex}}\mathrm{exp}\left[\alpha \right({t}_{1}+\Delta )+\mathbf{b}\mathbf{x}]\phantom{\rule{thinmathspace}{0ex}}ds\phantom{\rule{thinmathspace}{0ex}}dv}{{\int}_{{t}_{1}}^{t}{q}_{21}\left(s\right){\int}_{0}^{u}{q}_{22}\left(v\right)\phantom{\rule{thinmathspace}{0ex}}\mathrm{exp}[\alpha {t}_{1}+\mathbf{b}\mathbf{x}]\phantom{\rule{thinmathspace}{0ex}}ds\phantom{\rule{thinmathspace}{0ex}}dv}\hfill \\ \hfill & =\mathrm{exp}\left(\alpha \Delta \right)\frac{{\int}_{{t}_{1}+\Delta}^{t}{q}_{21}\left(s\right)\phantom{\rule{thinmathspace}{0ex}}ds{\int}_{0}^{u-\Delta}{q}_{22}\left(v\right)\phantom{\rule{thinmathspace}{0ex}}dv}{{\int}_{{t}_{1}}^{t}{q}_{21}\left(s\right)\phantom{\rule{thinmathspace}{0ex}}ds{\int}_{0}^{u}{q}_{22}\left(v\right)\phantom{\rule{thinmathspace}{0ex}}dv}.\hfill \end{array}$$

(34)

The difference in *T*_{2} prevalence between groups *B* and *A* also involves a change to double integrals

$$\begin{array}{cc}\hfill {S}_{2}(t,u\mid {t}_{1},{\mathbf{x}}_{B})]-{S}_{2}(t,u\mid {t}_{1},{\mathbf{x}}_{A})=& (1-\mathrm{exp}[-{H}_{2}(t,u\mid {t}_{1B},\mathbf{x})\left]\right)-(1-\mathrm{exp}[-{H}_{2}(t,u\mid {t}_{1A},\mathbf{x})\left]\right)\hfill \\ \hfill =& \mathrm{exp}[-\mathrm{exp}(\alpha [{t}_{1}+\Delta ]+\mathbf{b}\mathbf{x}\left){\int}_{{t}_{1}+\Delta}^{t}{q}_{21}\right(s\left)\phantom{\rule{thinmathspace}{0ex}}ds{\int}_{0}^{u-\Delta}{q}_{22}\right(v\left)\phantom{\rule{thinmathspace}{0ex}}dv\right]-\hfill \\ \hfill & \mathrm{exp}[-\mathrm{exp}(\alpha {t}_{1}+\mathbf{b}\mathbf{x}\left){\int}_{{t}_{1}}^{t}{q}_{21}\right(s\left)\phantom{\rule{thinmathspace}{0ex}}ds{\int}_{0}^{u}{q}_{22}\right(v\left)\phantom{\rule{thinmathspace}{0ex}}dv\right],\hfill \end{array}$$

(35)

with the relative *T*_{2} prevalence for groups *B* and *A* given by the ratio:

$$\frac{1-{S}_{2}(t,u\mid {t}_{1B},\mathbf{x})}{1-{S}_{2}(t,u\mid {t}_{1A},\mathbf{x})}=\frac{1-\mathrm{exp}[-\mathrm{exp}(\alpha [{t}_{1}+\Delta ]+\mathbf{b}\mathbf{x}\left){\int}_{{t}_{1}+\Delta}^{t}{q}_{21}\right(s\left)\phantom{\rule{thinmathspace}{0ex}}ds{\int}_{0}^{u-\Delta}{q}_{22}\right(v\left)\phantom{\rule{thinmathspace}{0ex}}dv\right]}{1-\mathrm{exp}[-\mathrm{exp}(\alpha {t}_{1}+\mathbf{b}\mathbf{x}\left){\int}_{{t}_{1}}^{t}{q}_{21}\right(s\left)\phantom{\rule{thinmathspace}{0ex}}ds{\int}_{0}^{u}{q}_{22}\right(v\left)\phantom{\rule{thinmathspace}{0ex}}dv\right]}.$$

(36)

Finally, we consider the general case in which a shift from *x*_{1} to *x*_{1} + Δ induces a shift from *t*_{1} to *t*_{1} + *t*_{1p}, thus generating both direct and indirect effects for the *T*_{2} equation. Let *u* = *t* – Δ*t*_{1p}; then the cumulative relative risk is given by combining (34) and (36):

$$\frac{{H}_{2}(t\mid {t}_{1B},{\mathbf{x}}_{B})}{{H}_{2}(t\mid {t}_{1A},{\mathbf{x}}_{A})}=\mathrm{exp}\left({b}_{1}\Delta \right)\mathrm{exp}\left(\alpha \Delta {t}_{1p}\right)\frac{{\int}_{{t}_{1A\pi}+\Delta {t}_{1p}}^{t}{q}_{21}\left(s\right)ds{\int}_{0}^{u-\Delta {t}_{1p}}{q}_{22}\left(v\right)dv}{{\int}_{{t}_{1A\pi}}^{t}{q}_{21}\left(s\right)ds{\int}_{0}^{u}{q}_{22}\left(v\right)dv}.$$

(37)

Thus as in (18), the consequence of shifting from *x*_{1} to *x*_{1} + Δ in the *T*_{1} equation appears in three places in (36) and (37): a direct effect of *x*_{1} on *T*_{2} represented by the quantity *b*_{1}Δ, and two indirect effects of *x*_{1} via *T*_{1}—an indirect effect of exposure represented by the lower and upper limits of integration, and the more usual indirect effect represented by the quantity *α*Δ*t*_{1p}.

We illustrate these methods using data on white women from the 1988 National Survey of Family Growth (NSFG), a retrospective survey of women aged 15–44 in 1988. We examine two young adult transitions, initiation of sexual activity and a premarital first birth—that is, a first birth that occurs prior to a first marriage. NSFG respondents were asked to supply the calendar year and month of first sexual intercourse; we converted these data into age (in months) at first intercourse. For this transition, we censored a woman’s data either at her age at the 1988 NSFG interview if she reported never having experienced sexual intercourse or at her age of first marriage if she reported initiating sexual activity during the same month of first marriage or at some later time. We constructed age at a premarital first birth using the NSFG first birth and first marriage histories. For this transition, we censored a woman’s data either at her age at first marriage if she married prior to a first birth or at her age at interview if neither a birth nor marriage occurred prior to interview.

Our empirical examples use a uniform set of covariates to model both the *T*_{1} transition—the transition to first sexual intercourse—and the *T*_{2} transition—to a premarital first birth. The covariates we examine are: snapshot measures of the respondent’s family structure at age 14, education of the respondent’s mother, age of the respondent’s mother at first birth, a dummy variable for Catholic religion at interview, and a dummy variable equal to one if we employed a hot-deck procedure to impute calendar month at first intercourse. Our family structure variables are snapshot measures that contrast women who resided in two-biological, mother-only, step, and other types of families at age 14. As noted in the previous section, in modeling T2, one can also include the timing of T1 as non-time-varying, and duration since T1 as a time-varying, right-hand-side covariate. Age at first intercourse and duration since first intercourse, then, are the only variables that appear in the T2 equation but not the T1 equation.

The NSFG contains data on 4,911 white women. We dropped cases if a first birth was reported prior to first intercourse or if data on first intercourse, first birth, first marriage, or family structure were missing. Of the full sample of 4,911 white women, 157 were missing data on either first intercourse or a premarital first birth, and another 2 women were missing data on family structure, yielding a sample of 4,752 white women.

Figure 1 presents smoothed nonparametric estimates using a procedure described in self-citation 3 for the age-graded risk of entry into sexual activity, the age-graded risk of a premarital first birth, and the duration-graded risk of a premarital first birth conditional on entry into sexual activity. The first panel of Figure 1 plots smoothed nonparametric estimates of the logarithm of the hazard rate of first sexual intercourse by age, the middle panel plots two different estimates of the logarithm of the hazard rate for a premarital first birth, and the bottom panel plots estimates of the logarithm of the hazard rate for a premarital first birth by duration since sexual onset. In the upper two panels, the curves for the logarithm of the rate rise in a roughly linear fashion to about age 18.5, after which the curves decline, again in a roughly linear fashion.

Smoothed nonparametric estimates of: (a) age dependence in the logarithm of the hazard for the transition to first sexual intercourse, (b) age dependence in the transition to a premarital first birth, and (c) duration dependence in the transition from **...**

In the middle panel of Figure 1, the two curves differ in the assumptions they make about when women become at risk of a premarital first birth. The solid curve presents estimates that do not place a woman at risk of a premarital first birth until she reports becoming sexually active; hence, for this curve, we use a woman’s report of age at first intercourse to left-truncate her premarital birth history. The dotted curve presents estimates that ignore this left truncation; hence, while this curve can be viewed as the average of the logarithm of premarital first birth risks in the population, it ignores variation in onset of sexual activity and implicitly assumes that women are at risk of a premarital first birth even if they have not initiated sexual activity, an implausible assumption. A comparison of the two curves in the lower panel of Figure 1 shows that left truncation affects estimates substantially, with the curve ignoring left truncation systematically underestimating premarital first birth risks relative to the curve that incorporates left truncation. Differences between these two curves are especially apparent at younger ages; substantively, this reflects the tendency for premarital births risks to be especially high among teenaged women in the period just following the initiation of sexual activity.

The nonparametric estimates in the bottom panel of Figure 1 exhibit less variation in premarital first birth risks by duration, as opposed to the clear patterns of age variation observed in the middle panel of Figure 1. Based on these nonparametric results, we model age dependence in both the *T*_{1} and *T*_{2} equations using a splined piecewise Gompertz specification with nodes at ages 16.5, 18.5, and 21.0 (e.g. self-citation 4; Lillard 1993). For the *T*_{2} equation, age at first sex was modeled as a simple fixed covariate, and duration-dependence (since onset of sexual activity) was modeled using a piecewise exponential specification for durations 0 to 35, 36 to 71, and 72 or more months.^{3}

We now proceed to our parametric models. To simplify the discussion, we focus attention on tracing the direct and indirect effects of family structure on our two transitions; however, our underlying models control both for family structure and the other background characteristics. Finally, and as noted above, when modeling *T*_{2}, one can also specify an effect for the observed timing of *T*_{1} as a non-time-varying right-hand-side covariate.

Table 1 presents coefficient estimates for family structure and, in some models of premarital first birth risks, coefficient estimates for age at first intercourse and for duration since sexual onset.^{4}

Effects of family structure on the age-specific rate of first sexual intercourse, a premarital first birth ignoring exposure since onset of first sexual intercourse, and a premarital first birth correcting for exposure since onset of first sexual intercourse. **...**

Model 1 presents estimates for the transition to first sexual intercourse. As in previous research, we observe higher risks for women who resided at age 14 in families that did not contain both biological parents relative to women who resided with both biological parents at age 14. The coefficient of .51 for the effect of residing in a mother-only family at age 14 indicates an age-specific rate of first sexual intercourse that is exp(.51) or 1.67 times higher than the rate for women who resided with both biological parents at age 14. For women residing in a step family at age 14 and for women in some other family arrangement at age 14, the model coefficients of .69 and .58 indicate rates of first sexual intercourse 1.99 and 1.79 times greater, respectively, than the rate for women who resided with both biological parents at age 14.

The remaining models in Table 1 examine the transition to a premarital first birth and reflect different assumptions about the effects of age at first intercourse on premarital first births and about the variation with duration in premarital first birth risks. Model 2 neither left-truncates exposure to risk using the respondent’s reported age at first intercourse nor includes an effect for age at first intercourse specified as a non-time-varying right-hand-side covariate. Thus, Model 2 contains no term involving *T*_{1}; hence, it assumes that risks are *identical* (conditional on family structure and the other background variables) for women who have and have not initiated sexual activity. Model 3 differs from Model 2 by including the respondent’s reported age at first intercourse as a non-time-varying right-hand-side covariate for the respondent’s risk of a premarital first birth. Model 3 does condition on *T*_{1}, but does so by including the realized value of *T*_{1} for a given respondent as a non-time-varying right-hand-side covariate. Hence, it assumes, somewhat implausibly, that women are at risk of a premarital first birth both before and after initiation of sexual activity, but models variation in premarital first birth risks for women who initiate sexual activity at different ages by a proportionality constant. See also Kiernan and Hobcraft (1997).

By contrast, Models 4–6 use a woman’s reported age at first sexual intercourse to left truncate her risk of a premarital first birth; thus, these models do not place a woman at risk of a premarital first birth until she reports initiating sexual activity. Model 5 differs from Model 4 by adding age at first sexual intercourse as a non-time-varying right-hand-side covariate. Model 6 uses the full specification in (9), incorporating one baseline hazard for age dependence and a second baseline hazard for duration dependence.^{5} Comparing coefficient estimates in Models 5 and 6 shows that despite substantially lower risks (exp(−.40) =.67 and exp(−1.01) =.36, corresponding to a 32 and 64 percent lower risk) at durations 32–71 and 72+ months since sexual onset, adding a baseline hazard for duration dependence in premarital first birth risks changes the coefficient estimates for the other covariates only slightly.

Table 2 gives predicted values for the median age at first intercourse by family structure at age 14 using estimates from Model 1 in Table 2, with estimates evaluated at the mean values for all background covariates. The predicted median age at first intercourse is 226.5, 213.0, 209.1, and 211.3 months for white women who resided at age 14 in a two biological parent family, a mother-only family, a step family, or in the residual category for other types of families, respectively. The second column gives deviations from the predicted median for women who resided in an intact family at age 14. The median age at first intercourse is between 13.4 months and 17.4 months later for women who resided in an intact family at age 14 relative to the other three types of nonintact families, with the magnitude of these deviations corresponding to the hazard model parameter estimates reported in column 1 of Table 1.

Predicted median age (in months) at first sexual intercourse for women residing in intact, mother-only, step, and other families at age 14 and predicted difference relative to women residing in an intact family at age 14. White women, National Survey **...**

Table 3 evaluates the indirect effects of the family structure variables on the cumulative relative risk of a premarital first birth given by (18) for white women, using estimated coefficients from Model 6 of Table 1. For most women in this sample, exposure to the risk of a premarital birth is ended by a transition to first marriage, so time t in columns two and four can be viewed counterfactually as the risks that a hypothetical woman would experience following onset of sexual activity but prior to entry into marriage. As in the hazard regressions in Table 1, white women who resided in a two biological parent family at age 14 form the baseline group, while as in Table 2, we set the remaining background covariates to their respective sample means and assess variation in age at first intercourse using the predicted median age at onset.

Decomposition of direct and indirect effects (in percentage change) for the cumulative relative risk of a premarital first birth for white women under alternative simulations for the median age at first sexual intercourse by family structure at age 14. **...**

We begin with the comparison in row 1 of Table 3. This row gives the indirect effect of a covariate *x* on *T*_{2}, in this case, having resided in a mother-only family at age 14 on the cumulative relative risk of a premarital first birth, given in expression (18). Recall that the indirect effect of *x* on *T*_{2} occurs through its influence on *T*_{1}, which we evaluate using the predicted median of the *T*_{1} distribution. From Table 2, we have that the predicted median age at first intercourse was 226.5 and 213.0 months for white women who resided in an intact and mother-only family at age 14, respectively; these in turn imply values of *t*_{1} = 226.5 and Δ = −13.5 in the first row of Table 3. In addition, the expression for the cumulative relative risk in (18) requires that one choose some age t at which to evaluate the cumulative relative risk. In the first row of results, we have chosen an age equal to 60 months after the predicted median age at onset for women in the omitted category—those who resided in a two-biological family at age 14; this corresponds to *t*_{1} + 60 = 286.5 months, or roughly age 24.

What is the indirect effect of having resided in a mother-only family at age 14 on a premarital first birth, relative to having resided in an intact family at age 14? The total indirect effect, evaluated at age 286.5 months, is to raise the cumulative risk by 51.8 percent relative to women who resided in an intact family at age 14. This 51.8 percent increase can be decomposed into an indirect component for exposure, corresponding to 19.0 percent increase in the cumulative relative risk, and a more usual indirect component, corresponding to a 27.5 percent increase, with the decomposition 1.518 = 1.190 × 1.275 given by the expression in (18). Put another way, these results state that, all else being equal, having resided in a mother-only family at age 14 is associated with a 13.5 month earlier entry into sexual activity relative to women who resided in an intact family at age 14. The consequence of this 13.5 month difference is to increase the cumulative relative risk of a premarital first birth by 51.8 percent over a 5 year period, with .37 of this increase (19.0/51.8) attributable to the 13.5 months of increased exposure to risk.

The next two rows of Table 3 present results for having resided in a step family or the residual “other” category of family situation at age 14. In both cases, the indirect effect of having resided in these nonintact family situations is to raise the cumulative risk of a premarital first birth relative to having resided in an intact family. Across all three rows, the indirect effect of exposure is roughly two-thirds the magnitude of the more usual indirect effect. Note that risks are highest for women who, at age 14, resided in a step family, followed by those who resided in other types of families, then for those who resided in a mother-only family. Because Table 3 reports estimates for the *indirect* effects of family structure on the cumulative relative risk of *T*_{2}, this ordering of effects is generated by the parameter estimates for family structure in the *T*_{1} equation (column 1 of Table 1), not for the parameter estimates for family structure in the *T*_{2} equation (column 5 of Table 1).

The results presented in the remaining rows show that the magnitude of the indirect effects of exposure varies with the time frame used to gauge exposure. While the comparisons in rows 1–3 suppose that women remain at risk of a premarital first birth during the first 60 months after the median age of first sex for women from an intact family, the comparisons in rows 4–6 extend the exposure time to 90 months. As noted earlier, women typically exit the risk of a premarital first birth through first marriage, and the 90 month exposure time will not be representative of the average cumulative exposure time for respondents in the 1988 NSFG, but could be appropriate for more recent cohorts of women for whom first marriage has been increasingly delayed.

Because the more usual indirect effect corresponds to the estimated parameter for *T*_{1} when entered as a right-hand-side covariate in the *T*_{2} equation, the value of this effect, under a proportional hazard specification, does not vary with time. This is reflected in the next three rows of results, which evaluate the cumulative relative risk at 90 months after onset of sexual activity for women who resided in an intact family at age 14, with the usual indirect effect identical when evaluating the cumulative relative risk at 60 and 90 months after onset. By contrast, the indirect effect of exposure declines with greater durations of exposure as initially large differences in the cumulative relative risk decrease at later exposures. These decreases are also reflected in the values for the total indirect effects, which are smaller in rows 4–6 than in rows 1–3.

The first 6 rows of Table 3 employ a single age *t* at which to base comparisons, using predicted medians for *T*_{1} to generate variation in exposure to the risk of *T*_{2}, a type of comparison depicted graphically in Figure 2. Rows 7–12 of Table 3 report indirect effects of family structure when equalizing the duration of exposure between the baseline and comparison groups (see Figure 3). These comparisons can be motivated substantively as supposing that women, on average, exit the risk of a premarital first birth not at particular ages but rather at particular durations following the sexual onset, say, if the onset of sexual activity marked the start of women’s marital search.

Hypothetical indirect effect of Δ a change in *t*_{1A} when evaluated at a fixed age *t* when integrating the hazard rate for *T*_{2} for the baseline and comparison groups.

Hypothetical indirect effect of a Δ change in *t*_{1A} when evaluated at the same duration since onset when integrating the hazard rate for *T*_{2} for the baseline and comparison groups.

Equalizing the duration of exposure for the baseline and comparison groups virtually eliminates the indirect effect of exposure, with estimates of this effect close to zero.^{6} In these results, the overall indirect effect of residing in a nonintact family at age 14 is considerable, raising the relative risk of a premarital first birth by at least 25%. However, the indirect effect of family structure due to increased exposure to risk, in which women from nonintact families are observed to have earlier entries into sexual activity (and thus longer exposure to the risk of a premarital birth), can be attributed entirely to differences in their duration at risk, and not differences in the ages at which they are at risk, since equalizing durations of exposure in Table 3 reduces these indirect effects to values close to zero. That the overall indirect effect of family structure remains large is due to the coefficient for age at first intercourse when entered as a usual right-hand-side covariate, which will reflect differences between women with different ages at sexual onset not controlled for in the other covariates in the model.

It is important to note that marriage is an implicit competing risk in our model. Thus, the results in Table 3 as well as subsequent tables should be interpreted in terms of the usual competing risk counterfactual in which we ask how prevalence would be affected by exposure to risk were women to remain unmarried.

Table 4 reports estimates for absolute and relative *T*_{2} prevalence given in (19) and (20), respectively, paralleling the results in Table 3. Note, however, that these estimates report the consequence of both direct and indirect effects of a shift from *x*_{1} to *x*_{1} + Δ on on *T*_{2} prevalence.

Absolute and relative differences for the probability of a premarital first birth for white women under alternative simulations for the median age at first sexual intercourse by family structure at age 14.

In the first row, we compare women who resided in an intact family and a mother-only family at age 14, and trace the implications of differences in age at onset of sexual activity for the probability of a premarital first birth in 60 months following onset of sexual activity. The column labeled Pr_{A} shows that 7.6 percent of women from an intact family are estimated to have a premarital first birth, while the column labeled Pr_{B} shows that the corresponding estimate for women from a mother-only family is 17.5 percent. The absolute difference (9.9 percent) and ratio (2.30) of these two estimates are reported in the subsequent two columns.

The next two rows give estimates for the probability of a premarital first birth during the first 60 months following onset of sexual activity for women who resided at age 14 in step and other types of families. Prevalence is nearly identical for women who at age 14 resided in a step family or in other types of families, in contrast to the higher prevalence for step families relative to other families in Table 3. This result—that prevalence is more similar in Table 4 for these two groups than in Table 3—occurs for two related reasons. First, the estimates in Table 3 referred only to the indirect effect on *T*_{2} of a shift from *x*_{1} to *x*_{1} + Δ, while those in Table 4 combine both indirect *and* direct effects of a shift from *x*_{1} to *x*_{1} + Δ. Second, the relative magnitude of coefficients for family structure differs for the *T*_{1} and *T*_{2} equations (compare columns 1 and 5 of Table 5); note, in particular, that the coefficient for having resided in a step family is larger than that for having resided in an other type of family in the *T*_{1} equation, with this relationship reversing for the *T*_{2} equation. As a result, the ordering of coefficients in the *T*_{1} equation in Table 1 is directly reflected in the ordering of cumulative relative risks in Table 3, but attenuated for prevalence in Table 4.

Decomposition of direct and indirect effects for the absolute prevalence of a premarital first birth for white women under alternative simulations for the median age at first sexual intercourse by family structure at age 14.

The next three rows present estimates of absolute and relative prevalence of a premarital first birth for a 90 month period following onset of sexual activity. This longer period generates larger estimates of absolute prevalence, corresponding to longer durations at risk of the event, but slightly smaller estimates of relative prevalence.

The final three rows of Table 4 present estimates that equalize the duration of exposure between the baseline and comparison groups. These estimates exhibit slightly smaller differences because group *B*’s exposure to risk is now more similar to group A’s exposure to risk, with the difference in absolute prevalence decreasing, for example, from 9.9 to 7.4 percent and that for relative prevalence from 2.30 to 1.98 when comparing rows 1 and 7.

Taken together, Tables Tables33 and and44 show how the cumulative relative risk and probability of a premarital first birth is affected by direct and indirect effects of family structure.^{7} Table 3 shows that the indirect effects of family structure on cumulative relative risk, while somewhat smaller than the direct effects, are still substantial, with important indirect contributions both from exposure and from the usual indirect effect of family structure as a right-hand-side covariate. Table 4 shows that the combination of indirect and direct effects increases prevalence substantially—by about 10 percent in absolute terms and by about a factor of two in relative terms.

We now turn to decomposing the direct and indirect effects, not in terms of cumulative relative risk, but in terms of the absolute and relative differences in the probability of a premarital birth. Table 5 presents a decomposition of the direct and indirect effects on absolute differences in the prevalence of a premarital first birth, using the Das Gupta decomposition technique described previously. The first row of Table 5 decomposes the absolute differences between women who resided in a mother-only family at age 14 and an intact family at age 14, evaluated 60 months after the predicted median age at first sex for women in an intact family at age 14. Under this decomposition, the direct effect of living in a mother-only family at age 14 add 5.2% to the prevalence of premarital first births. By comparison, the two indirect effects of living in a mother-only family via earlier age at first sex add 2.0% due to exposure and 2.7% due to the usual indirect effect to the prevalence of premarital first births.

The remaining rows in Table 5 mirror earlier results in Table 3, with the indirect effects generally smaller than the direct effects. In Table 5, however, we have decomposed the effects, not in terms of cumulative relative risk, but in terms of the increased percent of women who would have a premarital birth. The direct effects of the various types of nonintact family structure increase the prevalence of a premarital birth by about 5 to 10 percent relative to prevalence for those in intact families, while the pattern of results for the two indirect effects mirrors the pattern in Table 3, with the indirect of exposure declining to nearly zero values when equalizing exposure.

Table 6 presents a set of Das Gupta-style decompositions comparable to those in Table 5, but with the decompositions giving direct and indirect effects for the relative prevalence of a premarital first birth. The direct effects of residing in a nonintact family at age 14 increase the relative prevalence of a premarital birth by a factor of roughly 1.5 to 1.8. Again, the separate indirect effects are smaller than the direct effects with the indirect effect of exposure close one when equalizing exposure.

Decomposition of direct and indirect effects for the relative prevalence of a premarital first birth for white women under alternative simulations for the median age at first sexual intercourse by family structure at age 14.

Overall, these results provide potential insights into premarital first birth risks that cannot be obtained from conventional analyses that ignore the timing of sexual onset. For example, recall that our model of premarital first birth risks conditional on first sexual intercourse allows us to specify two dimensions of time—age and duration since onset. Our empirical results suggest one way in which such a specification may be potentially useful by letting one ask whether the indirect effect of exposure can be attributed to differences in the durations at risk, in the ages at risk, both, or neither. In our empirical results, women from nonintact families are observed to have earlier entries into sexual activity and thus longer exposure to the risk of a premarital birth. In addition, our results show that equalizing durations exposed to risk reduces the indirect effect of exposure to values close to zero. Thus, we find only small differences in the probability of a premarital first birth over the five-year period following initiation of sexual activity between women who initiated sexual activity at different ages—for example, at age 17 or 19. These findings thus suggest that the indirect effect of family structure on exposure can be attributed almost entirely to differences between these groups in their durations of exposure, and not differences in the ages at which they exposed to risk. However, it is also important when weighing these observations to note that our estimates of the indirect effect of exposure are small relative to our estimates of direct effects and the more usual indirect effect.

This pattern of results—that direct effects are larger, sometimes substantially so, than the indirect effects of covariates, and that the indirect effect of exposure consistently smaller than the more usual indirect effect—is perhaps the main finding that emerges from our empirical application. Taken at face value, these results might appear contrary to the view that the longer period of abstinence in intact families is responsible for the lower numbers of premarital first births for women who grew up in such families. Likewise, these results might appear to bolster the view that premarital birth risks are most strongly affected by factors that operate in the period following sexual onset. But because our results are based on observational data, our findings should not be interpreted as providing causal evidence but rather as providing regression-adjusted estimates of various quantities. Furthermore, a proponent of the abstinence and family structure hypothesis might object that our estimates are subject to omitted variable bias.

Nevertheless, the structure of our model together with this pattern of findings implies a narrower range of relevant unobservables than is usually the case. Consider, for example, our results in Table 6 for the direct and indirect effects of having resided in a single-mother family. The estimated direct and indirect effects in Table 6 are all positive; hence, consistent with previous findings, we find that having resided in a single mother family raises the probability of a premarital first birth, with our estimated coefficients suggesting a direct effect of 56%, an indirect effect of exposure of 18%, and an indirect covariate effect of 26%. However, our estimated direct effect could well be subject to omitted variable bias.

For concreteness, consider the example of wealth as a potential candidate for an omitted w correlated with *x* whose omission may bias inferences about *x*. But because typical expectations concerning wealth would predict effects of the same sign as those for family structure, controlling for wealth would typically be expected to yield smaller estimates of both the direct and indirect effects of family structure. If so, the inclusion of an unobserved variable such as wealth would not alter the finding that the direct effect of *x* is larger than its corresponding indirect effect on exposure.

This example nevertheless suggests variants on standard expectations in which the inclusion of *w* could reverse the empirical finding that direct effects dominate indirect effects. For example, consider a *w* with the property that omitting *w* yields large biases in the estimated direct effect of *x* but no bias in the estimated indirect effect of *x* on exposure. Under this possibility, the “true” direct effect of having resided in a single mother family could then be smaller than true indirect effect of exposure (if, for example, controlling for *w* were to reduce this estimate from 56% to something smaller than 18%). Note, however, that this possibility also leaves the estimate for the indirect effect of *x* on exposure unchanged.

It is also possible to posit conditions under which omitting *w* will *downwardly* bias estimates of the indirect effect of exposure. As noted above, standard expectations about *w* include the assumption that an omitted *w* will have similar signed effects as *x*. If this does not hold—if wealth in our example were to have a positive effect on *T*_{1} but a negative effect on *T*_{2}—then omission of such a *w* could yield a too small estimate of the indirect effect of *x* on exposure. But as the wealth example also illustrates, it is more difficult to posit an unobservable *w* correlated with *x* in which *w* has the same signed effect as *x* for one transition but an effect that is opposite in sign from *x* for the other transition.

Overall, these examples illustrate how unobserved heterogeneity can yield biased estimates of the indirect effect of exposure, but they also suggest a far narrower range of potential unobservables *w* than is usually the case. It is important to emphasize, however, that these observations pertain to a specific set of hypotheses—those positing a large role for *T*_{1} exposure—and to situations in which (as is often the case) indirect effects are substantially smaller than direct effects. Note, in particular, that nothing in our framework or estimation procedures provides any novel insight concerning unobservables for hypotheses about the direct effects of covariates.

This paper has outlined a three-variable recursive proportional hazard model analogous to the classical three-variable recursive model for metric outcomes. As in the classical recursive model for metric outcomes, we posit a vector **x** of exogenous covariates, but depart from the classical model by considering two event processes *T*_{1} and *T*_{2} governed by a common time dimension t, in which the occurrence of *T*_{1} is assumed to determine entry into risk of *T*_{2}. An issue that arises in the recursive hazard model but not in the recursive linear regression model concerns the effect of exposure—variation in *T*_{1} will generate differential exposure to risk, and hence differential prevalence in the event *T*_{2}.We show that under proportionality, one can decompose the cumulative relative risk of *T*_{2} conditional on *T*_{1} into two multiplicative components, one of which reflects an effect of exposure and the other of which reflects an effect analogous to the traditional indirect effect in linear regression. Although this multiplicative decomposition does not obtain when comparing relative or absolute prevalence, we provide derivations for direct and indirect effects of **x** on *T*_{2} for absolute and relative measures of prevalence, as well as for the cumulative relative risk.

Because our method requires that one be able to evaluate the integral of the baseline hazard function, it is most easily applied to parametric models of the baseline hazard function. Our examples use a splined piecewise Gompertz specification for the baseline hazard, which is a highly flexible model for the baseline hazard, but other parametric choices can be utilized as well. Note, however, that use of a Cox proportional hazard model would present difficulties because it does not provide direct estimates of the baseline hazard function and, in particular, estimates suited to inverting the integral of the baseline hazard function.

We illustrate these methods using data on age at first intercourse and age at a premarital first birth, estimating the direct and indirect effects of family structure on the age-specific risk of a premarital birth. Our empirical analyses suggest that roughly one-half of the effect of nonintact family structure on the risk of a premarital first birth is due to the conventional direct effect of family structure, in which the risk of a premarital first birth among sexually active women is higher for women from nonintact families. However, we also found that between one-fourth and one-half of the effect of nonintact family structure on the risk of a premarital first birth is due to an indirect effect of family structure, in which family structure influences the risk of a premarital first birth indirectly through its effect on age at first sexual intercourse. The largest component of this indirect effect is that women from nonintact families have an earlier onset of sexual activity, with a woman’s age at first sex, entered as a right-hand-side covariate for women’s risk of a premarital first birth, associated with higher risks of a premarital first birth. An additional indirect effect, which arises in a hazard context, is that an earlier onset of sexual activity will increase a woman’s duration of exposure to the risk of a premarital first birth. In all our examples, this component was the smallest of the effects of family structure on premarital first birth risks.

Thus, a consistent pattern that emerges from our empirical decompositions is that the direct effects of covariates are, in general, larger in magnitude than their corresponding indirect effects. Although these findings will for many social scientists be unsurprising—direct effects very often dominate indirect effects—because our data are observational, it is not possible to regard these findings as causal, but rather as providing estimates of various regression-adjusted quantities. However, we also show that unobservables must have highly specific characteristics if indirect effects are to dominate direct effects, a result that follows from the structure of our formal model. This implies a far smaller class of relevant unobserved confounds than is usually the case for hypotheses positing large indirect effects of exposure.

More generally, our empirical example of how one can decompose the direct and indirect effects of family structure on nonmarital childbearing illustrates the potential of this decomposition technique for modeling many other event processes. Whenever a hazard model outcome can be viewed in terms of two linked event processes, one gains the ability to distribute the effects of a covariate across the two event processes. By distinguishing between direct and indirect effects of the covariate, and by further distinguishing between a conventional indirect effect *and* an indirect effect related to duration of exposure, researchers can gain additional insight into the circumstances and dynamics under which an explanatory variable is associated with a particular outcome.

We gratefully acknowledge funding from NICHD (HD 29550) and helpful comments from the editor and anonymous reviewers. Earlier versions of this paper also benefitted from comments by participants at the 2008 Low Income Workshop, Institute for Poverty, University of Wisconsin-Madison.

Obtaining the indirect effect of **x** on *T*_{2} prevalence requires inverting the integral of *q*_{1}(*t*). The examples in this paper employ a piecewise splined Gompertz specification for the baseline hazard *q*_{1}(*t*) of *T*_{1}. Under proportionality, we have

$${r}_{1}\left(t\right)={q}_{1}\left(t\right)\phantom{\rule{thinmathspace}{0ex}}\mathrm{exp}\left(\mathbf{a}\mathbf{x}\right).$$

(A1)

Consider partitioning the time interval (*τ*_{0}, ∞) into *K* prespecified intervals (*τ*_{0}, *τ*_{1}], (*τ*_{1}, *τ*_{2}], …, (*τ*_{K–1}, ∞]; then a piecewise splined Gompertz specification for *q*_{1}(*t*) can be written as:

$${q}_{1}\left(t\right)=\{\begin{array}{cc}\mathrm{exp}({\beta}_{1}+{\gamma}_{1}t)\hfill & t\in ({\tau}_{0},{\tau}_{1}];\hfill \\ \mathrm{exp}({\beta}_{2}+{\gamma}_{2}t)\hfill & t\in ({\tau}_{1},{\tau}_{2}];\hfill \\ \cdots \hfill & \hfill \\ \mathrm{exp}({\beta}_{K}+{\gamma}_{K}t)\hfill & t\in ({\tau}_{K-1},\infty ],\hfill \end{array}\phantom{\}}$$

(A2)

for *γ _{k}* ≠ 0 and

$${H}_{1}\left(t\right)={\int}_{{\tau}_{0}}^{t}{r}_{1}\left(s\right)ds=\mathrm{exp}\left(\mathbf{a}\mathbf{x}\right){\int}_{{\tau}_{0}}^{t}{q}_{1}\left(s\right)ds=\mathrm{exp}\left(\mathbf{a}\mathbf{x}\right){Q}_{1}\left(t\right).$$

(A3)

Integrating *Q*_{1}(*t*) yields:

$${Q}_{1}\left(t\right)=\{\begin{array}{cc}{e}^{{\beta}_{1}}+\left({e}^{{\gamma}_{1}t}-{e}^{{\gamma}_{1}{\tau}_{1}}\right)\u2215{\gamma}_{1}\hfill & t\in ({\tau}_{0},{\tau}_{1}];\hfill \\ {Q}_{1}\left({\tau}_{1}\right)+{e}^{{\beta}_{2}}+\left({e}^{{\gamma}_{2}t}-{e}^{{\gamma}_{2}{\tau}_{2}}\right)\u2215{\gamma}_{2}\hfill & t\in ({\tau}_{1},{\tau}_{2}];\hfill \\ \cdots \hfill & \hfill \\ {Q}_{1}\left({\tau}_{K-1}\right)+{e}^{{\beta}_{K}}\left({e}^{{\gamma}_{K}t}-{e}^{{\gamma}_{K}{\tau}_{K}}\right)\u2215{\gamma}_{K}\hfill & t\in ({\tau}_{K-1},\infty ].\hfill \end{array}\phantom{\}}$$

(A4)

As noted in the text, our goal is to determine the *p* percentile of the *T*_{1} distribution; this corresponds to inverting the integral of *q*_{1}(*t*). Set *Q*_{1}(*t*) = *x* and define the inverse function implicitly through ${Q}_{1}^{-1}\left(x\right)=t$. Suppose the desired percentile lies in the *k*th interval (*τ*_{k–1}, *τ _{k}*]; then from (A4)

$$\begin{array}{cc}\hfill x& ={Q}_{1}\left({\tau}_{k-1}\right)+\left[\mathrm{exp}\right({\beta}_{k}+{\gamma}_{k}t)-\mathrm{exp}({\beta}_{k}+{\gamma}_{k}{\tau}_{k-1}\left)\right]\u2215{\gamma}_{k}\hfill \\ \hfill \mathrm{exp}({\beta}_{k}+{\gamma}_{k}t)& =\mathrm{exp}({\beta}_{k}+{\gamma}_{k}{\tau}_{k-1})+{\gamma}_{k}[x-{Q}_{1}({\tau}_{k-1}\left)\right]\hfill \\ \hfill t& =[\mathrm{log}\left(\mathrm{exp}({\beta}_{k}+{\gamma}_{k}{\tau}_{k-1})+{\gamma}_{k}[x-{Q}_{1}({\tau}_{k-1}\left)\right]\right)-{\beta}_{k}]\u2215{\gamma}_{k}\hfill \end{array}$$

(A5)

Hence for *t * (*τ*_{k–1}, *τ _{k}*],

$${Q}_{1}^{-1}\left(x\right)=[\mathrm{log}\left(\mathrm{exp}({\beta}_{k}+{\gamma}_{k}{\tau}_{k-1})+{\gamma}_{k}[x-{Q}_{1}({\tau}_{k-1}\left)\right]\right)-{\beta}_{k}]\u2215{\gamma}_{k}.$$

(A6)

Minor complications arise when the distribution of *T*_{1} is defective—that is, when some individuals will not experience the event *T*_{1} even when *t* → ∞. For the piecewise Gompertz specification, this is determined by parameters in the last open interval, (*τ*_{K–1}, ∞]. In this interval, the *T*_{1} distribution will be defective if

$${\gamma}_{K}{e}^{-{\beta}_{K}}[x-{Q}_{1}({\tau}_{K-1}\left)\right]<{e}^{{\gamma}_{K}{\tau}_{K-1}}.$$

(A7)

Inspecting (A7) shows that *γ _{K}* < 0 is a necessary but not sufficient condition for the distribution of

In this appendix, we sketch details when there are time-varying covariates. For the transition to *T*_{1}, we seek, as before, to make comparisons between groups *A* and *B* using the *p*th percentile of the distribution of *T*_{1}, *t*_{1p}. To fix ideas, consider the case of a single time-varying covariate, *x*_{1}(*t*). Then a natural means of comparison for groups *A* and *B* (see, e.g., self-citation 5) is to compare two (often hypothetical) trajectories for *x*_{1}, with the remaining covariates set to their means, with ${\stackrel{~}{x}}_{1A}\left(t\right)$ and ${\stackrel{~}{x}}_{1B}\left(t\right)$ denoting these two hypothetical trajectories for *x*_{1}(*t*). This then generalizes to ${\stackrel{~}{\mathbf{x}}}_{1A}\left(t\right)$ and ${\stackrel{~}{\mathbf{x}}}_{1B}\left(t\right)$, with the vector **x**(*t*) comprised of a mix of non-time-varying and time-varying covariates. The expressions in (7) and (8) then need modification as:

$${t}_{1p}={Q}_{1}^{-1}[-\mathrm{log}(\pi )\u2215\mathrm{exp}(\mathbf{a}\stackrel{~}{\mathbf{x}}\left(\mathbf{t}\right)\left)\right],$$

(7.A2)

and

$$\begin{array}{cc}\hfill \Delta {t}_{1p}& ={t}_{1p}^{B}-{t}_{1p}^{A}\hfill \\ \hfill & ={Q}_{1}^{-1}\{-\mathrm{log}(\pi )\u2215\mathrm{exp}[\mathbf{a}{\stackrel{~}{\mathbf{x}}}_{B}\left(t\right)\left]\right\}-{Q}_{1}^{-1}\{-\mathrm{log}(\pi )\u2215\mathrm{exp}[\mathbf{a}{\stackrel{~}{\mathbf{x}}}_{A}\left(t\right)\left]\right\},\hfill \end{array}$$

(8.A2)

where, as before, ${t}_{1p}^{A}$ and ${t}_{1p}^{B}$ denote the *p*th percentiles of *T*_{1} in groups *A* and *B*, respectively, and where the expression for ${Q}_{1}^{-1}\left(t\right)$ now requires integration over $\stackrel{~}{\mathbf{x}}\left(t\right)$.

Other expressions follow similarly.

Unconditional transition to a premarital first birth (no adjustment for left truncation) | Transition to first sexual intercourse | Transition to a premarital first birth given sexual onset | ||||||
---|---|---|---|---|---|---|---|---|

Gmp | Cox | Gmp | Cox | Gmp | Cox | Gmp | Cox | |

mother-only family | .96^{***}(.16) | .96^{***}(.16) | .40^{*}(.17) | .40^{*}(.17) | .51^{***}(.06) | .50^{***}(.06) | .47^{**}(.16) | .48^{**}(.16) |

step family | 1.06^{***}(.20) | 1.05^{***}(.20) | .66^{**}(.20) | .66^{**}(.20) | .69^{***}(.07) | .67^{***}(.07) | .58^{**}(.20) | .58^{**}(.20) |

other family | 1.04^{***}(.20) | 1.03^{***}(.20) | .75^{***}(.20) | .74^{***}(.20) | .58^{***}(.07) | .58^{***}(.07) | .64^{**}(.20) | .62^{**}(.20) |

catholic | −.23 (.14) | −.23 (.14) | −.15 (.14) | −.16 (.14) | −.01 (.04) | −.01 (.04) | −.22 (.14) | −.22 (.14) |

mother’s education | −.08^{**}(.02) | −.07^{**}(.02) | −.09^{***}(.02) | −.09^{***}(.02) | .01 (.01) | .01 (.01) | −.10^{***}(.02) | −.10^{***}(.02) |

mother’s age at first birth | −.10^{***}(.02) | −.10^{***}(.02) | −.06^{**}(.02) | −.06^{**}(.02) | −.05^{***}(.00) | −.05^{***}(.00) | −.06^{**}(.02) | −.06^{**}(.02) |

age at first intercourse | −.026^{***}(.002) | −.026^{***}(.002) | −.018^{***}(.004) | −.016^{***}(.004) | ||||

duration 32–59 months | −.40^{*}(.20) | −.28 (.21) | ||||||

duration 60+ months | −1.01^{*}(.40) | −.77 (.44) |

^{1}“Predetermination” of *Y* with respect to *Z* is a central assumption in our model. Examples in which this assumption is violated include situations in which *Y* and *Z* are jointly determined.

^{2}Our use of “treatment” and “control” is meant to clarify the logic underlying these comparisons, not to assert that causal effects can be obtained using such models on observational data. Similarly, our use of the terms “direct effect” and “indirect effect” should not be taken as implying that these quantities are causal.

^{3}An alternative to our piecewise splined Gompertz model is a piecewise constant model, which is commonly used by researchers to approximate a Cox model. Note that the piecewise constant model is a special case of a piecewise Gompertz model, since the latter yields a piecewise linear baseline for log *r*(*t*).

^{4}Parameter estimates for all variables, as well as those for the baseline hazard for both transitions, are available upon request. As a sensitivity check, we also estimated covariate effects in Table 1 using a Cox specification; see Appendix Table 1. Estimated coefficients from the two specifications differ in only slight ways for nearly all covariates.

^{5}In somewhat more detail, consider a woman who initiated sexual activity at age 20. Models 2 and 3 assume that the woman has a nonzero risk of a premarital first birth at all ages, both before and after age 20. These two models differ in that Model 3 adds a proportionality term *α* to Model 2; hence, these models assume that the risk of a premarital birth for a woman who initiated sexual activity at age 20 relative, say, to an otherwise identical woman who initiated activity at age 19, is exp(*α* × 20)/ exp(*α* × 19) = exp[*α*(20 – 19)] = exp(*α*), with this proportionality factor assumed constant across all ages for the woman, including those prior to her onset of sexual activity. By contrast, Models 4 and 5 place the woman at risk of a premarital first birth only after age 20; put another way, Models 4 and 5 assume that the woman’s risk of a premarital first birth is identically zero prior to age 20. Then considering two women who initiated sexual activity at ages 19 and 20 but who are otherwise identical in all respects, Model 4 assumes that these two women have identical premarital birth risks after age 20, while Model 5 lets the risks differ for these two women after age 20 by the proportional factor exp(*α*).

^{6}Note that in general, equalizing the duration of exposure does not imply that the indirect effect of exposure will be zero. Returning again to the thought experiment in which we compare groups *A* and *B*, it can be shown that the indirect effect of exposure will be identically zero only if the integrated hazards for groups *A* and *B* are equal, i.e., if the shaded areas are identical in the hypothetical example in Figure 3.

^{7}In interpreting the results in both Tables Tables33 and and4,4, it is important to emphasize that we have employed estimates from a competing risk hazard model, in which women are censored if they marry prior to giving birth. While substantively sensible, a consequence is that both Tables Tables33 and and44 are best interpreted as speaking to a particular counterfactual—that is, the consequences of exposure on prevalence or cumulative relative risk if a woman were to remain unmarried (and hence at risk of a premarital first birth) during the entire 60 or 90 month period following onset of sexual activity. Note that this counterfactual is in contrast to the behavior of any actual sample of women, in which some percentage of women will exit the risk of a premarital birth during the 60 or 90 month period following onset of sexual activity by virtue of marriage prior to a birth, while other women will remain unmarried and at risk of a premarital birth for a much longer period than 60 or 90 months. Put another way, our empirical example assumes that endogeneities are absent from the censoring mechanism (i.e., that no endogeneity exists between marriage and a premarital first birth), an implausible assumption.

Lawrence L. Wu, Department of Sociology New York University.

Steven P. Martin, Department of Sociology University of Maryland, College Park.

- Aalen Odd O. Nonparametric Inference for a Family of Counting Processes. Annals of Statistics. 1978;6(4):701–26.
- Breslow NE. Covariance Analysis of Censored Survival Data. Biometrics. 1974;30(1):89–99. [PubMed]
- Cox DR. Regression Models and Life Tables” (with discussion) Journal of the Royal Statistical Society. 1972;B34(2):187–220.
- Das Gupta Prithwis. Current Population Reports Special Studies P-23, No. 186. Bureau of the Census; Washington, DC: 1993. Standardization and Decomposition of Rates: A User’s Manual.
- Duncan Otis Dudley. Introduction to Structural Equation Models. Academic Press; New York: 1975.
- Featherman David L., Hauser Robert M. Opportunity and Change. Academic Press; New York: 1978.
- Hamilton Brady E., Martin Joyce A., Ventura Stephanie J. National Vital Statistics Reports. No. 7. Volume 56. National Center for Health Statistics; Rockville, MD: 2007. Births: Preliminary Data for 2006.
- Kalbfleisch John D., Prentice Ross L. The Statistical Analysis of Failure Time Data. Wiley; New York: 1980.
- Kiernan Kathleen E., Hobcraft John. Parental Divorce during Childhood: Age at First Intercourse, Partnership, and Parenthood. Population Studies. 1997;51(1):41–55.
- Kitagawa Evelyn M. Components of a Difference Between Two Rates. Journal of the American Statistical Association. 1955;50(272):1168–94.
- Lillard Lee. Simultaneous Equations for Hazards: Marriage Duration and Fertility Timing. Journal of Econometrics. 1993;56(1/2):189–217. [PubMed]
- Smith Herbert L., Morgan S. Philip, Koropeckyj-Cox Tanya. A Decomposition of Trends in the Nonmarital Fertility Ratios of Blacks and Whites in the United States, 1960 to 1992. Demography. 1996;33(2):141–52. [PubMed]
- Stolzenberg Ross M. The Measurement And Decomposition Of Causal Effects In Non-linear And Nonadditive Models. In: Schuessler Karl., editor. Sociological Methodology 1980. Jossey-Bass; San Francisco: 1979. pp. 459–88.
- Winship Christopher, Mare Robert D. Regression Models with Ordinal Variables. American Sociological Review. 1984;49(4):512–25.

PubMed Central Canada is a service of the Canadian Institutes of Health Research (CIHR) working in partnership with the National Research Council's national science library in cooperation with the National Center for Biotechnology Information at the U.S. National Library of Medicine(NCBI/NLM). It includes content provided to the PubMed Central International archive by participating publishers. |