Home | About | Journals | Submit | Contact Us | Français |

**|**HHS Author Manuscripts**|**PMC2773213

Formats

Article sections

- SUMMARY
- 1. INTRODUCTION
- 2. Estimation of the Cumulative Incidence Functions
- 3. Simulation Study
- 4. A real data application: Malignant melanoma study
- 5. Discussion
- REFERENCES

Authors

Related links

Stat Med. Author manuscript; available in PMC 2010 September 30.

Published in final edited form as:

Stat Med. 2009 September 30; 28(22): 2748–2768.

doi: 10.1002/sim.3640PMCID: PMC2773213

NIHMSID: NIHMS137898

See other articles in PMC that cite the published article.

In analyzing competing risks data, a quantity of considerable interest is the cumulative incidence function. Often the effect of covariates on the cumulative incidence function is modeled via the proportional hazards model for the cause-specific hazard function. As the proportionality assumption may be too restrictive in practice, we consider an alternative more flexible semiparametric additive hazards model of [1] for the cause-specific hazard. This model specifies the effect of covariates on the cause-specific hazard to be additive as well as allows the effect of some covariates to be fixed and that of others to be time-varying. We present an approach for constructing confidence intervals as well as confidence bands for the cause-specific cumulative incidence function of subjects with given values of the covariates. Furthermore, we also present an approach for constructing confidence intervals and confidence bands for comparing two cumulative incidence functions given values of the covariates. The finite sample property of the proposed estimators is investigated through simulations. We conclude our paper with an analysis of the well-known malignant melanoma data using our method.

Competing risks data are typically encountered in medical studies, for example, in studies dealing with time to progression of spontaneous labor, where labor due to medical intervention (example: delivery by cesarean) or membrane rupture leading to labor are treated as other causes. Another example can be found in the well-known malignant melanoma data [2], where patients with malignant melanoma were either at risk to die from malignant melanoma or from other causes. Typically response to a treatment can be classified in terms of failure from disease of interest and/or non-disease related causes. So in competing risks framework, each individual is exposed to *K* distinct types of risks and the eventual failure can be attributed to precisely one of the risks. Suppose each subject has an underlying continuous failure time that may be subject to censoring. The cause of failure *J* {1, … , *K*} is observed along with covariate information. The cause-specific hazard function for a subject with a covariate vector *z* is defined by

$${\lambda}_{k}(t\mid z)=\underset{\Delta t\to 0}{\mathrm{lim}}\frac{1}{\Delta t}P\left(t\le \stackrel{~}{T}<t+\Delta t,J=k\mid \stackrel{~}{T}\ge t,z\right)$$

for *k* = 1, … , *K*. The cause-specific hazard function *λ _{k}*(

$${F}_{k}(t\mid z)=P\left(\stackrel{~}{T}\le t,J=k\mid z\right).$$

The cumulative incidence function *F _{k}*(

Note that *F _{k}*(

$${F}_{k}(t\mid z)={\int}_{0}^{t}S(u\mid z){\lambda}_{k}(u\mid z)\phantom{\rule{thinmathspace}{0ex}}\mathit{du}$$

with the overall survival function $S(t\mid z)=\mathrm{exp}\left(-{\int}_{0}^{t}{\Sigma}_{l=1}^{K}{\lambda}_{l}(u\mid z)\mathit{du}\right)$ Thus, it is natural to estimate the cumulative incidence function through the cause-specific hazard function. Typically, one models the effect of covariates using the proportional hazards model ([4], [5]). Cheng *et al.* have studied the estimation of the cumulative incidence function based on Cox's regression model in a competing risks model [6]. However, the proportionality assumption under the proportional hazards model may be too restrictive in practice. For instance, the proportionality assumption for the malignant melanoma data fails based on the score test provided by [7] (see the numerical section for further details). Thus, it is of interest to investigate alternative approach and an important alternative is the additive risk model.

Shen and Cheng [8] presented an approach to estimating the cumulative incidence function under the additive risk model first proposed by [9]

$${\lambda}_{k}(t\mid z)={\lambda}_{0k}\left(t\right)+{z}^{T}{\beta}_{k},$$

(1)

where *λ*_{0k}(·) is an unspecified baseline hazard function for the *k*-th failure type and *β _{k}* an unknown parameter vector. Often, this model only provides a rough summary of the effect of covariates, because in their model the influence of all the covariates was restricted to be constant, i.e., time-varying effects of covariates cannot be captured by this model. Aalen

In many instances in practice, we may know that some of the covariates will have time-constant effects (for example, demographic characteristics) and some may be suspected to have time-varying effect (for example, treatment). In particular, using the test provided by [11] for checking the time-varying effect or constant effect of a covariate based on the additive model , we found that age and sex have time constant effect, and thickness of the tumor has time-varying effect for the malignant melanoma data. Motivated by this, we study a flexible additive risk model which allows for some covariates to be modeled parametrically and others to be modeled nonparametrically. The suggested approach is to study the semiparametric additive model based on [1] in the competing risks setup which is more appropriate in some applications.

To be specific, under the semiparametric additive risk model the cause-specific hazard function for cause *k*, given covariates *x* and *z* takes the form

$${\lambda}_{k}(t\mid x,z)={x}^{T}{\alpha}_{k}\left(t\right)+{z}^{T}{\beta}_{k},$$

where the covariates are partitioned into *x*, a *p*-dimensional covariate vector with time-varying effects and *z*, a *q*-dimensional covariate vector with time constant effects, *α _{k}*(

The paper is structured as follows. In Section 2 we present our methods for constructing confidence intervals and bands for the cumulative incidence function and for the difference and the ratio of two cumulative incidence functions. The finite sample property of the proposed estimators is investigated extensively through simulations in Section 3. In Section 4 we illustrate the proposed method with data from malignant melanoma study. The details of our asymptotic results are made explicit in the Appendix.

Suppose that there are *K* distinct failure types. Let *T _{ki}* be the

$${\lambda}_{\mathit{ki}}(t\mid {x}_{i},{z}_{i})={x}_{i}^{T}{\alpha}_{k}\left(t\right)+{z}_{i}^{T}{\beta}_{k},$$

(2)

for *k* {1, 2, … ,*K*}. Let *Y _{i}*(

$$X\left(t\right)={\left({Y}_{1}\left(t\right){x}_{1},\dots ,{Y}_{n}\left(t\right){x}_{n}\right)}^{T}\phantom{\rule{thinmathspace}{0ex}}\text{and}\phantom{\rule{thinmathspace}{0ex}}Z\left(t\right)={\left({Y}_{1}\left(t\right){z}_{1},\dots ,{Y}_{n}\left(t\right){z}_{n}\right)}^{T}.$$

Let *δ _{ki}* =

$${N}_{k}\left(t\right)={\left({N}_{k1}\left(t\right),\dots ,{N}_{\mathit{kn}}\left(t\right)\right)}^{T}\phantom{\rule{thinmathspace}{0ex}}\text{and}\phantom{\rule{thinmathspace}{0ex}}{\lambda}_{k}\left(t\right)={\left({\lambda}_{k1}\left(t\right),\dots ,{\lambda}_{\mathit{kn}}\left(t\right)\right)}^{T}$$

be the *n*-dimensional counting process and its intensity. Denote by *ω _{k}*(

$${\widehat{\beta}}_{k}={\left[{\int}_{0}^{\tau}{Z}^{T}\left(t\right){H}_{k}\left(t\right)Z\left(t\right)\mathit{dt}\right]}^{-1}{\int}_{0}^{\tau}{Z}^{T}\left(t\right){H}_{k}\left(t\right){\mathit{dN}}_{k}\left(t\right)$$

and

$${\widehat{A}}_{k}\left(t\right)={\int}_{0}^{t}{X}_{k}^{-}\left(u\right)\left({\mathit{dN}}_{k}\left(u\right)-Z\left(u\right){\widehat{\beta}}_{k}\phantom{\rule{thinmathspace}{0ex}}\mathit{du}\right),$$

where *H _{k}*(

However, the above estimates involve the weight function *ω _{k}*(

By combining these we can consistently estimate the cumulative incidence function for the cause 1, for example, with a particular set of covariates *x*_{0} and *z*_{0} by

$${\widehat{F}}_{1}(t\mid {x}_{0},{z}_{0})={\int}_{0}^{t}\widehat{S}(u\mid {x}_{0},{z}_{0})\phantom{\rule{thinmathspace}{0ex}}d{\widehat{\Lambda}}_{1}(u\mid {x}_{0},{z}_{0})$$

where $\widehat{S}(t\mid {x}_{0},{z}_{0})=\mathrm{exp}\left(-{\Sigma}_{k=1}^{K}{\widehat{\Lambda}}_{k}(t\mid {x}_{0},{z}_{0})\right)$ is an estimator of the overall survival probability *S*(*t*|*x*_{0}, *z*_{0}) and * _{k}* is the estimator of the cumulative cause-specific hazard function ${\widehat{\Lambda}}_{k}(t\mid {x}_{0},{z}_{0})={\int}_{0}^{t}{x}_{0}^{T}d{\widehat{A}}_{k}\left(u\right)+{\mathit{tz}}_{0}^{T}{\widehat{\beta}}_{k}$. We next present our approach for constructing confidence interval and confidence band for a specific cause-

In the Appendix, we show that $\sqrt{n}\left({\widehat{F}}_{1}(t\mid {x}_{0},{z}_{0})-{F}_{1}(t\mid {x}_{0},{z}_{0})\right)$ is asymptotically equivalent to a sum of square integrable martingales *U*_{1}(*t*|*x*_{0}, *z*_{0}) given by

$$\begin{array}{cc}\hfill {U}_{1}& (t\mid {x}_{0},{z}_{0})\hfill \\ \hfill & =\frac{1}{\sqrt{n}}\underset{i=1}{\overset{n}{\Sigma}}{\u220a}_{1i}(t\mid {x}_{0},{z}_{0})\hfill \\ \hfill & =\frac{1}{\sqrt{n}}\underset{i=1}{\overset{n}{\Sigma}}\{{\int}_{0}^{t}S(u\mid {x}_{0},{z}_{0}){x}_{0}^{T}{\left({n}^{-1}{W}_{1}\left(u\right)\right)}^{-1}{\omega}_{1i}\left(u\right){x}_{i}\phantom{\rule{thinmathspace}{0ex}}{\mathit{dM}}_{1i}\left(u\right)\phantom{\}}\hfill \\ \hfill & \phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}-{\int}_{0}^{t}S(u\mid {x}_{0},{z}_{0}){x}_{0}^{T}{X}_{1}^{-}\left(u\right)Z\left(u\right)\phantom{\rule{thinmathspace}{0ex}}{\mathit{duC}}_{1}^{-1}{D}_{1i}\hfill \\ \hfill & \phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}-{\int}_{0}^{t}S(u\mid {x}_{0},{z}_{0})\phantom{\rule{thinmathspace}{0ex}}{\mathit{duz}}_{0}^{T}{C}_{1}^{-1}{D}_{1i}\hfill \\ \hfill & \phantom{\rule{1em}{0ex}}\phantom{\rule{thinmathspace}{0ex}}-\underset{k=1}{\overset{K}{\Sigma}}({\int}_{0}^{t}{F}_{1}^{C}(t,u){x}_{0}^{T}{\left({n}^{-1}{W}_{k}\left(u\right)\right)}^{-1}{\omega}_{\mathit{ki}}\left(u\right){x}_{i}\phantom{\rule{thinmathspace}{0ex}}{\mathit{dM}}_{\mathit{ki}}\left(u\right)\phantom{)}\hfill \\ \hfill & \phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}-{\int}_{0}^{t}{F}_{1}^{C}(t,u){x}_{0}^{T}{\omega}_{k}\left(u\right){X}^{-}\left(u\right)Z\left(u\right)\phantom{\rule{thinmathspace}{0ex}}{\mathit{duC}}_{k}^{-1}{D}_{\mathit{ki}}\hfill \\ \hfill & \phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\{}\phantom{(}-{\int}_{0}^{t}{F}_{1}^{C}(t,u)\phantom{\rule{thinmathspace}{0ex}}{\mathit{duz}}_{0}^{T}{C}_{k}^{-1}{D}_{\mathit{ki}})\},\hfill \end{array}$$

(3)

where

$$\begin{array}{cc}\hfill {F}_{k}^{C}(t,u)& ={F}_{k}(t\mid {x}_{0},{z}_{0})-{F}_{k}(u\mid {x}_{0},{z}_{0})\hfill \\ \hfill {C}_{k}& =\frac{1}{n}{\int}_{0}^{\tau}{Z}^{T}\left(t\right){H}_{k}\left(t\right)Z\left(t\right)\phantom{\rule{thinmathspace}{0ex}}\mathit{dt}\hfill \\ \hfill {D}_{\mathit{ki}}& ={\int}_{0}^{\tau}\left\{{z}_{i}{\omega}_{\mathit{ki}}\left(t\right)-{Z}^{T}\left(t\right){\omega}_{k}\left(t\right)X\left(t\right){W}_{k}{\left(t\right)}^{-1}{\omega}_{\mathit{ki}}\left(t\right){x}_{i}\right\}\phantom{\rule{thinmathspace}{0ex}}{\mathit{dM}}_{\mathit{ki}}\left(t\right)\hfill \\ \hfill {M}_{\mathit{ki}}\left(t\right)& ={N}_{\mathit{ki}}\left(t\right)-{\int}_{0}^{t}{Y}_{i}\left(u\right)\left\{{x}_{i}^{T}{\alpha}_{k}\left(u\right)+{z}_{i}^{T}{\beta}_{k}\right\}\phantom{\rule{thinmathspace}{0ex}}\mathit{du}\hfill \end{array}$$

for *k* {1, 2, … , *K*} and *i* = 1, 2, … , *n*. Furthermore using the martingale central limit theorem, it converges in distribution to a Gaussian process. The martingale representation of *U*_{1}(*t*|*x*_{0}, *z*_{0}) can also be used to construct a consistent estimator of the asymptotic variance function. The variance function at time *t* can be consistently estimated by

$${\widehat{\sigma}}_{1}^{2}(t\mid {x}_{0},{z}_{0})=\frac{1}{n}\underset{i=1}{\overset{n}{\Sigma}}{\left({\widehat{\u220a}}_{1i}(t\mid {x}_{0},{z}_{0})\right)}^{2}$$

obtained by replacing all the terms with their empirical version and the parameters *β _{k}* and

Next, combining the asymptotic normality of _{1}(*t*|*x*_{0}, *z*_{0}) and the consistent estimator ${\widehat{\sigma}}_{1}^{2}(t\mid {x}_{0},{z}_{0})$ for the asymptotic variance, we can construct an (1−*α*)×100% confidence interval for *F*_{1}(*t*; *x*_{0}, *z*_{0}) as follows:

$${g}^{-1}\left(g\left({\widehat{F}}_{1}(t\mid {x}_{0},{z}_{0})\right)\pm {n}^{-1\u22152}{g}^{\prime}\left({\widehat{F}}_{1}(t\mid {x}_{0},{z}_{0})\right){\widehat{\sigma}}_{1}(t\mid {x}_{0},{z}_{0}){z}_{\alpha \u22152}\right),$$

where *g* is a smooth function chosen such that it retains the range of the distribution *F*_{1}, *g′* its continuous derivative and *g*^{−1} denotes the inverse function of *g*. Typically, the transformation *g* is chosen for constructing confidence interval for distributions so as to retain their range as well as to improve the coverage probability. Observe that using the functional delta method, the asymptotic normality of *g*(_{1}(*t*|*x*_{0}, *z*_{0})) follows from that of _{1}(*t*|*x*_{0}, *z*_{0}), but with asymptotic variance given by ${\left({g}^{\prime}\left({F}_{1}(t\mid {x}_{0},{z}_{0})\right)\right)}^{2}{\sigma}_{1}^{2}(t\mid {x}_{0},{z}_{0})$.

Another quantity of interest is the confidence band for the cumulative incidence function. However, in order to construct an (1−*α*)×100% confidence band for *F*_{1}(*t*|*x*_{0}, *z*_{0}) simultaneously for *t* [0, *τ*], we need to investigate the distribution of the supremum of the process *U*_{1}(*t*|*x*_{0}, *z*_{0}), 0 ≤ *t* ≥ *τ*. This is analytically challenging as *U*_{1} does not have an independent increment structure. Alternatively, we adapt the general procedure suggested by [7] to get an approximation of the distribution of the process *U*_{1}. We approximate the distribution of *U*_{1}(*t*|*x*_{0}, *z*_{0}) by a zero-mean Gaussian process, denoted by *Û*_{1}(*t*|*x*_{0}, *z*_{0}), whose distribution can be easily generated through simulations. We replace {*M _{ki}*(

$$E\left\{{M}_{\mathit{ki}}\left(t\right)\right\}=0,\phantom{\rule{1em}{0ex}}V\mathit{ar}\left\{{M}_{\mathit{ki}}\left(t\right)\right\}=E\left\{{N}_{\mathit{ki}}\left(t\right)\right\}.$$

To approximate the distribution of *U*_{1}(*t*|*x*_{0}, *z*_{0}), we simply obtain a large number, say *M*, of realizations from *Û*_{1}(*t*|*x*_{0}, *z*_{0}) by repeatedly generating random samples {*G _{i}*}, while fixing the data {(

An (1 − *α*) × 100% confidence band for *F*_{1}(*t*|*x*_{0}, *z*_{0}) on [*t*_{1}, *t*_{2}] is given by

$${\widehat{F}}_{1}(t\mid {x}_{0},{z}_{0})\pm {n}^{-1\u22152}{c}_{\alpha}{\widehat{\sigma}}_{1}(t\mid {x}_{0},{z}_{0}),$$

where the cutoff value *c _{α}* is given by

$$P\left[\underset{{t}_{1}\le t\le {t}_{2}}{\mathrm{sup}}\mid {\widehat{U}}_{1}(t\mid {x}_{0},{z}_{0})\u2215{\widehat{\sigma}}_{1}(t\mid {x}_{0},{z}_{0})\mid >{c}_{\alpha}\right]=\alpha $$

which can be estimated through replicates of *Û *_{1}(*t*|*x*_{0}, *z*_{0}) given the observed data. Here *t*_{1} and *t*_{2} (*t*_{1} < *t*_{2}) can be any time points between (0, *t*_{0}), where *t*_{0} = inf{*t* : *EY*_{1}(*t*) = 0}.

In many practical situations, one is interested in comparing cumulative incidence functions for a specific cause given two different values of covariates or comparing cumulative incidence functions for two different causes given a specific value of the covariate. For instance in malignant melanoma data, the difference in cumulative incidence functions for malignant melanoma between females and males measures the difference in the probability of dying from malignant melanoma between females and males. Similarly, the difference between cumulative incidence functions at different values of tumor thickness shows how the probability of dying from malignant melanoma is changed by tumor thickness. While the difference of two cumulative incidence functions measures additional probability of survival/dying from a cause, the ratio shows a relative rate of survival/dying from a cause at two different covariate values.

We begin by presenting a way to compare cumulative incidence functions for *k*-th cause, say, the first cause given two different values for covariates (*x, z*) = (*x*_{1}, *z*_{1}) and (*x*, *z*) = (*x*_{2}, *z*_{2}). We will first present the method for calculating the confidence interval and confidence band for their ratio. Using the consistency of * _{k}* and boundedness in probability of $\sqrt{n}({F}_{k}(t\mid x,z)-{F}_{k}(t\mid x,z))$ for

$$\begin{array}{cc}\hfill \sqrt{n}\left(\frac{{\widehat{F}}_{1}(t\mid {x}_{1},{z}_{1})}{{\widehat{F}}_{1}(t\mid {x}_{2},{z}_{2})}-\frac{{F}_{1}(t\mid {x}_{1},{z}_{1})}{{F}_{1}(t\mid {x}_{2},{z}_{2})}\right)\approx & \frac{1}{{F}_{1}(t\mid {x}_{2},{z}_{2})}\sqrt{n}({\widehat{F}}_{1}(t\mid {x}_{1},{z}_{1})-{F}_{1}(t\mid {x}_{1},{z}_{1}))\hfill \\ \hfill & \phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{thinmathspace}{0ex}}-\frac{{F}_{1}(t\mid {x}_{1},{z}_{1})}{{F}_{1}^{2}(t\mid {x}_{2},{z}_{2})}\sqrt{n}({\widehat{F}}_{1}(t\mid {x}_{2},{z}_{2})-{F}_{1}(t\mid {x}_{2},{z}_{2})),\hfill \end{array}$$

where ≈ denotes asymptotic equivalence. For notational simplicity, we denote by *a*(*t*) = 1/*F*_{1}(*t*|*x*_{2}, *z*_{2}) and by $b\left(t\right)={F}_{1}(t\mid {x}_{1},{z}_{1})\u2215{F}_{1}^{2}(t\mid {x}_{2},{z}_{2})$. Using similar arguments as in (3), it follows that $\sqrt{n}\left({\widehat{F}}_{1}(t\mid {x}_{1},{z}_{1})\u2215{\widehat{F}}_{1}(t\mid {x}_{2},{z}_{2})-{F}_{1}(t\mid {x}_{1},{z}_{1})\u2215{F}_{1}(t\mid {x}_{2},{z}_{2})\right)$ is asymptotically equivalent to a sum of square integrable martingales given by

$$\begin{array}{cc}\hfill {\stackrel{~}{U}}_{1}& (t\mid {x}_{1},{z}_{1},{x}_{2},{z}_{2})\hfill \\ \hfill & =\frac{1}{\sqrt{n}}\underset{i=1}{\overset{n}{\Sigma}}{\stackrel{~}{\u220a}}_{i}(t\mid {x}_{1},{z}_{1},{x}_{2},{z}_{2})\hfill \\ \hfill & =\frac{1}{\sqrt{n}}\underset{i=1}{\overset{n}{\Sigma}}\{{\int}_{0}^{t}{(a\left(u\right)S(u\mid {x}_{1},{z}_{1}){x}_{1}-b\left(u\right)S(u\mid {x}_{2},{z}_{2}){x}_{2})}^{T}\phantom{\rule{thinmathspace}{0ex}}{\left({n}^{-1}{W}_{1}\left(u\right)\right)}^{-1}\phantom{\rule{thinmathspace}{0ex}}{\omega}_{1i}\left(u\right){x}_{i}{\mathit{dM}}_{1i}\left(u\right)\phantom{\}}\hfill \\ \hfill & \phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}-{\int}_{0}^{t}{(a\left(u\right)S(u\mid {x}_{1},{z}_{1}){x}_{1}-b\left(u\right)S(u\mid {x}_{2},{z}_{2}){x}_{2})}^{T}{X}_{1}^{-}\left(u\right)Z\left(u\right)\phantom{\rule{thinmathspace}{0ex}}{\mathit{duC}}_{1}^{-1}{D}_{1i}\hfill \\ \hfill & \phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}-{\int}_{0}^{t}{(a\left(u\right)S(u\mid {x}_{1},{z}_{1}){z}_{1}-b\left(u\right)S(u\mid {x}_{2},{z}_{2}){z}_{2})}^{T}\phantom{\rule{thinmathspace}{0ex}}{\mathit{duC}}_{1}^{-1}{D}_{1i}\hfill \\ \hfill & \phantom{\rule{1em}{0ex}}\phantom{\rule{thinmathspace}{0ex}}-\underset{k=1}{\overset{K}{\Sigma}}({\int}_{0}^{t}{(a\left(u\right){F}_{1}^{C}(t,u\mid {x}_{1},{z}_{1}){x}_{1}-b\left(u\right){F}_{1}^{C}(t,u\mid {x}_{2},{z}_{2}){x}_{2})}^{T}{\left({n}^{-1}{W}_{k}\left(u\right)\right)}^{-1}{\omega}_{\mathit{ki}}\left(u\right){x}_{i}\phantom{\rule{thinmathspace}{0ex}}{\mathit{dM}}_{\mathit{ki}}\left(u\right)\phantom{)}\hfill \\ \hfill & \phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}-{\int}_{0}^{t}{(a\left(u\right){F}_{1}^{C}(t,u\mid {x}_{1},{z}_{1}){x}_{1}-b\left(u\right){F}_{1}^{C}(t,u\mid {x}_{2},{z}_{2}){x}_{2})}^{T}{X}_{k}^{-}\left(u\right)Z\left(u\right)\phantom{\rule{thinmathspace}{0ex}}{\mathit{duC}}_{k}^{-1}{D}_{\mathit{ki}}\hfill \\ \hfill & \phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\{}\phantom{(}-{\int}_{0}^{t}{(a\left(u\right){F}_{1}^{C}(t,u\mid {x}_{1},{z}_{1}){z}_{1}-b\left(u\right){F}_{1}^{C}(t,u\mid {x}_{2},{z}_{2}){z}_{2})}^{T}\phantom{\rule{thinmathspace}{0ex}}{\mathit{duC}}_{k}^{-1}{D}_{\mathit{ki}})\},\hfill \end{array}$$

(4)

where ${F}_{1}^{C}(t,u\mid {x}_{1},{z}_{1})={F}_{1}(t\mid {x}_{1},{z}_{1})-{F}_{1}(u\mid {x}_{1},{z}_{1}),\phantom{\rule{thinmathspace}{0ex}}{F}_{1}^{C}(t,u\mid {x}_{2},{z}_{2})={F}_{1}(t\mid {x}_{2},{z}_{2})-{F}_{1}(u\mid {x}_{2},{z}_{2})$, *C _{k}*,

$${\widehat{\stackrel{~}{\sigma}}}^{2}(t\mid {x}_{1},{z}_{1},{x}_{2},{z}_{2})=\frac{1}{n}\underset{i=1}{\overset{n}{\Sigma}}{\left({\widehat{\stackrel{~}{\u220a}}}_{i}(t\mid {x}_{1},{z}_{1},{x}_{2},{z}_{2})\right)}^{2}$$

(5)

obtained by replacing all the terms in (4) with their empirical version and *F _{k}*(

Next, to construct an (1 − *α*) × 100% confidence band for *F*_{1}(*t*|*x*_{1}, *z*_{1})/*F*_{1}(*t*|*x*_{2}, *z*_{2}) for *t* [*t*_{1}, *t*_{2}], we propose a Gaussian multiplier approach. Let

$$\widehat{\stackrel{~}{U}}(t\mid {x}_{1},{z}_{1},{x}_{2},{z}_{2})=\frac{1}{\sqrt{n}}\underset{i=1}{\overset{n}{\Sigma}}\widehat{\stackrel{~}{{\u220a}_{i}}}(t\mid {x}_{1},{z}_{1},{x}_{2},{z}_{2}){G}_{i},$$

where {*G _{i}* :

$$\frac{{\widehat{F}}_{1}(t\mid {x}_{1},{z}_{1})}{{\widehat{F}}_{1}(t\mid {x}_{2},{z}_{2})}\pm {n}^{-1\u22152}{c}_{\alpha}\widehat{\stackrel{~}{\sigma}}(t\mid {x}_{1},{z}_{1},{x}_{2},{z}_{2}),$$

where the cut-off value *c _{α}* given by

$$P\left[\underset{{t}_{1}\le t\le {t}_{2}}{\mathrm{sup}}\mid \widehat{\stackrel{~}{U}}(t\mid {x}_{1},{z}_{1},{x}_{2},{z}_{2})\u2215\widehat{\stackrel{~}{\sigma}}(t\mid {x}_{1},{z}_{1},{x}_{2},{z}_{2})\mid >{c}_{\alpha}\right]=\alpha .$$

The cut-off value can be estimated by approximating the distribution of *Ũ*(*t*|*x*_{1}, *z*_{1}, *x*_{2}, *z*_{2}) by obtaining a large number of realizations of $\widehat{\stackrel{~}{U}}(t\mid {x}_{1},{z}_{1},{x}_{2},{z}_{2})$ given the observed data by repeatedly generating random samples of {*G*_{i} : *i* = 1, …, *n*}.

Similarly, we can construct confidence interval and confidence band for the difference between cumulative incidence functions for a specific cause-*k* given different values of covariates, i.e., for *F _{k}*(

$$\sqrt{n}\left({\widehat{F}}_{1}(t\mid {x}_{1},{z}_{1})-{F}_{1}(t\mid {x}_{1},{z}_{1})\right)-\sqrt{n}\left({\widehat{F}}_{1}(t\mid {x}_{2},{z}_{2})-{F}_{1}(t\mid {x}_{2},{z}_{2})\right)$$

is asymptotically equivalent to $\frac{1}{\sqrt{n}}{\Sigma}_{i=1}^{n}{\stackrel{~}{\u220a}}_{i}(t\mid {x}_{1},{z}_{1},{x}_{2},{z}_{2})$ with *a*(*t*) = *b*(*t*) = 1 for all *t* [0, *t*_{0}]. Hence, an (1 − *α*) × 100% confidence interval and confidence band in *t* [*t*_{1}, *t*_{2}] for the difference can be constructed as

$$({\widehat{F}}_{1}(t\mid {x}_{1},{z}_{1})-{\widehat{F}}_{1}(t\mid {x}_{2},{z}_{2}))\pm {n}^{-1\u22152}{c}_{\alpha}\widehat{\stackrel{~}{\sigma}}(t\mid {x}_{1},{z}_{1},{x}_{2},{z}_{2}),$$

where *c _{α}* and $\widehat{\stackrel{~}{\sigma}}(\cdot )$ can be calculated as mentioned above with

Next, we discuss constructing confidence interval and band for comparing cumulative incidence functions for two different causes given a specific value for the covariate. We begin by considering their ratio i.e., *F*_{1}(*t*|*x*_{0}, *z*_{0})/*F*_{2}(*t*|*x*_{0}, *z*_{0}). Using the consistency of * _{k}* and boundedness in probability of $\sqrt{n}\left({F}_{k}(t\mid x,z)-{F}_{k}(t\mid x,z)\right)$ for

$$\begin{array}{cc}\hfill \sqrt{n}\left(\frac{{\widehat{F}}_{1}(t\mid {x}_{0},{z}_{0})}{{\widehat{F}}_{2}(t\mid {x}_{0},{z}_{0})}-\frac{{F}_{1}(t\mid {x}_{0},{z}_{0})}{{F}_{2}(t\mid {x}_{0},{z}_{0})}\right)& \approx \frac{1}{{F}_{2}(t\mid {x}_{0},{z}_{0})}\sqrt{n}({\widehat{F}}_{1}(t\mid {x}_{0},{z}_{0})-{F}_{1}(t\mid {x}_{0},{z}_{0}))\hfill \\ \hfill & -\frac{{F}_{1}(t\mid {x}_{0},{z}_{0})}{{F}_{2}^{2}(t\mid {x}_{0},{z}_{0})}\sqrt{n}({\widehat{F}}_{2}(t\mid {x}_{0},{z}_{0})-{F}_{2}(t\mid {x}_{0},{z}_{0})).\hfill \end{array}$$

For notational simplicity, we denote by *a*(*t*) = 1/*F*_{1}(*t*|*x*_{0}, *z*_{0}) and by $b\left(t\right)={F}_{1}(t\mid {x}_{0},{z}_{0})\u2215{F}_{2}^{2}(t\mid {x}_{0},{z}_{0})$. Using similar arguments as in (4), we can show that $\sqrt{n}\left({\widehat{F}}_{1}(t\mid {x}_{0},{z}_{0})\u2215{\widehat{F}}_{2}(t\mid {x}_{0},{z}_{0})-{F}_{1}(t\mid {x}_{0},{z}_{0})\u2215{F}_{2}(t\mid {x}_{0},{z}_{0})\right)$ is asymptotically equivalent to a sum of square integrable martingales given by

$$\begin{array}{cc}\hfill \stackrel{~}{U}& (t\mid {x}_{0},{z}_{0})\hfill \\ \hfill & =\frac{1}{\sqrt{n}}\underset{i=1}{\overset{n}{\Sigma}}{\stackrel{~}{\u220a}}_{i}(t\mid {x}_{0},{z}_{0})\hfill \\ \hfill & =\frac{1}{\sqrt{n}}\underset{i=1}{\overset{n}{\Sigma}}\{{\int}_{0}^{t}S(u\mid {x}_{0},{z}_{0}){x}_{0}^{T}\left(a\right(u){\left({n}^{-1}{W}_{1}\left(u\right)\right)}^{-1}\phantom{\rule{thinmathspace}{0ex}}{\omega}_{1i}\left(u\right){x}_{i}\phantom{\rule{thinmathspace}{0ex}}{\mathit{dM}}_{1i}\left(u\right)\phantom{\}}\hfill \\ \hfill & \phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}-b\left(u\right)\phantom{\rule{thinmathspace}{0ex}}{\left({n}^{-1}{W}_{2}\left(u\right)\right)}^{1}\phantom{\rule{thinmathspace}{0ex}}{\omega}_{2i}\left(u\right){x}_{i}\phantom{\rule{thinmathspace}{0ex}}{\mathit{dM}}_{2i}\left(u\right))\hfill \\ \hfill & \phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}-{\int}_{0}^{t}S(u\mid {x}_{0},{z}_{0}){x}_{0}^{T}(a\left(u\right){X}_{1}^{-}\left(u\right)Z\left(u\right){C}_{1}^{-1}{D}_{1i}-b\left(u\right){X}_{2}^{-}\left(u\right)Z\left(u\right){C}_{2}^{-1}{D}_{2i})\phantom{\rule{thinmathspace}{0ex}}\mathit{du}\hfill \\ \hfill & \phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}-{\int}_{0}^{t}S(u\mid {x}_{0},{z}_{0}){z}_{0}^{T}(a\left(u\right){C}_{1}^{-1}{D}_{1i}-b\left(u\right){C}_{2}^{-1}{D}_{2i})\phantom{\rule{thinmathspace}{0ex}}\mathit{du}\hfill \\ \hfill & \phantom{\rule{1em}{0ex}}\phantom{\rule{thinmathspace}{0ex}}-\underset{k=1}{\overset{K}{\Sigma}}({\int}_{0}^{t}(a\left(u\right){F}_{1}^{C}(t,u)-b\left(u\right){F}_{2}^{C}(t,u)){x}_{0}^{T}{\left({n}^{-1}{W}_{k}\left(u\right)\right)}^{-1}\phantom{\rule{thinmathspace}{0ex}}{\omega}_{\mathit{ki}}\left(u\right){x}_{i}\phantom{\rule{thinmathspace}{0ex}}{\mathit{dM}}_{\mathit{ki}}\left(u\right)\phantom{)}\hfill \\ \hfill & \phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}-{\int}_{0}^{t}(a\left(u\right){F}_{1}^{C}(t,u)-b\left(u\right){F}_{2}^{C}(t,u)){x}_{0}^{T}{X}_{k}^{-}\left(u\right)Z\left(u\right)\phantom{\rule{thinmathspace}{0ex}}{\mathit{duC}}_{k}^{-1}{D}_{\mathit{ki}}\hfill \\ \hfill & \phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\{}\phantom{(}-{\int}_{0}^{t}(a\left(u\right){F}_{1}^{C}(t,u)-b\left(u\right){F}_{2}^{C}(t,u))\phantom{\rule{thinmathspace}{0ex}}{\mathit{duz}}_{0}^{T}{C}_{k}^{-1}{D}_{\mathit{ki}})\}.\hfill \end{array}$$

(6)

Using the martingale central limit theorem, the above process converges in distribution to a Gaussian process. Furthermore, using the martingale representation of *Ũ*(*t*|*x*_{0}, *z*_{0}) we can construct a consistent estimator of the asymptotic variance function. The variance function at time *t* can be consistently estimated by

$${\widehat{\stackrel{~}{\sigma}}}^{2}(t\mid {x}_{0},{z}_{0})=\frac{1}{n}\underset{i=1}{\overset{n}{\Sigma}}{\left({\widehat{\stackrel{~}{\u220a}}}_{i}(t\mid {x}_{0},{z}_{0})\right)}^{2}$$

obtained by replacing all the terms with their empirical version and *F _{k}*(

We investigate the finite sample properties of the various proposed confidence intervals and confidence bands for the cumulative incidence function(s) and apply the methods in analyzing a real data. We first begin with simulation studies conducted to evaluate the performance of our proposed method. We generated a semiparametric additive risks model (2) with two causes. The covariate *x* associated with the time-varying parameter was chosen to be a 0/1 variable each with probability .5. We also included a covariate *z*, associated with the time constant effect generated from uniform distribution on [0, 1]. The *K* = 2 latent failure times were taken to be conditionally independent with cause-specific hazard functions given by

$${\lambda}_{k}(t\mid x,z)=x{\alpha}_{k}\left(t\right)+z{\beta}_{k}$$

where for *k* = 1 (cause 1), *α*_{1}(*t*) = 2.0 for 0 ≤ *t* ≤ 0.5 and 1.0 for *t* > 0.5 and *β*_{1} = 1.0; for *k* = 2 (cause 2), *α*_{2}(*t*) = 1.0 for *t* > 0 and *β*_{2} = 1.5. The censoring *C* was taken to be exponentially distributed with the rate parameter adjusted, so that 30% and 50% of the observations were censored. We simulated data of sample sizes *n* = 100 and *n* = 250, each replicated 1000 times. In order to illustrate the performance of the proposed estimator of the cumulative incidence functions, in Figure 3 we present a plot of the estimates of cumulative incidence function for cause 1 for covariate value *x* = 1 and *z* = .5 under the above model based on 50 replicates of the sample. The estimates are based on the sample sizes 100 and 250 with 30% censoring, respectively. Figure 3 shows that the estimates tend to get closer to the true cumulative incidence function as the sample size increases.

Predicted cumulative incidence function of malignant melanoma death for a 52-year-old female with ulceration and tumor thickness of 6.76 mm with 95% confidence intervals and bands.

Next, in order to examine the performance of the estimates of cumulative incidence functions and their confidence intervals at time points *t* for specific value of covariate, we examined their performance for *t* in .1, .2, .3, .4, .5 and .6, where .60 is close to the 70-th percentile of the underlying distribution for the time to first cause, i.e., *k* = 1. We also investigated the performance of confidence band for *t* [.1, .6]. The numbers reported in Table I are the bias, sampling standard deviation, estimated asymptotic standard deviation of the predicted cumulative incidence function for the cause 1 for covariates (*x*, *z*) = (1, .5) at the time points mentioned above. Table I also presents the coverage probability of the 95% confidence interval for the cumulative incidence function for cause-1 at the above mentioned time points as well as the coverage probability for its 95% confidence band in the last column. These results are based on 1000 replications for each of the sample sizes (*n* = 100, 250) and censoring percentages (30% and 50%).

The numbers indicate overall small biases of the proposed cumulative incidence functions and that their sampling standard deviations decrease as the sample sizes increase for each time point and each censoring level. The performance remains same when the censoring percentage increases from 30% to 50%. The estimated asymptotic standard deviations are very close to the sampling standard deviations indicating the appropriateness of the finite sample approximations to calculate the estimate for the asymptotic standard deviation. Furthermore, the coverage probabilities for 95% confidence intervals for the cumulative incidence function is good for *n* = 100 and gets very close to the nominal level of 95% as the sample size increases to *n* = 250 for both censoring percentages 30% and 50%, respectively. We next studied the performance of proposed estimators for comparing cumulative incidence functions given different values of the covariates. In Table II, we present the bias, sampling standard deviation, estimated asymptotic standard deviation and the coverage probability of the 95% confidence interval for the difference between cumulative incidence functions for cause-1 at (*x*_{1}, *z*_{1}) = (1, .5) with that at (*x*_{2}, *z*_{2}) = (0, .5), as well as its confidence band for *t* [.1, .6]. Overall, the performances are similar to that found in Table I indicating that the proposed estimation procedure for constructing confidence intervals and bands for comparing cumulative incidence functions seems to perform very well even with the weight function being chosen to be identity. Our experience with choice of weight function based on *λ* indicates that the performance improves from the point of view of efficiency and consequently the coverage probability.

We now illustrate our methodology with an analysis of data on 205 malignant melanoma patients observed during the years 1962 to 1977 [2]. Among these 205 patients, 57 patients died from malignant melanoma, 14 patients died from causes other than malignant melanoma, and the remaining 134 patients were alive at the end of follow-up. The covariates considered here were tumor thickness (mean: 2.92 mm; standard deviation: 2.96 mm), ulceration status (90 present and 115 not present), age (mean: 52 years; standard deviation: 17 years) and gender (79 male and 126 female). In our analysis, the covariates of tumor thickness and age were standardized. The main interest in this melanoma study is to predict the patient-specific cumulative incidence probability for the melanoma death.

We first began with fitting the proportional hazards model to the data and used the score process test proposed by [7] to test for proportionality of the underlying covariates of interest. We found that the proportionality assumption for age and sex were reasonable with *p*-values of 0.554 and 0.380, respectively. However, the *p*-values for the proportionality test for tumor thickness and ulceration status resulted in *p*-values of 0.032 and 0.008, respectively, indicating lack of fit. Thus this indicates that the Cox's model may be inappropriate here.

We next fitted the additive model (1) where all covariates of the model had nonparametric time-varying effects. Based on the results not presented here by applying the testing procedures of [11], the estimates indicated that gender and age did not have significant time-varying effects. So, based on this preliminary analysis, our final model considers the effect of gender and age as time constant effects, and tumor thickness and ulceration status as time-varying nonparametric effects. We used the weight function introduced in Section 2 throughout this analysis as we found an improvement in efficiency with the usage of the weights in comparison to the estimates based on weight matrix being identity. Based on this final model, the cumulative effects of the baseline estimate, tumor thickness, and ulceration, with 95% pointwise confidence intervals and bands, are given in Figure 4. The cumulative regression function for tumor thickness shows a positive effect in the first 4 years, which then seems to flatten out. Ulceration shows a positive effect within the first 4 years and a slower effect after that. The 95% confidence interval of the constant coefficients for gender and age are (−.1021, .1450) and (−.0535, .0691), respectively.

Predicted difference between two cumulative incidence functions of malignant melanoma death at three sets of covariate values specified in the last paragraph of Section 4.

Next, we estimated the cumulative incidence function of malignant melanoma for a 52-year-old female patient with ulceration and tumor thickness of 6.76 mm (90-th percentile of tumor thickness), and display its 95% confidence intervals and bands under the semiparametric additive model in Figure 4. We used a transformation of *g*(·) = log(−log(·)).

Finally, in Figure 4 we estimate the difference between the cumulative incidence functions for malignant melanoma for different values of covariates. In Figure 4(a), in order to ascertain the effect of tumor thickness on the cumulative incidence function, we estimated the difference between cumulative incidence functions for a female of 52 years of age with ulceration and tumor thickness of one standard deviation above the observed mean tumor thickness with that for a female of same age with ulceration and tumor thickness at the mean value. In Figure 4(b), in order to ascertain the effect of ulceration on cumulative incidence function, we estimate the difference between cumulative incidence functions for a 52 year old female with mean value of tumor thickness and with ulceration with that for a 52 year old female with mean value of tumor thickness but no ulceration. In Figure 4(c), in order to ascertain the gender effect on cumulative incidence function, we estimated the difference between a 52 year old male with ulceration and mean tumor thickness with that of a 52 year old female with ulceration and mean tumor thickness. Observe that for a 52 years old female with ulceration, Figure 4(a) estimates the additional probability of dying from malignant melanoma over time for those with tumor thickness of 5.88 mm (one standard deviation above the mean) compared to those with tumor thickness at 2.92 mm (the mean). For a 52 years old female with mean value of tumor thickness, Figure 4(b) shows the estimated increase in probability of dying due to malignant melanoma for those with ulceration compared to those without. Figure 4(c) indicates that men have higher probability of dying due to malignant melanoma than women with ulceration and with same mean level of tumor thickness.

In this paper, we have proposed to use the semiparametric additive model for the prediction and comparison of cause-specific cumulative incidence functions for given values of the covariate. This model adds to the existing approaches for analyzing competing risks data which include multiplicative model of [6], Aalen's additive model (with only fixed effect of covariates) of [8] and Aalen's nonparametric additive model [10] among others. This gives a rich choice of modeling approaches which can all be applied when analyzing competing risks data, and one can then choose the best fit model. Most of the existing literatures study the cumulative incidence functions via modelling the cause-specific hazard functions. Direct modeling of cumulative incidence functions has recently been studied by [12]. This collection of models gives a rich variety from which a user can choose an appropriate model for analyzing the data. Under semiparametric additive hazard model, we developed statistical procedures to construct confidence intervals and bands for the difference and ratio of two cumulative incidence functions given the covariates. The proposed methods of this paper can be used to estimate the differential (or relative) probability of failure from one cause under two sets of covariate values over time. Unlike direct modeling of cumulative incidence functions, our approach is more flexible in that we do not need to assume simple forms for covariate effects on the cumulative incidence function. Thus, our approach can capture covariate effects that are nonlinear on the cumulative incidence function. Our R-package for the methods proposed is available upon request from the corresponding author and will be posted on a website.

Cumulative incidence function of cause 1 (solid line) at (*x*, *z*) = (1, .5) with 50 random estimates (dotted line) for sample sizes 100 and 250 with 30% censoring.

The research of the first author was in part supported by the intramural research program of National Institutes of Health and in part by NSF grants DMS-0304922 and DMS-0604576. The research of the second author was partially supported by NSF grant DMS-0604576 and NIH grant 2 RO1 AI054165-04. The research of third author was supported by the intramural research program of *Eunice KennedyShriver* National Institute of Child Health and Human Development. The authorship is listed in alphabetical order.

The following results state that $\sqrt{n}\left({\widehat{F}}_{1}(t\mid {x}_{0},{z}_{0})-{F}_{1}(t\mid {x}_{0},{z}_{0})\right)$ is asymptotically equivalent to a sum of martingale *U*_{1}(*t*|*x*_{0}, *z*_{0}), and it converges in distribution to a Gaussian process. Using arguments similar to those given by [1] it can be shown that

$$\sqrt{n}\left({\widehat{\beta}}_{k}-{\beta}_{k}\right)={C}_{k}^{-1}\frac{1}{\sqrt{n}}\underset{i=1}{\overset{n}{\Sigma}}{D}_{\mathit{ki}}+{o}_{p}\left(1\right),$$

and

$$\sqrt{n}\left({\widehat{A}}_{k}\left(t\right)-{A}_{k}\left(t\right)\right)=\frac{1}{\sqrt{n}}\underset{i=1}{\overset{n}{\Sigma}}{Q}_{\mathit{ki}}\left(t\right)+{o}_{p}\left(1\right),$$

where

$${Q}_{\mathit{ki}}\left(t\right)={\int}_{0}^{t}{\left({n}^{-1}{W}_{k}\left(u\right)\right)}^{-1}\phantom{\rule{thinmathspace}{0ex}}{\omega}_{\mathit{ki}}\left(u\right){x}_{i}\phantom{\rule{thinmathspace}{0ex}}{\mathit{dM}}_{\mathit{ki}}\left(u\right)-{\int}_{0}^{t}{X}_{k}^{-}\left(u\right)Z\left(u\right)\phantom{\rule{thinmathspace}{0ex}}{\mathit{duC}}_{k}^{-1}{D}_{\mathit{ki}}.$$

Note that *M _{ki}* is a martingale with respect to the counting process

Define ${W}_{k}(t\mid {x}_{0},{z}_{0})=\sqrt{n}\left({\widehat{\Lambda}}_{k}(t\mid {x}_{0},{z}_{0})-{\Lambda}_{k}(t\mid {x}_{0},{z}_{0})\right)$. Then the process *W _{k}*(

$$\begin{array}{cc}\hfill {W}_{k}(t\mid {x}_{0},{z}_{0})& =\sqrt{n}\left({x}_{0}^{T}{\widehat{A}}_{k}\left(t\right)+{\mathit{tz}}_{0}^{T}{\widehat{\beta}}_{k}-{x}_{0}^{T}{A}_{k}\left(t\right)-{\mathit{tz}}_{0}^{T}{\beta}_{k}\right)\hfill \\ \hfill & ={x}_{0}^{T}\left[\sqrt{n}\left({\widehat{A}}_{k}\left(t\right)-{A}_{k}\left(t\right)\right)\right]+{\mathit{tz}}_{0}^{T}\left[\sqrt{n}\left({\widehat{\beta}}_{k}-{\beta}_{k}\right)\right]\hfill \\ \hfill & \approx \frac{1}{\sqrt{n}}\underset{i=1}{\overset{n}{\Sigma}}\left({x}_{0}^{T}{Q}_{\mathit{ki}}\left(t\right)+{\mathit{tz}}_{0}^{T}{C}_{k}^{-1}{D}_{\mathit{ki}}\right)\equiv {\stackrel{~}{W}}_{k}(t\mid {x}_{0},{z}_{0}).\hfill \end{array}$$

Note that

$$\begin{array}{cc}\hfill {\widehat{F}}_{1}(t\mid {x}_{0},{z}_{0})-{F}_{1}(t\mid {x}_{0},{z}_{0})=& {\int}_{0}^{t}S(u\mid {x}_{0},{z}_{0})d\{{\widehat{\Lambda}}_{1}(u\mid {x}_{0},{z}_{0})-{\Lambda}_{1}(u\mid {x}_{0},{z}_{0})\}\hfill \\ \hfill & +{\int}_{0}^{t}\{\widehat{S}(u\mid {x}_{0},{z}_{0})-S(u\mid {x}_{0},{z}_{0})\}\phantom{\rule{thinmathspace}{0ex}}d{\widehat{\Lambda}}_{1}(u\mid {x}_{0},{z}_{0}).\hfill \end{array}$$

Using the Taylor expansion to the second component in the preceding expression,

$$\begin{array}{cc}\hfill & {\int}_{0}^{t}\{\widehat{S}(u\mid {x}_{0},{z}_{0})-S(u\mid {x}_{0},{z}_{0})\}\phantom{\rule{thinmathspace}{0ex}}d{\widehat{\Lambda}}_{1}(u\mid {x}_{0},{z}_{0})\hfill \\ \hfill & =-{\int}_{0}^{t}S(u\mid {x}_{0},{z}_{0})\left(\underset{k=1}{\overset{K}{\Sigma}}{\widehat{\Lambda}}_{k}(u\mid {x}_{0},{z}_{0})-\underset{k=1}{\overset{K}{\Sigma}}{\Lambda}_{k}(u\mid {x}_{0},{z}_{0})\right)\phantom{\rule{thinmathspace}{0ex}}d{\widehat{\Lambda}}_{1}(u\mid {x}_{0},{z}_{0}).\hfill \end{array}$$

Then it follows that

$$\begin{array}{cc}\hfill & \sqrt{n}\left({\widehat{F}}_{1}(t\mid {x}_{0},{z}_{0})-{F}_{1}(t\mid {x}_{0},{z}_{0})\right)\hfill \\ \hfill & \approx {\int}_{0}^{t}S(u\mid {x}_{0},{z}_{0})\phantom{\rule{thinmathspace}{0ex}}d{\stackrel{~}{W}}_{1}(u\mid {x}_{0},{z}_{0})-\underset{k=1}{\overset{K}{\Sigma}}{\int}_{0}^{t}S(u\mid {x}_{0},{z}_{0}){\stackrel{~}{W}}_{k}(u\mid {x}_{0},{z}_{0})\phantom{\rule{thinmathspace}{0ex}}d{\Lambda}_{1}(u\mid {x}_{0},{z}_{0}).\hfill \end{array}$$

(7)

From integration by parts,

$$\begin{array}{cc}\hfill {\int}_{0}^{t}& S(u\mid {x}_{0},{z}_{0}){\stackrel{~}{W}}_{k}(u\mid {x}_{0},{z}_{0})\phantom{\rule{thinmathspace}{0ex}}d{\Lambda}_{1}(u\mid {x}_{0},{z}_{0})\hfill \\ \hfill & ={\int}_{0}^{t}{\stackrel{~}{W}}_{k}(u\mid {x}_{0},{z}_{0})S(u\mid {x}_{0},{z}_{0})\phantom{\rule{thinmathspace}{0ex}}d{\Lambda}_{1}(u\mid {x}_{0},{z}_{0})\hfill \\ \hfill & ={\int}_{0}^{t}{\stackrel{~}{W}}_{k}(u\mid {x}_{0},{z}_{0})\phantom{\rule{thinmathspace}{0ex}}{\mathit{dF}}_{1}(u\mid {x}_{0},{z}_{0})\hfill \\ \hfill & ={F}_{1}(t\mid {x}_{0},{z}_{0}){\stackrel{~}{W}}_{k}(t\mid {x}_{0},{z}_{0})-{\int}_{0}^{t}{F}_{1}(u\mid {x}_{0},{z}_{0})\phantom{\rule{thinmathspace}{0ex}}d{\stackrel{~}{W}}_{k}(u\mid {x}_{0},{z}_{0})\hfill \\ \hfill & ={\int}_{0}^{t}\{{F}_{1}(t\mid {x}_{0},{z}_{0})-{F}_{1}(u\mid {x}_{0},{z}_{0})\}\phantom{\rule{thinmathspace}{0ex}}d{\stackrel{~}{W}}_{k}(u\mid {x}_{0},{z}_{0})\hfill \\ \hfill & ={\int}_{0}^{t}{F}_{1}^{C}(t,u)\phantom{\rule{thinmathspace}{0ex}}d{\stackrel{~}{W}}_{k}(u\mid {x}_{0},{z}_{0}).\hfill \end{array}$$

Since

$${\stackrel{~}{W}}_{k}(t\mid {x}_{0},{z}_{0})=\frac{1}{\sqrt{n}}\underset{i=1}{\overset{n}{\Sigma}}\left({x}_{0}^{T}{Q}_{\mathit{ki}}\left(t\right)+{\mathit{tz}}_{0}^{T}{C}_{k}^{-1}{D}_{\mathit{ki}}\right),$$

it follows that

$$\begin{array}{cc}\hfill {\int}_{0}^{t}& S(u\mid {x}_{0},{z}_{0}){\stackrel{~}{W}}_{k}(u\mid {x}_{0},{z}_{0})\phantom{\rule{thinmathspace}{0ex}}d{\Lambda}_{1}(u\mid {x}_{0},{z}_{0})\hfill \\ \hfill & ={\int}_{0}^{t}{F}_{1}^{C}(t,u)\phantom{\rule{thinmathspace}{0ex}}d{\stackrel{~}{W}}_{k}(u\mid {x}_{0},{z}_{0})\hfill \\ \hfill & =\frac{1}{\sqrt{n}}\underset{i=1}{\overset{n}{\Sigma}}({\int}_{0}^{t}{F}_{1}^{C}(t,u){x}_{0}^{T}{\left({n}^{-1}{W}_{k}\left(u\right)\right)}^{-1}\phantom{\rule{thinmathspace}{0ex}}{\omega}_{\mathit{ki}}\left(u\right){x}_{i}\phantom{\rule{thinmathspace}{0ex}}{\mathit{dM}}_{\mathit{ki}}\left(u\right)\phantom{)}\hfill \\ \hfill & \phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}-{\int}_{0}^{t}{F}_{1}^{C}(t,u){x}_{0}^{T}{X}_{k}^{-}{\left(u\right)}^{T}Z\left(u\right)\phantom{\rule{thinmathspace}{0ex}}{\mathit{duC}}_{k}^{-1}{D}_{\mathit{ki}}\hfill \\ \hfill & \phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}+\phantom{(}{\int}_{0}^{t}{F}_{1}^{C}(t,u)\phantom{\rule{thinmathspace}{0ex}}{\mathit{duz}}_{0}^{T}{C}_{k}^{-1}{D}_{\mathit{ki}}).\hfill \end{array}$$

(8)

Then from (7) and (8) $\sqrt{n}\left({\widehat{F}}_{1}(t\mid {x}_{0},{z}_{0})-{F}_{1}(t\mid {x}_{0},{z}_{0})\right)$ can be approximated by martingale processes

$${U}_{1}(t\mid {x}_{0},{z}_{0})=\frac{1}{\sqrt{n}}\underset{i=1}{\overset{n}{\Sigma}}{\u220a}_{1i}(t\mid {x}_{0},{z}_{0}),$$

(9)

where

$$\begin{array}{cc}\hfill {\u220a}_{1i}& (t\mid {x}_{0},{z}_{0})\hfill \\ \hfill & ={\int}_{0}^{t}S(u\mid {x}_{0},{z}_{0}){x}_{0}^{T}{\left({n}^{-1}{W}_{1}\left(u\right)\right)}^{-1}\phantom{\rule{thinmathspace}{0ex}}{\omega}_{1i}\left(u\right){x}_{i}\phantom{\rule{thinmathspace}{0ex}}{\mathit{dM}}_{1i}\left(u\right)\hfill \\ \hfill & \phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}-{\int}_{0}^{t}S(u\mid {x}_{0},{z}_{0}){x}_{0}^{T}{X}_{1}^{-}\left(u\right)Z\left(u\right)\phantom{\rule{thinmathspace}{0ex}}{\mathit{duC}}_{1}^{-1}{D}_{1i}\hfill \\ \hfill & \phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}-{\int}_{0}^{t}S(u\mid {x}_{0},{z}_{0})\phantom{\rule{thinmathspace}{0ex}}{\mathit{duz}}_{0}^{T}{C}_{1}^{-1}{D}_{1i}\hfill \\ \hfill & \phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}-\underset{k=1}{\overset{K}{\Sigma}}({\int}_{0}^{t}{F}_{1}^{C}(t,u){x}_{0}^{T}{\left({n}^{-1}{W}_{k}\left(u\right)\right)}^{-1}\phantom{\rule{thinmathspace}{0ex}}{\omega}_{\mathit{ki}}\left(u\right){x}_{i}\phantom{\rule{thinmathspace}{0ex}}{\mathit{dM}}_{\mathit{ki}}\left(u\right)\phantom{)}\hfill \\ \hfill & \phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{thinmathspace}{0ex}}-{\int}_{0}^{t}{F}_{1}^{C}(t,u){x}_{0}^{T}{X}_{k}^{-}\left(u\right)Z\left(u\right)\phantom{\rule{thinmathspace}{0ex}}{\mathit{duC}}_{k}^{-1}{D}_{\mathit{ki}}\hfill \\ \hfill & \phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\phantom{\rule{thinmathspace}{0ex}}-\phantom{(}{\int}_{0}^{t}{F}_{1}^{C}(t,u)\phantom{\rule{thinmathspace}{0ex}}{\mathit{duz}}_{0}^{T}{C}_{k}^{-1}{D}_{\mathit{ki}}).\hfill \end{array}$$

(10)

Under our assumption that *T _{ki}* does not equal

$${\widehat{\sigma}}_{1}^{2}(t\mid {x}_{0},{z}_{0})=\frac{1}{n}\underset{i=1}{\overset{n}{\Sigma}}{\left({\widehat{\u220a}}_{1i}(t\mid {x}_{0},{z}_{0})\right)}^{2},$$

(11)

and *∈̂*_{1i}(*t*|*x*_{0}, *z*_{0} by plugging in estimates of the unknown quantities.

1. McKeague IW, Sasieni PD. A partly parametric additive risk model. Biometrika. 1994;81:501–514.

2. Andersen PK, Borgan Ø, Gill RD, Keiding N. Statistical models based on counting processes. Springer-verlag; New York: 1993.

3. Kalbfleisch JD, Prentice RL. The Statistical Analysis of Failure Time Data. Wiley; New York: 1980.

4. Cox DR. Regression models and life tables (with discussion) Journal of royal statistical society, series–B. 1972;34:187–220.

5. Andersen PK, Gill RD. Cox's regression model for counting processess: A large sample study. Annals of Statistics. 1982;10:1100–1120.

6. Cheng SC, Fine JP, Wei LJ. Prediction of cumulative incidence funciton under the proportional hazards model. Biometrics. 1998;54:219–228. [PubMed]

7. Lin DY, Wei LJ, Ying Z. Checking the Cox model with cumulative sums of martingale-based residuals. Biometrika. 1993;80:557–572.

8. Shen Y, Cheng SC. Confidence Bands for Cumulative Incidence Curves under the Additive Risk Model. Biometrics. 1999;55:1093–1100. [PubMed]

9. Lin DY, Ying Z. Semiparametric analysis of the additive risk model. Biometrika. 1994;81:61–71.

10. Aalen OO, Borgan O, Fekjaer H. Covariate adjustment of event histories estimated from Markov chains: The additive approach. Biometrics. 2001;57:993–1001. [PubMed]

11. Martinussen T, Scheike TH. Dynamic Regression Models for Survival Data. Springer; New York: 2006.

12. Scheike TH, Zhang MJ, Gerds T. Predicting cumulative incidence probability by direct binomial regression. Biometrika. 2008;95:1–16. DOI: 10.1093/biomet/asm096.

PubMed Central Canada is a service of the Canadian Institutes of Health Research (CIHR) working in partnership with the National Research Council's national science library in cooperation with the National Center for Biotechnology Information at the U.S. National Library of Medicine(NCBI/NLM). It includes content provided to the PubMed Central International archive by participating publishers. |