Home | About | Journals | Submit | Contact Us | Français |

**|**HHS Author Manuscripts**|**PMC2929143

Formats

Article sections

- Abstract
- 1. INTRODUCTION
- 2. MODEL ESTIMATION AND INFERENCES
- 3. NUMERICAL STUDIES
- 4. DISCUSSION
- Supplementary Material
- REFERENCES

Authors

Related links

J Am Stat Assoc. Author manuscript; available in PMC 2010 August 27.

Published in final edited form as:

J Am Stat Assoc. 2010 June 1; 105(490): 683–691.

doi: 10.1198/jasa.2010.tm09302.PMCID: PMC2929143

NIHMSID: NIHMS201448

Wenbin Lu, (Email: ude.uscn.tats@4ulw), Department of Statistics, North Carolina State University, Raleigh, NC 27695.

See other articles in PMC that cite the published article.

We study a general class of partially linear transformation models, which extend linear transformation models by incorporating nonlinear covariate effects in survival data analysis. A new martingale-based estimating equation approach, consisting of both global and kernel-weighted local estimation equations, is developed for estimating the parametric and nonparametric covariate effects in a unified manner. We show that with a proper choice of the kernel bandwidth parameter, one can obtain the consistent and asymptotically normal parameter estimates for the linear effects. Asymptotic properties of the estimated nonlinear effects are established as well. We further suggest a simple resampling method to estimate the asymptotic variance of the linear estimates and show its effectiveness. To facilitate the implementation of the new procedure, an iterative algorithm is developed. Numerical examples are given to illustrate the finite-sample performance of the procedure.

Linear transformation models provide a general framework for model estimation and inferences in censored survival data analysis, and they have recently attracted considerable attention due to their high flexibility (Clayton and Cuzick, 1985; Bickel et al., 1993; Cheng et al., 1995, 1997; Fine et al., 1998; Chen et al., 2002; Zeng and Lin, 2006; among others). Let *T* be the survival time and **Z** be the *p*-dimensional covariate vector. To the model effects of **Z** on the response *T*, the linear transformation models assume that

$$H\left(T\right)=-{\beta}^{\prime}\mathbf{Z}+\u220a,$$

(1)

where *H* is a completely unspecified strictly increasing function, ** β** is a

Linear transformation models include many useful models as special cases. For example, if follows the extreme value distribution, then the model (1) becomes the Cox’s proportional hazards model (Cox, 1972); if follows the standard logistic distribution, then (1) reduces to the proportional odds model (Pettitt, 1982, 1984; Bennett, 1983); if there is no censoring and follows the standard normal distribution, (1) generalizes the usual Box-Cox transformation models. Many procedures have been suggested for estimating ** β** in (1). Among them, Cheng et al. (1995) proposed the inverse censoring-probability weighted (ICPW) method for estimating

Despite these accomplishments, one limitation of linear transformation models is that all covariate effects are assumed to be linear. This assumption is sometimes too restrictive or unrealistic. For example, in the analysis of the lung cancer data from the Veteran’s Administration lung cancer trial (Kalbfleish and Prentice, 2002), the covariate age shows a strong nonlinear effect (U-shape) on the patient survival time. However, such an important effect may be missed by assuming the covariate effects to be linear (Tibshirani, 1997; Lu and Zhang, 2007). Another motivation for the need of partially linear transformation models is from the New York University women heath study (NYUWHS). In this study, one primary interest is to study the effects of sex hormone levels on the time of developing breast carcinoma, which usually show strongly nonlinear trends. The common practice is to break the continuous hormone levels into discrete quantiles (Zeleniuch-Jacquotte et al., 2004). However, such a discretization method can not make use of the entire data effectively, and more unpleasantly, the final fit can not retain the smooth curve of the nonlinear effect. Therefore, it is desired to provide a more powerful class of semiparametric survival models which can accommodate both linear and nonlinear covariate effects under one unified framework.

In this paper, we consider a class of partially linear transformation models and study their estimation and inference properties. In particular, we consider

$$H\left(T\right)=-{\beta}^{\prime}\mathbf{Z}-f\left(X\right)+\u220a,$$

(2)

where ** β** is the vector of regression parameters for linear covariates and

In this paper we propose a system of martingale representation based global and local estimating equations to naturally deal with all these difficulties. Furthermore, under appropriate regularity conditions, we show that with a range of choices of the smoother parameter (the kernel bandwidth), the estimator of ** β** is root-

The rest of the article is organized as follows. In Section 2, we describe the global and local estimating equations for the parameters ** β**,

In this section, we propose a system of estimating equations based on the martingale representation to simultaneously estimate *H*, ** β**, and

Suppose there are *n* subjects in the study. Let *T _{i}, C_{i}* and ${({\mathbf{Z}}_{i}^{\prime},{X}_{i})}^{\prime}$ be respectively the failure time, censoring time, and covariates of the

$${M}_{i}\left(t\right)={N}_{i}\left(t\right)-{\int}_{0}^{t}{Y}_{i}\left(s\right)d\Lambda \left\{{H}_{0}\right(s)+{\beta}_{0}^{\prime}{\mathbf{Z}}_{i}+{f}_{0}({X}_{i}\left)\right\},\phantom{\rule{1em}{0ex}}i=1,\cdots ,n,$$

(3)

where Λ(·) is the known cumulative hazard function of and (*β*_{0}, *H*_{0}, *f*_{0}) are the true values of (** β**,

If the nonparametric function *f* were known, partially linear transformation models reduce to linear transformation models, and thus we can adopt the estimation equation suggested by Chen et al. (2002) to estimate ** β** and

$$\sum _{i=1}^{N}\left[d{N}_{i}\left(t\right)-{Y}_{i}\left(t\right)d\Lambda \left\{H\right(t)+{\beta}^{\prime}{\mathbf{Z}}_{i}+f({X}_{i}\left)\right\}\right]=0,\phantom{\rule{thickmathspace}{0ex}}t\ge 0,\phantom{\rule{thickmathspace}{0ex}}H\left(0\right)=-\infty ,$$

(4)

$$\sum _{i=1}^{n}{\int}_{0}^{\tau}{\mathbf{Z}}_{i}\left[d{N}_{i}\left(t\right)-{Y}_{i}\left(t\right)d\Lambda \left\{H\right(t)+{\beta}^{\prime}{\mathbf{Z}}_{i}+f({X}_{i}\left)\right\}\right]=0,$$

(5)

where $\tau =\mathrm{inf}\{t:P(\stackrel{~}{T}>t)=0\}$. Note that (4) is a martingale difference equation used for estimating the transformation function *H* when ** β** is fixed, while (5) is a martingale integral equation used for identifying

Next, in order to estimate *f*, we approximate it locally by a linear function

$$f\left(x\right)\approx {\gamma}_{0}\left(u\right)+{\gamma}_{1}\left(u\right)(x-u),$$

for *x* in a neighborhood of *u*, where *γ*_{0}(*u*) = *f*(*u*) and ${\gamma}_{1}\left(u\right)=\stackrel{.}{f}\left(u\right)$. The superscript dot denotes the first-order derivatives. Let *K* be a symmetric probability density function and *K _{h}*(

$$\sum _{i=1}^{n}{\int}_{0}^{\tau}{K}_{h}({X}_{i}-x)\left[d{N}_{i}\left(t\right)-{Y}_{i}\left(t\right)d\Lambda \left\{H\right(t)+{\beta}^{\prime}{\mathbf{Z}}_{i}+{\gamma}_{0}(x)+{\gamma}_{1}(x\left)\right({X}_{i}-x\left)\right\}\right]=0,$$

(6)

$$\sum _{i=1}^{n}{\int}_{0}^{\tau}({X}_{i}-x){K}_{h}({X}_{i}-x)\left[d{N}_{i}\left(t\right)-{Y}_{i}\left(t\right)d\Lambda \left\{H\right(t)+{\beta}^{\prime}{\mathbf{Z}}_{i}+{\gamma}_{0}(x)+{\gamma}_{1}(x\left)\right({X}_{i}-x\left)\right\}\right]=0.$$

(7)

Altogether, we need to solve four estimating equations (4)-(7) iteratively: solve (4)-(5) for ** β** and

We now present an iterative algorithm to implement our estimation procedure. The algorithm is given as follows:

**Step 0**(Initialization step). Choose an initial estimate $\widehat{f}(\cdot )={\widehat{f}}^{\left(0\right)}(\cdot )$. Solve (4) and (5) to obtain ${\widehat{\beta}}_{0}$ and ${\widehat{H}}_{0}$ using Chen et al. (2002) for linear transformation models. Set $\widehat{\beta}={\widehat{\beta}}_{0}$ and $\widehat{H}={\widehat{H}}_{0}$.**Step 3**. Repeat Steps 1 and 2 until convergence.**Step 4**. Fix $(\widehat{\beta},\widehat{H})$ at its estimated value from Step 3. The final estimate of*f*(*x*) is ${\widehat{\gamma}}_{0}\left(x\right)={\widehat{\gamma}}_{0}(x;h,\widehat{\beta},\widehat{H})$, where $\left\{{\widehat{\gamma}}_{0}\right(x;h,\widehat{\beta},\widehat{H}),{\widehat{\gamma}}_{1}(x;h,\widehat{\beta},\widehat{H}\left)\right\}$ are obtained by solving (6) and (7).

In the supplementary Appendix A, we propose a *one-step* estimator as the initial estimator ${\widehat{f}}^{\left(0\right)}(\cdot )$ and show its local consistency. In practice, to save the computation cost, one can posit a parametric form for *f* and estimate it using the method of Chen et al. (2002) for the linear transformation model to obtain ${\widehat{f}}^{\left(0\right)}(\cdot )$.

The proposed algorithm above is in the similar spirit as Carroll et al. (1997) for generalized partially linear single-index models and Cai et al. (2007, 2008) for partially linear hazard regression models with multivariate survival data. All these algorithms, in their own contexts, alternatively optimize the global and local quasi-likelihood functions until convergence.

In the estimation procedure above, one needs to select the smoothing parameter *h* in Steps 1-3 and Step 4. It is worth noting that *h* plays different roles in different steps: in the first three steps, *h* should be chosen to ensure the proper estimation of *β* and *H*; in the last step, however, *h* should be optimal for estimating the nonparametric function *f*. Consequently, we suggest using one value of *h* in Steps 1-3 and using another value of *h* in Step 4. In the simulation section, we give more details on how to select the optimal *h* based on either the theoretical convergence rate or some data-adaptive tuning criteria.

In this section we establish the asymptotic properties of the estimators $\widehat{\beta}$, $\widehat{H}(\cdot )$, ${\widehat{\gamma}}_{0}\left(x\right)$ and ${\widehat{\gamma}}_{1}\left(x\right)$. Let $\lambda \left(t\right)=\stackrel{.}{\Lambda}\left(t\right)$ and $\psi \left(t\right)=\stackrel{.}{\lambda}\left(t\right)\u2215\lambda \left(t\right)$. Following the notations used in Chen et al. (2002), for any *t, s* (0, *τ*], define

$$\begin{array}{c}\hfill B(t,s)=\mathrm{exp}\left({\int}_{s}^{t}\frac{E\left[\stackrel{.}{\lambda}\right\{{H}_{0}\left(u\right)+{\beta}_{0}^{\prime}\mathbf{Z}+{f}_{0}\left(X\right)\left\}Y\right(u\left)\right]}{E\left[\lambda \right\{{H}_{0}\left(u\right)+{\beta}_{0}^{\prime}\mathbf{Z}+{f}_{0}\left(X\right)\left\}Y\right(u\left)\right]}d{H}_{0}\left(u\right)\right),\hfill \\ \hfill {\mu}_{\mathbf{Z}}\left(t\right)=\frac{E\left[\mathbf{Z}\lambda \right\{{H}_{0}\left(\stackrel{~}{T}\right)+{\beta}_{0}^{\prime}\mathbf{Z}+{f}_{0}\left(X\right)\left\}Y\right(t\left)B\right(t,\stackrel{~}{T}\left)\right]}{E\left[\lambda \right\{{H}_{0}\left(t\right)+{\beta}_{0}^{\prime}\mathbf{Z}+{f}_{0}\left(X\right)\left\}Y\right(t\left)\right]},\hfill \\ \hfill {B}_{1}\left(t\right)={\int}_{0}^{t}E\left[\stackrel{.}{\lambda}\right\{{H}_{0}\left(u\right)+{\beta}_{0}^{\prime}\mathbf{Z}+{f}_{0}\left(X\right)\left\}Y\right(u\left)\right]d{H}_{0}\left(u\right),\hfill \\ \hfill {B}_{2}\left(t\right)=E\left[\lambda \right\{{H}_{0}\left(t\right)+{\beta}_{0}^{\prime}\mathbf{Z}+{f}_{0}\left(X\right)\left\}Y\right(t\left)\right],\hfill \\ \hfill {\lambda}^{\ast}\left\{{H}_{0}\right(t\left)\right\}=B(t,0),\phantom{\rule{1em}{0ex}}{\Lambda}^{\ast}\left(x\right)={\int}_{-\infty}^{x}{\lambda}^{\ast}\left(u\right)du,\phantom{\rule{1em}{0ex}}\text{for}\phantom{\rule{thickmathspace}{0ex}}x\in (-\infty ,+\infty ).\hfill \end{array}$$

In addition, we define

$$\begin{array}{c}\hfill {\mathit{A}}_{1}={\int}_{0}^{\tau}E\left[\right\{\mathbf{Z}-{\mu}_{\mathbf{Z}}\left(t\right)\left\}{\mathbf{Z}}^{\prime}\stackrel{.}{\lambda}\right\{{H}_{0}\left(t\right)+{\beta}_{0}^{\prime}\mathbf{Z}+{f}_{0}\left(X\right)\left\}Y\right(t\left)\right]d{H}_{0}\left(t\right),\hfill \\ \hfill {\mathit{A}}_{2}={\int}_{0}^{\tau}E\left[\right\{\mathbf{Z}-{\mathit{m}}_{\mathbf{Z}}\left(t\right)\left\}{\left\{\rho \right(X\left)\right\}}^{\prime}\stackrel{.}{\lambda}\right\{{H}_{0}\left(t\right)+{\beta}_{0}^{\prime}\mathbf{Z}+{f}_{0}\left(X\right)\left\}Y\right(t\left)\right]d{H}_{0}\left(t\right),\hfill \\ \hfill \Sigma ={\int}_{0}^{\tau}E\left[{\{\mathbf{Z}-{\mathit{m}}_{\mathbf{Z}}(t)-({\mathbf{Z}}^{\ast}-{\mathit{m}}_{{\mathbf{z}}^{\ast}}\left)\right\}}^{\otimes 2}\lambda \right\{{H}_{0}\left(t\right)+{\beta}_{0}^{\prime}\mathbf{Z}+{f}_{0}\left(X\right)\left\}Y\right(t\left)\right]d{H}_{0}\left(t\right).\hfill \end{array}$$

where *b*^{2} = ** bb**’ for any vector

$$\begin{array}{c}\hfill {\mathbf{Z}}_{i}^{\ast}={\int}_{0}^{\tau}\frac{E\left[\mathbf{Z}\stackrel{.}{\lambda}\right\{{H}_{0}\left(t\right)+{\beta}_{0}^{\prime}\mathbf{Z}+{f}_{0}\left(X\right)\left\}Y\right(t)\mid X={X}_{i}]}{E\left[\lambda \right\{{H}_{0}\left(t\right)+{\beta}_{0}^{\prime}\mathbf{Z}+{f}_{0}\left(X\right)\}\mid X={X}_{i}]}d{H}_{0}\left(t\right),\hfill \\ \hfill {\mathit{m}}_{{\mathbf{Z}}^{\ast},i}={\int}_{0}^{\tau}{\mathit{m}}_{\mathbf{Z}}\left(t\right)\frac{E\left[\stackrel{.}{\lambda}\right\{{H}_{0}\left(t\right)+{\beta}_{0}^{\prime}\mathbf{Z}+{f}_{0}\left(X\right)\left\}Y\right(t)\mid X={X}_{i}]}{E\left[\lambda \right\{{H}_{0}\left(t\right)+{\beta}_{0}^{\prime}\mathbf{Z}+{f}_{0}\left(X\right)\}\mid X={X}_{i}]}d{H}_{0}\left(t\right),\hfill \end{array}$$

*m*_{Z}(*t*) = ** α**(

$$\alpha \left(t\right)-{\int}_{0}^{\tau}{D}_{1}(s,t)\alpha \left(s\right)d{H}_{0}\left(s\right)={\mathit{D}}_{2}\left(t\right),\phantom{\rule{1em}{0ex}}t\in [0,\tau ],$$

(8)

where *D*_{1}(·, ·), *D*_{2}(·) and ** ρ**(·) are defined in the supplementary Appendix B.

To derive the asymptotic properties of our estimators, we need the following regularity conditions:

- (C1) The covariates
**Z**and*X*are of compact support, and the density*g*(·) of*X*has a bounded second derivative. - (C2)
*β*_{0}belongs to the interior of a known compact set ${\mathcal{B}}_{0}$,*H*_{0}has a continuous and positive derivative, and*f*_{0}has a continuous second derivative. - (C3)
*λ*(·) is positive,*ψ*(·) is continuous, and lim_{t→-∞}*λ*(*t*) = 0 = lim_{t→-∞}*ψ*(*t*). - (C4)
*τ*is finite with*P*(*T*>*τ*) > 0 and*P*(*C*>*τ*) > 0. - (C5) There exist positive constants
*ζ*_{0}and*ζ*_{1}such that sup_{t[0, τ]}*B*_{2}(*t*) >*ζ*_{0}and sup_{t[0, τ]}{*B*_{2}(*t*) +*B*_{1}(*t*)} ≤*ζ*_{1}. - (C6) The kernel
*D*_{1}(·, ·) in (8) satisfies ${\mathrm{sup}}_{t\in [0,\tau ]}{\int}_{0}^{\tau}\mid {D}_{1}(s,t)\mid d{H}_{0}\left(s\right)<\infty $. - (C7) The matrices
=*A**A*_{1}−*A*_{2}and**Σ**are finite and nondegenerate.

Conditions (C1)-(C5) are similar to those in Chen et al. (2002) for establishing asymptotic results for linear transformation models. Condition (C6) is used to assure that there exists a unique solution to the integral equation (8), which is usually satisfied when the covariates are bounded and also the functions *f*(·), *λ*(·) and $\stackrel{.}{\lambda}(\cdot )$ are bounded on their supports. Condition (C7) is needed for establishing the asymptotic normality of the estimators.

The following theorems establish the asymptotic properties of $\widehat{\beta}$, $\widehat{H}(\cdot )$, ${\widehat{\gamma}}_{0}\left(x\right)$ and ${\widehat{\gamma}}_{1}\left(x\right)$. The proofs of all theorems and the necessary regularity conditions are relegated to the supplementary Appendix B for ease of exposition.

Under the regularity conditions (C1)-(C7), if nh^{2}/{log(1/h)} → ∞ and nh^{4} → 0, we have that, given $\widehat{\beta}$ in a small neighborhood of **β**_{0}, $\widehat{\beta}{\to}_{p}{\beta}_{0}$ and

$${n}^{1\u22152}(\widehat{\beta}-{\beta}_{0})\to N\{0,{\mathit{A}}^{-1}\Sigma {\left({\mathit{A}}^{-1}\right)}^{\prime}\}$$

(9)

in distribution as *n* → ∞.

Under the regularity conditions (C1)-(C7), if nh^{2}/{log(1/h)} → ∞ and nh^{4} → 0, we have the asymptotic representation

$$\sqrt{n}\left\{\widehat{H}\right(t)-{H}_{0}(t\left)\right\}=\frac{1}{\sqrt{n}}\sum _{i=1}^{n}\frac{{k}_{i}\left(t\right)}{{\lambda}^{\ast}\left\{{H}_{0}\right(t\left)\right\}}+{o}_{p}\left(1\right)$$

for t (0, τ], where κ_{i}(t)’s are independent mean zero functions and their definitions are given in the supplementary Appendix B.

Assume that conditions (C1)-(C4) hold. If nh^{5} is bounded, and **β** and H are estimated at the order O_{p}(n^{−1/2}), then

$$\sqrt{nh}\left(\left[\begin{array}{c}\hfill {\widehat{\gamma}}_{0}\left(x\right)-{f}_{0}\left(x\right)\hfill \\ \hfill h\left\{{\widehat{\gamma}}_{1}\right(x)-{\stackrel{.}{f}}_{0}(x\left)\right\}\hfill \end{array}\right]-{\mathit{b}}_{n}\left(x\right)\right)\to N\{0,\mathit{V}(x\left)\right\}$$

(10)

in distribution as n → ∞, where $\mathit{V}\left(x\right)={\mathit{V}}_{1}^{-1}\left(x\right){\mathit{V}}_{2}\left(x\right){\mathit{V}}_{1}^{-1}\left(x\right)$ and the definitions of **V**_{1}(x), **V**_{2}(x) and **b**_{n}(x) are given in the supplementary Appendix B.

Theorems 1 and 2 establish the root-*n* consistency of $\widehat{\beta}$ and $\widehat{H}(\cdot )$, respectively. They are used to establish the standard nonparametric rate for the estimates of the nonlinear covariate effect presented in Theorem 3.

As shown in Theorem 1, the asymptotic variance of $\widehat{\beta}$ has a standard sandwich form *A*^{−1}**Σ**(*A*^{−1})’. However, the matrices ** A** and

The resampling algorithm proceeds as follows. First, we generate *n* i.i.d. exponential random variables {*ξ _{i}*,

$$\sum _{i=1}^{n}{\xi}_{i}\left[d{N}_{i}\left(t\right)-{Y}_{i}\left(t\right)d\Lambda \left\{H\right(t)+{\beta}^{\prime}{\mathbf{Z}}_{i}+f({X}_{i}\left)\right\}\right]=0,\phantom{\rule{thinmathspace}{0ex}}t\ge 0,\phantom{\rule{thickmathspace}{0ex}}H\left(0\right)=-\infty ,$$

(11)

$$\sum _{i=1}^{n}{\xi}_{i}{\int}_{0}^{\tau}{\mathbf{Z}}_{i}\left[d{N}_{i}\left(t\right)-{Y}_{i}\left(t\right)d\Lambda \left\{H\right(t)+{\beta}^{\prime}{\mathbf{Z}}_{i}+f({X}_{i}\left)\right\}\right]=0,$$

(12)

$$\sum _{i=1}^{n}{\xi}_{i}{\int}_{0}^{\tau}{K}_{h}({X}_{i}-x)\left[d{N}_{i}\left(t\right)-{Y}_{i}\left(t\right)d\Lambda \left\{H\right(t)+{\beta}^{\prime}{\mathbf{Z}}_{i}+{\gamma}_{0}(x)+{\gamma}_{1}(x\left)\right({X}_{i}-x\left)\right\}\right]=0,$$

(13)

$$\sum _{i=1}^{n}{\xi}_{i}{\int}_{0}^{\tau}({X}_{i}-x){K}_{h}({X}_{i}-x)\left[d{N}_{i}\left(t\right)-{Y}_{i}\left(t\right)d\Lambda \left\{H\right(t)+{\beta}^{\prime}{\mathbf{Z}}_{i}+{\gamma}_{0}(x)+{\gamma}_{1}(x\left)\right({X}_{i}-x\left)\right\}\right]=0.$$

(14)

The estimates ** β***,

Under the regularity conditions (C1)-(C7) and the same rate of h, the conditional distribution of ${n}^{1\u22152}({\beta}^{\ast}-\widehat{\beta})$ given the observed data converges almost surely to the asymptotic distribution of ${n}^{1\u22152}(\widehat{\beta}-{\beta}_{0})$.

Based on Theorem 4, by repeatedly generating {*ξ*_{1}, … ,*ξ _{n}*} many times, we may obtain a large number of realizations of

We examine in this section the finite sample performance of the proposed estimators. The failure times *T _{i}*’s are generated from the partially linear transformation model (2). For the linear component, two independent covariates (

- Design I:
*f*(*x*) = 8(*x*−*x*^{3}), - Design II:
*f*(*x*) = 0.05{exp(3*x*) − 1}.

The hazard function of the error term is chosen as

$$\lambda \left(t\right)=\mathrm{exp}\left(t\right)\u2215\{1+\zeta \phantom{\rule{thinmathspace}{0ex}}\mathrm{exp}(t\left)\right\},$$

with *ζ* = 0, 1, 0.5 (Dabrowska and Doksum, 1988). Note that the partially linear proportional hazards (PLPH) and the partially linear proportional odds (PLPO) models correspond to *ζ* = 0 and *ζ* = 1, respectively. The function *H*(*t*) is chosen respectively as log(*t*) for *ζ* = 0, log(*e ^{t}* − 1) for

We consider two types of censoring mechanisms: covariate-independent censoring and covariate-dependent censoring. For covariate-independent censoring, the censoring times *C _{i}*’s are generated from a uniform distribution on (0,

In our computational algorithm, we choose the initial values as ${\widehat{f}}^{\left(0\right)}(\cdot )\equiv 0$. For the estimation of the parametric component, we set the bandwidth parameter *h* = *α*_{1}*n*^{−1/3} as suggested by the asymptotic theory. We have tried various values of *α*_{1} from 0.01 to 0.5, and found that *α*_{1} = 0.05 works very well under all the scenarios. Therefore, we only report the simulation results based on this choice. In practice, we may also use a similar cross-validation method as that of Tian et al. (2005) for the kernel estimation of the proportional hazards model with time-varying coefficients to choose the optimal bandwidth based on estimating equations. In order to assess the performance of the proposed resampling method for variance estimation, we generated *M* = 500 sets of *ξ*’s for each simulated data and computed the asymptotic variance estimates of $\widehat{\beta}$ based on the empirical variance of ** β***’s. The estimation results for

For the estimation of the nonparametric function *f*, a finer tuning with the mean integrated squared error (MISE) score was conducted. In particular, we set *h* = *α*_{2}*n*^{−1/5}, where *α*_{2} = 0.5, 0.25, 0.1, 0.05, 0.025 and selected the optimal *α*_{2} by minimizing the MISE. Here we only present the results for design I under the PLPH and PLPO models. The results for design II and the model corresponding to *ζ* = 0.5 are quite similar and hence omitted. To present the performance of our procedure of nonparametric estimation, we plot the estimated functions for the PLPH model obtained under various scenarios in Figure 1. The left column of Figure 1 depicts the typical estimated functions corresponding to the 10th best, the 50th best (median), and the 90th best according to MISE among 500 simulations. The top plot is for *n* = 100 and the bottom plot is for *n* = 200. It is evident that the estimated curves are able to capture the shape of the true function very well, and their performance improves when the sample increases. In order to describe the sampling variability of the estimated nonparametric function at each point, we also depict a 95% pointwise confidence interval for *f* in the right column of Figure 1. The upper and lower bound of the confidence interval are respectively given by the 2.5th and 97.5th percentiles of the estimated function at each grid point among 500 simulations. The results show that the function *f* is estimated with reasonably good accuracy. As the sample size increases from 100 to 200, the confidence interval becomes narrower as expected. In Figure 2, we plot the estimated nonparametric function and the associated 95% pointwise confidence interval for the PLPO model. Similar conclusions as the PLPH model can be drawn from Figure 2.

The estimated nonlinear function, confidence envelop and 95% point-wise confidence interval for the PLPH model.

The estimated nonlinear function, confidence envelop and 95% point-wise confidence interval for the PLPO model.

In order to examine the numerical stability and efficiency of the proposed iterative algorithm, we also conducted several simulations to study: (i) the effects of different initial values for the nonlinear covariate on the solution; and (ii) how much efficiency is lost if the true covariate effect is linear while the proposed method is used. The results are presented in the supplementary Appendix C. Based on these results, we observe that the estimated linear parameters produced from different initial values are quite similar to each other and close to the truth, and the efficiency loss of our method is relatively small compared with that of Chen et al. (2002) for the linear transformation model when the true covariate effects are really linear.

In this section, we apply our method to the lung cancer data from the Veteran’s Administration lung cancer trial (Kalbfleishch and Prentice, 2002). In this trial, 137 males with advanced inoperable lung cancer were randomized to either a standard treatment or chemotherapy. Besides the treatment indicator, there were five covariates: Cell type (1=squamous, 2=small cell, 3=adeno, 4=large), Karnofsky score, Months from Diagnosis, Age, and Prior therapy (0=no, 10=yes). The data set has been analyzed by many authors, for example, Tibshirani (1997) fitted the proportional hazards model and Lu and Zhang (2007) considered the proportional odds model. They found that both Cell type and Karnofsky score were significant while others were not. In both methods, all covariates were assumed linear, which may not be true, particularly for the age effect. It is well known that age is a complex confounding factor, and its effect usually shows a nonlinear trend.

We fitted both the PLPH and PLPO models to the data with three covariates: treatment, cell type and age, where age is assumed to be nonlinear. For estimation, we first rescaled age between 0 and 1, and set *h* = 0.05*n*^{−1/3} as in simulations. Table 3 summarizes the estimated coefficients and their standard errors obtained based on 500 resamplings for both models. As found in the literature, Cell type (small vs large, adeno vs large) is significant while treatment is not in both models. Moreover, Figure 3 gives the estimates of the nonlinear components: the left panel for the PLPH model and the right panel for the PLPO model. The red curves are estimated nonparametric functions and the blue curves are the 95% point-wise confidence intervals constructed based on the resampling method. Based on the plots, the covariate age showed clearly a nonlinear effect (U-shape) on survival times. It is noted that the zero line is not included in the 95% confidence intervals. This example suggests that the partially linear transformation models can be more powerful in discovering significant covariates than those assuming simply linear covariate effects.

We study a general class of partially linear transformation models, and develop the corresponding inference procedure by solving a unified *global* and *local* estimating equation system based on the martingale representation. We established the root-*n* consistency and asymptotical normality for the estimates of regression coefficients, and studied the convergence rates of the estimates for nonparametric components including the transformation function and nonlinear covariate effect. We also provide consistent estimates for the asymptotic variance of the regression coefficient estimates based on a feasible resampling scheme. It is noted that the proposed martingale-based estimating equations are ad hoc and are generally not efficient. Recently, Chen (2009) proposed a nice approach using the weighted Breslow-type estimator to construct efficient estimating equations for the linear transformation model. It is interesting to explore whether such an approach can be generalized to the partially linear transformation model. A thorough investigation is warranted for future research.

The authors thank the editor, an associate editor, and two referees for their constructive comments and suggestions, and Professor Kani Chen for inspiring discussions on the consistency of the proposed estimators. This research was partially supported by the NSF awards DMS-0504269 and DMS-0645293, and the NIH awards R01 CA-085848 and R01 CA-140632.

SUPPLEMENTAL MATERIALS Extended Derivations and Additional Simulation Results: The pdf file contains extended derivations of theoretical properties and additional simulation results. (jasa_plt_suppl_rev2.pdf)

Wenbin Lu, (Email: ude.uscn.tats@4ulw), Department of Statistics, North Carolina State University, Raleigh, NC 27695.

Hao Helen Zhang, (Email: ude.uscn.tats@2gnahzh), Department of Statistics, North Carolina State University, Raleigh, NC 27695.

- Andersen PK, Borgan O, Gill R, Keiding N. Statistical Models Based on Counting Processes. Springer; New York: 1993.
- Bennett S. Analysis of survival data by the proportional odds model. Statist. Med. 1983;2:273–277. [PubMed]
- Bickel PJ, Klaassen CAJ, Ritov Y, Wellner JA. Efficient and Adaptive Estimation for Semiparametric Models. Johns Hopkins University Press; Baltimore: 1993.
- Cai J, Fan J, Jiang J, Zhou H. Partially linear hazard regression for multivariate survival data. J. Am. Statist. Assoc. 2007;102:538–551.
- Cai J, Fan J, Jiang J, Zhou H. Partially linear hazard regression with varying-coefficients for multivariate survival data. J. R. Statist. Soc. Ser. B. 2008;70:141–158.
- Cai T, Wei LJ, Wilcox M. Semi-Parametric Regression Analysis for Clustered Failure Time Data. Biometrika. 2000;87:867–78.
- Carroll RJ, Fan J, Gijbels I, Wand MP. Generalized partially linear single-Index models. J. Am. Statist. Assoc. 1997;92:477–489.
- Carroll RJ, Ruppert D, Welsh A. Local estimating equations. J. Am. Statist. Assoc. 1998;93:214–227.
- Chen K, Jin Z, Ying Z. Semiparametric analysis of transformation models with censored data. Biometrika. 2002;89:659–668.
- Chen Y-H. Weighted Breslow-type and maximum likelihood estimation in semi-parametric transformation models. Biometrika. 2009;96:591–600.
- Cheng SC, Wei LJ, Ying Z. Analysis of transformation models with censored data. Biometrika. 1995;82:835–45.
- Cheng SC, Wei LJ, Ying Z. Prediction of survival probabilities with semi-parametric transformation models. J. Am. Statist. Assoc. 1997;92:227–235.
- Clayton D, Cuzick J. Multivariate generalizations of the proportional hazards model (with Discussion) J. R. Statist. Soc. Ser. A. 1985;148:82–117.
- Cox DR. Regression models and life-tables (with discussion) J. R. Statist. Soc. Ser. B. 1972;81:187–220.
- Cox DR. Partial Likelihood. Biometrika. 1975;62:269–276.
- Dabrowska DM, Doksum KA. Estimation and testing in the two-sample generalized odds rate model. J. Am. Statist. Assoc. 1988;83:744–9.
- Fan J, Gijbels I. Local Polynomial Modeling and Its Applications. Chapman and Hall; London: 1996.
- Fan J, Gijbels I, King M. Local likelihood and local partial likelihood in hazard regression. Ann. Statist. 1997;25:1661–1690.
- Fine J, Ying Z, Wei LJ. On the linear transformation model for censored data. Biometrika. 1998;85:980–986.
- Fleming TR, Harrington DP. Counting Processes and Survival Analysis. Wiley; New York: 1991.
- Grambsch P, Therneau T. Proportional hazards tests and diagnostics based on weighted residuals. Biometrika. 1994;81:515–526.
- Gray RJ. Flexible methods for analyzing survival data using splines, with application to breast cancer prognosis. J. Am. Statist. Assoc. 1992;87:942–951.
- Hastie T, Tibshirani R. Generalized additive models. Chapman & Hall; 1990.
- Huang J. Efficient estimation of the partially linear Cox model. Ann. Statist. 1999;27:1536–1563.
- Huang J, Kooperberg C, Stone C, Truong Y. Functional ANOVA modeling for proportional hazards regression. Ann. Statist. 2000;28:961–999.
- Jin Z, Ying Z, Wei LJ. A simple resampling method by perturbing the minimand. Biometrika. 2001;88:381–390.
- Kalbfleish JD, Prentice RL. The Statistical Analysis of Failure Time Data. Edition 2 Wiley; New Jersey: 2002.
- Lu W, Zhang HH. Variable selection for proportional odds model. Stat. Med. 2007;26:3771–3781. [PubMed]
- Ma S, Kosorok MR. Penalized log-likelihood estimation for partly linear transformation models with current status data. Ann. Statist. 2005;33:2256–2290.
- O’Sullivan F. Nonparametric estimation in the Cox model. Ann. Statist. 1993;27:124–145.
- Pettitt AN. Inference for the linear model using a likelihood based on ranks. J. R. Statist. Soc. Ser. B. 1982;44:234–243.
- Pettitt AN. Proportional odds model for survival data and estimates using ranks. Appl. Statist. 1984;33:169–175.
- Pollard D. Empirical Processes: Theory and Applications. NSF-CBMS Regional Conference Series in Probability and Statistics Volume 2. Hayward: 1990.
- Sasieni P. Information bounds for the conditional hazard ratio in a nested family of regression models. J. R. Statist. Soc. Ser. B. 1992;54:627–635.
- Shorack GR, Wellner JA. Empirical Processes with Applications to Statistics. John Wiley & Sons; New York: 1986.
- Tian L, Zucker D, Wei LJ. On the Cox model with time-varying regression coefficients. J. Am. Statist. Assoc. 2005;100:172–183.
- Tibshirani R. The lasso method for variable selection in the Cox model. Statist. Med. 1997;16:385–395. [PubMed]
- van der Vaart AW, Wellner JA. Weak Convergence and Empirical Processes: With Applications to Statistics. Springer; New York: 1996.
- Zeleniuch-Jacquotte A, Shore RE, Koenig KL, Akhmedkhanov A, Afanasyeva Y, Kato I, Kim MY, Rinaldi S, Kaaks R, Toniolo P. Postmenopausal levels of oestrogen, androgen and SHBG and breast cancer: long-term results of a prospective study. Brit. J. of Can. 2004;90:153–159. [PMC free article] [PubMed]
- Zeng D, Lin D. Efficient estimation of semiparametric transformation models for counting processes. Biometrika. 2006;93:627–640.

PubMed Central Canada is a service of the Canadian Institutes of Health Research (CIHR) working in partnership with the National Research Council's national science library in cooperation with the National Center for Biotechnology Information at the U.S. National Library of Medicine(NCBI/NLM). It includes content provided to the PubMed Central International archive by participating publishers. |