Home | About | Journals | Submit | Contact Us | Français |

**|**HHS Author Manuscripts**|**PMC2893401

Formats

Article sections

- Abstract
- 1. Introduction
- 2. Quantile calculus
- 3. Proposed estimator and algorithm
- 4. Asymptotic study and inference
- 5. Numerical studies
- 6. Discussion
- References

Authors

Related links

Ann Stat. Author manuscript; available in PMC 2010 June 29.

Published in final edited form as:

Ann Stat. 2010 June 1; 38(3): 1607–1637.

PMCID: PMC2893401

NIHMSID: NIHMS173372

Yijian Huang^{*}

Yijian Huang, Department of Biostatistics and Bioinformatics, Rollins School of Public Health, Emory University, Atlanta, GA 30322, USA, Email: ude.yrome@5gnauhy;.

See other articles in PMC that cite the published article.

Quantile regression has been advocated in survival analysis to assess evolving covariate effects. However, challenges arise when the censoring time is not always observed and may be covariate-dependent, particularly in the presence of continuously-distributed covariates. In spite of several recent advances, existing methods either involve algorithmic complications or impose a probability grid. The former leads to difficulties in the implementation and asymptotics, whereas the latter introduces undesirable grid dependence. To resolve these issues, we develop fundamental and general quantile calculus on cumulative probability scale in this article, upon recognizing that probability and time scales do not always have a one-to-one mapping given a survival distribution. These results give rise to a novel estimation procedure for censored quantile regression, based on estimating integral equations. A numerically reliable and efficient Progressive Localized Minimization (PLMIN) algorithm is proposed for the computation. This procedure reduces exactly to the Kaplan–Meier method in the *k*-sample problem, and to standard uncensored quantile regression in the absence of censoring. Under regularity conditions, the proposed quantile coefficient estimator is uniformly consistent and converges weakly to a Gaussian process. Simulations show good statistical and algorithmic performance. The proposal is illustrated in the application to a clinical study.

Quantile regression (Koenker & Bassett, 1978), concerning models for conditional quantile functions, has developed into a primary statistical methodology to investigate functional relationship between a response and covariates. Targeting the full spectrum of quantiles, it provides a far more complete statistical analysis than, say, classical linear regression. This technique has a long history in econometric applications. More recently, quantile regression has also been advocated for survival analysis to address evolving covariate effects which is a common phenomenon in demographic and clinical research among others. For instance, the aging process as well as the effects of its determinants can be vastly different at various stages of life (cf. Koenker & Geling, 2001). On the other hand, a clinical intervention can rarely be expected to have a constant effect, due to the time lag in reaching full effect and to drug resistance. The quantile regression model allows for varying regression coefficients and thus suits these applications well. However, a main challenge arises from censoring.

Denote the survival time by *T* and the censoring time by *C*. As a result of censoring, *T* is not directly observed but through follow-up time *X* = *T* ∧ *C* and censoring indicator Δ = *I*(*T* ≤ *C*), where ∧ is the minimization operator and *I*(·) is the indicator function. Of interest is the relationship between *T* and a *p* × 1 covariate vector *Z* with constant 1 as the leading component. Quantile function is an inverse of the distribution function *F _{Z}*(

$${Q}_{Z}(\tau )\equiv \text{sup}\{t:{F}_{Z}(t)\le \tau \}\text{}\tau \in [0,1).$$

(1)

The quantile regression model postulates that

$${Q}_{Z}(\tau )={Z}^{\top}{\beta}_{0}(\tau )\text{}\forall \tau \in [0,1),$$

(2)

where β_{0}(τ), referred to as the quantile coefficient, is a function of probability τ. This model is semiparametric in general, but nonparametric in the *k*-sample problem. The interest in evolving covariate effects necessitates the functional modeling of (2), which distinguishes itself from the modeling on a specific quantile as in, e.g., median regression; see Section 6 for further discussion. Note that the time scale may be, say, logarithm-transformed, and accordingly the supports of *T*, *C* and *X* are not necessarily restricted to the nonnegative half line. When all components of β_{0}(τ) except the intercept are constant in τ, this model reduces to the accelerated failure time mode as studied by Buckley & James (1979) and Tsiatis (1990) among others. In this regard, the quantile regression model is a varying-coefficient generalization. To provide an interpretation, suppose for the moment that model (2) holds on the logarithmic time scale. Then, the effect of a covariate, other than the leading 1 of *Z*, is to stretch or compress the baseline survival time (on the original scale) with a quantile-dependent stretching or compressing factor.

With uncensored data, Koenker & Bassett (1978) generalized sample quantile and proposed regression quantile as a quantile coefficient estimator via a convex objective function. An adaptation of the well-known Barrodale–Roberts algorithm was later suggested by Koenker & D’Orey (1987) for the computation. The reliability and efficiency of this algorithm contribute to broader acceptance of the standard methodology. In the presence of censoring, Powell (1984, 1986) proposed an estimation procedure when censoring time *C* is always observed. His approach applies uncensored quantile regression to *X* as the τ-th quantile of *X* turns out to be *Q _{Z}*(τ) ∧

However, in most survival studies, not only is the survival time subject to censoring but also the censoring time is unobserved for uncensored individuals. Taking the missing-data perspective of censoring, Ying, Jung & Wei (1995) and Honoré, Khan & Powell (2002) developed different methods but with the common consistent estimation requirement of the censoring distribution given covariates. This amounts to either an unconditional independence censoring mechanism, or a finite-number limitation on covariate values, or additional censoring-time modeling to achieve a root-*n* convergence rate of the estimated censoring distribution. Obviously, none of these is desirable. Ying et al. (2005) indicated that such a restriction may be relieved by employing smoothing techniques to nonparametrically estimate the conditional censoring distribution. As an alternative, Wang & Wang (2009) developed a method by nonparametrically estimating the conditional survival distribution via kernel smoothing. Nevertheless, Robins & Ritov (1997) argued that these smoothing-based methods may not be practical with moderate sample size in the presence of multiple continuously-distributed covariates; see also Portnoy (2003).

Our investigation focuses on the same preceding data structure, but aims to allow for generalities on both the censoring mechanism and covariates. Specifically, we consider conditional independence censoring mechanism:

$$T\perp C\phantom{\rule{thinmathspace}{0ex}}|\phantom{\rule{thinmathspace}{0ex}}Z,$$

(3)

where denotes statistical independence. This problem was first investigated by Portnoy (2003), who suggested the pivoting method employing the redistribution-to-the-right imputation scheme for censoring (Efron, 1967). The mass of censored observations is recursively redistributed to adopt standard uncensored quantile regression. However, one premise is the quantile monotonicity so that the “right”, or future, is unequivocal in the redistribution. In the *k*-sample case, the monotonicity holds in uncensored sample quantile, and the method reduces to the Kaplan–Meier method, i.e., taking an inverse of the Kaplan–Meier estimator. Unfortunately, uncensored quantile regression in general does not respect the monotonicity, leading to both algorithmic and analytic difficulties with Portnoy (2003). Indeed, the asymptotic properties of the estimator have not yet been established; see Neocleous, Vanden Branden & Portnoy (2006). As an alternative, Neocleous et al. (2006) advocated a closely related grid method. Most recently, Peng & Huang (2008) proposed a functional estimating function upon discovering a martingale structure, and developed a grid-based quantile coefficient estimator. Both uniform consistency and weak convergence have been established. As for the last two methods, the grid dependence, however, might not be completely satisfactory.

This article makes two main contributions to this problem. First of all, fundamental and general quantile calculus is developed on probability scale, establishing the probability-scale dynamics with allowance for zero-density intervals and discontinuities in a distribution. Second, from quantile calculus a well-defined estimator and a reliable and efficient algorithm for censored quantile regression naturally emerge on the basis of estimating integral equations. As compared with Portnoy (2003), Neocleous et al. (2006) and Peng & Huang (2008), this new approach entails neither algorithmic complications nor a probability grid. For the rest of this article, quantile calculus is presented in Section 2, and the proposed estimator and algorithm in Section 3. The asymptotic properties are investigated and an inference procedure suggested in Section 4. Section 5 presents simulation results on statistical and algorithmic performance, and an illustration with a clinical study. Section 6 concludes with discussion. All proofs are collected in the Appendices.

Given a survival distribution, a one-to-many mapping from probability to time scale may arise from zero-density intervals; adopting the cadlag definition of quantile function is a solution given in Section 1. Reciprocally, a one-to-many mapping from time to probability scale may also arise, resulting from distributional discontinuities. Thus, time-scale theories including counting-process martingales cannot be applied to probability scale, unless continuity restriction is imposed on the distribution. In uncensored quantile regression, this issue may be bypassed by formulating the estimation as an optimization problem. However, such an approach may not be feasible in censored quantile regression, which calls for the development of quantile calculus.

Drop *Z* from the notation in this case. By definition, Pr{*T* < *Q*(τ)} ≤ τ ≤ Pr{*T* ≤ *Q*(τ)}. Thus, *Q*(τ) does not correspond to a unique probability τ when Pr{*T* = *Q*(τ)} > 0. To fill in the missing piece, we introduce the τ-*th quantile equality fraction*:

$$\xi (\tau )=\frac{\tau -\text{Pr}\{T<Q(\tau )\}}{\text{Pr}\{T=Q(\tau )\}},$$

which is the fraction of the probability mass at the quantile that brings the cumulative probability up to τ. Here and throughout, we define 0/0 0. Elementary algebra then gives

$$\begin{array}{c}\text{Pr}\{T<Q(\tau )\}+\text{Pr}\{T=Q(\tau )\}\xi (\tau )\hfill \\ ={\displaystyle {\int}_{0}^{T}\left[\text{Pr}\{T\ge Q(\nu )\}-\text{Pr}\{T=Q(\nu )\}\xi (\nu )\right]\frac{d\nu}{1-\nu}\text{}\forall \tau \in [0,1).}\hfill \end{array}$$

(4)

This result establishes the quantile dynamics on probability scale. More significantly, it can be readily exploited to accommodate censoring. Denote the limit of identifiability by = sup{τ : Pr{*C* ≥ Q(τ)} > 0}.

Proposition 1. *Suppose that T and C are independent and their distributions do not have jump points in common. Consider integral equation*

$$\begin{array}{c}E(\mathrm{\Delta}[I\{X<q(\tau )\}+I\{X=q(\tau )\}\eta (\tau )])\hfill \\ =E{\displaystyle {\int}_{0}^{T}[I\{X\ge q(\nu )\}-I\{X=q(\nu )\}\eta (\nu )]\frac{d\nu}{1-\nu}}\text{}\forall \tau \in [0,1),\hfill \end{array}$$

(5)

*where q(·) is a cadlag function and η(·) takes values in* [0, 1].

*If equation (5) holds, then*$$q(\tau )=Q(\tau ),$$(6)$$E[\mathrm{\Delta}I\{X=q(\tau )\}\eta (\tau )]=E[\mathrm{\Delta}I\{X=Q(\tau )\}\xi (\tau )],$$(7)*for all*τ (0, ).

Remark 1. The condition that the distributions of *T* and *C* do not share jump points is necessary for the identifiability of the former and therefore the corresponding quantile function as well. The role of η(τ) is to split probability mass in the case of Pr{*T* = *Q*(τ)} > 0. Equation (5), however, does not determine η(τ) elsewhere. But instead of, say, setting η(τ) to 0 in those occasions, keeping the more general form would be advantageous for later developments.

Similar to the one-sample case, we assume that

Assumption 1. The conditional distribution functions of *T* and *C* given *Z* do not have jump points in common for all values of *Z*.

Write ξ_{Z}(τ) as the τ -th quantile equality fraction for the distribution of *T* given *Z*. Generalize the definition of identifiability limit as

$$\overline{\tau}=\text{sup}\{\tau :E[{Z}^{\otimes 2}I\{C\ge {Z}^{\top}{\beta}_{0}(\tau )\}]\phantom{\rule{thinmathspace}{0ex}}\text{is}\phantom{\rule{thinmathspace}{0ex}}\text{nonsingular}\},$$

where *v*^{2} *vv*^{}. The one-sample result of Proposition 1 can then be extended.

Proposition 2. *Suppose that quantile regression model (2) and censoring mechanism (3) hold along with Assumption 1. Consider integral equation*

$$\begin{array}{c}\hfill \text{}E(Z\mathrm{\Delta}[I\{X{Z}^{\top}\beta (\tau )\}+I\{X={Z}^{\top}\beta (\tau )\}\eta Z(\tau )])\hfill \\ =E{\displaystyle {\int}_{0}^{\tau}Z}[I\{X\ge {Z}^{\top}\beta (\nu )\}-I\{X={Z}^{\top}\beta (\nu )\}\eta Z(\nu )]\frac{d\nu}{1-\nu}\hfill \\ \hfill \forall \mathrm{\lambda}\in [0,1),\end{array}$$

(8)

*where β(·) is a cadlag function and Z-dependent η_{Z}(·) takes values in [0, 1].*

*If β(·) = β*Pr{_{0}(·) and, for any given*Z*, η_{Z}(τ) = ξ_{Z}(τ) for all τ such that*T*=*Q*(τ) |_{Z}*Z*} > 0,*then equation (8) holds.**In the case that both C and Z are discretely distributed, if equation (8) holds, then*$$\beta (\tau )={\beta}_{0}(\tau ),$$(9)$$E[Z\mathrm{\Delta}I\{X={Z}^{\top}\beta (\tau )\}{\eta}_{Z}(\tau )]=E[Z\mathrm{\Delta}I\{X={Z}^{\top}{\beta}_{0}(\tau )\}{\xi}_{Z}(\tau )]$$(10)*for all*τ (0, ).

Remark 2. The admission of β_{0}(·) as a solution to integral equation (8) is general; no restriction on the survival and censoring distributions is imposed except for a necessary identifiability condition. But the uniqueness result of β_{0}(·) is provided only for the case that *C* and *Z* are discretely distributed. It is also established under some other conditions in Section 4. However, we do not yet have a proof for the most general case.

Remark 3. Consider the censoring-absent special case with nonsingular *E*(*Z*^{2}). Then, integral equation (8) reduces to

$$D(\tau )={\displaystyle {\int}_{0}^{\tau}\{\mathit{\text{EZ}}-D(\nu )\}\frac{d\nu}{1-\nu},}$$

where *D*(τ) is the left-hand side of (8). This equation has a unique and closed-form solution *D*(τ) = τ*EZ*, or

$$E(Z[I\{T<{Z}^{\top}\beta (\tau )\}+I\{T={Z}^{\top}\beta (\tau )\}{\eta}_{Z}(\tau )])=\tau EZ.$$

(11)

Note that η_{Z}(τ) affects the left-hand side at a nonsmooth point only, i.e., when Pr{*T* = *Z*^{}β(τ)} ≠ 0. With fixed η_{Z}(τ), the left-hand side may not be smooth in β(τ). Nonetheless, thanks to η_{Z}(τ), it can always be smooth in τ and therefore, the equality of (11) is attainable. As far as β(τ) is concerned, the equation is equivalent to the minimization problem with *E*[{*T* −*Z*^{}β(τ)}^{−}+τ{*T* −*Z*^{}β(τ)}], where *a*^{−} −*aI* (*a* < 0). Thus, Proposition 2 reduces to a well-known result in uncensored quantile regression.

Remark 4. Quantile equality fraction ξ_{Z}(τ) is a nuisance parameter. When Pr(*Z* = *z*) = 0 for a given value *z*, Pr{*T* = *Q _{Z}*(τ) |

Remark 5. Proposition 2 is more general than the martingale result of Peng & Huang (2008, equation 4), whose validity is limited to the circumstance of continuous survival distribution. In that special case, the former may reduce to the latter since the mapping between time and probability scales becomes one to one and all terms involving η_{Z}(·) in integral equation (8) may vanish. Even so, the more general form of equation (8) is still desirable in order to derive a natural estimating integral equation by the plug-in principle. After all, an empirical distribution is always discrete, i.e., full of discontinuities and zero-density intervals.

To facilitate probability-scale analysis both conceptually and algebraically, we introduce the notion of *relative quantile*. Anchored at the τ -th quantile, the {τ + λ(1 − τ)}-th quantile coefficient for λ [0, 1) is referred to as the λ-th relative quantile coefficient, written as β_{0}(λ, τ) β_{0}{τ + λ(1 − τ)}. This notion provides a convenient vehicle to study quantile coefficient β_{0}(·) forward from a given probability, similar in spirit to the concept of hazard in survival analysis.

An integral equation for β_{0}(λ, τ) can be derived with given *D*(τ), the left-side of equation (8) at τ. By algebraic manipulation, integral equation (8) implies

$$\begin{array}{c}\hfill \text{}E(Z\mathrm{\Delta}[I\{X{Z}^{\top}\beta (\mathrm{\lambda},\tau )\}+I\{X={Z}^{\top}\beta (\mathrm{\lambda},\tau )\}{\eta}_{Z}(\mathrm{\lambda},\tau )])-D(\tau )\hfill \\ =E{\displaystyle {\int}_{0}^{\tau}Z}[I\{X\ge {Z}^{\top}\beta (\nu ,\tau )\}-I\{X={Z}^{\top}\beta (\nu ,\tau )\}{\eta}_{Z}(\nu ,\tau )]\frac{d\nu}{1-\nu}\hfill \\ \hfill \forall \mathrm{\lambda}\in [0,1),\end{array}$$

(12)

where β(λ, τ) β{τ+λ(1−τ)} and η_{Z}(λ, τ) η_{Z}{τ+λ(1−τ)}. Apparently, this is a dual equation since equation (8) becomes a special case when τ = 0.

The data consist of (*X _{i}*,Δ

$$\begin{array}{c}{\displaystyle \sum _{i=1}^{n}{Z}_{i}{\mathrm{\Delta}}_{i}\left[I\{{X}_{i}<{Z}_{i}^{\top}\beta (\tau )\}+I\{{X}_{i}={Z}_{i}^{\top}\beta (\tau )\}{w}_{i}(\tau )\right]}\hfill \\ ={\displaystyle \sum _{i=1}^{n}{\displaystyle {\int}_{0}^{\tau}{Z}_{i}\left[I\{{X}_{i}\ge {Z}_{i}^{\top}\beta (\nu )\}-I\{{X}_{i}={Z}_{i}^{\top}\beta (\nu )\}{w}_{i}(\nu )\right]\frac{d\nu}{1-\nu},}}\hfill \end{array}$$

(13)

where *w _{i}*(·) takes values in [0, 1]. Representing a convenient reparameterization of η

We shall say that *b interpolates* an observation (*X*,Δ,*Z*) if *X* = *Z*^{}*b*.

Theorem 1. *Suppose that* $\sum}_{i=1}^{n}{Z}_{i}^{\otimes 2$ *is nonsingular. Estimating integral equation (13) admits a solution (·) over τ [0, 1) with the following properties: (i) (·) is cadlag; and (ii) (τ) interpolates at least p individuals and the covariate matrix for the interpolated set is of full rank, for each and every* τ [0, 1).

Remark 6. Estimating integral equation (13) also admits a solution in the case of singular $\sum}_{i=1}^{n}{Z}_{i}^{\otimes 2$ by Theorem 1 upon eliminating parametrization redundancy of the quantile coefficient.

Remark 7. A subtle issue concerns the fact that identifiability limit is unknown. Empirically, cannot even be determined *definitively* to exceed *any* τ > 0, which may be easily seen in the one-sample case. Although it is possible to estimate , we do not terminate the estimation of β_{0}(·) at such an estimate but rather provide (τ) for all τ [0, 1); the properties of (·) would otherwise become more complicated. Precisely speaking, (τ) is an estimator of β_{0}(τ) provided τ < . This strategy of separating the estimation of β_{0}(τ) provided τ < from that of is similar to that adopted by Peng & Huang (2008). In contrast, Portnoy (2003) and Neocleous et al. (2006) terminated their estimation of β_{0}(·) once the estimate becomes non-unique, which might partly explain the difficulties in their interval estimation.

Geometrically, (τ) for each τ is a hyperplane, partitioning the sample into

$$\{\begin{array}{cc}\{i:{X}_{i}<{Z}_{i}^{\top}\widehat{\beta}(\tau )\}\hfill & \text{below}\phantom{\rule{thinmathspace}{0ex}}\text{set}\hfill \\ \{i:{X}_{i}={Z}_{i}^{\top}\widehat{\beta}(\tau )\}\hfill & \text{interpolated}\phantom{\rule{thinmathspace}{0ex}}\text{set}.\hfill \\ \{i:{X}_{i}>{Z}_{i}^{\top}\widehat{\beta}(\tau )\}\hfill & \text{above}\phantom{\rule{thinmathspace}{0ex}}\text{set}\hfill \end{array}$$

Each interpolated individual on the hyperplane may be split in a ratio of *w _{i}*(τ) : {1 −

Estimating integral equation (13) may be solved exactly with the proposed Progressive Localized Minimization (PLMIN) algorithm. The algorithm proceeds from the 0th quantile coefficient upward in a progressive fashion. Due to sample discreteness, (·) is piecewise constant. We thus conveniently decompose the computation into sequential rounds, each involving that of a 0th relative quantile coefficient and a potential breakpoint.

Suppose that equation (13) is solved up to τ−, and thus ϕ_{i}(τ−) of every uncensored individual is available. Then, by continuity ϕ_{i}(τ) = ϕ_{i}(τ−) of uncensored individual *i* is determined; obviously ϕ_{i}(τ) = 0 in the case of τ = 0. Inherited from the relationship between integral equations (8) and (12), estimating integral equation (13) is equivalent to the following equation for relative quantile coefficient:

$$\begin{array}{c}\text{\hspace{1em}}{\displaystyle \sum _{i=1}^{n}{Z}_{i}{\mathrm{\Delta}}_{i}[I\{{X}_{i}<{Z}_{i}^{\top}\beta (\mathrm{\lambda},\tau )\}+I\{{X}_{i}={Z}_{i}^{\top}\beta (\mathrm{\lambda},\tau )\}{w}_{i}(\mathrm{\lambda},\tau )-{\varphi}_{i}}(\tau )]\hfill \\ ={\displaystyle \sum _{i=1}^{n}{\displaystyle {\int}_{0}^{\mathrm{\lambda}}{Z}_{i}}[I\{{X}_{i}\ge {Z}_{i}^{\top}\beta (\nu ,\tau ))\}-I\{{X}_{i}={Z}_{i}^{\top}\beta (\nu ,\tau )\}{w}_{i}(\nu ,\tau )]\frac{d\nu}{1-\nu},}\hfill \end{array}$$

(14)

where *w _{i}*(λ, τ)

$$L(\mathrm{\lambda})={\displaystyle {\int}_{0}^{\mathrm{\lambda}}\{H-L(\nu )\}\frac{d\nu}{1-\nu}\text{}\mathrm{\lambda}\in [0,{\mathrm{\lambda}}_{b}),}$$

which admits a unique solution *L*(λ) = λ*H* or equivalently,

$$\begin{array}{c}{\displaystyle \sum _{i=1}^{n}{Z}_{i}{\mathrm{\Delta}}_{i}\left[I\{{X}_{i}<{Z}_{i}^{\top}\beta (\mathrm{\lambda},\tau )\}+I\{{X}_{i}={Z}_{i}^{\top}\beta (\mathrm{\lambda},\tau )\}{w}_{i}(\mathrm{\lambda},\tau )-{\varphi}_{i}(\tau )\right]}\hfill \\ =\mathrm{\lambda}{\displaystyle \sum _{i=1}^{n}{Z}_{i}}\left[I\{{X}_{i}\ge {Z}_{i}^{\top}\beta (\mathrm{\lambda},\tau )\}-I\{{X}_{i}={Z}_{i}^{\top}\beta (\mathrm{\lambda},\tau )\}{w}_{i}(\tau )\right].\hfill \end{array}$$

(15)

Write (λ, τ) {τ+λ(1−τ)}. Since (·) is cadlag, (0, τ) is the solution to the above equation with λ ↓ 0. Subsequently, λ_{b} is a λ, typically the supremum λ, such that the equation holds with β(λ, τ) = (0, τ). Furthermore, *w _{i}*(λ

With the same arguments following equation (11), solving (15) for (0, τ) can be reformulated as a minimization problem:

$$\begin{array}{cc}\widehat{\beta}(0,\tau )=\hfill & \underset{\mathrm{\lambda}\downarrow 0}{\text{lim}}\phantom{\rule{thinmathspace}{0ex}}\text{arg}\phantom{\rule{thinmathspace}{0ex}}\underset{b}{\text{min}}{\displaystyle \sum _{i=1}^{n}({X}_{i}-{Z}_{i}^{\top}b)}\hfill \\ \hfill & \text{\hspace{1em}\hspace{1em}\hspace{1em}\hspace{1em}}\times \left[I({X}_{i}\ge {Z}_{i}^{\top}b)-{\mathrm{\lambda}}^{-1}{\mathrm{\Delta}}_{i}\{I({X}_{i}<{Z}_{i}^{\top}b)-{\varphi}_{i}(\tau )\}\right],\hfill \end{array}$$

which no longer involves *w _{i}*(·, τ). Further algebraic simplification gives

$$\widehat{\beta}(0,\tau )=\text{arg}\phantom{\rule{thinmathspace}{0ex}}\underset{b}{\text{min}}{\displaystyle \sum _{i=1}^{n}{({X}_{i}-{Z}_{i}^{\top}b)}^{+}}$$

(16)

$$\begin{array}{cc}\text{subject}\phantom{\rule{thinmathspace}{0ex}}\text{to}\hfill & {X}_{i}\le {Z}_{i}^{\top}b\text{}\forall i\in {\mathbb{D}}_{-}\equiv \{j:{\mathrm{\Delta}}_{j}=1,\phantom{\rule{thinmathspace}{0ex}}{\varphi}_{j}(\tau )=1\}\hfill \\ \hfill & {X}_{i}={Z}_{i}^{\top}b\text{}\forall i\in {\mathbb{D}}_{0}\equiv \{j:{\mathrm{\Delta}}_{j}=1,{\varphi}_{j}(\tau )\in (0,1)\},\hfill \\ \hfill & {X}_{i}\ge {Z}_{i}^{\top}b\text{}\forall i\in {\mathbb{D}}_{+}\equiv \{j:{\mathrm{\Delta}}_{j}=1,{\varphi}_{j}(\tau )=0\}\hfill \end{array}$$

where a^{+} *aI*(*a* > 0). For the special case of the 0th quantile coefficient,

$$\widehat{\beta}(0)=\text{arg}\phantom{\rule{thinmathspace}{0ex}}\underset{b}{\text{min}}{\displaystyle \sum _{i=1}^{n}{({X}_{i}-{Z}_{i}^{\top}b)}^{+}}$$

(17)

$$\begin{array}{cc}\text{subject}\phantom{\rule{thinmathspace}{0ex}}\text{to}\hfill & {X}_{i}\ge {Z}_{i}^{\top}b\text{}\forall i:{\mathrm{\Delta}}_{i}=1.\hfill \end{array}$$

The minimization of (16) is a piecewise-linear programming problem with convex objective function, characterized by the following lemma to Theorem 1. Note that, once (0, τ) is determined, so is *w _{i}*(τ) of a (0, τ)-interpolated uncensored individual by continuity of ϕ

Lemma 1. *Suppose that the condition of Theorem 1 holds and that covariates Z _{i}, i _{0}, are linearly independent. There exists a minimizer (0, τ) for problem (16) such that the covariate matrix for (0, τ)-interpolated observations is of full rank. Furthermore, there exist (i) a p-member subset of (0, τ)-interpolated observations with _{0} ; and (ii) for any (0, τ)-interpolated censored individual i,*

$${w}_{i}(\tau )\in \{\text{}\begin{array}{cc}[0,1]\hfill & \text{if}\phantom{\rule{thinmathspace}{0ex}}i\in \mathbb{S}\hfill \\ \{0,1\}\hfill & \mathit{\text{otherwise}}\hfill \end{array}$$

*such that (iii) Z _{} = {Z_{i} : i } is of full rank; and (iv)* $\widehat{H}\equiv {\displaystyle {\sum}_{i=1}^{n}{Z}_{i}[I\{{X}_{i}\ge {Z}_{i}^{\top}\widehat{\beta}(0,\tau )\}-I\{{X}_{i}={Z}_{i}^{\top}\widehat{\beta}(0,\tau )\}{w}_{i}(\tau )]}$

$$\sum _{i\in \mathbb{S}}{Z}_{i}{\mathrm{\Delta}}_{i}{\gamma}_{i}=\widehat{H}$$

(18)

* for some γ _{i}, where γ_{i} ≤ 0 for i _{−} and γ_{i} ≥ 0 for i _{+}.*

Piecewise-linear programming can be viewed as extended linear programming, although a (0, τ)-interpolated individual may be a censored one and thus not involved in the constraints. We devise an algorithm aiming at the determination of the p-member interpolated subset , the same strategy as the simplex method of linear programming (e.g., Gill, Murray & Wright, 1991). To locate a candidate member of , the method of steepest descent is used. Note that a feasible value for (0, τ) is readily available. In the case of τ = 0, any value with a sufficiently small intercept component is feasible. Subsequently, (τ−) is feasible as necessary by continuity of ϕ_{i}(·) for uncensored individuals. The minimization along a given feasible direction is reached once an uncensored observation becomes interpolated, or potentially so if the interpolated observation is a censored one instead. The constrained space is of dimension *p* minus the size of _{0}. For (0), there is no equality constraint and the dimension is *p*. For following 0th relative quantile coefficients, typically the dimension is 1 in which case the minimization involves only a line search. To deal with the possibility of more than *p* interpolated individuals, the perturbation anti-cycling technique in linear programming (e.g., Gill et al., 1991, Section 8.3.3) can be adapted. In the perturbation, one may follow a tie-breaking convention to let individuals in _{+} precede censored ones, which in turn precede those in _{−}. This minimization is numerically reliable and efficient.

The minimization determines (0, τ), , *w _{i}*(τ) for each in , and

$$\sum _{i\in \mathbb{S}}{Z}_{i}{\mathrm{\Delta}}_{i}\{{w}_{i}(\mathrm{\lambda},\tau )-{w}_{i}(\tau )\}=\mathrm{\lambda}{\displaystyle \sum _{i\in \mathbb{S}}{Z}_{i}{\mathrm{\Delta}}_{i}{\gamma}_{i}.}$$

(19)

Simple algebra then gives the potential relative breakpoint

$${\mathrm{\lambda}}_{b}=\{\begin{array}{ccc}\underset{i\in \mathbb{S}:{\mathrm{\Delta}}_{i}=1,{\gamma}_{i}\ne 0}{\text{min}}\hfill & \frac{I({\gamma}_{i}>0)-{w}_{i}(\tau )}{{\gamma}_{i}}\hfill & \{i\in \mathbb{S}:{\mathrm{\Delta}}_{i}=1,{\gamma}_{i}\ne 0\}\ne \varnothing \hfill \\ \hfill & 1\hfill & \text{otherwise}\hfill \end{array},$$

(20)

which is proper in the sense of 0 < λ_{b} ≤ 1. The lower bound of λ_{b} is obvious, whereas the upper bound can easily be established by analyzing the intercept component of equations (18) and (19). If λ_{b} = 1, the final quantile is reached. Otherwise, for those uncensored,

$$\begin{array}{cc}{w}_{i}({\mathrm{\lambda}}_{b}-,\tau )={w}_{i}(\tau )+{\mathrm{\lambda}}_{b}{\gamma}_{i}\hfill & \text{\hspace{1em}\hspace{1em}}i\in \mathbb{S}:{\mathrm{\Delta}}_{i}=1.\hfill \end{array}$$

At least one *w _{i}*(λ

When τ is small, typically consists of uncensored individuals only. But as τ becomes larger, interpolated censored individuals could emerge when (τ) might still be uniquely determined nonetheless. Eventually, the computation could reach a point beyond which (τ) is no longer unique. Apparently, this phenomenon relates to the identifiability issue; see Remark 7. On a different note, just like uncensored quantile regression, this censored quantile regression may not respect quantile monotonicity in general.

In the absence of censoring, estimating integral equation (13) reduces to

$$\sum _{i=1}^{n}{Z}_{i}\left[I\{{T}_{i}<{Z}_{i}^{\top}\beta (\tau )\}+I\{{T}_{i}={Z}_{i}^{\top}\beta (\tau )\}{w}_{i}(\tau )\right]=\tau {\displaystyle \sum _{i=1}^{n}{Z}_{i},}$$

(21)

by the same approach to obtaining (11) from (8). Thus, (·) is the cadlag function β(·) that minimizes ${\sum}_{i=1}^{n}[{\{{T}_{i}-{Z}_{i}^{\top}\beta (\tau )\}}^{-}+\tau \{{T}_{i}-{Z}_{i}^{\top}\beta (\tau )\}]$, reducing to one regression quantile of Koenker & Bassett (1978); note that the Koenker–Bassett estimator is not always uniquely defined. In the mean time, 1 − ϕ_{i}(τ) becomes $I\{{T}_{i}\ge {Z}_{i}^{\top}\beta (\tau )\}-I\{{T}_{i}={Z}_{i}^{\top}\beta (\tau )\}{w}_{i}(\tau )$, which is the regression rank score of Gutenbrunner & Jureková (1992).

On the other hand, in the one-sample case, (·) reduces exactly to the cadlag inverse of the Kaplan–Meier estimator. It is clear from (17) that (0) is the first failure time and from (20) that the breakpoint is the Nelson–Aalen estimate of the hazard at (0). Subsequently and more generally, each estimated 0th relative quantile is a failure time and the relative breakpoint is the Nelson–Aalen hazard estimate. In case that the last observation is censored, the final estimated quantile is defined as this last follow-up time by convention. More generally, in the *k*-sample problem, (·) is a linear combination of cadlag inverses of the *k* Kaplan–Meier estimators.

In our developments thus far, we have kept our assumptions to minimal. But the generality challenges large-sample developments in both exposition and technicalities; see Section 6 for further discussion. In this section, we shall focus on the situation that *F _{Z}* is continuous and free of zero-density intervals, and additionally

A parameter space needs to be specified. In light of the interpolation property of the estimator by Theorem 1, we require that any *b* in such a parameter space satisfies that *E*[*Z*^{2}*I*{*Z*^{}β_{0}(0) ≤ *Z*^{}*b* ≤ *Z*^{}β_{0}(1−) ∧ *C*}] is nonsingular. Write eigmin as the minimum eigenvalue of a matrix. Specifically, a parameter space containing β_{0}(τ) for all τ [0, *u*] is given by

$$\begin{array}{cc}\mathbb{B}(u)=\hfill & \{b\in \mathbb{R}\phantom{\rule{thinmathspace}{0ex}}\times \phantom{\rule{thinmathspace}{0ex}}{\u2102}_{p-1}:\hfill \\ \hfill & \text{eigmin}E[{Z}^{\otimes 2}I\{{Z}^{\top}{\beta}_{0}(0)\le {Z}^{\top}b\le {Z}^{\top}{\beta}_{0}(1-)\wedge C\}]c(u)\},\hfill \end{array}$$

where constant *u* < , compact space _{p−1} ^{p−1}, and positive constant *c*(*u*) < eigmin*E*[*Z*^{2}*I*{*C* ≥ *Z*^{} β_{0}(*u*)}]. Thus, all slope components are bounded but the intercept may be −∞.

Write ‖ · ‖ as the Euclidean norm. Let _{Z}(*t*) = 1 − *F _{Z}*(

- C1. > 0 and ‖
*Z*‖ is bounded; - C2.
*F*and_{Z}*G*have density functions_{Z}*f*and_{Z}*g*, which both are continuous and bounded, uniformly in_{Z}*t*and*Z*; - C3. β
_{0}(·) is Lipschitz-continuous on [τ_{1}, τ_{2}] for any τ_{1}and τ_{2}such that 0 < τ_{1}< τ_{2}< 1; - C4. There exist
*u*(0, ) and a parameter space (*u*) such that the maximum singular value ofis bounded uniformly in$$\mathrm{\Psi}(b)=E\left\{{Z}^{\otimes 2}{\overline{F}}_{Z}({Z}^{\top}b)\mathit{\text{gz}}({Z}^{\top}b)\right\}\phantom{\rule{thinmathspace}{0ex}}{\left[E\left\{{Z}^{\otimes 2}{\overline{G}}_{Z}({Z}^{\top}b)\mathit{\text{fz}}({Z}^{\top}b)\right\}\right]}^{-1}$$*b*(*u*) \ (*u*), where denotes the boundary.

The first two conditions are self-explanatory. Conditions C3 implies that the survival distribution does not have zero-density intervals between *Q _{Z}*(0) and

Throughout this section, (·) is any cadlag solution to estimating integral equation (13). The solution may not be unique, nor is the interpolation property in Theorem 1 necessary.

Theorem 2. *Suppose that quantile regression model (2) and censoring mechanism (3) hold along with conditions C1–C4. Equation (8) implies β(τ) = β _{0}(τ) for all τ (0, u]. For any l (0, u), sup_{τ [l,u]} ‖ (τ)−β_{0}(τ) ‖ → 0 almost surely. Furthermore, n^{1/2}{(τ) − β_{0}(τ)} converges weakly to a Gaussian process on [l, u].*

Remark 8. Integral equation (8) is an initial value problem, and estimating integral equation (13) is its empirical version. Accordingly, the large-sample study as provided in Appendix D exploits classical differential equation theory and modern empirical process theory. Our study bears similarities with that of Peng & Huang (2008). Indeed, under the continuity condition of C2, equation (13) is essentially equivalent to the estimating function of Peng & Huang (2008, equation 5) since *w _{i}*(·) becomes negligible; see also Remark 5. Nevertheless, we spare the inductive arguments of Peng & Huang (2008) in their asymptotic study as typically necessary for a grid method, by virtue of the fact that (·) is an exact solution to equation (13). Equally noteworthy is that the generality here on the 0th quantile requires a more delicate treatment.

Remark 9. In the case that *Z*^{}β_{0}(0) is −∞ for all *Z*, our estimator (·) is asymptotically equivalent to that of Peng & Huang (2008) provided that mesh size of the grid as required by the latter is of order *o*(*n*^{−1/2}).

To make inference, the distribution of *n*^{1/2}{(·) − β_{0}(·)} needs to be estimated. For their estimator, Peng & Huang (2008) adapted the resampling approach of Jin, Ying & Wei (2001). We adopt the same approach by perturbing estimating integral equation (13). This procedure is equivalent to a multiplier bootstrap as described in Kosorok (2008, section 2.2.3).

Theorem 3. *Suppose that the conditions of Theorem 2 hold, and that nonnegative random variable ξ has unit mean and unit variance and satisfies* ${\int}_{0}^{\infty}\text{Pr}{(\xi >x)}^{1/2}\mathit{\text{dx}}<\infty .$ *Perturb estimating integral equation (13) by assigning iid random variables of the same distribution as ξ and independent of the data to individuals in the sample, and denote a solution to the perturbed equation by *(·). On [l, u], n ^{1/2} {(·) − β_{0}(·)} has the same asymptotic distribution as n^{1/2} {*(·) − (·)} conditionally on the data.*

The standard exponential distribution, for example, may be used to generate these perturbing random variables. By repeatedly simulating perturbed samples, the conditional distribution of *(·) can be obtained as an approximation for the distribution of (·).

The quantile regression model is formulated in β_{0}(·). But alternative covariate-effect measures can be practically useful and were used in our application (Section 5.3). Write

$${\mu}_{0}({\tau}_{1},{\tau}_{2})\equiv {({\tau}_{2,}-{\tau}_{1})}^{-1}{\displaystyle {\int}_{{\tau}_{1}}^{\tau 2}{\beta}_{0}(\nu )d\nu .}$$

Model (2) implies

$${({\tau}_{2}-{\tau}_{1})}^{-1}{\displaystyle {\int}_{{\tau}_{1}}^{{\tau}_{2}}{Q}_{Z}(\nu )d\nu ={Z}^{\top}{\mu}_{0}({\tau}_{1},{\tau}_{2}),}$$

(22)

where the left-hand side is a trimmed mean of *T*. Therefore, μ_{0}(τ_{1}, τ_{2}) measures *trimmed mean effect*. This measure is versatile through the choices of τ_{1} and τ_{2}. In fact, β_{0}(τ) = lim_{ν↓τ} μ_{0}(τ, ν) is a special case. On the other hand, μ_{0}(0, 1) is the mean effect, i.e., the regression coefficient in the linear regression model. Originally suggested as an average effect measure by Peng & Huang (2008), μ_{0}(τ_{1}, τ_{2}) becomes even more appealing in light of its specific interpretation as revealed. With censored data, μ_{0}(τ_{1}, τ_{2}) is identifiable when τ_{2} ≤ , and a natural estimator is given by

$$\widehat{\mu}({\tau}_{1},{\tau}_{2})={({\tau}_{2}-{\tau}_{1})}^{-1}{\displaystyle {\int}_{{\tau}_{1}}^{{\tau}_{2}}\widehat{\beta}(\nu )\phantom{\rule{thinmathspace}{0ex}}d\nu .}$$

Obviously, (τ_{1}, τ_{2}) with 0 < τ_{1} < τ_{2} ≤ *u* is strongly consistent and asymptotically normal under the conditions of Theorem 2. The variance can be estimated by using the simulated distribution of *(·). Our numerical experience suggested that (τ_{1}, τ_{2}) behaves reasonably well even when τ_{1} takes 0.

Simulations were conducted to mimic a clinical trial. On the original time scale, the baseline survival distribution was standard exponential and the censoring distribution was uniform on [0, 5]. The quantile regression model held on the logarithmic time scale, with two non-constant covariates: *Z*_{1} was Bernoulli of probability 0.5 and *Z*_{2} uniform on [0, 1]. We considered two scenarios with the following conditional quantile functions:

$$\begin{array}{c}{Q}_{z}(\tau )\text{\hspace{1em}}=\text{\hspace{1em}log}\{-\text{log}(1-\tau )\}+(1.25\tau \phantom{\rule{thinmathspace}{0ex}}\wedge \phantom{\rule{thinmathspace}{0ex}}0.5){Z}_{1}+0.5{Z}_{2},\hfill \\ {Q}_{Z}(\tau )\text{\hspace{1em}}=\text{\hspace{1em}log}\{-\text{log}(1-\tau )\}+0.5{Z}_{1}+0.5{Z}_{2}.\hfill \end{array}$$

Scenario 1 above involved a ramp-up effect of Z_{1}, going from none to full linearly with probability τ and staying constant afterwards. In contrast, Scenario 2 followed the accelerated failure time model.

The sample size was 200. Under each scenario, simulations were conducted with 1,000 iterations. For both scenarios, the censoring rate was approximately 32%. Table 1 reports the summary statistics for the proposed τ -th quantile coefficient estimates ranging from τ = 0.1 to 0.7, along with estimates based on the pivoting method of Portnoy (2003), the grid method of Portnoy (Neocleous et al., 2006), and Peng & Huang (2008). The two Portnoy’s methods are available in R Quantreg package, of which the latest version at the time of this research, 4.20, was used. The default mesh size, 0.01 in this case, was adopted for the grid method of Portnoy, and the same mesh size for Peng & Huang. For point estimation, the pivoting method of Portnoy had large bias and variance at τ = 0.7 under both scenarios. Other than that, these estimators all had negligible bias and similar efficiency. But the bias of the proposed estimator seemed smaller. These findings are not surprising since the estimator of Peng & Huang is asymptotically equivalent to the proposed in the settings under study; see Remark 9. In addition, similar efficiency between Peng & Huang and the grid method of Portnoy was already observed in Peng & Huang (2008). For interval estimation, Peng & Huang employs the same procedure as the proposed, whereas the methods of Portnoy use bootstrap. The resampling size was set to 200 for all these methods. The standard error of the proposed estimator agreed with the standard deviation well. The Wald-type 95% confidence intervals of both the proposed and Peng & Huang achieved reasonably accurate coverage probability. In contrast, the bootstrap confidence intervals of Portnoy’s methods had undercoverage particularly for the intercept, a finding consistent with that reported in Peng & Huang (2008).

These preceding stimulation settings conform to the conditions of the asymptotic study in Section 4. Additional settings with distributional discontinuities and zero-density intervals of the survival time were also considered. One such simulation involved a setting similar to the preceding ones but having a discontinuous baseline survival distribution:

$${Q}_{Z}(\tau )=\text{log}\{-\text{log}(1-\tau \phantom{\rule{thinmathspace}{0ex}}\vee \phantom{\rule{thinmathspace}{0ex}}0.4)\}+(\tau \phantom{\rule{thinmathspace}{0ex}}\vee \phantom{\rule{thinmathspace}{0ex}}0.4){Z}_{1}+0.5{Z}_{2},$$

where denotes the maximization operator. Unfortunately, the pivoting and grid methods of Portnoy as implemented in R Quantreg package had numerical difficulties and their appropriate evaluation was not permitted. Both the estimator of Peng & Huang and the proposed had negligible bias at τ = 0.1 and 0.3. However, for τ = 0.5 and 0.7, the absolute median-bias of Peng & Huang reached 0.136, 0.065, and 0.041 for the intercept and two slopes, respectively. In comparison, the absolute median-bias of the proposed estimator was no larger than 0.026 for all three coefficients. Here, medianbias is a more appropriate summary than mean-bias due to the skewness of these estimators resulting from discontinuity in the survival distribution. These results were expected since the validity of Peng & Huang is tied to the assumption of continuous survival distribution; see Remark 5.

The proposed method was compared against the pivoting method of Portnoy, the grid method of Portnoy, and the Peng & Huang method implemented by Koenker (2008). All these existing methods are implemented with Fortran source code in R Quantreg package. The original implementation of Peng & Huang in R language was inappropriate for comparison due to the inherent slower speed of R. For the two grid methods, the default mesh size, 0.01 ∧ *n*^{−0.7}/2 for sample size *n*, was adopted. The proposed method was also implemented in R with Fortran source code. To minimize the impact of R overhead, calling the Fortran function of a method from R was timed. The computation was performed on a Dell 2950 computer with 3.0 GHz Pentium Xeon X5365 CPUs.

The survival time followed the accelerated failure time model with p − 1 non-constant covariates:

$$\text{log}T=\epsilon +{\displaystyle \sum _{m=2}^{p}\frac{{(-1)}^{m-1}}{2}{Z}^{(m)},}$$

where ε followed the extreme-value distribution, and *Z*^{(m)}, *m* = 2,, *p*, were independent and uniformly distributed on [0, 1]. The number of non-constant covariates ranged from 1 to 8, and the sample size from 100 to 1,600. Three levels of censoring, 0%, 25%, and 50%, were investigated. Unless there was no censoring, the censoring time followed the uniform distribution between 0 and a censoring rate-determined upper bound. Computational reliability and efficiency of various methods for point estimation of the quantile coefficient process were assessed with 1,000 iterations, shown in Table 2.

Both the proposed and Peng & Huang methods were reliable. However, the pivoting and grid methods of Portnoy tended to cause frequent R session crashes in case of no censoring and more covariates. Furthermore, in the presence of censoring, the pivoting method of Portnoy might terminate with warning or error messages. This rate increased with the number of covariates and censoring rate, up to 67%.

The computer time of the proposed method was roughly constant over different censoring levels, given sample size and number of covariates. Comparatively, the proposed is faster than other methods uniformly in all scenarios considered. Specifically, the pivoting method of Portnoy cost 1.6 to 6.7 times as much computer time, the grid method of Portnoy cost 1.8 to 5.9 times, and the Peng & Huang method cost 10.5 to 47.5 times. This result is remarkable since the grid methods involve much fewer grid points than breakpoints of the proposed method at larger sample size, suggesting that a grid point could be much more costly to compute.

For illustration, we applied the proposed estimation procedure to the Mayo primary biliary cirrhosis dataset (Fleming & Harrington, 1991, app. D). Conducted at Mayo Clinic between 1974 and 1984, the study followed 418 patients with primary biliary cirrhosis, a rare but fatal chronic liver disease. One question of interest was concerned with prognostic factors associated with survival. In this analysis, we considered five baseline measures: age, edema, log(bilirubin), log(albumin), and log(prothrombin time). Two participants had incomplete measures and were thus removed. Our analysis data consisted of 416 patients, with a median follow-up time of 4.74 years and a censoring rate of 61.5%.

We adopted the quantile regression model on the logarithmic time scale, with the five baseline measures as covariates. The estimated quantile coefficient processes are shown in Figure 1, along with pointwise Wald 95% confidence intervals. The resampling size for the interval estimation was 200. The maximum cumulative probability up to which the estimated quantile coefficient was unique is 0.91. Among the five covariates, log(prothrombin time) in particular exhibited a prominent varying effect. It was negatively associated with survival time for short survivors, but the effect diminished gradually for longer survivors. This result echos findings from analyses of this dataset with other varying-coefficient models, e.g., by Tian, Zucker & Wei (2005) using the varying-coefficient Cox model. Nevertheless, this varying effect was not apparent in the model with log(prothrombin time) as the only covariate, shown in Figure 2.

Analysis of the Mayo primary biliary cirrhosis data. Estimated quantile coefficient processes are shown in rugged lines, along with pointwise Wald 95% confidence intervals given by vertical bars. The three regions on which the estimated quantile coefficient **...**

Analysis of the Mayo primary biliary cirrhosis data with log(prothrombin time) as the only covariate. Dots and open circles represent uncensored and censored individuals, respectively. Estimated decile coefficients are shown from τ = 0 up to 0.8. **...**

The graphical presentation is revealing of the covariate effect evolution. To summarize, estimated upper-trimmed mean effects and standard errors are given in Table 3. For comparison, the estimates based on the accelerated failure time model using the log-rank and Gehan estimating functions are also included. Notice that the two estimated regression coefficients deviate from each other for log(prothrombin time) with the accelerated failure time model. This disparity also suggests a lack of fit of this sub-model. In this situation, estimates from the accelerated failure time model are difficult to interpret. In contrast, the estimated upper-trimmed mean effects from the quantile regression model are meaningful, for covariates with constant or varying effects alike.

Quantile calculus as developed proves useful and effective for quantile regression. With uncensored data, it offers a new perspective of the standard regression procedure. Most importantly, censoring can be naturally accommodated, and it gives rise to our proposed censored regression via solving a well-defined estimating integral equation. To focus on the main ideas, we have not addressed second-stage inference and model diagnostics, which are practically useful. They can be developed along the lines similar to those in Peng & Huang (2008).

For survival data, alternative models exist to address varying covariate effects. One better known varying-coefficient model is the additive hazards model of Aalen (1980). There is also an extensive literature on the varying-coefficient Cox model, but most available estimation methods require smoothing; see Tian et al. (2005) and the references therein. More recently, Peng & Huang (2007) extended the class of semiparametric linear transformation models to allow for varying coefficients. In comparison with all these alternatives, the quantile regression model is particularly attractive with its simple interpretation; see Section 1.

When the inferential goal is on a specific quantile, the quantile regression model for the given τ only, or the singleton model, is of direct interest. In this case, methods for uncensored quantile regression (Koenker & Bassett, 1978) and censored one with always-observed censoring time (Powell, 1984; 1986) are still applicable. But when censoring time is only observed for censored individuals, the proposed method as well as Portnoy (2003), Neocleous et al. (2006) and Peng & Huang (2008) may not be applied unless the quantile regression model holds from the 0th through the τ -th quantile. In contrast, the approaches of Ying et al. (1995) and Honoré et al. (2002) do not necessarily require the functional model but at the price of a more restrictive censoring mechanism. A choice between these two classes of methods depends on whether functional survival-time modeling or censoring-time modeling might be more reasonable and justifiable in a specific application. The method of Wang & Wang (2009) is appealing in this regard, but might have practicality concerns in the case of multiple continuously-distributed covariates, as discussed in Section 1.

Generalizing the asymptotic results given in Section 4 is of interest, say, to allow for zero-density intervals and discontinuities in the survival distribution. Unfortunately, difficulties include the open question on solution uniqueness for integral equation (8), as indicated in Remark 2, and more. These additional ones can be readily seen in the one-sample case. First, the notion of consistency might not even be appropriate in evaluating the estimated quantile corresponding to a zero-density interval. Indeed, consistent estimation might be impossible in this circumstance but the estimated quantile is nonetheless informative of the estimand. Second, a distributional discontinuity might ruin asymptotic normality of the corresponding estimated quantile. These issues become much more complex and also have broader implication in the general case. Due to the sequential nature of the estimation, of concern are not only those corresponding quantile coefficients but also the subsequent ones. Nevertheless, it seems reasonable to conjecture that consistency and asymptotic normality might still hold for estimated quantile coefficients other than those corresponding to zero-density intervals and distributional discontinuities.

Several additional problems are also worth further investigation. First, our focus has been on the estimation of β_{0}(τ) provided τ < , and the estimation of identifiability limit remains to be addressed; see Remark 7. Second, the sub-model with a mixture of constant- and varying-coefficients would be useful when constant effects are determined *a priori* for some covariates. Efficiency gain might be expected over the more general method in this article. Third, in addition to right censoring, quantile regression with other types of censoring and truncation is also of interest. But new techniques might be needed. Finally, the proposed 0th quantile coefficient estimator as given by (17) might be of interest in its own right. In the absence of censoring, our estimator reduces to the extreme regression quantile studied by Smith (1994), Portnoy & Jureková (1999) and Chernozhukov (2005) among others. Some of their results may be extended.

The author extends his gratitude to Limin Peng for discussions over the course of this research and her review of an earlier version of this paper. He also thanks the seminar participants at the University of North Carolina at Charlotte and the reviewers for helpful comments.

Consider assertion (i). Given that η(τ) = ξ(τ) for all τ such that Pr{*T* = *Q*(τ)} > 0, equation (4) still holds when ξ(·) is replaced by η(·). Therefore,

$$\begin{array}{c}{\displaystyle {\int}_{0}^{\tau}\left[\text{Pr}\{C\ge Q(\nu )\}-\text{Pr}\{C=Q(\nu )\}\eta (\nu )\right]}\hfill \\ \hfill \times \phantom{\rule{thinmathspace}{0ex}}d\left[\text{Pr}\{T<Q(\nu )\}+\text{Pr}\{T=Q(\nu )\}\eta (\nu )\right]\\ {\displaystyle \text{}={\int}_{0}^{\tau}\left[\text{Pr}\{C\ge Q(\nu )\}-\text{Pr}\{C=Q(\nu )\}\eta (\nu )\right]}\hfill \\ \text{\hspace{1em}}\times \left[\text{Pr}\{T\ge Q(\nu )\}-\text{Pr}\{T=Q(\nu )\}\eta (\nu )\right]\frac{d\nu}{1-\nu}\text{}\forall \tau \in [0,1).\hfill \end{array}$$

The above equation simplifies to (5) under the given conditions.

For assertion (ii), only the case of > 0 needs to be considered. The definition of implies *E*(Δ[*I*{*X* ≥ *Q*(τ)} − *I*{*X* = *Q*( )}ξ( )]) = 0. Thus,

$$E\left(\mathrm{\Delta}[I\{X<Q(\overline{\tau})\}+I\{X=Q(\overline{\tau})\}\xi (\overline{\tau})]\right)=E\mathrm{\Delta}.$$

(A.1)

Define τ* = sup{τ : Pr{*C* ≥ *q*(ν)} > 0 ∀ ν [0, τ ]}. The same argument as before gives

$$E\left(\mathrm{\Delta}[I\{X<q({\tau}^{*})\}+I\{X=q({\tau}^{*})\}\eta ({\tau}^{*})]\right)=E\mathrm{\Delta}.$$

(A.2)

Given the continuity of the left-hand side of equation (5) in τ, the above equation implies τ* > 0. Since Pr{*C* ≥ *q*(τ)} > 0 for any τ [0, τ*), equation (5) under the given conditions implies

$$\begin{array}{c}\text{Pr}\{T<q(\tau )\}+\text{Pr}\{T=q(\tau )\}\eta (\tau )\hfill \\ \text{}={\displaystyle {\int}_{0}^{\tau}\left[1-\text{Pr}\{Tq(\nu )\}-\text{Pr}\{T=q(\nu )\}\eta (\nu )\right]\frac{d\nu}{1-\nu}\text{}\forall \tau \in [0,{\tau}^{*}).}\hfill \end{array}$$

The above integral equation has a unique solution:

$$\text{Pr}\{T<q(\tau )\}+\text{Pr}\{T=q(\tau )\}\eta (\tau )=\tau \text{}\forall \tau \in [0,{\tau}^{*}),$$

from which (6) and (7) follow for τ (0, τ*). Furthermore, equations (A.1) and (A.2) imply τ* = . This completes the proof.

Existence result (i) follows directly from Proposition 1. We now prove uniqueness result (ii) by construction. Start from τ = 0. Write as the discrete distributional support of (*C,Z*), and define

$$\begin{array}{c}{\tau}_{1}=\text{sup}\phantom{\rule{thinmathspace}{0ex}}\{T:I\{c\ge {z}^{\top}\beta ({\nu}_{1})\}=\underset{{\nu}_{2}\downarrow 0}{\text{lim}}I\{c\ge {z}^{\top}\beta ({\nu}_{2})\}\phantom{\rule{thinmathspace}{0ex}}\text{and}\phantom{\rule{thinmathspace}{0ex}}c\ne {z}^{\top}\beta ({\nu}_{1})\hfill \\ \hfill \forall {\nu}_{1}\in (0,\tau ]\phantom{\rule{thinmathspace}{0ex}}\text{and}\phantom{\rule{thinmathspace}{0ex}}(c,z)\in \mathbb{H}\}.\end{array}$$

Thus, *I*{*C* ≥ *Z*^{}β(τ)} remains constant and *C* ≠ *Z*^{}β(τ) over τ (0, τ_{1}) almost surely. Locally, equation (8) reduces to

$$D(\tau )={\displaystyle {\int}_{0}^{\tau}\{Y(\tau )-D(\nu )\}\frac{d\nu}{1-\nu}\text{}\tau \in [0,{\tau}_{1}),}$$

where *D*(τ) is the left-hand side of (8) and *Y* (τ) *E*[*ZI*{*C* ≥ *Z*^{}β(τ)}] is constant over τ (0, τ_{1}). The above equation admits a unique solution *D*(τ) = τ*Y* (τ), or equivalently

$$\begin{array}{c}E\left(\mathit{\text{ZI}}\{C\ge {Z}^{\top}\beta (\tau )\}[I\{T<{Z}^{\top}\beta (\tau )\}+I\{T={Z}^{\top}\beta (\tau )\}{\eta}_{Z}(\tau )-\tau ]\right)\phantom{\rule{thinmathspace}{0ex}}=\phantom{\rule{thinmathspace}{0ex}}0\hfill \\ \hfill \tau \in (0,{\tau}_{1}).\end{array}$$

By arguments similar to Remark 3, β(τ) is the minimizer of *E*[{*X* − *Z*^{}β(τ) ∧ *C*}^{−} + τ{*X* − *Z*^{}β(τ) ∧ *C*}]. Recognizing that this minimization problem is the basis for Powell’s (1984, 1986) estimator, we then obtain (9) for τ (0, τ_{1}) so long as > 0. Given (9), integral equation (8) implies

$$J(\tau )=-{\displaystyle {\int}_{0}^{\tau}J(\nu )\frac{d\nu}{1-\nu}\text{}\tau \in [0,{\tau}_{1}),}$$

where *J*(τ) is the difference between the two sides of (10). Thus, (10) is obtained for τ (0, τ_{1}) by an application of the Gronwall’s inequality.

Under Assumption 1, one can show

$$\begin{array}{c}\underset{\tau \downarrow {\tau}_{1}}{\text{lim}}\phantom{\rule{thinmathspace}{0ex}}E\left(Z[I\{X\ge {Z}^{\top}\beta (\tau )\}-I\{X={Z}^{\top}\beta (\tau )\}{\eta}_{Z}(\tau )]\right)\hfill \\ \hfill =(1-{\tau}_{1})\underset{\tau \downarrow {\tau}_{1}}{\text{lim}}E[\mathit{\text{ZI}}\{C\ge {Z}^{\top}\beta (\tau )\}].\end{array}$$

Then, by taking advantage of the notion of relative quantile and integral equation (12), results (9) and (10) can be established inductively beyond τ_{1}, up to .

With the developments in Section 3, it remains to establish Lemma 1. Given the existence of a feasible value for (0, τ), nonnegativity of the objective function in (16) ensures the existence of a minimizer. Furthermore, note that the objective function becomes linear upon adding ${X}_{i}\le {Z}_{i}^{\top}b\phantom{\rule{thinmathspace}{0ex}}\text{or}\phantom{\rule{thinmathspace}{0ex}}{X}_{i}\ge {Z}_{i}^{\top}b$ for each censored individual to the constraints. Therefore, this piecewise-linear programming problem becomes the minimization of a set of linear programming problems, where each member involves additional constraints concerning censored individuals. It is known that a linear programming problem has a vertex solution if a bounded solution exists (e.g., Gill et al., 1991, Section 7.8.2). Assertion (iii) of Lemma 1 then follows.

For assertion (iv), we only consider the situation that the interpolated set is of size *p*; otherwise one may work with the corresponding perturbed problem. Write _{C} as the subset of censored individuals in . The following two linear programming problems have the same solution as (16):

$$\begin{array}{cccc}\hfill {\text{min}}_{b}& -{A}^{\top}b\hfill & \hfill {\text{min}}_{b}& -{(A+{\displaystyle {\sum}_{i\in {\mathbb{S}}_{C}}{Z}_{i}})}^{\top}b\hfill \\ \text{subject}\phantom{\rule{thinmathspace}{0ex}}\text{to}\hfill & {X}_{i}\le {Z}_{i}^{\top}b\text{}\forall i\in {\mathbb{D}}_{-}\hfill & \text{subject}\phantom{\rule{thinmathspace}{0ex}}\text{to}\hfill & {X}_{i}\le {Z}_{i}^{\top}b\text{}\forall i\in {\mathbb{D}}_{-}\hfill \\ \hfill & {X}_{i}={Z}_{i}^{\top}b\text{}\forall i\in {\mathbb{D}}_{0},\hfill & \hfill & {X}_{i}={Z}_{i}^{\top}b\text{}\forall i\in {\mathbb{D}}_{0},\hfill \\ \hfill & {X}_{i}\ge {Z}_{i}^{\top}b\text{}\forall i\in {\mathbb{D}}_{+}\hfill & \hfill & {X}_{i}\ge {Z}_{i}^{\top}b\text{}\forall i\in {\mathbb{D}}_{+}\hfill \\ \hfill & {X}_{i}\le {Z}_{i}^{\top}b\text{}\forall i\in {\mathbb{S}}_{C}\hfill & \hfill & {X}_{i}\ge {Z}_{i}^{\top}b\text{}\forall i\in {\mathbb{S}}_{C}\hfill \end{array}$$

where $A={\displaystyle {\sum}_{i\in {\mathbb{D}}_{0}}\{1-{w}_{i}(\tau )\}{Z}_{i}+{\displaystyle {\sum}_{i\in {\mathbb{D}}_{+}}{Z}_{i}+{\displaystyle {\sum}_{i:{\mathrm{\Delta}}_{i}=0,{X}_{i}>{Z}_{i}^{\top}\widehat{\beta}(0,\tau )}{Z}_{i}}}}$. Of course, the above two coincide in the case of _{C} = . Applying Gill et al. (1991, theorem 7.8.1) yields

$$\begin{array}{c}A={\displaystyle \sum _{i\in \mathbb{S}}{Z}_{i}{\gamma}_{i}^{(1)},\text{}A+{\displaystyle \sum _{i\in {\mathbb{S}}_{C}}{Z}_{i}={\displaystyle \sum _{i\in \mathbb{S}}{Z}_{i}{\gamma}_{i}^{(2)}},}}\hfill \end{array}$$

for some ${\gamma}_{i}^{(\xb7)}$, where ${\gamma}_{i}^{(\xb7)}\le 0$ for *i* _{−}, ${\gamma}_{i}^{(\xb7)}\ge 0$ for *i* _{+}, and ${\gamma}_{i}^{(1)}\le 0\phantom{\rule{thinmathspace}{0ex}}\text{and}\phantom{\rule{thinmathspace}{0ex}}{\gamma}_{i}^{(2)}\ge 0$ for *i* _{C}. Since *Z*_{} is of full rank, ${\gamma}_{i}^{(1)}={\gamma}_{i}^{(2)}$ for *i* \ _{C} and ${\gamma}_{i}^{(1)}={\gamma}_{i}^{(2)}-1$ for *i* _{C}. Therefore, *Ĥ* as determined upon setting ${w}_{i}(\tau )={\gamma}_{i}^{(2)}\in [0,1]$ for *i* _{C} satisfies equation (18), with ${\gamma}_{i}={\gamma}_{i}^{(\xb7)}$ for *i* \ _{C}. This completes the proof.

Similar to Peng & Huang (2008), we introduce monotone maps Γ_{1}(*b*) = *E*{*Z*Δ*I*(*X* ≤ *Z*^{}*b*)} and Γ_{2}(*b*) = *E*{*ZI*(*X* ≥ *Z*^{}*b*)}. Write their empirical counterparts as ${\widehat{\mathrm{\Gamma}}}_{1}(b)={n}^{-1}{\displaystyle {\sum}_{i=1}^{n}{Z}_{i}{\mathrm{\Delta}}_{i}I({X}_{i}\le {Z}_{i}^{\top}b)\phantom{\rule{thinmathspace}{0ex}}\text{and}\phantom{\rule{thinmathspace}{0ex}}{\widehat{\mathrm{\Gamma}}}_{2}(b)={n}^{-1}{\displaystyle {\sum}_{i=1}^{n}{Z}_{i}I({X}_{i}\ge {Z}_{i}^{\top}b)}}$ Under condition C3, Γ_{1}(*b*) is a one-to-one map for *b* (*u*) and ${\mathrm{\Gamma}}_{1}^{-1}(\xb7)$ exists. Write $H(a)={\mathrm{\Gamma}}_{2}\{{\mathrm{\Gamma}}_{1}^{-1}(a)\}$ and note $\partial H(a)/\partial {a}^{\top}=-\mathrm{\Psi}\{{\mathrm{\Gamma}}_{1}^{-1}(a)\}-\mathrm{\Pi}$, where Ψ(·) is defined in condition C4 and Π is the *p* × *p* identity matrix.

**Identifiability.** Write *α*(τ) = Γ_{1}{β(τ)}, and integral equation (8) can be written as

$$\alpha (\tau )={\displaystyle {\int}_{0}^{\tau}H\{\alpha (\nu )\}\frac{d\nu}{1-\nu};}$$

note that terms involving η_{Z}(·) vanish under condition C2. Condition C4 implies that *H*(*a*) is Lipschitz-continuous. The Picard–Lindelöf theorem then asserts the solution uniqueness, i.e., α(·) = α_{0}(·) Γ_{1}{β0(τ)}. It follows that β(τ) = β_{0}(τ) for all τ (0, *u*].

**Consistency.** It is known that {*I*(*X* ≤ *Z*^{}*b*) : *b* ^{p}} is Donsker (e.g., Kosorok, 2008, lemma 9.12). Furthermore, *Z* is bounded under condition C1. By permanence property of the Donsker class, {*Z*Δ*I*(*X* ≤ *Z*^{}*b*) : *b* ^{p}} is Donsker. So is {*ZI*(*X* ≥ *Z*^{}*b*) : *b* ^{p}} by similar arguments. Since Donsker implies Glivenko–Cantelli, almost surely

$$\underset{b\in {\mathbb{R}}^{p}}{\text{sup}}\Vert \widehat{{\mathrm{\Gamma}}_{j}}(b)-{\mathrm{\Gamma}}_{j}(b)\Vert =0(1)\text{}j=1,2.$$

(A.3)

On the other hand, condition C2 implies that sup_{bp} ${\sum}_{i=1}^{n}I({X}_{i}={Z}_{i}^{\top}b)\le p$ almost surely. Then, coupled with condition C1, with any *w _{i}* [0, 1], almost surely

$$\begin{array}{c}\underset{b\in {\mathbb{R}}^{p}}{\text{sup}}\Vert {n}^{-1}{\displaystyle \sum _{i=1}^{n}{Z}_{i}{\mathrm{\Delta}}_{i}I({X}_{i}={Z}_{i}^{\top}b)({w}_{i}-1)}\Vert =O({n}^{-1}),\hfill \\ \hfill \underset{b\in {\mathbb{R}}^{p}}{\text{sup}}\Vert {n}^{-1}{\displaystyle \sum _{i=1}^{n}{Z}_{i}I({X}_{i}={Z}_{i}^{\top}b){w}_{i}}\Vert =O({n}^{-1}).\end{array}$$

Therefore, almost surely

$$\underset{\tau \in [0,u]}{\text{sup}}\Vert {\widehat{\mathrm{\Gamma}}}_{1}\{\widehat{\beta}(\tau )\}-{\displaystyle {\int}_{0}^{\tau}{\widehat{\mathrm{\Gamma}}}_{2}\{\widehat{\beta}(\nu )\}\frac{d\nu}{1-\nu}}\Vert =O({n}^{-1}),$$

(A.4)

since equation (13) can be written as

$$\begin{array}{c}{\widehat{\mathrm{\Gamma}}}_{1}\{\beta (\tau )\}+{n}^{-1}{\displaystyle \sum _{i=1}^{n}{Z}_{i}{\mathrm{\Delta}}_{i}I\{{X}_{i}={Z}_{i}^{\top}\beta (\tau )\}\{{w}_{i}(\tau )-1\}}\hfill \\ ={\displaystyle {\int}_{0}^{\tau}\left[{\widehat{\mathrm{\Gamma}}}_{2}\{\beta (\nu )\}-{n}^{-1}{\displaystyle \sum _{i=1}^{n}{Z}_{i}I\{{X}_{i}={Z}_{i}^{\top}\beta (\nu )\}{w}_{i}(\nu )}\right]\frac{d\nu}{1-\nu}}\hfill \end{array}$$

and (·) is a solution.

Following equations (A.3) and (A.4), almost surely

$$\underset{\tau \in [0,u]}{\text{sup}}\Vert \widehat{\alpha}(\tau )-{\displaystyle {\int}_{0}^{\tau}H\{\widehat{\alpha}(\nu )\}\frac{d\nu}{1-\nu}}\Vert =o(1),$$

where (τ) = Γ_{1}{(τ)}. Write *L* as the Lipschitz constant of *H*(·). Thus, for every ε > 0 and with sufficiently large *n*, almost surely

$$\begin{array}{cc}\Vert \widehat{\alpha}(\tau )-{\alpha}_{o}(\tau )\Vert \hfill & \le {\displaystyle {\int}_{0}^{\tau}\Vert H\{\widehat{\alpha}(\nu )\}-H\{{\alpha}_{0}(\nu )\}\Vert \frac{d\nu}{1-\nu}+\epsilon}\hfill \\ \hfill & \le {\displaystyle {\int}_{0}^{\tau}L\Vert \widehat{\alpha}(\nu )-{\alpha}_{0}(\nu )\Vert \frac{d\nu}{1-\nu}+\epsilon},\hfill \end{array}$$

which leads to

$$\Vert \widehat{\alpha}(\tau )-{\alpha}_{0}(\tau )\Vert \le \epsilon {(1-\tau )}^{-L}\text{}\tau \in [0,u],$$

(A.5)

by the Gronwall’s inequality. Therefore, (τ) is strongly consistent for α_{0}(τ) uniformly in τ [0, *u*].

It remains to show that, for any ε > 0, there exists δ > 0 such that sup_{τ[l,u]} ‖α(τ) − α_{0}(τ) ‖ < δ implies sup_{τ [l,u]} ‖β(τ) − β_{0}(τ) ‖ < ε. Suppose that the assertion is false. Thus, for each δ > 0, there exists (*b*, ν) such that ‖Γ_{1}(*b*) − α_{0}(ν) ‖ < δ and ‖*b* − β_{0}(ν) ‖ > *d* for some constant *d* > 0. Then, there is a subsequence of (*b*, ν) that converges to, say, (*b*_{0}, ν_{0}). This means that Γ_{1}(*b*_{0}) = Γ_{1}{β_{0}(ν_{0})} but *b*_{0} ≠ β_{0}(ν_{0}). However, conditions C1 and C3 imply that *f _{Z}*{

**Weak convergence.**

Lemma A.1 *Under the conditions in Theorem 2*,

$$\begin{array}{c}\underset{\tau \in [0,u]}{\text{sup}}\Vert {\widehat{\mathrm{\Gamma}}}_{1}\{\widehat{\beta}(\tau )\}-{\mathrm{\Gamma}}_{1}\{\widehat{\beta}(\tau )\}-{\widehat{\mathrm{\Gamma}}}_{1}\{{\beta}_{0}(\tau )\}+{\mathrm{\Gamma}}_{1}\{{\beta}_{0}(\tau )\}\Vert \hfill \\ \hfill ={o}_{p}({n}^{-1/2}),\hfill \end{array}$$

(A.6)

$$\begin{array}{c}\underset{\tau \in [0,u]}{\text{sup}}\Vert {\displaystyle {\int}_{0}^{\tau}[{\widehat{\mathrm{\Gamma}}}_{2}\left\{\widehat{\beta}\left(\nu \right)\right\}-{\mathrm{\Gamma}}_{2}\left\{\widehat{\beta}\left(\nu \right)\right\}-{\widehat{\mathrm{\Gamma}}}_{2}\left\{{\beta}_{0}\left(\nu \right)\right\}+{\mathrm{\Gamma}}_{2}\left\{{\beta}_{0}\left(\nu \right)\right\}]\frac{d\nu}{1-\nu}}\Vert \hfill \\ \hfill ={o}_{p}\left({n}^{-1/2}\right).\hfill \end{array}$$

(A.7)

*Proof of Lemma A.1.* Consider (A.6) first. Since {*Z*Δ*I*(*X* ≤ *Z*^{}*b*) : *b* ^{p}} is Donsker, *n*^{1/2}{_{1}(*b*) − Γ_{1}(*b*)} converges weakly to a tight Gaussian process. The tightness implies that, for every ε > 0 and *m* = 1, , *p*,

$$\begin{array}{c}\underset{\delta \downarrow 0}{\text{lim}}\phantom{\rule{thinmathspace}{0ex}}\underset{n\to \infty}{\text{lim}\phantom{\rule{thinmathspace}{0ex}}\text{sup}}\phantom{\rule{thinmathspace}{0ex}}\text{Pr}\phantom{\rule{thinmathspace}{0ex}}(\underset{{b}_{1},{b}_{2}:\sigma [{n}^{1/2}\{{\widehat{\mathrm{\Gamma}}}_{1}^{(m)}({b}_{1})-{\widehat{\mathrm{\Gamma}}}_{1}^{(m)}({b}_{2})\}]<\delta}{\text{sup}}\hfill \\ \hfill \text{}{n}^{1/2}\left|{\widehat{\mathrm{\Gamma}}}_{1}^{(m)}({b}_{1})-{\mathrm{\Gamma}}_{1}^{(m)}({b}_{1})-{\widehat{\mathrm{\Gamma}}}_{1}^{(m)}({b}_{2})+{\mathrm{\Gamma}}_{1}^{(m)}({b}_{2})\right|\epsilon )=0,\end{array}$$

where σ(·) denotes standard deviation and superscript ^{(m)} the *m*-th component of a vector; see, e.g., Kosorok (2008). Furthermore, note that

$$\begin{array}{c}{\sigma}^{2}[{n}^{1/2}\{{\widehat{\mathrm{\Gamma}}}_{1}^{(m)}({b}_{1})-{\widehat{\mathrm{\Gamma}}}_{1}^{(m)}({b}_{2})\}]\hfill \\ \text{\hspace{1em}\hspace{1em}\hspace{1em}\hspace{1em}\hspace{1em}}={\sigma}^{2}[{Z}^{(m)}\mathrm{\Delta}\{I(X\le {Z}^{\top}{b}_{1})-I(X\le {Z}^{\top}{b}_{2})\}]\hfill \\ \text{\hspace{1em}\hspace{1em}\hspace{1em}\hspace{1em}\hspace{1em}}\le E\{{Z}^{(m)2}\mathrm{\Delta}\left|I(X\le {Z}^{\top}{b}_{1})-I(X\le {Z}^{\top}{b}_{2})\right|\}.\hfill \end{array}$$

Write (β_{0}, , τ) = *E*[Δ|*I*(*X* ≤ *Z*^{}*b*) − *I*{X ≤ *Z*^{}β_{0}(τ)}|] |_{b=(τ)}. Given condition C1, it then suffices to show sup

$$\underset{\tau \in [0,u]}{\text{sup}}\Upsilon ({\beta}_{0},\widehat{\beta},\tau )=o(1)$$

(A.8)

almost surely. Let *c _{f}* be the upper bound of

$$\begin{array}{cc}\Upsilon ({\beta}_{0},\widehat{\beta},\tau )\hfill & \le {c}_{f}\phantom{\rule{thinmathspace}{0ex}}E\phantom{\rule{thinmathspace}{0ex}}\left[\left|{Z}^{\top}\{b-{\beta}_{0}(\tau )\}\right|\right]{|}_{b=\widehat{\beta}(\tau )}\hfill \\ \hfill & \le {c}_{f}\Vert \widehat{\beta}(\tau )-{\beta}_{0}(\tau )\Vert E\Vert Z\Vert .\hfill \end{array}$$

Following the consistency of (·), for every *l* > 0,

$$\underset{\tau \in [l,u]}{\text{sup}}\Upsilon ({\beta}_{0},\widehat{\beta},\tau )=o(1)$$

(A.9)

almost surely. On the other hand,

$$\begin{array}{cc}\Upsilon ({\beta}_{0},\widehat{\beta},\tau )\hfill & \le {\mathrm{\Gamma}}_{1}^{(1)}\{\widehat{\beta}(\tau )\}+{\mathrm{\Gamma}}_{1}^{(1)}\{{\beta}_{0}(\tau )\}\hfill \\ \hfill & \le 2{\mathrm{\Gamma}}_{1}^{(1)}\{{\beta}_{0}(\tau )\}+\left|{\mathrm{\Gamma}}_{1}^{(1)}\{\widehat{\beta}(\tau )\}-{\mathrm{\Gamma}}_{1}^{(1)}\{{\beta}_{0}(\tau )\}\right|.\hfill \end{array}$$

Therefore, following the consistency of (·), for every ε > 0, there exists τ_{ε} > 0 such that

$$\underset{\tau \in [0,{\tau}_{e}]}{\text{sup}}\Upsilon ({\beta}_{0},\widehat{\beta},\tau )<\epsilon $$

(A.10)

almost surely for sufficiently large *n*. Combining (A.9) and (A.10) gives (A.8).

Now consider (A.7). Arguments similar to the above establish that, for every *l* > 0,

$$\underset{\tau \in [l,u]}{\text{sup}}\Vert {\widehat{\mathrm{\Gamma}}}_{2}\{\widehat{\beta}(\tau )\}-{\mathrm{\Gamma}}_{2}\{\widehat{\beta}(\tau )\}-{\widehat{\mathrm{\Gamma}}}_{2}\{{\beta}_{0}(\tau )\}+{\mathrm{\Gamma}}_{2}\{{\beta}_{0}(\tau )\}\Vert ={o}_{p}({n}^{-1/2}).$$

(A.11)

Since sup_{bp} ‖_{2}(*b*) − Γ_{2}(*b*)‖ = *O _{p}*(

$$\begin{array}{c}\text{Pr}\phantom{\rule{thinmathspace}{0ex}}(\underset{\tau \in [0,{\tau}_{e}]}{\text{sup}}\Vert {\displaystyle {\int}_{0}^{\tau}{n}^{1/2}[{\widehat{\mathrm{\Gamma}}}_{2}\{\widehat{\beta}(\nu )\}-{\mathrm{\Gamma}}_{2}\{\widehat{\beta}(\nu )\}}\hfill \\ \text{\hspace{1em}\hspace{1em}\hspace{1em}\hspace{1em}\hspace{1em}\hspace{1em}}-{\widehat{\mathrm{\Gamma}}}_{2}\{{\beta}_{0}(\nu )\}+{\mathrm{\Gamma}}_{2}\{{\beta}_{0}(\nu )\}]\frac{d\nu}{1-\nu}\Vert >\epsilon )\to 0.\hfill \end{array}$$

(A.12)

Then, (A.7) follows from (A.11) and (A.12).

Plugging equations (A.6) and (A.7) into (A.4) yields that, for τ [0, *u*],

$$\begin{array}{c}\widehat{\alpha}(\tau )-{\alpha}_{0}(\tau )-{\displaystyle {\int}_{0}^{\tau}[H\{\widehat{\alpha}(\nu )\}-H\{{\alpha}_{0}(\nu )\}]\frac{d\nu}{1-\nu}+{o}_{p}({n}^{-1/2})}\hfill \\ \text{\hspace{1em}\hspace{1em}\hspace{1em}\hspace{1em}\hspace{1em}\hspace{1em}}=-{\widehat{\mathrm{\Gamma}}}_{1}\{{\beta}_{0}(\tau )\}+{\displaystyle {\int}_{0}^{\tau}{\widehat{\mathrm{\Gamma}}}_{2}}\{{\beta}_{0}(\nu )\}\frac{d\nu}{1-\nu},\hfill \end{array}$$

where *o _{p}*(·) is uniform for τ [0,

$$\begin{array}{c}\widehat{\alpha}(\tau )-{\alpha}_{0}(\tau )+{\displaystyle {\int}_{0}^{\tau}[\mathrm{\Psi}\{{\beta}_{0}(\nu )\}+\mathrm{\Pi}][\widehat{\alpha}(\nu )-{\alpha}_{0}(\nu )]\frac{d\nu}{1-\nu}}\hfill \\ \hfill \text{\hspace{1em}\hspace{1em}\hspace{1em}\hspace{1em}\hspace{1em}}+{o}_{p}({n}^{-1/2}+\Vert \widehat{\alpha}(\xb7)-{\alpha}_{0}(\xb7)\Vert )\hfill \\ \hfill \text{\hspace{1em}\hspace{1em}}=-{\widehat{\mathrm{\Gamma}}}_{1}\{{\beta}_{0}(\tau )\}+{\displaystyle {\int}_{0}^{\tau}{\widehat{\mathrm{\Gamma}}}_{2}\{{\beta}_{0}(\nu )\}\frac{d\nu}{1-\nu}.}\hfill \end{array}$$

(A.13)

The remaining proof is sketched since it essentially follows that of theorem 2 in Peng & Huang (2008), where more details can be found. Note that the right-hand side of (A.13) is a martingale and converges weakly to a Gaussian process by the martingale central limit theorem (e.g., Fleming & Harrington, 1991). Furthermore, (A.13) as a differential equation of (·) − α_{0}(·) can be solved by using product integration theory (Gill & Johansen, 1990), establishing that (·)−α_{0}(·) as a linear map of the right-hand side converges weakly to a Gaussian process. The weak convergence of (·) then follows by the functional delta method.

Throughout, a quantity based on the perturbed sample is denoted by adding an asterisk. For example,
${\widehat{\mathrm{\Gamma}}}_{1}^{*}(b)$ is the perturbed version of _{1}(*b*).

The same arguments of the consistency proof in Appendix D may be used to show the strong consistency of *(·) for α_{0}(·) on [0, *u*] and that of *(·) for β_{0}(·) on [*l, u*], upon establishment of the following two results. First, by Kosorok (2008, theorem 10.13), almost surely

$$\underset{b\in {\mathbb{R}}^{p}}{\text{sup}}\Vert {\widehat{\mathrm{\Gamma}}}_{j}^{*}(b)-{\mathrm{\Gamma}}_{j}(b)\Vert =o(1)\text{}j=1,2.$$

Second, the terms involving *w _{i}*(·) in the perturbed version of equation (13) are negligible, by the fact that the maximum of the n iid perturbing random variables is almost surely o(

By an unconditional multiplier central limit theorem (Kosorok, 2008, theorem 10.1 and corollary 10.3), ${n}^{1/2}\{{\widehat{\mathrm{\Gamma}}}_{j}^{*}(\xb7)-{\mathrm{\Gamma}}_{j}(\xb7)\},\phantom{\rule{thinmathspace}{0ex}}j=1,2$ converge weakly to tight processes. The arguments in the proof of Lemma A.1 then can be used to establish

$$\begin{array}{c}\underset{\tau \in [0,u]}{\text{sup}}\Vert {\widehat{\mathrm{\Gamma}}}_{1}^{*}\{{\widehat{\beta}}^{*}(\tau )\}-{\mathrm{\Gamma}}_{1}\{{\widehat{\beta}}^{*}(\tau )\}-{\widehat{\mathrm{\Gamma}}}_{1}^{*}\{{\beta}_{0}(\tau )\}+{\mathrm{\Gamma}}_{1}\{{\beta}_{0}(\tau )\}\Vert \hfill \\ \hfill ={o}_{p}({n}^{-1/2}),\hfill \\ \underset{\tau \in [0,u]}{\text{sup}}\Vert {\displaystyle {\int}_{0}^{\tau}[{\widehat{\mathrm{\Gamma}}}_{2}^{*}\{{\widehat{\beta}}^{*}(\nu )\}-{\mathrm{\Gamma}}_{2}\{{\widehat{\beta}}^{*}(\nu )\}-{\widehat{\mathrm{\Gamma}}}_{2}^{*}\{{\beta}_{0}(\nu )\}+{\mathrm{\Gamma}}_{2}\{{\beta}_{0}(\nu )\}]\frac{d\nu}{1-\nu}}\Vert \hfill \\ \hfill ={o}_{p}({n}^{-1/2}).\hfill \end{array}$$

Thus, along the lines to establish equation (A.13), one obtains

$$\begin{array}{c}{\widehat{\alpha}}^{*}(\tau )-{\alpha}_{0}(\tau )+{\displaystyle {\int}_{0}^{\tau}[\mathrm{\Psi}\{{\beta}_{0}(\nu )\}+\mathrm{\Pi}][{\widehat{\alpha}}^{*}(\nu )-{\alpha}_{0}(\nu )]\frac{d\nu}{1-\nu}}\hfill \\ \hfill +{o}_{p}({n}^{-1/2}+\Vert {\widehat{\alpha}}^{*}(\xb7)-{\alpha}_{0}(\xb7)\Vert )\\ \hfill =-{\widehat{\mathrm{\Gamma}}}_{1}^{*}\{{\beta}_{0}(\tau )\}+{\displaystyle {\int}_{0}^{\tau}{\widehat{\mathrm{\Gamma}}}_{2}^{*}\{{\beta}_{0}(\nu )\}\frac{d\nu}{1-\nu},}\hfill \end{array}$$

(A.14)

given that a perturbing random variable is almost surely *o*(*n*^{1/2}).

Following from equations (A.13) and (A.14),

$$\begin{array}{c}\text{}{\widehat{\alpha}}^{*}(\tau )-\widehat{\alpha}(\tau )+{\displaystyle {\int}_{0}^{\tau}[\mathrm{\Psi}\{{\beta}_{0}(\nu )\}+\mathrm{\Pi}][{\widehat{\alpha}}^{*}(\nu )-\widehat{\alpha}(\nu )]\frac{d\nu}{1-\nu}}\hfill \\ \hfill \text{\hspace{1em}\hspace{1em}\hspace{1em}}+{o}_{p}({n}^{-1/2}+\Vert {\widehat{\alpha}}^{*}(\xb7)-\widehat{\alpha}(\xb7)\Vert )\hfill \\ =-{\widehat{\mathrm{\Gamma}}}_{1}^{*}\{{\beta}_{0}(\tau )\}+{\widehat{\mathrm{\Gamma}}}_{1}\{{\beta}_{0}(\tau )\}+{\displaystyle {\int}_{0}^{\tau}[{\widehat{\mathrm{\Gamma}}}_{2}^{*}\{{\beta}_{0}(\nu )\}-{\widehat{\mathrm{\Gamma}}}_{2}\{{\beta}_{0}(\nu )\}]\frac{d\nu}{1-\nu},}\hfill \end{array}$$

(A.15)

since ‖(·) − α_{0}(·) ‖ = *O _{p}*(

^{*}Supported in part by grants R01 CA090747 and R01 AI078835 from the U.S. National Institutes of Health

- Aalen OO. A model for nonparametric regression analysis of counting processes. In: Klonecki W, Kozek A, Rosiński J, editors. Lecture Notes in Statistics, Volume 2. New York: Springer-Verlag; 1980. pp. 1–25.
- Buckley J, James I. Linear regression with censored data. Biometrika. 1979;66:429–436.
- Chernozhukov V. Extremal quantile regression. Ann. Statist. 2005;33:806–839.
- Efron B. Proceedings of the Fifth Berkeley Symposiums in Mathematical Statistics IV. New York: Prentice-Hall; 1967. The two sample problem with censored data; pp. 831–853.
- Fleming TR, Harrington DP. Counting Processes and Survival Analysis. New York: Wiley; 1991.
- Gill PE, Murray W, Wright MH. Numerical Linear Algebra and Optimization. Redwood City, CA: Addison–Wesley; 1991.
- Gill RD, Johansen S. A survey of product-integration with a view towards application in survival analysis. Ann. Statist. 1990;18:1501–1555.
- Gutenbrunner C, Jureková J. Regression rank scores and regression quantiles. Ann. Statist. 1992;20:305–330.
- Honoré B, Khan S, Powell JL. Quantile regression under random censoring. J. Economet. 2002;109:67–105.
- Jin Z, Ying Z, Wei LJ. A simple resampling method by perturbing the minimand. Biometrika. 2001;88:381–390.
- Koenker R. Censored quantile regression redux. J. Statist. Software. 2008;27(Issue 6):1–25.
- Koenker R, Bassett G. Regression quantiles. Econometrica. 1978;46:33–50.
- Koenker R, D’Orey V. Computing regression quantiles. Appl. Statist. 1987;36:383–393.
- Koenker R, Geling O. Reappraising medfly longevity: a quantile regression survival analysis. J. Am. Statist. Assoc. 2001;96:458–468.
- Kosorok MR. Introduction to Empirical Processes and Semi-parametric Inference. New York: Springer; 2008.
- Neocleous T, Vanden Branden K, Portnoy S. Correction to censored regression quantiles by S. Portnoy, 98 (2003), 1001–1012. J. Am. Statist. Assoc. 2006;101:860–861.
- Owen A. Empirical likelihood ratio confidence regions. Ann. Statist. 1990;18:90–120.
- Peng L, Huang Y. Survival Analysis with Temporal Covariate Effects. Biometrika. 2007;94:719–733.
- Peng L, Huang Y. Survival analysis with quantile regression models. J. Am. Statist. Assoc. 2008;103:637–649.
- Portnoy S. Censored regression quantiles. J. Am. Statist. Assoc. 2003;98:1001–1012.
- Portnoy S, Jureková J. On extreme regression quantiles. Extremes. 1999;2:227–243.
- Powell JL. Least absolute deviations estimation for the censored regression model. J. Economet. 1984;25:303–325.
- Powell JL. Censored regression quantiles. J. Economet. 1986;32:143–155.
- Robins JM, Ritov Y. Toward a curse of dimensionality appropriate (CODA) asymptotic theory for semi-parametric models. Statist. Med. 1997;16:285–319. [PubMed]
- Smith R. Nonregular regression. Biometrika. 1994;81:173–183.
- Tian L, Zucker D, Wei LJ. On the Cox model with time-varying regression coefficients. J. Am. Statist. Assoc. 2005;100:172–183.
- Tsiatis AA. Estimating regression parameters using linear rank tests for censored data. Ann. Statist. 1990;18:354–372.
- Wang H, Wang L. Locally weighed censored quantile regression. J. Am. Statist. Assoc. 2009;104:1117–1128.
- Ying Z, Jung SH, Wei LJ. Survival analysis with median regression models. J. Am. Statist. Assoc. 1995;90:178–184.

PubMed Central Canada is a service of the Canadian Institutes of Health Research (CIHR) working in partnership with the National Research Council's national science library in cooperation with the National Center for Biotechnology Information at the U.S. National Library of Medicine(NCBI/NLM). It includes content provided to the PubMed Central International archive by participating publishers. |