|Home | About | Journals | Submit | Contact Us | Français|
Fan et al. are to be congratulated for this important contribution to the analysis of multivariate failure time data. They have provided three regression parameter estimators for multiple covariates in the marginal hazard model. Using the weighted estimating equation approach, they proposed sets of weights to:
They showed that each of their proposed estimators can consistently outperform estimates derived using the working independence model.
In this short note, we show that in the presence of high-dimensional covariates Fan et al.’s ideas can be combined with those of  to achieve these optimal estimates along with simultaneous variable selection. That is, our interest lies in controlling the variances of the estimates of β = (β1,…,βp)T associated with high dimensional covariates, while simultaneously selecting the “important” covariates in order to construct a parsimonious model. Here, we consider p as a large but fixed constant, as opposed to  which considered the situation where p may increase with the sample size n.
The key idea is to add a penalty function pλj (|βj|) to Fan et al.’s weighted partial likelihood function (10), which leads to the following penalized likelihood function
Denote by β0 = (β01,…,β0p)T the true value of β and suppose, without loss of generality, that β0k ≠ 0, k s and β0k = 0, k > s for some s p. Let β01 = (β01,…,β0s)T. Finally, let = (1,2)T be the solution that maximizes (1) such that 1 = (1,… ,s)T and 2 = (s+1,…,p)T.
With an appropriate penalty function, our estimator may enjoy the oracle property. That is, the procedure should select the true model with probability tending to 1 and, given the true model, the coefficient estimates should asymptotically behave like maximum (partial) likelihood estimators. More specifically, consider
Here, since λj depends on n, we write it as λjn. Along the lines of Theorem 2 of , we can show that if the penalty function is such that and bn → 0, and in addition λjn → 0 and ,
Therefore, by (2) we have that the asymptotic covariance of is
where is the penalized version of var(W) defined in section 2.3 of Fan et al. One example of an appropriate pλj (θ) is the smoothly clipped absolute deviation penalty of , where
. More examples can be found in .
We can now follow Fan et al. and define optimality criteria that will allow us to simultaneously estimate the optimal weights w = (w1,…,wJ). Minimizing the component-wise variance may not be ideal because minimizing the variance of the j, j > s is irrelevant if the true β0j = 0. Minimizing the variance of any arbitrary linear function of the parameter estimates is also not always feasible, as was explained in Subsection 3.3 of Fan et al. Hence, we focus on minimizing the total variance:
analogous to (14). Following the derivation in Subsection 3.2, we assume that Σj(β) ≈ bjΓ for some Γ. Thus, if we constrain , we are to minimize
over w, where Γ11 and Dkl11 are the first s × s submatrices of Γ and Dkl(β), respectively. If HP is a symmetric matrix with diagonal elements tr([Γ11 + Σ11]−1Σj11[Γ11 + Σ11]−1) and off-diagonal elements tr([Γ11 + Σ11]−1Dkl11[Γ11 + Σ11]−1), the solution is given by
where b = (b1,…,bJ)T. Γ11 can be estimated by , and suggestions for possible choices for b were given in Subsection 3.2.
Finally, we can choose the parameters λj by iteratively minimizing the generalized cross-validation statistic of Cai et al. and solving for the optimal weights wj. Let
If λ = (λ1,…,λp)T, we choose
To conclude, we have suggested a way in which the work of Fan et al. can be extended to perform variable selection on models for multivariate survival times with high dimensional covariates, while simultaneously providing optimally efficient parameter estimates for the selected covariates. Our future direction lies in utilizing empirical date and simulations to evaluate the accuracy and stability of the variable selection process, as well as the performance of the estimates derived using these optimal weights.
This work was supported by U. S. National Cancer Institute (Grant No. R01 CA95747)