Home | About | Journals | Submit | Contact Us | Français |

**|**HHS Author Manuscripts**|**PMC2875380

Formats

Article sections

- Summary
- 1. Introduction
- 2. The model
- 3. Estimation
- 4. Inference
- 5. Simulation
- 6. Example: PSA for Prostate Cancer Prognosis
- 7. Discussion
- Supplementary Material
- References

Authors

Related links

Biometrics. Author manuscript; available in PMC 2010 May 25.

Published in final edited form as:

Published online 2009 April 13. doi: 10.1111/j.1541-0420.2009.01246.x

PMCID: PMC2875380

NIHMSID: NIHMS154286

The publisher's final edited version of this article is available at Biometrics

See other articles in PMC that cite the published article.

Rigorous statistical evaluation of the predictive values of novel biomarkers is critical prior to applying novel biomarkers into routine standard care. It is important to identify factors that influence the performance of a biomarker in order to determine the optimal conditions for test performance. We propose a covariate-specific time-dependent PPV curve to quantify the predictive accuracy of a prognostic marker measured on a continuous scale and with censored failure time outcome. The covariate effect is accommodated with a semiparametric regression model framework. In particular we adopt a smoothed survival time regression technique (Dabrowska, 1997) to account for the situation where risk for the disease occurrence and progression is likely to change over time. In addition, we provide asymptotic distribution theory and resampling-based procedures for making statistical inference on the covariate specific positive predictive values. We illustrate our approach with numerical studies and a dataset from a prostate cancer study.

The utility of novel molecular markers and clinical information for predicting disease risk and future prognostic outcome has great potential for advancing treatment and management decisions. In the Pacific Northwest Prostate Cancer SPORE (Specialized Program of Research Excellence), a cohort of patients diagnosed with prostate cancer (PC) are under follow-up for clinical outcomes. Information on biomarkers such as serum prostate-specific antigen (PSA) at diagnosis and genotype data on single nucleotide polymorphisms (SNPs) have been collected. One goal of the study is to identify biomarkers that can predict PC-specific mortality so that treatment options can be tailored to the individual. In this setting there are two key prognostic questions. First, should a test be taken by a patient to predict his/her prognosis? Second, given a patient has the test, what is the probability that a poor outcome will be realized in the near future based on the test result?

The first prognostic question may be addressed with retrospective accuracy characterized by correct classification rates such as the true positive fraction (TPF) and true negative fraction (TNF). Traditionally for a marker measured on a continuous scale, denoted by *Y*, measured concurrently with the disease outcome *D*, its discrimination potential can be summarized with the receiver operating characteristic (ROC) curve. The ROC curve is a plot of the TPF, *P*(*Y* > *c*|*D* = 1) against the false positive fraction (FPF), *P*(*Y* > *c*|*D* = 0) for all possible values *c*, where *D* = 1 indicates the presence of disease. These summaries reflect how well the test results reflect the true outcome, and they are crucial for evaluating the public health implications of clinical guidelines. Extending the traditional cross-sectional concept of ROC curve to accommodate the censored failure time outcome has been an area of active research for the past decade (e.g., Etzioni et al. (1999); Heagerty et al. (2000); Cai and Pepe (2002); Heagerty and Zheng (2005), for review, see Pepe et al. (2008)).

Physicians and individual patients are often most concerned with disease prediction given their clinical status data. Thus the prospective accuracy such as positive predictive values (PPV) and negative predictive values (NPV) are critical in addressing the second prognostic question. When data from a prospective cohort are available, calculation of PPV and NPV is usually performed. In contrast to research on retrospective accuracy, statistical methods for prospective accuracy have not been well developed, especially when a continuous marker is measured and the outcome is followed prospectively from a cohort study. The two prospective accuracy summaries are traditionally defined for a dichotomous test *Y* as

$$\text{PPV}=P(D=1\mid Y=1)\phantom{\rule{0.38889em}{0ex}}\phantom{\rule{0.38889em}{0ex}}\phantom{\rule{0.38889em}{0ex}}\text{and}\phantom{\rule{0.38889em}{0ex}}\phantom{\rule{0.38889em}{0ex}}\phantom{\rule{0.38889em}{0ex}}\text{NPV}=P(D=0\mid Y=0).$$

To accommodate a continuous marker, Moskowitz and Pepe (2004b) proposed to define PPV(*v*) = *P*{*D* = 1|*F*(*y*) ≥ *v*} and NPV(*v*) = *P*{*D* = 0|*F*(*y*) < *v*}, where *F* is the cumulative distribution function of *Y* in the population. Analogous to the idea of an ROC curve for thresholding the continuous marker values, they consider the PPV curve, a plot of PPV(*v*) versus *v*, for 0 < *v* < 1, where subjects with marker values exceed the *v*th percentile of the population are regarded as having a positive test. The consideration of *v* rather than the raw marker values as the x-axis of the PPV curve is appealing. First, it provides a common scale for comparing multiple markers with raw values measured on different scales. Second, it highlights the need for considering both predictive probability and marker distribution in the targeted population when evaluating the predictive accuracy of the marker.

Zheng et al. (2008) considered extending the PPV curve concept to a setting where disease status is not measured concurrently with the test and the time to event can be censored. Specifically, they consider disease status as a function of time, *D*(*t*) = 1(*T* < *t*) for a clinical event time *T*. Consequently PPV and NPV were time-dependent functions of the form

$$\text{PPV}(t,v)=P\{T<t\mid Y\ge {F}^{-1}(v)\},\phantom{\rule{0.38889em}{0ex}}\phantom{\rule{0.38889em}{0ex}}\phantom{\rule{0.38889em}{0ex}}\text{and}\phantom{\rule{0.38889em}{0ex}}\phantom{\rule{0.38889em}{0ex}}\phantom{\rule{0.38889em}{0ex}}\text{NPV}(t,v)=P\{T\ge t\mid Y<{F}^{-1}(v)\}.$$

A time-dependent PPV curve at a predictive time *t* is then defined as a plot of PPV(*t, v*) versus *v* for *v* in an open interval of (0, 1). Therefore the x-axis of a PPV curve shows the proportion of subjects testing positive when a positive test is defined by exceeding the threshold corresponding to the *v*th percentile of *Y* in the population: *Y* ≥ *F*^{−1}(*v*), whereas the y-axis of a PPV curve shows the risk of developing an event by time *t* for subjects who satisfy the positivity criterion. Zheng et al. (2008) focuses on a class of nonparametric PPV curve estimators and their associated indices that provide a natural way for handling censored event outcomes and involve little parametric assumption. The methods are useful for evaluating individual PPV curves and comparing PPV curves.

In this article we focus on the statistical method for evaluating factors that may influence the predictive accuracy of a continuous biomarker. In practice there are many variables that can affect the accuracy of a marker, such as a subject’s age, gender, or body mass index (BMI). For example PSA levels tend to increase with age, and older patients may have a higher risk of prostate cancer progression depending upon treatment. It is likely that the predictive performance of PSA can be altered by these confounding variables. Marker performance may also depend on the means by which a biomarker-based test is administered or varying protocols between different institutions participating in the study. For example, variations in the timing for collecting and processing biological specimens should be considered. Furthermore, the heterogeneous nature of the disease will often affect the performance of the test. Gleason score is a PC aggressiveness index. A biomarker such as PSA could have differential performance in predicting prognosis in patients with a high Gleason score as compared to those in a low Gleason score group. Genetic studies will often identify new genotypes that give rise to sub-populations differing in their disease susceptibility, and the performance of PSA among these sub-populations may be substantially different. When the performance of a biomarker among sub-populations may be substantially different from the general population, covariate-specific PPV curve should be considered. It provides a graphical tool for investigating the impact of these confounding factors on the predictive performance of the marker, and therefore helps to identify appropriate sub-populations with effective test performance.

A broad range of questions can be addressed with covariate-specific PPV curves, in a regression model framework. It can be used to characterize the simultaneous or independent effects of covariates on predictive accuracy and to compare tests with regard to predictive accuracy while controlling for potential confounding variables. Using a test indicator as a covariate in the model permits more efficient comparison of two tests. Finally, it may provide a way to evaluate the incremental values of a new test, for example by examining whether the new test is predictive beyond previously identified prognostic information.

Statistical methods for incorporating covariate effects into PPV analysis are limited. One approach to incorporating covariate effects into time-dependent predictive values has been suggested by Moskowitz and Pepe (2004a) for a binary marker *Y*, where *Y* = 1 indicating a positive test result. For PPV** _{z}**(

$$\lambda (t\mid \mathbf{Z},Y=1)={\lambda}_{01}(t)exp({\mathit{\beta}}_{1}^{\text{T}}\mathbf{Z}),$$

in which *β*_{1} quantifies the difference in positive predictive accuracy between two levels of covariate **Z**. However, the idea is not directly applicable to continuous markers. The nonparametric time-dependent PPV curve estimators considered in Zheng et al. (2007) are quite flexible in specifying the relationship between a continuous marker *Y* and clinical event *T*, but it can be difficult to handle multiple covariates. In this article, we extend the method of the time-dependent PPV curve studied by Zheng et al. (2008) to the estimation of covariate-specific PPV curves. Similar to Moskowitz and Pepe (2004b), we will accommodate the covariate effects in a regression framework, but we will seek to preserve the flexible relationship between biomarker and event time that accommodated by the nonparametric PPV curve estimators in Zheng et al. (2008).

A formal definition of the covariate-specific PPV curve is given in Section 2. We describe two alternative estimation procedures and characterize their large sample properties in Section 3 and 4, respectively. The finite sample performance is investigated in Section 5, and in Section 6 we illustrate the method by analyzing a prostate cancer dataset. A discussion of the potential extensions of current work is given in section 7.

Consider a setting where one is interested in determining the effect of **Z** on the predictive accuracy of a continuous marker *Y*. In particular, we have a prospective study where each subject indexed by the subscript *i* is followed over time for a clinical event, and measured along with a marker *Y _{i}* at the baseline and other covariates

We define a time-dependent covariate-specific PPV curve for a prediction time *t* as a plot of

$${\text{PPV}}_{\mathbf{z}}(t,v)=P\{T<t\mid Y\ge {c}_{\mathbf{z}}(v),\mathbf{Z}=\mathbf{z}\},$$

versus *v*, for *v* in an open interval of (0,1). In contrast to a PPV curve that ignores covariates, the x-axis of a covariate-specific PPV curve displays among the subgroups who share the same covariate values **z** the fraction of subjects testing positive, when a positive test is defined by exceeding the threshold corresponding to the *v*th percentile of *Y* in the subpopulation: *Y* ≥ *c*** _{z}**(

$${\text{NPV}}_{\mathbf{z}}(t,v)=P\{T\ge t\mid Y<{c}_{\mathbf{z}}(v),\mathbf{Z}=\mathbf{z}\}.$$

Observe that the differences among PPV** _{z}**(

A simple approach for estimating the covariate-specific PPV curves is to fit an individual PPV curve using only data with **Z** = **z**. The procedure developed in Zheng et al. (2008) can be directly applied. However, such a subgroup analysis can not accommodate continuous covariates and may not be feasible due to sparseness of data. We therefore consider here flexible regression based procedures to accommodate more general structures of **Z**.

Observe that the covariate-specific PPV curve can be written as

$${\text{PPV}}_{\mathbf{z}}(t,v)=1-{(1-v)}^{-1}{\int}_{{c}_{\mathbf{z}}(v)}^{\infty}P(T\ge t\mid Y=y,\mathbf{Z}=\mathbf{z})d{F}_{Y\mid \mathbf{z}}(y).$$

The estimation of PPV** _{z}**(

To estimate *F _{Y}*

$${\widehat{F}}_{Y\mid \mathbf{z}}(y)=\sum _{i=1}^{n}I({Y}_{i}<y,{\mathbf{Z}}_{i}=\mathbf{z})/\sum _{i=1}^{n}I({\mathbf{Z}}_{i}=\mathbf{z}).$$

For multiple covariates or continuous variables, we will adopt the semi-parametric regression quantile method of Heagerty and Pepe (1999). Specifically, it specifies biomarker *Y* as a function of **Z** with a location-scale model: *Y _{i}* =

$${\widehat{F}}_{Y\mid \mathbf{z}}(y)={n}^{-1}\sum _{i=1}^{n}I\left[\frac{{Y}_{i}-{\widehat{\mathit{\gamma}}}^{\text{T}}({\mathbf{Z}}_{i}-\mathbf{z})}{exp\{{\widehat{\mathit{\kappa}}}^{\text{T}}({\mathbf{Z}}_{i}-\mathbf{z})\}}<y\right].$$

(3.1)

The other key quantity is the conditional survival function *S _{y,}*

In order to estimate the absolute risk function *S _{y,}*

$${S}_{y,\mathbf{z}}(t)=exp\{-{\mathrm{\Lambda}}_{0}(t)exp({\alpha}_{0}y+{\mathit{\beta}}_{0}^{\text{T}}\mathbf{z})\},$$

(3.2)

where ${\mathrm{\Lambda}}_{0}(t)={\int}_{0}^{t}{\lambda}_{0}(u)du$ is the cumulative baseline hazard function. Under the Cox model assumption, the survival function can be consistently estimated by

$${\stackrel{\sim}{S}}_{y,\mathbf{z}}(t)=exp\left\{-{\stackrel{\sim}{\mathrm{\Lambda}}}_{0}(t)exp\left(\stackrel{\sim}{\alpha}y+{\stackrel{\sim}{\mathit{\beta}}}^{\text{T}}\mathbf{z}\right)\right\},$$

where and ** are the maximum partial likelihood estimators of ***α*_{0} and *β*_{0} respectively, and

$${\stackrel{\sim}{\mathrm{\Lambda}}}_{0}(t)=\sum _{i=1}^{n}{\int}_{0}^{t}\frac{d{N}_{i}(u)}{{\sum}_{j=1}^{n}I({X}_{j}\ge u)exp(\stackrel{\sim}{\alpha}{Y}_{j}+{\stackrel{\sim}{\mathit{\beta}}}^{\text{T}}{\mathbf{Z}}_{j})}.$$

is the Breslow estimator of Λ_{0}(*t*) and *N _{i}*(

A plug-in estimator for PPV** _{z}**(

$${\stackrel{\sim}{\text{PPV}}}_{\mathbf{z}}(t,v)=1-{(1-v)}^{-1}{\int}_{{\widehat{c}}_{\mathbf{z}}(v)}^{\infty}{\stackrel{\sim}{S}}_{y,\mathbf{z}}(t)d{\widehat{F}}_{Y\mid \mathbf{z}}(y),$$

(3.3)

where ${\widehat{c}}_{\mathbf{z}}(v)={\widehat{F}}_{Y\mid \mathbf{z}}^{-1}(v)$. The approach has two advantages. First, the maximum partial likelihood based estimator will be most efficient when the assumption of a proportional hazards model holds. Second, it is easy to implement with standard software.

It is often anticipated in biomarker studies that the effects of markers may be time-varying and/or non-linear. As a result, the standard proportional hazards assumption may not be satisfied. For example, it is generally observed in HIV studies that the predictive capacity of CD4 cell counts on time to HIV infection among sero-converters tend to be stronger early in time, however the effect wanes over time (Zheng and Heagerty, 2005). Mis-specification of *T*|*Y* can have a substantial impact on estimating the PPV curve (Zheng et al., 2008). A more flexible specification of the relationship between *T*, marker value *Y* and covariates **Z** would be useful in practice. In the absence of covariates, Zheng et al. (2008) considered a nonparametric kernel estimator for *P*(*T* ≥ *t*|*Y*), which results in a more robust PPV estimator. To extend such a non-parametric procedure to incorporate covariates, we propose to preserve the non-parametric form of the functional relationship between *T* and *Y*, but cast the effects of other time-constant covariates within a regression model framework. This motivates the consideration of a “smoothed Cox regression model” by Dabrowska (1997),

$${\lambda}_{Y,\mathbf{Z}}(t)={\lambda}_{0}(t,Y)exp({\mathit{\beta}}_{0}^{\text{T}}\mathbf{Z}).$$

(3.4)

This model, while allowing possibly a complex non-linear effect of *Y* on *T*, can easily accommodate multiple continuous covariates. Note that interactions between *Y* and **Z** can be accommodated by adding additional covariates that corresponding to the interaction terms in the parametric part of the model. The estimator ** can be obtained as the maximizer of the following (pseudo) smoothed profile likelihood:
**

$$\widehat{C}(\mathit{\beta})={n}^{-1}\sum _{i=1}^{n}\int \int \{{\mathit{\beta}}^{\text{T}}{\mathbf{Z}}_{i}-log{\widehat{\pi}}_{y}(u,\mathit{\beta})\}d{\widehat{N}}_{y}(u)dy,$$

where
${\widehat{N}}_{y}(u)={n}^{-1}{\displaystyle \sum _{i=1}^{n}}{K}_{h}({Y}_{i}-y){N}_{i}(u),\phantom{\rule{0.38889em}{0ex}}{\widehat{\pi}}_{y}(u,\mathit{\beta})={n}^{-1}{\displaystyle \sum _{i=1}^{n}}{K}_{h}({Y}_{i}-y)I({X}_{i}\ge u)exp({\mathit{\beta}}^{\text{T}}{\mathbf{Z}}_{i})$, and *K _{h}*(

$${\widehat{C}}^{\ast}(\mathit{\beta})={n}^{-1}\sum _{i=1}^{n}\int \{{\mathit{\beta}}^{\text{T}}{\mathbf{Z}}_{i}-log{\widehat{\pi}}_{{Y}_{i}}(u,\mathit{\beta})\}{N}_{i}(du).$$

The maximizer of *Ĉ**(** β**) was shown to be asymptotically equivalent to

$${\widehat{S}}_{y,\mathbf{z}}(t)=exp\left\{-{\widehat{\mathrm{\Lambda}}}_{y,\mathbf{z}}(t)\right\}=exp\left\{-{\widehat{\mathrm{\Lambda}}}_{0y}(t)exp({\widehat{\mathit{\beta}}}^{\text{T}}\mathbf{z})\right\},$$

where
${\widehat{\mathrm{\Lambda}}}_{0y}(t)={\int}_{0}^{t}{\scriptstyle \frac{d{\widehat{N}}_{y}(u)}{{\widehat{\pi}}_{y}(u,\widehat{\mathit{\beta}})}}$. Again, a plug-in estimator of PPV** _{z}**(

$${\widehat{\text{PPV}}}_{\mathbf{z}}(t,v)=1-{(1-v)}^{-1}{\int}_{{\widehat{c}}_{\mathbf{z}}(v)}^{\infty}{\widehat{S}}_{y,\mathbf{z}}(t)d{\widehat{F}}_{Y\mid \mathbf{z}}(y).$$

(3.5)

In order to test whether a covariate has a significant impact on predictive accuracies, we consider an omnibus test to test the joint hypotheses:

$${H}_{0}:{\text{PPV}}_{{\mathbf{z}}_{1}}(t,v)={\text{PPV}}_{{\mathbf{z}}_{2}}(t,v),\phantom{\rule{0.38889em}{0ex}}\text{and}\phantom{\rule{0.16667em}{0ex}}{\text{NPV}}_{{\mathbf{z}}_{1}}(t,v)={\text{NPV}}_{{\mathbf{z}}_{2}}(t,v),$$

at every (*t, v*) and for all **z**_{1} and **z**_{2}, where **z**_{1} and **z**_{2} are different covariate values that define subpopulations under investigation. Suppose we use the semiparametric location-scale model as specified in model (3.1) for the distribution of the marker. We show in Web Appendix A that under a proportional hazards regression model as specified in model (3.2), *H*_{0} holds if and only if _{0}: *α*_{0}*κ*_{0} = **0** and *β*_{0} = −*α*_{0}*γ*_{0} holds. Thus, the covariate effect could be tested by simultaneously testing
${({\alpha}_{0}{\mathit{\kappa}}_{0}^{\text{T}},{\mathit{\beta}}_{0}^{\text{T}}+{\alpha}_{0}{\mathit{\gamma}}_{0}^{\text{T}})}^{\text{T}}=\mathbf{0}$. A *χ*^{2} test could be constructed based on the asymptotic distribution of the corresponding estimates. When the assumption of a proportional hazards model does not hold, and the distribution of the disease process is specified as a smoothed Cox regression model (model (3.4)), we can show that testing *H*_{0} is equivalent to testing *β*_{0} = *γ*_{0} = *κ*_{0} = **0**. A *χ*^{2} test can be derived using the variance-covariance matrix of all parameters involved. In practice, one can first test the proportional hazards assumption in the failure time model, then select the appropriate *χ*^{2} test to test whether predictive accuracies vary by different values of the covariate.

To make inference about PPV** _{z}**(

To construct point-wise and simultaneous confidence intervals for PPV** _{z}**(

$${\widehat{\text{PPV}}}_{\mathbf{z}}(t,v)\mp {d}_{\alpha}{\widehat{\sigma}}_{\mathbf{z}}(t,v),$$

where for a pointwise confidence interval, *d _{α}* is the 100(1 −

$$P\left\{\underset{v\in \{{v}_{a},{v}_{b}\}}{sup}\frac{\mid {\mathbb{W}}_{\mathbf{z}}^{(j)}(t,v)\mid}{{\widehat{\sigma}}_{\mathbf{z}}(t,v)}\le {d}_{\alpha}\right\}=1-\alpha .$$

We conduct numerical studies to assess the performance of the estimators under several scenarios. We first consider their finite sample performance using a moderate sample size of *n* = 500 and then a larger sample of *n* = 2000. We consider the case when the hazard of failure depends on the marker *Y* and a single covariate *Z* through the proportional hazards model of the form:
$\lambda (t)={\scriptstyle \frac{1}{10}}exp\{\mathit{log}(2)Y+\mathit{log}(1.5)Z\}$. We generate the censoring time from a uniform distribution with mean 10. This yields a censoring rate of 25% approximately. The covariate *Z* is a binary random variable from a Bernoulli distribution with probability equal to 0.5. The marker *Y* is generated from a conditional normal distribution with mean 0.5*Z* and variance 1. We estimate PPV values for *Z* = 0, 1 at *t* = 7 and *v* = 0.1, 0.3, 0.5, 0.7, 0.9. We evaluate the performance of PPV* _{z}*(

Simulation Results: Estimates (Est.), Standard Errors (SE), Means of the Standard Error Estimators (
$\mathit{Mean}(\widehat{SE})$), and the 95% Coverage Probabilities (95% cov) under Cox Regression Models

We also investigate the robustness of the two estimators under two scenarios. We first consider a model of the form *λ*(*t*) = (2*Y* +1.5*Y*^{2} +3*Z*}/10, but fit a mis-specified Cox model ignoring the quadratic term for
${\stackrel{\sim}{\text{PPV}}}_{z}(t,v)$. In the second scenario, for the *i*th subject *Y _{i}* is from a mixture of the log-normal distribution:

Prostate cancer is a major health problem in aging men. The disease is unique in that it manifests in two forms with no clear distinction: an aggressive form, and an indolent form, which develops slowly over many years and does not pose a threat to the patient’s life. Many patients with indolent prostate cancer are subjected unnecessarily to invasive procedures that may lead to important side effects and significant morbidity. One of the goals of our prostate cancer research is to identify novel markers, used in combination with other existing markers and clinical factors, for improved predictions of prostate cancer prognosis. Since PSA has been used as a prognostic marker in the past, it is a crucial first step to evaluate the predictive performance of PSA-based criteria for treatment decisions in our population. Specifically, our cohort consists of 752 incident PC cases who were diagnosed at ages 40–64 years in 1993–1996 and who are followed for disease progression and survival. We include 671 patients with PSA measures prior to prostate cancer primary treatment in our analysis. By the time the data are analyzed, 124 patients are deceased and the median followup time among patients alive is 12 years. Pre-treatment PSA (logarithm transformed throughout this section) is a strong predictor for all-cause mortality, with an unadjusted hazard ratio of 1.58 [95% CI: 1.39, 1.8].

One question of interest is whether the predictive performance of PSA is affected by other characteristics such as tumor grade. For example, do PPV curves for PSA differ between groups of patients with low grade (i.e., Gleason score ≤ 6) and moderate to high grade (i.e., Gleason score ≥ 7)? Figure 1 shows the distributions of log-transformed pre-treatment PSA for both the low grade and moderate to high grade patients. Although the mean of PSA appears to be slightly higher for the high grade group, the two distributions are essentially overlapped. This implies that a positive test based on PSA would likely include both low and high grade patients. We use covariate-specific PPV curves to address this question, considering both estimators that are based on the ordinary Cox proportional hazards model or the smoothed Cox regression model. To calculate
${\stackrel{\sim}{\text{PPV}}}_{z}(t)$ we fit a proportional hazards model of the form *λ*(*t*|*Y, Z*) = *λ*_{0}(*t*)exp{*αlog*(PSA) + *βI*(grade=high)}. For
${\widehat{\text{PPV}}}_{z}(t)$ we fit a smoothed hazards model of the form *λ*(*t*|*Y, Z*) = *λ*_{0}{*t, log*(PSA)}*exp*{*βI*(grade=high)} with the smoothing bandwidth *h* = *cn*^{−1/3}, using the standard deviation of the logarithm of PSA for *c. F _{Y}*

Box plot shows the distributions of log-transformed PSA values from low grade (Gleason score ≤ 6), and moderate to high grade (Gleason score > 6) prostate cancer patients.

The estimated PPV and NPV for different groups at *t*=10 years and their estimated 95% confidence intervals and simultaneous confidence bands are shown in Tables 3 and and4,4, respectively. For comparison we also calculated the corresponding PPV and NPV using the nonparametric estimators introduced in Zheng et al. (2008), using the data from each covariate group separately. The estimator is the most flexible as it does not involve any parametric assumptions about the failure time and either PSA or Gleason score. The estimates from the two semiparametric approaches agree well with those from the nonparametric procedure, suggesting that the assumptions used in the semiparametric models work well in this sample. The efficiencies of the two semiparametric models are comparable; as expected, the nonparametric estimators are relatively less efficient. Since NPVs for both groups do not appear to vary much over the entire range of PSA values we focus presentation below only on PPV. The covariate-specific PPV curves for *v* ranging from 0.1 to 0.9 from the proposed procedures are displayed in Figure 7. It appears that for low grade PC patients the averaged 10-year PC mortality risk stays at a very low level across a wide range of PSA values used as positivity criteria for defining a positive test. The averaged risk for 10-year mortality among these whose PSA values are at the top 10% of this subpopulation is only about 6% (
${\widehat{\text{PPV}}}_{\text{low}\phantom{\rule{0.16667em}{0ex}}\text{grade}}(10,0.9)=0.06$, 95% CI: [0.01, 0.12]). In contrast, for the subpopulation of patients with moderate to high grade cancer, the PPV curve increases relatively more steeply, an indication that PSA is quite informative at identifying patients with elevated risk of death by 10 years in this subpopulation. For example, at *v* = 0.1, the corresponding
${\widehat{\text{PPV}}}_{\text{moderate}/\text{high}\phantom{\rule{0.16667em}{0ex}}\text{grade}}(10,0.1)$ is 0.15 (95% CI: [0.09, 0.18]), whereas
${\widehat{\text{PPV}}}_{\text{moderate}/\text{high}\phantom{\rule{0.16667em}{0ex}}\text{grade}}(10,0.9)$ equals 0.44 (95% CI: [0.28, 0.60]) at *v* = 0.9. In other words, PSA is a more useful marker in helping patients with moderate or high grade tumors to make treatment decisions. Such information may be helpful for urologists to determine how many patients are eligible for more aggressive treatment based on both PSA and Gleason scores. For example, if patients with a 10-year mortality risk of at least 30% are eligible for an aggressive therapy, we may opt to only administer the treatment to those with high grade cancer and PSA values in the top 30% of the subpopulation. In contrast, if only the crude PPV curve is considered (as illustrated in Figure 1), we would use the same PSA threshold for these two groups. Nevertheless, those eligible patients with low Gleason scores would in fact have a very low risk of death within 10 years and therefore would not really benefit from the treatment. We conclude that covariate-specific PPV curves in this case indeed lead to more informed treatment decisions.

Estimates (95% confidence intervals) and [95% confidence bands] of PPV_{z}(t, v) at t = 10 years with various sample percentile (v) in the PC SPORE study.

When applying novel biomarkers into routine standard care, it is important to consider risk threshold to ensure that medical decisions are satisfying at the individual patient level. It is also important to identify factors that influence the performance of a biomarker in order to determine the optimal or suboptimal conditions or populations for test performance. By understanding the effects of the biomarker on different subgroups, intervention can be targeted for those individuals most likely to benefit, at a reasonable expense with nominal risk of complications. The covariate-specific time-dependent PPV curve provides a novel way to carry out this task.

We develop a regression model framework to accommodate the covariate effect when quantifying time-dependent predictive accuracies. Existing methods often fail to account for the fact that risk for disease occurrence and progression is likely to change over time. Such a limitation is critical given that model checking for failure time data has not become a routine practice. By adopting a smoothed survival time regression technique, we provide estimating procedures that will be broadly applicable to many biomarker study settings. Our procedures are simple yet meaningful for clinical practice, amenable to the complex covariate structure, and flexible in its assumption about the underlying model and censoring mechanism. Other flexible modeling approaches can also be considered. One natural choice is the Cox model with time-varying coefficients, which offers greater robustness than the proportional hazards model while retaining the loglinear relationship between the baseline hazard and the marker. This model can be applied in a parametric version, going back to Cox (1972), or in a nonparametric version, as described by Tian et al. (2005) and other authors. General model checking procedures can be helpful for selecting a parsimonious model.

The proposed covariate-specific PPV curve is developed for the setting where a test is administered and subjects are classified into being tested positive and negative based on
$Y\ge {F}_{\mathbf{z}}^{-1}(v)$. Therefore the assumption of a monotonic function for *P*(*T* < *t*|*Y* = *y*, **Z** = **z**) is imposed to reflect the rationale for classifying subjects as short term or long term survivors based on the binary classification rule. In practice, to ensure the monotonicity of the estimated risk function when the underlying true risk is expected to be increasing, one may construct such a monotone estimate by considering various isotonic regression techniques (Friedman and Tibshirani (1984); Bloch and Silverman (1997); Hall and Huang (2001)). However inference procedure may need to be modified to reflect the constrained inference. A simpler remedy is to replace * _{y}*(

Research on time-dependent PPV curves can be extended in a number of directions. First, once factors that may have a substantial impact on the accuracy of a clinical prediction are identified, a natural next step might be seeking a new algorithm for incorporating other factors by combining multiple factors into a new test. In our PC application it is of great interest to decide whether new markers (e.g., SNPs) provide significant improvement in predictive accuracy when combined with standard variables (pre-operative PSA, Gleason score, clinical stage) for predicting PC mortality. Statistical procedures for deriving and evaluating a new combinatory rule for improved predictive accuracy for this setting are yet to be developed. Second, the analysis of time-to-event data is often complicated by the presence of competing risk. In the prostate cancer example, invasive procedures are only necessary for individuals with aggressive forms of the disease (i.e., those who are at risk of dying *from* rather than dying *with* prostate cancer). Therefore the most important outcome when considering PC-related treatment is PC-specific death. Deaths due to other causes are competing risk events. Characterizing the predictive accuracy of a marker on different causes of failures, might allow more rational or cost-effective use of specific medical or surgical treatment strategies. Statistical methods that accommodate competing risk events are subjects of current development.

Estimated PPV curves (left panel:
${\stackrel{\sim}{\text{PPV}}}_{\mathbf{z}}(t,v)$; right panel:
${\widehat{\text{PPV}}}_{\mathbf{z}}(t,v)$) and 95% confidence intervals and bands for *v* (0.10, 0.90) and at *t* = 10 years after diagnosis in PC study. Solid lines: estimates of PPV_{z}(*t*, *v*); dotted lines: nonparametric **...**

The authors thank Dr. David Zucker for several suggestions that have greatly improved the article. This research was supported by grant U01-CA86368, P01-CA053996 and R01-GM079330 awarded by the National Institutes of Health.

Throughout we assume that the joint density of *T*, *C* and *Y* is twice continuously differentiable and **W** = (*Y*, **Z**^{T})^{T} belongs to a compact set Ω** _{W}**. We consider

$$\underset{y}{sup}\mid {n}^{1/2}\{{\widehat{F}}_{Y\mid \mathbf{z}}(y)-{F}_{Y\mid \mathbf{z}}(y)\}-{n}^{-{\scriptstyle \frac{1}{2}}}\sum _{i=1}^{n}{\mathcal{P}}_{i}(y,\mathbf{z})\mid ={o}_{p}(1),$$

(A.1)

where (*y*, **z**) = (*y*, **z**, *X _{i}*, Δ

We require the same conditions as specified in Dabrowska (1997) and the bandwidth *h* is chosen such that *nh*^{2} → ∞ and *nh*^{4} → 0 as *n* → ∞. It follows from Dabrowska (1997) that sup_{t}_{,}* _{y}*|

The uniform convergence of _{y}_{,}** _{z}**(

$${\widehat{\mathcal{W}}}_{\mathbf{z}}(t,v)={n}^{{\scriptstyle \frac{1}{2}}}\{{\widehat{\text{PPV}}}_{\mathbf{z}}(t,v)-\text{PPV}\mathbf{z}(t,v)\}=\{{\widehat{\mathcal{W}}}_{\mathbf{z}1}(t,v)+{\widehat{\mathcal{W}}}_{\mathbf{z}2}(t,v)\}/(1-v),$$

where ${\widehat{\mathcal{W}}}_{\mathbf{z}1}(t,v)={n}^{{\scriptstyle \frac{1}{2}}}{\int}_{{\widehat{c}}_{\mathbf{z}}(v)}^{\infty}\left\{{e}^{-{\widehat{\mathrm{\Lambda}}}_{y,\mathbf{z}}(t)}-{e}^{-{\mathrm{\Lambda}}_{y,\mathbf{z}}(t)}\right\}d{\widehat{F}}_{Y\mid z}(y)$ and

$${\widehat{\mathcal{W}}}_{\mathbf{z}2}(t,v)={n}^{{\scriptstyle \frac{1}{2}}}\left\{{\int}_{{\widehat{c}}_{\mathbf{z}}(v)}^{\infty}{S}_{y,\mathbf{z}}(t)d{\widehat{F}}_{Y\mid z}(y)-{\int}_{{c}_{\mathbf{z}}(v)}^{\infty}{S}_{y,\mathbf{z}}(t)d{F}_{Y\mid \mathbf{z}}(y)\right\}.$$

To approximate the distribution of (*t*, *v*), we again invoke Lemma A.3 of Bilias, Gu and Ying (1997) and use the fact that sup_{t}_{,}* _{y}*|

$${\widehat{\mathcal{W}}}_{\mathbf{z}1}(t,v)=-{n}^{{\scriptstyle \frac{1}{2}}}{\int}_{{c}_{\mathbf{z}}(v)}^{\infty}{S}_{y,\mathbf{z}}(t)\left\{{\widehat{\mathrm{\Lambda}}}_{y,\mathbf{z}}(t)-{\mathrm{\Lambda}}_{y,\mathbf{z}}(t)\right\}d{F}_{Y\mid \mathbf{z}}(y)+{o}_{p}(1).$$

Now, it follows from the asymptotic expansions for _{y}_{,}** _{z}**(

$${\widehat{\mathrm{\Lambda}}}_{y,\mathbf{z}}(t)-{\mathrm{\Lambda}}_{y,\mathbf{z}}(t)\simeq {n}^{-1}\sum _{i=1}^{n}\{{\mathcal{B}}_{\mathbf{z}}(t,y){\mathcal{A}}_{i}+{K}_{h}({Y}_{i}-y){M}_{y,\mathbf{z}}(t,{X}_{i},{\mathrm{\Delta}}_{i},{\mathbf{Z}}_{i})\}+{o}_{p}({n}^{-{\scriptstyle \frac{1}{2}}})$$

where *A _{y}*(

$${\widehat{\mathcal{W}}}_{\mathbf{z}1}(t,v)\simeq -{n}^{-{\scriptstyle \frac{1}{2}}}\sum _{i=1}^{n}{\int}_{{c}_{\mathbf{z}}(v)}^{\infty}{S}_{y,\mathbf{z}}(t)[{K}_{h}(y-{Y}_{i}){M}_{y,\mathbf{z}}(t;{X}_{i},{\mathrm{\Delta}}_{i})+{\mathcal{B}}_{\mathbf{z}}(t,y){\mathcal{A}}_{i}]{F}_{Y\mid \mathbf{z}}^{\prime}(y)dy$$

Now, by a change variable
$\psi ={\scriptstyle \frac{y-{Y}_{i}}{h}}$ and assuming that *nh*^{4} = *o _{p}*(1),

$$\begin{array}{l}{n}^{-{\scriptstyle \frac{1}{2}}}{h}^{-1}\sum _{i=1}^{n}{\int}_{{c}_{\mathbf{z}}(v)}^{\infty}K\left(\frac{y-{Y}_{i}}{h}\right){S}_{y,\mathbf{z}}(t){F}_{Y\mid \mathbf{z}}^{\prime}(y){M}_{y,\mathbf{z}}(t;{X}_{i},{\mathrm{\Delta}}_{i},{\mathbf{Z}}_{i})dy\\ ={n}^{-{\scriptstyle \frac{1}{2}}}\sum _{i=1}^{n}I({Y}_{i}\ge {c}_{\mathbf{z}}(v)){S}_{{Y}_{i},\mathbf{z}}(t){F}_{Y\mid \mathbf{z}}^{\prime}({Y}_{i}){M}_{{Y}_{i},\mathbf{z}}(t;{X}_{i},{\mathrm{\Delta}}_{i},{\mathbf{Z}}_{i})+{o}_{p}(1)\end{array}$$

Therefore, ${\widehat{\mathcal{W}}}_{\mathbf{z}1}(t,v)=-{n}^{-{\scriptstyle \frac{1}{2}}}{\sum}_{i=1}^{n}{\xi}_{i1}(t,v,\mathbf{z})+{o}_{p}(1)$, where

$${\xi}_{i1}(t,v,\mathbf{z})=I({Y}_{i}\ge {c}_{\mathbf{z}}(v)){S}_{{Y}_{i},\mathbf{z}}(t){F}_{Y\mid \mathbf{z}}^{\prime}({Y}_{i}){M}_{{Y}_{i},\mathbf{z}}(t;{X}_{i},{\mathrm{\Delta}}_{i},{\mathbf{Z}}_{i})+{\mathcal{A}}_{i}{\int}_{{c}_{\mathbf{z}}(v)}^{\infty}{S}_{y,\mathbf{z}}(t){\mathcal{B}}_{\mathbf{z}}(t,y){F}_{Y\mid \mathbf{z}}^{\prime}(y)dy.$$

On the other hand, the process (*t*, *v*) can be approximated by
${n}^{-{\scriptstyle \frac{1}{2}}}{\sum}_{i=1}^{n}{\zeta}_{i2}(t,v,\mathbf{z})$ as for (*t*, *v*). Hence,
${\widehat{\mathcal{W}}}_{\mathbf{z}}(t,v)={n}^{-{\scriptstyle \frac{1}{2}}}{\sum}_{i=1}^{n}{\xi}_{i}(t,v,\mathbf{z})+{o}_{p}(1)$, where *ξ _{i}*(

Web Appendices referenced in Sections 4 and Appendix are available under the Paper Information link at the Biometrics website http://www.biometrics.tibs.org.

- Bloch DA, Silverman BW. Monotone discriminant functions and their applications in rheumatology. Journal of the American Statistical Association. 1997;92:144–153.
- Cai T, Pepe MS. Semiparametric receiver operating characteristic analysis to evaluate biomarkers for disease. Journal of the American Statistical Association. 2002;97:1099–1107.
- Cai T, Tian L, Wei L. Semiparametric Box-Cox power transformation models for censored survival observations. Biometrika. 2005;92:619–32.
- Cox DR. Regression models and life-tables (with discussion) Journal of the Royal Statistical Society, Series B, Methodological. 1972;34:187–220.
- Dabrowska DM. Smoothed Cox regression. The Annals of Statistics. 1997;25:1510–1540.
- Etzioni R, Pepe M, Longton G, Hu C, Goodman G. Incorporating the time dimension in receiver operating characteristic curves: A case study of prostate cancer. Medical Decision Making. 1999;19:242–251. [PubMed]
- Friedman J, Tibshirani R. The monotone smoothing of scatterplots. Technometrics. 1984;26:243–250.
- Hall P, Huang LS. Nonparametric kernel regression subject to monotonicity constraints. The Annals of Statistics. 2001;29:624–647.
- Heagerty P, Zheng Y. Survival model accuracy and roc curves. Biometrics. 2005;61:92–105. [PubMed]
- Heagerty PJ, Lumley T, Pepe MS. Time-dependent ROC curves for censored survival data and a diagnostic marker. Biometrics. 2000;56:337–344. [PubMed]
- Heagerty PJ, Pepe MS. Semiparametric estimation of regression quantiles with application to standardizing weight for height and age in US children. Applied Statistics. 1999;48:533–551.
- Lin DY, Sun W, Ying Z. Nonparametric estimation of the gap time distributions for serial events with censored data. Biometrika. 1999;86:59–70.
- Moskowitz C, Pepe M. Quantifying and comparing the accuracy of binary biomarkers when predicting a failure time outcome. Statistics in Medicine. 2004a;23:1555–1570. [PubMed]
- Moskowitz C, Pepe M. Quantifying and comparing the predictive accuracy of continuous prognostic factors for binary outcomes. Biostatistics. 2004b;5:113–127. [PubMed]
- Park Y, Wei LJ. Estimating subject-specific survival functions under the accelerated failure time model. Biometrika. 2003;90:717–23.
- Peng L, Huang Y. Survival analysis with temporal covariate effects. Biometrika. 2007;94:719–733.
- Pepe M, Zheng Y, Jin Y, Huang Y, Parikh C, Levy W. Evaluating the ROC performance of markers for future events. Lifetime Data Analysis. 2008;14:86–113. [PubMed]
- Pollard D. Empirical processes: theory and applications. Institute of Mathematical Statistics; 1990.
- Tian L, Zucker D, Wei L. On the Cox model with time-varying regression coefficients. Journal of the American Statistical Association. 2005;100:172–183.
- Zheng Y, Cai T, Pepe M, Levy W. Time-dependent predictive values of prognostic biomarkers with failure time outcome. Journal of the American Statistical Association. 2008;103:362–368. [PMC free article] [PubMed]
- Zheng Y, Heagerty P. Partly conditional survival model for longitudinal data. Biometrics. 2005;61:379–391. [PubMed]

PubMed Central Canada is a service of the Canadian Institutes of Health Research (CIHR) working in partnership with the National Research Council's national science library in cooperation with the National Center for Biotechnology Information at the U.S. National Library of Medicine(NCBI/NLM). It includes content provided to the PubMed Central International archive by participating publishers. |