Search tips
Search criteria 


Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Lifetime Data Anal. Author manuscript; available in PMC 2014 April 1.
Published in final edited form as:
PMCID: PMC3633735

Subgroup specific incremental value of new markers for risk prediction


In many clinical applications, understanding when measurement of new markers is necessary to provide added accuracy to existing prediction tools could lead to more cost effective disease management. Many statistical tools for evaluating the incremental value (IncV) of the novel markers over the routine clinical risk factors have been developed in recent years. However, most existing literature focuses primarily on global assessment. Since the IncVs of new markers often vary across subgroups, it would be of great interest to identify subgroups for which the new markers are most/least useful in improving risk prediction. In this paper we provide novel statistical procedures for systematically identifying potential traditional-marker based subgroups in whom it might be beneficial to apply a new model with measurements of both the novel and traditional markers. We consider various conditional time-dependent accuracy parameters for censored failure time outcome to assess the subgroup-specific IncVs. We provide non-parametric kernel-based estimation procedures to calculate the proposed parameters. Simultaneous interval estimation procedures are provided to account for sampling variation and adjust for multiple testing. Simulation studies suggest that our proposed procedures work well in finite samples. The proposed procedures are applied to the Framingham Offspring Study to examine the added value of an inflammation marker, C-reactive protein, on top of the traditional Framingham risk score for predicting 10-year risk of cardiovascular disease.

Keywords: Incremental value, Partial area under the ROC curve, Prognostic accuracy, Risk prediction, Subgroup analysis, Time dependent ROC analysis

1 Introduction

Risk models have been applied in medical practice for prediction of long-term incidence or progression of many chronic diseases such as cardiovascular disease (CVD) and cancer. With the advancement in science and technology, a wide range of biological and genomic markers have now become available to assist in risk prediction. However, due to the potential financial and medical costs associated with measuring these markers, their ability in improving the prediction of disease outcomes and treatment response over existing risk models needs to be rigorously accessed.

Effective statistical tools for evaluating the incremental value (IncV) of the novel markers over the routine clinical risk factors are crucial in the field of outcome prediction. Many of newly discovered markers, while promising and strongly associated with clinical outcomes, may have limited capacity in improving risk prediction over and above routine clinical variables (Tice et al. 2005; Wacholder et al. 2010). For example, on top of traditional risk variables from the Framingham risk score (FRS) (Wilson et al. 1998), the inflammation biomarker, C-reactive protein (CRP), was shown to provide modest prognostic information (Cook et al. 2006; Blumenthal et al. 2007; Ridker et al. 2007) while a genetic risk score consisting of 101 single nucleotide polymorphisms was reported as not useful (Paynter et al. 2010). In a recent paper, Wang et al. (2006) concluded that almost all new contemporary biomarkers for prevention of coronary heart disease added rather moderate overall predictive values to the FRS.

One possible explanation for the minimal improvement at the population average level is that the new markers may only be useful for certain subpopulations. For example, while much debate about the clinical utility of CRP remains, there is empirical evidence that CRP may substantially improve the prediction for subjects at intermediate risk (Ridker 2007). Such finding, if valid, would be extremely useful in clinical practice, since identifying the subgroups where markers can provide valuable improvement in prediction will not only lead to more informed clinical decisions but also reduced cost and effort compared to measuring novel markers on the entire population. However, to ensure the validity of such claims and more precisely pinpoint such specific subgroups, rigorous and systematic analytical tools for IncV evaluation are needed.

To quantify the global IncV of new markers for risk prediction, various approaches have been advocated. For example with the most popular one being focused on a comparison of summary measures of accuracy under a conventional and new models respectively (Heagerty and Zheng 2005; Uno et al. 2007; Cai and Cheng 2008). Excellent discussions on the choices of different accuracy measures can be found in Gail and Pfeiffer (2005). However, these measures quantify the overall IncV of new markers averaged over the entire study population and do not provide information on how the IncV may vary across different groups of subjects. If there are pre-defined subgroups, these measures could be estimated for each of the subgroups. However, in practice, it is often unclear how to optimally select subgroups for comparisons and ad-hoc subgroup analyses without careful planning and execution may lead to invalid results (Rothwell 2005; Pfeffer and Jarcho 2006; Wang et al. 2007). Furthermore, it is vitally important to adjust for multiple comparisons when conducting any subgroup analysis. Thus, an important question is how to systematically identify the potential subgroups who would benefit from the additional markers properly adjusting for multiple comparisons. There is a paucity of statistical literature on approaches for identifying such subgroups (D’Agostino 2006). Tian et al. (2009) proposed an inference procedure to estimate the IncVs in absolute prediction error of new markers in various subgroups of patients classified by the conventional markers. However, their method does not incorporate censoring. In addition, the subgroups in their paper were defined as groups of subjects whose conventional risk scores lie in different pre-assigned intervals. However, how to determine the length of intervals could be an issue. Uno et al. (2011b) proposed estimation procedures for the conditional quantiles of the improvement in the predicted risk separately for the cases and the controls. However, they did not provide procedures for determining which subgroups should be recommended to have the new markers measured. Furthermore, no procedures were provided to account for the sampling variation or control overall type I error which is particularly important in subgroup analysis.

In this paper, we propose systematic approaches to analyzing censored event time data for identifying subgroups of patients for whom the new markers have the most or least IncV. We consider two common accuracy measures, the partial area under the ROC curve (pAUC) and the integrated discrimination improvement (IDI) index. Compared with the standard C-statistic, for many applications, the pAUC is often advocated as a better summary measure (Dwyer 1996; Dodd and Pepe 2003; Cai and Dodd 2008), since clinical interests often lie only in a specific range of the false positive rates (FPRs) or true positive rates (TPRs). For example, the region with low FPR is of more concern for disease screening (Baker and Pinsky 2001); while the region with high TPR is of more concern for the prognosis of serious disease (Jiang et al. 1996). However, the ROC curve does not capture certain aspects of the predicted absolute risk, since it is scale invariant. Many model performance measures, including the reclassification table (Cook and Ridker 2009), net reclassification improvement (NRI) and IDI (Pencina et al. 2008), proportion of case followed (PCF) and proportion needed to follow-up (PNF) (Pfeiffer and Gail 2010), have been proposed recently to overcome the limitation of the ROC curve method. Many of these measures, such as the reclassification table, NRI, PCF and PNF, rely on pre-specified clinically meaningful risk or quantile threshold values which may not be available for most diseases. For illustration purposes, we focus primarily on pAUC and IDI in this paper but note that our procedures can be easily extended to accommodate other accuracy measures.

The rest of paper is organized as follows. In Sect. 2, we present our proposed non-parametric estimation procedure for subgroup-specific IncV of new markers and along with their corresponding interval estimation procedures. In particular, resampling based simultaneous interval estimation procedures are provided as convenient and effective tools to control for multiple comparisons. We describe results from our simulation studies in Sect. 3 and the analyses of the Framingham Offspring Study using our proposed procedures in Sect. 4. Concluding remarks are given in Sect. 5. All the technical details are included in the appendices.

2 Methods

2.1 Risk modeling with and without new markers

Let X denote a set of conventional markers and let Z denote a set of new markers. Due to censoring, for the event time T, one can only observe T = min(T, C), Δ = I (TC), where C is the censoring time, which is assumed to be independent of T conditional on (X, Z). See below for more discussions about censoring assumptions. Furthermore, define Y = I (Tt0), where t0 is the prediction time of clinical interest, and Y = I (Tt0). Let equation M1 and equation M2 be the true conditional risk of developing the event by time t0 conditional on X only and (X, Z), respectively. Suppose a data set for analysis consists of n independent realizations of (T, Δ, X, Z), {(Ti, Δi, Xi, Zi)}. Although Y and the conditional risk functions depend on t0, we suppress t0 from the notation throughout for the ease of presentation. From the Neyman–Pearson Lemma and similar arguments as given in McIntosh and Pepe (2002), it is not difficult to show that equation M3 achieves the optimal ROC curve for predicting Y based on X only. Similarly, equation M4 is the optimal score for prediction Y given (X, Z).

To estimate equation M5 and equation M6, one may consider a fully non-parametrical approach (Li and Doss 1995). However, in practice, such non-parametric estimates may perform poorly when the dimension of X or Z is not small due to the curse of dimensionality (Robins and Ya’Acov 1997). An alternative feasible way is approximate equation M7 and equation M8 by imposing simple working models

equation M9

where V, a p × 1 vector, is a function of X, W, a q × 1 vector, is a function of X and Z, β and γ are vectors of unknown regression parameters, and g1 and g2 are known, smooth, increasing functions. An estimator of β and γ can be obtained respectively by solving the following inverse probability weighted (IPW) estimating equations as given in Uno et al. (2007):

equation M10

where equation M11, and equation M12 is a root-n consistent estimator of GX,Z(t) = pr(Ct|X, Z). This IPW estimator may be justified heuristically the argument that equation M13, for y = 0, 1. Let equation M14 and equation M15 be the resulting estimator of β and γ, respectively. For a subject with X = x, Z = z whose V = v and W = w, the risk is estimated by equation M16 based on X alone and by equation M17 based on X and Z. It has been previously shown in Uno et al. (2007) that regardless of the adequacy of the working model (1), equation M18 converges in probability to a deterministic vector equation M19 as n → ∞. Let equation M20 and equation M21. When the models in (1) are correctly specified, equation M22 and equation M23. To obtain a valid estimator equation M24, one may impose a semi-parametric model, such as the proportional hazards (PHs) model (Cox 1972), for GX,Z(t) and obtain equation M25 as equation M26, where Wc is a function of (X, Z), equation M27 is the maximum partial likelihood estimator and equation M28 is the Breslow’s estimator. When the censoring is independent of both T and (X, Z), one may obtain equation M29 simply as the Kaplan–Meier estimator. It is important to note that if the in (1) only hold for a given t0 and the dimension of (X, Z) is not small, root-n consistent estimators of β and γ may not exist without imposing additional modeling assumptions about GX,Z(t) due to the curse of dimensionality (Robins and Ya’Acov 1997).

2.2 Subgroup specific IncVs

For illustration purposes, we consider two accuracy measures, the pAUC and the IDI index. We first define both concepts in the context of evaluating a risk score/model. Suppose that we use equation M30 as a risk score for classifying the event status Y, and without loss of generality, we assume that a higher value of equation M31 is associated with a higher risk and refer to the two states, Y = 1 and Y = 0, as “diseased” and “disease-free” or “cases” and “controls”. The discrimination capacity of equation M32 can be quantified based on the ROC curve, which is a plot of the TPR function, equation M33, against the FPR function, equation M34. The ROC curve, equation M35 describes the inherent capacity of distinguishing “cases” from “controls”. The pAUC with a restricted region of FPR, say FPR ≤ f, is given by equation M36, for f [set membership] [0, 1]. The IDI index, is simply equation M37.

To evaluate how the IncV of Z may vary across subgroups defined by X, we define new conditional pAUC and IDI index. We propose to use equation M38 as a scoring system for grouping subjects with potentially similar initial risk estimates and create subgroups [upsilon]s = {X : [p with macron]1(X) = s}. Then we evaluate the IncV of Z for each [upsilon]s based on how well equation M40 can further discriminate subjects within [upsilon]s with Y = 1 from those with Y = 0. Suppose equation M41 is used to classify Y for subjects in [upsilon]s. The TPR and FPR of the classification rule equation M42 given [upsilon]s equation M43 and equation M44, respectively, where

equation M45

Conditional on equation M46, the ROC curve of equation M47 is equation M48, for u [set membership] [0, 1]. The conditional pAUC is given by equation M49, f [set membership] [0,1]. Note that f = 1 yields conditional AUC(s. If Z is non-informative for [upsilon]s, the corresponding ROC curve would be a diagonal line, and we expect that pAUCs = f2/2, which is the area under a diagonal line. Thus, the subgroup [upsilon]s specific IncV of Z with respect to (wrt) the pAUC is given by pAUCf(s) − f2/2. The IDI index conditional on equation M50 is given by

equation M51

If Z is non-informative for this subgroup [upsilon]s, the conditional IDI index would be 0, and therefore, the subgroup [upsilon]s specific IncV of Z wrt the IDI index is IDI(s). Based on these subgroup-specific IncVs, we are able to identify the set of s such that Z is useful to improve the prediction accuracy for [upsilon]s, which is referred to as the effective subpopulation [upsilon]* in our paper. Specifically, the effective subpopulation wrt pAUC is defined as [upsilon]* = {X : pAUCf([p with macron]1(X))−f2/2>c1}; the effective subpopulation wrt IDI are defined as [upsilon]* = {X : IDI([p with macron]1(X))>c2ý, where c1 and c2 are some possibly data dependent threshold values. For the subjects in the effective subpopulation, measurement of new markers would provide added accuracy to the conventional risk model.

2.3 Inference about subgroup-specific IncVs

We first discuss the estimation for the conditional TPR and FPR functions equation M54 since both pAUCf(s) and IDI(s) are simple functionals of these two functions. Let equation M55 and equation M56. To obtain a consistent estimator of equation M57, since equation M58 is between 0 and 1, we consider a non-parametric local likelihood estimation method (Tibshirani and Hastie 1987) along with IPW accounting for censoring. Specifically, we obtain equation M59 as the solution to the IPW local likelihood score equation,

equation M60

where equation M61, g(x) = exp(x)/{1 + exp(x)}, Kh(x) = K (x/h)/h, K(·) is a known smooth symmetric kernel density function with a bounded support, and the bandwidth hy > 0 is assumed to be O(nν), for 1/5 < ν < 1/2, and [var phi](·) is a known, non-decreasing transformation function that can potentially be helpful in improving the performance of the smoothed estimator (Wand et al. 1991; Park et al. 1997). Then, equation M62 can be estimated by equation M63 for y = 0, 1. In the Appendix A.1, we show that equation M64 in probability as n → ∞, uniformly in c [set membership] [0, 1] and equation M65 where [ρl, ρu] is a subset of the support of equation M66 and β0 is the limit of equation M67. As a special case, by setting b in (4) to 0, one may obtain a local constant estimator,

equation M68

2.4 Inference procedures for pAUCf(s)

Based on equation M69 can be estimated as

equation M70

where equation M71 and (h0, h1) is the pair of optimal band-widths for estimating equation M72 and equation M73, respectively. In the Appendix A.2, we show that equation M74 is uniformly consistent for pAUCf(s).

As a special case, when both X and Z are univariate, the ROC curve of equation M75 conditional on equation M76 is equivalent to the ROC curve of Z conditional on X since the ROC curve is scale invariant. A simple local constant IPW estimator of equation M77 is given by

equation M78

The resulting estimator of pAUCf(x) is

equation M79

where equation M80 is the estimated truncated placement value proposed by Cai and Dodd (2008).

It is difficult to directly estimate the variance of equation M81 since it involves unknown derivative functions. We propose a perturbation-resampling method to approximate the distribution of equation M82. This method has been widely used in survival analyses (see for example, Jin et al. 2001; Park and Wei 2003; Cai et al. 2005). To be specific, let Ξ = {ξi, i = 1, …, n} be n independent positive random variables following a known distribution with mean 1 and variance 1, and Ξ is independent of the data. For each set of Ξ, we first obtain β* and γ*, as the respective solutions to

equation M83

where equation M84 and equation M85 is the perturbed estimator of GX,Z (·) with Ξ being the weights. Let equation M86, equation M87, and equation M88. Subsequently, we obtain the perturbed counterpart of equation M89 as equation M90, where equation M91 is the solution to the perturbed score equation

equation M92

Then, the perturbed pAUC is given by, equation M93, where equation M94. In the Appendix A.3, we show that the unconditional distribution of equation M95 can be approximated by the conditional distribution of

equation M96

given the data. With the above resampling method, for any fixed equation M97, one may obtain a variance estimator of equation M98, equation M99, based on the empirical the variance of B realizations from (5). For any fixed equation M100 and α [set membership] (0, 1), a pointwise 100(1 − α)% confidence interval (CI) for pAUCf(s) can be constructed via equation M101, where cα is the 100(1 − α)th percentile of the standard normal distribution.

2.4.1 Inference for IDI(s)

Based on equation M102, we may obtain plug-in estimators for equation M103 and equation M104 respectively as

equation M105

Thus, IDI(s) can be estimated by equation M106. Similar to the derivations given in the Appendix A for equation M107, the asymptotic results for equation M108 can be directly used to establish the consistency and asymptotic normality for equation M109. In addition, the unconditional distribution of equation M110 can be approximated the conditional distribution of equation M111, given the data, where equation M112 and equation M113 is the perturbed counterpart of equation M114. The pointwise CIs for any fixed equation M115 are constructed in a similar way as the inference for pAUCf(s). As a special case, a kernel local constant estimator of IDI(s) is given by

equation M116

with the perturbed counterpart given by

equation M117

Selection of the optimal bandwidths for pAUCf(s) and IDI(s) is illustrated in the Appendix B.

2.4.2 Identifying the effective subpopulation

To identify the effective subpopulation, one may simultaneously assess the subgroup-specific IncV wrt a certain accuracy measure, denoted by equation M118, for example pAUCf(s) − f2/2 or IDI(s), over a range of s values by constructing simultaneous CI for equation M119. Unfortunately, the distribution of equation M120 does not converge as a process in s, as n → ∞. Thus, we cannot apply the standard large sample theory for stochastic processes to approximate the distribution of equation M121. Nevertheless, by the strong approximation argument and extreme value limit theorem (Bickel and Rosenblatt 1973), we show in the Appendix A.3 that a standardized version of the sup-statistic equation M122 converges in distribution to a proper random variable, where equation M123 denotes the variance estimator of equation M124. In practice, for large n, one can approximate the distribution of Γ based on realizations of equation M125, where equation M126 is the perturbed counterpart of equation M127. Therefore, a 100(1 − α) % simultaneous CI for equation M128 can be obtained as equation M129, where dα is the empirical 100(1 − α)th quantile of Γ*. Thus, to account for sampling variation and multiple testing, the effective subpopulation is chosen as equation M130 in real data analyses.

2.4.3 Test for heterogeneous IncV

Another question of interest is whether the subgroup-specific equation M131, for example pAUCf(s), is constant across different values of s over a certain interval [sl, su]. We define the average IncV over [sl, su] as

equation M132

where equation M133, and we define the relative subgroup-specific IncV over [sl, su] as equation M134. The point estimate of equation M135 is given by

equation M136

where equation M137. In addition, the unconditional distribution of equation M138 can be approximated by the conditional distribution of equation M139 given the data, where equation M140 with equation M141 as the perturbed counterpart of equation M142 and equation M143. The variance estimator equation M144 of equation M145 can be obtained from realizations of equation M146.

If the subgroup-specific IncV of Z is constant over [sl, su], i.e., equation M147 for equation M148 and equation M149 for s [set membership] [sl, su]. Testing whether the subgroup specific IncV is constant over [sl, su] is the equivalent to testing the null hypothesis equation M150 for s [set membership] [sl, su]. To adjust for multiple testing, we consider the standard version of the sup-statistic equation M151 where equation M152 is the statistic equation M153 under the null hypothesis H0. One may approximate the distribution of equation M154 based on realizations of equation M155. The empirical p value for testing the null hypothesis H0 can be obtained by equation M156, where equation M157 are B realizations of equation M158.

3 Simulation studies

To examine the finite sample properties of the proposed estimation procedure, we conduct a simulation study where the conventional marker X and the new marker Z are both univariate and jointly generated from a bivariate normal distribution

equation M159

In this simulation study, μX = μZ = 0, σX = 2 and σZ 0.5, and ρXZ 0.01. The failure time T, given the markers X and Z, is generated from an accelerated failure time model with a log-normal distribution for T, i.e., log T = h(X, Z) + [set membership], where [set membership] is a normal random variable with mean 0 and standard deviation σT = 1.5. In this simulation study, h(X, Z) is a linear model, i.e., h(X, Z) = −βX XβZ ZβXZXZ. We consider a practical situation where the new marker Z may make a major contribution to the underlying mechanism in contrast with the conventional marker X, although it may not be measured routinely. Thus, in this simulation study, we set βX = 0.01 and βZ = βX Z = 1. The censoring time C is generated from an exponential distribution with rate equation M160. A value of c0 ≈ 20 is chosen such that roughly 80% of the failure time is censored. A time point t0 ≈ 0.2 is set such that the proportion of the “cases” in the sample is approximately 20%.

We investigate the kernel local constant estimator for the conditional pAUCf with f = 0.1 representing a low FPR region and f = 1 representing the standard AUC. Since Z and log T are jointly normal conditional on X = x, it is straightforward to calculate the true values of pAUCf(x). We consider a relatively smaller sample size 1,000, a moderate sample size of 5,000 and a relatively larger sample size of 10,000. Both of the pAUC with FPR ≤ 0.1, i.e., pAUC0.1(x) and the full AUC are estimated at a sequence of values of X. For ease of computation, the pair of the bandwidths (h0, h1) for constructing the non-parametric estimate was fixed at (i) for the full AUC, (2.531, 2.102) for n = 1,000, (1.905,1.534) for n = 5,000, and (1.640, 1.380) for n = 10,000; (ii) for the pAUC0.1, (2.377,2.432) for n = 1,000, (1.843,2.361) for n = 5,000, and (1.507, 2.085) for n = 10,000. Here, equation M161 were chosen as the average of the bandwidths selected from 10 independent simulated datasets using the two-stage of five-fold cross-validation method described in the Appendix B and n−0.1 was multiplied to equation M162 to yield the final bandwidths used for simulation. In addition, the kernel function K (·) was chosen as the Epanechnikov kernel. Here, since we assume that the censoring time C is independent of both T and (X, Z), GX,Z(t) = G(t) is estimated by a Kaplan–Meier estimator.

The performance of the point estimates and pointwise 95% CIs obtained by the resampling method was assessed from 1,000 independent replicates. For all of these scenarios, the non-parametric estimators have substantially small biases, the estimated standard errors are close to their empirical counterparts, and empirical coverage levels are close to the nominal level. In Fig. 1, we summarize the performance of the point and interval estimates for pAUC0.1 with sample size 10,000. For this scenario, the empirical coverage probabilities of the 95% pointwise CIs range from 92.9 to 95.4%. The empirical coverage levels of the 95% simultaneous confidence bands for the standard AUC are 93.2% for n = 1,000, 93.3% for n = 5,000, and 94.5% for n = 10,000; the empirical coverage levels of the 95% simultaneous confidence bands for the pAUC0.1 are 93.3% for n = 1,000, 93.4% for n = 5,000, and 92.5% for n = 10,000.

Fig. 1
Performance of the point estimates, the standard error estimates and pointwise CIs for pAUC0.1 with sample size 10,000: (a) the true pAUC0.1(x) (solid) and the average point estimates (dashed) over 1,000 replicates, (b) the empirical standard error estimates ...

4 Example: the Framingham Offspring Study

The Framingham Offspring Study was established in 1971 with 5,124 participants who were monitored prospectively on epidemiological and genetic risk factors of CVD. Here, we use data from 1,687 female participants of which 261 have either died or experienced a CVD event by the end of follow-up period, and the 10-year event rate is 6%. The Framingham risk model, based on several clinical risk factors including age, systolic blood pressure, diastolic blood pressure, total cholesterol, high-density lipoprotein cholesterol, current smoking status and diabetes, is widely used in clinical settings but only with moderate accuracy for predicting the 10-year risk of CVD (Cook et al. 2006). The FRS is constructed as the weighted average of the risk factors in the Framingham risk model using β-coefficients given in Table 6 of Wilson et al. (1998). The risk estimates are obtained from the FRS through the transformation 1 − exp{− exp(·)}. The density plot of the risk estimates obtained from the FRS is shown in Fig. 2a. The overall gain in C-statistic by adding the CRP on top of FRS is 0.002 (from 0.776 to 0.778, with 95% CI (−0.005,0.01)). Note that a log transformation is applied on the CRP throughout the analysis. According to the Framingham risk model (Wilson et al. 1998) and the risk threshold values employed by the Adult Treatment Panel III of the National Cholesterol Eduction Program (Expert Panel on Detection, Evaluation, and Treatment of High Blood Cholesterol in Adults 2001), these 1,687 female participants may be classified into three risk groups: 1,462 as low risk (<10 %); 193 as intermediate risk (between 10% and 20%); 32 as high risk (>20 %). The IncVs wrt C-statistic are 0.00057 (with 95% CI (−0.012,0.013)) for the low risk group; 0.037 (with 95% CI (−0.054,0.13)) for the intermediate risk group; 0.034 (with 95% CI (−0.097,0.16)) for the high risk group. Note that the low risk group consists of about 87% of the entire cohort. Now we further classify the 1,462 patients of the low risk group into 10 finer subgroups with the length of the risk interval for each subgroup being 0.01, for example, 0–0.01, 0.01–0.02, and etc. The IncVs wrt C-statistics for these 10 subgroups of low risk as well as the intermediate and high risk groups with their 95% CIs are shown in Fig. 2b. This suggests that adding CRP on top of FRS may be most useful for the risk groups around 5%, which is also referred to as the intermedium low risk group in some literature.

Fig. 2
(a) The density estimates of the 10-year event risk calculated from the FRS. (b) The IncVs wrt C-statistics for the 10 subgroups of low risk as well as the intermediate and high risk groups with their 95% CIs

First, we investigate the IncV of the CRP over the FRS wrt AUC, pAUC0.1 and IDI in predicting the 10-year risk of CVD events among subgroups defined by the FRS. For the purpose of kernel smoothing, the transformation function [var phi](·) in the local likelihood score equation (4) is equation M163, where μX = −3.74 is the sample mean of the FRS and σ X = 1.35 is the sample standard deviation of the FRS, and Φ(x) is the cumulative distribution function of a standard normal distribution. Here we use local kernel constant estimates with Epanechnikov kernel. The optimal bandwidths equation M164 in [var phi]-scale are chosen via a 10-fold cross validation procedure: (0.117,0.393) for the standard AUC, (0.264,0.721) for pAUC0.1, and (0.018,0.273) for IDI. The point estimates along with the 95% pointwise and simultaneous CIs for the subgroup-specific IncV wrt AUC, pAUC0.1 and IDI are shown in Fig. 3. The point estimate for IDI is obtained via a cross-validation procedure to correct for biases due to overfitting. Based on the pointwise CIs of the subgroup-specific IncV wrt AUC, the addition of CRP appears to improve risk prediction for subjects with the FRS risk ranging from 0.028 to 0.096. The corresponding range is 0.008–0.148 when based on the CIs for the subgroup-specific IncV wrt pAUC0.1; 0.004–0.102 when based on the CIs for the subgroup-specific IncV wrt IDI. After controlling for the overall type I error, inclusion of CRP may significantly improve discrimination for subjects with the FRS risk ranging from 0.034 to 0.070 based on AUC; from 0.010 to 0.078 based on pAUC0.1; from 0.032 to 0.068 based on IDI. The IDI findings and the pAUC findings agree with each other. These results suggest that CPR might be useful to improve risk prediction among patients regarded as having low to moderate risk according to the FRS.

Fig. 3
The point estimates (solid line), and its 95% pointwise CIs (dashed lines) and the 95% simultaneous confidence bands (dark shaded region) for (I) the subgroup-specific IncV with respect to AUC, AUC(x) − 1/2; (II) the subgroup-specific IncV with ...

It is worth to note that the bandwidth selection procedure is not sensitive towards the choice of the number of folds in cross-validation. Using a five-fold cross-validation, the optimal bandwidths equation M165 are (0.121, 0.394) for the standard AUC, (0.238, 0.614) for pAUC0.1, and (0.016, 0.272) for IDI. The resulting point estimates and CIs are almost the same as the results with the bandwidths selected via a 10-fold cross validation procedure. In addition, for calculating the weights equation M166, the survival function G(·) of the censoring time C is estimated by a Kaplan-Meier estimator since in the study C is likely to be independent of both T and X, Z. In Sect. 2.1, we commented that if this independence assumption does not hold, we could still provide a correct estimate of G(·) via a semi-parametric model, for example a Cox PH model. Here, we also obtained the estimates of G(t0) via a Cox PH model, i.e., equation M167 where Wc consists of the FRS and the CRP. Based on the resulting weights equation M168, we obtained the point estimates and CIs for the subgroup-specific IncV wrt AUC, pAUC0.1 and IDI, which is presented in Fig. 4. The results are very similar to the results using Kaplan–Meier estimator of G(·), and therefore it implies that the independence assumption about the censoring time C is reasonable.

Fig. 4
The point estimates (solid line), and its 95% pointwise CIs (dashed lines) and the 95% simultaneous confidence bands (dark shaded region) for the subgroup-specific IncV with respect to AUC and pAUC0.1 as well as IDI. The results are based on the weights ...

We are also interested in testing whether the subgroup-specific IncV of the CRP over the FRS is constant over the values [0,0.4] of the risk estimates obtained from the FRS. The p values of testing for constant subgroup-specific IncV are 0.028 for AUC, 0.108 for pAUC0.1 and 0.002 for IDI. These results agree with Fig. 5, which shows the point estimates and simultaneous 95% CIs for the relative subgroup-specific IncV wrt AUC, pAUC0.1 and IDI. It shows that the subgroup-specific IncVs wrt AUC and IDI are not constant over the interval [0,0.4]; on the other hand,the subgroup-specific IncV wrt pAUC0.1 is constant over this interval. It is worth to note that the asymptotic variance of equation M169 is larger than that of equation M170, and therefore, the power of testing whether the subgroup-specific IncV is constant over a certain interval is not as strong as the power of testing whether the subgroup-specific IncV is above zero over the interval.

Fig. 5
The point estimates (solid line), and its 95% simultaneous confidence bands (dark shaded region) for the relative subgroup-specific IncVs, which are used to test for heterogeneous IncVs, with respect to AUC and pAUC0.1 as well as IDI over the interval ...

5 Concluding remarks

In this paper, we propose a non-parametric procedure to estimate the IncVs of new markers in prediction accuracy accross different subgroups defined by the conventional scoring system. We also provide the pointwise and simultaneous interval estimates via perturbation resampling. In addition, with proper adjustment for multiple subgroups comparison, our approach is able to systematically identify the subgroups which would benefit from adding new markers. Unlike global measures which do not provide information on how the IncV may vary across subgroups, our methods enables the identification of subgroups for which the new markers may or may not be useful. Existing procedures often assess subgroup-specific IncVs empirically. We provide more rigorous and systematic analytical tools to ensure the validity of such claims and more precisely pinpoint such specific subgroups.

Appropriate choice of prediction accuracy summaries is of great importance to capture the usefulness of new markers. It is also motived by primary research interests. Discrimination is one of the major components in assessing the accuracy of prediction models. The AUC is the most popular summary index which depicts inherent discrimination capacity. However, it is unable to capture how well the predicted risks agree with the actual observed risks (Gail and Pfeiffer 2005). In some cases, alternative summary measures should be also considered, for example, NRI, PCF and PNF. Our approach can be naturally extended to other metrics that maybe more appropriate for particular clinical applications.

The subgroup-specific TPR equation M171 and the subgroup-specific FPR equation M172 both depend on the time point t, which is usually pre-determined. In some applications, new biomarkers might produce relatively better long-term performance in prediction accuracy than short-term. It is straightforward to extend our procedure to different time points over an arbitrary time interval since the non-parametric estimates of the TPR and FPR, equation M173, converge to a Gaussian process in time t. We could estimate the overall improvement of new markers over a certain time interval by integrating the subgroup-specific pAUC and the subgroup-specific IDI index wrt time t. Furthermore, with properly adjusting for multiple comparison, it is possible to identify the time interval where new markers have the most IncVs for different subgroups.

Instead of focusing on the prediction of t-year survival for a fixed time point, we might be also interested in a global assessment of a fitted prediction model for the continuous event time. One example of such global measure is the C-statistic of the prediction score equation M174 (Harrell Jr et al. 1996; Korn and Simon 1990; Pencina and D’Agostino 2004). When the event time T is subject to right censoring which may have finite support [0, τ], one may consider a truncated C-statistic,

equation M175

as considered in Heagerty and Zheng (2005) and Uno et al. (2011a). It is straightforward to extend Cτ to our subgroup-specific C-statistic

equation M176

and construct an IPW kernel estimator for Cτ (s) as for other accuracy measures.


The Framingham Heart Study and the Framingham SHARe project are conducted and supported by the National Heart, Lung, and Blood Institute (NHLBI) in collaboration with Boston University. The Framingham SHARe data used for the analyses described in this manuscript were obtained through dbGaP (access number: phs000007.v3.p2). This manuscript was not prepared in collaboration with investigators of the Framingham Heart Study and does not necessarily reflect the opinions or views of the Framingham Heart Study, Boston University, or the NHLBI. The work is supported by Grants U01-CA86368, P01-CA053996, R01-GM085047, R01-GM079330, R01-AI052817 and U54-LM008748 awarded by the National Institutes of Health.

Appendix A

Let equation M177 and equation M178 denote expectation with respect to (wrt) the empirical probability measure of {(Ti, Δi, Xi, Zi), i = 1, …, n} and the probability measure of (T, Δ, X, Z}, respectively, and equation M179. We use equation M180 to denote equation M181 for any function equation M182, [similar, equals] to denote equivalence up to op(1), and [less, similar] to denote being bounded above up to a universal constant. Let β0 and γ0 denote the solution to equation M183 and equation M184, respectively. Let equation M185 and equation M186. Let ω = ΔI (Tt0)/GX, Z(T) + I (T > t0)/GX, Z(t0), equation M187 and equation M188. For y = 0, 1, let fy(c; s) denote the conditional density of equation M189 given equation M190 and equation M191 and we assumed that fy(c; s) is continuous and bounded away from zero uniformly in c and s. This assumption implies that ROC(u; s) has continuous and bounded derivative ROC(u; s) = [partial differential]ROC(u; s)/[partial differential]u. We assume that V and W are bounded, and equation M192, is continuously differentiable bounded derivatives and bounded away from zero. Throughout, the bandwidths are assumed to be of order n−ν with ν [set membership] (1/5, 1/2). For ease of presentation and without loss of generality, we assume that h1 = h0, denoted by h, and suppress h from the notations. Without loss of generality, we assume that equation M193. When C is assumed to be independent of both T and (X, Z), the simple Kaplan–Meier estimator satisfies this condition. When C depends on (X, Z), ĜX, Z obtained under the Cox model also satisfies this condition provided that Wc is bounded. The kernel function K is assumed to be symmetric, smooth with a bounded support on [−1, 1] and we let m2 = ∫ K (x)2dx.

A.1 Asymptotic expansions for equation M194

Uniform convergence rate for equation M195 We first establish the following uniform convergence rate of equation M196:

equation M197

To this end, we note that for any given c and s,

equation M198

is the solution to the estimating equation equation M199, where ζy = (ζay, ζby)’ and

equation M200

equation M201, equation M202 and equation M203. We next establish the convergence rate for equation M204, where

equation M205

We first show that

equation M206


equation M207

are both Op{(nh)−½ log(n)} where equation M208 and [ρl, ρu] is a subset of the support of equation M209. To this end, we note that since supu |ĜX, Z(u) − GX, Z(u)| = Op(n−½ and equation M210,

equation M211

This implies that

equation M212

where equation M213 is a class of functions indexed by β and e. By the maximum inequality of Van der Vaart and Wellner (1996), we have

equation M214

Together with the fact that equation M215 from Uno et al. (2007), it implies that equation M216. In addition, with the standard arguments used in Bickel and Rosenblatt (1973), it can be shown that

equation M217

Therefore, for h = nν, 1/5 < ν < 1/2,

equation M218

is Op{(nh)−½log(n)}. Following with similar arguments as given above, coupled with the fact that equation M219, we have

equation M220

Thus, equation M221. It follows from same arguments as given above that

equation M222

Therefore, equation M223. In addition, we note that 0 is the unique solution to the equation ψy(ζy; c, s) = 0 wrt ζy. It suggests that equation M224, which implies the consistency of equation M225,

equation M226

Asymptotic expansion for equation M227 Let equation M228. It follows from a Taylor series expansion and the convergence rate of ζy(c; s) that

equation M229

where equation M230. Futhermore, since suptto{ĜX, Z(t) − GX, Z(t)| = Op(n−1/2),

equation M231

We next show that equation M232 is asymptotically equivalent to

equation M233

where equation M234. From (8) and the fact that τ{y; [var phi](s)} is bounded away from 0 uniformly in s, we have

equation M235

where equation M236 is the class of functions indexed by γ, β and e. By the maximum inequality of Van der Vaart and Wellner (1996) and the fact that equation M237 from Uno et al. (2007), we have equation M238 and equation M239. It follows that equation M240. Then, by a delta method,

equation M241


equation M242

Using the same arguments as for establishing the uniform convergence rate of conditional Kaplan-Meier estimators (Dabrowska 1989; Du and Akritas 2002), we obtain (6). Furthermore, following similar arguments as given in Dabrowska (1987, 1997), we have equation M243 converges weakly to a Gaussian process in c for all s. Note that as for all kernel estimators, equation M244 does not converge as a process in s.

A.2 Uniform consistency of equation M245

Next we establish the uniform convergence rate for equation M246. To this end, we write

equation M247

where equation M248 and equation M249. It follows from (6) that equation M250. Let equation M251. Then equation M252. Noting that equation M253, we have equation M254 by the continuity and boundedness of ROC(u; s). Therefore,

equation M255

which implies

equation M256

and hence the uniform consistency of equation M257.

A.3 Asymptotic distribution of equation M258

To derive the asymptotic distribution for equation M259, we first derive asymptotic expansions for equation M260. From the weak convergence of equation M261 in c, the approximation in (9), and the consistency of equation M262 given in the Appendix A.2, we have

equation M263

On the other hand, from the uniform convergence of equation M264 and the weak convergence of equation M265 in c, we have

equation M266

This, together with a Taylor series expansion and the expansion given (9), implies that

equation M267

It follows that

equation M268


equation M269

It then follows from a central limit theorem that for any fixed s, equation M270 converges to a normal with mean 0 and variance

equation M271

where equation M272 is the density function of equation M273,

equation M274


equation M275

A.4 Justification for the resampling methods

To justify the resampling method, we first note that equation M276. It follows from similar arguments given in the Appendix A and Appendix 1 of Cai et al. (2010) that equation M277, where equation M278 is obtained by replacing all theoretical quantities in equation M279 given in (10) with the estimated counterparts for the ith subject. This, together with similar arguments as given above for the expansion of equation M280, implies that

equation M281

where equation M282 Conditional on the data, equation M283 is approximately normally distributed with mean 0 and variance

equation M284

Using the consistency of the proposed estimators along with similar arguments as given above, it is not difficult to show that the above variance converges to equation M285 as n → ∞. Therefore, the empirical distribution obtained from the perturbed sample can be used to approximate the distribution of equation M286.

We now show that after proper standardization, the supermum type statistics Γ converges weakly. To this end, we first note that, similar arguments as given in the Appendix A can be used to show that equation M287 and

equation M288

for some small positive constant δ. Using similar arguments in Bickel and Rosenblatt (1973), we have

equation M289

where an = [2 log{{ρuρl)/h}]½ and equation M290. Now justify the resampling procedure for constructing the CI, we note that

equation M291

where pr{sups[set membership]Ω(h)|nε*(s)| ≥ e | data} → 0 in probability. Therefore,

equation M292

where equation M293. It follows from similar arguments as given in Tian et al. (2005) and Zhao et al. (2010) that

equation M294

in probability as n → ∞. Thus, the conditional distribution of an(Γ* − dn) can be used to approximate the unconditional distribution of an(Γ − dn). When h0 = h1, in general, the standardized Γ does not converge to the extreme value distribution. However, when h0 = h1 = k [set membership] (0, ∞), the distribution of the suitable standardized version of Γ still can be approximated by that of the corresponding standardized Γ* conditional on the data (Gilbert et al. 2002).

Appendix B

B.1 Bandwidth selection for pAUCf(s)

The choice of the bandwidths h0 and h1 is important for making inference about equation M295 and consequently pAUCf(s). Here we propose a two-stage K-fold cross-validation procedure to obtain the optimal bandwidth for equation M296 and equation M297 sequentially. Specifically, we randomly split the data into K disjoint subsets of about equal sizes denoted by equation M298. The two-stage procedure is described as follows:

  1. Motivated by the fact that equation M299 is essentially the (1 − u)-th quantile of the conditional distribution of equation M300 given Y = 0 and equation M301, for each k, we use all the observations not in equation M302 to estimate equation M303 by obtaining equation M304, the minimizer of
    equation M305
    wrt (α0, α1), where ρτ(e) is a check function defined as ρτ(e) = τ e, if e ≥ 0; = (τ − 1)e, otherwise. Let equation M306 denote the resulting estimator of equation M307. With observation in equation M308, we obtain
    equation M309
    Then, we let equation M310.
  2. Next, to find an optimal h1 for equation M311, we choose an error function that directly relates to equation M312. Specifically, noting the fact that
    equation M313
    we use the corresponding mean integrated squared error for equation M314 as the error function. For each k, we use all the observations which are not in equation M315 to obtain the estimate of equation M316, denoted by equation M317 via (4). Then, with the observations in equation M318, we calculate the prediction error
    equation M319

We let equation M320.

Since the order of equation M321 is expected to be n−1/5 (Fan and Gijbels 1995), the bandwidth we use for estimation is equation M322 with 0 < d < 3/10 such that hy = n−ν with 1/5 < ν < 1/2. This ensures that the resulting functional estimator equation M323 with the data-dependent smooth parameter has the above desirable large sample properties.

B.2 Bandwidth selection for IDI(s)

Same as bandwidth selection for pAUC, we also propose a K-fold cross validation procedure to choose the optimal bandwidth h1 for equation M324 and h0 for equation M325 separately. The procedure is described as follows: we randomly split the data into K disjoint subsets of about equal sizes denoted by equation M326. Motivated by the fact (3), for each k, we use all the observations not in equation M327 to estimate equation M328 by obtaining equation M329 for y = 0, 1, which is the solution to the estimating equation

equation M330

wrt equation M331. Let equation M332 and equation M333. With observations in equation M334, we obtain

equation M335


equation M336

Then, we let equation M337 and equation M338.

Appendix C

R codes for application will be available from the corresponding author upon request.

Contributor Information

Qian M. Zhou, Department of Statistics and Actuarial Science, Simon Fraser University, Burnaby, BC V5A 1S6, Canada, qmzhou/at/

Yingye Zheng, Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA, yzheng/at/

Tianxi Cai, Department of Biostatistics, Harvard University, Boston, MA 02115, USA.


  • Baker S, Pinsky P. A proposed design and analysis for comparing digital and analog mammography: special receiver operating characteristic methods for cancer screening. J Am Stat Assoc. 2001;96:421–428.
  • Bickel P, Rosenblatt M. On some global measures of the deviations of density function estimates. Ann Stat. 1973;1:1071–1095.
  • Blumenthal R, Michos E, Nasir K. Further improvements in CHD risk prediction for women. J Am Med Assoc. 2007;297:641–643. [PubMed]
  • Cai T, Cheng S. Robust combination of multiple diagnostic tests for classifying censored event times. Biostatistics. 2008;9:216–233. [PubMed]
  • Cai T, Dodd LE. Regression analysis for the partial area under the ROC curve. Stat Sin. 2008;18:817–836.
  • Cai T, Tian L, Wei L. Semiparametric Box–Cox power transformation models for censored survival observations. Biometrika. 2005;92(3):619–632.
  • Cai T, Tian L, Uno H, Solomon S, Wei L. Calibrating parametric subject-specific risk estimation. Biometrika. 2010;97(2):389–404. [PMC free article] [PubMed]
  • Cook N, Ridker P. The use and magnitude of reclassification measures for individual predictors of global cardiovascular risk. Ann Intern Med. 2009;150(11):795–802. [PMC free article] [PubMed]
  • Cook N, Buring J, Ridker P. The effect of including C-reactive protein in cardiovascular risk prediction models for women. Ann Intern Med. 2006;145:21–29. [PubMed]
  • Cox D. Regression models and life-tables. J R Stat Soc B. 1972;34(2):187–220.
  • Dabrowska D. Non-parametric regression with censored survival time data. Scand J Stat. 1987;14(3):181–197.
  • Dabrowska D. Uniform consistency of the kernel conditional Kaplan–Meier estimate. Ann Stat. 1989;17(3):1157–1167.
  • Dabrowska D. Smoothed Cox regression. Ann Stat. 1997;25(4):1510–1540.
  • D’Agostino R. Risk prediction and finding new independent prognostic factors. J Hypertens. 2006;24(4):643–645. [PubMed]
  • Dodd L, Pepe M. Partial AUC estimation and regression. Biometrics. 2003;59:614–623. [PubMed]
  • Du Y, Akritas M. Iid representations of the conditional Kaplan–Meier process for arbitrary distributions. Math Methods Stat. 2002;11:152–182.
  • Dwyer AJ. In pursuit of a piece of the ROC. Radiology. 1996;201:621–625. [PubMed]
  • Expert Panel on Detection, Evaluation, and Treatment of High Blood Cholesterol in Adults Executive summary of the third report of the National Cholesterol Education Program (NCEP) Expert Panel on Detection, Evaluation, and Treatment of High Blood Cholesterol in Adults (Adult Treatment Panel III) J Am Med Assoc. 2001;285(19):2486–2497. [PubMed]
  • Fan J, Gijbels I. Data-driven bandwidth selection in local polynomial regression: variable bandwidth selection and spatial adaptation. J R Stat Soc B. 1995;57:371–394.
  • Gail M, Pfeiffer R. On criteria for evaluating models of absolute risk. Biostatistics. 2005;6(2):227–239. [PubMed]
  • Gilbert P, Wei L, Kosorok M, Clemens J. Simultaneous inferences on the contrast of two hazard functions with censored observations. Biometrics. 2002;58(4):773–780. [PubMed]
  • Harrell F, Jr, Lee K, Mark D. Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat Med. 1996;15(4):361–387. [PubMed]
  • Heagerty P, Zheng Y. Survival model predictive accuracy and ROC curves. Biometrics. 2005;61:92–105. [PubMed]
  • Jiang Y, Metz C, Nishikawa R. A receiver operating characteristic partial area index for highly sensitive diagnostic tests. Radiology. 1996;201:745–750. [PubMed]
  • Jin Z, Ying Z, Wei L. A simple resampling method by perturbing the minimand. Biometrika. 2001;88(2):381–390.
  • Korn E, Simon R. Measures of explained variation for survival data. Stat Med. 1990;9(5):487–503. [PubMed]
  • An approach to nonparametric regression for life history data using local linear fitting. Ann Stat. 23:787–823.
  • McIntosh M, Pepe M. Combining several screening tests: optimality of the risk score. Biometrics. 2002;58(3):657–664. [PubMed]
  • Park Y, Wei L. Estimating subject-specific survival functions under the accelerated failure time model. Biometrika. 2003;9:717–723.
  • Park B, Kim W, Ruppert D, Jones M, Signorini D, Kohn R. Simple transformation techniques for improved non-parametric regression. Scand J Stat. 1997;24(2):145–163.
  • Paynter N, Chasman D, Pare G, Buring J, Cook N, Miletich J, Ridker P. Association between a literature-based genetic risk score and cardiovascular events in women. J Am Med Assoc. 2010;303(7):631–637. [PMC free article] [PubMed]
  • Pencina M, D’Agostino R. Overall C as a measure of discrimination in survival analysis: model specific population value and confidence interval estimation. Stat Med. 2004;23(13):2109–2123. [PubMed]
  • Pencina M, D’Agostino RS, D’Agostino RJ, Vasan R. Evaluating the added predictive ability of a new marker: From area under the ROC curve to reclassification and beyond (with Coomentaries & Rejoinder) Stat Med. 2008;27:157–212. [PubMed]
  • Pfeiffer R, Gail M. Two criteria for evaluating risk prediction models. Biometrics. 2010;67(3):1057–1065. [PMC free article] [PubMed]
  • Pfeffer M, Jarcho J. The charisma of subgroups and the subgroups of CHARISMA. N Engl J Med. 2006;354(16):1744–1746. [PubMed]
  • Ridker P. C-reactive protein and the prediction of cardiovascular events among those at intermediate risk: moving an inflammatory hypothesis toward consensus. J Am Coll Cardiol. 2007;49(21):2129–2138. [PubMed]
  • Ridker P, Rifai N, Rose L, Buring J, Cook N. Comparison of C-reactive protein and low-density lipoprotein cholesterol levels in the prediction of first cardiovascular events. N Engl J Med. 2007;347:1557–1565. [PubMed]
  • Robins J, Ya’Acov R. Toward a curse of dimensionality appropriate (CODA) asymptotic theory for semi-parametric models. Stat Med. 1997;16(3):285–319. [PubMed]
  • Rothwell P. Treating Individuals 1: External validity of randomised controlled trials: “To whom do the results of this trial apply?” Lancet. 2005;365:82–93. [PubMed]
  • Tian L, Zucker D, Wei L. On the Cox model with time-varying regression coefficients. J Am Stat Assoc. 2005;100(469):172–183.
  • Tian L, Cai T, Wei LJ. Identifying subjects who benefit from additional information for better prediction of the outcome variables. Biometrics. 2009;65:894–902. [PMC free article] [PubMed]
  • Tibshirani R, Hastie T. Local likelihood estimation. J Am Stat Assoc. 1987;82(398):559–567.
  • Tice J, Cummings S, Ziv E, Kerlikowske K. Mammographic breast density and the Gail model for breast cancer risk prediction in a screening population. Breast Cancer Res Treat. 2005;94(2):115–122. [PubMed]
  • Uno H, Cai T, Tian L, Wei L. Evaluating prediction rules for t-year survivors with censored regression models. J Am Stat Assoc. 2007;102:527–537.
  • Uno H, Cai T, Pencina M, D’Agostino R, Wei L. On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat Med. 2011a;30(10):1105–1117. [PMC free article] [PubMed]
  • Uno H, Cai T, Tian L, Wei LJ. Graphical procedures for evaluating overall and subject-specific incremental values from new predictiors with censored event time data. Biometrics. 2011b;67:1389–1396. [PMC free article] [PubMed]
  • Van der Vaart AW, Wellner JA. Weak convergence and empirical processes. Springer; New York: 1996.
  • Wacholder S, Hartge P, Prentice R, Garcia-Closas M, Feigelson H, Diver W, Thun M, Cox D, Hankinson S, Kraft P, et al. Performance of common genetic variants in breast-cancer risk models. N Engl J Med. 2010;362(11):986–993. [PMC free article] [PubMed]
  • Wand M, Marron J, Ruppert D. Transformation in density estimation (with comments) J Am Stat Assoc. 1991;86:343–361.
  • Wang T, Gona P, Larson M, Tofler G, Levy D, Newton-Cheh C, Jacques P, Rifai N, Selhub J, Robins S. Multiple biomarkers for the prediction of first major cardiovascular events and death. N Engl J Med. 2006;355:2631–2639. [PubMed]
  • Wang R, Lagakos S, Ware J, Hunter D, Drazen J. Statistics in medicine-reporting of subgroup analyses in clinical trials. N Engl J Med. 2007;357(21):2189–2194. [PubMed]
  • Wilson PW, D’Agostino RB, Levy D, Belanger AM, Silbershatz H, Kannel WB. Prediction of cornary heart disease using risk factor categories. Circulation. 1998;97:1837–1847. [PubMed]
  • Zhao L, Cai T, Tian L, Uno H, Solomon S, Wei L, Minnier J, Kohane I, Pencina M, D’Agostino R, et al. Harvard University Biostatistics Working Paper Series 2010: Working Paper 122. 2010. Stratifying subjects for treatment selection with censored event time data from a comparative study.