CF is an inherited chronic disease that affects the lungs and digestive system of people. A defective gene and its protein product cause the body to produce unusually thick sticky mucus that clogs the lungs and leads to life-threatening lung infections and also obstructs the pancreas and stops natural enzymes from helping the body break down and absorb food. The main culminating event that leads to death is acute pulmonary exacerbation, that is lung infection requiring intravenous antibiotics.
The data for analysis are from the CF Registry, a database maintained by the CF Foundation, containing annually updated information on over 20 000 people diagnosed with CF and living in the United States. We are interested in the predictive information provided by knowing FEV
1, a measure of lung function, measured in 1995 to predict the occurrence of pulmonary exacerbation in 1996. There are 12 802 unique subjects in the data and 5245 (41%) had at least 1 pulmonary exacerbation. Patients younger than 6 years are excluded. FEV
1 is standardized for age, gender, and height (Knudson
and others, 1983) by converting it to a percentage of predicted for healthy children, and it is negated to satisfy our assumption that increasing values are associated with increasing risk (see
Moskowitz and Pepe, 2004 for more details). In order to apply our methodology, we simulated a nested case–control sample from the entire cohort by randomly selecting 500 individuals with and 500 individuals without pulmonary exacerbation in 1996.
5.1. Prediction using FEV1
displays the estimated log DLR curves for FEV
1. Since the density ratio estimates performed so poorly in the simulation, we do not present them here. Observe that log DLR estimated using nonrank-invariant LR is linear in
Y = FEV
1 because we let
Y enter the model as a linear term. The estimated placement value,

, entered the rank-invariant LR model as

. The binormal ROC–GLM model was employed. We see that the estimators are close and that their CIs are also similar ().
shows the log DLR values estimated at FEV
1 = 100 and 40, approximately the first and third quartiles of the population distribution of FEV
1. The estimates derived using all 3 methods appear to be similar. Since the probability of having a pulmonary exacerbation is approximately 0.4 in the population, if a subject's FEV
1 was measured and was equal to 100, the revised event probability would be calculated as logit
−1(logit 0.4 − 1.263) = 0.16 (95% CI = (0.13, 0.18)) using
LR. Rank-invariant LR yielded a similar posttest risk probability of 0.15, with 95% CI (0.13, 0.18), while the ROC–GLM estimator yielded 0.17 (95% CI = (0.14, 0.19)). These estimates and their associated CIs are almost identical. It appears that the chances are fairly low that a subject with FEV
1 equal to 100 will have a pulmonary exacerbation in the following year.
| Table 2.Estimates of the log DLR function at FEV1 equal to 100 and 40 in the CF Study. Shown in the table are the estimates and associated 95% bootstrap percentile CIs according to different estimation approaches |
Now, let us consider FEV1 = 40, which is approximately the 25th percentile of the population distribution of FEV1. The LR estimate of log DLR is 1.347 (95% CI = (1.152, 1.536)). The corresponding posttest disease probability is logit−1 (logit 0.4 + 1.347) = 0.72 (95% CI = (0.68, 0.76)). Under rank-invariant estimation approaches, estimates of log DLR are 1.199 (95% CI = (1.022, 1.430)) and 1.246 (95% CI = (1.025, 1.537)) for LR and ROC–GLM methods, respectively. Modified risk probabilities are therefore 0.69 (95% CI = (0.65, 0.74)) and 0.70 (95% CI = (0.65, 0.76)). Overall, estimates of posttest risks are quite similar, and we conclude that for a patient whose FEV1 is 40, the chance that he will have a pulmonary exacerbation in the following year is fairly high.
5.2. Comparison between FEV1 and weight as predictors
We now turn to use of the DLR functions for making comparisons between risk prediction markers, FEV1, and weight. We have argued that the DLR function quantifies the predictive information in a marker since it quantifies how much the risk should be modified from baseline by knowing the marker value. A better marker should lead to a larger revision in the risk probability.
The issue in making DLR comparisons between markers is that the DLR is a function of the marker value but raw values for one marker are not comparable with those for another. For example, should the DLR associated with an FEV1 value of 100 be compared with the DLR associated with a weight percentile of 50? Our proposal is to first standardize both markers using placement value standardization and to then make comparisons between the DLR functions. That is, we propose that comparisons between risk prediction markers can be based on DLR(U(Y)), the rank-invariant DLR function. By transforming Y into U(Y), we are essentially saying that marker values are comparable when they are at the same quantile in their respective control distributions. For example, if we consider U(Y) ≤ 0.10 and find that DLR is substantially higher for FEV1 than it is for weight in this risk range, we would conclude that the FEV1 values at or worse than the 90th percentile of controls are more predictive than weight values at or worse than the 90th percentile of controls. In particular, if subjects are candidates for intervention if their predictor values are in the worst decile (relative to controls), the ordering of DLR(U) functions for FEV1 versus weight in the range u < 0.10 indicates that FEV1 identifies a group at greater risk than does weight.
Turning now to the data, estimates of the rank-invariant DLR functions are shown in . These curves were estimated using the rank-invariant LR method with placement values, U, for FEV1 and weight entered into separate LR models as terms of the form Φ−1(1 − U). We can read from the plot the DLR of a marker value Y which is at the 100(1 − u)th percentile of the marker distribution in controls, that is in subjects who did not suffer an event in 1996.
The rank-invariant DLR function is substantially higher for FEV1 than for weight when u is small but substantially lower for FEV1 than for weight when u is large. Observe in particular that when logDLR(u) > 0, so that the predictors are in ranges where risk modification yields increased risk over baseline, we see that FEV1 values increase the risk more than do comparable weight values. Conversely, when logDLR(u) < 0, so that the predictors are in ranges where risk modification yields reduced risk relative to baseline, we see that FEV1 values decrease the risk more than comparable weight values. This indicates that FEV1 is a better marker for predicting risk than is weight.
As suggested earlier, suppose that it has been decided to treat subjects whose FEV1 or weight measurement are in the worst 10% of values measured for controls. We see that DLR(0.1) is 2.08 (95% CI = (1.91, 2.24)) for FEV1 as opposed to 1.26 (95% CI = (1.16, 1.38)) for weight. The corresponding posttest risks are logit−1 (log2.08 + logit0.4) = 0.58 (95% CI = (0.56, 0.60)) and logit−1 (log1.26 + logit0.4) = 0.46 (95% CI = (0.44, 0.48)), respectively. Therefore, using FEV1 to select the subpopulation to receive treatment ensures that these subjects are at greater risk of an event, risk >0.58 as opposed to risk > 0.46.