This study examined the effect of correction for long-term variation in multiple risk factors on CHD risk estimation and prediction. Correcting for long term variation has a substantial effect onthe relative hazard estimates, strengthening the relative hazards for systolic blood pressure, total and HDL cholesterol, with little change for smoking, and weakening of the relative hazards of age, medication use, and, in women, race. The regression calibration method generated an estimate of the value of the risk factors corrected for long-term variation, which allowed us to also examine the effect of correction on prediction. There was no significant increase in the AUC in the mean model compared to the baseline model, and no further improvement in the regression calibration model. However, both the mean and the regression calibration models improved risk classification in women compared to the baseline model.
Some of the changes in hazard ratios due to adjustment for long term variation are suggested by underlying risk factor relationships. For example, cholesterol and hypertension medication use at baseline are likely to be related to cholesterol and blood pressure values 3-years prior. Likewise the decrease in the hazard ratios for age in women might be related to the stronger relationship between age and blood pressure in women than men. The lack of change in the hazard ratio of smoking could be due to a lack of relationship between smoking status and blood pressure and cholesterol levels.
This work builds on previously published findings on CHD risk from the ARIC cohort (
22). The increases seen in our estimates of relative effect are slightly higher than those shown by single risk factor adjustment. MacMahon et al (
23) found a 60% increase in the relative risk for blood pressure, while Law et al (
24) and Verschuren et al (
25) found increases in the relative risk for of total cholesterol of around 40%.
The results of multivariable adjustment, on the other hand, will depend on which risk factors are included in the model, though our results show similar direction to those seen in other studies. Rosner et al (
4) found an increase in the effects of cholesterol, glucose and blood pressure, when variation in those variables was included, and a decrease in the effect of smoking and BMI and age. Iribarren et al (
26) showed increases in the hazard ratios of serum cholesterol, blood pressure, and dietary cholesterol, when variation was included, and mixed effects in smoking, and decreases in the hazard ratios of alcohol consumption and abstinence and body mass index.
Emberson et al (
27) used a model in which diastolic blood pressure and serum total cholesterol were assumed to vary, while age, history of CHD, and smoking status were not and found increases in all hazard ratios. To aid in the comparison, we provided the relative hazards using the same variables but generated using the Rosner macro, which were very similar to those obtained using our regression calibration method. This suggests that differences in the results are not driven by the choice of method.
Previous studies have not examined the effect of correction for long term variation on prediction. Our results show no significant improvement in discrimination resulting from the additional information provided by the previous visit. This lack of improvement did not vary across different methods of accounting for long term variation. The improvement in classification seen in women but not in men may be due to the stronger correlations between the two measurements in women, making the long term average a better estimate of the effect in women. This finding would need to be replicated in additional populations. Additionally, the utility of our method in adding predictive ability for other diseases, or if expanded by using more than two measurements to include an estimate of trajectory rather than averages, remains untested. It is clear, however, that using a corrected model does not decrease predictive ability of the model. One potential advantage of correction for variation on prediction worthy of further study is that it allows for the risk model not to depend on the variability in a given population. Correction for variation may allow for more similarity in models and calibration across populations.
This study is limited by having only two visits, three years apart, for each variable, and did not include continuous measures of smoking and glycemic control. With more frequent measurements, which may become increasingly available through electronic medical records, the method could be extended to examine trajectories of change rather than a single long-term value. Including continuous measures of glucose control and smoking would have allowed improved understanding of the relationships between risk factors. Additionally, we were also unable to separate the observed long-term variation into measurement imprecision, short-term variability and long-term changes. However, decomposition would not impact the final results of the present analysis. Also, our correction method is a first-order correction and does not take into account the variance of the underlying estimate of the long-term mean. We did check for a period effect and for a relationship between the variability and age at baseline and found none. Variation in the mean time between visits (mean 2.9 years, SD 0.2) in men and women led to some variability between the ages at the two visits (correlation of 0.998 in men and 0.997 in women).. We were also limited by the missing values in the data. We chose to use a listwise deletion approach rather than use the other variables to both estimate the missing data and to derive the correlation structure for the regression calibration. However, our approach does introduce potential biases in the results if the missing values are not randomly distributed.
The study does have substantial strengths. The time period of measurements and follow-up is well suited to answer the question of previous variation on future risk prediction versus relative hazard estimation. We also were able to correct for variation in multiple risk factors simultaneously as well as see the effects of correction on the remaining risk factors. Finally, our method provided an actual estimate of the underlying risk factor value and allowed us to examine prediction.