|Home | About | Journals | Submit | Contact Us | Français|
Despite the rising heart failure (HF) incidence and aging United States population, there are no validated prediction models for incident HF in the elderly population. We sought to develop a new prediction model for 5-year risk of incident HF among older persons.
Proportional hazards models were used to assess independent predictors of incident HF, defined as hospitalization for new onset HF, in 2935 elderly participants without baseline HF enrolled in the Health ABC study (73.6±2.9 years, 47.9% males, 58.6% whites). A prediction equation was developed and internally validated by bootstrapping, allowing the development of a 5-year risk score. Incident HF developed in 258 (8.8%) participants during 6.5±1.8 years of follow-up. Independent predictors of incident HF included age, history of coronary disease and smoking, baseline systolic blood pressure and heart rate, serum glucose, creatinine, and albumin levels, and left ventricular hypertrophy. The Health ABC HF model had a c-statistic 0.73 in the derivation dataset, 0.72 by internal validation (optimism-corrected), and good calibration (goodness-of-fit χ2 6.24, p=0.621). A simple point score was created to predict incident HF risk into four risk groups corresponding to <5%, 5–10%, 10–20%, and >20% 5-yr risk. The actual 5-year incident HF rates in these groups were 2.9%, 5.7%, 13.3%, and 36.8% respectively.
The Health ABC HF prediction model uses common clinical variables to predict incident HF risk in the elderly, an approach that may be used to target and treat high-risk individuals.
Despite significant progress in the treatment of heart failure (HF), the incidence and prevalence of this diagnosis are rising in the United States.1–3 This trend is expected to continue and is attributed primarily to the increasing proportion of elderly in the population, improved care of acute heart diseases resulting in improved patient survival, and increasing prevalence of cardiovascular risk factors such as obesity and diabetes.4, 5 The majority of HF research to date has focused on treatment. In order to further HF prevention efforts, the American Heart Association and the American College of Cardiology proposed a new classification scheme for HF to include “Stage-A” patients; those who do not have structural heart disease but are at risk for HF.1 Unlike coronary heart disease (CHD) however, no validated risk prediction scores are available for targeting such subjects for primary prevention.
Previous studies of HF risk factor assessment are not useful for population based risk prediction. These studies either included a select specific patient sub-population, e.g. the Framingham Heart Failure Risk Score (FHFRS) was developed in patients with known CHD, valvular disease, or hypertension; or assessed individual risk factors but did not develop risk assessment scores.6–13
Heart failure is primarily a disease of the elderly. Its incidence approaches 10/1000 annually after age 65 and 80%of patients hospitalized with HF are older than 65 years.14–16 In this study, we sought to develop and validate a risk prediction model for incident HF among elderly participants enrolled in the Health Aging and Body Composition (Health ABC) study. Moreover, we sought to assess the predictive utility of the FHFRS for incident HF in this population.
The Health ABC study is a population-based study of 3075 well-functioning, community-dwelling men and women aged 70 to 79 years at inception. Participants were identified from a random sample of white Medicare beneficiaries and all age-eligible black residents in designated zip codes areas surrounding Pittsburgh and Memphis. To be eligible, participants had to report no difficulty in walking one-quarter mile or climbing 10 stairs without resting. Exclusion criteria included difficulties with daily activities, cognitive impairment, inability to communicate, intention of moving within 3 years, or participation in a trial involving a life-style intervention. The Institutional Review Boards at both sites approved the protocol.
Baseline data were collected in 1997–1998 and these results represent outcomes during seven years of follow-up. Overall 140 participants had HF and 82 had missing data on HF status at baseline; these participants were excluded resulting in a study cohort of 2853 for the Health ABC model development. Of these, 1441 participants did not have either hypertension or CAD or valvular heart disease (VHD) and were excluded from analysis of the FHFRS, restricting that analysis to 1412 participants. The presence of cardiovascular diseases at baseline was based on ICD 9-CM codes, reported by Medicare and Medicaid Services for the years 1995–1998, self-reported history and use of selected drugs. Cardiovascular outcomes were adjudicated using methods adapted from the Cardiovascular Health Study.17
DefiniteCHD was defined as history of coronary artery bypass graft surgery (CABG), percutaneous coronary intervention (PCI), myocardial infarction (MI), or angina, or self-reported history of CHD accompanied by antianginal medication use (calcium channel blockers, beta blockers, or nitrates). Possible CHD was designated if there was a self-reported history of CHD without antianginal (or missing information on) medication use and any information about history of CABG, PCI, MI, or angina was missing or negative. Cerebrovascular disease was defined as self-reported history of stroke, transient ischemic attack, or carotid endarterectomy. Hypertension was defined as definite if there was a self-reported history of physician diagnosis accompanied by use of antihypertensive medication; or possible if there was a self reported history of physician diagnosis of hypertension but without use of antihypertensive medication (or missing information about medication use) or there was antihypertensive medication use but there was no history of hypertension. Depression was defined as definite if there was both a self-reported treatment of depression and use of antidepressants; or possible if there was a self reported treatment of depression but without use of antidepressants (or missing information about medication) or if there was medication use but no history of depression. Diabetes mellitus was considered present if the participant reported a history of diabetes mellitus or used hypoglycemic medications at baseline. Smoking status was defined as current use, past use (smoked at least 100 cigarettes in their lifetime), or never. The Minnesota code criteria were applied to diagnose left ventricular hypertrophy (LVH) from the baseline electrocardiogram 18: R >26mm in either V5 or V6, or R >20mm in any of leads I, II, III, aVF, or R >12mm in lead aVL, or R in V5 or V6 plus S in V1 >35mm. History of VHD was not collected in the Health ABC study; VHD was considered present if either the participant had history of rheumatic heart disease or valve surgery.
All participants in Health ABC were asked to report any hospitalizations and every 6 months were asked direct questions to elicit information about interim cardiovascular events. Medical records for overnight hospitalizations were reviewed at each site. All first admissions to the hospital with an overnight stay confirmed to be related to HF were classified as incident HF. Local adjudicators classified HF, based on symptoms, signs, chest x-ray, and echocardiographic findings, using criteria similar to those used in the Cardiovascular Health Study.17 The HF criteria required at least HF diagnosis from a physician and treatment for HF (i.e. diuretics and either digitalis or a vasodilator); these criteria have been used in previous studies.19 All deaths were reviewed by the Health ABC Diagnosis and Disease Ascertainment committee; cause of death was determined by central adjudication. Since HF was not allowed as a cause of death, there were no deaths considered as incident HF.
First, to facilitate preliminary selection of predictors, descriptive statistics were obtained and compared by the Fisher’s exact test or the Welch-corrected t test between participants who developed HF (n=258) and those who did not (n=2677), Table 1. Variables with p ≤0.20 were considered as candidates. The association of candidate variables with risk for incident HF was assessed in univariate Cox models using bootstrap estimation (1000 replications, resampling with replacement).20 The functional form of continuous predictors (linear vs. non-linear relations with incident HF risk) was evaluated using fractional polynomial functions.21, 22 All candidate variables were also evaluated for significant interactions with age, gender, and race. All terms with p ≤0.10 (Wald χ2 test) were considered for inclusion in multivariable models. Observations with missing values were dropped from subsequent analyses. Second, to identify independent predictors of outcome (incident HF), we adopted a backwards elimination approach.22 Bootstrap estimation was adopted to obtain bias-corrected coefficients and confidence intervals (CI) in each step.23, 24 The threshold to retain a term in the model was set to p≤0.05 (Wald χ2).
The goodness-of-fit of the final model was evaluated both formally by the Hosmer-Lemeshow χ2 statistic and visually by plotting the cumulative expected vs. observed events across the quartiles of risk (Arjas plots).25, 26 The bias-corrected coefficients of the final model presented in Table 2 formed the basis for the Health ABC HF Score.
We internally validated the performance of the model by bootstrapping.27 Simulation studies have shown that this approach provides the least biased and most stable estimates of optimism-corrected performance among the various proposed methods for internal validation;28 with ‘optimism’ referring to the inherent bias towards an overestimated performance in the derivation dataset.27–29 Briefly, optimism in a performance measure (e.g. the c-statistic) with this method is estimated by the average of (measure bootstrap sample −measure original dataset) for a large number of models derived from respective bootstrap samples: the performance of each of the bootstrap-sample derived models is evaluated on the bootstrap sample (‘training’ dataset) and back to the original dataset (‘validation’ dataset). The average of (measure bootstrap sample − measure original dataset), i.e. the optimism, is then subtracted from the original performance measure (i.e. the c-statistic of the original model) to provide a more realistic estimate. This approach moderates our expectations from the model and sets an upper limit for performance in future external validation.
We validated two measures of performance using 1000 bootstrap samples: the c-statistic and the slope of the linear predictor. The c-statistic is a measure of discrimination of the model, i.e. the ability to distinguish high- from low- risk subjects and is analogous to the area under the receiver operating characteristic (ROC) curve. 27 Values range from 0.5 (=useless) to 1.0 (=perfect). The slope of the linear predictor is a measure of model calibration, i.e. whether predicted probabilities agree with observed probabilities. Perfect is 1.0 and calibration is worse as the value deviates from 1.0. Validating the slope of the linear predictor by bootstrapping provides also a means to moderate absolute predictions by recalibrating the linear predictor using the optimism-corrected slope as a ‘shrinkage factor’ (see Appendix).27
The entire follow-up period was used to develop the model. After recalibrating the linear predictor of the model using the optimism-corrected slope (‘shrinkage factor’) to provide more conservative estimates, the results were adapted to provide 5-yr HF risk predictions (Appendix). To facilitate clinical use of the model, the coefficients in Table 2 were used to assign score points for each risk factor using an approach similar to that adopted in the development of the FRS,30 For each level of the total score (the Health ABC HF Score) the 5-yr risk was calculated; thus the Health ABC HF Score could be divided into four risk categories (<5%, 5–10%, 10–20%, and >20%). The Health ABC HF Score was tested for possible loss of information against the original equation. In addition, consistency of risk prediction was evaluated across gender and race.
For the FHFRS, we restricted analyses to Health ABC participants with hypertension, CHD, or VHD. To compare performance for 5-yr HF prediction, we used the 5-yr occurrence of HF as a binary outcome and fit the respective, sex-specific scores in univariate logistic regression models. For each score, we calculated the c-statistic as a measure of discrimination and the Nagelkerke R2 as a measure of explained variance.31 The c-statistics were compared between models according to the method described by DeLong & DeLong.32 Again, performance measures for the Health ABC HF Score were corrected for optimism by bootstrapping using the methods described above.27–29
Survival analysis, development of the multivariable model, and calculation of 5-yr estimates was performed with STATA SE 9.2 (StataCorp LP). The S-Plus 6. R2 statistical language (Insightful Corp.) was used for internal validation of the models using the Design library provided by F. E. Harrell http://lib.stat.cmu.edu/S/Harrell/Design.html). The authors had full access to and take full responsibility for the integrity of the data. All authors have read and agree to the manuscript as written.
Table 1 describes the baseline patient characteristics stratified by incident HF status in the overall cohort and the sub-population studied. The mean age of participants was 73.6±2.9 years with 47.9% male and 58.6% whites. The mean follow-up was 6.5 years.
Overall 611 participants out of 2935 died, representing a cumulative mortality of 20.8% and annual mortality of 3.1%. A total of 258 participants developed HF (cumulative rate 8.8%, annual 1.36%). Subsequent mortality among participants who developed HF was 18.0%/yr (cumulative 46.9%) over a mean follow-up of 2.6 years after HF hospitalization, compared to the 2677 participants who did not develop HF in whom annual mortality was 2.7% (cumulative 18.3%) over a mean follow-up of 6.7 years. Men and blacks were more likely than women and whites to develop HF (men: 140/1407, 10.0% cumulative, 1.58% annual rate vs. women: 118/1528, 7.7% cumulative, 1.17% annual rate, p=0.01, and blacks: 123/1215, 10.1% cumulative, 1.63% annual rate vs. white: 135/1720, 7.8% cumulative, 1.18% annual rate, p=0.01).
As shown in Table 2, nine variables were associated with development of incident HF including: age, history of smoking and CHD, LVH, systolic blood pressure and heart rate, and serum glucose, albumin, and creatinine levels. Sex and race were both considered for inclusion but neither was associated with HF development in the final multivariable model. Formal and graphical statistical testing revealed concordant baseline hazard functions for both these factors. A significant non-linear relationship with HF risk was detected only for creatinine levels. After inclusion of baseline blood pressure and serum glucose levels in the prediction model, past history of hypertension and diabetes were no longer independently associated with incident HF.
The Health ABC model for incident HF had satisfactory discrimination (c-statistic 0.73 in the derivation dataset and 0.72 by internal validation with bootstrap-derived samples and correction for optimism). The Hosmer-Lemeshow goodness-of fit test demonstrated overall good calibration (χ2 =6.24, p=0.621); the distribution of expected vs. observed HF incidence across deciles of risk is shown in Figure 1. In concordance, the slope of the linear predictor during internal validation with bootstrap-derived samples was estimated to 0.95 suggesting good calibration; we opted to use this optimism-corrected slope to obtain 5-yr estimates in order to provide more conservative risk predictions.
A score was developed from the coefficients in Table 2; we were able to define four risk groups (low, average, high, and very-high) corresponding to <5%, 5–10%, 10–20%, and >20% 5-yr risk of incident HF respectively (Table 3, Figure 2).33 Actual 5-year HF risk in these groups was 2.9%, 5.7%, 13.3%, and 36.8% respectively. Figure 3 shows the Kaplan-Meier curves for incident HF in these risk groups. The Health ABC HF score predicted risk well in both genders and in white/black race based subgroups (Figure 4). In the Health ABC derivation cohort (n=2853), the Health ABC HF Score achieved an optimism-corrected c=0.76 for 5-yr HF occurrence (95% CI: 0.72–0.80) and R2=0.154.
Table 4 summarizes the comparative utility of the Health ABC and FHFRS in predicting incident HF in the Health ABC cohort and the sub-cohort in which the original score was developed. The FHFRS was suboptimal in predicting risk (three of four gender-specific analyses had a c-statistic <0.7) and inferior compared to the Health ABC model.
In this study, we developed and internally validated a risk prediction model for incident HF in an elderly cohort using commonly available clinical variables. We demonstrated that this model provides better discrimination for incident HF than the FHFRS. Moreover, we created a simple to use scoring system to classify the population at-risk into four risk categories for HF development over 5 years.
The Health ABC HF risk prediction model and score has several strengths. First, this is a clinically relevant and applicable model that has potentially important utility in the general elderly population for prediction of incident HF. This is of significant epidemiologic importance. According to the White House Conference on Aging 2005,4 approximately 12% of Americans were older than 65 years in the year 2000; this proportion will rise to 20% by the year 2050. Heart failure incidence and prevalence is highest among the elderly and the aging of the population is expected to significantly worsen the current HF epidemic.14–16 This model provides a framework for risk assessment and systematic evaluation of preventive strategies to curb the HF epidemic. Second, although in a younger adult population from an earlier era the population attributable risk for hypertension was found to be nearly 40% in men and 60% in women, the population attributable risk of even major HF risk factors like hypertension and diabetes in the elderly recently were found to be only 12.7% and 8.3% respectively.6, 34 Since most subjects have multiple risk factors in various combinations, a multi-factorial risk prediction scheme is likely to be more robust in predicting risk. Third, our model predicts risk reliably using only common, clinically available parameters and a simple scoring system ensuring ease of widespread use. Fourth, eight of nine variables in our model except age are potentially modifiable. Therefore, risk assessment based on our model can lead to interventions that can potentially modify HF risk and may facilitate close follow-up and aggressive clinical management. It is possible that identification of high-risk individuals can be used for recruitment into HF prevention trials. Finally, very importantly this is the first prediction scheme that has shown reliable risk prediction of incident HF in blacks. The FHFRS was drawn on almost exclusively white population and until now there was no incident HF prediction model that assessed the risk in blacks. With the growing understanding on race based differences in risks and outcomes for various diseases and the particular relevance of certain risk factors in blacks e.g. hypertension, the reliability of any prediction model needs to be validated in the various race based cohorts. In our study, the Health ABC HF Score predicted risk equally well in both white and black subjects.
Unlike CHD, we currently lack prediction models on how to detect at-risk HF subjects in the general population. Previous literature has identified individual risk factors associated with HF, but comprehensive and validated risk prediction models have not been developed.13 The only exception is the FHFRS, which was developed in a subgroup of community-based cohort at higher risk for HF with known CHD or VHD or hypertension.10 Such patients accounted for half the population in our study. Moreover, with the obesity, metabolic syndrome, and diabetes epidemic, the population risk profile for incident HF may be changing.5 We assessed the utility of the FHFRS in predicting incident HF in a general population of elderly subjects and found it to be suboptimal in assessing the risk of incident HF, in both the overall Health ABC cohort and also in the subgroup of patients from which it was derived.
Our nine-variable model had good discrimination and calibration, with acceptable performance in both gender- and race-based groups. Importantly, internal validation in 1000 random bootstrap samples demonstrated stable performance. Although hypertension and diabetes were significantly associated with HF in univariate analyses, after inclusion of blood pressure and serum glucose levels in the analyses, past history of hypertension and diabetes were not independently associated with incident HF. This finding suggests that the relation between blood pressure, glucose, and HF is continuous and graded, and that blood pressure and glucose levels may increase HF risk even in the normal range.35,36 A recent analysis also showed an independent relationship between glucose levels and HF hospitalization risk.37 Thus, optimal glucose and blood pressure levels to ameliorate risk for HF need further study. This becomes a central issue in light of recent studies that indicate both increasing prevalence and inadequate control of hypertension and diabetes.38, 39
Our study has several limitations. Diagnosis of HF was based on HF hospitalization. As some participants may have developed HF without hospitalization, our rates of HF are likely underestimated. Possible misclassification of HF events might have occurred, as diagnostic criteria for HF are difficult to define. Of note, although the prognostic validity of CHS criteria for diagnosis of HF has been demonstrated, these criteria are less specific than the Framingham HF criteria and may explain some of the variability in the performance of the different models.40 Echocardiography was not performed at baseline in the Health ABC study. Thus, patients with sub-clinical prevalent structural heart disease may have been included in the analysis. The outcomes of both patients with either systolic dysfunction or HF with preserved ejection fraction are uniformly poor. The discriminatory ability of the current model to predict the two types of HF needs to be assessed further.41, 42 The Health ABC study did not collect data uniformly on VHD; however it is unlikely that a very large proportion of participants had significant subclinical VHD that would impact these results. Finally, the model was developed in a relatively healthy cohort of a certain age. Thus, the validity of the Health ABC model in other age groups, or general population within this age group where the burden of comorbidity may be higher, needs to be studied.
In conclusion, we have developed and internally validated a HF risk prediction model based on nine routine clinical variables, most of which are potentially modifiable. The identification of persons at high risk for HF using the Health ABC HF Score and targeting strategies for primary prevention of HF to improve outcomes needs further study.
Funding Sources: This research was supported in part by the Intramural Research Program of the National Institute of Aging, National Institutes of Health, Bethesda MD and by grants N01-AG-6-2101, N01-AG-6-2103, and N01-AG-6-2106.
Conflict of interest: None