|Home | About | Journals | Submit | Contact Us | Français|
Current heart failure (HF) risk prediction models do not consider how individual patient assessments occur in incremental steps; furthermore, each additional diagnostic evaluation may add cost, complexity, and potential morbidity.
Using a cohort of well-treated ambulatory HF patients with reduced ejection fraction (HFrEF) with complete clinical, laboratory, health-related quality of life, imaging, and exercise testing data, we estimated incremental prognostic information provided by five assessment categories, performing an additional analysis on those with available N-terminal pro-B-type natriuretic peptide (NT-proBNP) levels. We compared the incremental value of each additional assessment (quality of life screen, laboratory testing, echocardiography, exercise testing) to baseline clinical assessment for predicting clinical outcomes (all-cause mortality, all-cause mortality/hospitalization, cardiovascular death/HF hospitalizations), gauging incremental improvements in prognostic ability with more information using area under the curve and reclassification improvement (Net Reclassification Index; NRI), with and without NT-proBNP availability. Of 2331 participants, 1631 patients had complete clinical data; of these, 1023 had baseline NT-proBNP. For prediction of all-cause mortality, models with incremental assessments sans NT-proBNP showed improvements in C-indices (0.72[clinical model alone]–0.77[complete model]). Compared to baseline clinical assessment alone, NRI improved from 0.035 (w/laboratory data) to 0.085 (complete model). These improvements were significantly attenuated for models in the subset with measured NT-proBNP data (c-indices: 0.80[w/laboratory data]–0.81[full model]); NRI improvements were similarly marginal (0.091→0.096); prediction of other clinical outcomes had similar findings.
In patients with chronic HFrEF, the marginal benefit of complex prognostic evaluations should be weighed against potential patient discomfort and cost escalation.
URL: http://www.clinicaltrials.gov. Unique identifier: NCT00047437.
Approximately 5.1 million people have been diagnosed with heart failure (HF) in the United States (U.S.),1 a syndrome that imparts excessive burden on patients and the health care system. Within 5 years of diagnosis, HF is responsible for mortality rates approaching 50%, and each year, HF is responsible for more than 1 million hospitalizations and more than $30 billion spent on treatment.2 Furthermore, HF is rapidly becoming the leading cause of death and disability in low- and middle-income countries.3
An important rationale for clinical assessments and diagnostic testing in patients with chronic HF is to guide therapeutic interventions.4 Quantifying a patient's risk of clinical outcomes can provide one way of identifying those who would most benefit from more intensive monitoring and treatment.4 Several risk scores and methodologies, ranging from simple to complex, have been developed for purposes of HF risk stratification, including biomarker testing, imaging modalities, and a variety of multivariable clinical risk scores.1,4 Nevertheless, in clinical practice, different incremental sets of variables, such as physical exam findings and laboratory results, often become available to physicians in a specific sequence (e.g., medical history before serum sodium, before left ventricular ejection fraction [LVEF]). Additional testing may be associated with gains in prognostic information, but may increase the cost and complexity of care, as well as burden to the patient.
Despite the clinical relevance of this question, the incremental prognostic benefit of additional costly and complex clinical assessments remains uncertain. Therefore, we sought to measure the incremental prognostic value provided by clinical, quality of life, laboratory, echocardiography, and cardiopulmonary exercise testing data in a well-treated cohort of ambulatory patients with systolic HF.
Details of the Heart Failure: A Controlled Trial Investigating Outcomes of Exercise Training (HF-ACTION) study have been previously published.5,6 Briefly, HF-ACTION (clinicaltrials.gov: NCT00047437) was a randomized clinical trial evaluating the effect of exercise training versus usual care on long-term morbidity and mortality in 2331 well-treated patients with chronic HF due to left ventricular systolic dysfunction (New York Heart Association [NYHA] classes II-IV, LVEF <35%). Patients were randomized to either usual HF care or a structured, group-based, supervised exercise program. All patients, regardless of treatment group, received detailed self-management educational materials that included information on medications, fluid management, symptom exacerbation, sodium intake, and amount of activity recommended by the American Heart Association (AHA) guidelines.7
Demographics, socioeconomic status, past medical history, current medications, a physical examination, and the most recent laboratory tests were obtained at the baseline clinic visit prior to randomization. Participants reported race and ethnicity at the time of study enrollment using categories defined by the National Institutes of Health. All patients underwent baseline assessments, which included: 1) cardiopulmonary exercise testing; 2) a six-minute walk distance (6MWD); 3) transthoracic echocardiography (TTE); 4) quality of life measures, ascertained using several validated psychometric instruments measuring health-related quality of life, pain, depression, and social support, including the Kansas City Cardiomyopathy Questionnaire (KCCQ)8; and 5) levels of N-terminal pro-B-type natriuretic peptide (NT-proBNP) in a subset of patients who agreed to participate in the biomarker substudy.9,10
The primary endpoint was a composite of all-cause mortality and all-cause hospitalization over a median follow-up of 2.6 years. We examined the following secondary clinical endpoints: all-cause death, and all-cause death or HF hospitalization. An independent clinical events committee that was blinded to treatment assignment adjudicated all deaths and first hospitalizations. Local institutional review boards approved HF-ACTION, and all enrolled patients provided written informed consent.
Among patients used in this analysis, categorical data were summarized as percentages and differences in patients with and without NT-proBNP data were compared using chi-square tests and Wilcoxon rank-sum tests. A series of Cox proportional hazards regression models with a priori variable selection were fit for each clinical endpoint in the full patient population and the subgroup with NT-proBNP. The models were as follows (with incremental addition of variables [Figure 1]):
We also performed a sensitivity analysis, by considering the 6MWD prior to laboratory information, since this assessment is theoretically cheaper and easier to perform in the outpatient setting. We estimated the incremental cost of each additional assessment using the 2012 American Medical Association Current Procedural Terminology manual.11 To account for deviations from the linearity assumption in proportional hazards regression, linear splines were used for age (>60), 6MWD (<450), peak VO2 (<20), and creatinine (between 1.0 and 2.3). Hazard ratios are interpreted as the average hazard over the follow-up period to address potential deviations from proportional hazards. The time-point for comparison for all analyses was 30 months, with censoring of events occurring after 30 months. Each of the models was compared to the baseline clinical model that included only variables collected at the clinic visit using a 2-category Net Reclassification Index (NRI) split at the endpoint incidence rate; 95% confidence intervals for the NRI were created by estimating bootstrap standard errors from 50 bootstrap samples. Patients lost to follow-up prior to 30 months were considered non-events for NRI calculations. We compared full and reduced models using likelihood ratio tests. The same methodology was repeated in the biomarker subset with the addition of log-transformed NT-proBNP to the list of laboratory assessments. These steps were taken for the three endpoints: all-cause death or hospitalization, all-cause death, and cardiovascular death or HF hospitalization. Lastly, we also examined individual association or variables with outcomes using the chi-square statistic. We considered p<0.05 to be statistically significant. All analyses were performed using SAS version 9.2 (SAS Institute, Inc., Cary, NC).
Complete clinical data was available on 1631 of 2331 HF-ACTION participants; of these, 1023 had baseline levels of NT-proBNP. Excluded patients were generally similar to those included in the analysis. The Table displays the characteristics of the patient population, as well as those with and without baseline NT-proBNP levels. Median age of the study cohort was 59 years (25th, 75th percentile: 51, 68); 574 (35%) patients were black, and 1167 (72%) were male. The majority of patients were NYHA class II (n=1036, 64%) or class III (n=580, 36%) at baseline. Median LVEF was 25%, beta-blockers were used in 1545 (95%) patients, and angiotensin-converting enzyme inhibitors/angiotensin receptor blockers were used in 1538 (94%). At baseline, 675 (41%) patients had an implantable cardioverter defibrillator and 300 (18%) had a cardiac resynchronization therapy device. The median 6MWD was 368 meters and the median peak VO2 levels were 14.3 mL/kg/min. Figure 1 depicts the overall characteristics of the study population, followed by the variables analyzed during increments of clinical assessments. Sets of evaluations were organized according to usual temporality during initial assessment of a patient with HF, beginning with a clinical assessment, then proceeding with laboratory testing, echocardiography, and exercise testing. Comparison between patients with and without available NT-proBNP levels is shown in Supplemental Table 1. There were 1031 deaths or hospitalizations in the overall cohort (646 in NT-proBNP subset), 223 deaths (134 in NT-proBNP subset), and 470 CV death/HF hospitalizations (304 in NT-proBNP subset).
Changes in discrimination measures with the addition of variables for the composite of all-cause mortality and all-cause hospitalization are shown in Figure 2a, Figure 3a, and Supplemental Table 2. In the overall cohort, baseline clinical information alone yielded a C-index of 0.62, increasing to 0.65 for the full model. The NRI improved from 0.019 with inclusion of the KCCQ score to 0.118 for the full model. In the subset of patients with available NT-proBNP levels, consideration of clinical, KCCQ, and laboratory information yielded the peak C-index of 0.67. The NRI increased from 0.112 for the model with baseline clinical information, KCCQ, and laboratory data, to 0.169 for the full model.
Changes in model discrimination with the addition of variables for all-cause mortality are shown in Figure 2b, Figure 3b, and Supplemental Table 3. In the overall cohort, use of baseline clinical information alone yielded a C-index of 0.72, and with the addition of the KCCQ score, laboratory, echocardiography, and exercise parameters, the C-index increased to 0.77. The NRI improved from −0.001 after the addition of the KCCQ score to 0.085 for the overall set of variables. When analysis was performed in patients with available NT-proBNP levels, the C-index increased from 0.73 for baseline clinical information alone to 0.80 after consideration of KCCQ score and laboratory parameters. Inclusion of additional data increased the C-statistic nominally to 0.81. Similarly, there were no appreciable increases in NRI after the addition of laboratory data (0.091→0.096).
Changes in measures of discrimination with the addition of variables for the composite of cardiovascular death and HF hospitalization are shown in Figure 2c, Figure 3c, and Supplemental Table 4. In the overall cohort, baseline clinical information alone yielded a C-index of 0.68, increasing to 0.74 for the full model. The NRI improved from 0.009 with inclusion of the KCCQ score to 0.12 for the full model. In the subset of patients with NT-proBNP levels available, inclusion of clinical, KCCQ, and laboratory information yielded a C-index of 0.75, increasing to a maximum of 0.76 with the inclusion of additional information. The NRI increased from 0.138 for the model with clinical, KCCQ, and laboratory information, to 0.172 for the full model. When variables were examined on an individual basis, NT-proBNP was the strongest individual predictor for all clinical outcomes when it was included in the modeling, with the highest χ2 for all clinical outcomes (Supplemental Table 5). In the absence of NT-proBNP values, exercise, patient symptoms, and echocardiographic variables showed strong individual prognostic value (Supplemental Table 5).
We performed a sensitivity analysis, by considering the 6MWD prior to laboratory information, as this assessment is theoretically cheaper and easier to perform in the outpatient setting (Supplemental Tables 6–8). For the composite of all-cause mortality and hospitalization, the NRI improved from 0.017 for baseline clinical information, KCCQ, and 6MWD, to 0.065 with laboratory data, and 0.109 for the overall set of variables. When this analysis was performed in the patients with available NT-proBNP levels, the increase in NRI was more dramatic (0.016→0.130), with modest increases for the full model (0.162). The C-index increased from 0.64 for the baseline clinical model+KCCQ+6MWD to 0.67 with laboratory information (including NT-proBNP), and did not increase further with inclusion of other variables. For the outcome of all-cause mortality, the increase in NRI was similarly higher in the cohort with available NTproBNP levels (clinical information+KCCQ+6MWD=0.014; +laboratories=0.085; full=0.094) than in those without (clinical information+KCCQ+6MWD=0.007; +laboratories=0.056; full=0.079). For the outcome of cardiovascular death and HF hospitalization, similar trends were noted, with large increases in C-statistics and NRI after inclusion of laboratory data with NT-proBNP.
Figure 4 depicts the additional costs of each set of assessments, based on Medicare and Medicaid reimbursement data, with the clinical assessment and quality of life survey as the referent. All assessments together were close to $500; the largest increases in cost were related to echocardiography and exercise testing. The addition of NT-proBNP increased the cost of each assessment by approximately $50. Of note, costs would be expected to vary considerably based on geographic region and insurance status, and true costs may be higher in clinical practice.
We examined the incremental prognostic value of sequential clinical assessments in a large cohort of ambulatory chronic systolic HF patients. We found that the addition of testing beyond clinical and laboratory assessments, such as echocardiography and cardiopulmonary exercise testing, yielded diminishing gains in the C-index and measures of appropriate risk reclassification—especially with the addition of NT-proBNP levels. In the absence of contemporary guideline-based therapeutic recommendations for chronic HF relying on a detailed qualification of individual patient risk, the clinical benefit of increasingly complex testing, when done for purposes of prognostication, should be carefully weighed against potential discomfort to the patient and escalation in overall cost of care.
Our findings have several important clinical implications. First, to our knowledge, our study is the first to consider the prognostic value of evaluations in chronic HF patients with regard to how these evaluations are typically performed in the outpatient setting, beginning with a comprehensive clinical evaluation, followed by laboratory, echocardiography, and cardiopulmonary exercise testing. Prior multivariable risk prediction models have treated variables based on their individual prognostic merit without consideration to each variables’ place in the assessment timeline (i.e., whether or not a variable might belong to a “bundle” of assessments like sodium, blood urea nitrogen, and creatinine). Prior risk prediction models, such as those derived from the Candesartan in Heart Failure-Assessment of Reduction in Mortality and Morbidity (CHARM) and Controlled Rosuvastatin Multinational Study in Heart Failure (CORONA) trials, include variables that are not commonly assessed in chronic HF patients, but yield similar values to ours for C-statistics predicting all-cause mortality (Supplemental Table 9). Furthermore, a recent pooled analysis of 39,372 patients from several cohort studies derived a prediction model that included mostly clinical variables and required assessment of LVEF, but did not include natriuretic peptide levels, which are an AHA/American College of Cardiology (ACC) Class 1 recommendation for the prognostic evaluation of chronic HF.4 This is a key point as, in our analysis, much of the improvement in prognostic ability was from inclusion of the natriuretic peptide levels.
Second, we believe our findings draw attention to the possibility that increasing the cost and complexity of HF evaluations may only provide modest gains in prognostic predictive ability. None of the AHA/ACC guidelines for the management of stable chronic systolic HF require a precise assessment of patient prognosis; rather, treatment recommendations are based on broad classifications such as NYHA class, LVEF cut-points, and HF stage.1 Nonetheless, improving prognostic evaluations of chronic HF patients are the subject of intense research activity, with novel imaging methodologies, biomarkers, and multivariable risk scores being introduced on a regular basis. The uptake of these new assessments relies on the assumption that risk stratification may help guide therapeutic decision-making by identifying those patients needing more intensive monitoring and therapy.1,4,12 Yet to our knowledge, it remains unclear if patients with a higher estimated risk would benefit from more intensive treatment, particularly since in certain cases (such as implantable cardioverter defibrillator use), more aggressive treatment of sicker patients may lead to more harm than benefit.13
Third, beyond the unclear capacity for aiding therapeutic decision-making, our findings highlight the notion that diagnostic testing in HF, for the purposes of improved prognostication, carries potential economic implications. Recognizing the importance of economic considerations, the AHA and ACC have indicated that they will begin to use cost data in their clinical practice guidelines and performance standards to rate the value of treatments.14 To illustrate the significance of economic factors in the field of cardiology, we can consider the rapid growth in reimbursements for cardiac imaging from the Centers for Medicare & Medicaid Services over the last decade, primarily from large increases in utilization.15 However, despite the exceptional monetary cost, it remains unclear as to the extent to which additional information provided by further testing in HF adds value to therapeutic decision-making. Furthermore, with the rapid growth of HF prevalence internationally, populations with fewer resources will need to identify more cost-effective methods of managing chronic HF, rather than recapitulating or building on current practices. Our study suggests that a comprehensive history, physical exam, assessment of quality of life, and laboratory testing— especially one that that includes natriuretic peptide levels— may provide similar prognostic value to that of a full assessment including assessments of left ventricular function and exercise testing.
Our study has several limitations that need careful consideration. First, our findings only pertain to the incremental prognostic value of clinical testing— there are numerous other reasons why testing may be performed on patients with HF to guide therapeutic decision making (e.g., timing of surgery). Second, exercise testing in HF-ACTION was not performed for ischemia evaluation, which is the most frequent reason why it is performed in the clinical setting. Third, patients qualifying for the HF-ACTION study required an LVEF of <35% at the outset; therefore, the patient population was already known to have systolic HF prior to the study echocardiogram. Fourth, all testing were done for research purposes rather than for clinical care. While this is a limitation of the majority of clinical trials, we cannot draw firm conclusions about how this information might have different clinical implications outside of a study setting where there is often incomplete data available. Our study population also consisted of only ambulatory patients with impaired ejection fraction (LVEF <35%), so our results may not be generalizable to the general population of HF patients. Fifth, since we only considered prognosis with regard to pre-specified outcomes used in the original study, observed associations may vary when considering additional clinical endpoints. Sixth, there was far greater improvement in net reclassification indices among patients without events than in those with events, indicating that additional data may improve discrimination more in this group of patients. Lastly, these findings should be considered hypothesis generating, and require replication and validation in additional cohorts of chronic HF patients.
In conclusion, we examined the incremental prognostic value of sequential clinical assessments in a large cohort of well-treated chronic systolic HF patients. Clinical data and readily available laboratory tests may provide similar prognostic information to more complex models that include assessments of echocardiographic and exercise variables. In the absence of clear linkage between precise estimation of risk and specific therapies, the marginal benefit of increasingly complex prognostic evaluations should be weighed against potential risk to the patient and escalation in costs of care.
The authors would like to thank Erin Hanley, MS, for her editorial contributions to this manuscript. Ms. Hanley did not receive compensation for her contributions, apart from her employment at the institution where this study was conducted.
Sources of Funding
Dr. Ahmad received support from the Daland Fellowship in Clinical Investigation. The HF-ACTION study was funded by grants from the NHLBI.
T Ahmad: Dr. Ahmad has received support from the Daland Fellowship in Clinical Investigation and has served as a consultant for Roche Diagnostics.
EC O'Brien: Dr. O'Brien TBD
P Schulte: Dr. Schulte TBD
S Stevens: Ms. Stevens TBD
M Fiuzat: Dr. Fiuzat has served as a consultant for Roche Diagnostics and has received research funding from Roche Diagnostics, BG Medicine, and Critical Diagnostics.
D Kitzman: Dr. Kitzman serves as a consultant for Relypsa, Inc.
KF Adams: Dr. Adams has received research funding from Roche Diagnostics and Critical Diagnostics and has served as a consultant for Roche Diagnostics.
WE Kraus: Dr. Kraus TBD
IL Piña: Dr. Piña is a consultant for General Electric and Novartis.
MP Donahue: Dr. Donahue TBD
F Zannad: Dr. Zannad has received research funding from BGMedicine and Roche Diagnostics and has served as a consultant for BG Medicine.
DJ Whellan: Dr. Whellan TBD
C O'Connor: Dr. O'Connor has served as a consultant for Roche Diagnostics and has received research funding from Roche Diagnostics, BG Medicine, and Critical Diagnostics.
GM Felker: Dr. Felker has received research funding from Roche Diagnostics, BG Medicine, and Critical Diagnostics and has served as a consultant for Roche Diagnostics and Singulex.