Models for risk prediction are widely used in clinical practice to risk stratify and assign treatment strategies. The contribution of new biomarkers has largely been based on the area under the receiver operating characteristic curve, but this measure can be insensitive to important changes in absolute risk. Methods based on risk stratification have recently been proposed to compare predictive models. These include the reclassification calibration statistic, the net reclassification improvement (NRI), and the integrated discrimination improvement (IDI). This work demonstrates the use of reclassification measures, and illustrates their performance for well-known cardiovascular risk predictors in a cohort of women. These measures are targeted at evaluating the potential of new models and markers to change risk strata and alter treatment decisions.
The discovery and development of new biomarkers continues to be an exciting and promising field. Improvement of prediction of risk of developing disease is one of the key motivations in these pursuits. Appropriate statistical measures are necessary for drawing meaningful conclusions about the clinical usefulness of these new markers. In this review, we present several novel metrics proposed to serve this purpose. We use reclassification tables constructed based on clinically meaningful disease risk categories to discuss the concepts of calibration, risk separation, risk discrimination, and risk classification accuracy. We discuss the notion that the net reclassification improvement is a simple yet informative way to summarize information contained in risk reclassification tables. In the absence of meaningful risk categories, we suggest a ‘category-less’ version of the net reclassification improvement and integrated discrimination improvement as metrics to summarize the incremental value of new biomarkers. We also suggest that predictiveness curves be preferred to receiver-operating-characteristic curves as visual descriptors of a statistical model’s ability to separate predicted probabilities of disease events. Reporting of standard metrics, including measures of relative risk and the c statistic is still recommended. These concepts are illustrated with a risk prediction example using data from the Framingham Heart Study.
reclassification; risk prediction; NRI; IDI; calibration; discrimination
For comparing the performance of a baseline risk prediction model with one that includes an additional predictor, a risk reclassification analysis strategy has been proposed. The first step is to cross-classify risks calculated according to the 2 models for all study subjects. Summary measures including the percentage of reclassification and the percentage of correct reclassification are calculated, along with 2 reclassification calibration statistics. The author shows that interpretations of the proposed summary measures and P values are problematic. The author's recommendation is to display the reclassification table, because it shows interesting information, but to use alternative methods for summarizing and comparing model performance. The Net Reclassification Index has been suggested as one alternative method. The author argues for reporting components of the Net Reclassification Index because they are more clinically relevant than is the single numerical summary measure.
biological markers; diagnosis; epidemiologic methods; prognosis; risk model
The performance of prediction models can be assessed using a variety of different methods and metrics. Traditional measures for binary and survival outcomes include the Brier score to indicate overall model performance, the concordance (or c) statistic for discriminative ability (or area under the receiver operating characteristic (ROC) curve), and goodness-of-fit statistics for calibration.
Several new measures have recently been proposed that can be seen as refinements of discrimination measures, including variants of the c statistic for survival, reclassification tables, net reclassification improvement (NRI), and integrated discrimination improvement (IDI). Moreover, decision–analytic measures have been proposed, including decision curves to plot the net benefit achieved by making decisions based on model predictions.
We aimed to define the role of these relatively novel approaches in the evaluation of the performance of prediction models. For illustration we present a case study of predicting the presence of residual tumor versus benign tissue in patients with testicular cancer (n=544 for model development, n=273 for external validation).
We suggest that reporting discrimination and calibration will always be important for a prediction model. Decision-analytic measures should be reported if the predictive model is to be used for making clinical decisions. Other measures of performance may be warranted in specific applications, such as reclassification metrics to gain insight into the value of adding a novel predictor to an established model.
Concerns have been raised about the use of traditional measures of model fit in evaluating risk prediction models for clinical use, and reclassification tables have been suggested as an alternative means of assessing the clinical utility of a model. Several measures based on the table have been proposed, including the reclassification calibration (RC) statistic, the net reclassification improvement (NRI), and the integrated discrimination improvement (IDI), but the performance of these in practical settings has not been fully examined. We used simulations to estimate the type I error and power for these statistics in a number of scenarios, as well as the impact of the number and type of categories, when adding a new marker to an established or reference model. The type I error was found to be reasonable in most settings, and power was highest for the IDI, which was similar to the test of association. The relative power of the RC statistic, a test of calibration, and the NRI, a test of discrimination, varied depending on the model assumptions. These tools provide unique but complementary information.
Calibration; Discrimination; Model accuracy; Prediction; Reclassification
Net reclassification and integrated discrimination improvements have been proposed as alternatives to the increase in the AUC for evaluating improvement in the performance of risk assessment algorithms introduced by the addition of new phenotypic or genetic markers. In this paper, we demonstrate that in the setting of linear discriminant analysis, under the assumptions of multivariate normality, all three measures can be presented as functions of the squared Mahalanobis distance. This relationship affords an interpretation of the magnitude of these measures in the familiar language of effect size for uncorrelated variables. Furthermore, it allows us to conclude that net reclassification improvement can be viewed as a universal measure of effect size. Our theoretical developments are illustrated with an example based on the Framingham Heart Study risk assessment model for high risk men in primary prevention of cardiovascular disease.
AUC; biomarker; c statistic; model performance; risk prediction; ROC
Purpose of review
We discuss two data analysis issues for studies that use binary clinical outcomes (whether or not an event occurred): the choice of an appropriate scale and transformation when biomarkers are evaluated as explanatory factors in logistic regression; and assessing the ability of biomarkers to improve prediction accuracy for event risk.
Biomarkers with skewed distributions should be transformed before they are included as continuous covariates in logistic regression models. The utility of new biomarkers may be assessed by measuring the improvement in predicting event risk after adding the biomarkers to an existing model. The area under the receiver operating characteristic (ROC) curve (C-statistic) is often cited; it was developed for a different purpose, however, and may not address the clinically relevant questions. Measures of risk reclassification and risk prediction accuracy may be more appropriate.
The appropriate analysis of biomarkers depends on the research question. Odds ratios obtained from logistic regression describe associations of biomarkers with clinical events; failure to accurately transform the markers, however, may result in misleading estimates. Whilst the C-statistic is often used to assess the ability of new biomarkers to improve the prediction of event risk, other measures may be more suitable.
biomarker analysis; odds ratio; ROC curve; risk prediction accuracy; C-statistic
Mortality among patients with heart failure (HF) is high. Though individual biomarkers have been investigated to determine their value in mortality risk prediction, the role of a multimarker strategy requires further evaluation.
Methods and Results
Olmsted County residents presenting with HF from July 2004 to September 2007 were recruited to undergo biomarker measurement. We investigated whether addition of C-reactive protein (CRP), B-type natriuretic peptide (BNP), and troponin T (TnT) to a model including established risk indicators improved 1-year mortality risk prediction using the c statistic, integrated discrimination improvement (IDI), and net reclassification improvement (NRI). Among 593 participants, the mean age was 76.4 years and 48% were men. After 1 year follow-up, 122 (20.6%) participants had died. Patients with CRP (<11.8mg/L), BNP (<350pg/mL), and TnT (≤0.01ng/mL) below the median had low 1-year mortality (3.3%), while those with two or three biomarkers above the median had markedly increased mortality (30.8% and 35.5%, respectively). The addition of two or more biomarkers to the model offered greater improvement in 1-year mortality risk prediction than use of a single biomarker. The combination of CRP and BNP resulted in an increase in the c statistic from 0.757 to 0.810 (p<0.001), an IDI gain of 7.1% (p<0.001), and a NRI of 22.1% (p<0.001). Use of all three biomarkers offered no incremental gain (IDI gain 0.7% vs. CRP+BNP, p=0.065).
Biomarkers improved 1-year mortality risk prediction beyond established indicators. The use of a two-biomarker combination was superior to a single biomarker in risk prediction, though addition of a third biomarker conferred no added benefit.
epidemiology; heart failure; prognosis; inflammation; community
This study compares inflammation-related biomarkers with established cardiometabolic risk factors in the prediction of incident type 2 diabetes and incident coronary events in a prospective case-cohort study within the population-based MONICA/KORA Augsburg cohort.
Methods and Findings
Analyses for type 2 diabetes are based on 436 individuals with and 1410 individuals without incident diabetes. Analyses for coronary events are based on 314 individuals with and 1659 individuals without incident coronary events. Mean follow-up times were almost 11 years. Areas under the receiver-operating characteristic curve (AUC), changes in Akaike's information criterion (ΔAIC), integrated discrimination improvement (IDI) and net reclassification index (NRI) were calculated for different models. A basic model consisting of age, sex and survey predicted type 2 diabetes with an AUC of 0.690. Addition of 13 inflammation-related biomarkers (CRP, IL-6, IL-18, MIF, MCP-1/CCL2, IL-8/CXCL8, IP-10/CXCL10, adiponectin, leptin, RANTES/CCL5, TGF-β1, sE-selectin, sICAM-1; all measured in nonfasting serum) increased the AUC to 0.801, whereas addition of cardiometabolic risk factors (BMI, systolic blood pressure, ratio total/HDL-cholesterol, smoking, alcohol, physical activity, parental diabetes) increased the AUC to 0.803 (ΔAUC [95% CI] 0.111 [0.092–0.149] and 0.113 [0.093–0.149], respectively, compared to the basic model). The combination of all inflammation-related biomarkers and cardiometabolic risk factors yielded a further increase in AUC to 0.847 (ΔAUC [95% CI] 0.044 [0.028–0.066] compared to the cardiometabolic risk model). Corresponding AUCs for incident coronary events were 0.807, 0.825 (ΔAUC [95% CI] 0.018 [0.013–0.038] compared to the basic model), 0.845 (ΔAUC [95% CI] 0.038 [0.028–0.059] compared to the basic model) and 0.851 (ΔAUC [95% CI] 0.006 [0.003–0.021] compared to the cardiometabolic risk model), respectively.
Inclusion of multiple inflammation-related biomarkers into a basic model and into a model including cardiometabolic risk factors significantly improved the prediction of type 2 diabetes and coronary events, although the improvement was less pronounced for the latter endpoint.
Many novel and emerging risk factors exhibit a significant association with cardiovascular disease, but have not been found to improve risk prediction. Statistical criteria used to evaluate such models and markers have largely relied on the receiver operating characteristic curve, which is an insensitive measure of improvement. Recently, new methods have been developed based on risk reclassification, or changes in risk strata following use of a new marker or model. Associated measures based on both calibration and discrimination have been proposed. This review describes previous methods used to evaluate models as well as the newly developed methods to evaluate clinical utility.
Rigorous statistical evaluation of the predictive values of novel biomarkers is critical prior to applying novel biomarkers into routine standard care. It is important to identify factors that influence the performance of a biomarker in order to determine the optimal conditions for test performance. We propose a covariate-specific time-dependent PPV curve to quantify the predictive accuracy of a prognostic marker measured on a continuous scale and with censored failure time outcome. The covariate effect is accommodated with a semiparametric regression model framework. In particular we adopt a smoothed survival time regression technique (Dabrowska, 1997) to account for the situation where risk for the disease occurrence and progression is likely to change over time. In addition, we provide asymptotic distribution theory and resampling-based procedures for making statistical inference on the covariate specific positive predictive values. We illustrate our approach with numerical studies and a dataset from a prostate cancer study.
Biomarker evaluation; Negative predictive value; Positive predictive value; Semi-parametric survival analysis
To determine whether erectile dysfunction (ED) predicts cardiovascular disease (CVD) beyond traditional risk factors.
ED and CVD share pathophysiological mechanisms and often co-occur. It is unknown whether ED improves the prediction of CVD beyond traditional risk factors.
This was a prospective, population-based study of 1,709 men (of 3,258 eligible) aged 40–70 years. ED was measured by self-report. Subjects were followed for CVD for an average follow-up of 11.7 years. The association between ED and CVD was examined using the Cox proportional hazards regression model. The discriminatory capability of ED was examined using c statistics. The reclassification of CVD risk associated with ED was assessed using a method that quantifies net reclassification improvement.
1,057 men with complete risk factor data who were free of CVD and diabetes at baseline were included. During follow-up, 261 new cases of CVD occurred. ED was associated with CVD incidence controlling for age (Hazard Ratio (HR): 1.42 (95% Confidence Interval (CI)): 1.05, 1.90), age and traditional CVD risk factors (HR: 1.41, 95% CI: 1.05, 1.90), as well as age and Framingham risk score (HR: 1.40, 95% CI: 1.04–1.88). Despite these significant findings, ED did not significantly improve the prediction of CVD incidence beyond traditional risk factors.
Independent of established CVD risk factors, ED is significantly associated with increased CVD incidence. Nonetheless, ED does not improve the prediction of who will and will not develop CVD beyond that offered by traditional risk factors.
Aging; erectile dysfunction; cardiovascular disease; longitudinal studies; men
New markers may improve prediction of diagnostic and prognostic outcomes. We review various measures to quantify the incremental value of markers over standard, readily available characteristics. Widely used traditional measures include the improvement in model fit or in the area under the receiver operating characteristic (ROC) curve (AUC). New measures include the net reclassification index (NRI) and decision–analytic measures, such as the fraction of true positive classifications penalized for false positive classifications (‘net benefit’, NB).
For illustration we discuss a case study on the presence of residual tumor versus benign tissue in 544 patients with testicular cancer. We assessed 3 tumor markers (AFP, HCG, and LDH) for their incremental value over currently standard clinical predictors. AUC and R2 values suggested adding continuous LDH and AFP whereas NB only favored HCG as a potentially promising marker at a clinically defendable decision threshold of 20% risk. Results based on the NRI fell in the middle, suggesting reclassification potential of all three markers.
We conclude that improvement in standard discrimination measures, which focus on finding variables that might be promising across all decision thresholds, may not detect the most informative markers at a specific threshold of particular clinical relevance. When a marker is intended to support decision making, calculation of the improvement in a decision–analytic measure, such as NB, is preferable over an overall judgment as obtained from the AUC in ROC analysis.
prediction; logistic regression model; performance measures; incremental value
To assess the value of a continuous marker in predicting the risk of a disease, a graphical tool called the predictiveness curve has been proposed. It characterizes the marker’s predictiveness, or capacity to risk stratify the population by displaying the distribution of risk endowed by the marker. Methods for making inference about the curve and for comparing curves in a general population have been developed. However, knowledge about a marker’s performance in the general population only is not enough. Since a marker’s effect on the risk model and its distribution can both differ across subpopulations, its predictiveness may vary when applied to different subpopulations. Moreover, information about the predictiveness of a marker conditional on baseline covariates is valuable for individual decision making about having the marker measured or not. Therefore, to fully realize the usefulness of a risk prediction marker, it is important to study its performance conditional on covariates. In this article, we propose semiparametric methods for estimating covariate-specific predictiveness curves for a continuous marker. Unmatched and matched case-control study designs are accommodated. We illustrate application of the methodology by evaluating serum creatinine as a predictor of risk of renal artery stenosis.
To date, the only established model for assessing risk for nasopharyngeal carcinoma (NPC) relies on the sero-status of the Epstein-Barr virus (EBV). By contrast, the risk assessment models proposed here include environmental risk factors, family history of NPC, and information on genetic variants. The models were developed using epidemiological and genetic data from a large case-control study, which included 1,387 subjects with NPC and 1,459 controls of Cantonese origin. The predictive accuracy of the models were then assessed by calculating the area under the receiver-operating characteristic curves (AUC). To compare the discriminatory improvement of models with and without genetic information, we estimated the net reclassification improvement (NRI) and integrated discrimination index (IDI). Well-established environmental risk factors for NPC include consumption of salted fish and preserved vegetables and cigarette smoking (in pack years). The environmental model alone shows modest discriminatory ability (AUC = 0.68; 95% CI: 0.66, 0.70), which is only slightly increased by the addition of data on family history of NPC (AUC = 0.70; 95% CI: 0.68, 0.72). With the addition of data on genetic variants, however, our model’s discriminatory ability rises to 0.74 (95% CI: 0.72, 0.76). The improvements in NRI and IDI also suggest the potential usefulness of considering genetic variants when screening for NPC in endemic areas. If these findings are confirmed in larger cohort and population-based case-control studies, use of the new models to analyse data from NPC-endemic areas could well lead to earlier detection of NPC.
Fracture prediction models help identify individuals at high risk who may benefit from treatment. Area Under the Curve (AUC) is used to compare prediction models. However, the AUC has limitations and may miss important differences between models. Novel reclassification methods quantify how accurately models classify patients who benefit from treatment and the proportion of patients above/below treatment thresholds. We applied two reclassification methods, using the NOF treatment thresholds, to compare two risk models: femoral neck BMD and age (“simple model”) and FRAX (”FRAX model”).
The Pepe method classifies based on case/non-case status and examines the proportion of each above and below thresholds. The Cook method examines fracture rates above and below thresholds. We applied these to the Study of Osteoporotic Fractures.
There were 6036 (1037 fractures) and 6232 (389 fractures) participants with complete data for major osteoporotic and hip fracture respectively. Both models for major osteoporotic fracture (0.68 vs. 0.69) and hip fracture (0.75 vs. 0.76) had similar AUCs. In contrast, using reclassification methods, each model classified a substantial number of women differently. Using the Pepe method, the FRAX model (vs. simple model), missed treating 70 (7%) cases of major osteoporotic fracture but avoided treating 285 (6%) non-cases. For hip fracture, the FRAX model missed treating 31 (8%) cases but avoided treating 1026 (18%) non-cases. The Cook method (both models, both fracture outcomes) had similar fracture rates above/below the treatment thresholds.
Compared with the AUC, new methods provide more detailed information about how models classify patients.
hip fracture; major osteoporotic fracture; FRAX; BMD; prediction
There are two popular statistical approaches to biomarker evaluation. One models the risk of disease (or disease outcome) with, for example, logistic regression. A marker is considered useful if it has a strong effect on risk. The second evaluates classification performance by use of measures such as sensitivity, specificity, predictive values, and receiver operating characteristic curves. There is controversy about which approach is more appropriate. Moreover, the two approaches can give contradictory results on the same data. The authors present a new graphic, the predictiveness curve, which complements the risk modeling approach. It assesses the usefulness of a risk model when applied to the population. Although the predictiveness curve relates to classification performance measures, it also displays essential information about risk that is not displayed by the receiver operating characteristic curve. The authors propose that the predictiveness and classification performance of a marker, displayed together in an integrated plot, provide a comprehensive and cohesive assessment of a risk marker or model. The methods are demonstrated with data on prostate-specific antigen and risk factors from the Prostate Cancer Prevention Trial, 1993–2003.
biological markers; classification analysis; diagnostic tests, routine; epidemiologic methods; predictive value of tests; prostate-specific antigen; risk assessment; risk model
Appropriate quantification of added usefulness offered by new markers included in risk prediction algorithms is a problem of active research and debate. Standard methods, including statistical significance and c statistic are useful but not sufficient. Net reclassification improvement (NRI) offers a simple intuitive way of quantifying improvement offered by new markers and has been gaining popularity among researchers. However, several aspects of the NRI have not been studied in sufficient detail.
In this paper we propose a prospective formulation for the NRI which offers immediate application to survival and competing risk data as well as allows for easy weighting with observed or perceived costs. We address the issue of the number and choice of categories and their impact on NRI. We contrast category-based NRI with one which is category-free and conclude that NRIs cannot be compared across studies unless they are defined in the same manner. We discuss the impact of differing event rates when models are applied to different samples or definitions of events and durations of follow-up vary between studies. We also show how NRI can be applied to case-control data. The concepts presented in the paper are illustrated in a Framingham Heart Study example.
In conclusion, NRI can be readily calculated for survival, competing risk, and case-control data, is more objective and comparable across studies using the category-free version, and can include relative costs for classifications. We recommend that researchers clearly define and justify the choices they make when choosing NRI for their application.
discrimination; model performance; NRI; risk prediction; biomarker
Although the area under the receiver operating characteristic (ROC) curve (AUC) is the most popular measure of the performance of prediction models, it has limitations, especially when it is used to evaluate the added discrimination of a new risk marker in an existing risk model. Pencina et al. (2008) proposed two indices, the net reclassification improvement (NRI) and integrated discrimination improvement (IDI), to supplement the improvement in the AUC (IAUC). Their NRI and IDI are based on binary outcomes in case-control settings, which do not involve time-to-event outcome. However, many disease outcomes are time-dependent and the onset time can be censored. Measuring discrimination potential of a prognostic marker without considering time to event can lead to biased estimates. In this paper, we extended the NRI and IDI to time-to-event settings and derived the corresponding sample estimators and asymptotic tests. Simulation studies showed that the time-dependent NRI and IDI have better performance than Pencina’s NRI and IDI for measuring the improved discriminatory power of a new risk marker in prognostic survival models.
Improved discrimination; Prognostic survival models; Time-dependent NRI; Time-dependent IDI
The predictiveness curve shows the population distribution of risk endowed by a marker or risk prediction model. It provides a means for assessing the model’s capacity for stratifying the population according to risk. Methods for making inference about the predictiveness curve have been developed using cross-sectional or cohort data. Here we consider inference based on case-control studies which are far more common in practice. We investigate the relationship between the ROC curve and the predictiveness curve. Insights about their relationship provide alternative ROC interpretations for the predictiveness curve and for a previously proposed summary index of it. Next the relationship motivates ROC based methods for estimating the predictiveness curve. An important advantage of these methods over previously proposed methods is that they are rank invariant. In addition they provide a way of combining information across populations that have similar ROC curves but varying prevalence of the outcome. We apply the methods to PSA, a marker for predicting risk of prostate cancer.
biomarker; classification; predictiveness curve; risk prediction; ROC curve; total gain
In a prospective cohort study, information on clinical parameters, tests and molecular markers is often collected. Such information is useful to predict patient prognosis and to select patients for targeted therapy. We propose a new graphical approach, the positive predictive value (PPV) curve, to quantify the predictive accuracy of prognostic markers measured on a continuous scale with censored failure time outcome. The proposed method highlights the need to consider both predictive values and the marker distribution in the population when evaluating a marker, and it provides a common scale for comparing different markers. We consider both semiparametric and nonparametric based estimating procedures. In addition, we provide asymptotic distribution theory and resampling based procedures for making statistical inference. We illustrate our approach with numerical studies and datasets from the Seattle Heart Failure Study.
Prognostic accuracy; Positive predictive value; Survival analysis
New prognostic models are traditionally evaluated using measures of discrimination and risk reclassification, but these do not take full account of the clinical and health economic context. We propose a framework for comparing prognostic models by quantifying the public health impact (net benefit) of the treatment decisions they support, assuming a set of predetermined clinical treatment guidelines. The change in net benefit is more clinically interpretable than changes in traditional measures and can be used in full health economic evaluations of prognostic models used for screening and allocating risk reduction interventions. We extend previous work in this area by quantifying net benefits in life years, thus linking prognostic performance to health economic measures; by taking full account of the occurrence of events over time; and by considering estimation and cross-validation in a multiple-study setting. The method is illustrated in the context of cardiovascular disease risk prediction using an individual participant data meta-analysis. We estimate the number of cardiovascular-disease-free life years gained when statin treatment is allocated based on a risk prediction model with five established risk factors instead of a model with just age, gender and region. We explore methodological issues associated with the multistudy design and show that cost-effectiveness comparisons based on the proposed methodology are robust against a range of modelling assumptions, including adjusting for competing risks. Copyright © 2011 John Wiley & Sons, Ltd.
net benefit; cost-effectiveness; cardiovascular disease; meta-analysis; competing risks; screening strategies
Several biological pathways are activated in ventricular remodeling and in overt heart failure (HF). There are no data, however, on the incremental utility of a parsimonious set of biomarkers (reflecting pathways implicated in HF) for predicting HF risk in the community.
Methods and Results
We related a multi-biomarker panel to the incidence of a first HF event in 2754 Framingham Heart Study participants (mean age 58 years; 54% women), who were free of HF and underwent routine assays for 6 biomarkers (c-reactive protein, plasminogen activator inhibitor-1, homocysteine, aldosterone-to-renin ratio, b-type natriuretic peptide [BNP] and urinary albumin-to-creatinine ratio [UACR]). We estimated model c-statistic, calibration and net reclassification improvement (NRI) to assess the incremental predictive usefulness of biomarkers. We also related biomarkers to incidence of non-ischemic HF in participants without prevalent coronary heart disease.
On follow-up (mean 9.4 years), 95 first HF events occurred (54 in men). In multivariable-adjusted models, the biomarker panel was significantly related to HF risk (p=0.00005). Upon backwards elimination, BNP and UACR emerged as key biomarkers predicting HF risk: hazards ratio (HR; confidence interval [CI]) per standard deviation increment in log-marker were 1.52 (1.24-1.87) and 1.35 (1.11-1.66), respectively. BNP and UACR significantly improved the model c-statistic (CI) from 0.84 (0.80-0.88) in standard models to 0.86 (0.83-0.90), enhanced risk reclassification (NRI = 0.13; p=0.002), and were also independently associated with non-ischemic HF risk.
Using a multimarker strategy, we identified BNP and UACR as key risk factors for new-onset HF with incremental predictive utility over standard risk factors.
Biomarkers; heart failure; risk; prediction
The authors studied the incremental value of adding serum cystatin C or creatinine to the Framingham risk score variables (FRSVs) for the prediction of incident cardiovascular disease (CVD) among 6,653 adults without clinical CVD utilizing the Multi-Ethnic Study of Atherosclerosis (2000–2008). CVD events included coronary heart disease, heart failure, stroke, and peripheral arterial disease. Variables were transformed to yield optimal prediction of 6-year CVD events in sex-stratified models with FRSVs alone, FRSVs + cystatin C, and FRSVs + creatinine. Risk prediction in the 3 models was assessed by using the C statistic, and net reclassification improvement was calculated. The mean ages were 61.9 and 64.6 years for individuals with and without diabetes, respectively. After 6 years of follow-up, 447 (7.2%) CVD events occurred. In the total cohort, no significant change in the C statistic was noted with FRSVs + cystatin C and FRSVs + creatinine compared with FRSVs alone, and net reclassification improvement for CVD risk was extremely small and not significant with the addition of cystatin C or creatinine to FRSVs. Similar findings were noted after stratifying by baseline presence of diabetes. In conclusion, the addition of cystatin C or serum creatinine to FRSVs does not improve CVD risk prediction among adults without clinical CVD.
cardiovascular diseases; creatinine; cystatin C; risk model
To test if knowledge of type 2 diabetes genetic variants improves disease prediction.
RESEARCH DESIGN AND METHODS
We tested 40 single nucleotide polymorphisms (SNPs) associated with diabetes in 3,471 Framingham Offspring Study subjects followed over 34 years using pooled logistic regression models stratified by age (<50 years, diabetes cases = 144; or ≥50 years, diabetes cases = 302). Models included clinical risk factors and a 40-SNP weighted genetic risk score.
In people <50 years of age, the clinical risk factors model C-statistic was 0.908; the 40-SNP score increased it to 0.911 (P = 0.3; net reclassification improvement (NRI): 10.2%, P = 0.001). In people ≥50 years of age, the C-statistics without and with the score were 0.883 and 0.884 (P = 0.2; NRI: 0.4%). The risk per risk allele was higher in people <50 than ≥50 years of age (24 vs. 11%; P value for age interaction = 0.02).
Knowledge of common genetic variation appropriately reclassifies younger people for type 2 diabetes risk beyond clinical risk factors but not older people.