|Home | About | Journals | Submit | Contact Us | Français|
Study Concept and Design: Polonsky, McClelland, Greenland.
Acquisition of data: McClelland, Bild, Burke, Guerci, Greenland.
Analysis and interpretation of data: Polonsky, McClelland, Jorgensen, Bild, Burke, Guerci, Greenland.
Drafting of the manuscript: Polonsky, Greenland.
Critical revision of the manuscript for important intellectual content: McClelland, Jorgensen, Bild, Burke, Guerci.
Statistical analysis: McClelland, Jorgensen.
Obtained funding: Bild, Burke, Guerci, Greenland.
Administrative, technical or material support: Polonsky, McClelland, Jorgensen, Bild, Burke, Guerci, Greenland.
Study supervision: McClelland, Bild, Burke, Guerci, Greenland.
Coronary artery calcium score (CACS) has been shown to predict future coronary heart disease (CHD) events. However, the extent to which adding CACS to traditional CHD risk factors improves classification of risk is unclear.
To determine whether adding CACS to a prediction model based on traditional risk factors improves classification of risk.
CACS was measured by computed tomography on 6,814 participants from the Multi-Ethnic Study of Atherosclerosis (MESA), a population-based cohort without known cardiovascular disease. Recruitment spanned July 2000 to September 2002; follow-up extended through May 2008. Participants with diabetes were excluded for the primary analysis. Five-year risk estimates for incident CHD were categorized as 0-<3%, 3-<10%, and ≥10% using Cox proportional hazards models. Model 1 used age, gender, tobacco use, systolic blood pressure, antihypertensive medication use, total and high-density lipoprotein cholesterol, and race/ethnicity. Model 2 used these risk factors plus CACS. We calculated the net reclassification improvement (NRI) and compared the distribution of risk using Model 2 versus Model 1.
Incident CHD events
Over 5.8 years median follow-up, 209 CHD events occurred, of which 122 were myocardial infarction, death from CHD, or resuscitated cardiac arrest. Model 2 resulted in significant improvements in risk prediction compared to Model 1 (NRI=0.25, 95% confidence interval 0.16-0.34, P<0.001). With Model 1, 69% of the cohort was classified in the highest or lowest risk categories, compared to 77% with Model 2. An additional 23% of those who experienced events were reclassified to high risk, and an additional 13% without events were reclassified to low risk using Model 2.
In the MESA cohort, addition of CACS to a prediction model based on traditional risk factors significantly improved the classification of risk and placed more individuals in the most extreme risk categories.
The coronary artery calcium score (CACS) has been shown in large prospective studies to be associated with the risk of future cardiovascular events.1-4 Recent data from the Multi-Ethnic Study of Atherosclerosis (MESA), a population-based cohort of individuals without known cardiovascular disease, found that a CACS > 300 was associated with a hazard ratio for future coronary heart disease (CHD) events of nearly 10.4 In addition, including CACS in a prediction model based on traditional risk factors significantly improved the prediction of future CHD events.
While these findings clearly demonstrated strong statistical association of CACS with cardiovascular risk, assessing the clinical value of new markers in risk prediction requires assessment of several additional measures.5 Further investigation should evaluate how closely the predicted probabilities of risk using the new marker reflect observed risk. In addition, Pencina et al recently introduced the concept of “net reclassification improvement” (NRI) which measures the extent to which people with and without events are appropriately reclassified into clinically accepted higher or lower risk categories with the addition of a new marker.6 The NRI therefore provides a method of quantifying the enhancement in clinically useful risk estimation when a novel marker is added to a standard risk prediction model. This new approach is rapidly being accepted as an important method for evaluating the clinical utility of new risk markers.7, 8
We evaluated the extent to which adding CACS to a model based on traditional risk factors correctly reclassifies participants in the MESA cohort in terms of risk for future CHD events. We determined how the addition of CACS to a prediction model changes the overall distribution of estimated risk. In contrast to previous studies that reported statistical associations only, we sought to clarify the potential utility of CAC as a tool for risk stratification.
The study design for MESA has been published elsewhere.9 In brief, MESA is a prospective cohort study of 6,814 people between the ages of 45 and 84 without known cardiovascular disease. Participants were recruited from July 2000 through September 2002, and identified themselves as white (38%), black (28%), Hispanic (22%), or Chinese (12%) at the time of enrollment. The study was approved by the institutional review boards of each site, and all participants gave written informed consent.
Carr et al. have reported the details of the MESA CT scanning and interpretation methods.10 Scanning centers assessed coronary calcium by chest CT with either a cardiac-gated electron-beam CT scanner (Chicago, Los Angeles, and New York Field Centers) or a multidetector CT system (Baltimore, Forsyth County, and St. Paul Field Centers). Certified technologists scanned all participants twice over phantoms of known physical calcium concentration. A radiologist or cardiologist read all CT scans at a central reading center (Los Angeles Biomedical Research Institute at Harbor–UCLA in Torrance, California). We used the average Agatston score for the 2 scans in all analyses.11 Intraobserver and interobserver agreements were excellent (kappa statistics, 0.93 and 0.90, respectively). The participants were told either that they had no coronary calcification or that the amount was less than average, average, or greater than average and that they should discuss the results with their physicians.
As part of the baseline examination, clinical teams collected information on traditional cardiovascular risk factors, including age, blood pressure and tobacco use (current, former or no prior use). Total and high-density lipoprotein (HDL) cholesterol, triglycerides, and plasma glucose were measured from blood samples obtained after a 12-hour fast. Using a Dinamap Pro 1000 automated oscillometric sphygmomanometer (Critikon), we measured resting blood pressure three times with the participant in the seated position. The average of the last two blood pressures was used.
For the primary analysis, 883 people with diabetes were excluded, as current National Cholesterol Education Program Guidelines consider diabetes a CHD risk-equivalent.12 Diabetes was defined as a fasting plasma glucose level greater than 126 mg per deciliter (7.8 mmol per liter) or a history of medical treatment for diabetes.
At intervals of 9 to 12 months, interviewers telephoned participants or a family member to inquire about interim hospital admissions, outpatient diagnoses of cardiovascular disease, and deaths. Follow-up for this analysis extended through May 2008. To verify self-reported diagnoses, trained personnel abstracted data from hospital records for an estimated 96% of hospitalized cardiovascular events; records were available for 95% of outpatient diagnostic encounters. Next of kin and physicians were interviewed for participants who experienced out-of-hospital cardiovascular deaths. Two physician members of the MESA mortality and morbidity review committee independently classified events and assigned incidence dates. If they disagreed, the full committee made the final classification. We classified CHD events as myocardial infarction (MI), death from CHD, resuscitated cardiac arrest, definite or probable angina followed by coronary revascularization, and definite angina not followed by coronary revascularization. Revascularizations that were not based on a diagnosis of angina were not included in the primary endpoint.
The diagnosis of MI was based on a combination of symptoms, electrocardiographic findings, and levels of circulating cardiac biomarkers. A death was considered related to CHD if it occurred within 28 days after an MI, if the participant had had chest pain within 72 hours before death, or if the participant had a history of CHD and there was no known nonatherosclerotic, noncardiac cause of death. Reviewers classified resuscitated cardiac arrest when a patient successfully recovered from a full cardiac arrest through cardiopulmonary resuscitation (including cardioversion). Adjudicators graded angina on the basis of their clinical judgment. A classification of definite or probable angina required clear and definite documentation of symptoms distinct from the diagnosis of MI. A classification of definite angina also required objective evidence of reversible myocardial ischemia or obstructive coronary artery disease. A more detailed description of the MESA follow-up methods is available at http://www.mesa-nhlbi.org.
Five-year estimated incident CHD risk was calculated for each participant using a Cox proportional hazards model. Model 1 used the standard Framingham risk factors (age, gender, smoking, systolic blood pressure, use of antihypertensive medications, HDL and total cholesterol) and race/ethnicity. Model 2 used these standard risk factors plus CACS [expressed as ln(CACS + 1)]. The risk estimates were categorized as 0-<3%, 3-<10%, and ≥10%, corresponding to low, intermediate and high risk respectively. Tests for nonproportional hazards using Schoenfeld residuals were not significant. Interaction for CACS with gender was also tested, and was not significant (p=0.97).
We assessed discrimination, which reflects a marker’s ability to differentiate between people who do and do not have events. We constructed receiver-operating characteristic (ROC) curves and compared the areas under the curves (AUC) with and without CACS in the model. We estimated predicted values from a survival model and then treated the endpoint as binary and uncensored for purposes of estimating and testing the areas under the ROC curve.13 As a sensitivity analysis we also calculated Harrell’s C-statistic, which allows censored data.14 These estimates were identical through two decimal places to the binary version for both models.
The integrated discrimination index (IDI) measures the improvement in the average sensitivity with the new marker, and subtracts any increase in the average “1 minus the specificity.” The integrals of sensitivity and ‘one minus specificity’ over all possible cut-off values from the (0, 1) interval are used.6 The IDI can be expressed as [EY1–EY0] - [EX1–EX0], where EY1 and EY0 are the average expected probabilities of events and nonevents respectively for the model including the new marker; EX1 and EX0 are the average expected probabilities of events and nonevents respectively for the model without the new marker. When the incidence of events is relatively small, it is recommended to calculate the relative IDI as well.6 The relative IDI is defined as [EY1–EY0]/[EX1–EX0] – 1.
Cross tabulations of risk categories based on the models with and without CACS were performed to describe the number and percentage of participants who were reclassified appropriately (i.e., to a lower risk group for non-events, or a higher risk group for events) and inappropriately (to a lower risk group for events, or a higher risk group for non-events). We calculated the NRI, per Pencina et al.6 The NRI is estimated as
Kaplan-Meier five-year event rates were calculated. Statistical significance was established a priori as a P value <0.05.
We sought to determine how the use of lipid-lowering therapy and the presence of diabetes might change the NRI. The NRI was recalculated after excluding individuals who were receiving lipid-lowering therapy at the baseline examination (16% of the cohort). We also recalculated the NRI after including individuals with diabetes. Presence or absence of diabetes was incorporated into the model as an additional variable.
We assessed calibration, which measures how closely the predicted probabilities of risk using the new marker reflect observed risk. We calculated the survival-adapted Hosmer-Lemeshow χ2 statistic for both models.15 A P value <0.05 represents a significant difference between the expected and observed event rates and suggests that the model is not well-calibrated.
Finally, we examined the risk stratification capacity, as described by Janes et al.16 The risk stratification capacity measures the ability of a model to reclassify participants from the intermediate risk categories to the highest and lowest risk categories, where treatment strategies are better delineated.
All analyses were conducted with STATA software, version 11.0.
The study population included 5931 non-diabetic individuals at baseline. Follow-up or risk-factor information was not available for 53 participants, leaving a final cohort of 5878 subjects. There were 209 CHD events over a median follow-up of 5.8 years (interquartile range 5.6-5.9 years). One-hundred twenty-two people had a major event (96 had an MI, 14 died from CHD, and 12 had a resuscitated cardiac arrest), and 87 had angina (81 with definite angina of whom 67 were revascularized, and 6 with probable angina followed by revascularization).
Table 1 shows the baseline cardiovascular risk factors, stratified by estimated 5-year risk categories. As expected, the cardiovascular risk profile was less favorable in those with a higher predicted risk, and included a higher proportion of men and older individuals.
Measures of discrimination showed a significant improvement with the inclusion of CACS to the prediction model. The area under the ROC curve for the prediction of CHD events was 0.76 (95% CI 0.72-0.79) using model 1, and increased to 0.81 (95% CI 0.78 – 0.84) (P<0.001) with the addition of CACS, consistent with a previous MESA report based on fewer events.4 The IDI was 0.026 (P<0.001), with the relative IDI showing an 81% improvement in the discrimination slope.
Cross-tabulations of the 5-year estimated risk using the models with and without CACS are displayed in Table 2. Kaplan-Meier event rates for the model using traditional risk factors and the model using risk factors plus CACS are seen along the fifth column and row respectively. The survival-adapted Hosmer-Lemeshow χ2 statistic was 6.72 (P=0.46) for the model with traditional risk factors, and was 9.15 (p=0.24) with the addition of CACS, suggesting that neither model had a significant lack of fit.
The addition of CACS to the predictive model resulted in reclassification of 26% of the sample. The NRI for events was 0.23, and the NRI for nonevents was 0.02, achieving an NRI for the entire study cohort of 0.25 (95% CI 0.16-0.34, P<0.001) (Table 2). The NRI was essentially unchanged after including participants with diabetes (0.27, 95% CI 0.19-0.34) or excluding participants who were receiving lipid-lowering therapy at the baseline examination (0.26, 95% CI 0.16-0.37).
Overall, 728 individuals in the entire cohort were reclassified to a higher risk category, with an event rate of 8.7% (95% CI 6.9, 11.1), and 814 were reclassified to a lower risk category with an event rate of 2.7% (95% CI 1.8-4.1). The 5-year event rate for the entire cohort was 3.1% (95% CI 2.7-3.6%).
We evaluated separately the most clinically meaningful reclassifications which would presumably have the largest impact on treatment decisions. When CACS was added to the model, 298 (5.1%) were reclassified to high risk. Among those up-graded to high risk, 49 individuals (16.4%) experienced events. Conversely, 744 (12.7%) were reclassified to low risk, of whom 17 (2.3%) experienced events. Two high risk individuals reclassified to low risk experienced events (6.3%).
In the intermediate risk individuals, 292 (16%) were reclassified to high risk, while 712 (39%) were classified to low risk (NRI 0.55, 95% CI 0.41-0.69, P<0.001). The improvement in risk classification is more balanced between events and nonevents for intermediate risk individuals than the overall cohort (0.29 for events and 0.26 for nonevents). Further, of the 115 events that occurred among intermediate risk participants, 48 (41%) were among individuals reclassified to high risk, whereas 15 (13%) were among individuals reclassified to low risk.
The hazard ratios associated with the risk for a CHD event before and after adjustment for CACS are shown in Table 3. Inclusion of CACS into the model substantially attenuated the risk associated with all of the risk factors, although the hazard ratio associated with HDL was least influenced by the inclusion of CACS to the model.
The risk stratification capacity of a CACS-adjusted model is shown in Figure 1. The top panel shows that including CACS in the model places 77% of the overall population into either the highest or lowest risk categories, compared to 69% with traditional risk factors alone. With the addition of CACS to the model an additional 23% of those who experienced events were reclassified to high risk (middle panel), and an additional 13% of those who did not experience events were reclassified to low risk (bottom panel).
The results of this study demonstrate that when CACS is added to traditional risk factors it results in a significant improvement in the classification of risk for the prediction of CHD events in an asymptomatic population-based sample of men and women drawn from four U.S. ethnic groups. Our results highlight improvements in risk classification when utilizing CACS. Incorporation of an individual’s CACS leads to a more refined estimation of future risk for CHD events than traditional risk factors alone. The intermediate risk group achieved a substantially higher NRI than the overall cohort, and therefore appear to benefit the most from a CACS-adjusted strategy. This study provides strong evidence that there may be a significant amount of clinically useful reclassification when CACS is added to risk assessment in asymptomatic intermediate risk patients.
Considerable debate remains about how best to use CACS for risk assessment. Current American College of Cardiology/American Heart Association statements recommend that asymptomatic individuals at intermediate Framingham risk may be reasonable candidates for CHD testing using CACS.17 However, particular concern has been raised about the safety and cost associated with the widespread use of CACS. One recent study suggested an elevated cancer risk if a calcium score is obtained every five years.18 Others have questioned whether a CACS-guided strategy may actually cost more money and prevent fewer events than simply treating all patients at intermediate risk.19 In the setting of such uncertainty it is important to understand how to maximize the potential benefits of using CACS, while minimizing harm.
Direct comparisons to studies evaluating the NRI with other biomarkers should be made with caution, because the number of risk categories used, definition of the primary outcome, and length of follow-up often differs between studies. However, it is of interest that the NRI achieved with the addition of lipoprotein particles was negligible, glycosylated hemoglobin was 0.034, midregional proadrenomedullin (MR-proADM) with N-terminal pro-B-type natriuretic peptide was 0.047 and high-sensitivity C-reactive protein with family history was 0.068.20-23 In another study from MESA the use of brachial artery flow-mediated dilation resulted in an NRI of 0.29.24 However, this included a substantial proportion of inappropriate reclassifications downward among individuals who experienced events (23%). An important impact of a marker for the prediction of risk is the number people identified as having a higher disease risk and consequently become eligible to receive more intensive therapy as a result of screening. A relatively small proportion of the total MESA population, 5.1%, was reclassified to high risk. Importantly, almost 60% of the events (123/209) occurred among individuals who were not classified as high risk – either by traditional risk factors or CACS. The smaller number of participants who were classified to high risk is likely, in part, a reflection of the study population. More than half of the MESA cohort is in the lowest 5-year risk category based on traditional risk factors. Participants who were low risk required very elevated CACS to be reclassified to high risk. In contrast, the proportions of individuals reclassified were larger among the intermediate risk participants (16% to high risk and 39% to low risk). Almost half of the events among participants who were intermediate risk based on traditional risk factors alone occurred in individuals who were reclassified to high risk based on their CACS (48/115).
Inspection of the relative contribution of correct reclassification for events and nonevents also reveals important strengths and weaknesses of a CACS-adjusted strategy. For the entire cohort the NRI for events was 0.23, whereas the NRI for nonevents was 0.02. The results suggest that when applied to a general population a CACS-adjusted strategy may effectively identify more individuals who experience events, but this comes at the expense of identifying many other individuals as higher risk who do not experience events. With the availability of generic statins and years of data confirming their tolerability, the disadvantages of “overtreatment” may have become less significant over time. However, the improvement in risk classification is more balanced among the intermediate risk individuals (0.29 for events and 0.26 for nonevents), again suggesting that a CACS-adjusted strategy may be most clinically useful in this group.
Another metric of a risk marker’s utility is whether it separates individuals into more clinically relevant risk categories, as seen by the risk stratification capacity. Ideally, a model would reclassify most of the individuals out of the intermediate risk group and into the highest or lowest risk categories. When CACS is added to the model more than half of the intermediate risk individuals are reclassified to high and low risk, where treatment strategies are better established.
The values in the margins of the reclassification table best represent the net effect of including a novel marker into a risk prediction model.16 However, looking at individual cells can shed light on the potential limits of applying a marker to the clinical setting. Only four out of more than 3,000 low risk individuals were reclassified as high risk, suggesting that CACS may not be an efficient screening tool among low risk individuals. An additional concern is whether physicians can safely withhold or decrease therapy for patients who are reclassified to lower risk categories. We report that individuals who were reclassified from high to low risk experienced an event rate that was higher than predicted by the model with CACS. While the absolute number of events was small, our data support the recommendation that patients who are high risk should be treated regardless of their CACS, and as a result should not undergo CAC testing for additional risk assessment.
A critical question not answered in this study is whether screening for subclinical disease with CACS improves patient outcomes. In a recent American Heart Association scientific statement, the steps needed before widespread adoption of a risk marker were outlined.5 Initial phases of evaluation should demonstrate that a marker can differentiate between people with and without events, prospectively predict future events, and add predictive information to traditional risk factors – all of which have been accomplished with CACS. The results in the current report address the fourth phase, in which a marker must be shown to adjust predicted risk sufficiently to change recommended therapy. Whether the use of a marker improves clinical outcomes enough to justify the associated cost should be tested in the final phase, preferably with a randomized clinical trial.
Our study has limitations which should be acknowledged. Our results will need to be validated in additional populations. Had our study population contained a larger proportion of higher risk individuals, we may have seen higher event rates and different rates of reclassification. It is also possible that with longer follow-up and additional events our results could change.
In MESA the CACS was revealed to participants and their physicians. This could have affected our results in two ways. Knowledge of a high CACS may have biased the diagnosis of angina, and thus could have increased the NRI. Alternatively, participants with a high CACS may have had more intensive risk factor modification thereby reducing the number of events, and decreasing the NRI. We do not expect that the diagnosis of major coronary events would have been influenced by CACS.
In conclusion, we found that use of CACS plus traditional risk factors substantially enhances the ability to classify a multi-ethnic cohort of asymptomatic people without known CVD into clinically accepted categories of risk of future CHD events. The results provide encouragement for moving to the next stage of evaluation to assess the use of CACS on clinical outcomes.
This research was supported by contracts N01-HC-95159 through N01-HC-95169 from the National Heart, Lung, and Blood Institute (NHLBI). The authors thank the other investigators, the staff, and the participants of the MESA study for their valuable contributions. A full list of participating MESA investigators and institutions can be found at http://www.mesa-nhlbi.org
Role of the Sponsors The NHLBI participated in the design and conduct of the MESA study. A member of the NHLBI staff served as a co-author, and had input into the collection, management, analysis and interpretation of the data, and in preparation of the manuscript, as did the other co-authors. While additional members of the NHLBI staff were able to view the manuscript prior to submission, they did not participate in the decision to submit the manuscript or approve it prior to publication.
Disclosures The authors have no conflicts of interest to disclose.
Tamar S. Polonsky, Department of Preventive Medicine, Northwestern University, Chicago, IL;
Robyn L. McClelland, Department of Biostatistics, University of Washington, Seattle, WA;
Neal W. Jorgensen, Department of Biostatistics, University of Washington, Seattle, WA;
Diane E. Bild, Division of Cardiovascular Sciences, National Heart, Lung, and Blood Institute, Bethesda, MD;
Gregory L. Burke, Division of Public Health Sciences, Wake Forest University School of Medicine, Winston-Salem, NC;
Alan D. Guerci, St. Francis Hospital, The Heart Center, Roslyn, NY;
Philip Greenland, Department of Preventive Medicine, Northwestern University, Chicago, IL.