|Home | About | Journals | Submit | Contact Us | Français|
Background Non-uniform reporting of relevant relationships and metrics hampers critical appraisal of the clinical utility of C-reactive protein (CRP) measurement for prediction of later coronary events.
Methods We evaluated the predictive performance of CRP in the Northwick Park Heart Study (NPHS-II) and the Edinburgh Artery Study (EAS) comparing discrimination by area under the ROC curve (AUC), calibration and reclassification. We set the findings in the context of a systematic review of published studies comparing different available and imputed measures of prediction. Risk estimates per-quantile of CRP were pooled using a random effects model to infer the shape of the CRP-coronary event relationship.
Results NPHS-II and EAS (3441 individuals, 309 coronary events): CRP alone provided modest discrimination for coronary heart disease (AUC 0.61 and 0.62 in NPHS-II and EAS, respectively) and only modest improvement in the discrimination of a Framingham-based risk score (FRS) (increment in AUC 0.04 and –0.01, respectively). Risk models based on FRS alone and FRS + CRP were both well calibrated and the net reclassification improvement (NRI) was 8.5% in NPHS-II and 8.8% in EAS with four risk categories, falling to 4.9% and 3.0% for 10-year coronary disease risk threshold of 15%. Systematic review (31 prospective studies 84 063 individuals, 11 252 coronary events): pooled inferred values for the AUC for CRP alone were 0.59 (0.57, 0.61), 0.59 (0.57, 0.61) and 0.57 (0.54, 0.61) for studies of <5, 5–10 and >10 years follow up, respectively. Evidence from 13 studies (7201 cases) indicated that CRP did not consistently improve performance of the Framingham risk score when assessed by discrimination, with AUC increments in the range 0–0.15. Evidence from six studies (2430 cases) showed that CRP provided statistically significant but quantitatively small improvement in calibration of models based on established risk factors in some but not all studies. The wide overlap of CRP values among people who later suffered events and those who did not appeared to be explained by the consistently log-normal distribution of CRP and a graded continuous increment in coronary risk across the whole range of values without a threshold, such that a large proportion of events occurred among the many individuals with near average levels of CRP.
Conclusions CRP does not perform better than the Framingham risk equation for discrimination. The improvement in risk stratification or reclassification from addition of CRP to models based on established risk factors is small and inconsistent. Guidance on the clinical use of CRP measurement in the prediction of coronary events may require updating in light of this large comparative analysis.
Primary prevention of cardiovascular events currently involves targeting interventions to those at high absolute risk, identified using risk-prediction instruments such as the Framingham equation, that integrate information on established risk factors.1 However, a large proportion of events occur among individuals with near average levels of continuous risk factors,2 or at intermediate Framingham risk. With emerging evidence on the role of inflammation in atherosclerosis, there is interest in the potential pre-dictive utility of C-reactive protein (CRP), a sensitive circulating biomarker of inflammation.3 Nearly, 40 reports from prospective studies and three meta-analyses4–6 indicate a highly consistent, moderate association of CRP with later coronary heart disease (CHD) events among clinically healthy subjects. In 2004, consensus statements from population, laboratory science and clinical practice expert committees convened by the American Heart Association/Centres for Disease Control (AHA/CDC) indicated that ‘CRP may be used at the discretion of the physician as part of a global coronary risk assessment in adults without known cardiovascular disease’, and that a CRP value above a cut-point of 3 mg/l was indicative of subjects at high risk.7–10 However, all committees highlighted gaps in the available data and made recommendations for additional research. Although several CRP tests have received FDA approval, recent publications have highlighted ongoing debate on the real utility of CRP measurement in comparison with or as a supplement to conventional risk assessment of CHD,11–15 leaving clinicians, policymakers and patients uncertain as to the value of this test.
Reporting of measures of predictive performance in the many prospective studies of CRP has been inconsistent. Although measures of the strength of the association, e.g. using hazard or odds ratios, have achieved greater prominence (and are important in aetiological analysis), these measures provide limited information on predictive utility.16,17 To complicate matters, there is debate on the most appropriate metric for the evaluation of a new marker. Established measures include discrimination [the ability of a marker to distinguish individuals who will develop an event, assessed by sensitivity and specificity and the area under the ROC curve (AUC)], and calibration (use of a risk model to order or stratify risk, whose accuracy is assessed by tests that evaluate model fit). More recently, concerns that neither of these approaches provides a fair evaluation of a new marker has motivated interest in the development of new measures of predictive utility.18 A method proposed recently involves quantifying the extent to which a new marker shifts individuals between categories of CHD risk initially determined using established risk models (reclassification).19
A 2006 NHLBI workshop on CRP included the following recommendations: (i) ‘analyses of individual and pooled data from existing and new epidemiologic studies (or ancillary studies) to examine whether CRP measurement improves CVD risk prediction beyond traditional risk factors and can help target who should be treated’; (ii) research to assess ‘whether replacing a traditional risk factor(s) with CRP (or other newly discovered risk predictors) could improve CVD risk prediction’; and (iii) research on ‘the shape of the dose-response curve’ (http://www.nhlbi.nih.gov/meetings/workshops/crp/report.htm). We therefore assessed the performance of CRP as a risk marker in the Northwick Park Heart Study II (NPHS-II) and replicated the findings in the Edinburgh Artery Study (EAS), two population-based prospective studies. For these analyses we tested discrimination, calibration and reclassification to assess consistency between methods. Next, we conducted a systematic review of prior prospective studies that had examined the association of CRP and CHD to assess consistency between studies. This review differs importantly from prior overviews in this area in that our focus was on the evaluation of the predictive performance of CRP, rather than simply its association, and involved inferring measures relevant to prediction where these were unreported. Finally, we examined the population distribution of CRP in all the studies, and the shape of the relationship with CHD events, to better understand the factors that could constrain performance of this marker.
NPHS-II is a prospective study of 3012 healthy middle-aged men of European descent (age at recruitment 50–64 years), with enrolment commencing in 1989.20 Nine general practices participated in the study. The study was approved by the local institutional review committee and all subjects provided written informed consent. Further details are provided in the Supplementary Data.22–24
The EAS is a prospective study of 1592 European descent individuals (809 men and 783 women) aged 55–74 years enroled in 1988. Individuals were selected at random from 11 general practices serving a range of socio-economic and geographical areas throughout the city of Edinburgh, Scotland. Details of the study recruitment and examination process have been described elsewhere.21 Further details are provided in the Supplementary Data.25
Log-transformations were conducted for non-normally distributed data and for these variables means are geometric and standard deviations approximate. The association of CRP and risk of CHD events is reported as a hazard ratio with 95% CI obtained from a Cox proportional hazards model. Age-adjusted incidence rates (cases/1000 person years) and the absolute number of events are reported by tertile of CRP and in categories defined by CRP cut-points <1, 1–3 and >3 mg/l. Means and standard deviations (SD) of log-transformed CRP in those who suffered CHD events and those who remained event-free were used to construct normal curves to determine the overlap between the distributions in the two groups.
The performance of CRP for discrimination was assessed by means of the disease detection rate (DR) or sensitivity, i.e. the proportion of those who developed events who tested positive using CRP cut-points corresponding to pre-set false-positive rates of 5% (DR5) or 10% (DR10). In addition, we calculated the positive predictive value of a CRP cut-point of 3 mg/l i.e. the proportion of people with a CRP value >3 mg/l who suffered an event. The incremental effect on discrimination of adding CRP to the Framingham risk score was summarized by means of the AUC and Harrell's C-statistic. The Harrell's C-statistic from a Cox model is conceptually analogous to the AUC estimated from logistic models which allows for right-censored data and variable time to follow up.16,26–28
The performance of risk estimates generated from Cox models using Framingham variables alone,29 or with the addition of CRP, was evaluated by ranking participants according to one-fifths of predicted risk, and comparing within each category, the predicted and observed 10-year risks of CHD in NPHS-II and 17-year risks of CHD in EAS. The Akaike's information criterion (AIC) and Bayes information criterion (BIC) were used to assess global fit using Framingham variables alone or with CRP. The Hosmer–Lemeshow chi-square statistic30 was used to assess model fit. Small chi-square values indicate a good calibration while values exceeding 20 (P < 0.01) indicate significant lack of calibration.
The extent to which CRP reassigned individuals to risk categories that better reflected their final outcome was assessed using the NRI measure.19 For individuals who develop an event, risk classification is considered improved if the individual moves to a higher risk category with the addition of a marker, and worsened if the individual moves to a lower one. For individuals who remain healthy the converse is true. For people who develop events, the difference in the proportion of individuals moving up and the proportion moving down a category is calculated. For people remaining healthy, the proportion of individuals moving down minus the proportion moving up a category is calculated. The NRI is obtained as the sum of these two values. First, individuals were assigned to one of the following 10-year CHD risk categories based on a model using established risk factors: 0% to <5%, 5% to <10%, 10% to <15% and ≥15%, similar to previous reports.31 Next, the NRI was calculated following the addition of CRP to the risk model. In a separate analysis, individuals were assigned to one of two risk categories based on a 10-year CHD risk of 0% to <15% or ≥15%. This allowed us to assess the potential effect of reclassification on the clinical decision to prescribe cholesterol lowering therapy for primary prevention based on thresholds for intervention advocated by UK guidelines (e.g. NICE guidelines on primary prevention with statins, http://www.nice.org.uk/guidance/). In addition, to facilitate comparison with two prior reports, reclassification tables were drawn to allow comparison of the observed (actual) risk in each category of predicted risk comparing models that differed only in their inclusion or omission of CRP.
Two electronic databases (Medline and EMBASE) were searched up to and including August 2007 for all prospective studies (including cohort, nested case–control or case–cohort studies) of initially healthy subjects evaluating the association between CRP concentration and coronary events with no threshold sample size. For the search, the MeSH terms, ‘C-reactive protein’ and ‘CRP’ in combination with ‘coronary’, ‘coronary heart disease’, ‘CHD’ and ‘CVD’ were used and the search was limited by the terms ‘human’ and ‘English Language’. Studies in which total mortality was the only outcome reported were excluded. See Supplementary Data and Supplementary Figure 1 for further details.
Measures relevant to the performance of a screening test tended not to be reported in many of the previously reported studies and we estimated some of them as follows. We first extracted values of the geometric mean CRP (and approximate SD) separately for incident cases and those remaining free of events (controls). Where the median was reported, this was used as an estimate of the geometric mean and approximate SD was calculated from the inter-quartile range, assuming a log-normal distribution. Where no measure of dispersion was reported, an estimated SD based on pooled values from studies that provided this measure was applied. These data were then used to construct the log-normal distributions of CRP from each study for the cases and controls separately, which permitted estimation of study specific values for the DR (sensitivity) for a 5% or 10% false-positive rate (DR5 and DR10), as well as the AUC for CRP using methods reported previously.17,27 Where appropriate, study-specific values for AUC and DR were pooled using a random-effects model with weighting by the inverse of the variance.
There were 162 incident cases of CHD in NPHS-II and 147 in EAS among the individuals with valid CRP measurements. Higher CRP concentrations were associated with a higher risk of CHD events in both NPHS-II and EAS. In NPHS-II a hazard ratio of 2.22 (95% CI 1.50, 3.30) and EAS a hazard ratio of 1.87 (95% CI 1.21, 2.89) was observed for the comparison of the top vs bottom tertile of the CRP distribution in age- and primary-care recruitment centre-adjusted models. These values were similar in magnitude to those reported in a previous meta-analysis.6
With a CRP cut-point corresponding to a pre-set false-positive rate of 5% (DR5), the sensitivity for detection of CHD events was 11.2% in NPHS-II and 10.9% in EAS, i.e. around 10% of individuals who suffered CHD events were identified. With a CRP cut-point corresponding to a 10% false-positive rate, the DR (DR10) was 17.8% in NPHS-II and 19.7% for EAS. Using the cut-point of 3 mg/l, the positive predictive value was 8.7% in NPHS-II and 20.0% in EAS. The AUC for CRP in NPHS-II was 0.61 (95% CI 0.57, 0.66), and in EAS was 0.62 (95% CI 0.57, 0.67). These estimates of the AUC were modified only minimally when account was made of the effect of time on the accumulation of CHD events using Harrell's C-statistic (Table 1). Values of the C-statistic for the Framingham-based model alone and the Framingham plus CRP model were 0.62 (0.60, 0.65) and 0.66 (0.63, 0.68) in NPHS-II, and 0.68 (0.64, 0.71) and 0.67 (0.63, 0.71) in EAS, respectively (Table 1).
Two risk models were compared in NPHS-II and EAS: a Cox model derived using Framingham variables, and the same model with the addition of CRP. Individuals were first divided into one-fifths of predicted risk and observed event rates were compared with those expected. The global fit of the model was improved by the addition of CRP when assessed by the AIC and the BIC, which in NPHS and EAS were 2368.6 and 2397.6 for Framingham variables, respectively, and 2355.8 and 2390.5 for Framingham variables plus CRP, respectively (P < 0.001). However, because statistically significant improvements in model fit do not necessarily correspond to large absolute differences, we calculated the ratio of predicted to observed event-rates in each one-fifth of risk in the two models generated from both studies (Figure 1). This indicated that the magnitude of improvement in calibration conferred by adding CRP to the Framingham model was small and this was consistent with the Hosmer–Lemeshow P-value which indicated that in NPHS-II and EAS studies, models using Framingham variables alone and Framingham variables plus CRP were both well calibrated. In NPHS-II study the Hosmer–Lemeshow P-values were 0.82 for Framingham without CRP and 0.90 including CRP, and in the EAS cohort, the P-values were 0.65 for Framingham without CRP and 0.65 including CRP.
The reclassification of individuals following the addition of CRP to a Framingham-based model is summarized in Table 2 according to the eventual outcome. In NPHS-II, after generation of four 10-year CHD risk categories, a total of 716 men (60 cases and 656 controls) were reclassified, of which 368 (51.4%) moved to higher categories and 348 (48.6%) to lower categories. Of the 60 cases reclassified, 23 (38%) were inappropriately reclassified to lower risk categories and a similar proportion of inappropriate reclassifications was observed among controls, with 331 (50.5%) out of 656 controls being reclassified to higher categories. The NRI was 8.5% (–1.3, 18.3; P = 0.09). Very similar results were observed in the EAS study when using the same four risk categories (Table 2, panel A). When two risk-categories were generated using a treatment threshold for primary prevention of 15% CHD risk at 10-years, the number of subjects correctly reclassified by case–control status was substantially lower. In NPHS-II, in only 10 cases and 18 controls would the addition of CRP have impacted on the decision to prescribe statins, with a similarly low number in the EAS study. Consequently, the NRI using a treatment threshold of 15% was reduced to 4.9% (0.8–9.0) in NPHS-II and 3.0% (–3.0 to 9.2) in EAS (Table 2, Panel B).
In both EAS and NPHS-II, the distribution of CRP was right-skewed. Log-transformation resulted in normalization of the distribution in both studies, with the geometric mean being 2.46 mg/l (approximate SD 2.50 mg/l) in NPHS-II, and 1.93 mg/l (approximate SD 3.02 mg/l) in EAS. In NPHS-II, the geometric mean CRP (approximate SD) among the 162 men who later developed CHD was 3.66 (3.66) mg/l, and 2.40 (2.44) mg/l among the 2317 men who remained event-free. In EAS, geometric mean CRP (approximate SD) among the 147 individuals who later developed CHD was 2.47 (3.01) mg/l, and was 1.85 (3.00) mg/l among the 815 people who remained event free. Though on average CRP was higher among individuals who later suffered events, there was substantial overlap of the log-normal distributions of CRP values between the two groups (Supplementary Figure 2), imposing a difficulty in setting a threshold value for CRP that distinguishes individuals who later suffer events. Although the risk of CHD events increased with higher CRP values, 52.4% of CHD events in NPHS-II and 57.1% events in EAS occurred among individuals in the middle and lower tertiles of the CRP distribution (Supplementary Table 1). When the CRP distribution was divided according to the cut-points <1, 1–3 and >3 mg/l, recommended as defining individuals at low, intermediate and high risk, respectively, a similar percentage of events (41.3% in NPHS-II and 57.1% in EAS) were observed among individuals at low or intermediate risk categories.
Thirty-one studies of 28 prospective cohorts involving a total of 84 063 individuals and 11 252 incident CHD events were identified up to August 2007 (Supplementary Table 4; Supplementary references W1–W31). Twelve (38.7%) were studies of men alone (Supplementary references W1, W3, W4, W9–W13, W17, W18, W21, W25), six (19.4%) were studies of women alone (Supplementary references W5, W6, W16, W21, W23, W27) and 13 (41.9%) studies of both men and women (Supplementary references W2, W7, W8, W14, W15, W19, W20, W22, W24, W28, W29, W31).All 31 studies provided a measure of strength of association. Two studies (6.7%; 20 191 individuals, 531 incident cases) reported the AUC for CRP alone (Supplementary references W6, W24). A total of 13 studies from the 31 prospective cohorts (42%; 84 292 individuals, 7201 incident cases) reported the effect of adding CRP to the Framingham model on discrimination (Supplementary references W6, W8, W13, W15, W19, W20, W22–W25, W27, W28, W30). One study (15 048 individuals, 390 incident cases) assessed reclassification but did not report the NRI measure (Supplementary references W27). Up to 2003, only one study had reported on metrics related directly to predictive performance (Supplementary reference W6).
Thirty studies (96.8%) reported on the shape of the CRP distribution, and in all these studies it was reported to be skewed (Supplementary references W1–W3, W5–W31). Twenty-one studies (67.7%) reported that logarithmic transformation normalized the distribution, (Supplementary references W1–W3, W7–W15, W17, W20, W22–W24, W28–W31) as observed in NPHS-II and EAS. Including NPHS-II and EAS, a total of 12 studies reporting risk according to quartiles of CRP concentration, and 10 studies reporting risk according to quintiles, allowed estimation of the shape of the relationship between log-CRP and log risk of CHD (Supplementary references W1, W3, W5, W6, W10, W12–W14, W16–W19, W21, W22, W26). There were no systematic differences in the characteristics of studies reporting CRP effects by quartiles or quintiles (Supplementary Table 2). A linear graded association of log-CRP values with log risk of events was noted in all studies (Figure 2). The log-normal distribution of CRP in populations, coupled with the graded incremental association of CRP concentration with risk over the whole range of CRP values, leads to the expectation of a substantial proportion of all CVD events occurring among the large number of individuals with near average levels of CRP. In the eight studies, including NPHS-II and EAS, that reported the absolute number of events by tertile; 1772 of 3152 events (56%) occurred among individuals in the lower and middle tertiles of the CRP distribution (Supplementary references W2, W8, W9, W21, W22, W28). In the seven studies, including NPHS-II and EAS (Supplementary references W1, W12, W13, W17, W22), that reported absolute events by quartile, 1105 of 2428 events (46%) occurred among people in the middle two quartiles. These proportions were almost identical to those estimated from the pooled relative risks by quartiles or quintiles from 18 studies with data available (12 412 individuals, 4651 incident cases) (Figure 2).
Ten studies reported on geometric mean CRP and its approximate SD separately among incident cases and those remaining event-free (controls), and seven studies reported separately on median CRP values and the inter-quartile range from which the geometric mean CRP and approximate SD were estimated under the assumption of a log-normal distribution. For the remainder of studies where a measure of dispersion was not reported, a pooled SD was applied. These were then used to reconstruct the log-normal distribution of baseline CRP values in the two groups defined by eventual outcome. In all the 25 studies with relevant information (40 684 individuals and 9351 cases), there was a substantial overlap of baseline CRP values among incident cases and controls (Supplementary Figure 2), similar to that observed in NPHS-II and EAS.
Estimated disease DRs for a 5% false-positive rate inferred from these distributions are tabulated (Table 3) and those for a 10% false-positive rate shown graphically (Figure 3a), together with estimates of the AUC for CRP from individual studies (Figure 3b). The estimates were consistent with the corresponding values from NPHS-II and EAS. Because it was not possible to estimate the C-statistic from the summary data, we examined the influence of length of follow up on the AUC. There was a broad similarity in the DR10% or AUC estimates in analysis stratified by mean duration of follow up. The seven studies that reported on the geometric mean but provided no measure of dispersion and the single study that reported the median but provided no inter-quartile range were similar in other respects to those that reported these data (Supplementary Table 3). In the 13 studies that reported on the effect on the ROC curve or C-statistic of adding CRP to the Framingham-based models (Supplementary references W6, W8, W13, W15, W19, W20, W22–W25, W27, W28, W30) five reported no change and eight reported an improvement in the AUC ranging from 0.01 to 0.15 (Figure 4).
Five studies (16.1%; 42 141 individuals, 2430 incident cases) reported the effect of adding CRP to the calibration of risk models utilizing traditional risk factors (Supplementary Table 4; Supplementary references W15, W23, W27–W29). The effect of adding CRP to a base model was assessed in a variety of ways across studies, precluding pooled analyses. However, regardless of the metric used or reported statistical significance, absolute improvements if present were small.
Two reports from the Women's Health Study (Supplementary reference W27, 38), examined the effect of adding CRP to models based on established risk factors using reclassification tables. Both studies focused on event rates among individuals who shifted risk category on the addition of CRP, concluding that the actual risk of these individuals was more accurately assigned by the addition of CRP to the risk model. A comparison of observed risk in each category of predicted risk in NPHS-II and EAS (Table 4) that included all individuals in each category of predicted risk (i.e. those that shifted as well as those that did not) and a similar re-analysis of information from references W27 and 38 (data not shown) indicated that models that included or omitted CRP performed similarly. Moreover, the number of individuals that were reclassified between two risk categories based on a threshold 10-year CVD risk of 20% in WHS was 40 (0.15%) individuals who shifted above the threshold and 30 subjects (0.11%) who moved below the threshold. In the remaining 26 857, reclassification would not have affected decisions on the prescription of statins for primary prevention if these were based on a 20% CVD risk threshold. It was not possible to assess the number of individuals for whom reclassification was appropriate using the NRI measure, as the data were not presented separately by case–control status.
In a new analysis of two prospective cohorts and a critical appraisal of published studies of CRP, we found consistent reporting of measures of association relevant to aetiological analysis, but variable reporting of measures more directly relevant to the predictive utility of a marker. Collation of the available evidence, drawing together information on the different measures of prediction (both reported and imputed) indicated that while consistently associated with CHD risk, CRP measurement provides variable, generally modest, though sometimes statistically significant improvement in risk prediction. This analysis helps rationalize the available evidence to help clinicians, health providers and patients decide about whether or not measurement of CRP should be incorporated in the evaluation of CHD or CVD risk. The findings are also relevant to the conduct and reporting of studies of other emerging biomarkers being evaluated for CHD prediction and inform the debate on the optimal strategy for primary prevention of CHD.
One recommendation of the 2006 NHLBI workshop on CRP was research to assess ‘whether replacing a traditional risk factor(s) with CRP (or other newly discovered risk predictors) could improve CVD risk prediction’, (http://www.nhlbi.nih.gov/meetings/workshops/crp/report.htm). In the NPHS-II and EAS studies, measurement of CRP in individuals healthy at baseline provided limited discrimination for CHD events. CRP cut-points set to reduce misclassification (false-positive) rates to 5% or 10%, only detected between 10% and 20% of the people who eventually had events. The previously recommended CRP cut-point of 3 mg/l provided similarly modest discrimination. Discrimination over a range of cut-point values is summarized quantitatively as the AUC, which was 0.61 (0.57, 0.66) in NPHS-II, and 0.62 (0.57, 0.67) in EAS where 0.5 represents no discrimination and 1 represents perfect discrimination. Harrell's C-statistic was also not substantially different from the AUC in NPHS-II and EAS. The findings in NPHS-II and EAS were concordant with data obtained from prior reports identified by systematic review. Because of the variable reporting of the metrics relevant for assessing discrimination, these were derived, where possible, using study-specific summary CRP values from cases and controls. The derived values for disease DRs and the AUCs for CRP (Table 3 and Figure 3) were concordant with those obtained by direct calculation in NPHS-II and EAS, providing evidence for consistency across studies.
Another proposed application is to add CRP to established predictors included in the Framingham model. In NPHS, EAS and in the 13 prospective studies that evaluated the effect of adding CRP to the Framingham risk equation (Figure 4), CRP provided either very small or no improvement in the AUC or C-statistic.
Calibration refers to the ability of a risk model to predict the observed risk in a group of individuals. Risk models are used clinically to target treatments for primary prevention to those at highest risk in order to enhance cost effectiveness and reduce unnecessary exposure to drugs. However, the addition of CRP to models based on established markers included in the Framingham equation did not substantially enhance the calibration in NPHS and EAS (Figure 1). Of the five published studies that evaluated the effect of adding CRP to the calibration of risk models, assessment was made in different ways but none appeared to identify a quantitatively large improvement in model fit, though statistically significant differences were sometimes judged as being clinically important.
Because of concerns that neither discrimination nor calibration provides a fair test of a new marker, we also studied the effect of adding CRP to established risk assessment methods using the NRI measure. Although this analysis breaks a continuous risk score into categories, it can be informative in assessing how the addition of a marker to a base model reassigns risk in those who do or do not suffer clinical events. When CRP was added to a Framingham-based model using four categories of 10-year CHD risk in NPHS-II and EAS, the proportion of subjects correctly reclassified (risk upgraded in eventual cases and risk downgraded in people remaining healthy) was almost matched by the proportion incorrectly reclassified, so that the NRI was only 8.5% (–1.3, 18.3) in NPHS-II and 8.8% (–1.3, 18.9) in EAS. When two risk categories were considered based on an individual traversing the 15% 10-year risk cut-point for initiating statin therapy for primary prevention, the NRI was reduced still further to 4.9% (0.8, 9.0) in NPHS-II and 3.0% (–3.0, 9.2) in EAS. Thus, in absolute terms, measurement of CRP in all 2479 men in NPHS-II would have led to a change in the decision to recommend statin therapy in only 11 individuals; statins would have been given to an additional 10 individuals, and withheld from one, and the cost effectiveness of such an approach has yet to be adequately assessed. Very similar results were observed in EAS (Table 2). Only one study previously reported the ability of CRP to reclassify subjects into risk categories. Only 70 of the 26 927 participants (0.26%) crossed the 10-year 20% CHD risk threshold following addition of CRP, but it was not possible from the data provided to derive the NRI in the absence of information on the eventual outcome among reclassified individuals. In NPHS-II and EAS, in direct comparisons of models that included or excluded CRP using reclassification tables that compare observed (actual) risk with predicted risk, there was very little overall difference in alignment between the two models (Table 4). Thus, overall, the available evidence from NPHS and EAS as well as published prospective studies indicated that the incremental predictive performance of CRP in CHD is limited when added to conventional risk factors, regardless of the metric utilized.
Why should CRP, a sensitive marker of the inflammatory processes linked to atherosclerosis32 and consistently associated with CHD risk, predict cases of CHD only modestly well? One important reason could lie in the fundamental relationships that determine the predictive performance of a marker. First, the distribution of CRP values in all populations studied was log-normal. Second, using pooled data from 18 studies, we identified a graded continuous relationship of CRP with risk of CHD, over the full range of CRP values observed in general populations, without a threshold. Thus, a substantial proportion of all CHD events would be expected among the many individuals with near average levels of CRP, who are at intermediate risk of disease. This was observed in NPHS-II and EAS, as well as in 25 other studies that provided the relevant information. This likely contributes to the substantial overlap of CRP values among incident cases and controls (Supplementary Figure 2) that in turn, explains the small to moderate size of the association of CRP with CHD (an odds ratio of around two for a two SD difference in CRP). It also accounts for the difficulty in establishing a cut-point for CRP that adequately distinguishes those who suffered events, summarized quantitatively by the AUC.
Additional reasons are likely to contribute to the limited incremental utility for discrimination when CRP is added to standard risk assessment. First, it has been shown that combining markers that exhibit modest discrimination individually adds less than might be expected to the discrimination of a multi marker model.26 Second, it is recognized that CRP is not only associated with blood pressure, LDL- and HDL-cholesterol, age and gender, but also with diabetes, smoking, left ventricular hypertrophy and atrial fibrillation, all of which already contribute to the Framingham risk model.33–35 In some studies, the proportion of high CRP levels attributed to traditional risk factors was as high as a 78% in men and 67% in women.36 This was borne out empirically in the 15 prospective studies that reported the AUC or the C-statistic for Framingham model with and without CRP (Figure 4). It is likely that similar constraints limit the degree to which addition of CRP recalibrates risk models or improves classification.
Some of the limitations of our study are worthy of consideration. First, not all studies reported the full range of metrics relevant for prediction. This was, in large part, the motivation for our critical appraisal. Nevertheless, we found no systematic differences in the characteristics of the studies reporting different metrics. Moreover, the fundamental relationships that constrain marker performance, namely the log-normal distributions of CRP values overall and the moderate, graded log-linear association with risk were more widely reported and were consistent across all studies. Where we derived metrics relevant to prediction from the published studies, we used aggregate rather than participant-level data with the assumption that the CRP distribution is log normal among both incident cases and controls. However, all the derived metrics and estimated relationships in published data were directly verifiable in EAS and NPHS-II, and all were closely concordant. Participant-level information would allow more detailed analyses, and this may become possible through resources being built by the Emerging Risk Factors Collaboration. We did not formally assess the possibility of publication bias in this analysis, although this was addressed by a previous meta-analysis.6 However, unpublished studies, should they exist, are more likely to attenuate, rather than inflate the estimates of predictive utility. Lastly, although CRP may have limited utility in the screening or prediction of CHD, there is also immense interest in the possibility that it may be causally involved in the development and progression of atheroma or its complications after an acute event37 and our analysis was not designed to address this. Nor did our analysis address the utility of CRP for risk stratification around the time of an acute coronary syndrome, nor its potential pathological role in ischaemic tissue damage.
Twenty-five years ago more than 240 ‘risk factors’ for CHD had already been identified, and many new markers have since been added. The pace of discovery of new markers is expected to accelerate with the new -OMICs technologies, and it is likely that one or more of these newly identified markers will be advocated for risk prediction in the same way as CRP. However, it is imperative that the utility of any marker is assessed using the appropriate metrics and not simply based on tests of ‘independent’ association that provide insufficient evidence on predictive utility. Better appreciation of the fundamental epidemiological relationships that constrain the performance of all novel markers will also be required; namely, the distribution in a population and the shape and strength of the association with CHD.
In summary, new studies and a systematic review of published data indicate that while CRP is consistently associated with CHD and CVD risk, measurement of CRP provides much more limited information for risk prediction than tests of association alone might indicate. Previous guidance on the clinical use of CRP measurement may require updating in the light of these findings. Analyses of the type reported in this paper for CRP, could be also conducted for other novel biomarkers, to test whether or not any of these are likely to help better predict the occurrence of coronary events. All such analyses would benefit from the development of reporting standards to facilitate critical evaluation of the performance of biomarkers advocated for risk prediction.
Supplementary data are available at IJE online.
UK Medical Research Council (to NPHS-II); the US National Institutes of Health (grant NHLBI33014); Du Pont Pharma, Wilmington, USA; British Heart Foundation (to EAS); the British Heart Foundation (PhD Studentship FS/02/086/14760 to T.S.); British Heart Foundation Programme Grant (PG2000/015 to J.A.C. and S.E.H.); Scottish Executive (Chief Scientist Office to I.T.); British Heart Foundation Shillingford Training Fellowship (FS/07/011 to R.S.); British Heart Foundation Senior Fellowship (FS/05/125 to A.D.H.).
A.D.H. acknowledges generous support from the Rosetrees Trust.
Conflict of interest: None declared.