|Home | About | Journals | Submit | Contact Us | Français|
The use of coronary computed tomographic angiography (CCTA) for evaluation of patients with suspected coronary artery disease (CAD) is growing rapidly, yet questions remain regarding its diagnostic accuracy and its impact on clinical decision-making and patient outcomes.
A systematic literature review was conducted to identify studies examining (a) CCTA’s diagnostic accuracy; and (b) the impact of CCTA on clinical decision-making and/or patient outcomes. Diagnostic accuracy estimates were limited to patient-based analyses of occlusion; outcome studies were eligible for inclusion if they involved patients at low-to-intermediate risk of CAD. Pooled accuracy estimates were derived using bivariate random effects models; non-diagnostic CCTA results were conservatively assumed to be false positives.
A total of 42 diagnostic accuracy studies and 11 patient outcome studies were identified. The pooled mean sensitivity for CCTA in primary analyses was 98% (95% CI: 96%, 99%); specificity was 85% (81%, 89%). A small number of outcome studies set primarily in the emergency department found triage of low-risk patients using CCTA produced no serious adverse outcomes and was time-saving relative to standard triage care. Outcome studies in the outpatient setting were limited to four case series that did not directly compare patient care or outcomes with those of contemporaneous patients evaluated without CCTA.
CCTA appears to have high diagnostic accuracy in patients with suspected CAD, but its potential impact on clinical decision-making and patient outcomes is less well-understood, particularly in non-emergent settings.
Coronary artery disease (CAD) is the leading cause of death in the United States among both men and women, resulting in over 400,000 deaths annually.1 CAD also has a substantial impact on health care utilization, including over 6 million patient visits each year to emergency departments for acute chest pain, the hallmark symptom of CAD.2
Due to CAD’s prevalence as well as its impact on health care expenditures, and because several options exist to reduce CAD-related morbidity and mortality, accurate diagnosis is critical. Currently, the definitive standard for diagnosis is invasive coronary angiography (ICA). There are risks associated with ICA, however, such as artery trauma and heart arrhythmias as well as a small mortality risk.3 For these reasons non-invasive diagnostic methods have also been sought, most commonly via stress testing, often combined with echocardiogram or single photon emission computed tomography (SPECT). While these diagnostic modalities are important tests of cardiac function, they do not provide information on the presence of underlying CAD.4
For this and other reasons, interest has grown in using imaging technology to provide complementary anatomic information on patients with suspected CAD. Recently, the evolution of ultra-fast CT scanners has led to improved coronary imaging; with the advent of 64-slice (and higher) scanners, scan artifacts have been greatly reduced, and both spatial and temporal resolution have improved significantly.5 The technological evolution of coronary computed tomographic angiography (CCTA) has resulted in substantial growth in its use; findings from a 2009 survey of 220 US-based cardiology practices indicate that nearly half own or lease cardiac CT equipment, vs. 23.5% in the 2006 survey.6
However, there is still considerable debate regarding CCTA’s diagnostic accuracy in multiple settings as well as its prognostic ability.7 We therefore undertook an assessment focused on the diagnostic accuracy of CCTA as well as its potential impact on clinical decision-making and patient outcomes. The study scope was guided by a 17-person advisory committee composed of clinical experts, patients and patient advocates, experts in the fields of health economics, epidemiology, health policy, and bioethics, and representatives from public and private payers as well as manufacturers of CCTA technologies.8 Based on the input of this committee, the focus for this review was targeted at the use of CCTA to evaluate patients at low-to-intermediate CAD risk for (a) acute chest pain of unknown origin in an emergency department (ED) setting; and (b) stable chest pain symptoms in an outpatient setting.
This review included study reports on the performance of CCTA in diagnosing CAD using scanners with 64-slice or higher resolution. Guidance from the advisory committee suggested that 64-slice scanners were now widely available and had become viewed as the standard for CCTA, and that literature on earlier-generation scanners would not be viewed as relevant by the clinical and patient communities.
The literature search timeframe for the assessment spanned from January 2005 (the first year of published studies from 64-slice scanners) through February 2010. The search was limited to English-language reports only. The search strategy employed can be found in Table 1.
To be eligible for inclusion, diagnostic accuracy studies must have used ICA as the reference standard in all or a random sample of patients. To reflect our interest in producing results reflective of decision-making in typical clinical practice, only those studies that reported results at the patient level or whose results could be used to construct per-patient findings were included; in addition, because quantitative models for estimation of degree of stenosis are not commonly employed in community practice, findings were restricted to those determined by visual inspection alone. Additional eligibility criteria were incorporated in an attempt to reduce between-study heterogeneity, including restriction of the sample to studies that evaluated accuracy in native coronary arteries only, and exclusion of studies that (a) did not include blinded review of both CCTA and ICA; and (b) had an elapsed time between CCTA and ICA greater than 3 months. Diagnostic accuracy studies were included regardless of setting (e.g., emergency department, outpatient).
Studies were also sought that examined the impact of CCTA, whether used alone or in combination with other diagnostic methods, on clinical decision-making and/or rates of major cardiovascular events. The review of studies on patient outcomes was limited to those involving patients with low-to-intermediate risk or pretest probability of CAD (i.e., approximately 10–30%), consistent with published clinical guidelines on the use of CCTA for CAD detection.7 Diagnostic accuracy studies were not subject to this restriction, however, as nearly all of these studies were conducted in patients already scheduled for ICA, and therefore likely to be at higher risk of CAD.
Electronic databases searched included MEDLINE, EMBASE, and The Cochrane Library (including the Database of Abstracts of Reviews of Effects [DARE]) for eligible studies; in addition to primary studies, health technology assessments (HTAs) and systematic reviews were also examined. Reference lists of all eligible studies were also searched. Figure 1 shows a flow chart of the results of all searches for included primary studies. In addition to 53 primary studies, searches identified three systematic reviews and two HTAs.
Data Collection & Synthesis If sensitivity or specificity was not reported, these values were calculated. Primary analyses were conducted according to an “intent to diagnose” paradigm; in this approach, patients with “non-diagnostic” or indeterminate CCTA tests were considered to have positive findings, as guidance from our advisory committee suggested that clinicians commonly refer such cases to ICA or further non-invasive testing. We also assumed that all such patients would be determined to be false positives on ICA, which materially affects only the calculations of specificity and positive predictive value (i.e., as false positives are not included in calculations of sensitivity or negative predictive value). This method may under-represent the diagnostic accuracy of CCTA but avoids the equal or greater risk of overestimating accuracy when non-diagnostic CCTA results are excluded from consideration. Similar “conservative” approaches have been employed by several investigators specifically to evaluate the impact of excluding non-diagnostic findings on diagnostic accuracy.9,10 In separate analyses, specificity was calculated with non-diagnostic findings excluded from consideration.The quality of diagnostic accuracy studies is often assessed using the Quality Assessment of Diagnostic Accuracy Studies (QUADAS) tool, a 14-item instrument evaluating internal validity developed by Whiting et al.11 We modified the published tool by first eliminating two items that relate to sufficient description of the index test and reference standard to allow their replication, as it was felt that these items relate more to the quality of study reporting rather than any methodological deficiencies. We then added four items to the checklist, consistent with methods used in a recent systematic review of 64-slice CCTA:12
A single reviewer with systematic review training assigned the QUADAS scores.
Meta-Analysis Analyses of diagnostic accuracy were conducted by first using the reported or derived numbers of true positives, false positives, true negatives, and false negatives to calculate sensitivity and specificity. Because sensitivity and specificity are typically correlated to some extent, we generated pooled estimates using a bivariate random-effects model. The advantages of such an approach include preservation of the two-dimensional nature of the underlying data, the ability to generate pooled estimates of both sensitivity and specificity, and the ability to test for the level of correlation between the two measures.13 The bivariate model does not provide a clear measure of between-study heterogeneity, however; this was therefore assessed using individual random-effects models of sensitivity and specificity. Meta-regression, a technique employed to examine the possible correlation between study characteristics and primary findings in order to more fully explain observed differences between studies, was employed in situations where sizable heterogeneity was observed. Characteristics examined included sample size, mean age, % male, CAD prevalence, and the proportion of patients with prior known CAD. Statistically significant factors (i.e.., p<0.05) were then included in the bivariate model to examine their impact on study findings.As the evidence base for this meta-analysis was limited mainly to single-center studies, publication bias and duplicative research were also considered as partial explanations for the results. Authors with more than one publication in the sample were first contacted to identify whether any samples had a significant degree of overlap; in such cases, the largest study was retained and the others were removed. Authors were also queried regarding the presence of any significant unpublished or grey literature research that might have altered the primary findings; no studies were identified. Publication bias was expected, as several recent analyses have suggested that most meta-analyses of small, single-center diagnostic accuracy studies are subject to some level of publication bias;14,15 formal tests of such bias were therefore not conducted.Analyses were conducted using SAS® (Statistical Analysis Software, Cary, NC, USA), version 9.1, and MetaDisc®, version 1.4.16
Because studies of the impact of CCTA varied substantially in terms of their definitions of clinical outcomes, comparators, period of follow-up, and data collection methods, no quantitative meta-analysis of these data was performed. Study characteristics and primary findings are presented in a descriptive evidence table (Table 4).
A total of 66 studies were initially identified from the literature search; 13 of these studies were excluded because either no per-patient findings were available (n=6), the comparison performed was for an outcome other than detection of CAD (e.g., comparison to SPECT to assess myocardial perfusion, n=2), identical/overlapping study samples were presented in another included study (n=3), CCTA was considered as part of a multi-test strategy (n=1), or ICA was not performed on the full or a random sample (n=1). A flowchart of sample attrition, created in PRISMA format,17 is available in Figure 1.
Of the remaining 53 studies, 44 were conducted in an outpatient setting, and nine were conducted in an ED setting. In all, 6,151 patients were analyzed in these studies; mean age ranged from 46–69 years, and 63% were male. Most studies were diagnostic accuracy studies (n=42; 2 ED, 40 outpatient), with the majority of these conducted in patients already scheduled for ICA; in addition, nearly all of these studies (39 of 42) were conducted in a single center.. A total of 11 studies examined the impact of CCTA by evaluating subsequent clinical decisions and patient outcomes; while this approach was typically utilized in an ED setting (where definitive diagnosis by ICA is not universally feasible or warranted), 4 of the 11 studies identified were conducted among patients presenting on an outpatient basis with stable symptoms.
Because most of the included diagnostic accuracy studies involved patients already scheduled for ICA, the prevalence of CAD in our sample was relatively high (mean [SD]: 59.3% [21.6%]; range: 15.0%–91.0%). Approximately two-thirds of studies excluded patients with known prior CAD or revascularization. And, while not a criterion for patient exclusion, vessels smaller than 1.5 mm in diameter or those felt to be heavily calcified were often excluded from analysis, as CCTA image quality is often impaired in these vessel types.18
Findings from the QUADAS tool assessment of diagnostic accuracy study quality can be found in Table 2. Spectrum bias, or the systematic variation of diagnostic test performance across patient subgroups coupled with the failure to represent all such subgroups in a study, was found to be present in the majority of studies (66%). This is not surprising, as the high underlying prevalence of CAD in many of these studies suggests the possibility of over-estimation of diagnostic accuracy if sufficient numbers of low-risk individuals are not included.19 Other major concerns included lack of information on study withdrawals and observer variation.
Studies ascertained for diagnostic accuracy included 3 multicenter evaluations. The ACCURACY and CORE 64 studies were evaluations of 230 and 291 patients evaluated at 16 U.S. and 9 multi-national sites respectively.20,21 A third evaluation examined 360 patients at three Dutch hospital sites.22 The ACCURACY study was unique for its enrollment of patients typically excluded from evaluations of CCTA sensitivity and specificity, such as obese patients and those with high calcium scores. The Dutch study was also unique for its mixed population of stable (65%) and unstable (35%) chest pain, use of scanners from multiple vendors, and intent-to-diagnose protocol (i.e., all segments and vessels were analyzed regardless of image quality). In contrast, CORE 64 had entry criteria similar to those of the single-center studies that comprised the rest of our sample.
Estimates of diagnostic accuracy from these studies were somewhat divergent: sensitivity and specificity were 95% and 83% respectively in the ACCURACY study;20 corresponding estimates were 83% and 91% for CORE 6421 and 99% and 64% for the Dutch evaluation.22
Figure 2 presents the data on sensitivity and specificity of CCTA from all diagnostic accuracy studies,9,10,20–59 including the pooled estimates generated by the bivariate model. Sensitivity was relatively consistent, exceeding 90% in all but 5 of the 42 studies examined; the pooled estimate for sensitivity was 98% (95% CI, 96%, 99%). Approximately 3% of patients had non-diagnostic CCTA results (range: 0–18%); as described previously, these patients were considered to be false positives in primary calculations.
A greater degree of variability was observed in analyses of specificity; results by study ranged from 50–100%. No discernible pattern in study design or diagnosis confirmation was observed among “outlier” studies. While a difference in the “cutoff” level for significant stenosis was considered as a possible explanation for variability in specificity, nearly all (39/42) studies used ≥50% as the cutoff value. Despite this variability, the summary receiver operating characteristic (sROC) curve suggested high diagnostic accuracy overall in this sample of studies (area under the curve = 0.9692) (Fig. 3).
Consideration of patients with non-diagnostic findings as false positives resulted in a pooled specificity estimate of 85% (95% CI: 81%, 89%). When non-diagnostic exams were excluded from consideration, specificity rose from 85% to 88% (95% CI: 85%, 91%); sensitivity was unchanged.
Because the bivariate model does not produce a single heterogeneity statistic, heterogeneity was assessed using separate random effects models for sensitivity and specificity. For sensitivity, moderate heterogeneity was observed (I2=59.8%, p<0.001); however, a high degree of heterogeneity was seen in analyses of specificity (I2=82.5%, p<0.0001). To further explore explanations for heterogeneity, the possibility of a threshold effect (i.e., a strong inverse correlation between sensitivity and specificity) was first examined. No material effect was observed (Spearman correlation coefficient: −0.146, p=0.312).
A meta-regression model was then specified to examine differences across subgroups, including sample size, mean age, % male, CAD prevalence, and whether the sample included patients with known prior CAD. Age was the only significant subgroup in the model. When diagnostic accuracy estimates were examined by age group, sensitivity was largely unchanged (range: 95–99%) (Table 3). Specificity varied widely by age group, however, ranging from 91% for studies with a mean age <59 years to 77% for age >62 years. This is not a surprising finding, as CCTA has been found to have diminished specificity in the elderly due to higher levels of coronary artery calcification in older patients.60
Details on the 11 studies that evaluated in some way the impact of CCTA on patient management and outcomes2,61–70 can be found in Table 4. The outcome measures employed, event definitions used, underlying CAD risk, and duration of follow-up varied significantly between studies.
Only 2 of the 9 studies, both conducted in the ED, documented the impact of CCTA on clinician decision-making and patient outcomes relative to standard care. The first was a randomized controlled trial of 197 patients who received CCTA plus standard observation unit care (serial enzymes and stress SPECT as necessary) or standard care alone and were followed for up to 6 months.62 There were no major cardiovascular events (MACE) in either group, which may be due in part to the fact that patients were only randomized after two consecutive negative enzyme tests (at 0 and 4 hours), and would therefore be considered at very low risk for acute coronary syndromes caused by CAD. A higher number of patients in the CCTA arm received ICA (11 vs. 9 for standard care). Testing costs were higher in the CCTA arm, but overall ED-related costs were lower due to a higher percentage of immediate discharges for negative test results.
The second study, performed in Israel, evaluated CCTA’s use in guiding triage among 58 patients with and without known prior CAD who presented to the ED with chest pain, intermediate CAD risk, negative initial enzymes, and no EKG changes;67 patients were followed for up to 12 months following initial presentation. Patients received standard ED triage along with cardiology consultation, after which a preliminary diagnosis of acute coronary syndrome (ACS) or non-ACS chest pain was made, along with recommendations for discharge home or hospitalization. CCTA was then performed in all patients, and triage recommendations were adjusted at the discretion of the treating physician. CCTA results led to a revised diagnosis (i.e., from ACS to non-ACS chest pain) in 18 of 41 patients as well as canceled hospitalizations in 21 of 47. In addition, planned ICA was deemed unnecessary in 20 of 32 patients, while CCTA did suggest the need for ICA in 5 of 26 patients for whom it was not initially felt to be required. One CCTA scan was deemed to be false positive; no MACE events were recorded in the 32 patients discharged from the ED.
The remaining patient management and outcomes studies were case series or retrospective cohort studies comparing MACE event rates in patients with positive and negative CCTA findings (Table 4). These rates were not compared to those of other diagnostic strategies for CAD.
The results of our analysis suggest that the sensitivity (98%) and specificity (85%) of CCTA for significant coronary artery stenosis appear to be relatively high. These findings compare favorably with accuracy estimates for other non-invasive strategies for CAD detection. Findings from recent series examining the diagnostic accuracy of SPECT, for example, suggest somewhat lower sensitivity (85–90%) than CCTA and comparable specificity (80–90%).71,72 However, as noted previously with published multicenter studies,20–22 diagnostic accuracy can vary widely based on patient selection criteria, evaluation methodology, and even type of CT scanner used. For example, the diagnostic accuracy results of the multi-center ACCURACY study, with entry criteria designed to reflect typical clinical practice, were similar to those in our review;20 however, findings from a Dutch multicenter evaluation, which employed an “intent-to-diagnose” method similar to our own to reflect clinical decision-making,22 showed comparable sensitivity but lower specificity.
Evidence on CCTA’s impact on clinical decision-making and patient outcomes is very limited. While there are preliminary data suggesting that use of CCTA for triage purposes in the emergency department has the potential to speed discharge and reduce costs in a large number of low-to-intermediate risk patients with negative CCTA findings, we found no studies that measured the potential for CCTA to alter decision-making and reduce unnecessary testing in the outpatient setting through explicit comparisons to other diagnostic strategies.
Our estimate of CCTA’s sensitivity is comparable to pooled estimates from other systematic reviews,12,76–79 however, pooled specificity is somewhat lower in comparison. For example, one review of 18 studies yielded summary estimates of 99% (95% Bayesian credible interval: 97%, 99%) and 89% (83%, 94%) for sensitivity and specificity.12 In another review, an analysis of individual random-effects models for sensitivity and specificity in 13 studies reporting patient-based results yielded estimates of 97.5% (95% CI: 96%, 99%) and 91% (87.5%, 94%) respectively.76
Explicit comparisons of these findings to our results are problematic given differences in study methodology and sample. However, our analysis is based on the largest number of studies reported to date, including 3 recent reports of multi-center studies. Differences may therefore simply be a function of greater variability in study design and patient populations in our sample relative to the other reviews. In addition, a larger number of studies will also reflect greater variability in the interpretation of what constitutes “significant” stenosis, leading to a greater heterogeneity in the false-positive rate (and accordingly, specificity). Indeed, as described previously, the 3 multi-center evaluations produced specificity estimates of 91%, 83%, and 64%;20–22 in addition to differences in study protocol and entry criteria, the number of centers involved and interpretations provided may also have contributed to this variation.
There are a number of important questions that the current evidence is unable to address. For one, the lack of data on long-term outcomes with CCTA makes it difficult to ascribe value to its ability to reduce the rate of false-positive and false-negative findings relative to other strategies. Without these data, it is not possible to know whether and when patients with false-negative findings will re-present with symptoms and be diagnosed correctly, and whether they will suffer any health consequences in the intervening period. It is also impossible to know the degree to which heightened clinical attention given to patients with false-positive CCTA tests might provide a net health benefit given that CAD will develop over time in many healthy individuals.
It is important to note that the data on long-term outcomes related to radiation exposure and extra-coronary “incidental” findings are so limited that they were not considered here, although they remain key considerations in clinical and policy decisions. Another critical but unstudied issue is whether widespread adoption of CCTA would result in a shift in the number or type of patients sent for diagnostic intervention. For example, availability of CCTA may lead to increased diagnostic testing of patients at very low risk of significant CAD.73 If this occurs, the relative balance of true-positive and false-positive results may shift, which may in turn increase the number of unnecessary tests and alter perceptions of the net health benefit of CCTA.
Because of CCTA’s visual precision, “mild” levels of stenosis (i.e., 20–70%) can be detected; the benefits of aggressive management of this level of CAD are unknown. While not a focus of this systematic review, several studies have attempted to examine CCTA’s ability to diagnose functional cardiac deficits, using SPECT or another functional test as a reference.2,74,75 While negative predictive value for these abnormalities was similar to that reported in the ICA-reference studies, positive predictive value only ranged between 50–60%, which is indicative of overestimation of clinically significant obstruction by CCTA. Some have posited that, with increasingly precise technology, the ability to use CCTA to study blood flow and perfusion deficits will be heightened; evidence has not yet accumulated to evaluate this hypothesis.
The state of current evidence on CCTA and the unanswered questions described above make CCTA’s role in the diagnostic armamentarium unclear. For patients who require data on cardiac function, SPECT or other testing followed by ICA is the likely pathway. For those who otherwise exhibit characteristics associated with high risk of CAD, immediate ICA without non-invasive testing would be in order.7 For patients at lower CAD risk, the high sensitivity of CCTA makes it a valuable test for excluding CAD as a cause of chest pain. It remains to be seen, however, whether this capability truly represents an advance over clinical judgment and basic stress testing, particularly in very low-risk individuals.
Our review is subject to some important limitations. For one, while efforts were undertaken to reduce publication bias and duplicative research, certain aspects of our search strategy (e.g., exclusion of non-English-language articles) may be subject to residual levels of such bias. In addition, while results from multi-center studies are included, the majority of findings were from small, single-center evaluations from major research centers, restricting the generalizability of our findings to the full spectrum of patients likely to receive this test as well as to conditions of typical community practice. Finally, as noted previously, the “intent-to-diagnose” paradigm employed for our primary analysis, which assumed all non-diagnostic findings on CCTA would be shown to be false-positives upon further testing, may have underestimated CCTA’s true diagnostic accuracy. However, we feel that exclusion of non-diagnostic results from analysis -- the approach taken by many of the studies in our sample -- provides an unwarranted boost to estimates of diagnostic accuracy. Our findings are therefore presented using both approaches so that the reader can visualize the interval in which CCTA’s true specificity most likely resides.
Despite these limitations, we believe our study represents an important summary of the existing literature, highlighting how limited the current evidence is on the overall impact of CCTA on clinical decision-making and patient outcomes. Further research is needed that captures important outcomes and allows a direct comparison of contemporaneous groups of patients evaluated with and without CCTA. In addition, the remaining uncertainties regarding broader use of CCTA and its impact on the clinical care of patients with mild-to-moderate stenoses will require ongoing evaluation of this technology in multiple patient populations.
Contributors The authors wish to acknowledge the efforts of the advisory committee involved in this evaluation for their assistance in defining the review scope as well as reviewing and commenting on draft findings.
Funding The systematic review was funded through pooled resources from non-profit foundation grants and unrestricted research grants from multiple sources, including health plans and life science companies. None of these companies are manufacturers of CT machines. In addition, we received a contract from the Washington state Health Technology Assessment Program for part of the work involved in this project.
Prior Presentations This work has not been published in any peer-review journal previously, nor has it been presented at any conference.
Conflicts of Interest Ms. Kuba was employed by the Institute for Clinical and Economic Review at the time these analyses were conducted. None of the remaining authors reported conflicts of interest.