Estimation of the likelihood of adverse outcome after surgery is a central objective of preoperative assessment. However, it remains uncertain which technique provides the most accurate prediction of perioperative risk. Limitations in the design and conduct of many studies evaluating methods of risk assessment contribute to this uncertainty. Further studies of preoperative risk assessment are urgently required and should meet several essential criteria.
Tissue injury and physiological disturbance associated with major surgery can have long-lasting consequences and may sometimes be immediately life-threatening. Low overall mortality rates conceal the existence of a subgroup of high-risk patients who accounts for more than 80% of postoperative deaths.1,2 In the UK alone, more than 170 000 high-risk non-cardiac surgical procedures are performed each year, after which 100 000 patients develop complications resulting in more than 25 000 deaths before hospital discharge.1–3 Importantly, those patients who develop complications but survive to leave hospital suffer a substantial reduction in functional independence and long-term survival.4–6 Some have suggested that among developed nations, poor surgical outcomes are a particular problem in the UK.7 However, published data suggest that the issue is widespread.5,6,8,9 Recent figures suggest that more than 230 million surgical procedures are performed worldwide each year.10 Assuming a mortality rate of between 0.5% and 2.0%,6,11 surgery will be associated with 1.25–5 million deaths annually worldwide. Morbidity rates are likely to be between five and 10 times this figure. Clearly, major surgery represents an extremely important cause of death and disability, even in those parts of the world where mortality rates are low.10
The evidence of the impact of poor surgical outcomes is growing. However, we appear to be failing to accurately identify high-risk surgical patients and allocate them to an appropriate level of perioperative care. Recent UK data indicate that only a minority of high-risk patients are admitted directly to critical care after surgery, and that many postoperative deaths occur following delayed admission to critical care with initial treatment on a standard surgical ward.1,2 Precise evaluation of an individual's risk of death or complications is fundamental to the process of improving postoperative outcomes, but remains an elusive goal.12 Such information would allow more effective allocation of resources and targeting of specific interventions to those patients most likely to benefit. Potentially, beneficial strategies include preoperative specialist medical review, specific perioperative interventions, such as β-blockade or flow-guided fluid therapy and elective critical care admission. Greater precision in the assessment of risk would also improve the quality of information that can be given to patients, many of whom simply rely on the guidance of surgeons and anaesthetists when consenting for surgery. Objective, accurate information balancing the risks and benefits of surgery should improve the quality, and experience, of patient decision-making.
Although a variety of methods are currently used to provide objective assessment of perioperative risk, none has yet been subjected to the type of robust evaluation required for routine clinical use. Cohort studies are a valuable and commonly used method of evaluating prognostic tools but when the test under investigation is allowed to influence treatment decisions, ‘confounding by indication’ may complicate interpretation, particularly in un-blinded studies.13 In other words, if clinicians are aware that a given result may indicate a poor prognosis, it is likely that they will alter their practice accordingly, with corresponding effects on outcome. A prognostic test might precisely and reliably identify patients at high risk who are then appropriately admitted to critical care after surgery. If one accepts the assumption that critical care may improve outcome, when compared with ward care, then the prognostic value of the test will be underestimated in this situation. Importantly, this methodological problem distorts not only the measurement of predictive accuracy but also the optimal threshold values for a given test used to categorize patients into high- and low-risk groups.
This issue is particularly relevant to the evaluation of cardiopulmonary exercise testing (CPET) as a means of assessing the risks of non-cardiac surgery. This prognostic test involves breath-by-breath expired gas analysis in patients undergoing a graded exercise challenge to derive objective measures of functional cardiorespiratory capacity (anaerobic threshold and peak oxygen consumption). Several studies have reported an association between postoperative mortality and poor cardiorespiratory reserve as determined by CPET.14–21 However, in all of these studies, clinicians were aware of, and their clinical decisions influenced by, the results of CPET. Inevitably, this has obscured the true relationship between CPET-derived measures of risk and patient outcome. Such confounding by indication in cohort studies can be mitigated to some extent by a variety of methods. These include restriction of subjects, adjustment using propensity scores, blinded prospective review, ecological analysis, and the instrumental variable approach.22 Unfortunately, none of these methods was used in the studies cited above. The recent findings of a small, single-centre study, with some clinician blinding of CPET data, hint at a stronger association between low anaerobic threshold and postoperative morbidity than described previously.23 The hypothesis that CPET-derived variables could be used to accurately predict surgical risk is intuitively appealing, but is only now, finally, being robustly tested. In contrast, most of the studies that have evaluated the predictive accuracy of plasma biomarkers such as B-type natriuretic peptide (BNP) have been blinded,24,25 reducing the chance of significant confounding. Nonetheless, even in the case of plasma BNP measurement, large multi-centre trials are required to confirm predictive accuracy and identify the optimal threshold values for allocating patients into high- and low-risk categories.
In order to confirm the validity of a particular test as a means of assessing risk in individual patients, it is not sufficient simply to demonstrate that poor clinical outcomes are more frequently associated with an abnormal result. Although such association studies can identify promising candidate methods of risk assessment, more rigorous evaluation is required to establish their utility in clinical practice. Further studies must assess predictive accuracy, establish the optimal discriminatory threshold for categorizing patients according to surgical risk and demonstrate that the test has incremental value over and above existing methods. Where possible, such investigations should be followed by randomized trials to confirm whether the use of the test to triage patients to specific interventions improves outcome. Randomized trials also allow health economic evaluations that can provide powerful arguments for or against implementation. The current evidence base for perioperative risk assessment falls far short of this ideal standard. It is also essential that our evaluation of risk assessment technologies is objective. A lack of equipoise among clinicians can represent a formidable barrier to the robust evaluation of health-care technology. This in turn hampers widespread implementation into routine practice.
There is a clear and urgent need for large, well-designed clinical investigations to define the optimal approach to risk assessment before non-cardiac surgery. Until such studies have been performed, we must recognize the important limitations of the available evidence, make best use of currently available tools, and be prepared to adapt our practice as new data become available.