There is no ideal scoring system for the pre-operative assessment of elderly patients needing emergency surgery. Some pre-operative scoring systems provide approximate estimates of mortality risk but none have been shown to be sufficiently specific for use on individual patients. At present, the Fitness Score has greatest specificity (80%) but would not be easy to use on all emergency admissions. Post-operative scoring systems such as P-POSSUM probably provide more accurate predictions, but are not useful in pre-operative assessment. Unfortunately, there are very few studies that have revisited old scoring systems or attempted to compare systems to assess which is best. Most articles in this field have proposed another new system.
The timing of data collection to create risk scores is seldom mentioned in the literature. Not only do physiological values vary during the acute admission, making the scores obtained by them unreliable, but there is evidence that to include operative findings and post-operative parameters on ICU improves the accuracy of the prediction. Although a score at initial assessment would help triage and plan treatment, comparative audit with post-operative scores remains the most useful function of scoring systems at present.
Even if accurate pre-operative predictions of outcome were possible by estimation of a risk score, an expert surgical opinion would be required to interpret these predictions at the bedside. An experienced clinician can not only assess prognosis but also weigh up the local facilities available, the patient's QOL and ethical issues, as well as considering the patient or relative's wishes. Scoring will never replace clinical judgement; if a prediction of 75% mortality after surgery is made by a score, it will still fall to the surgeon to decide whether or not to recommend an operation.
There is some evidence that expert surgical opinion is as accurate as any current pre-operative scoring system, or more so. Hartley and Sagar [
49] compared surgical opinion on outcome after surgery, with a POSSUM score prediction. They showed that a surgeon's opinion had greater specificity than POSSUM at predicting death (88% vs 64%). Cook
et al in an audit of mortality in the elderly tried to determine whether clinical judgement was better than scoring [
50]. They found that pre-operatively, surgeons or anaesthetists predicted death with a specificity of 89%, which is greater than any of the scores identified in this review. Sensitivity was less good: 46% and 62% respectively. Markus
et al found that surgeons tend to underestimate the risk of complications in emergency surgery, but that their clinical judgement was more accurate than P-POSSUM predictions [
51] while Hobson
et al studied 163 patients needing emergency surgery and compared predictions of 30-day mortality by surgeons, anaesthetists and POSSUM scoring and found that clinical predictions were as good as those made by scoring, using ROC curve analysis [
52].
The specificity of surgical opinion will clearly depend on who the available surgeon is. If a senior surgeon is not available, then a scoring system may provide a better prediction. For that reason, scoring systems should continue to be developed.
Of the pre-operative scoring systems in use, the one which has stood the test of time and is used most, the ASA score, is also the most subjective, relying on the anaesthetist's overall clinical assessment of the patient. The fact that ASA scores vary between observers suggests that it is really an expert clinical assessment of risk and not a score at all. In a study of acutely perforated colorectal cancers in patients with a mean age of 70.5 years, stepwise logistic regression analysis showed that ASA scoring was the only significant pre-operative method of predicting short-term outcome [
53].
Scoring systems such as the Reiss Index or Fitness Score can be used pre-operatively if there is time to gain enough data to complete the scoring. In future the speed and accuracy of investigations may allow a pre-operative diagnosis to be established more reliably, making these systems more useful than they are at present.
Scoring systems are generated and validated on specific populations that may be substantially different from the patients being scored in a different hospital. One potential resolution would be for each hospital to create a system specific to its own population, which is regularly re-validated.
One of the most accurate scoring systems used in elective cardio-thoracic surgery is the EUROSCORE [
54]. It was developed from data on more than 19,000 operations in 128 centres. Because the surgery is elective and the variables most associated with mortality risk are clear, a high level of accuracy is possible and the score has been validated in North America and Europe [
55]. Abdominal emergencies in the elderly are never going to be as predictable, and we must expect greater regional variation in this branch of surgery.