To systematically assess adherence of randomised trials in surgery to Consolidated Standards of Reporting Trials (CONSORT) guidelines for non-pharmacological treatments (NPT). Surgical trials are considered more difficult to design and execute than pharmacological trials. Furthermore, the original CONSORT statement does not address some aspects that are vital to the transparent reporting of surgical trials. The CONSORT-NPT extension was designed to address these issues but adherence in medical and surgical journals has not been assessed.
We identified eight general medical and eight surgical journals, indexed in PubMed and published in 2011, with the highest impact factors in their respective categories.
Adherence to CONSORT statement and CONSORT-NPT extension items.
We identified 54 surgical trials (22 published in medical journals and 32 in surgical journals). There were eight items for which there was less than 30% overall compliance (seven were specific to the CONSORT-NPT extension). These seven items are related to: a full description of the care providers, centres and blinding status in the abstract (n=7/54, 13%), eligibility criteria for centres performing the interventions (n=13/54, 24%), how adherence of care providers with the protocol was assessed or enhanced (n=7/54, 13%), how clustering by care providers or centres was addressed as it relates to sample size (n=3/54, 6%), how care providers were allocated to each group (n=9/54, 17%), how clustering by care providers or centres was addressed as it relates to statistical methods (n=2/54, 4%), a description of care providers (case volume, qualification, expertise, etc) and centres (volume) in each group (n=0/54, 0%).
Adherence of surgical trials to CONSORT-NPT extension items is much poorer than to the standard CONSORT statement. Adherence also appears to be superior in general medical journals compared with surgical journals. Raising awareness and conducting qualitative research to identify areas for specific intervention will be important going forward.
To develop a sensitive, reliable tool for enumerating and evaluating technical process imperfections during surgical operations.
Prospective cohort study with direct observation.
Operating theatres on five sites in three National Health Service Trusts.
Staff taking part in elective and emergency surgical procedures in orthopaedics, trauma, vascular and plastic surgery; including anaesthetists, surgeons, nurses and operating department practitioners.
Reliability and validity of the glitch count method; frequency, type, temporal pattern and rate of glitches in relation to site and surgical specialty.
The glitch count has construct and face validity, and category agreement between observers is good (κ=0.7). Redundancy between pairs of observers significantly improves the sensitivity over a single observation. In total, 429 operations were observed and 5742 glitches were recorded (mean 14 per operation, range 0–83). Specialty-specific glitch rates varied from 6.9 to 8.3/h of operating (ns). The distribution of glitch categories was strikingly similar across specialties, with distractions the commonest type in all cases. The difference in glitch rate between specialty teams operating at different sites was larger than that between specialties (range 6.3–10.5/h, p<0.001). Forty per cent of glitches occurred in the first quarter of an operation, and only 10% occurred in the final quarter.
The glitch method allows collection of a rich dataset suitable for analysing the changes following interventions to improve process safety, and appears reliable and sensitive. Glitches occur more frequently in the early stages of an operation. Hospital environment, culture and work systems may influence the operative process more strongly than the specialty.
SURGERY; patient safety; quality improvement; process of care
Critics of systematic reviews have argued that these studies often fail to inform clinical decision making because their results are far too general, that the data are sparse, such that findings cannot be applied to individual patients or for other decision making. While there is some consensus on methods for investigating statistical and methodological heterogeneity, little attention has been paid to clinical aspects of heterogeneity. Clinical heterogeneity, true effect heterogeneity, can be defined as variability among studies in the participants, the types or timing of outcome measurements, and the intervention characteristics. The objective of this project was to develop recommendations for investigating clinical heterogeneity in systematic reviews.
We used a modified Delphi technique with three phases: (1) pre-meeting item generation; (2) face-to-face consensus meeting in the form of a modified Delphi process; and (3) post-meeting feedback. We identified and invited potential participants with expertise in systematic review methodology, systematic review reporting, or statistical aspects of meta-analyses, or those who published papers on clinical heterogeneity.
Between April and June of 2011, we conducted phone calls with participants. In June 2011 we held the face-to-face focus group meeting in Ann Arbor, Michigan. First, we agreed upon a definition of clinical heterogeneity: Variations in the treatment effect that are due to differences in clinically related characteristics. Next, we discussed and generated recommendations in the following 12 categories related to investigating clinical heterogeneity: the systematic review team, planning investigations, rationale for choice of variables, types of clinical variables, the role of statistical heterogeneity, the use of plotting and visual aids, dealing with outlier studies, the number of investigations or variables, the role of the best evidence synthesis, types of statistical methods, the interpretation of findings, and reporting.
Clinical heterogeneity is common in systematic reviews. Our recommendations can help guide systematic reviewers in conducting valid and reliable investigations of clinical heterogeneity. Findings of these investigations may allow for increased applicability of findings of systematic reviews to the management of individual patients.
Bold claims have been made for the ability of the WHO surgical checklist to reduce surgical morbidity and mortality and improve patient safety regardless of the setting. Little is known about how far the challenges faced by low-income countries are the same as those in high-income countries or different. We aimed to identify and compare the influences on checklist implementation and compliance in the UK and Africa.
Ethnographic study involving observations, interviews and collection of documents. Thematic analysis of the data.
Operating theatres in one African university hospital and two UK university hospitals.
112 h of observations were undertaken. Interviews with 39 theatre and administrative staff were conducted.
Many staff saw value in the checklist in the UK and African hospitals. Some resentment was present in all settings, linked to conflicts between the philosophy behind the checklist and the realities of local cultural, social and economic contexts. Compliance—involving use, completeness and fidelity—was considerably higher, though not perfect, in the UK settings. In these hospitals, compliance was supported by established structures and systems, and was not significantly undermined by major resource constraints; the same was not true of the low-income context. Hierarchical relationships were a major barrier to implementation in all settings, but were more marked in the low-income setting. Introducing a checklist in a professional environment characterised by a lack of accountability and transparency could make the staff feel jeopardised legally, professionally, and personally, and it encouraged them to make misleading records of what had actually been done.
Surgical checklist implementation is likely to be optimised, regardless of the setting, when used as a tool in multifaceted cultural and organisational programmes to strengthen patient safety. It cannot be assumed that the introduction of a checklist will automatically lead to improved communication and clinical processes.
HEALTH SERVICES ADMINISTRATION & MANAGEMENT; QUALITATIVE RESEARCH; SURGERY
Atrial fibrillation and delayed gastric emptying (DGE) are common after pancreaticoduodenectomy. Our aim was to investigate a potential relationship between atrial fibrillation and DGE, which we defined as failure to tolerate a regular diet by the 7th postoperative day.
We performed a retrospective chart review of 249 patients who underwent pancreaticoduodenectomy at our institution between 2000 and 2009. Data was analyzed with Fisher exact test for categorical variables and Mann-Whitney U or unpaired T-test for continuous variables.
Approximately 5% of the 249 patients included in the analysis experienced at least one episode of postoperative atrial fibrillation. Median age of patients with atrial fibrillation was 74 years, compared with 66 years in patients without atrial fibrillation (p = 0.0005). Patients with atrial fibrillation were more likely to have a history of atrial fibrillation (p = 0.03). 92% of the patients with atrial fibrillation suffered from DGE, compared to 46% of patients without atrial fibrillation (p = 0.0007). This association held true when controlling for age.
Patients with postoperative atrial fibrillation are more likely to experience delayed gastric emptying. Interventions to manage delayed gastric function might be prudent in patients at high risk for postoperative atrial fibrillation.
Guidelines traditionally focus on the diagnosis and treatment of single diseases. As almost half of the patients with a chronic disease have more than one disease, the applicability of guidelines may be limited. The aim of this study was to assess the extent that guidelines address comorbidity and to assess the supporting evidence of recommendations related to comorbidity.
We conducted a systematic analysis of evidence-based guidelines focusing on four highly prevalent chronic conditions with a high impact on quality of life: chronic obstructive pulmonary disease, depressive disorder, diabetes mellitus type 2, and osteoarthritis. Data were abstracted from each guideline on the extent that comorbidity was addressed (general comments, specific recommendations), the type of comorbidity discussed (concordant, discordant), and the supporting evidence of the comorbidity-related recommendations (level of evidence, translation of evidence). Of the 20 guidelines, 17 (85%) addressed the issue of comorbidity and 14 (70%) provided specific recommendations on comorbidity. In general, the guidelines included few recommendations on patients with comorbidity (mean 3 recommendations per guideline, range 0 to 26). Of the 59 comorbidity-related recommendations provided, 46 (78%) addressed concordant comorbidities, 8 (14%) discordant comorbidities, and for 5 (8%) the type of comorbidity was not specified. The strength of the supporting evidence was moderate for 25% (15/59) and low for 37% (22/59) of the recommendations. In addition, for 73% (43/59) of the recommendations the evidence was not adequately translated into the guidelines.
Our study showed that the applicability of current evidence-based guidelines to patients with comorbid conditions is limited. Most guidelines do not provide explicit guidance on treatment of patients with comorbidity, particularly for discordant combinations. Guidelines should be more explicit about the applicability of their recommendations to patients with comorbidity. Future clinical trials should also include patients with the most prevalent combinations of chronic conditions.
Participant reports of their own behaviour are critical for the provision and evaluation of behavioural interventions. Recent developments in brief alcohol intervention trials provide an opportunity to evaluate longstanding concerns that answering questions on behaviour as part of research assessments may inadvertently influence it and produce bias. The study objective was to evaluate the size and nature of effects observed in randomized manipulations of the effects of answering questions on drinking behaviour in brief intervention trials.
Multiple methods were used to identify primary studies. Between-group differences in total weekly alcohol consumption, quantity per drinking day and AUDIT scores were evaluated in random effects meta-analyses.
Ten trials were included in this review, of which two did not provide findings for quantitative study, in which three outcomes were evaluated. Between-group differences were of the magnitude of 13.7 (−0.17 to 27.6) grams of alcohol per week (approximately 1.5 U.K. units or 1 standard U.S. drink) and 1 point (0.1 to 1.9) in AUDIT score. There was no difference in quantity per drinking day.
Answering questions on drinking in brief intervention trials appears to alter subsequent self-reported behaviour. This potentially generates bias by exposing non-intervention control groups to an integral component of the intervention. The effects of brief alcohol interventions may thus have been consistently under-estimated. These findings are relevant to evaluations of any interventions to alter behaviours which involve participant self-report.
Research estimates of inadvertent harm to patients undergoing modern healthcare demonstrate a serious problem. Much attention has been paid to analysis of the causes of error and harm, but researchers have typically focussed either on human interaction and communication or on systems design, without fully considering the other components. Existing models for analysing harm are principally derived from theory and the analysis of individual incidents, and their practical value is often limited by the assumption that identifying causal factors automatically suggests solutions. We suggest that new models based on observation are required to help analyse healthcare safety problems and evaluate proposed solutions. We propose such a model which is directed at "microsystem" level (Ward and operating theatre), and which frames problems and solutions within three dimensions.
We have developed a new, simple, model of safety in healthcare systems, based on analysis of real problems seen in surgical systems, in which influences on risk at the "microsystem" level are described in terms of only 3 dimensions - technology, system and culture. We used definitions of these terms which are similar or identical to those used elsewhere in the safety literature, and utilised a set of formal empirical and deductive processes to derive the model. The "3D" model assumes that new risks arise in an unpredictable stochastic manner, and that the three defined dimensions are interactive, in an unconstrained fashion. We illustrated testing of the model, using analysis of a small number of incidents in a surgical environment for which we had detailed prospective observational data.
The model appeared to provide useful explanation and categorisation of real events. We made predictions based on the model, which are experimentally verifiable, and propose further work to test and refine it.
We suggest that, if calibrated by application to a large incident dataset, the 3D model could form the basis for a quantitative statistical method for estimating risk at microsystem levels in many acute healthcare settings.
Patient safety; surgery; medical error; theory, system; culture
Background and Objectives
Asthma and depression are common health problems in primary care. Evidence of a relationship between asthma and depression is conflicting. Objectives: to determine 1. The incidence rate and incidence rate ratio of depression in primary care patients with asthma compared to those without asthma, and 2. The standardized mortality ratio of depressed compared to non-depressed patients with asthma.
A historical cohort and nested case control study using data derived from the United Kingdom General Practice Research Database. Participants: 11,275 incident cases of asthma recorded between 1/1/95 and 31/12/96 age, sex and practice matched with non-cases from the database (ratio 1∶1) and followed up through the database for 10 years. 1,660 cases were matched by date of asthma diagnosis with 1,660 controls. Main outcome measures: number of cases diagnosed with depression, the number of deaths over the study period.
The rate of depression in patients with asthma was 22.4/1,000 person years and without asthma 13.8 /1,000 person years. The incident rate ratio (adjusted for age, sex, practice, diabetes, cardiovascular disease, cerebrovascular disease, smoking) was 1.59 (95% CI 1.48–1.71). The increased rate of depression was not associated with asthma severity or oral corticosteroid use. It was associated with the number of consultations (odds ratio per visit 1.09; 95% CI 1.07–1.11). The age and sex adjusted standardized mortality ratio for depressed patients with asthma was 1.87 (95% CI: 1.54–2.27).
Asthma is associated with depression. This was not related to asthma severity or oral corticosteroid use but was related to service use. This suggests that a diagnosis of depression is related to health seeking behavior in patients with asthma. There is an increased mortality rate in depressed patients with asthma. The cause of this needs further exploration. Consideration should be given to case-finding for depression in this population.
Critical incident audit and feedback are recommended interventions to improve the quality of obstetric care. To evaluate the effect of audit at district level in Thyolo, Malawi, we assessed the incidence of facility-based severe maternal complications (severe acute maternal morbidity (SAMM) and maternal mortality) during two years of audit and feedback.
Between September 2007 and September 2009, we included all cases of maternal mortality and SAMM that occurred in Thyolo District Hospital, the main referral facility in the area, using validated disease-specific criteria. During two- to three-weekly audit sessions, health workers and managers identified substandard care factors. Resulting recommendations were implemented and followed up. Feedback was given during subsequent sessions. A linear regression analysis was performed on facility-based severe maternal complications. During the two-year study period, 386 women were included: 46 died and 340 sustained SAMM, giving a case fatality rate of 11.9%. Forty-five cases out of the 386 inclusions were audited in plenary with hospital staff. There was a reduction of 3.1 women with severe maternal complications per 1000 deliveries in the district health facilities, from 13.5 per 1000 deliveries in the beginning to 10.4 per 1000 deliveries at the end of the study period. The incidence of uterine rupture and major obstetric hemorrhage reduced considerably (from 3.5 to 0.2 and from 5.9 to 2.6 per 1000 facility deliveries respectively).
Our findings indicate that audit and feedback have the potential to reduce serious maternal complications including maternal mortality. Complications like major hemorrhage and uterine rupture that require relatively straightforward intrapartum emergency management are easier to reduce than those which require uptake of improved antenatal care (eclampsia) or timely intravenous medication or HIV-treatment (peripartum infections).
To determine the effectiveness of an index in increasing recognition of
misleading problem framing in articles and manuscripts.
A propaganda index consisting of 32 items was developed drawing on related
literature. Seventeen subjects who review manuscripts for possible
publication were requested to read five recent published reports of
randomized controlled trials concerning social anxiety and to identify
indicators of propaganda (defined as encouraging beliefs and actions with
the least thought possible). They then re-read the same five articles using
a propaganda index to note instances of propaganda.
Convenience sample of individuals who review manuscripts for possible
publication and sample of recent published reports of randomized controlled
trials regarding social anxiety in five different journals by different
authors, blinded by author and journal.
Data showed that there was a high rate of propagandistic problem framing in
reports of RCTs regarding social anxiety such as hiding well argued
alternative views and vagueness. This occurred in 117 out of 160
opportunities over five research reports. A convenience sample of 17
academics spotted only 4.5 percent of propaganda indicators. This increased
to 64 percent with use of the 32 item propaganda index. Use of a propaganda
index increased recognition of related indicators. However many instances
This propaganda index warrants further exploration as a complement to
reporting guidelines such as CONSORT and PRISMA.
Robotic-assisted surgical techniques are not yet well established among surgeon practice groups beyond a few surgical subspecialties. To help identify the facilitators and barriers to their adoption, this belief-elicitation study contextualized and supplemented constructs of the unified theory of acceptance and use of technology (UTAUT) in robotic-assisted surgery. Semi-structured individual interviews were conducted with 21 surgeons comprising two groups: users and nonusers. The main facilitators to adoption were Perceived Usefulness and Facilitating Conditions among both users and nonusers, followed by Attitude Toward Using Technology among users and Extrinsic Motivation among nonusers. The three main barriers to adoption for both users and nonusers were Perceived Ease of Use and Complexity, Perceived Usefulness, and Perceived Behavioral Control. This study's findings can assist surgeons, hospital and medical school administrators, and other policy makers on the proper adoption of robotic-assisted surgery and can guide future research on the development of theories and framing of hypotheses.
Osteosarcoma is the most common malignant primary bone tumour in young adult treated by neo adjuvant chemotherapy, surgical tumor removal and adjuvant multidrug chemotherapy. For correction of soft tissue defect consecutive to surgery and/or tumor treatment, autologous fat graft has been proposed in plastic and reconstructive surgery.
We report here a case of a late local recurrence of osteosarcoma which occurred 13 years after the initial pathology and 18 months after a lipofilling procedure. Because such recurrence was highly unexpected, we investigated the possible relationship of tumor growth with fat injections and with mesenchymal stem/stromal cell like cells which are largely found in fatty tissue. Results obtained in osteosarcoma pre-clinical models show that fat grafts or progenitor cells promoted tumor growth.
These observations and results raise the question of whether autologous fat grafting is a safe reconstructive procedure in a known post neoplasic context.
With the globalization of clinical trials, large developing nations have substantially increased their participation in multi-site studies. This participation has raised ethical concerns, among them the fear that local customs, habits and culture are not respected while asking potential participants to take part in study. This knowledge gap is particularly noticeable among Indian subjects, since despite the large number of participants, little is known regarding what factors affect their willingness to participate in clinical trials.
We conducted a meta-analysis of all studies evaluating the factors and barriers, from the perspective of potential Indian participants, contributing to their participation in clinical trials. We searched both international as well as Indian-specific bibliographic databases, including Pubmed, Cochrane, Openjgate, MedInd, Scirus and Medknow, also performing hand searches and communicating with authors to obtain additional references. We enrolled studies dealing exclusively with the participation of Indians in clinical trials. Data extraction was conducted by three researchers, with disagreement being resolved by consensus.
Six qualitative studies and one survey were found evaluating the main themes affecting the participation of Indian subjects. Themes included Personal health benefits, Altruism, Trust in physicians, Source of extra income, Detailed knowledge, Methods for motivating participants as factors favoring, while Mistrust on trial organizations, Concerns about efficacy and safety of trials, Psychological reasons, Trial burden, Loss of confidentiality, Dependency issues, Language as the barriers.
We identified factors that facilitated and barriers that have negative implications on trial participation decisions in Indian subjects. Due consideration and weightage should be assigned to these factors while planning future trials in India.
Worldwide distribution of surgical interventions is unequal. Developed countries account for the majority of surgeries and information about non-cardiac operations in developing countries is scarce. The purpose of our study was to describe the epidemiological data of non-cardiac surgeries performed in Brazil in the last years.
Methods and Findings
This is a retrospective cohort study that investigated the time window from 1995 to 2007. We collected information from DATASUS, a national public health system database. The following variables were studied: number of surgeries, in-hospital expenses, blood transfusion related costs, length of stay and case fatality rates. The results were presented as sum, average and percentage. The trend analysis was performed by linear regression model. There were 32,659,513 non-cardiac surgeries performed in Brazil in thirteen years. An increment of 20.42% was observed in the number of surgeries in this period and nowadays nearly 3 million operations are performed annually. The cost of these procedures has increased tremendously in the last years. The increment of surgical cost was almost 200%. The total expenses related to surgical hospitalizations were more than $10 billion in all these years. The yearly cost of surgical procedures to public health system was more than $1.27 billion for all surgical hospitalizations, and in average, U$445.24 per surgical procedure. The total cost of blood transfusion was near $98 million in all years and annually approximately $10 million were spent in perioperative transfusion. The surgical mortality had an increment of 31.11% in the period. Actually, in 2007, the surgical mortality in Brazil was 1.77%. All the variables had a significant increment along the studied period: r square (r2) = 0.447 for the number of surgeries (P = 0.012), r2 = 0.439 for in-hospital expenses (P = 0.014) and r2 = 0.907 for surgical mortality (P = 0.0055).
The volume of surgical procedures has increased substantially in Brazil through the past years. The expenditure related to these procedures and its mortality has also increased as the number of operations. Better planning of public health resource and strategies of investment are needed to supply the crescent demand of surgery in Brazil.
Although randomised trials are widely accepted as the ideal way of obtaining unbiased estimates of treatment effects, some treatments have dramatic effects that are highly unlikely to reflect inadequately controlled biases. We compiled a list of historical examples of such effects and identified the features of convincing inferences about treatment effects from sources other than randomised trials. A unifying principle is the size of the treatment effect (signal) relative to the expected prognosis (noise) of the condition. A treatment effect is inferred most confidently when the signal to noise ratio is large and its timing is rapid compared with the natural course of the condition. For the examples we considered in detail the rate ratio often exceeds 10 and thus is highly unlikely to reflect bias or factors other than a treatment effect. This model may help to reduce controversy about evidence for treatments whose effects are so dramatic that randomised trials are unnecessary.
The relation between a treatment and its effect is sometimes so dramatic that bias can be ruled out as an explanation. Paul Glasziouand colleagues suggest how to determine when observations speak for themselves
We previously showed that in the absence of a formal emergency system, lay people face a heavy burden of injuries in Kampala, Uganda, and we demonstrated the feasibility of a basic prehospital trauma course for lay people. This study tests the effectiveness of this course and estimates the costs and cost-effectiveness of scaling up this training.
Methods and Findings
For six months, we prospectively followed 307 trainees (police, taxi drivers, and community leaders) who completed a one-day basic prehospital trauma care program in 2008. Cross-sectional surveys and fund of knowledge tests were used to measure their frequency of skill and supply use, reasons for not providing aid, perceived utility of the course and kit, confidence in using skills, and knowledge of first-aid. We then estimated the cost-effectiveness of scaling up the program.
At six months, 188 (62%) of the trainees were followed up. Their knowledge retention remained high or increased. The mean correct score on a basic fund of knowledge test was 92%, up from 86% after initial training (n = 146 pairs, p = 0.0016). 97% of participants had used at least one skill from the course: most commonly haemorrhage control, recovery position and lifting/moving and 96% had used at least one first-aid item. Lack of knowledge was less of a barrier and trainees were significantly more confident in providing first-aid. Based on cost estimates from the World Health Organization, local injury data, and modelling from previous studies, the projected cost of scaling up this program was $0.12 per capita or $25–75 per life year saved. Key limitations of the study include small sample size, possible reporter bias, preliminary local validation of study instruments, and an indirect estimate of mortality reduction.
Lay first-responders effectively retained knowledge on prehospital trauma care and confidently used their first-aid skills and supplies for at least six months. The costs of scaling up this intervention to cover Kampala are very modest. This may be a cost-effective first step toward developing formal emergency services in Uganda other resource-constrained settings. Further research is needed in this critical area of trauma care in low-income countries.
Objective To evaluate the effect of comorbidity and other risk factors on postoperative mortality and morbidity in patients undergoing major oesophageal and gastric surgery.
Design Multicentre cohort study with data on postoperative mortality and morbidity in hospital.
Data source and methods The ASCOT prospective database, comprising 2087 patients with newly diagnosed oesophageal and gastric cancer in 24 hospitals in England and Wales between 1 January 1999 and 31 December 2002. Multivariate logistic regression analysis was used to model the risk of death and postoperative complications.
Results 955 patients underwent oesophagectomy or gastrectomy. Of these, 253 (27%) were graded ASA III or IV, and 187 (20%) had a high physiological POSSUM score (≥ 20). Operative mortality was 12% (111/955). Physiological POSSUM score, surgeon's assessment, type of operation, hospital case volume, and tumour stage independently predicted operative mortality. Medical complications were associated with higher physiological POSSUM scores and ASA grade, oesophagectomy or total gastrectomy, thoracotomy, and radical nodal dissection. Stage and additional organ resection predicted surgical (technical) complications.
Conclusions Many patients undergoing surgery for gastro-oesophageal cancer have major comorbid disease, which strongly influences their risk of postoperative death. Technical complications do not seem to be influenced by preoperative factors but reflect the extent of surgery and perhaps surgical judgment. Detailed prospective multicentre cooperative audit, with appropriate risk adjustment, is fundamental in the evaluation of cancer care and must be properly resourced.
To design and validate a statistical method for evaluating the performance of surgical units that adjusts for case volume and case mix.
Validation study using routinely collected data on in-hospital mortality.
Two UK databases, the ASCOT prospective database and the risk scoring collaborative (RISC) database, covering 1042 patients undergoing surgery in 29 hospitals for gastro-oesophageal cancer between 1995 and 2000.
A two level hierarchical logistic regression model was used to adjust each unit's operative mortality for case mix. Crude or adjusted operative mortality was plotted on mortality control charts (a graphical representation of surgical performance) as a function of number of operations. Control limits defined as 90%, 95%, and 99% confidence intervals identified units whose performance diverged significantly from the mean.
The mean in-hospital mortality was 12% (range 0% to 50%). The case volume of the units ranged from one to 55 cases a year. When crude figures were plotted on the mortality control chart, four units lay outside the 90% control limit, including two outside the 95% limit. When operative mortality was adjusted for risk, three units lay outside the 90% limit and one outside the 95% limit. The model fitted the data well and had adequate discrimination (area under the receiver operating characteristics curve 0.78).
The mortality control chart is an accurate, risk adjusted means of identifying units whose surgical performance, in terms of operative mortality, diverges significantly from the population mean. It gives an early warning of divergent performance. It could be adapted to monitor performance across various specialties.
What is already known on this topicLeague tables are an established technique for ranking the performance of organisations such as healthcare providersMortality control charts are another way to compare the performance of healthcare providers, particularly for outcomes of surgeryWhat this study addsMortality control charts can be adjusted for case mix and case volume and are better than league tables for monitoring surgical performanceMortality control charts have a “buffer zone” for indicating divergence from the mean mortality and are particularly useful for specialties with a low volume of surgery