PMCCPMCCPMCC

Search tips
Search criteria 

Advanced

 
Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
 
J Thorac Cardiovasc Surg. Author manuscript; available in PMC 2013 November 12.
Published in final edited form as:
PMCID: PMC3824389
NIHMSID: NIHMS526108

An empirically based tool for analyzing morbidity associated with operations for congenital heart disease

Abstract

Objective:

Congenital heart surgery outcomes analysis requires reliable methods of estimating the risk of adverse outcomes. Contemporary methods focus primarily on mortality or rely on expert opinion to estimate morbidity associated with different procedures. We created an objective, empirically based index that reflects statistically estimated risk of morbidity by procedure.

Methods:

Morbidity risk was estimated using data from 62,851 operations in the Society of Thoracic Surgeons Congenital Heart Surgery Database (2002-2008). Model-based estimates with 95% Bayesian credible intervals were calculated for each procedure’s average risk of major complications and average postoperative length of stay. These 2 measures were combined into a composite morbidity score. A total of 140 procedures were assigned scores ranging from 0.1 to 5.0 and sorted into 5 relatively homogeneous categories.

Results:

Model-estimated risk of major complications ranged from 1.0% for simple procedures to 38.2% for truncus arteriosus with interrupted aortic arch repair. Procedure-specific estimates of average postoperative length of stay ranged from 2.9 days for simple procedures to 42.6 days for a combined atrial switch and Rastelli operation. Spearman rank correlation between raw rates of major complication and average postoperative length of stay was 0.82 in procedures with n greater than 200. Rate of major complications ranged from 3.2% in category 1 to 30.0% in category 5. Aggregate average postoperative length of stay ranged from 6.3 days in category 1 to 34.0 days in category 5.

Conclusions:

Complication rates and postoperative length of stay provide related but not redundant information about morbidity. The Morbidity Scores and Categories provide an objective assessment of risk associated with operations for congenital heart disease, which should facilitate comparison of outcomes across cohorts with differing case mixes.

Contemporary efforts to describe and compare congenital heart surgery outcomes across institutions have evolved to include (1) use of clinical registry data, rather than administrative data from the hospital bill to evaluate outcomes; (2) use of empiric rather than opinion-based models to adjust for differences in case complexity across institutions; and (3) recognition that focusing solely on in-hospital mortality overlooks 96% of patients who survive to hospital discharge and the important morbidities that they may experience.1

In 2009, an empirically based tool for analyzing mortality associated with congenital heart surgery was introduced. The Society of Thoracic Surgeons-European Association for Cardiothoracic Surgery (STS-EACTS) Congenital Heart Surgery Mortality Score and Categories are based on analysis of 148 different types of operations performed in 77,294 patients.2 Procedures are assigned to 1 of 5 categories on the basis of a similar risk of in-hospital death. Category 1 has the lowest risk of death, and category 5 has the highest risk of death. In addition, each procedure receives a numeric score ranging from 0.1 to 5.0 that expresses mortality risk on a more continuous scale. The STS-EACTS Mortality Categories are intended to facilitate analysis of outcomes by grouping procedures with similar risk of in-hospital mortality.

Although congenital heart surgery outcomes analyses have traditionally focused on mortality, comprehensive assessment requires attention to other end points. Nonfatal events, such as stroke and renal failure, are major determinants of hospital cost and patients’ health status after surgery. In addition, postprocedure length of hospital stay provides useful direct information about resource use and indirect proxy information about a patient’s condition.3,4 Although such measures are captured in clinical registries, tools for analyzing these end points are lacking.

The goal of the present study was to develop a new system for classifying congenital heart surgery procedures on the basis of their potential for morbidity using empirical data from the STS Congenital Heart Surgery Database (STSCHSD). There were 4 specific objectives:

  • to develop a morbidity metric based on both the occurrence of complications that have a significant and durable impact on patient health and utilization of health care resources;
  • to estimate the average amount of patient morbidity by procedure type;
  • to convert these procedure-specific morbidity estimates into a scale ranging from 0.1 to 5.0 (range was chosen for consistency with the STS-EACTS Mortality Score2); and
  • to group procedures with similar estimated morbidity risk into 5 relatively homogeneous categories that were designed to minimize within-category variation and to serve as a stratification variable that can be used to adjust for case mix when analyzing outcomes and comparing institutions.

The Morbidity metric was developed primarily for the purpose of grouping types of procedures to better describe case mix, as was the STS-EACTS Mortality metric. The intent was not to assess or predict outcomes for an individual patient or surgeon, for which other types of analyses may be used.

MATERIALS AND METHODS

Study Population

The STSCHSD has been described.5 The Duke Clinical Research Institute serves as the data analysis center for STS databases and has an agreement and institutional review board approval to analyze the aggregate deidentified data for research purposes. For this study, operations were included if they took place between January 1, 2002, and December 31, 2008, and were 1 of the 148 types of cardiovascular procedures for which the STS-EACTS Mortality Score is defined.2 Operations performed at centers with no more than 10% missing data for complications, mortality, or postoperative length of stay (PLOS) were eligible for inclusion in the analysis. From eligible centers, individual operations with missing data for complications, mortality, or PLOS were excluded. Of 63,297 potentially eligible operations, 446 individual operations were excluded on the basis of missing data regarding complications (n = 273), PLOS (n = 151), or mortality (n = 22).

Additional inclusion and exclusion criteria were identical to those used for developing the STS-EACTS Mortality Score.2 Only the first operation of each hospital admission was analyzed. The final study population consisted of 62,851 operations classified into 148 procedure types at 68 centers. Results are presented for the subset of 140 procedure types having at least 10 eligible cases (62,819 total operations; 99.9%).

Classification of Multiple-Procedure Operations

Several operations in the analysis represent combinations of 2 or more procedures. These are analyzed as combined procedures because the complexity of the combination is regarded as being different from the complexity of the component procedures when performed in isolation. For each of these combined procedures, unique procedure codes were subsequently assigned in STSCHSD version 3.0. Because all data in this analysis predate version 3.0, classification of multiple-procedure operations in this study follows guidelines set forth previously in development of the STS-EACTS Mortality Score and Categories.2

End Points

Morbidity was quantified for each procedure on the basis of the proportion of patients experiencing major complications and by the average PLOS (Table 1). Major complication was defined as the occurrence of any 1 or more of the 6 complications listed in Table 2. These complications represent definitive outcomes that can be ascertained reliably and that are likely to have significant and durable impact on patient health. The unadjusted rate of major complications is defined as the percent of operations that were associated with the occurrence of 1 or more of the major complications listed in Table 2. PLOS was defined as the number of days from the date of operation to the date of discharge and was determined for all patients, including those who died in-hospital.

TABLE 1
Procedure names, morbidity categories and scores, and data for model development
TABLE 2
Major complications

Analysis

Statistics calculated for each procedure type included the number of eligible operations, the percent of patients experiencing major complications, the 95% binomial confidence interval for the probability of major complications, and the average and interquartile range (25th and 75th percentiles) of PLOS. Model-based estimates of each procedure’s average risk of major complications and average PLOS were calculated by hierarchical modeling and presented along with 95% Bayesian credible intervals (CrIs). Details of these calculations are provided in Appendix 1.

Creation of Morbidity Scores

To facilitate ranking and grouping of procedures, average risk of major complications and average PLOS were combined into a single composite morbidity measure. To account for different measurement scales, the 2 individual measures were rescaled to have the same standard deviation (Appendix 3). They were then summed together. The resulting composite morbidity measure was the basis of the proposed Morbidity Scores and Categories. Each procedure was assigned a numeric score ranging from 0.1 to 5.0 (STS Congenital Heart Surgery Morbidity Score). The range was chosen to be the same as the existing STS-EACTS Mortality Score. Scores were assigned by shifting and rescaling the procedure-specific composite morbidity estimates to lie in the interval from 0.1 to 5.0 and then rounding to 1 decimal place.

Creation of Morbidity Categories

Procedures were sorted by increasing estimated morbidity and partitioned into 5 relatively homogeneous categories (STS Congenital Heart Surgery Morbidity Categories). This number of categories was chosen to match the number of STS-EACTS Mortality Categories.2 A computer program was used to search for category cutpoints that were optimal for minimizing within-category variance and maximizing between-category variance of the composite morbidity measure. The relationship between the number of categories and the degree of within-category homogeneity was assessed (Figure E1).

FIGURE E1
The relationship between the number of categories and the degree of within-category homogeneity. Within-category homogeneity is defined as 1 minus the ratio of within-category variance to total variance of procedure-specific morbidity estimates.

Sensitivity Analyses

Sensitivity analyses were performed to assess whether the ranking of procedures depended heavily on the choice of statistical methodology. The Spearman rank correlation coefficient was used to quantify the extent to which rankings differed. Differences were also assessed graphically by plotting estimates of the same quantity calculated by 2 different statistical methods.

Assessment of Statistical Reliability

Finally, we estimated the statistical precision (reliability) of the estimated rates of major complication, average PLOS, and composite morbidity. Reliability of a set of estimates is conventionally defined as the proportion of between-unit variation that is explained by true between-unit differences (ie, signal) as opposed to random statistical fluctuations (ie, noise). A mathematically equivalent definition is the squared correlation between a measurement and the true value. In our case, reliability was defined as the squared Pearson correlation between each procedure’s estimated and true amount of morbidity. Reliability could not be calculated directly (because the “true” morbidity values are unknown) but was estimated by hierarchical modeling, as described in Appendix 1.

RESULTS

Sample sizes per procedure ranged from 1 to 4868. The 140 procedures with at least 10 cases are listed in Table 1 along with their sample sizes, raw and model-based morbidity estimates, and Morbidity Scores and Categories.

Model-estimated risk of major complications ranged from 1.0% for atrial septal defect repair to 38.2% for truncus arteriosus with interrupted aortic arch repair. Procedure-specific estimates of average PLOS ranged from 2.9 days for implantable cardioverter defibrillator procedure to 34.6 days for a stage 1 Norwood procedure and 42.6 days for a combined atrial switch and Rastelli procedure for congenitally corrected transposition. The Spearman rank correlation between raw rates of major complication and average PLOS was 0.63 (Figure 1) in procedures with at least 10 cases and 0.82 in procedures with at least 200 cases. This degree of correlation suggests that complication rates and PLOS provide related, but not redundant, information about morbidity.

FIGURE 1
Unadjusted average PLOS (days) and unadjusted rate of major complications (percentage) are measured on the horizontal and vertical axes, respectively. Squares represent the 140 procedure types with n greater than 10. PLOS, Postoperative length of stay. ...

Procedure-specific overall morbidity was defined as 0.141 × percentage rate of major complications + 0.162 × average PLOS in days. The numbers 0.141 and 0.162 were calculated as the reciprocals of the standard deviations of the percentage rate of major complications and average PLOS, respectively. The STS Morbidity Score was obtained by rescaling this overall morbidity measure to lie in the interval 0.1 to 5.0. Thus, by design it ranged from 0.1 to 5.0. Procedures with the least morbidity (STS Morbidity Score = 0.1) include atrial septal defect repair and implantable cardioverter defibrillator procedures. The procedure with the greatest morbidity (STS Morbidity Score = 5.0) was repair of truncus arteriosus with interrupted aortic arch.

STS Morbidity Categories were obtained by grouping procedures into 5 unequally sized categories (1 = least bidity, 5 = most morbidity) chosen to be maximally homogeneous with respect to overall morbidity. The number of procedures assigned to categories 1, 2, 3, 4, and 5 were 36, 43, 36, 21, and 4, respectively. The rate of major complication ranged from 3.2% in category 1 to 30.0% in category 5. The aggregate average PLOS ranged across categories from 6.3 days in category 1 to 34.0 days in category 5.

Several analyses were performed to address potential methodological concerns with the composite measures used in this analysis. First, we addressed potential issues related to “Major Complication,” which is a composite designating the occurrence of any 1 or more of 6 individual complications. The observed rate of discharge mortality for patients who experienced at least 1 major complication was 23.5% (Table 2) in comparison with 2.0% among patients who experienced none of the major complications. When end points in a composite occur with differing frequencies, the more frequent end points may sometimes dominate.6 As shown in Table 2, the aggregate rate of major complications ranged from 0.8% for “postoperative neurologic deficit persisting at discharge” to 4.7% for “unplanned reoperation.” To verify that each individual complication contributed statistical information but did not dominate the composite, we calculated the Spearman rank correlation coefficient between procedure-specific rates of each individual complication and rates of any major complication. These correlations ranged from 0.37 for heart block to 0.79 for unplanned reoperation. Thus, although unplanned reoperation explained much of the variation in the major complication end point, no single item dominated. All 6 complications contributed statistical information.

Second, we assessed the impact of modifying the list of major complications to include mortality. Although mortality was ultimately excluded, we thought it was important to know whether results would be similar or different had mortality been included. To address this, we calculated 2 versions of the major complication end point (1 including and 1 excluding mortality) and compared them. As shown in Figure 2, the 2 major complication end points were highly correlated but not perfectly related. The rank correlation coefficient between them was 0.97.

FIGURE 2
The proportions (%) of Any Major Complication and of Any Major Complication or Mortality are measured on the horizontal and vertical axes, respectively. The squares represent the 140 procedure types with n greater than 10.

Third, although morbidity was calculated as an equally weighted combination of complication rate + average PLOS, strong consideration was given to an þalternative composite consisting of the rate of major complications and the average time on ventilator. Rank correlation between these 2 composite morbidity measures was 0.93, suggesting that the 2 methods tend to give similar, but not completely identical, results. The version using PLOS was preferred in part because PLOS was collected with high (>99.9%) completeness, whereas ventilation time was more than 15% missing. Moreover, during the time period of this study, the STS definition of time on ventilator only included the time until the first extubation and did not include the additional time on ventilator for patients who were subsequently reintubated.

Fourth, we assessed the reliability (ie, statistical precision; see “Materials and Methods”) of the various measures that were used for ranking procedures in this study. For major complications, average PLOS, and the composite morbidity measure, the estimated reliability values were 0.80 (95% CrI, 0.71-0.87), 0.88 (95% CrI, 0.82-0.92), and 0.90 (95% CrI, 0.85-0.94), respectively. Thus, reliability was greatest for composite morbidity, which was the basis of the proposed Morbidity Score and Categories. The estimated reliability of composite morbidity increased to 0.95 (95% CrI, 0.92-0.97) when considering only procedures with at least 30 cases (N = 115 procedures) and to 0.99 (95% CrI, 0.98-0.99) when considering only procedures with at least 200 cases (N = 67 procedures).

Finally, we assessed the degree of association between the proposed Morbidity Score and the existing STS-EACTS Mortality Score. A weak association would suggest poor content validity because, conceptually, we know that morbidity and mortality are closely related. On the other hand, a perfect association would suggest that the morbidity score is redundant with mortality and thus is unneeded. To address these issues, the proposed Morbidity Score and the STS-EACTS Mortality Score were plotted and compared. As shown in Figure 3, they are closely related (rank correlation = 0.79), but far from being redundant.

FIGURE 3
The relationship between the STS-EACTS Mortality score2 and the STS Morbidity Score. Squares represent the 140 procedure types with n greater than 10.

Descriptive characteristics of the 5 Morbidity categories are shown in Table 3. The association between Morbidity Categories and Mortality Categories is summarized in Table 4. The Morbidity and Mortality Categories were identical for 74 procedures, differed by 1 or fewer positions for 135 procedures, and differed by 2 or fewer positions for 139 procedures. One procedure (pulmonary artery debanding) was in category 4 for mortality but category 1 for morbidity.

TABLE 3
Summary of morbidity categories
TABLE 4
Association between morbidity categories and mortality categories

DISCUSSION

Measuring morbidity is a challenging but important element of outcomes reporting and quality assessment.7,8 Morbidity is a major determinant of health status after surgery and of hospital cost.3,4,8 The importance of developing a morbidity metric was articulated in 2004 by Phillipe Kolh,9 who described quantitation of morbidity in cardiac surgery as follows: “Being more frequent than mortality, it could carry more information and be measured in terms of postoperative complications and length of hospital stay…. Furthermore, because of the heterogeneity of morbidity events, future scoring systems should probably generate separate predictions for mortality and major morbidity events.” In this report, we introduce an empirically derived tool that estimates the relative risk of morbidity associated with congenital heart surgery procedures on the basis of elements of both complications and PLOS.

Formal risk modeling using logistic regression is practical for common “adult cardiac procedures,” such as coronary artery bypass grafting and valve replacement. No operation for congenital heart disease is performed in numbers comparable to coronary artery bypass grafting. The diverse spectrum of distinct procedures is reflected by 140 procedure types in this study. Bayesian modeling is a particularly appropriate tool in this setting where denominators may be small. Thus, the product of this analysis is not a series of procedure-specific risk models, but rather a metric of procedure-based estimates of morbidity that can be used to describe case mix.

Composite Development

At the outset, we appreciated the importance of including both a complications element and a resource utilization element in a morbidity metric. We felt obligated, however, to demonstrate that use of either alone would be inadequate, that is, a model that assumed a direct 1-to-1 relationship between major complications and PLOS would not fit the data as well, and therefore be an incomplete and less informative morbidity metric.

Resource utilization variables used in previous analyses were considered. Analysis based on inclusion of ventilation time is described earlier in this article. Previous analyses from individual institutions have included length of intensive care unit (ICU) stay.10,11 It is less useful at a multi-institutional level because of lack of a uniform definition of ICU, and because some institutions keep postoperative patients in an ICU environment until discharge. Cost is another measure of resource utilization that may be associated with morbidity. Cost data are not included in STS registries; furthermore, true cost data can be difficult to estimate.

Individual elements of the complication end point were considered on the basis of their potential impact on patients’ health status, including durable, long-lasting effects. We acknowledge that validated data describing relationships between some individual complications and late health status are not readily available. Some complications that are not included, such as Postoperative Cardiac Arrest, have been evaluated in other studies and shown to potentially be associated with mortality.12 Despite the fact that all complication codes in the STSCHSD have corresponding definitions since 2006, the coding of complications such as Postoperative Cardiac Arrest may still be subject to a degree of interpretation, and thus potentially variable ascertainment, in contrast to complications included in our list, such as postoperative mechanical circulatory support. Still other complications, such as sternal dehiscence and mediastinitis, are not included in this metric because they result in unplanned reoperations, and thus are accounted for in the composite. A variety of other complications that are not counted in the major complications end point are likely to be reflected in increased PLOS. Although some have argued that mortality is the “ultimate morbidity,” the decision to exclude in-hospital mortality from the morbidity metric was deliberate, based on several principles. First, the Morbidity Scores and Categories are designed to be used in conjunction with the STS-EACTS Mortality Scores and Categories. Second, analyses of potential associations between morbidity and mortality require the use of separate metrics for each.9,13

Our analysis confirms that morbidity and mortality indices provide related but not redundant information. Outcomes assessment should include measures of both, as suggested by Kohl.9 The decision to include patients who died before discharge may seem obvious, but it was debated. An alternative strategy that eliminates from analysis those who died before discharge would have resulted in an incomplete and potentially misleading picture of morbidity, and would have compromised any possible efforts to explore relationships between morbidity and mortality. The concept of “Failure to Rescue,” that is, probability of death following the occurrence of a complication or adverse event, is emerging as a potentially important tool for measuring performance and directing quality improvement initiatives.14,15 Quantitative estimation of morbidity only among hospital survivors would overlook the importance of this concept.

Widely used systems for stratifying risk or complexity of congenital heart surgery procedures have focused entirely on in-hospital mortality16 or included a morbidity element that was based on expert opinion rather than objective data.17 The Morbidity Score and Categories presented in the current study are complementary to the empirically based STS-EACTS Mortality Score and Categories2 and were derived using data in the largest congenital heart surgery registry.

Study Limitations

Despite the advantages of an empirically based tool for analyzing morbidity, this study has important limitations. The analysis focuses on estimation of morbidity at the procedure level. We did not address methods of incorporating these procedural variables into statistical models for performing inter-institutional outcomes comparisons. Nor does our methodology address the appropriateness and timing of individual procedures in relation to overall disease management or include consideration of patient-specific risk factors. Second, despite a large database size, it is possible that patients and data in the STSCHSD are not entirely representative of other populations. Third, several individual procedures had small sample sizes, and the true morbidity associated with these procedures may have been estimated with error. We attempted to minimize this error by using a composite measure that combines statistical information from several related end points into a single end point. Fourth, both the occurrence and the impact of morbidity extend beyond the duration of the surgical hospital admission. The nature of the STSCHSD precludes inclusion of complications recognized or therapeutic interventions occurring after discharge from the “surgical admission.”18 A “long-term database” will ultimately be needed to achieve a more comprehensive estimation of morbidity associated with surgery for congenital heart disease. This study represents an important first step and acknowledges the need for a quantitative morbidity metric and the mandate that it be empirically derived.

CONCLUSIONS

The STS Morbidity Score and Categories is a tool for analyzing morbidity associated with operations for congenital heart disease and for grouping procedures with similar empirically estimated risk of morbidity. Together with the STS-EACTS Mortality Score and Categories, this tool enhances our ability to accurately characterize case mix. It should add a new dimension and precision to outcome assessments and may provide important information to guide quality-improvement initiatives.

Abbreviations and Acronyms

CrI
credible interval
ICU
intensive care unit
PLOS
postoperative length of stay
STSCHSD
Society of Thoracic Surgeons Congenital Heart Surgery Database
STS-EACTS
Society of Thoracic Surgeons-European Association for Cardiothoracic Surgery

APPENDIX 1. STATISTICAL APPENDIX

Statistical Model

A bivariate hierarchical model with normally distributed random effects was used to estimate the distribution of procedure-specific probabilities of major complication and average PLOS. For the i-th patient undergoing the j-th procedure, let yji denote the occurrence of major complication (0 = no, 1 = yes) and xji denote the patient’s PLOS. The model was as follows:

equation M1

equation M2

equation M3

where πj denotes the unknown theoretic probability of major complication for the j-th procedure; μj and equation M4 denote the unknown mean and variance of PLOS for the j-th procedure; and μ, Σ denote unknown parameters of the assumed bivariate normal random effects distribution.

Estimation

Model parameters were estimated in a Bayesian statistical framework by specifying a prior probability distribution for unknown parameters μ, Σ, and equation M5. Because our prior knowledge was limited, we specified a vague proper prior distribution that consisted of independent normal distributions for the elements of μ, independent inverse Gamma distributions for the equation M6 s, and an inverse Wishart distribution for Σ. Posterior means and CrIs were calculated using Markov Chain Monte Carlo (MCMC) simulations as implemented in WinBUGS version 1.4 software (Medical Research Council Biostatistics Unit, Cambridge, UK, and the Imperial College School of Medicine at St Mary’s, London, UK).19 Posterior summaries were based on 4000 sets of simulated parameter values generated after a long burn-in period to ensure convergence.

Composite Morbidity

The overall composite morbidity of the j-th procedure was defined as follows:

equation M7

where

equation M8

The parameter θj was estimated as equation M9 where equation M10 denotes the simulated value of θj at the l-th iteration of the MCMC procedure. A 95% Bayesian CrI for θj was obtained by calculating the 100th lowest and 100th highest values of θ(l) across the 4000 simulated values.

Creation of Categories

To create internally homogeneous categories, procedures were first sorted in order of increasing estimated morbidity and then grouped by choosing cutpoints that were optimal according to a least squares variance criterion, as described later. We first sorted procedures so that equation M11. Let K denote the number of categories and let cK = c1<c2<(...)<cK−1 denote a set of category cutpoints that partition the categories into K groups. The symbol cK denotes a number between 1 and 148 and represents the index of the highest-morbidity procedure in the k-th category. Also, define c0 = 0 and cK = 149. For any particular choice of K and cK, within-category homogeneity was measured by the weighted sum-of-squares criterion:

equation M12

where nj is the number of patients in the denominator for the j-th procedure and equation M13 is the average morbidity of all procedures in the k-th category weighted by their respective sample sizes. If the θj were known instead of unknown, then the “optimal” cutpoints could (in theory) be determined by enumerating all possible choices for the ck and choosing the one that minimizes the WSS. Because the θj are unknown, we instead chose cutpoints that minimize the estimated value of WSS (cK). Specifically, we chose cut-points that minimize the posterior mean

equation M14

where θ(l) is the value of θ = (θ1, …, θ148) on the l-th iteration of the MCMC procedure. An unpublished dynamic programming algorithm was used to determine the set of cutpoints that made this quantity a minimum. The WSS criterion gets smaller as K, the number of categories, increases. The value K = 5 was selected for consistency with the published STS-EACTS mortality categories.

Estimation of Reliability

Reliability is conventionally defined as the proportion of variation in a measure that is due to true between-unit differences (ie, signal) as opposed to random statistical fluctuations (ie, noise). Equivalently, it is the squared correlation between a measurement and the true value. Accordingly, reliability was defined as the square of the Pearson correlation coefficient between the set of procedure-specific estimates equation M15 and the corresponding unknown true values θ1, …, θ148, that is:

equation M16

The quantity ρ2 was estimated by its posterior mean, namely

equation M17

where

equation M18

with equation M19 denoting the value of θj at the l-th MCMC iteration and equation M20 denoting the posterior mean of θj. A 95% CrI for ρ2 was obtained by calculating the 100th smallest and 100th largest values of equation M21 across the 4000 MCMC iterations. Analogous calculations were used to estimate reliability for subsets of procedures with at least 10, 30, or 200 cases. An identical approach was also used for estimating the reliability of procedure-specific probability parameters π1, …, π148 and mean parameters μ, …, μ148.

APPENDIX 2. CLASSIFICATION OF MULTIPLE-PROCEDURE OPERATIONS

Several procedures listed in Table 1 are actually combinations of 2 or more procedures. These combinations were previously identified during development of the STS-EACTS Mortality Score and Categories.2 They occur frequently in the STS and EACTS databases, and the complexity of the combination is regarded as being different from the complexity of the component procedures when performed in isolation. For all other operations involving combinations of procedures, the operation was classified according to the most technically complex procedure, as determined by the difficulty component of the 2007 update of the Aristotle Basic Complexity score. The Aristotle Basic Complexity score contains some ties and is not defined for 3 of the procedures listed in Table 1. To deal with undefined or tied Aristotle scores, 6 of the study authors independently ranked the difficulty of each procedure. Undefined or tied Aristotle scores were adjudicated by assigning the operation to the procedure with the highest average ranking determined by the 6 graders. The difficulty rankings were published together with the STS-EACTS Mortality Score and Categories.2 The identical methodology and rankings were used to classify multiple-procedure operations during development of the STS Morbidity Score and categories.

APPENDIX 3. RESCALING OF INDIVIDUAL OUTCOME MEASURES (AVERAGE RISK OF MAJOR COMPLICATIONS AND AVERAGE POSTOPERATIVE LENGTH OF STAY)

Procedure-specific complication rates are percentages measured on a scale from 0 to 100. Procedure-specific average PLOS is measured in days ranging from 0 to infinity. These are different measurement scales. We rescaled these so that the new scales would have approximately the same standard deviation. This guarantees that approximately half of the variation of the composite measure will be attributable to complications and half to PLOS. If we did not rescale them, then the amount of variation contributed by each item would be dependent on the scale we used for measuring it. For example, we would get different results depending on whether complication rates were expressed as percentages or proportions, or whether PLOS was measured in days, weeks, or months. Rescaling makes it possible to have a composite measure in which dominance of a single element is avoided.

For complications, first, we calculated each procedure’s complication rate. Next, we calculated the standard deviation of the set of procedure-specific complication rates. Finally, to obtain a rescaled complication rate, we divided each of the original 140 complication rates by their common standard deviation. The same process was used for standardizing average PLOS.

Footnotes

An external file that holds a picture, illustration, etc.
Object name is nihms-526108-ig0001.jpg Supplemental material is available online.

Disclosures: Authors have nothing to disclose with regard to commercial support.

References

1. Jacobs JP, O’Brien SM, Pasquali SK, Jacobs ML, Lacour-Gayet FG, Tchervenkov CI, et al. Variation in outcomes for benchmark operations: an analysis of The Society of Thoracic Surgeons’ Congenital Heart Surgery Database. Ann Thorac Surg. 2011;92:2184–92. [PMC free article] [PubMed]
2. O’Brien SM, Clarke DR, Jacobs JP, Jacobs ML, Lacour-Gayet FG, Pizarro C, et al. An empirically based tool for analyzing mortality associated with congenital heart surgery. J Thorac Cardiovasc Surg. 2009;138:1139–53. [PubMed]
3. Mahle WT, Wernovsky G. Long-term developmental outcome of children with complex congenital heart disease. Clin Perinatol. 2001;28:235–47. [PubMed]
4. Pasquali SK, Sun JL, d’Almada P, Jaquiss RD, Lodge AJ, Miller N, et al. Center variation in hospital costs for patients undergoing congenital heart surgery. Circ Cardiovasc Qual Outcomes. 2011;4:306–12. [PMC free article] [PubMed]
5. Jacobs ML, Jacobs JP, Franklin RCG, Mavroudis C, Lacour-Gayet F, Tchervenkov CI, et al. Databases for assessing the outcomes of the treatment of patients with congenital and paediatric cardiac disease–the perspective of cardiac surgery. Cardiol Young. 2008;18(Suppl 2):101–15. [PubMed]
6. Ferreira-González I, Permanyer-Miralda G, Busse JW, Bryant DM, Montori VM, Alonso-Coello P, et al. Methodologic discussions for using and interpreting composite endpoints are limited, but still identify major concerns. J Clin Epidemiol. 2007;60:651–7. [PubMed]
7. Jacobs JP, Jacobs ML, Mavroudis C, Maruszewski B, Tchervenkov CI, Lacour-Gayet FG, et al. What is operative morbidity? Defining complications in a surgical registry database: a report from the STS Congenital Database Task Force and the Joint EACTS-STS Congenital Database Committee. Ann Thorac Surg. 2007;84:1416–21. [PubMed]
8. Benavidez OJ, Connor JA, Gauvreau K, Jenkins KJ. The contribution of complications to high resource utilization during congenital heart surgery admissions. Congenit Heart Dis. 2007;2:319–26. [PubMed]
9. Kolh P. Importance of risk stratification models in cardiac surgery. Eur Heart J. 2006;27:768–9. [PubMed]
10. Pagowska-Klimek I, Pychynska-Pokorska M, Krajewski W, Moll JJ. Predictors of long intensive care unit stay following cardiac surgery in children. Eur J Cardiothorac Surg. 2011;40:179–84. [PubMed]
11. Bojan M, Gerelli S, Gioanni S, Pouard P, Vouhé P. Evaluation of a new tool for morbidity assessment in congenital cardiac surgery. Ann Thorac Surg. 2011;92:2200–4. [PubMed]
12. Benavidez OJ, Gauvreau K, Del Nido P, Bacha E, Jenkins KJ. Complications and risk factors for mortality during congenital heart surgery admissions. Ann Thorac Surg. 2007;84:147–55. [PubMed]
13. Ghaferi AA, Birkmeyer JD, Dimick JB. Complications, failure to rescue, and mortality with major inpatient surgery in Medicare patients. Ann Surg. 2009;250:1029–34. [PubMed]
14. Silber JH, Romano PS, Rosen AK, Wang Y, Even-Shoshan O, Volpp KG. Failure-to-rescue: comparing definitions to measure quality of care. Med Care. 2007;45:918–25. [PubMed]
15. Pasquali SK, Li JS, Burstein DS, Sheng S, O’Brien SM, Jacobs ML, et al. The Association of Center Volume with Mortality and Complications in Pediatric Heart Surgery. Pediatrics. 2012;129:e370–6. [PMC free article] [PubMed]
16. Jenkins KJ, Gauvreau K, Newburger JW, Spray TL, Moller JH, Iezzoni LI. Consensus-based method for risk adjustment for surgery for congenital heart disease. J Thorac Cardiovasc Surg. 2002;123:110–8. [PubMed]
17. Lacour-Gayet F, Clarke D, Jacobs J. The Aristotle score: a complexity-adjusted method to evaluate surgical results. Eur J Cardiothorac Surg. 2004;25:911–24. [PubMed]
18. The Society of Thoracic Surgeons Congenital Cardiac Surgery Database Data Collection Form. Available at: http://www.sts.org/sites/default/files/documents/pdf/DataCollectionForm250_07102006_Nonannotated.pdf. Accessed December 14, 2011.
19. Lunn DJ, Thomas A, Best N, Spiegelhalter D. WinBUGS—a Bayesian modelling framework: concepts, structure, and extensibility. Stat Comput. 2000;10:325–37.