From 2005–2009, inclusive, 85 centers (USA and Canada) submitted data to STS–CHSDB, and discharge mortality of index cardiac operations was 4.0% (3,418/86,297). For patients age<18 years, from 2005–2009, inclusive, 85 centers submitted data to STS–CHSDB, and discharge mortality of index cardiac operations was 4.1% (3,309/81,062). 18,375 index operations at 74 centers were included in the analysis of eight benchmark operations.

Raw Data and Funnel Plots

summarizes overall aggregate and participant-specific results for mortality and PLOS for each operation. Mortality data are also displayed as funnel plots for these eight benchmark operations (). These funnel plots demonstrate that for the majority of these benchmark operations, very few programs can be classified as outliers for discharge mortality, i.e., most programs fall within the 95% prediction limits and are not considered outliers. In fact, for some operations such as VSD, TOF, and Fontan, no programs are outliers. For other operations such as AVC, ASO, ASO+VSD, truncus, and Norwood, some participants are outliers. The number of “outliers” (based on two one-sided .025-level tests) were: VSD=0, TOF=0, AVC=1, ASO=3, ASO+VSD=1, Fontan=0, Truncus=4, Norwood=11. By design, approximately 5% of participants would be expected to have mortality rates that fall outside of the 95% prediction interval even if true probability of mortality did not vary across centers. For each operation except Norwood, the number of centers falling outside of the 95% prediction interval was consistent with the number that would be expected under the null hypothesis of no between-center variation. However, the small number of outliers should not be interpreted as evidence of no between-center variation in mortality. Power for detecting between-center variation for low complexity operations was minimal, as described below.

Feasibility of analyzing between-center variation

The number of cases required to detect a two-fold increase in the mortality rate with at least 50% power ranged from 17 for Norwood to 599 for VSD repair (). In the Norwood group, 40 participants met this required sample size. (Power to detect a smaller 1.5-fold increase in Norwood mortality was at least 50% for 12 participants and at least 80% for 4 participants.) For procedures other than Norwood, at most 1 participant met the sample size required to detect a doubling of mortality with at least 50% power. Based on these results, between-participant variation in mortality was analyzed with Bayesian methodology only for Norwood. For Bayesian analyses of Norwood, all participants were included regardless of sample size.

| **Table 3**Feasibility of analyzing between-center variation |

The required sample size to detect a doubling of the mean PLOS is five operations (). Based on these results, between-participant variation in PLOS was analyzed for all operations. All participants were included regardless of sample size.

Bayesian estimation of between-participant variation

documents unadjusted and risk adjusted Bayesian estimation of between-participant variation for mortality and PLOS. The estimated 25^{th} and 75^{th} percentiles for Norwood mortality are 15.5% and 27.0%. We estimate that 25% of participants have a true mortality rate<15.5% and 75% of participants have a true mortality rate<27.0%. The estimated minimum and maximum true mortality rates are 7.3% and 47.0%. We estimate that the highest mortality rate is approximately 7-fold higher than the lowest. The 95% PI for the max/min ratio is 3.7–13.9, implying that we are highly confident that there is at least a 3.7-fold difference and no more than a 13.9-fold difference between the highest and lowest participant-specific true mortality rate. The between-center variation in mortality was only marginally attenuated when adjusting for case mix (estimated max/min ratio=6.5; 95% PI: 3.3–13.0). Variation in PLOS was also substantial, with a trend suggesting greater variation for higher-complexity operations. The estimated GINI index for adjusted PLOS ranged from 0.069 (95% PI: 0.056-0.082) for TOF to 0.142 (95% PI: 0.117-0.171) for Norwood.

| **Table 4**Results of Bayesian Hierarchical Models |