|Home | About | Journals | Submit | Contact Us | Français|
Most survival prediction models for coronary artery bypass grafting surgery (CABG) are limited to in-hospital or 30-day endpoints. We estimate a long-term survival model using data from the Society of Thoracic Surgeons Adult Cardiac Surgery Database and Centers for Medicare and Medicaid Services (CMS).
The final study cohort included 348,341 isolated CABG patients ≥ 65 years of age, discharged between January 1, 2002 and December 31, 2007 from 917 STS-participating hospitals, randomly divided into training (n = 174,506) and validation (n = 173,835) samples. Through linkage with CMS claims data, we ascertained vital status from date of surgery through December 31, 2008 (1 – 6 year follow-up). Because the proportional hazards assumption was violated, we fit four Cox regression models conditional on being alive at the beginning of the following intervals: 0 –30 days, 31 – 180 days, 181 days – 2 years, > 2 years. Kaplan-Meier estimated mortality was 3.2% at 30 days, 6.4% at 180 days, 8.1% at one-year, and 23.3% at 3 years of follow-up. Harrell's C statistic for predicting overall survival time was 0.732. Some risk factors (e.g., emergency status, shock, reoperation) were strong predictors of short-term outcome but, for early survivors, became non-significant within 2 years. The adverse impact of some other risk factors (e.g., dialysis-dependent renal failure, insulin-dependent diabetes) continued to increase.
Using clinical registry data and longitudinal claims data, we developed a long-term survival prediction model for isolated CABG. This provides valuable information for shared decision-making, comparative effectiveness research, quality improvement, and provider profiling.
Risk-adjusted mortality after coronary artery bypass grafting surgery (CABG) has been the dominant cardiac surgery outcome metric for more than two decades. Ideally, these rates are based on audited clinical data registries such as those maintained by The Society of Thoracic Surgeons (STS), state and federal government agencies, and regional collaboratives. Clinical registries include important preoperative, intraoperative, and postoperative variables that are typically unavailable in administrative data sources. Analyses of risk-adjusted clinical outcomes data from these registries have been used for a variety of quality assessment and improvement activities as well as clinical research.
Despite their many advantages, clinical registries also have an important limitation. Because of cost and other practical barriers, most clinical data registries collect only in-hospital or 30-day postoperative outcomes, including mortality. Ascertainment of longer-term vital status is especially problematic for referral centers whose patients are often returned to the care of their primary doctors in distant cities or states. As many important events occur after the index hospitalization, this limited long-term follow-up is a significant barrier to the optimal utilization of registry data. Particularly as short-term procedural mortality has decreased, longer-term outcomes are of equal or greater relevance to patients, providers, and other stakeholders.
If robust, long-term follow-up data were available, it would enable investigators to study the association of these outcomes with relevant clinical factors (e.g., patient characteristics and disease severity on admission). Longitudinal data would greatly enhance shared decision-making, individualized patient management strategies, the study of long-term efficacy and safety, and comparative effectiveness research.
The American College of Cardiology Foundation (ACCF), The Society of Thoracic Surgeons. and the Duke Clinical Research Institute are collaborating on a comparative effectiveness study (American College of Cardiology Foundation-The Society of Thoracic Surgeons Collaboration on the Comparative Effectiveness of Revascularization sTrategies, or ASCERT) of CABG and percutaneous coronary interventions (PCI), funded by the National Heart, Lung and Blood Institute of the National Institutes of Health1. The first aim of the ASCERT Study is to develop novel, long-term mortality risk prediction models for CABG and PCI. By linking the STS Adult Cardiac Surgery Database (ACSD) and the Center for Medicare and Medicaid Services (CMS) 100% denominator file2, we developed long-term mortality models that estimate the time-dependent effect of preoperative patient factors on medium and long-term mortality following CABG.
This analysis has been reviewed and approved by the Duke University Health System IRB under protocol number Pro00019987.
The study population consisted of isolated CABG patients at STS-participating hospitals, who were discharged between January 1, 2002 and December 31, 2007, and whose clinical data were collected using STS ACSD version 2.41 and 2.52 data specifications3. Data quality in the STS Database has been shown to be high. In audits of STS data from 12 sites in Iowa conducted by the Iowa Foundation for Medical Care in 2001–2002 (corresponding to the earliest data used for the current study), the overall agreement rate for risk predictors was 96%4. External audits of the entire STS National Database currently include 5% of randomly selected participants annually, and the overall agreement rate for 2009 records (over 70 data elements in each) was 96.1%.
Patients younger than 65 years or having a history of coma were excluded, as were patients with missing data on age, sex or status (elective, urgent, emergent, salvage). For patients with multiple operations in the dataset, only the first operation was included. The final study population included 348,341 patients from 917 STS-participating hospitals (Figure 1).
Procedural records in the STS Database were linked to CMS inpatient claims and denominator databases2, 5. STS and CMS claims records from 2003–2007 were considered to be a match if they agreed exactly on site, sex, admission date, discharge date, date of birth (if present), and age. For 2002, dates of birth, admission, and discharge were coarsened to protect confidentiality, and thus a more complicated matching criterion was required. Records were considered to be a match if they agreed exactly on site, sex, length of stay, procedure month and year, days from birth to admission (if present), age, and days from admission to surgery. Overall, 86.5% of records were collected during 2003–2007 and matched exactly on all available matching criteria. In a validation study of this methodology using heart failure patients from Duke University5, the estimated false match rate was 0% (0/109) when using the most stringent matching criterion and 1% (1/109) when using a less stringent matching rule.
Vital status and dates of death through December 31, 2008 were obtained by linking CMS claims records to the denominator file on the basis of an encrypted Medicare patient identifier. Follow-up was considered to be administratively censored on December 31, 2008 and was at least 1 year for all patients (median 4 years; maximum 7 years).
Predictor variables were summarized as percentages if categorical and as mean, median, standard deviation (STD), and quartiles (25th, 75th) if continuous. Predictors were chosen based on published CABG short-term models6 and clinical experience. Variable definitions are available at www.sts.org. The variable for “Number of Diseased Vessels” in the STS ACSD was designed to reflect the amount of myocardium at risk. Thus, although patients with significant left main coronary disease are specifically identified by a separate dichotomous variable, for the purposes of defining myocardium at risk they are also classified as two diseased vessels.
Data were randomly divided into a 50% training sample (n = 174,506) to determine the form of the model and estimate regression coefficients, and a 50% validation sample (n = 173,835) to assess model calibration and discrimination.
We estimated survival as a function of patient preoperative characteristics using the Cox proportional hazards model7. The proportional hazards assumption was investigated by plotting and visually inspecting transformed (log-log) survival probabilities vs. time following CABG.To allow for non-proportional hazards, we estimated separate hazard ratio parameters for all model variables for each of the following time intervals: 0 – 30 days, 31 – 180 days, 181 days – 2 years, and > 2 years. Time intervals were chosen after conducting a preliminary analysis which involved fitting Cox models with several relatively narrow categories, then collapsing adjacent categories based on a combination of statistical and non-statistical considerations. The first cutpoint (30-days) was chosen for consistency with many existing short-term CABG mortality models and quality metrics. As the ability to support even the most seriously ill postoperative patients has increased due to modern critical care, some have suggested that our definition of the “early” postoperative period should likewise be lengthened so as not to underestimate early risk8. This was the basis for our relatively narrow second time interval, 31 to 180 days. The remaining intervals were chosen by collapsing adjacent categories for which the hazard ratios appeared most similar, while retaining sufficient events in each to ensure precise estimation of category-specific hazard ratio parameters.
We fit four separate Cox regression models that were conditional on being alive at the beginning of each time interval. Mathematically, this was equivalent to fitting a single Cox model with piecewise-constant, time-dependent hazard ratios for all model variables.
Graphical exploratory analyses were used were used to determine the functional form of continuous variables and to decide whether categorical variables with several categories could be collapsed into fewer categories. In a preliminary Cox model using flexible regression splines for continuous variables, plots of the variables age, height and year of surgery revealed an approximately linear association with the log-hazard of mortality, and were modeled as linear. The association between body mass index (BMI) and mortality was determined to be non-monotone (U-shape) and was modeled as a continuous polynomial regression function with linear and quadratic effects. We arbitrarily selected BMI's of 20, 30, 35, and 40 kg / m2 to compare their hazard ratios relative to a “normal” reference BMI of 25 kg / m2. For modeling renal function, patients on dialysis were adequately represented by an indicator variable for dialysis without further adjusting for the patient's last preoperative creatinine level. For patients not on dialysis, the relationship between last preoperative creatinine and mortality was modeled as a straight line with a change of slope at 1.5 mg/dL. Ejection fraction was modeled as linear below 60% and constant above 60%. Finally, aortic stenosis pressure gradient was modeled as linear below 77mmHg and constant above 77mmHg (the 99th percentile).
Interactions between predictors were examined by identifying 5 predictors with the highest global chi-square statistics and creating all possible pairwise interactions among them, in each case considering whether these were also clinically plausible. While some interaction terms were statistically significant, they were not felt to be of major practical significance. Measures of model calibration and discrimination were not materially affected by their inclusion (i.e., model fit was not substantially improved), and models without interactions were also considered to be substantially more interpretable and usable. Therefore, we retained only main effects in the final model.
Predictor data were highly complete with most covariates having less than 1% missing data (Supplemental Table 1). Missing values were imputed to the median of continuous variables (after stratifying on relevant variables to enhance prediction of the missing value) and the most common category of binary and polytomous variables. More computationally intensive missing data strategies, such as multiple imputations, were not used for this analysis, as they have been documented to have minimal impact in previous STS risk models9.
Model performance was assessed in the 50% validation sample. Predicted survival curves were generated by applying estimated regression coefficients from the development sample to covariate data of patients in the validation sample. To assess calibration (fit), model-based predicted survival curves were averaged across patients in the validation sample and compared to non-parametric (Kaplan-Meier, K-M) survival curves. This was done in the overall validation population and in various subgroups. To further assess calibration, patients in the validation sample were ranked into 20 categories based on their estimated risk of dying within 3 years. Average expected and observed (Kaplan-Meier) 3-year survival probabilities were then calculated within each category and plotted.
Discrimination was quantified using two methods. First, discrimination for predicting mortality status as a dichotomous endpoint (alive/dead) was assessed by the area under the receiver operating characteristic (ROC) curve (C-index) for three selected time points: 30 days, 1 year, and 3 years. All patients had at least one year of follow-up and were included in the estimation of discrimination for the 30-day and 1-year time-points. For the 3-year time-point, the 65% of patients with at least 3 years of potential follow-up (i.e., those treated between January 1, 2004 and December 31, 2005) were included. Second, an analogous overall measure of discrimination for predicting survival time as a continuous variable was calculated using Harrell's C-index for censored survival data10. To apply Harrell's method, patients were ranked according to their predicted 3-year mortality risk. We then calculated the proportion of pairs of patients for which the patient with the lower predicted probability of mortality survived longer than the patient with the higher predicted probability, accounting appropriately for censoring.
After model development and validation were completed, we re-estimated the final model coefficients based on the complete dataset (development + validation samples). Confidence intervals for hazard ratios were calculated with sandwich standard error estimates to account for within-hospital clustering11.
Supplemental Table 2 compares the characteristics of STS CABG patients who were or were not matched to CMS. For most variables these two groups were quite similar.
Table 1 depicts the characteristics of the final study population of 348,341 patients who underwent isolated CABG. Kaplan-Meier estimated mortality in the overall study cohort (development and validation samples) was 3.2% at 30 days, 6.4% at 180 days, 8.1% at one-year, 11.3% at two years, and 23.3% at 3 years of follow-up. Supplemental Table 3 summarizes the univariable association between each candidate predictor variable and estimated mortality rates at 30-days, 1 year and 3 years.
Table 2 shows hazard ratios derived by fitting multivariable Cox regression models to four time intervals (see methods). In multivariable analyses, several distinct, temporal risk factor patterns are evident. For example, higher ejection fraction was protective over all time periods and the magnitude of effect was stable. Conversely, past history of a stroke (CVA), transient ischemic attack (TIA), or reversible ischemic neurological deficit (RIND), moderate or severe chronic lung disease (CLD), or immunosuppressive treatment had a significant negative impact on survival at all endpoints. The magnitude of effect of some important early predictors of risk, including current smoking, insulin-dependent diabetes, and dialysis-dependent renal failure, increased over time suggesting an accumulation of risk from these debilitating chronic behaviors and diseases. On the other hand, the effect of some important early predictors of increased mortality (e.g., emergency status, cardiogenic shock, acute preoperative myocardial infarction [MI], and reoperation) diminished rather quickly and became non-significant for those patients who survived the early postoperative and recovery periods.
Our results confirm the so-called “obesity paradox” reported in other short-term analyses12, 13and demonstrate that these effects persist for at least two years postoperatively. Low body mass index (20 kg/m2 vs. 25 kg/m2) predicted higher mortality at all time periods postoperatively, whereas obesity (>25 kg/m2) was associated with decreased risk.
Model discrimination (C-index) in the validation set was 0.762 for predicting 30-day status, 0.764 for predicting one year status, and 0.748 for predicting three-year status. Harrell's C statistic for predicting overall survival time was 0.732. Thirty-day model discrimination differs from that observed in our most recent STS isolated CABG risk models6, most likely because the current model is limited to patients over 65 years of age. Model discrimination at longer time intervals is also lower than that in the early postoperative period. As the time interval from surgery increases, there is correspondingly greater probability that other factors not included in the risk models may impact survival.
Figure 2 depicts the expected and Kaplan-Meier (K-M) observed survival curves for the overall validation cohort. Figure 3 compares observed and expected 3-year mortality risk across 20 categories of predicted risk. Within the typical range of expected mortalities, prediction is highly accurate. From 20 – 40% expected mortality, there is very slight underestimation of mortality risk, and at the highest expected mortality (> 50%) there is slight overestimation. Figure 4 depicts K-M observed and expected survival curves for selected patient subgroups in the validation cohort. The expected (solid) and observed (dashed) lines are nearly superimposable on most of the plots.
Short-term duration of follow-up has prevented the full potential of clinical data registries from being realized. As average acute hospital length of stay has shortened, procedural-related deaths and complications are correspondingly more likely to occur after patients have been discharged from the hospital. The use of advanced mechanical and pharmacological support has increasingly prolonged the lives of many critically ill postoperative patients, and such patients may be transferred to long-term critical care facilities on ventilators or dialysis. Deaths among such patients may not occur for months after their index hospital discharge, and these delayed postoperative deaths would not be captured in most existing clinical registries8. Short-term follow-up is also a major limitation of comparative effectiveness studies of various treatment strategies, such as CABG or PCI for coronary artery disease. Differences in efficacy of alternative treatments are often not apparent for months or years, much longer than the typical endpoints in most clinical registries. Finally, some preoperative risk factors may have little impact on short-term mortality but are major considerations in the longer term, and vice versa.
Some previous studies have assessed the long-term impact of preoperative risk factors, operative and perioperative care processes, and postoperative complications14–28. Our study focuses specifically on the former, and it addresses the major limitations of these earlier studies. Many are from single institutions and their inferences may be confounded by idiosyncratic hospital practice patterns. Most prior studies include only a few hundred to a few thousand patients and lack the power to identify the full spectrum of factors associated with outcomes. Some studies of long-term CABG mortality have been based solely on large administrative databases. This strategy assures adequate sample size and provides valuable information regarding vital status, readmissions, re-interventions, costs, aggregate resource utilization, and outpatient activities. However, administrative databases have a number of well-known deficiencies that limit their usefulness in clinical research, including misclassification of procedures and diagnoses; unavailability of important clinical variables; inability to distinguish co-morbidities from complications (in the absence of Present on Admission indicators); and focus on narrow patient populations29–34. There are studies of long-term CABG outcome predictors based on clinical registries, but these are derived from data that are ten to twenty years old and may not reflect current patient severity and surgical practice35, 36.
Our study seeks to overcome the inherent limitations of both clinical and administrative data registries by linking the two together. This approach compensates for their individual deficiencies while harnessing their complementary strengths. The resulting linked data retain the granularity and clinical detail of clinical registries while adding long-term outcomes and cost data available in administrative data sources. These linked data are ideally suited to studies of long-term clinical outcomes, comparative effectiveness, resource utilization, and provider performance for particular types of patients.
Using predicted long-term outcomes tailored to their specific risk profiles, patients may more effectively participate in shared decision-making with their providers. Awareness of both the short-term and long-term risks and benefits (e.g., survival, complications, quality of life) might assist patients in deciding whether or not to proceed with surgery. Furthermore, just as short-term outcomes vary among providers, it is possible that long-term outcomes may also vary, and such information could be useful for all stakeholders.
The long-term CABG mortality model described in this report, based solely on preoperative patient characteristics, is only the first of many applications we envision to exploit the advantages of linked registries. In addition to mortality, it will also be possible to study other long-term endpoints such as readmissions, re-interventions, and cumulative costs and resource use. Other models will estimate the effect of intraoperative decisions, such as use of all-arterial grafting or off-pump procedures, on long-term outcomes. Combined with preoperative variables, such information could help determine what specific procedures or perioperative strategies are most useful for specific types of patients. The addition of early postoperative events (e.g., stroke, or mediastinitis) as predictor variables would permit more effective discussions with such patients regarding their long-term health expectations.
Linkages with CMS and other administrative data sources will also enhance the accuracy of outcomes data used to calculate performance metrics such as the STS CABG composite scores37, 38. For example, ongoing linkages with the Social Security Death Master File or National Death Index would permit continuous input and validation of vital status for patients of all ages, not just the Medicare population39.
Linked clinical and administrative data will facilitate the determination of risk-adjusted, long-term freedom from reoperation and readmission not only for surgical procedures but also for a variety of medical devices, such as cardiac valve prostheses. This ability to capture objective long-term patient status, coupled with extensive clinical data from the perioperative period, will be a marked improvement over existing methods for post-market surveillance.
Linkages to other clinical registries will also be useful, and their combined utility will be further enhanced by linking to administrative data, as demonstrated by the ASCERT comparative effectiveness study1. Furthermore, as payment strategies evolve from a focus on procedures or acute hospitalizations to episodes of care, the ability to link related clinical registries (e.g., cardiology and cardiac surgery) will facilitate the study and implementation of these reimbursement policies. Finally, linkages between clinical and payer registries would provide unique information such as outpatient visits, compliance with medications, and cumulative resource use.
There are inherent limitations to any studies that use voluntarily collected data, but these are mitigated by the robust STS audit program described previously.
In order to obtain accurate long-term follow-up, it was necessary to link our data to CMS claims data. Because Medicare claims data are restricted to patients 65 years and older, the generalizability of our findings to younger populations is uncertain.
It was impossible to accurately determine cause of death (e.g., cardiac vs. non-cardiac), and our analyses use all-cause mortality as the endpoint.
Our linkages are based on combinations of indirect identifiers. Previous analyses have demonstrated that non-unique identifiers can be combined to create high-quality links between a clinical registry and an administrative data set, allowing researchers to capitalize on the strengths of both types of data to answer important clinical questions5, 40. Although we believe this strategy yielded highly accurate matches in our study, some errors may have been introduced through this process.
Finally, the more distant from the time of surgery, the more opportunity there is for non-surgery related events to confound the apparent associations between preoperative factors and outcomes.
We linked broadly representative, real-world clinical data from the STS ACSD and vital status from Medicare claims data to construct a robust, long-term CABG survival prediction model. Because of the large study cohort, model performance is excellent.
As the time interval from surgery lengthens, the clinical outcomes of postoperative survivors are less impacted by traditional predictors of early survival, such as emergency status, shock, and reoperation. Conversely, late mortality is increasingly associated with chronic diseases such as insulin-dependent diabetes and dialysis-dependent renal failure, and behaviors such as smoking.
As short-term CABG mortality rates decline, the ability to estimate long-term outcomes for patients with particular risk factors will become increasingly important for shared decision-making, comparative effectiveness research, optimal treatment planning, quality improvement initiatives, and provider profiling.
Funding Sources: The ASCERT Study is supported by Award Number RC2HL101489 from the National Heart, Lung, and Blood Institute. This award has been issued under the American Recovery and Reinvestment Act of 2009 for a two-year period. The funders played no role in the design, interpretation, or decision to publish the analysis presented herein. The content of this article is solely the responsibility of the authors and does not necessarily represent the official views of the National Heart, Lung, And Blood Institute or the National Institutes of Health.
Conflict of Interest Disclosures: Consultancy, expert testimony, grants/grants pending: Dangas, Weintraub. Other: DCRI serves as the data warehouse and analysis center for the STS and ACCF registries (DeLong, Grau-Sepulveda, O'Brien, Peterson, Sheng).
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.