Data were extracted from the medical records of all 3511 patients undergoing primary revision hip and knee arthroplasties from March 1, 2003, to August 31, 2006, at Massachusetts General Hospital. Data for primary diagnosis, procedure, comorbidities, intraoperative variables, and immediate outcomes during hospitalization were collected from electronic clinical data and electronic administrative records at Massachusetts General Hospital. The study protocol, including a waiver of informed consent for individual patients, was approved by the Massachusetts General Hospital Human Research Committee.
Data for each patient were abstracted manually from discharge summaries, operative notes, and ICD-9 codes by one investigator (THW). Primary diagnoses and procedures were abstracted from the operative reports and discharge summaries. In case of conflicting information, the operative report took precedence. To assess interrater reliability, a clinical fellow (THV) was first instructed on the abstraction techniques and then abstracted data for the same variables regarding comorbidities and major complications from a random sample of 50 patients taken from the database. Abstraction of outcomes by an independent rater yielded 100% agreement for all the variables.
Primary diagnoses and indications were collapsed into the following categories: osteoarthritis (including dysplasia and slipped capital femoral epiphysis); rheumatoid arthritis (including inflammatory etiologies, such as villonodular and psoriatic and heterotopic ossification); infection-related joint arthroplasty; mechanical (including dislocation, aseptic loosening, failed allografts, pseudarthrosis, wear, and osteolysis); avascular necrosis; posttraumatic changes; and benign or malignant tumor-related joint arthroplasty.
Intraoperative parameters included estimated blood loss (EBL), lowest heart rate (HR), and lowest mean arterial pressure (MAP). Intraoperative records were stored in an electronic Anesthesia Information Management System (Saturn; Dräger Medical, Telford, PA, USA). This database is accessible via Structured Query Language (SQL). A SQL query was developed to examine the intraoperative physiologic data during the surgery. Electronic anesthesia data differ from handwritten records in multiple aspects [5
]. Specifically, the tendency for inclusion of some artifactual or erroneous values (for example, false pressure readings when an arterial catheter is flushed) is of concern. Therefore, we used a previously validated filtering algorithm to eliminate artifactual readings [19
The 10-point Surgical Apgar Score was computed for each patient based on three intraoperative parameters (EBL, HR, MAP) (Table ). Derivation of the score was based on a logistic regression model as previously reported [8
The primary end point was the incidence of a major postoperative complication or death during hospitalization. Major complications were identified from diagnoses in discharge summaries, operative reports, and ICD-9 codes and are based on definitions from the NSQIP [13
] (Table ).
Individual major complications
Basic demographics and summary statistics were calculated overall and for those with and without major complications. There were no missing data for analysis of the intraoperative variables and the outcomes. For development of the preoperative risk model, only complete cases with complete data were used; as some laboratory values were missing for some patients, the overall number of patients for analysis decreased from 3511 to 3236 for this model. For all variables, including the three intraoperative variables of interest, differences between patients with and without complications were compared using the two-sided t test or Wilcoxon rank-sum test for continuous variables and chi square test for categorical variables.
To assess predictive performance, we used univariable logistic regression, with the 10-point score as a categorical predictor, to evaluate the calibration and discrimination of the score as a comprehensive predictive instrument for major complications or death in the arthroplasty cohort. We evaluated calibration with calibration graphs and the Hosmer-Lemeshow goodness-of-fit test [12
]. This test is commonly used to assess for goodness of fit for logistic regression models. It tests whether the observed event rates match expected event rates. If observed and expected event rates are similar, the model is called well calibrated. If the p value of this test is more than 0.05, there is no evidence of lack of fit. Discrimination, as a measure of how well the score can differentiate patients with and without complications, was assessed by the c-statistic. Closely related to sensitivity and specificity, the c-statistic represents the percentage of all possible discordant pairs of cases in which the model correctly assigns a higher probability of having a major complication to the patient with the complication rather than to the patient without the complication.
To investigate the stability of the score in relation to mortality status, sensitivity analysis was performed by subgroup analysis, with patients who had died being excluded.
To analyze whether the score contributes incremental information regarding patient risk independent of baseline clinical variables, we derived a preoperative risk model based on clinical variables, easily obtainable before surgery, using stepwise multivariable logistic regression to predict major complications. Variables tested for inclusion in the model included age, gender, weight, height, American Society of Anesthesiologists (ASA) class, creatinine, estimated glomerular filtration rate, blood urea nitrogen (BUN), and diagnosis of femoral fracture. Preexisting conditions including diabetes mellitus and pulmonary and cardiovascular comorbidities also were evaluated as risk factors (Table ). Pulmonary comorbidity was defined as preexisting chronic obstructive pulmonary disease, ventilator dependence, or pneumonia. Cardiovascular comorbidity was defined as prior myocardial infarction, angina, congestive heart failure, or coronary revascularization. Patients with a history of transient ischemic attack (TIA), or stroke, with or without residual neurologic deficit were pooled in one group, called “history of stroke/TIA” and included in cardiovascular comorbidity. Nonphysiologic outliers and influential points were checked individually for the logistic regression models, and variables were truncated accordingly when necessary. Model diagnostics were performed with residual plots. The c-statistic for this model was calculated to summarize discrimination and the Hosmer-Lemeshow goodness-of-fit test was used to assess calibration.
Patients were stratified into equal-sized quintiles based on their preoperative risk as determined by the derived model. A Spearman correlation was used to compare preoperative risk stratification and the score. Outcomes rates for increasing levels of the Surgical Apgar Score (categorized as 0–4, 5–6, 7–8, and 9–10) were calculated for each quintile of preoperative risk. A Cochran-Armitage chi square test of trend was performed separately for each preoperative risk quintile to see whether, in each level of preoperative risk, outcome rate was related to the intraoperative risk as measured by the score category. We also used logistic regression to assess whether the score added incremental prognostic information independent of preoperative risk, again using the incidence of major complications as outcome and controlling for preoperative risk predictions. A likelihood ratio test was computed to compare the model with preoperative risk predictions as the only independent variables with a model that included the preoperative risk predictions and the score. The results are presented as the odds ratio, 95% confidence interval, and corresponding p value for the score, adjusted for preoperative risk. All analyses were performed using the SAS® 9.1 statistical software package (SAS Institute Inc, Cary, NC, USA).