Background: Indices for predicting survival are essential for assessing prognosis and assigning priority for liver transplantation in patients with liver cirrhosis. The model for end stage liver disease (MELD) has been proposed as a tool to predict mortality risk in cirrhotic patients. However, this model has not been validated beyond its original setting.
Aim: To evaluate the short and medium term survival prognosis of a European series of cirrhotic patients by means of MELD compared with the Child-Pugh score. We also assessed correlations between the MELD scoring system and the degree of impairment of liver function, as evaluated by the monoethylglycinexylidide (MEGX) test.
Patients and methods: We retrospectively evaluated survival of a cohort of 129 cirrhotic patients with a follow up period of at least one year. The Child-Pugh score was calculated and the MELD score was computed according to the original formula for each patient. All patients had undergone a MEGX test. Multivariate analysis was performed on all variables to identify the parameters independently associated with one year and six month survival. MELD values were correlated with both Child-Pugh scores and MEGX test results.
Results: Thirty one patients died within the first year of follow up. Child-Pugh and MELD scores, and MEGX serum levels were significantly different among patients who survived and those who died. Serum creatinine, international normalised ratio, and MEGX60 were independently associated with six month mortality while the same variables and the presence of ascites were associated with one year mortality. MELD scores showed significant correlations with both MEGX values and Child-Pugh scores.
Conclusions: In a European series of cirrhotic patients the MELD score is an excellent predictor of both short and medium term survival, and performs at least as well as the Child-Pugh score. An increase in MELD score is associated with a decrease in residual liver function.
cirrhosis; liver function; ascites; creatinine; Child-Pugh score; MELD; MEGX
This study aims to compare the efficiency of APACHE IV with that of MELD scoring system for prediction of the risk of mortality risk after orthotopic liver transplantation (OLT). A retrospective cohort study was performed based on a total of 195 patients admitted to the ICU after orthotopic liver transplantation (OLT) between February 2006 and July 2009 in Guangzhou, China. APACHE IV and MELD scoring systems were used to predict the postoperative mortality after OLT. The area under the receiver operating characteristic curve (AUC) and the Hosmer-Lemeshow C statistic were used to assess the discrimination and calibration of APACHE IV and MELD, respectively. Twenty-seven patients died during hospitalization with a mortality rate of 13.8%. The mean scores of APACHE IV and MELD were 42.32 ± 21.95 and 18.09 ± 10.55, respectively, and APACHE IV showed better discrimination than MELD; the areas under the receiver operating characteristic curve for APACHE IV and MELD were 0.937 and 0.694 (P < 0.05 for both models), which indicated that the prognostic value of APACHE IV was relatively high. Both models were well-calibrated (The Hosmer-Lemeshow C statistics were 1.568 and 6.818 for APACHE IV and MELD, resp.; P > 0.05 for both). The respective Youden indexes of APACHE IV, MELD, and combination of APACHE IV with MELD were 0.763, 0.430, and 0.545. The prognostic value of APACHE IV is high but still underestimates the overall hospital mortality, while the prognostic value of MELD is poor. The function of the APACHE IV is, thus, better than that of the MELD.
The scarcity of grafts available necessitates a system that considers expected posttransplant survival, in addition to pretransplant mortality as estimated by the MELD. So far, however, conventional linear techniques have failed to achieve sufficient accuracy in posttransplant outcome prediction. In this study, we aim to develop a pretransplant predictive model for liver recipients' survival with benign end-stage liver diseases (BESLD) by a nonlinear method based on pretransplant characteristics, and compare its performance with a BESLD-specific prognostic model (MELD) and a general-illness severity model (the sequential organ failure assessment score, or SOFA score).
With retrospectively collected data on 360 recipients receiving deceased-donor transplantation for BESLD between February 1999 and August 2009 in the west China hospital of Sichuan university, we developed a multi-layer perceptron (MLP) network to predict one-year and two-year survival probability after transplantation. The performances of the MLP, SOFA, and MELD were assessed by measuring both calibration ability and discriminative power, with Hosmer-Lemeshow test and receiver operating characteristic analysis, respectively. By the forward stepwise selection, donor age and BMI; serum concentration of HB, Crea, ALB, TB, ALT, INR, Na+; presence of pretransplant diabetes; dialysis prior to transplantation, and microbiologically proven sepsis were identified to be the optimal input features. The MLP, employing 18 input neurons and 12 hidden neurons, yielded high predictive accuracy, with c-statistic of 0.91 (P<0.001) in one-year and 0.88 (P<0.001) in two-year prediction. The performances of SOFA and MELD were fairly poor in prognostic assessment, with c-statistics of 0.70 and 0.66, respectively, in one-year prediction, and 0.67 and 0.65 in two-year prediction.
The posttransplant prognosis is a multidimensional nonlinear problem, and the MLP can achieve significantly high accuracy than SOFA and MELD scores in posttransplant survival prediction. The pattern recognition methodologies like MLP hold promise for solving posttransplant outcome prediction.
The aim of this study was to assess the prognostic accuracy of Child-Pugh and APACHE II and III scoring systems in predicting short-term, hospital mortality of patients with liver cirrhosis.
200 admissions of 147 cirrhotic patients (44% viral-associated liver cirrhosis, 33% alcoholic, 18.5% cryptogenic, 4.5% both viral and alcoholic) were studied prospectively. Clinical and laboratory data conforming to the Child-Pugh, APACHE II and III scores were recorded on day 1 for all patients. Discrimination was evaluated using receiver operating characteristic (ROC) curves and area under a ROC curve (AUC). Calibration was estimated using the Hosmer-Lemeshow goodness-of-fit test.
Overall mortality was 11.5%. The mean Child-Pugh, APACHE II and III scores for survivors were found to be significantly lower than those of nonsurvivors. Discrimination was excellent for Child-Pugh (ROC AUC: 0.859) and APACHE III (ROC AUC: 0.816) scores, and acceptable for APACHE II score (ROC AUC: 0.759). Although the Hosmer-Lemeshow statistic revealed adequate goodness-of-fit for Child-Pugh score (P = 0.192), this was not the case for APACHE II and III scores (P = 0.004 and 0.003 respectively)
Our results indicate that, of the three models, Child-Pugh score had the least statistically significant discrepancy between predicted and observed mortality across the strata of increasing predicting mortality. This supports the hypothesis that APACHE scores do not work accurately outside ICU settings.
To determine the generalizability of the predictions for 90-day mortality generated by Model for End-stage Liver Disease (MELD) and the serum sodium augmented MELD (MELDNa) to Atlantic Canadian adults with end-stage liver disease awaiting liver transplantation (LT).
The predictive accuracy of the MELD and the MELDNa was evaluated by measurement of the discrimination and calibration of the respective models’ estimates for the occurrence of 90-day mortality in a consecutive cohort of LT candidates accrued over a five-year period. Accuracy of discrimination was measured by the area under the ROC curves. Calibration accuracy was evaluated by comparing the observed and model-estimated incidences of 90-day wait-list failure for the total cohort and within quantiles of risk.
The area under the ROC curve for the MELD was 0.887 (95% CI 0.705 to 0.978) – consistent with very good accuracy of discrimination. The area under the ROC curve for the MELDNa was 0.848 (95% CI 0.681 to 0.965). The observed incidence of 90-day wait-list mortality in the validation cohort was 7.9%, which was not significantly different from the MELD estimate of 6.6% (95% CI 4.9% to 8.4%; P=0.177) or the MELDNa estimate of 5.8% (95% CI 3.5% to 8.0%; P=0.065). Global goodness-of-fit testing found no evidence of significant lack of fit for either model (Hosmer-Lemeshow χ2 [df=3] for MELD 2.941, P=0.401; for MELDNa 2.895, P=0.414).
Both the MELD and the MELDNa accurately predicted the occurrence of 90-day wait-list mortality in the study cohort and, therefore, are generalizable to Atlantic Canadians with end-stage liver disease awaiting LT.
End-stage liver disease; Liver transplantation; Mortality; Statistical models; Validation study; Wait list
Acute liver failure is a rare disease with high mortality and liver transplantation is the only life saving therapy. Accurate prognosis of ALF is crucial for proper intervention.
To identify and characterize newly developed prognostic models of mortality for ALF patients, assess study quality, identify important variables and provide recommendations for the development of improved models in the future.
The online databases MEDLINE® (1950–2012) and EMBASE® (1980–2012) were searched for English-language articles that reported original data from clinical trials or observational studies on prognostic models in ALF patients. Studies were included if they developed a new model or modified existing prognostic models. The studies were evaluated based on an existing framework for scoring the methodological and reporting quality of prognostic models.
Twenty studies were included, of which 18 reported on newly developed models, 1 on modification of the Kings College Criteria (KCC) and 1 on the Model for End-Stage Liver Disease (MELD). Ten studies compared the newly developed models to previously existing models (e.g. KCC); they all reported that the new models were superior. In the 12-point methodological quality score, only one study scored full points. On the 38-point reporting score, no study scored full points. There was a general lack of reporting on missing values. In addition, none of the studies used performance measures for calibration and accuracy (e.g. Hosmer-Lemeshow statistics, Brier score), and only 5 studies used the AUC as a measure of discrimination.
There are many studies on prognostic models for ALF but they show methodological and reporting limitations. Future studies could be improved by better reporting and handling of missing data, the inclusion of model calibration aspects, use of absolute risk measures, explicit considerations for variable selection, the use of a more extensive set of reference models and more thorough validation.
AIM: To incorporate estimated glomerular filtration rate (eGFR) into the model for end-stage liver disease (MELD) score to evaluate the predictive value.
METHODS: From January 2004 to October 2008, the records of 4127 admitted cirrhotic patients were reviewed. Patients who survived and were followed up as outpatients were defined as survivors and their most recent available laboratory data were collected. Patients whose records indicated death at any time during the hospital stay were defined as non-survivors (in-hospital mortality). Patients with incomplete data or with cirrhosis due to a congenital abnormality such as primary biliary cirrhosis were excluded; thus, a total of 3857 patients were enrolled in the present study. The eGFR, which was calculated by using either the modification of diet in renal disease (MDRD) equation or the chronic kidney disease epidemiology collaboration (CKD-EPI) equation, was incorporated into the MELD score after adjustment with the original MELD equation by logistic regression analysis [bilirubin and international normalized ratio (INR) were set at 1.0 for values less than 1.0].
RESULTS: Patients defined as survivors were significantly younger, had a lower incidence of hepatoma, lower Child-Pugh and MELD scores, and better renal function. The underlying causes of cirrhosis were very different from those in Western countries. In Taiwan, most cirrhotic patients were associated with the hepatitis virus, especially hepatitis B. There were 16 parameters included in univariate logistic regression analysis to predict in-hospital mortality and those with significant predicting values were included in further multivariate analysis. Both 4-variable MDRD eGFR and 6-variable MDRD eGFR, rather than creatinine, were significant predictors of in-hospital mortality. Three new equations were constructed (MELD-MDRD-4, MELD-MDRD-6, MELD-CKD-EPI). As expected, original MELD score was a significant predictor of in-hospital mortality (odds ratio = 1.25, P < 0.001). MELD-MDRD-4 excluded serum creatinine, with the coefficients refit among the remaining 3 variables, i.e., total bilirubin, INR and 4-variable MDRD eGFR. This model represented an exacerbated outcome over MELD score, as suggested by a decrease in chi-square (2161.45 vs 2198.32) and an increase in -2 log (likelihood) (2810.77 vs 2773.90). MELD-MDRD-6 included 6-variable MDRD eGFR as one of the variables and showed an improvement over MELD score, as suggested by an increase in chi-square (2293.82 vs 2198.32) and a decrease in -2 log (likelihood) (2810.77 vs 2664.79). Finally, when serum creatinine was replaced by CKD-EPI eGFR, it showed a slight improvement compared to the original MELD score (chi-square: 2199.16, -2 log (likelihood): 2773.07). In the receiver-operating characteristic curve, the MELD-MDRD-6 score showed a marginal improvement in area under the curve (0.909 vs 0.902), sensitivity (0.854 vs 0.819) and specificity (0.818 vs 0.839) compared to the original MELD equation. In patients with a different eGFR, the MELD-MDRD-6 equation showed a better predictive value in patients with eGFR ≥ 90, 60-89, 30-59 and 15-29.
CONCLUSION: Incorporating eGFR obtained by the 6-variable MDRD equation into the MELD score showed an equal predictive performance in in-hospital mortality compared to a creatinine-based MELD score.
Liver cirrhosis; Estimated glomerular filtration rate; End-stage liver disease; Modification of diet in renal disease; Renal function
The modification of the Model for End-Stage Liver Disease (MELD) scoring system (Refit MELD) and the modification of MELD-Na (Refit MELDNa), which optimized the MELD coefficients, were published in 2011. We aimed to validate the superiority of the Refit MELDNa over the Refit MELD for the prediction of 3-month mortality in Korean patients with cirrhosis and ascites.
We reviewed the medical records of patients admitted with hepatic cirrhosis and ascites to the Konkuk University Hospital between January 2006 and December 2011. The Refit MELD and Refit MELDNa were compared using the predictive value of the 3-month mortality, as assessed by the Child-Pugh score.
In total, 530 patients were enrolled, 87 of whom died within 3 months. Alcohol was the most common etiology of their cirrhosis (n=271, 51.1%), and the most common cause of death was variceal bleeding (n=20, 23%). The areas under the receiver operating curve (AUROCs) for the Child-Pugh, Refit MELD, and Refit MELDNa scores were 0.754, 0.791, and 0.764 respectively; the corresponding values when the analysis was performed only in patients with persistent ascites (n=115) were 0.725, 0.804, and 0.796, respectively. The significant difference found among the Child-Pugh, Refit MELD, and Refit MELDNa scores was between the Child-Pugh score and Refit MELD in patients with persistent ascites (P=0.039).
Refit MELD and Refit MELDNa exhibited good predictability for 3-month mortality in patients with cirrhosis and ascites. However, Refit MELDNa was not found to be a better predictor than Refit MELD, despite the known relationship between hyponatremia and mortality in cirrhotic patients with ascites.
Stage Liver Disease; Liver Cirrhosis; Ascites; Mortality; Hyponatremia
Risk assessment is an important part of emergency patient care. Risk assessment tools based on biochemical data have the advantage that calculation can be automated and results can be easily provided. However, to be used clinically, existing tools have to be validated by independent researchers. This study involved an independent external validation of four risk stratification systems predicting death that rely primarily on biochemical variables.
Prospective observational study.
The medical admission unit at a regional teaching hospital in Denmark.
Of 5894 adult (age 15 or above) acutely admitted medical patients, 205 (3.5%) died during admission and 46 died (0.8%) within one calendar day.
Main outcome measures
The main outcome measure was the ability to identify patients at an increased risk of dying (discriminatory power) as area under the receiver-operating characteristic curve (AUROC) and the accuracy of the predicted probability (calibration) using the Hosmer-Lemeshow goodness-of-fit test. The endpoint was all-cause mortality, defined in accordance with the original manuscripts.
Using the original coefficients, all four systems were excellent at identifying patients at increased risk (discriminatory power, AUROC ≥0.80). The accuracy was poor (we could assess calibration for two systems, which failed). After recalculation of the coefficients, two systems had improved discriminatory power and two remained unchanged. Calibration failed for one system in the validation cohort.
Four biochemical risk stratification systems can risk-stratify the acutely admitted medical patients for mortality with excellent discriminatory power. We could improve the models for use in our setting by recalculating the risk coefficient for the chosen variables.
Epidemiology; Internal Medicine
AIM: To derive and validate a score for the prediction of mid-term bleeding events following discharge for myocardial infarction (MI).
METHODS: One thousand and fifty patients admitted for MI and followed for 19.9 ± 6.7 mo were assigned to a derivation cohort. A new risk model, called BLEED-MI, was developed for predicting clinically significant bleeding events during follow-up (primary endpoint) and a composite endpoint of significant hemorrhage plus all-cause mortality (secondary endpoint), incorporating the following variables: age, diabetes mellitus, arterial hypertension, smoking habits, blood urea nitrogen, glomerular filtration rate and hemoglobin at admission, history of stroke, bleeding during hospitalization or previous major bleeding, heart failure during hospitalization and anti-thrombotic therapies prescribed at discharge. The BLEED-MI model was tested for calibration, accuracy and discrimination in the derivation sample and in a new, independent, validation cohort comprising 852 patients admitted at a later date.
RESULTS: The BLEED-MI score showed good calibration in both derivation and validation samples (Hosmer-Lemeshow test P value 0.371 and 0.444, respectively) and high accuracy within each individual patient (Brier score 0.061 and 0.067, respectively). Its discriminative performance in predicting the primary outcome was relatively high (c-statistic of 0.753 ± 0.032 in the derivation cohort and 0.718 ± 0.033 in the validation sample). Incidence of primary/secondary endpoints increased progressively with increasing BLEED-MI scores. In the validation sample, a BLEED-MI score below 2 had a negative predictive value of 98.7% (152/154) for the occurrence of a clinically significant hemorrhagic episode during follow-up and for the composite endpoint of post-discharge hemorrhage plus all-cause mortality. An accurate prediction of bleeding events was shown independently of mortality, as BLEED-MI predicted bleeding with similar efficacy in patients who did not die during follow-up: Area Under the Curve 0.703, Hosmer-Lemeshow test P value 0.547, Brier score 0.060; low-risk (BLEED-MI score 0-3) event rate: 1.2%; intermediate risk (score 4-6) event rate: 5.6%; high risk (score ≥ 7) event rate: 12.5%.
CONCLUSION: A new bedside prediction-scoring model for post-discharge mid-term bleeding has been derived and preliminarily validated. This is the first score designed to predict mid- term hemorrhagic risk in patients discharged following admission for acute MI. This model should be externally validated in larger cohorts of patients before its potential implementation.
Myocardial infarction; Bleeding; Prediction model; Risk stratification
Prognosis for patients with cirrhosis admitted to intensive care unit (ICU) is poor. ICU prognostic models are more accurate than liver-specific models. We identified predictors of mortality, developed a novel prognostic score (Royal Free Hospital (RFH) score), and tested it against established prognostic models and the yet unvalidated Chronic Liver Failure-Sequential Organ Failure Assessment (CLIF-SOFA) model.
Predictors of mortality were defined by logistic regression in a cohort of 635 consecutive patients with cirrhosis admitted to ICU (1989–2012). The RFH score was derived using a 75% training and 25% validation set. Predictive accuracy and calibration were evaluated using area under the receiver operating characteristic (AUROC) and goodness-of-fit χ2 for the RFH score, as well as for SOFA, Model for End-Stage Liver Disease (MELD), Acute Physiology and Chronic Health Evaluation (APACHE II), and Child-Pugh. CLIF-SOFA was applied to a recent subset (2005–2012) of patients.
In-hospital mortality was 52.3%. Mortality improved over time but with a corresponding reduction in acuity of illness on admission. Predictors of mortality in training set, which constituted the RFH score, were the following: bilirubin, international normalized ratio, lactate, alveolar arterial partial pressure oxygen gradient, urea, while variceal bleeding as indication for admission conferred lesser risk. Classification accuracy was 73.4% in training and 76.7% in validation sample and did not change significantly across different eras of admission. The AUROC for the derived model was 0.83 and the goodness-of-fit χ2 was 3.74 (P=0.88). AUROC for SOFA was 0.81, MELD was 0.79, APACHE II was 0.78, and Child-Pugh was 0.67. In 2005–2012 cohort, AUROC was: SOFA: 0.74, CLIF-SOFA: 0.75, and RFH: 0.78. Goodness-of-fit χ2 was: SOFA: 6.21 (P=0.63), CLIF-SOFA: 9.18 (P=0.33), and RFH: 2.91 (P=0.94).
RFH score demonstrated good discriminative ability and calibration. Internal validation supports its generalizability. CLIF-SOFA did not perform better than RFH and the original SOFA. External validation of our model should be undertaken to confirm its clinical utility.
Background and Aims
Survival of patients with hepatocellular carcinoma (HCC) is determined by the extent of the tumor and the underlying liver function. We aimed to develop a survival model for HCC based on objective parameters including the Model for End stage Liver Disease (MELD) as a gauge of liver dysfunction.
This analysis is based on 477 patients with HCC seen at Mayo Clinic Rochester between 1994 and 2008 (derivation cohort) and 904 patients at the Korean National Cancer Center between 2000 and 2003 (validation cohort). Multivariable proportional hazards models and corresponding risk score were created based on baseline demographic, clinical, and tumor characteristics. Internal and external validation of the model was performed. Discrimination and calibration of this new model were compared against existing models including Barcelona Clinic Liver Cancer (BCLC), Cancer of the Liver Italian Program (CLIP), and Japan Integrated Staging (JIS) scores.
The majority of the patients had viral hepatitis as the underlying liver disease (100% in the derivation cohort and 85% in the validation cohort). The survival model incorporated MELD, age, number of tumor nodules, size of the largest nodule, vascular invasion, metastasis, serum albumin, and alpha-fetoprotein. In cross validation, the coefficients remained largely unchanged between iterations. Observed survival in the validation cohort matched closely with what was predicted by the model. The c-statistic for this model (0.77) was superior to that for BCLC (0.71), CLIP (0.70), or JIS (0.70). The score was able to further classify patient survival within each stage of the BCLC classification.
A new model to predict survival of HCC patients based on objective parameters provides refined prognostication and supplements the BCLC classification.
liver cancer; BCLC; staging
This study was designed to determine if existing methods of grading liver function that have been developed in non-Asian patients with cirrhosis can be used to predict mortality in Asian patients treated for refractory variceal hemorrhage by the use of the transjugular intrahepatic portosystemic shunt (TIPS) procedure.
Materials and Methods
Data for 107 consecutive patients who underwent an emergency TIPS procedure were retrospectively analyzed. Acute physiology and chronic health evaluation (APACHE II), Child-Pugh and model for end-stage liver disease (MELD) scores were calculated. Survival analyses were performed to evaluate the ability of the various models to predict 30-day, 60-day and 360-day mortality. The ability of stratified APACHE II, Child-Pugh, and MELD scores to predict survival was assessed by the use of Kaplan-Meier analysis with the log-rank test.
No patient died during the TIPS procedure, but 82 patients died during the follow-up period. Thirty patients died within 30 days after the TIPS procedure; 37 patients died within 60 days and 53 patients died within 360 days. Univariate analysis indicated that hepatorenal syndrome, use of inotropic agents and mechanical ventilation were associated with elevated 30-day mortality (p < 0.05). Multivariate analysis showed that a Child-Pugh score > 11 or an MELD score > 20 predicted increased risk of death at 30, 60 and 360 days (p < 0.05). APACHE II scores could only predict mortality at 360 days (p < 0.05).
A Child-Pugh score > 11 or an MELD score > 20 are predictive of mortality in Asian patients with refractory variceal hemorrhage treated with the TIPS procedure. An APACHE II score is not predictive of early mortality in this patient population.
Hepatitis; Viral cirrhosis; Mortality; Prognosis; Transjugular intrahepatic portosystemic shunt (TIPS)
AIM: Model of End-stage Liver Disease (MELD) score has recently gained wide acceptance over the old Child-Pugh score in predicting survival in patients with decompensated cirrhosis, although it is more sophisticated. We compared the predictive values of MELD, Child-Pugh and creatinine-modified Child-Pugh scores in decompensated cirrhosis.
METHODS: A cohort of 102 patients with decompensated cirrhosis followed-up for a median of 6 mo was studied. Two types of modified Child-Pugh scores estimated by adding 0-4 points to the original score using creatinine levels as a sixth categorical variable were evaluated.
RESULTS: The areas under the receiver operating charac-teristic curves did not differ significantly among the four scores, but none had excellent diagnostic accuracy (areas: 0.71-0.79). Child-Pugh score appeared to be the worst, while the accuracy of MELD was almost identical with that of modified Child-Pugh in predicting short-term and slightly better in predicting medium-term survival. In Cox regression analysis, all four scores were significantly associated with survival, while MELD and creatinine-modified Child-Pugh scores had better predictive values (c-statistics: 0.73 and 0.69-0.70) than Child-Pugh score (c-statistics: 0.65). Adjustment for gamma-glutamate transpeptidase levels increased the predictive values of all systems (c-statistics: 0.77-0.81). Analysis of the expected and observed survival curves in patients subgroups according to their prognosis showed that all models fit the data reasonably well with MELD probably discriminating better the subgroups with worse prognosis.
CONCLUSION: MELD compared to the old Child-Pugh and particularly to creatinine-modified Child-Pugh scores does not appear to offer a clear advantage in predicting survival in patients with decompensated cirrhosis in daily clinical practice.
Child-Pugh; MELD; Cirrhosis; Decompensated cirrhosis
Acute heart failure syndrome (AHFS) is a major cause of hospitalisation and imparts a substantial burden on patients and healthcare systems. Tools to define risk of AHFS hospitalisation are lacking.
A prospective cohort study (n=628) of patients with stable chronic heart failure (CHF) secondary to left ventricular systolic dysfunction was used to derive an AHFS prediction model which was then assessed in a prospectively recruited validation cohort (n=462).
Within the derivation cohort, 44 (7%) patients were hospitalised as a result of AHFS during 1 year of follow-up. Predictors of AHFS hospitalisation included furosemide equivalent dose, the presence of type 2 diabetes mellitus, AHFS hospitalisation within the previous year and pulmonary congestion on chest radiograph, all assessed at baseline. A multivariable model containing these four variables exhibited good calibration (Hosmer–Lemeshow p=0.38) and discrimination (C-statistic 0.77; 95% CI 0.71 to 0.84). Using a 2.5% risk cut-off for predicted AHFS, the model defined 38.5% of patients as low risk, with negative predictive value of 99.1%; this low risk cohort exhibited <1% excess all-cause mortality per annum when compared with contemporaneous actuarial data. Within the validation cohort, an identically applied model derived comparable performance parameters (C-statistic 0.81 (95% CI 0.74 to 0.87), Hosmer–Lemeshow p=0.15, negative predictive value 100%).
A prospectively derived and validated model using simply obtained clinical data can identify patients with CHF at low risk of hospitalisation due to AHFS in the year following assessment. This may guide the design of future strategies allocating resources to the management of CHF.
Quality improvement initiatives in cardiac surgery largely rely on risk prediction models. Most often, these models include isolated populations and describe isolated end-points. However, with the changing clinical profile of the cardiac surgical patients, mixed populations models are required to accurately represent the majority of the surgical population. Also, composite model end-points of morbidity and mortality, better reflect outcomes experienced by patients.
The model development cohort included 4,270 patients who underwent aortic or mitral valve replacement, or mitral valve repair with/without coronary artery bypass grafting, or isolated coronary artery bypass grafting. A composite end-point of infection, stroke, acute renal failure, or death was evaluated. Age, sex, surgical priority, and procedure were forced, a priori, into the model and then stepwise selection of candidate variables was utilized. Model performance was evaluated by concordance statistic, Hosmer-Lemeshow Goodness of Fit, and calibration plots. Bootstrap technique was employed to validate the model.
The model included 16 variables. Several variables were significant such as, emergent surgical priority (OR 4.3; 95% CI 2.9-7.4), CABG + Valve procedure (OR 2.3; 95% CI 1.8-3.0), and frailty (OR 1.7; 95% CI 1.2-2.5), among others. The concordance statistic for the major adverse cardiac events model in a mixed population was 0.764 (95% CL; 0.75-0.79) and had excellent calibration.
Development of predictive models with composite end-points and mixed procedure population can yield robust statistical and clinical validity. As they more accurately reflect current cardiac surgical profile, models such as this, are an essential tool in quality improvement efforts.
Cardiac surgery; Predictive model; Outcomes
In medical and surgical intensive care units, clinical risk prediction models for readmission have been developed; however, studies reporting the risks for cardiovascular intensive care unit (CVICU) readmission have been methodologically limited by small numbers of outcomes, unreported measures of calibration or discrimination, or a lack of information spanning the entire perioperative period. The purpose of this study was to derive and validate a clinical prediction model for CVICU readmission in cardiac surgical patients.
A total of 10,799 patients more than or equal to 18 years in the Alberta Provincial Project for Outcomes Assessment in Coronary Heart Disease (APPROACH) registry who underwent cardiac surgery (coronary artery bypass or valvular surgery) between 2004 and 2012 and were discharged alive from the first CVICU admission were included. The full cohort was used to derive the clinical prediction model and the model was internally validated with bootstrapping. Discrimination and calibration were assessed using the AUC c index and the Hosmer-Lemeshow tests, respectively.
A total of 479 (4.4%) patients required CVICU readmission. The mean CVICU length of stay (19.9 versus 3.3 days, P <0.001) and in-hospital mortality (14.4% versus 2.2%, P <0.001) were higher among patients readmitted to the CVICU. In the derivation cohort, a total of three preoperative (age ≥70, ejection fraction, chronic lung disease), two intraoperative (single valve repair or replacement plus non-CABG surgery, multivalve repair or replacement), and seven postoperative variables (cardiac arrest, pneumonia, pleural effusion, deep sternal wound infection, leg graft harvest site infection, gastrointestinal bleed, neurologic complications) were independently associated with CVICU readmission. The clinical prediction model had robust discrimination and calibration in the derivation cohort (AUC c index = 0.799; Hosmer-Lemeshow P = 0.192). The validation point estimates and confidence intervals were similar to derivation model.
In a large population-based dataset incorporating a comprehensive set of perioperative variables, we have derived a clinical prediction model with excellent discrimination and calibration. This model identifies opportunities for targeted therapeutic interventions aimed at reducing CVICU readmissions in high-risk patients.
There are multiple clinical and radiographic factors that influence outcomes after endovascular reperfusion therapy (ERT) in acute ischemic stroke (AIS). We sought to derive and validate an outcome prediction score for AIS patients undergoing ERT based on readily available pretreatment and posttreatment factors.
The derivation cohort included 511 patients with anterior circulation AIS treated with ERT at 10 centers between September 2009 and July 2011. The prospective validation cohort included 223 patients with anterior circulation AIS treated in the North American Solitaire Acute Stroke registry. Multivariable logistic regression identified predictors of good outcome (modified Rankin score ≤2 at 3 months) in the derivation cohort; model β coefficients were used to assign points and calculate a risk score. Discrimination was tested using C statistics with 95% confidence intervals (CIs) in the derivation and validation cohorts. Calibration was assessed using the Hosmer-Lemeshow test and plots of observed to expected outcomes. We assessed the net reclassification improvement for the derived score compared to the Totaled Health Risks in Vascular Events (THRIVE) score. Subgroup analysis in patients with pretreatment Alberta Stroke Program Early CT Score (ASPECTS) and posttreatment final infarct volume measurements was also performed to identify whether these radiographic predictors improved the model compared to simpler models.
Good outcome was noted in 186 (36.4%) and 100 patients (44.8%) in the derivation and validation cohorts, respectively. Combining readily available pretreatment and posttreatment variables, we created a score (acronym: SNARL) based on the following parameters: symptomatic hemorrhage [2 points: none, hemorrhagic infarction (HI)1–2 or parenchymal hematoma (PH) type 1; 0 points: PH2], baseline National Institutes of Health Stroke Scale score (3 points: 0–10; 1 point: 11–20; 0 points: >20), age (2 points: <60 years; 1 point: 60–79 years; 0 points: >79 years), reperfusion (3 points: Thrombolysis In Cerebral Ischemia score 2b or 3) and location of clot (1 point: M2; 0 points: M1 or internal carotid artery). The SNARL score demonstrated good discrimination in the derivation (C statistic 0.79, 95% CI 0.75–0.83) and validation cohorts (C statistic 0.74, 95% CI 0.68–0.81) and was superior to the THRIVE score (derivation cohort: C statistic 0.65, 95% CI 0.60–0.70; validation cohort: C-statistic 0.59, 95% CI 0.52–0.67; p < 0.01 in both cohorts) but was inferior to a score that included age, ASPECTS, reperfusion status and final infarct volume (C statistic 0.86, 95% CI 0.82–0.91; p = 0.04). Compared with the THRIVE score, the SNARL score resulted in a net reclassification improvement of 34.8%.
Among AIS patients treated with ERT, pretreatment scores such as the THRIVE score provide only fair prognostic information. Inclusion of posttreatment variables such as reperfusion and symptomatic hemorrhage greatly influences outcome and results in improved outcome prediction.
Prediction tools; Revascularization; Reperfusion
The model for end-stage liver disease (MELD) score is used to stratify candidates for liver transplantation based on objective measures of disease severity. MELD has been validated as a predictor of wait-list mortality in transplantation candidates and has been postulated as a predictor of post-transplant survival. The purpose of this study was to examine the predictive value of the pre-transplantation MELD score on post-transplant survival from relevant existing studies. A systematic review and critical appraisal was performed using Cochrane guidelines. PubMed, the Cochrane Library, Embase, and Web of Science were searched for articles published in the English language since 2005 using a structured search strategy. There were 3058 discrete citations identified and screened for possible inclusion. Any study examining the relationship between pre-transplant MELD and post-transplant survival in the general transplant population was included. Thirty-seven studies met these criteria and were included in the review. Studies were all case series that typically involved stratified analyses of survival by MELD. They represented 15 countries and a total of 53,691 patients. There was significant clinical heterogeneity in patient populations across studies, which precluded performance of a meta-analysis. In 15 studies, no statistically significant association between MELD and post-transplant survival was found. In the remaining 22, some association was found. Eleven studies also measured predictive ability with c-statistics. Values were below 0.7 in all but two studies, suggesting poor predictive value. In summary, while the majority of studies reported an association between pre-transplantation MELD score and post-transplant survival, they represented a low level of evidence. Therefore, their findings should be interpreted conservatively.
Acute kidney injury (AKI) risk prediction scores are an objective and transparent means to enable cohort enrichment in clinical trials or to risk stratify patients preoperatively. Existing scores are limited in that they have been designed to predict only severe, or non-consensus AKI definitions and not less severe stages of AKI, which also have prognostic significance. The aim of this study was to develop and validate novel risk scores that could identify all patients at risk of AKI.
Prospective routinely collected clinical data (n = 30,854) were obtained from 3 UK cardiac surgical centres (Bristol, Birmingham and Wolverhampton). AKI was defined as per the Kidney Disease: Improving Global Outcomes (KDIGO) Guidelines. The model was developed using the Bristol and Birmingham datasets, and externally validated using the Wolverhampton data. Model discrimination was estimated using the area under the ROC curve (AUC). Model calibration was assessed using the Hosmer–Lemeshow test and calibration plots. Diagnostic utility was also compared to existing scores.
The risk prediction score for any stage AKI (AUC = 0.74 (95% confidence intervals (CI) 0.72, 0.76)) demonstrated better discrimination compared to the Euroscore and the Cleveland Clinic Score, and equivalent discrimination to the Mehta and Ng scores. The any stage AKI score demonstrated better calibration than the four comparison scores. A stage 3 AKI risk prediction score also demonstrated good discrimination (AUC = 0.78 (95% CI 0.75, 0.80)) as did the four comparison risk scores, but stage 3 AKI scores were less well calibrated.
This is the first risk score that accurately identifies patients at risk of any stage AKI. This score will be useful in the perioperative management of high risk patients as well as in clinical trial design.
Deep sternal wound infection (DSWI) is a devastating complication of cardiac surgery, with a historical incidence of 0.4–5%. Predicting which patients are at higher risk of infection may help instituting various preventive measures. Risk calculations for mortality have been used as surrogates to estimate the risk of deep sternal wound infection, with limited success. The Society of Thoracic Surgeons (STS) 2008 Risk Calculator modelled the risk of DSWI for cardiac surgical patients, but it has not been validated since its publication. We sought to assess the external validity of the STS-estimated risk of DSWI in a United Kingdom (UK) population.
Using our prospectively captured database, we retrospectively calculated the risk of DSWI for 14 036 patients undergoing valve, coronary artery bypass grafts or combined procedures between February 2001 and March 2010. DSWI was identified according to the Centre for Disease Control and Prevention definition. The receiver operator characteristic (ROC) curve was employed to test the performance of the model using the area under the ROC curve (AUROC). The calibration of the model was interrogated using the Hosmer–Lemeshow test for Goodness of Fit.
A total of 135 (0.95%) patients developed DSWI. Although there was a statistically significant difference in the calculated risk of patients who contracted DSWI (0.44% ± 0.01) vs those who did not (0.28% ± 0.00, P < 0.0001), the AUROC of 0.699 (95% confidence interval: 0.6522–0.7414) denoted a modest discriminatory power, with the Hosmer–Lemeshow Goodness of Fit statistic (P < 0.001) suggesting poor calibration. A risk-adjusted modifier improved the calibration (P = 0.08).
The STS risk calculator lacks adequate discriminatory power for estimating the isolated risk of developing deep sternal wound infection in a UK population. The discrimination is similar to the tool's validation c-statistic and may have a place in an integrated calculator.
Statistics; Risk analysis/modelling; Complications; Sternum; Infection
AIM: To investigate the prognostic value of the model for end-stage liver disease (MELD) and three new MELD-based models combination with serum sodium in decompensated cirrhosis patients-the MELD with the incorporation of serum sodium (MELD-Na), the integrated MELD (iMELD), and the MELD to sodium (MESO) index.
METHODS: A total of 166 patients with decompensated cirrhosis were enrolled into the study. MELD, MELD-Na, iMELD and MESO scores were calculated for each patient following the original formula on the first day of admission. All patients were followed up at least 1 year. The predictive prognosis related with the four models was determined by the area under the receiver operating characteristic curve (AUC) of the four parameters. Kaplan-Meier survival curves were made using the cut-offs identified by means of receiver operating characteristic (ROC).
RESULTS: Out of 166 patients, 38 patients with significantly higher MELD-Na (28.84 ± 2.43 vs 14.72 ± 0.60), iMELD (49.04 ± 1.72 vs 35.52 ± 0.67), MESO scores (1.59 ± 0.82 vs 0.99 ± 0.42) compared to the survivors died within 3 mo (P < 0.001). Of 166 patients, 75 with markedly higher MELD-Na (23.01 ± 1.51 vs 13.78 ± 0.69), iMELD (44.06 ± 1.19 vs 34.12 ± 0.69), MESO scores (1.37 ± 0.70 vs 0.93 ± 0.40) than the survivors died within 1 year (P < 0.001). At 3 mo of enrollment, the iMELD had the highest AUC (0.841), and was followed by the MELD-Na (0.766), MESO (0.723), all larger than MELD (0.773); At 1 year, the iMELD still had the highest AUC (0.783), the difference between the iMELD and MELD was statistically significant (P < 0.05). Survival curves showed that the three new models were all clearly discriminated the patients who survived or died in short-term as well as intermediate-term (P < 0.001).
CONCLUSION: Three new models, changed with serum sodium (MELD-Na, iMELD, MESO) can exactly predict the prognosis of patients with decompensated cirrhosis for short and intermediate period, and may enhance the prognostic accuracy of MELD. The iMELD is better prognostic model for outcome prediction in patients with decompensated cirrhosis.
Cirrhosis; Model for end-stage liver disease; Serum sodium; Prognosis; Survival time
Thrombolysis In Myocardial Infarction (TIMI), Platelet Glycoprotein IIb/IIIa in Unstable Angina: Receptor Suppression Using Integrilin (PURSUIT) and Global Registry of Acute Coronary Events (GRACE) scores have been developed for risk stratification in myocardial infarction (MI). The latter is the most validated score, yet active research is ongoing for improving prognostication in MI.
Derivation and validation of a new model for intrahospital, post-discharge and combined/total all-cause mortality prediction – ACHTUNG-Rule – and comparison with the GRACE algorithm.
1091 patients admitted for MI (age 68.4 ± 13.5, 63.2% males, 41.8% acute MI with ST-segment elevation (STEMI)) and followed for 19.7 ± 6.4 months were assigned to a derivation sample. 400 patients admitted at a later date at our institution (age 68.3 ± 13.4, 62.7% males, 38.8% STEMI) and followed for a period of 7.2 ± 4.0 months were assigned to a validation sample. Three versions of the ACHTUNG-Rule were developed for the prediction of intrahospital, post-discharge and combined (intrahospital plus post-discharge) all-cause mortality prediction. All models were evaluated for their predictive performance using the area under the receiver operating characteristic (ROC) curve, calibration through the Hosmer–Lemeshow test and predictive utility within each individual patient through the Brier score. Comparison through ROC curve analysis and measures of risk reclassification – net reclassification improvement index (NRI) or Integrated Discrimination Improvement (IDI) – was performed between the ACHTUNG versions for intrahospital, post-discharge and combined mortality prediction and the equivalent GRACE score versions for intrahospital (GRACE-IH), post-discharge (GRACE-6PD) and post-admission 6-month mortality (GRACE-6).
Assessment of calibration and overall performance of the ACHTUNG-Rule demonstrated a good fit (p value for the Hosmer–Lemeshow goodness-of-fit test of 0.258, 0.101 and 0.550 for ACHTUNG-IH, ACHTUNG-T and ACHTUNG-R, respectively) and high discriminatory power in the validation cohort for all the primary endpoints (intrahospital mortality: AUC ACHTUNG-IH 0.886 ± 0.035 vs. AUC GRACE-IH 0.906 ± 0.026; post-discharge mortality: AUC ACHTUNG-R 0.827 ± 0.036 vs. AUC GRACE-6PD 0.811 ± 0.034; combined/total mortality: AUC ACHTUNG-T 0.831 ± 0.028 vs. AUC GRACE-6 0.815 ± 0.033). Furthermore, all versions of the ACHTUNG-Rule accurately reclassified a significant number of patients in different, more appropriate, risk categories (NRI ACHTUNG-IH 17.1%, p (2-sided) = 0.0021; NRI ACHTUNG-R 22.0%, p = 0.0002; NRI ACHTUNG-T 18.6%, p = 0.0012). The prognostic performance of the ACHTUNG-Rule was similar in both derivation and validation samples.
All versions of the ACHTUNG-Rule have shown excellent discriminative power and good calibration for predicting intrahospital, post-discharge and combined in-hospital plus post-discharge mortality. The ACHTUNG version for intrahospital mortality prediction was not inferior to its equivalent GRACE model, and ACHTUNG versions for post-discharge and combined/total mortality demonstrated apparent superiority. External validation in wider, independent, preferably multicentre, registries is warranted before its potential clinical implementation.
Myocardial infarction; prognosis; risk assessment; GRACE risk score
Frequent attenders are patients who visit their general practitioner exceptionally frequently. Frequent attendance is usually transitory, but some frequent attenders become persistent. Clinically, prediction of persistent frequent attendance is useful to target treatment at underlying diseases or problems. Scientifically it is useful for the selection of high-risk populations for trials. We previously developed a model to predict which frequent attenders become persistent.
To validate an existing prediction model for persistent frequent attendance that uses information solely from General Practitioners’ electronic medical records.
We applied the existing model (N = 3,045, 2003–2005) to a later time frame (2009–2011) in the original derivation network (N = 4,032, temporal validation) and to patients of another network (SMILE; 2007–2009, N = 5,462, temporal and geographical validation). Model improvement was studied by adding three new predictors (presence of medically unexplained problems, prescriptions of psychoactive drugs and antibiotics). Finally, we derived a model on the three data sets combined (N = 12,539). We expressed discrimination using histograms of the predicted values and the concordance-statistic (c-statistic) and calibration using the calibration slope (1 = ideal) and Hosmer-Lemeshow tests.
The existing model (c-statistic 0.67) discriminated moderately with predicted values between 7.5 and 50 percent and c-statistics of 0.62 and 0.63, for validation in the original network and SMILE network, respectively. Calibration (0.99 originally) was better in SMILE than in the original network (slopes 0.84 and 0.65, respectively). Adding information on the three new predictors did not importantly improve the model (c-statistics 0.64 and 0.63, respectively). Performance of the model based on the combined data was similar (c-statistic 0.65).
This external validation study showed that persistent frequent attenders can be prospectively identified moderately well using data solely from patients’ electronic medical records.
Background & Aims
Cirrhotics undergoing transjugular intrahepatic portosystemic shunt (TIPS) for refractory ascites or recurrent variceal bleeding are at risk for decompensation and death. This study examined whether a new model for end-stage liver disease (MELDNa), which incorporates serum sodium, is a better predictor of death or transplant after TIPS than the original MELD.
148 consecutive patients undergoing non-emergent TIPS for refractory ascites or recurrent variceal bleeding from 1997 to 2006 at a single center were evaluated retrospectively. Cox model analysis was performed with death or transplant within 6 months as the end point. The models were compared using the Harrell’s C index. Recursive partitioning determined the optimal MELDNa cut-off to maximize the risk-benefit ratio of TIPS.
The predictive ability of MELDNa was superior to MELD, particularly in patients with low MELD scores. The C indices (95% CI) for MELDNa and MELD were 0.65 (0.55, 0.71) and 0.58 (0.51, 0.67) using a cut-off score of 18, and 0.72 (0.60, 0.85) and 0.62 (0.49, 0.74) using a cut-off score of 15. Using a MELDNa > 15, 22% of patients were reclassified to a higher risk with an event rate of 44% compared to 10% when the score was ≤ 15.
MELDNa performed better than MELD in predicting death or transplant after TIPS, especially in patients with low MELD scores. In cirrhotics undergoing non-emergent TIPS, a MELD score ≤ 18 can provide a false positive prognosis; a MELDNa score ≤ 15 provides a more accurate risk prediction.