|Home | About | Journals | Submit | Contact Us | Français|
Objectives To develop and validate a delirium prediction model for adult intensive care patients and determine its additional value compared with prediction by caregivers.
Design Observational multicentre study.
Setting Five intensive care units in the Netherlands (two university hospitals and three university affiliated teaching hospitals).
Participants 3056 intensive care patients aged 18 years or over.
Main outcome measure Development of delirium (defined as at least one positive delirium screening) during patients’ stay in intensive care.
Results The model was developed using 1613 consecutive intensive care patients in one hospital and temporally validated using 549 patients from the same hospital. For external validation, data were collected from 894 patients in four other hospitals. The prediction (PRE-DELIRIC) model contains 10 risk factors—age, APACHE-II score, admission group, coma, infection, metabolic acidosis, use of sedatives and morphine, urea concentration, and urgent admission. The model had an area under the receiver operating characteristics curve of 0.87 (95% confidence interval 0.85 to 0.89) and 0.86 after bootstrapping. Temporal validation and external validation resulted in areas under the curve of 0.89 (0.86 to 0.92) and 0.84 (0.82 to 0.87). The pooled area under the receiver operating characteristics curve (n=3056) was 0.85 (0.84 to 0.87). The area under the curve for nurses’ and physicians’ predictions (n=124) was significantly lower at 0.59 (0.49 to 0.70) for both.
Conclusion The PRE-DELIRIC model for intensive care patients consists of 10 risk factors that are readily available within 24 hours after intensive care admission and has a high predictive value. Clinical prediction by nurses and physicians performed significantly worse. The model allows for early prediction of delirium and initiation of preventive measures.
Trial registration Clinical trials NCT00604773 (development study) and NCT00961389 (validation study).
Delirium, characterised by an acute onset of fluctuating changes in mental status and changed levels of consciousness and inattentiveness,1 has a high incidence rate in critically ill patients.2 3 4 It is a serious disorder associated with prolonged stays in intensive care units and hospitals, higher costs, and increased morbidity and mortality.2 3 5
Several tools are available for assessing delirium in intensive care patients, of which the confusion assessment method—intensive care unit (CAM-ICU) has the highest sensitivity and specificity.6 7 Screening intensive care patients is important,8 9 10 so that timely treatment can be provided. However, preventive measures for delirium may also reduce its incidence, severity, and duration, as determined in other groups of patients.11 12 General preventive measures in all intensive care patients are time consuming and may expose a substantial number of patients to unnecessary risks such as the adverse effects of drug prophylaxis. Although several predictive models for non-intensive care patients exist,13 14 as well as one for older medical intensive care patients,15 no evidence based prediction model for general intensive care patients is available. The aim of our study was to develop and validate a delirium prediction model for intensive care patients and to determine its value compared with prediction by the attending nurses and physicians.
This was an observational multicentre study in which we firstly developed the PREdiction of DELIRium for Intensive Care patients (PRE-DELIRIC) model and then temporally validated it in a second prospective cohort in the same hospital. We then validated the model externally in four other Dutch hospitals.
To develop the prediction model, we did a prospective cohort study in the Radboud University Nijmegen Medical Centre in the Netherlands. This study took place between 1 February 2008 and 1 February 2009. We did a second prospective cohort study in the same hospital for temporal validation of the model between 1 May and 1 September 2009.16
After development and temporal validation, we externally validated the delirium prediction model with data from intensive care patients admitted to four other Dutch hospitals between 1 January and 1 September 2009. One of these hospitals was a university hospital, and three were university affiliated teaching hospitals; all had mixed intensive care populations (table 11).). In these hospitals, trained intensive care nurses used the CAM-ICU at least twice daily.
To compare the predictive value of the model with that of the caregivers, we asked intensive care nurses and physicians caring for the patient to predict independently, within 24 hours after admission to intensive care, if patients would develop a delirious period during their complete stay in intensive care.
After the successful implementation of the validated Dutch version of the CAM-ICU,17 the inter-rater reliability of the delirium screenings by the intensive care nurses was above 0.80 Cohen’s κ, with a compliance rate of more than 90%, as described in more detail previously.10 During the development and temporal validation studies, we included all adult patients admitted to the intensive care unit. To detect delirium, intensive care nurses screened all consecutive adult intensive care patients at least three times daily, and more often if required (for example, after sudden changes in behaviour, attention, or consciousness). This frequency of screening was in accordance with screening in daily practice. We excluded patients if they were delirious within 24 hours after admission to intensive care, had a sustained Richmond agitation sedation score (RASS) of −4/−5 during the complete intensive care admission, stayed on the intensive care unit for less than one day, had serious auditory or visual disorders, were unable to understand Dutch, were severely mentally disabled, or had a serious receptive aphasia or if the compliance rate of the delirium screening was less than 80% during a patient’s stay in the intensive care unit.
To meet the same inclusion and exclusion criteria during the external validation study, we used consecutive patients with complete CAM-ICU screenings, defined as a CAM-ICU compliance rate above 80% per patient. We defined patients as having delirium when they had at least one positive CAM-ICU screening during their intensive care stay or were treated with haloperidol, as in these hospitals haloperidol is used only for treatment of delirium. To examine the predictive value of the PRE-DELIRIC model in daily practice in these hospitals, we did no compliance and inter-rater reliability measurements and used only data from CAM-ICU screenings as done in normal daily practice.
We collected demographic variables and information on potential risk factors identified by a recent systematic review.18 We collected these data electronically within 24 hours of admission to intensive care (web appendix A). In addition, we included variables from the Dutch national intensive care evaluation database as potential risk factors when the delirium incidence rate associated with that variable was more than 50% higher than the incidence rate of the total group (web appendix B).19 Wherever possible, the risk factors were collected as continuous variables (categorical or dichotomised when otherwise).
In view of our aim to develop and validate a delirium prediction model, the main outcome measure was development of delirium during patients’ stay in the intensive care unit. We defined delirium as a minimum of one positive CAM-ICU screening during each patient’s intensive care stay. In addition, we screened patients’ medical and nursing files daily for signs of delirium.20 If the files provided signs of delirium without a positive CAM-ICU screening or, conversely, if the files did not provide evidence of delirium and the patient had a positive CAM-ICU result, a delirium expert additionally screened patients according to the Diagnostic and Statistical Manual of Mental Disorders, fourth edition (DSM-IV) criteria to rule out false negatives and false positives.1
We monitored the performance of CAM-ICU screening to ensure the quality of data collection. We calculated compliance as the percentage of assessments done each day in relation to the total number of assessments that should have been done. The mean compliance during the development and temporal validation studies was 90.4%. We measured the quality of CAM-ICU performance as the inter-rater reliability. For this, we compared the CAM-ICU screening assessed by the attending intensive care nurse with the CAM-ICU score assessed by an expert psychiatry nurse within a time window of one hour. We did 120 inter-rater reliability measurements, resulting in a Cohen’s κ of 0.90 (95% confidence interval 0.82 to 0.98). The first author randomly double checked data from 15% of all patients included for completeness and accuracy.
We calculated the sample size needed for the development of the model on the basis of the need for 10-15 delirious patients per risk factor plus 10% dropout. We imputed missing data for the risk factors. Values were missing in the development study for urea (0.7%), liver enzymes (3.0%), bilirubin (18.0%), calcium (4.5%), sodium (0.3%), haematocrit (0.4%), metabolic acidosis (1.0%), and acute physiology and chronic health evaluation (APACHE)-II scores (0.7%). Data for all other variables were complete. All data for the temporal validation study were complete. We decided that if a laboratory measurement was not determined we had no reason to assume that the missing variable had an abnormal value, and we imputed the mean normal value. To calculate the normal value, we selected all patients with a normal value and then calculated the mean value for this group of patients and used it for imputation. When the APACHE-II score was missing, we imputed the mean value for the delirium or non-delirium group, depending on the results of the CAM-ICU. In the external validation dataset, 6.3% of the urea values were missing and imputed. For APACHE-II, 0.6% of the scores were missing, and we imputed a mean APACHE-II score for the group in the external validation set.
We used univariate logistic regression to develop the prediction model by assessing the association between each potential prognostic determinant and the presence or absence of delirium. We excluded determinants with a P value above 0.15 in univariate analysis or with a prevalence rate below 10%. With the remaining risk factors, we used multivariate logistic regression analysis with backward elimination (excluding risk factors with P values over 0.10) to evaluate the independent associations with the occurrence of delirium. The final model therefore contains independent risk factors for delirium. We estimated the prognostic ability of the model to discriminate between patients with and without delirium by using the area under the receiver operating characteristics curve (AUROC). We used bootstrapping techniques to adjust for overfitting—that is, for overly optimistic estimates of the regression coefficients of the risk factors in the final model. Two hundred random bootstrap samples resulted in shrunken regression coefficients of the risk factors and area under the curve of the developed model.21
In both validation studies, we multiplied shrunken regression coefficients for each risk factor by the observed patients’ values. The outcome is a calculated predicted probability on which we built a new AUROC. Finally, to examine how well the model was calibrated, we calculated linear predictor values for each patient of every cohort by using the coefficients from the final development model. We used these linear predictors in a logistic regression model to test whether the prediction rule was well calibrated, resulting in a calibration slope and an intercept. A calibration slope of 1 and an intercept of 0 show a perfect calibration. Calibration plots for each cohort are available in web appendix D. We used SPSS 16.01, R statistics version 2.10.1,22 using the rms package,23 for all analyses.
In total, we screened 2116 consecutive patients and excluded 503 of them (fig 11).). Of the remaining 1613 patients, 411 (25.5%) developed delirium. Table 22 shows patients’ characteristics, and web appendices B and C contain prevalence rates and delirium incidence rates for the separate risk factors. Of the 25 potential risk factors, we excluded alcohol misuse (7.8%), dementia (1.7%), use of an epidural catheter (2.2%), hyperamylasaemia (3.9%), hyponatraemia (5.8%), use of dopamine (0.2%), and use of lorazepam (0.7%) because of a prevalence rate below 10%. We excluded hypertension because of a P value above 0.15 in univariate logistic regression analysis. After multivariate logistic regression analysis with the remaining risk factors, we constructed the PRE-DELIRIC model, which consisted of 10 risk factors (table 33).). The AUROC was 0.87 (95% confidence interval 0.85 to 0.89) and 0.86 after bootstrapping. Calibration of the model resulted in a calibration slope of 1.08 and an intercept of −0.06. The box shows the formula for the PRE-DELIRIC model.
The scoring system’s intercept is expressed as −6.31; the other numbers represent the shrunken regression coefficients (weight) of each risk factor.
In the prospective validation study, we screened 748 consecutive patients and excluded 199 of them (fig 11).). Of the remaining 549 patients, 171 (31.1%) developed delirium (table 22).). The temporal validation resulted in an AUROC of 0.89 (0.86 to 0.92). The calibration slope of the temporal model was 1.2, and the intercept was 0.22.
We used data from 894 non-selected intensive care patients (table 22)) for external validation, resulting in an AUROC of 0.84 (0.82 to 0.87) with a calibration slope of 0.76 and an intercept of −0.59. The AUROCs of the four different hospitals did not differ from each other (data not shown). As no differences in prediction existed between the three studies, we pooled the data (n=3056), resulting in an AUROC of 0.85 (0.84 to 0.87) (fig 22).). The pooled data resulted in an overall calibration slope of 0.93 with an intercept of −0.29, indicating good calibration.
We divided the complete group into four different risk groups; low, moderate, high, and very high risk, with PRE-DELIRIC scores of 0-20%, >20-40%, >40-60%, and >60%. Figure 22 shows the sensitivity, specificity, and likelihood ratios for each risk group. Figure 33 shows the calibration plot of the pooled data.
In a convenience sample of 124 patients, we asked attending intensive care nurses and physicians to predict delirium, independently of each other, within 24 hours of patients’ admission to intensive care. The AUROC for prediction by the nurses (0.59, 0.49 to 0.70) and physicians (0.59, 0.49 to 0.70) was inferior to the predictive value of the PRE-DELIRIC model (0.87, 0.81 to 0.93) in this specific subgroup of 124 patients. We found no significant differences between the prediction made by intensive care nurses (75% of sample) and student intensive care nurses (25%) or between predictions made by intensivists (36%), fellow-intensivists (40%), or residents (24%) (data not shown).
In this multicentre study, we developed and validated a model for predicting delirium in intensive care patients. To our knowledge, this is the first delirium prediction study for general intensive care patients and represents by far the largest delirium related study in intensive care patients to date. Our PRE-DELIRIC model reliably predicted the development of delirium for the complete length of stay in intensive care, on the basis of 10 readily available risk factors within 24 hours of admission to intensive care. In addition, the area under the receiver operating characteristics curve of the PRE-DELIRIC model was significantly higher than the delirium prediction capacity of attending caregivers. These findings confirm that the model has additional value in daily practice. Importantly, dementia and alcohol misuse are not included the model, as these patients need to be considered as high risk patients irrespective of the presence of other risk factors.
The early prediction of development of delirium in intensive care patients with the PRE-DELIRIC model facilitates the use of non-drug preventive measures in high risk patients, such as improvement of orientation, cognitive stimulation, early mobilisation,11 and listening to music.24 It also facilitates drug interventions in high risk patients, such as the administration of prophylactic haloperidol.12 These interventions aim to improve patients’ cognition or have a systemic effect, although the evidence of beneficial preventive measures with drugs and nursing interventions in critically ill patients is limited at this moment.25 Non-drug preventive measures were successful in reducing the incidence and duration of delirium in a non-critically ill hospital population with an intermediate to high risk for the development of delirium,11 and prevention with haloperidol resulted in reduced severity of delirium and fewer days with delirium, as well as a shorter length of stay in hospital.12 Importantly, no data from intensive care patients are available. Interestingly, early mobilisation of mechanically ventilated patients in intensive care, besides other significant effects, resulted in a reduced duration of delirium.26
The use of the PRE-DELIRIC model to identify and consequently preventively treat high risk patients could offer an important contribution to intensive care practice and ensure efficient use of research resources to study only high risk patients. In addition, the modifiable risk factors of the model may facilitate the use of preventive measures. Currently, the PRE-DELIRIC model is used in clinical daily practice in the hospital that developed the model; intensive care patients with a high risk of delirium (≥50% PRE-DELIRIC score), and patients with dementia or alcohol misuse, receive preventive measures. The optimal cut-off point of the PRE-DELIRIC model and the most effective delirium preventive interventions for intensive care patients need to be studied in the near future.
Our study had several limitations. Firstly, although the CAM-ICU has a high sensitivity and specificity when used by dedicated research nurses,27 28 its performance in daily practice as used by bedside nurses recently proved to be lower.29 In this performance study, the CAM-ICU was measured at one point on one day, whereas our diagnosis of delirium was based on all CAM-ICU screenings during patients’ complete stay in intensive care, increasing the sensitivity of the test. We also used haloperidol as a proxy for the diagnosis of delirium, as in all participating centres haloperidol was used only to treat delirium, and the hospitals with the highest CAM-ICU performance participated in this delirium prediction study. In view of the fluctuating nature of delirium, all patients were screened three times daily and more often if needed. When delirium was not detected with the CAM-ICU but suspected on the basis of medical and nursing reports, patients were additionally screened by a delirium expert according to the DSM-IV criteria.1 In addition, during the development and temporal validation study, we did quality checks that showed a high compliance rate and inter-rater reliability. We therefore presume that few patients were misdiagnosed.
Secondly, we used data collected from four other hospitals in the same study period. These centres implemented and clinically used the CAM-ICU combined with a delirium treatment protocol before the conduct of the study. For the external validation study, we included only patients with complete CAM-ICU screenings and those who were treated with haloperidol for delirium. The case mix of these patients showed a higher APACHE-II score and more sedated patients, and more patients were admitted for medical reasons compared with the hospital where the primary development and validation studies were done. These differences may explain the higher incidence of delirium in these hospitals. Because of logistic reasons and the fact that we wanted to examine the predictive value of the PRE-DELIRIC model in daily intensive care practice, we did not do quality checks such as inter-rater reliability measurements in these other hospitals. Despite these limitations, the PRE-DELIRIC model showed a good predictive value in daily intensive care practice.
Thirdly, as recommended,21 the risk factors used in our study were primarily based on a systematic review.18 We included additional variables following the results of our first cohort. We added “diagnosis group” and “urgent admission” as new risk factors because of a high incidence of delirium associated with these items. Although these variables were not found in the systematic review,18 some studies show that urgent admission to intensive care and neurological conditions are risk factors for delirium.30 31 The results of our development study show that these risk factors are of importance in predicting delirium in intensive care patients. Because of a low prevalence rate, relevant risk factors such as hyponatraemia, alcohol misuse, and dementia were excluded from the multivariate logistic regression analysis. The additional value of hyponatraemia for the model would be expected to be low, as the incidence of delirium in patients with hyponatraemia in the first 24 hours after admission to intensive care is low. The importance of, for example, dementia and alcohol misuse is recognised in several studies,4 32 and the incidence of delirium in these patients was also high in our study. In many institutes, all these patients will receive preventive measures so physicians do not need a delirium prediction model in these particular subgroups. Moreover, adding these covariates to the model would decrease its sensitivity to the other covariates. For these reasons, we did not include alcohol misuse and dementia in the PRE-DELIRIC model.
Fourthly, the negative likelihood ratio for patients with a predicted low chance of developing delirium is relatively rather moderate. This indicates that, in this group, patients will develop delirium while they are classified as having a low risk. On the other hand, preventive measures are advised in patients with a high risk, and the higher the risk of delirium the better the performance of the model. Nevertheless, a predicted low risk does not exclude the possibility of development of delirium.
Finally, the PRE-DELIRIC model is a static model that yields a calculated probability for delirium 24 hours after admission to intensive care. As the health status of patients can improve or deteriorate during their stay in intensive care, the probability of development of delirium may also change. Our model does not take into account changes in health status. Despite this limitation of the PRE-DELIRIC model, the area under the receiver operating characteristics curve of the model is high. Even so, development of a dynamic prediction model using dynamic parameters, such as the sequential organ failure assessment (SOFA) score would be interesting, to improve its predictive value during the patients’ stay on the intensive care unit, which may also result in a better performance in the low risk group.
The PRE-DELIRIC model can predict delirium for the complete stay in the intensive care within 24 hours of admission. We can now identify patients who have a high risk of developing delirium during their intensive care stay. This will facilitate targeted initiation of preventive measures. Our study shows that the use of the PRE-DELIRIC model is significantly better than the predictions of the attending caregivers, and it should therefore be used daily in intensive care practice.
An automatic version of the PRE-DELIRIC model (Excel and web based) can be downloaded at www.umcn.nl/Research/Departments/intensive%20care/Pages/vandenBoogaard.aspx (English and Dutch version available).
We thank Gabriel Roodbol, delirium expert psychiatric nurse, who did all the inter-rater reliability measurements during the study. Thanks also to Henk Westerbaan, Maaike Fenten, and Wendy Groetelaers-Kusters, for collecting data during the absence of the first author and processing the data in SPSS. We also thank Sjef van der Velde for his support in gathering the national intensive care evaluation database information. Finally, we thank Gareth Parsons (University of Glamorgan, Faculty of Health, Sport and Science, UK) for his support with English grammar and spelling.
Contributors: MvdB conducted the study, collected all data, did the statistical analysis in collaboration with RD, and drafted the manuscript. PP and LS supervised the conduct of the study and writing of the paper. AJCS, MAK, PES, and PHJvdV gathered the data from the other hospitals and corrected the manuscript. JGvdH and TvA co-supervised and corrected the manuscript. MvdB and PP are the guarantors.
Competing interests: All authors have completed the Unified Competing Interest form at http://www.icmje.org/coi_disclosure.pdf (available on request from the corresponding author) and declare: no support from any organisation for the submitted work; no financial relationships with any organisations that might have an interest in the submitted work in the previous three years; no other relationships or activities that could appear to have influenced the submitted work.
Ethical approval: The regional medical ethical committee approved the study and waived the need for informed consent, as no additional interventions were carried out and data collection was not burdensome to patients.
Data sharing: No additional data available.
Cite this as: BMJ 2012;344:e420