|Home | About | Journals | Submit | Contact Us | Français|
A one-to-one case control study was conducted on a pre-existing dataset to examine a predictive model with a set of risk factors for pressure ulcer development in acute care settings. Various techniques were used to select the most relevant predictors from ten subsets of a pre-existing dataset. The predictors identified were further examined using ten additional subsets by measuring sensitivities, specificities, positive/negative predictive values, and the areas under the ROC (receiver operating characteristic) curves. The best components for identifying at-risk patients consisted of three Braden subscales and five risk factors routinely collected through electronic health records. Entering these eight predictors into the logistic regression model yielded a sensitivity of 92%, a specificity of 67%, and an area under the ROC curve of 89%. Further evaluation, however, is needed to explore the validity of the model.
Despite numerous efforts to decrease pressure ulcer development, current estimates are still above an acceptable incidence rate of less than 2%.1 Higher incidences of pressure ulcers have been reported in acute care settings (0.4% – 38 %) than in long-term care facilities (2.2 % – 23.9%).2 The incidence of pressure ulcers is likely to rise in the future due to increasing acuity and severity of illness and a larger aging population in acute care settings.3 It is well known that pressure ulcer development is directly associated with increasing lengths of stay and healthcare costs and may lead to decreased quality of life.4
According to the current best evidence, preventive measures (e.g., repositioning, support surfaces, nutritional interventions) should be initiated in a timely manner after identifying at-risk individuals.5–6 To date, the recommended timing is to evaluate patients in acute care settings for early detection on the day of admission and to reevaluate the plan of care within 48 to 72 hours after admission.7 As such, the National Quality Forum (NQF) has designated initial and regular risk assessments for pressure ulcer development as a “safe practice” for the nation’s health.8
Risk assessment as “an integral part of prevention efforts” involves examining the patient’s skin integrity and level of risk with predisposing factors. 9 Risk factors can be defined as anything that increases the susceptibility of individuals for developing pressure ulcers. The body of literature, however, has shown considerable variations in study designs and methods, sample populations, healthcare settings, data sources, and operational methods for defining the variables. Subsequently, the relationships among risk factors studied (e.g., age, mobility, activity, incontinence, malnutrition) have varied, depending on the set of variables entered into analyses. Therefore, previous studies make it difficult to compare and generalize causal relationships between risk factors and pressure ulcer development thus, enlarging a gap between evidence and practice. Using risk assessment tools allows clinicians to make better decisions in determining at-risk individuals and initiating preventive measures.6 The current best evidence recommends the use of the Norton and the Braden scales because of acceptable reliability and validity of the tools. 5–6 The Norton scale is derived from five clinical determinants of pressure ulcer risk (including physical condition, mental condition, activity, mobility, and incontinence) and the Braden scale is composed of six subscales (sensory perception, activity, mobility, moisture, friction and sheer, and nutrition). In both, individuals are classified into an at-risk group versus a not-at-risk group by a cut-off point (i.e., threshold). Each subscale score along with a total score has been investigated as a predictor for pressure ulcer development. Yet, the quality of assessment tools is of some concerns because the tools fail to consider all relevant risk factors so that they may fail to identify all who may benefit from interventions to prevent pressure ulcers.
In sum, further work is needed to refine the tools to provide optimal quality of care for at-risk patients considering patients’ level of risk.6 This study, thus, was designed to develop a predictive model with a set of risk factors for pressure ulcer formation in hospital settings.
A one-to-one case control study was conducted using a pre-existing dataset created for a Quality Improvement project conducted by a health care system. The de-identified dataset was derived from electronic records of patients who were admitted to two Midwestern teaching hospitals between April 2004 and September 2004. Both hospitals had approximately 350 beds and were affiliated with the same regional, not-for-profit, healthcare system. Patients in both hospitals had the same accessibility to durable medical equipment, quality care provision, staffing ratios, and skill mix rate. The dataset with 5,796 adult patients (age ≥ 18) consisted of administrative data (e.g., length of stay and demographics), laboratory results examined from two weeks before admission through the first four days of hospitalization, and active patient records documented on 11 charting forms (e.g., patient history and physical assessment) during the first four days of their stay. Data preparation and analyses were achieved using Oracle enterprise database personal edition (Release 9i), SPSS for windows (Release 11.5), and Weka (Version 3.4.4). Approval for this study was obtained from the Institutional Review Boards (IRBs) for the protection of human subjects at the University of Pennsylvania and from the regional healthcare system.
This study included only the first patient admission during the study period, assuming that prior hospitalization might contribute to pressure ulcer development. In addition, patients were selected if they stayed longer than four days and had both health history and physical assessment records, including admission route; activities of daily living (ADLs) prior to admission; endocrine history; as well as genitourinary, cardiovascular, musculoskeletal, neurological, and gastrointestinal histories/ assessments. Since this study used the Braden scale as a reference standard of predictive modeling results, patients also needed to have Braden scores documented on admission in order to be eligible for the study. Among the patients who met the selection criteria, patients with pressure ulcers were excluded if it was unclear whether or not the pressure ulcers developed during their stay. As a result, 84 subjects with hospital-acquired pressure ulcers (HPUs) and 2,263 non-HPU subjects were available for the analysis (i.e., incidence of 3.6%).
Given the statistical rules pertaining to an effective sample size to avoid over- and under-fitting issues of a prognostic study (e.g., common statistical procedures can underestimate the probability of low incidence events and it is desirable to have at least 10 HPU subjects in the sample per potential predictor 10), it was not feasible to create both training and testing datasets from the pre-existing dataset. This preliminary study, thus, was done using 20 training datasets each of which included all 84 of the HPU subjects and each of which included 84 non-HPU who were randomly selected from the 2,263 non-HPU subjects. Random sampling of non-HPU subjects was done without replacement. Through replication of the training process with different subsets of non-HPU subjects in the samples, somewhat greater confidence in the replicability of results is achieved. The first 10 training datasets (hereafter referred to as TD-1 datasets) were used for study variable construction, predictor (i.e., risk factor) selection, and model creation. The predictors and models identified from the TD-1 datasets were further examined using the additional 10 training datasets (hereafter referred to as TD-2 datasets) in order to examine the replicability of results.
From the pre-existing dataset, we identified 595 “raw” codified text data values (i.e., attributes) and created meta-data (i.e., data of data) by recording the presence/absence of the data values from each subject. Study variables were then constructed through attribute value transformation and reduction with each of the TD-1 datasets by applying the results of frequencies, chi-square, and Mann-Whitney tests (Figure 1). In other words, the attributes showing more than 5% of frequencies in more than five training datasets were entered into chi-square tests for categorical variables and Mann-Whitney tests for continuous variables presenting non-normal distributions. It was thought that attributes showing an underlying relationship in univariate analyses may increase a patient risk in multivariate analyses. 10 After iterating these tests with the TD-1 datasets, attributes were selected if they showed a p value less than .2 in more than five training datasets. The p value criterion is a statistical rule which is used to ensure at least borderline association between independent variables and a dependent variable (i.e., presence of HPUs). The other attribute reduction criterion (≥ 5 training datasets) guarantees that the attributes were not selected by chance.
Accordingly, 77 study variables were identified and then missing values were replaced for all the study subjects (n = 2,347) using an EM (expectation-maximization) analysis for continuous variables and a frequencies analysis for categorical variables. To find a set of best predictors, further variable reduction using the TD-1 datasets was done by applying a combination of three techniques. Gathering potential predictors using different techniques would enhance the generalizability of predictive modeling results. That is, of 77 variables, this study collected variables from each TD-1 dataset (a) if they were presented in the equations of logistic regressions (forward stepwise selection method with a p value to enter 0.05); (b) if they occurred in external nodes of decision trees (modified C4.5 algorithm of Weka11); or (c) if they appeared to be significant when run subset evaluations with the Best First forward search algorithm.11 The subset evaluation algorithm produced a set of substantial variables by examining the predictability of each variable entered and the redundancy among the variables (Weka). 11 When listing all the variables identified from the three techniques, only 24 independent variables that occurred in more than 5 training datasets remained for model construction.
Four modeling procedures were accomplished using logistic regression analyses by entering different sets of independent variables. The first set was composed of the 24 variables without the Braden scores. The second set of variables included the Braden subscales as well as a total score (Figure 1). These two modeling procedures consisted of (a) predictor identification with the TD-1 datasets using a backward stepwise deletion method with a p value to remove 0.2 and (b) evaluation of identified predictors with the TD-2 datasets using an enter method. Since each TD-1 dataset could have presented a different set of predictors because each included different non-HPU subjects, predictors were selected for further analysis if they had a p value of less than .05 in more than five training datasets. All the evaluations with the TD-2 datasets were then completed by running logistic regressions, with these predictors, measuring average sensitivities, specificities, positive/negative predictive values, and the areas under the ROC curves. Each modeling procedure was performed using a different threshold which produced the best balance between a sensitivity and specificity. The third modeling procedure with the TD-2 datasets examined average predictabilities of the Braden total score as a reference standard. The last logistic regression analyses were conducted on each of the TD-2 dataset using all the significant predictors identified from the first two modeling procedures. As a result, a set of best predictors was identified and then a predictive model was created using all the subjects of the TD-1 datasets (n = 924). Finally, this predictive model was examined using all the subjects of the TD-2 datasets to see the stability of that model (n = 924).
The sample of 2,347 patients was primarily female (58.4%) and white (72.1%), with a mean age of 67.8 (SD = 17.5, Median = 72). The majority of patients were admitted for medical (80.1%) and surgical (17.3%) services. According to the ICD-9 categories for principal diagnoses, approximately 19% of the subjects had a diagnosis involving the respiratory system and other chest symptoms, followed by general symptoms (18.7%), musculoskeletal system disease (18.1%), and circulatory system disease (4.4%). Of the subjects who had undergone a procedure or surgery (n = 1,406), 66% had an operation related to musculoskeletal (25.4%), digestive (23.2%), and cardiac (17.4%) systems. The most frequently occurring admission type was emergent (79.5%) with a mean length of stay of 7 days (SD = 4.1, Mode = 4). Fifty-five subjects (2.3%) had pre-existing pressure ulcers at admission and 20% of subjects with HPUs had pre-existing pressure ulcers at admission. Fifty-four percent of the subjects with HPUs developed a pressure ulcer within the first five days after admission and an additional 31% of subjects with HPUs developed them within the next four days. The most frequently occurring body sites for HPUs were buttocks (38%), coccyx (25%), and heels (8.3%).
First, when logistic regression models with the 24 variables were developed, five variables were statistically significant (p <.05) in more than five of the TD-1 training datasets: (a) presence of edema (cardiovascular system), (b) presence of an indwelling foley catheter, (c) presence of a nutrition consult triggered for potential or actual nutritional imbalance, (d) use of a wheelchair as an assistive device during the hospital stay, and (e) presence of a need for extra nursing care. Clinicians documented that patients had a need for extra nursing care if the patient needed surveillance or an escort by nurse clinician for off-floor tests (e.g., chest x-ray, colonoscopy), telemetry monitoring, or chemotherapy for cancer, or if the patient was very confused, at high risk for falls, or in restraints. When the logistic regression with the enter method was run, the same five predictors were also statistically significant (p <.05) across the TD-2 datasets. Predictive accuracies for true positive subjects ranged from 65% to 82% (average: 72%), while an accuracy range for false negative subjects was 77% to 88% (average: 83%). Further, the logistic regression models yielded an average sensitivity of 86% (range: 77–93%), specificity of 65% (54–83%), and area under the ROC curve of 81% (77–85%) using a cut value of 0.4 (Table 1). An area under the ROC curve greater than 80% indicates a good model for classifying subjects with an outcome of interest against subjects without the outcome.12
Second, multivariate logistic regression models with the Braden six subscale scores produced three significant predictors – activity, friction/shear, and sensory perception (p < .05 in more than five of the TD-1 training datasets). It should be noted that the models were built without the total score since the total score, as a sum of six subscale scores, may be highly correlated with the subscales. These three subscales also were significant when subsequent logistic regressions with a backward stepwise deletion were run using the three subscale scores, together with the Braden total score (TD-1 datasets). At a cut value of 0.32, logistic regression models with the three subscales using the TD-2 datasets produced an average sensitivity of 91% (89 – 95%) and specificity of 67% (60 – 73%). The average area under the ROC curve was 86% with a range from 83% to 89%. The average predictive accuracies were 73% for positive predictive values and 89% for negative predictive values.
Third, logistic regressions with only the Braden total score (a cut value of 0.4) yielded a constant sensitivity of 86% and an average specificity of 67% (60 – 75%), along with an average ROC of 84% (81 – 87%). According to this cut-off point corresponding to the Braden total score of 17, on average, the Braden total score over-predicted 28% of the subjects as at-risk patients (range: 22 – 33%) and 17% of the subjects (16 – 19%) were under-predicted as not-at-risk patients.
Lastly, logistic regressions with the significant five risk factors and the three Braden subscales (using an enter method) showed the best balance for predicting hospital-acquired pressure ulcers with sensitivities ranging from 92% to 94%, specificities ranging from 63% to 74%, and the areas under the ROC curves ranging from 88% to 92% (a cut value of 0.3). The overall performance of the models was 93%, 69%, and 90% in sensitivity, specificity, and the area under the ROC curve, respectively.
Based on the results of prior logistic regression analyses, the eight predictors were selected to build a model by entering all the subjects from the TD-1 datasets (n = 924; 84 HPU subjects and 840 non-HPU subjects). A logistic regression using an enter method yielded a sensitivity of 92%, a specificity of 66% and the area under the ROC curve of 89% at a cut value 0.045. In this model, six predictors out of eight predictors were statistically significant: presence of edema (p <.01), use of wheelchairs (p <.001), nutrition consult (p <.01), activity subscale (p <.01), friction/shear subscale (p <.001), and sensory perception subscale (p <.01).
The predictive model was then applied to the subjects from all of the TD-2 datasets (n = 924), which presented 92%, 67%, and 89% in sensitivity, specificity, and the area under the ROC curve, respectively. Certainly, the predictive model showed a stable performance without any deterioration for the TD-2 subjects, indicating that it is both promising and important to test this model further.
According to the literature, there is a clear room to improve prediction methods for pressure ulcer development. This study using considerable amounts of patient data gathered from a variety of sources has provided an opportunity to examine a way of risk stratification in the area of pressure ulcer prevention. The analytic methods and criteria used for choosing the predictors (which were significant in more than five training datasets) were differentiated from the previous studies that usually identified risk factors for pressure ulcer development by employing a single sample.
All the eight predictors were clinically relevant and frequently identified as related to pressure ulcer development in the literature. Assuming that costs for treating pressure ulcers exceed those for preventing pressure ulcers, this study focused on creating a highly sensitive model to guide proper interventions. As compared to the Braden scale, which was used for a reference standard, the eight predictors identified from the multivariate logistic regression models presented higher accuracy and more stable performance with less variation across the training datasets (Table 1).
Although the identified predictors extend the current knowledge base regarding pressure ulcer development in hospitalized adults, our specific findings might not apply in other clinical settings. Due to the low incidence of HPUs in this preliminary study, we used all the 84 subjects with HPUs in the training datasets and were not able to examine internal and external validity of the model developed. The performance of the model developed from the training datasets therefore should be further evaluated in a random sample of new subjects from the same healthcare setting and/or from other settings. As clinical information systems are increasingly implemented, integrating the prediction models refined through future research into clinical information systems would provide one type of strategy to make the study findings more usable in the practice.
Because the patient was the unit of analysis in this study, factors that might be related to care context, such as preventive measures, staffing ratios, and skill mix, were not examined. Further, controlling for the preventive interventions provided was beyond the scope of this study.
The authors would like to thank Drs. Lois Evans, Kathy Bowles, Teresa Richmond, Barbara Frink, Sarah Ratcliffe, Mary Hagle, and Elizabeth Devine for their advice. We also acknowledge the Frank Morgan Jones Fund at the School of Nursing of the University of Pennsylvania and Dr. Susie Kim of the Global Korean Nursing Foundation for supporting this research.