Falls in the elderly are often a symptom of acute medical problems in combination with underlying risks such as medications, postural hypotension and lower extremity weakness. Identifying those at risk allows targeted assessment and intervention such as a review of medications and environmental modifications [16
]. This study demonstrated good predictive validity for the modified SRATIFY tool to identify individuals at risk of falling in acute care. With a risk score of 9, sensitivity was 91% and specificity was 60%. The falls risk assessment tool can be easily incorporated into practice without added burden to the patient. The findings were achieved with a conservative methodology in which the outcome measure was the patient (i.e. fallers), rather than falls, and the risk score was generated before any falls. Despite minimal nurse training and short completion time, we were able to obtain very good inter-rater reliability (ICC = 0.78). A recent analytic review of falls risk assessment tools found that only two of five tools used in acute care with a sensitivity over 80 described how long the tool took to complete and only one had findings reproduced by other investigators. Many did not report inter-rater reliability [17
Risk factors included in screening for falls in hospitalized patients have largely been consistent across studies with varying methods. Findings have repeatedly emphasized falls history, mental impairment, toileting frequency, and general mobility as predictive variables for falls [7
]. Nevertheless, it is not yet clear how to maximize prediction. The variables included in different studies do not overlap entirely and some studies incorporate variables with poor or inconsistent predictive validity. For example, visual impairment had poor predictive value in our study and no significant predictive value in Morse's study [24
]. Oliver et al.
included visual impairment as a variable in STRATIFY based on an initial study phase in which it was moderately predictive (OR = 3.55), and appeared to rank fifth strongest among 10 clinical variables described [7
]. However, their design did not include a control for inter-correlations among risk factors. Also, relevant to optimizing prediction is the fact that Morse's study [24
] and ours are the only ones to include weightings derived from quantitative analysis. Item weighting was clearly important to optimize prediction.
Studies have also differed in the suggested ideal risk score cut-offs to consider patients in the "at risk" group. Whether our suggested cut-off of 9 is ideal for different hospital settings is not known. One approach is to use the cut-off that maximizes predictive power mathematically. Practitioners in different settings may adjust the trade-off between sensitivity and specificity, based on differing falls rates, values, laws, funding and other factors.
Our finding of poor predictive validity with the unweighted items does not clearly amount to poor generalization across settings because the items and protocol were changed. Studies involving tests of prediction tools in new locations have found results that are weaker than the original findings [25
]. The difficulty obtaining generalized (i.e. reproducible) effects is concerning. One explanation may be that the base rate for occurrence of a clinical outcome is known to affect positive predictive value [24
]. Our base rate for falls was 5.5%, which is lower than that found in the British study and may have contributed to lower predictability. Another possibility is that prediction may only be consistent among patients with similar characteristics, resulting in generalization across some settings and not others.
A potential methodological limitation is uncertainty about patient incident reports, which may not capture all falls, however our documented rate of falls was similar to previous years in our setting. There is also the possibility that completing risk assessments influences how nurses respond to patients in terms of falls prevention strategies (i.e. Hawthorne effect). It is unknown if this factor affected the true rate of falls in our setting. However, it is predicted that this effect is likely minimal given that strong consistent falls prevention strategies were not in place at the time of the study. Another potential limitation is that despite some changes to scale items taken from STRATIFY to improve reproducibility; there is still room for error. For example, patient recall of falls may not be accurate and consistent. This may have accounted for an inter-rater reliability that was lower than ideal. The problem stresses a potential need to improve operational definitions of risk variables to ensure reproducibility in measuring items such as mental status. It may help to have consensus among investigators on key issues including: which variables to include, operational definitions of risk constructs, the duration over which risk is assessed (i.e. within 24 or 48 hours), the way in which users should be trained, and what the appropriate outcomes are (e.g. falls versus fallers). Further replication of our study in other settings will help to correct upon these limitations and improve generalizability.