|Home | About | Journals | Submit | Contact Us | Français|
Accidental falls are a major problem of later life. Different technologies to predict falls have been investigated, but with limited success, mainly because of low specificity due to a high false positive rate. This Letter presents an automatic classifier based on heart rate variability (HRV) analysis with the goal to identify fallers automatically. HRV was used in this study as it is considered a good estimator of autonomic nervous system (ANS) states, which are responsible, among other things, for human balance control. Nominal 24 h electrocardiogram recordings from 168 cardiac patients (age 72 ± 8 years, 60 female), of which 47 were fallers, were investigated. Linear and nonlinear HRV properties were analysed in 30 min excerpts. Different data mining approaches were adopted and their performances were compared with a subject-based receiver operating characteristic analysis. The best performance was achieved by a hybrid algorithm, RUSBoost, integrated with feature selection method based on principal component analysis, which achieved satisfactory specificity and accuracy (80 and 72%, respectively), but low sensitivity (51%). These results suggested that ANS states causing falls could be reliably detected, but also that not all the falls were due to ANS states.
Falls represent one of the most common problems of later life. The annual incidence ranges between 35–50% and increases with age, reaching 66% (52–84%) per year among healthy elderly . Falls reduce overall well-being, mobility and quality of life of elderly and of those family members who care for them . The mean and median costs for a fall are about €9000 and €11 000 , and considering the aging population, this equates to millions of euros in the next years.
Falls are caused by complex and dynamic interactions between intrinsic (subject-based) and extrinsic (environmental) factors . Over 400 risk factors have been identified  and their prioritisation remains unclear . Moreover, the applicability, sensitivity and particularly the specificity of subject-specific assessment of fall risks remains imprecise .
Several studies investigated the independent capability of several technologies to prevent falls, including posturography, balance/gate, trunk accelerations, sock pressure sensors, bed/chair alarms and other indoor ambient sensors .
However, recent systematic reviews highlighted that these technologies presented several limits including the fact that the occurrence of false alarms is too high to maintain full attention of the nursing staff . For that reason, Bressler et al.  reported that, in their study on in-hospital falls prevention, the alarms had to be removed. Moreover, these approaches require the use of additive sensors (i.e. pressure matrices or wearable accelerometers) that have no other direct benefits for the health of the elderly and cause additional costs.
Differently from previous works, the present study focused on the use of heart rate variability (HRV) to identify fallers automatically. HRV is considered a reliable non-invasive direct estimator of cardiovascular system (CVS) states and indirect estimator of autonomous nervous system (ANS) states . This choice resulted from the evidence that, although 31% of falls are due to accidents and the causes of 27% of falls remain unclear, the remaining 42% are due to transient problems, which are clearly related to ANS and CVS states, including : gait/balance disorders or weakness (17%), dizziness/vertigo (13%), drop attacks (9%) and postural hypotension (3%). There are significant relationships between these physiological conditions and HRV and dizziness/vertigo  or postural hypotension .
The present study differs from previous works in that it focuses on the use of HRV that can be extracted from one lead electrocardiogram (ECG), which is largely used to monitor the elderly, in hospital and in the community, and there is consensus that ECG monitoring is also beneficial for the early detection of cardiovascular disease worsening [14–17]. Consequently, the use of ECG/HRV in patients suffering from cardiovascular disease is even more diffused . This is a relevant consideration for fall prevention as the most frequent co-morbidities of patients hospitalised for a fall are cardiovascular diseases : hypertension (63%), coronary atrial fibrillation (30%), artery disease (25%) and congestive heart failure (20%). Finally, differently from biomechanical variables, HRV may also be studied when the patient is lying on a chair or a bed. This is relevant as most indoor falls happen while rising from a chair/bed .
This Letter presents the results of the automatic classifiers developed using advanced data mining methods [20–28] and HRV features [29–34] to automatically assess falls. The performances of these classifiers were compared with relevant existing literature on tools and methods to prevent falls [35–38].
The current study was performed acquiring nominal 24 h ECG Holter from 168 hypertensive patients. Among them, according to clinical records, 47 subjects experienced one fall (defined as an unintentionally coming to the ground or some lower level) within three months from the recording, as reported at the next outpatient visit. These subjects were referred as fallers and the remaining ones as non-fallers. This study was approved by the Federico II University Hospital Ethic Committee and all the participants signed specific informed consent to allow the use of their anonymised data for this study.
The series of normal-to-normal beat (NN) intervals were obtained from ECG recordings using an automatic QRS detector based on nonlinearly scaled ECG curve length feature . HRV was analysed concurrently in excerpts of 30 min and if less than 600 valid beats were detected, the excerpt was excluded . Standard linear HRV analysis according to international guidelines was performed . Moreover, nonlinear features were computed according to recent literature .
A number of standard time-domain HRV measures were calculated: average of all NN intervals, standard deviation of all NN intervals (SDNN), square root of the mean of the sum of the squares of differences between adjacent NN intervals (RMSSD), number and percentage of differences between adjacent NN intervals that are longer than 50 ms (NN50 and pNN50), standard deviation of the averages of NN intervals in all 5 min segments, mean of the standard deviations of NN intervals in all 5 min segments, maximum of NN intervals, minimum of NN intervals, median of NN intervals, HRV triangular index (the proportion of all accepted NN intervals to their modal measurement at a discrete scale of 1/128 s bins), triangular interpolation of NN interval histogram (the baseline width of the distribution measured as a base of a triangle), approximating the NN interval distribution by using the minimum square difference.
The frequency-domain HRV measures rely on the estimation of power spectral density, computed, in this work, with three different methods: Welch periodogram, auto-regressive (AR) method and Lomb–Scargle periodogram. For the Welch periodogram, the NN interval was first interpolated with cubic spline interpolation at 4 Hz, then divided into overlapping segments of 256 points in length (with 128 point overlap) and each segment was Hamming windowed. The AR model order was 16. The generalised frequency bands in case of short-term HRV recordings were the very low frequency (VLF, 0–0.04 Hz), low frequency (LF, 0.04–0.15 Hz) and high frequency (HF, 0.15–0.4 Hz). The frequency-domain measures included absolute, relative powers and peak frequency for each band, LF and HF band powers in normalised units and the LF/HF power ratio.
Nonlinear HRV was analysed with the following methods: Poincaré plot (features SD1 and SD2) [15, 39], approximate entropy , sample entropy , correlation dimension (CD) , detrended fluctuation analysis [43, 44] and recurrence plot (RP) [33, 45, 46].
The Poincaré plot is a common graphical representation of the correlation between successive relative risk (RR) intervals. A widely used approach to analyse the Poincaré plot consists of fitting an ellipse oriented according to the line-of-identity and computing the standard deviation of the points perpendicular to and along the line-of-identity.
Approximate entropy measures the complexity or irregularity of the RR series. It is a statistical measure used to quantify the regularities in data without a prior knowledge of the problem.
CD is another method used to measure the complexity of the HRV time series.
Detrended fluctuation analysis measures the correlation within the signal and computes two parameters: short-term fluctuations (α1) and long-term fluctuations (α2).
RP is another approach to measure the complexity of the time series . The following measures of RP were computed: recurrence rate, maximal length of lines, mean length of lines, the determinism and the Shannon entropy.
To develop a classifier to identify the fallers, we adopted several data mining approaches and compared the best performance of each algorithm. We adopted data-mining methods based on two tree classification algorithms: classification and regression tree (CART) and C4.5. The two algorithms iteratively split the dataset according to a criterion that maximises the separation of the data, producing a tree-like decision structure. The most relevant difference is the adopted criterion: Gini index is adopted in CART default implementation , whereas information gain is used in the C4.5 algorithm .
Random forest (RF) is a decision tree ensemble method developed by Breiman . Decision trees that compose the forest are constructed by choosing their splitting attributes from a random subset of k attributes at each internal node. The best split is taken among these randomly chosen attributes and the trees are built without pruning, as opposed to C4.5. The quality of the split at an attribute is determined by its Gini impurity index. RF avoids overfitting due to two sources of randomness – the aforementioned random attribute subset selection and bootstrap training set sampling coupled with majority voting (also referred as bagging), which is shown to reduce the variance of the classifiers.
Rotation forest (RTF) is an ensemble method capable of both classification and regression, depending on the base classifier . By default, RTF uses C4.5 decision trees as the base classifiers. The algorithm focuses on presenting transformed data to the classifier by using a projection filter such as principal component analysis (PCA), non-parametric discriminant analysis, random projections and independent component analysis. The most successful projection filter is the PCA filter .
AdaBoost.M1 (AB) is a well-known algorithm for boosting weak classifiers . The idea of the AB algorithm is to penalise the instances in the training set that are correctly classified by the classifier. In the first step, the algorithm uses the bootstrap method to select the instances for the first training set by giving equal chance to all the instances. The base classifier is trained and the instances are classified. The instances that are correctly classified receive a penalty to their weight for the next step of the training-classification cycle. The algorithm terminates after a predetermined number of iterations. A weight is contributed to each constructed classifier. In the testing phase, each classifier provides a probability estimate for the target class. Each time a target class is selected, its weight is increased depending on the weight of the classifiers. Finally, voting is performed that selects the target class with the highest weight.
MultiBoost (MB) is regarded as an extension to AB that combines the AB algorithm with the wagging procedure, which is an extension of the basic bagging method .
RUSBoost (RB) is a hybrid approach recently proposed by Seiffert et al.  to handle class imbalance. RB relies on the random under-sampling (RUS) technique and AB as boosting algorithm. CART was adopted as weak learner. RUS is one of the most common data level algorithms to deal with an unbalanced dataset or a rare class problem. RUS randomly discards the majority of class samples to modify the class distribution until a desired class distribution (e.g. equal number of instances in the majority and minority classes) is achieved. However, since HRV features have been shown to be correlated, there is a risk that some of the computed features might be redundant and could worsen the classifier performance by increasing the running time and reducing its generalisation ability. To find the optimal feature space, we adopted the PCA method  and we tested the proposed classifier with different number of dimensions.
To assess the performance of the classifiers, we adopted the ten-fold person-independent cross-validation . In the ten-fold person-independent cross-validation method, subjects are partitioned into two subsets in each round (a total of ten rounds): one with 90% subjects for training and the other with 10% subjects for testing. All the excerpts of the same subjects were included in the training or in the testing dataset. As we were interested in a subject-based classification, for each subject, the proportion of excerpts classified as fallers was computed and considered an estimate of the probability that the subject belongs to fallers. A subject-based receiver operating characteristic (ROC) curve analysis was performed: for all the cut-points true positive rate (TPR) and false positive rate (FPR) were calculated. We selected the cut-point that maximises the TPR and provided a FPR that was lower than 20% as the best cut-point. The most common measures for binary classification performance were computed according to the formulae in Table 1.
The HRV analysis was performed using an ad hoc developed HRV software based on MATLAB version R2013a (The MathWorks Inc., Natick, MA) implementation  and the QRS detection was performed through the WQRS implementation  which is freely available from PhysioNet. The classifiers based on RB algorithm were developed in MATLAB, whereas the other algorithms were implemented using the Weka platform for knowledge discovery, version 3.6.10.
RF was constructed using an ensemble of 100 random trees with no limit to tree depth. RTF was constructed with an ensemble of 10 C4.5 trees using PCA filter (all dimensions retained). AB was used in combination with C4.5 classifier, the number of iterations was varied between 10 and 200 with steps of 50 iterations, and C4.5 decision tree was tested with a variable minimal number of observations in each leaf (2, 5, 10, 20). Confidence factor for pruning was set to default of 0.25. MB was used in combination with C4.5 classifier, the number of iterations was varied between 10 and 200 with steps of 50 iterations, and C4.5 decision tree was tested with a variable minimal number of observations in each leaf (2, 5, 10, 20). Confidence factor for pruning was set to default of 0.25. The number of sub-committees was set to 30% of the number of iterations (classifiers), as default. RB was evaluated with PCA dimension varying between 2 and 20. The number of iterations was varied from 20 to 500 with steps of 20 iterations, and CART was tested with a variable minimal number of observations in each leaf (5, 25, 50 and RB default) and of misclassification cost ratio (from 1 to 20). Post-sampling 50:50 class distribution was adopted.
The study sample included 60 female and 108 male subjects (age 72 ± 8 years). Among them, 47 subjects experienced a fall within 3 months from the recordings. No significant differences in age and gender distribution were detected between the two groups, while a repeated measurement regression analysis by generalised estimation equation showed significant differences (p < 0.001) in LF power and in three nonlinear features: the maximal length of lines, the mean length of lines and the Shannon entropy (all RP features).
Several classification algorithms have been trained and tested with the parameter values reported in Section 2.4. The ROC curves of the best classifier for each algorithm are shown in Fig. 1 and the related performances for the selected cut-points (higher TPR, provided a FPR rate lower than 20%) are reported in Table 2.
The algorithms based on the boosting approach appeared to be superior to those based on bagging and the method based on RB achieved better performances in terms of accuracy, sensitivity, and positive and negative predictive values, compared with the other classifiers. These results were achieved with the following values of the parameters: 11 PCA dimensions, 180 iterations, learning rate of 0.7, misclassification cost ratio of 5 and 5 minimal observations in leaves. In particular, RUS improved the performance of the AB algorithm, increasing the AUC from 51.7 to 63.9% and the sensitivity rate from 25.5 to 40.4% without relevant decrease in specificity rate. Using PCA resulted in a further improvement of the RB performance, by increasing sensitivity rate from 40.4 to 51.1% and AUC from 63.9 to 67.6% (with an unchanged specificity rate of 80.2%).
The current study proposed an automatic classifier based on HRV analysis to identify fallers among hypertensive patients. To the best of the authors’ knowledge, only one other study  investigated the discrimination power of HRV features for fallers’ identification using 24 h ECG, but it was a retrospective study and did not propose an automatic classifier method. Moreover, in , the authors adopted only standard linear HRV methods (i.e. SDNN, RMSSD, pNN50, SDNN, RMSSD, pNN50, total power, HF, LF, VLF), and observed no significant differences in these measures between fallers and non-fallers. The statistical analysis on HRV linear and nonlinear measures of the current dataset showed that frequency and nonlinear measures, which are not computed in , significantly differed between fallers and non-fallers. Moreover, the statistical analysis suggested that a depressed HRV, particularly at LF, and a less ‘chaotic’ behaviour of HRV, as assessed by RP features, could be associated with an increased risk of falling. Finally, we computed the feature importance according to RF algorithm, and observed that among ten most relevant features, there are frequency domain features expressed in normalised units, nonlinear features and geometric linear features, all features which are not computed in .
The best performance presented in this Letter was achieved by a hybrid data-mining algorithm, RB, integrated with feature extraction based on PCA. This classifier achieved a relatively high specificity and accuracy (80 and 72%, respectively), but low sensitivity (51%). Particularly, the sensitivity rate achieved is consistent with the findings of Rubenstein , who highlighted that at least 42% of falls are due to transient problems, which are related to ANS and CVS states. Since a limited part of falls are directly caused by CVS (i.e. syncope), the results presented in the current Letter suggested for the first time that ANS/CVS dysfunctions may be responsible for a temporary reduced capability to react to extrinsic risk factors (i.e. reduced reflex velocity) avoiding falls. Moreover, for the first time, this study proved that these dysfunctions are detectable with HRV monitoring. Moreover, the low rate of false positives (1−SPE = 19.8%) suggested that this approach based on HRV analysis could be successfully used in clinical settings, eventually in combination with other approaches.
Several fall risk assessment tools in elderly population have been proposed in literature and showed a wide variability in the reported diagnostic accuracy: sensitivity varied from 43 to 100% and specificity varied from 38 to 96% . To assess the performance of the proposed method, we compared the ROC curve of the proposed method with the performances of several functional mobility tests for predicting falls in community-dwelling older people  (Fig. 2). The proposed method showed higher performance than all the functional tests, which had RR ranging from 1.3 to 2.3 and sensitivity and specificity scores ranging from 11 to 78%, and 28 to 93%, respectively. More recently, a Stroop stepping test using low-cost computer gaming technology has been proposed to discriminate between older fallers and non-fallers, but the authors provided only the odds ratio (1.7) , which is lower than the one proposed here (DOR = 4.2, CI 95% = 2.0–8.7, p-value < 0.001) and reported in Table 2. Finally, the method proposed in the present Letter is clinically feasible, since it only requires a 24 h ECG recording, which is often performed in cardiovascular patients or through wearable devices . For instance, the proposed method does not require the use of other technologies such as wearable accelerometers or pressure matrices, which are not used in everyday clinical practices owing to not having direct benefits for hypertensive outpatients. For that reason, the proposed method could be used widely in outpatient settings to identify high-risk patients who need further assessment and could benefit from fall prevention programs or fall detection systems [37, 38]. In particular, the HRV analysis and the automatic classification could be obtained through the web cloud-based platform developed in the framework of the Smart Health and Artificial intelligence for Risk Estimation (SHARE) project: the ECG could be easily acquired by a commercial wearable device (e.g. Bioharness BH3 manufactured by Zephyr Ltd) and an ad hoc developed client application; the physician could see the acquired signal, the processing results (i.e. HRV analysis and automatic classification) by using a web browser. The SHARE platform is described in detail elsewhere .
Regarding the classification data mining methods, we adopted up-to-date ensemble algorithms based on bagging (i.e. RF, RTF) and boosting (i.e. AB, MB, RB), showing that the latter is superior to the former on this problem, maybe owing to RF performance being more affected by the dependency structure of the data . Moreover, as recently proposed by Seiffert et al. , when a dataset is imbalanced, as in the current study, the performance of boosting algorithm could be improved by integrating it with a data sampling technique. Finally, since previous studies showed the importance of feature selection for learning from small and imbalanced datasets , we integrated RB with a feature extraction method based on PCA and observed that using PCA resulted in higher performances compared with RB and a bagging classifier adopting PCA filter.
However, this study had some limits that should be considered before adopting these methods in other contexts. The dataset used was not specifically designed to study falls. Therefore important information, such as the exposure to other independent intrinsic risk factors for falls could not be accessed or used to verify independently the results. Moreover, the fall recordings were based on patient self-reports, which are not considered to be reliable every time because some non-harmful falls can be forgotten and not reported. Therefore the number of falls could have been underestimated. In addition, the results of the classifiers could be difficult to interpret as the employed methods mixed and masked those HRV features that have an accepted clinical meaning. Rule-based models to distinguish fallers from non-fallers could be more suitable for medical personnel. However, with respect to maximum achieved accuracy, the opaque models obtained from automatic classifiers have an advantage over those with clear interpretation. It should be noted that automatic systems in this field are sufficient to provide early warning signs before adequate medical assessment can be performed. Finally, our findings have been obtained in a population of hypertensive patients, in which HRV is already known to be depressed compared with healthy people. This suggests that depressed HRV could be a more relevant risk factor for falls in people free of cardiovascular disease.
The current study proposed an automated method based on HRV analysis to identify fallers among elderly suffering from cardiovascular disease. The classifier presented achieved a satisfactory overall diagnostic accuracy and specificity, showing better performances than several functional tests proposed in literature for fall risk assessment. As the proposed method requires only ECG recording, which is often performed in cardiac patients, it could be an inexpensive and clinically feasible tool for identifying older hypertensive subjects in need of further medical assessment. The accuracy and the sensitivity achieved suggest that HRV-based classification would be a valuable complementary adding to other multidisciplinary approaches already in use to predict and prevent falls.
This work was supported by the 2007–2013 NOP for Research and Competitiveness for the Convergence Regions (Calabria, Campania, Puglia and Sicily) with code PON04a3_00139 – Smart Health and Artificial intelligence for Risk Estimation. Conflict of interest: none declared.