|Home | About | Journals | Submit | Contact Us | Français|
The goal of this study was to determine which of several clinical balance tests best identifies patients with vestibular disorders. We compared the scores of normals and patients on the Berg Balance Scale (Berg), Dynamic Gait Index (DGI), Timed Up and Go (TUG), Computerized Dynamic Posturography Sensory Organization Test (SOT), and a new obstacle avoidance test: the Functional Mobility Test (FMT). The study was performed in an out-patient balance laboratory at a tertiary care center. Subjects were 40 normal adults, and 40 adults with vestibular impairments. The main outcome measures were the sensitivity of tests to patients and specificity to normals. When adjusted for age the Berg, TUG, DGI and FMT had moderate sensitivity and specificity. SOT had moderately high sensitivity and specificity. SOT and FMT, combined, had high sensitivity and moderate specificity. Therefore, the kinds of tests of standing and walking balance that clinicians may use to screen patients for falling are not as good for screening for vestibular disorders as SOT. SOT combined with FMT is better. When screening patients for vestibular disorders, when objective diagnostic tests of the vestibular system, itself, are unavailable, tests of both standing and walking balance, together, give the most information about community-dwelling patients. These tests may also indicate the presence of sub-clinical balance problems in community-dwelling, asymptomatic adults.
The literature describes many clinical tests of balance. Some tests selectively evaluate standing or walking balance; some tests have only one component, others have many subtests. Some tests are designed for frail, institutionalized individuals, other tests are designed for relatively healthy people. No evidence suggests which tests are best for screening particular disorders, or whether or not both standing balance and walking balance should be tested. Most studies have examined falls prediction. No studies have examined the value of a battery of inexpensive screening tests to suggest which individuals might have vestibular impairments and might benefit from referral for further testing.
The goal of screening is to identify patients who may benefit from in-depth diagnostic testing. An ideal screening test requires minimal equipment, is easy to administer in a short period of time, and has high sensitivity, thus minimizing the likelihood of a false negative result. For example the well-known Berg Balance scale, which was designed to evaluate standing balance in elderly patients [1, 2], fits that description. It is user-friendly, includes 14 brief subtests, uses minimal, inexpensive equipment, and is easily scored by a staff member using a 5-point ordinal scale. It predicts falling in seniors , differentiates among normals, people with Parkinson’s disease and people with peripheral neuropathy , and is sensitive to change after vestibular rehabilitation . With in-patient stroke patients it detected fallers well but specificity to ambulatory fallers increased when the Berg was combined with a test of walking . The Berg uses a single cut-point to separate normal from abnormal scores . The finding that age and sex affect scores  suggests that multiple cut-points might be more useful, however.
Computerized dynamic posturography, using the Equitest (Neurocom International, Inc), measures changes in the center of pressure as the body sways over the feet during various conditions of quiet standing on a force platform. It has been considered the criterion standard since publication of the seminal paper by Nashner and his colleagues . In the six conditions of the Sensory Organization Test (SOT) subjects are tested on six combinations of visual (eyes open reliable, vs. eyes closed, eyes open unreliable) and proprioceptive (reliable vs. unreliable) conditions. The most challenging conditions are sensitive to people with histories of falls  and show changes after space flight . All subtests show changes with age . The equipment, however, is large and not easily moved, and the cost may be beyond the budget of many small clinics, limiting its use in many clinical environments.
The Get Up and Go Test , sharpened by timing it as the Timed Up and Go (TUG)  is a test of walking balance, designed to identify elderly fallers. It is easy for even cognitively impaired elderly people to understand, requires minimal equipment and is easy to score and interpret . It differentiates elderly patients at moderate to high risk of falling from individuals at low risk for falling , elderly, institutionalized patients from community-dwelling seniors , and community-dwelling seniors who fall from non-fallers . On an in-patient stroke unit, compared to the Berg the TUG had slightly greater specificity to fallers but less sensitivity to ambulatory non-fallers .
The Dynamic Gait Index (DGI)  also tests walking balance. Similar to the Berg, it uses minimal equipment, has several subtests, and is easily scored. It has moderate sensitivity to patients with balance disorders  but good sensitivity to fallers with vestibular disorders [20, 21]. It is well-constructed and particularly useful for community-dwelling older adults with balance problems . Scores on the DGI and the Berg are moderately correlated . Inter-rater reliability on individual test items varies from poor to excellent . Similarly, test-retest reliability ranges from poor to excellent, depending on the subtest, although overall test-retest reliability is high . Changes on the DGI may be related to vestibular compensation .
Obstacle avoidance is an important component of many mobility skills. Older adults generally perform worse than younger adults on obstacle avoidance tests [27–29]. Normals undergoing visuomotor and vestibulomotor adaptation perform poorly on obstacle avoidance tasks [30–32]. Obstacle avoidance during treadmill walking has been shown to be sensitive to change in fallers after a falls prevention program although a standing balance test showed no change . Not surprisingly, the DGI includes an obstacle avoidance subtest. The first goal of the present study was to test the usefulness of our previously developed Functional Mobility Test (FMT) obstacle avoidance task [31, 32] as a test of locomotor balance.
The literature does not indicate which test or combination of tests best identifies patients with balance disorders or best predicts which patients have vestibular impairments. Most studies have examined falls prediction. Vestibular disorders have complex manifestations so a combination of tests that measure different factors may be more useful for screening than a single test [6, 34] . Since standing and walking are different skills, a combination of tests of standing and walking may be most accurate in predicting patients with balance impairments.
The second goal of the present study was to determine which test or combination of tests would be best for screening people for vestibular disorders. Such screening tests could be used by health care providers who are not physicians, to help identify individuals who might benefit from referral to a physician who has expertise in diagnosis of vestibular disorders. Such tests might also be useful in population-based epidemiologic screening studies that require inexpensive but valid and reliable screening tests to approximate the incidence and prevalence of vestibular disorders in various populations. Aside from FMT, which we developed, we selected tests because they are common, normed, clinical tests; easy to administer; easy to score and easy to interpret.
Subjects were 40 normal people, aged 18 to 62 years (mean 38.1 yrs, SD.12.9), including 12 males and 28 females, and 40 patients, aged 22 to 73 years (mean 57.4 yrs. SD 13.7), including 15 males and 25 females. Normals were recruited from the staff, students, and visitors at Baylor College of Medicine, via e-mail to Baylor students, and word of mouth to staff and visitors. The laboratory, which is a tertiary care center, is also the clinical laboratory for the college, and does objective diagnostic tests of the vestibular system, which take 2 to 3 hours. Therefore, patients’ family members and friends often wait for long periods in our waiting area. Visitors in the waiting area were informed about the study and were invited to participate if they fit the criteria for normals. Due to the expense and the long time needed for diagnostic testing, normals were not tested on objective diagnostic tests to ensure that they were normal. Instead, we relied on self report of absence of symptoms and absence of prior history of otologic, neurologic, and orthopedic disorders. Normals could walk independently without ataxia, and had no history of otologic or neurologic disorders and no musculoskeletal limitations.
Patients were recruited from the caseload of patients referred to the Center for Balance Disorders for diagnostic testing and/ or vestibular rehabilitation. All patient subjects were ambulatory without gait aids, had no significant musculoskeletal limitations and had no significant lower extremity peripheral neuropathies. They had been diagnosed with vestibular disorders by the board-certified otolaryngologists and neurologists. The referring physicians made the diagnoses based on the clinical histories, clinical examinations, vestibular diagnostic tests and any other tests that the referring physicians chose to order. All patients had positive findings on at least one of the objective diagnostic tests of the vestibular system including low frequency sinusoidal tests of the vestibulo-ocular reflex in darkness, bi-thermal caloric tests, and Dix-Hallpike maneuvers. We did not have access to results of other tests. Table 1 lists patients’ diagnoses.
The standards for diagnostic testing in our laboratory are as follows: bi-thermal caloric testing with water, caloric weakness >20% or total velocity <25°/sec; rotational tests in darkness at 0.0125 Hz, 0.05 Hz and 0.2 Hz, normal ranges are gains of 0.3 to 0.8, 0.4 to 0.95, and 0.45 to 1.0, respectively; Dix-Hallpike maneuvers and positional tests, presence of nystagmus.
All subjects performed all tests. SOT was given on the Equitest (Neurocom), using the control condition (SOT 1, standing quietly with eyes open) and the vestibularly challenging condition with sway-referenced force platform motion and eyes closed (SOT 5). Three trials of each condition were used, per the manufacturer’s instructions. Subjects wore a safety harness that is part of the apparatus. The dependent measures for each condition were the average equilibrium score from all three trials per condition (SOT 5 eq) and the number of falls (SOT 5 falls).
The Berg, DGI and TUG were given per the published instructions [1, 2, 13, 18]. The Berg includes 14 subtests, each test graded on a 5-point scale in which 4 is normal and 0 is the worst possible score. For TUG, subjects began by sitting in a standard armchair, seat height and depth 46 cm each, arm height 64 cm. They were instructed to stand, walk 3 meters at a comfortable pace, turn around, walk back and sit down. They were timed with a stopwatch. Ataxia was graded on a 4-point scale in which 1 meant normal. The DGI includes 8 subtests, graded on a 4-point scale in which 3 is normal and 0 is the worst score.
For the FMT, which has been described previously [31, 32], subjects walked through an obstacle course, 6.6 × 5.2 m, on 10.16 cm thick, medium density, compliant foam (Sunmate; Dynamic Systems, Leicester, NC). The course included 2 pleated paper curtains suspended from the ceiling at shoulder height and two pairs of Styrofoam blocks (41 cm × 10 cm) placed across the foam. Each curtain plus a pair of blocks made a portal: the subject simultaneously stepped over the blocks and under the curtain. The course also included 4 pairs of inflated, sand-weighted pylons 0.9 – 1.4 m × 0.4 m diameter (children’s bop bags), 4 noise-making spots, and two low (20 cm) Styrofoam blocks. A small bell was taped to each obstacle to facilitate counting obstacles as they were bumped. Subjects were instructed to walk through the course as quickly as possible without touching any obstacles but touching all of the noise spots. They did two trials, which were timed with a stopwatch. See Figure 1 and Figure 2. To avoid the possibility of a learning effect, only the data from Trial 1 were used. The dependent measures were time around the course and the number of obstacles bumped.
The three laboratory technicians who tested subjects all held undergraduate degrees in science, were all experienced in administering vestibular diagnostic tests including SOT, and all routinely participated in collecting research data. Collectively they had 51 years of diagnostic and research testing experience: (bioengineer: 18 years; registered electroencephalography technician: 30 years; certified medical assistant: 3 years). Staff had already been trained to administer FMT. They practiced administering the Berg, TUG and DGI until inter-rater reliability was >0.9. Staff who administered tests could not be blinded to group because patients had had their diagnostic tests in the laboratory. Thus, they were already known to the staff.
The test battery took approximately 25 minutes. Tests were given in random order. Subjects sat down after SOT and FMT. They could also rest as needed in between the other tests. No subjects complained of fatigue.
Each subject gave informed consent prior to testing. The Institutional Review Board for Baylor College of Medicine and Affiliated Hospitals approved this study.
Logistic regression adjusted for age of each subject was used to quantify the association between patient/control status and prediction of disease by the diagnostic tests. Standard definitions of sensitivity and specificity were used to determine how accurately patients and control subjects were classified by each test, e.g., sensitivity = true positive/ (true positive + false negative). The positive likelihood ratio was calculated as sensitivity/(1-specificity), post-test odds as pre-test odds * positive likelihood ratio, and post-test probability as post-test odds /(post-test odds + 1). ROC (Receiver Operating Characteristic) areas under the curve were calculated to quantify and compare tests and combinations of tests. Published norms (cut-points) were used in these analyses for SOT5eq, TUG time, TUG ataxia, Berg Balance scale and Dynamic Gait Index. These cut points were taken from the published studies [1, 2, 13, 18] and from the operator’s manual for SOT published by the manufacturer. ROC analyses were used to determine cut-points for the new FMT tests: time to complete the course (sec) and number of obstacles touched. Sensitivity rather than specificity was emphasized when choosing cut-points, because we were concerned with identifying abnormal balance, rather than normal balance. In the absence of reliable public health data on the prevalence of vestibular disorders, in general, we estimated the pre-test prevalence as 50% for calculation of odds ratios. Stata statistical software was used for the analyses .
Published norms were used for the cut-points separating normal from abnormal scores on previously normed tests. For SOT 5, the standard cut-points are age adjusted: age 20–59 cut-point = 52, age 60–69, cut-point=51, age 70–79 cut-point=45. For other published tests only one cut-point is used: for the Berg, 45; the DGI, 19; and for TUG, time ≤12 sec and ataxia=1 . See Table 2.
To help interpret the results below, descriptive terms for percent of individuals classified are defined as follows: low, < 60%; moderately low, ≥60% and < 70%; moderate, ≥70% and ≤80%; moderately high, ≥ 81% and <90%; high, ≥ 90%. To assist the reader who is not an expert in statistics, consider the following verbal definitions of statistical concepts: likelihood ratio indicates how much the odds of disease increase when a test is positive; post-test probability indicates the probability of disease if the test is positive. This measure estimates how much the result on a diagnostic test changes the probability that a patient has a disease.
As shown in Table 3, using the published cut-points and adjusted for age, the Berg, TUG time, TUG ataxia and DGI all had moderate specificity and sensitivity, Therefore the usefulness of any one of those tests to classify vestibularly impaired individuals correctly was moderate, at best. As shown in Table 2, scores of normals and patients on those tests did not differ. Using the standard, age-appropriate cut-points for the SOT scores, SOT 5 eq and falls, using a cut-point of > 0 falls, had moderately high specificity and moderate sensitivity, and moderately high ability to classify patients and normals, combined., See Table 3.
ROC analyses were used to determine cut-points for FMT time. Using a cut-point of 23 sec for FMT time, and adjusted for age, specificity was moderate (78) and sensitivity was moderate but slightly higher (80). The total or combined sensitivity plus specificity was moderate. Using a cut-point of 1 or more obstacles touched, and adjusted for age, FMT obstacles sensitivity, specificity and total percent correctly identified were moderate (78). See Table 3. ROC analyses for other tests are reflected in the Total percent correctly classified in Table 3 and Table 4. Subjects seemed to use one of two strategies, either doing well on time or on obstacles, so individuals who did well on time did not do well on obstacles. As shown in Figure 3 most normals were quite fast and bumped several obstacles. By contrast, almost all patients went slowly and avoided hitting obstacles. We do not know if they planned to go slowly and thus avoided obstacles or if they planned to avoid obstacles and thus moved slowly. As indicated in Table 2 scores of patients were more variable than normals.
We examined combinations of tests to determine if using two tests improved sensitivity to patients. To be positive for a combination the subject had to be positive on both tests. As shown in Table 4, SOT 5 falls >0 plus the standard, age adjusted SOT 5 eq combined with FMT time or with FMT obstacles had high sensitivity but moderate specificity. Combining SOT 5 with other tests gave similar or slightly weaker results. Combing the Berg with DGI, TUG or FMT also gave somewhat weaker results, as shown in Table 4. When combinations of standing balance and walking balance tests were compared, none of the combinations differed significantly (p>0.15 for combinations without SOT 5; p>0.32 for combinations that included SOT 5).
The Berg, TUG and DGI may be useful for predicting falls in elderly people with multifactorial balance disorders  or other people with balance impairments so severe that they fall, although the usefulness of TUG for elderly fallers is open to question  . These tests are moderately useful for identifying balance impairments in younger, community-dwelling people with vestibular impairments who may not have significant histories of falling. Individually, SOT 5 and the new FMT are somewhat more useful, as indicated by the higher likelihood ratios. SOT 5 combined with FMT, TUG time or DGI has higher sensitivity, although the likelihood rations are not much better due to somewhat decreased specificity. Similarly, if the Berg is combined with other tests sensitivity increases somewhat. The combinations with SOT 5 yielded higher sensitivity than the combinations with the Berg. Thus, if computerized dynamic posturography is available to the clinician, then combining SOT 5 with a test of walking balance is advisable to get the most value for the patient from these screening tests. If computerized posturography is not available, then the Berg should be combined with one of the tests of walking balance.
These results appear to differ from a previous study of the DGI and TUG in patients with vestibular disorders  but the findings about sensitivity and specificity are not comparable due to differences in the paradigms. Whitney et al examined the usefulness of tests for identifying fallers based on falls histories, they did not use a normal control group, and the subjects had worse scores on TUG and DGI than in the present study. They found that the TUG was useful for identifying fallers at the cut-point of 11.1, which was higher than the planned cut-point, and they reported moderate specificity. The terminology can be confusing. Unlike the present study Whitney et al defined specificity as the rate of false positives based on falls histories. In contrast to our finding of low sensitivity to vestibular disorders and high specificity for normals, they reported moderate sensitivity to falls and specificity for false positives for the DGI. In the present study we were concerned not with prediction of falling but with prediction of vestibular impairment regardless of falls status. Thus, our lower sensitivity and higher specificity scores are not surprising.
These data suggest that to determine if a patient might have a clinically significant balance impairment, combining standing and walking balance tests would be more useful than giving one or the other test, alone, thus supporting earlier work . These data also suggest that this combination of tests may identify some sub-clinical balance abnormalities in high-functioning, community dwelling adults who are identified as normal. Therefore, given a choice between high sensitivity and high specificity, high sensitivity may be more useful to the clinician when screening a clinical population.
These findings support and extend previous research showing that combining tests of standing balance and walking is the best way to identify patients with uncompensated vestibular impairments. Previous work shows that tests of standing and walking are not highly correlated . We have shown that sensitivity is higher for the combined tests than for each test, alone.
This study has some limitations. Data were collected at a tertiary care center. All patient subjects were already known to have vestibular disorders. Therefore we did not test patients complaining of vague “dizziness” that could have been caused by other health conditions. These tests are not intended to be used as diagnostic tests or to replace the role of the physician in determining the definitive diagnosis. Therapists and other health professionals cannot use these screening tests to make diagnostic decisions. This study is a first effort toward developing better screening tests that may be useful in epidemiologic studies and in treatment settings, outside of a tertiary care facility, where physicians with limited facilities and non-physician health providers may see patients, such as nursing homes, rehabilitation centers, private practice therapy clinics, and geriatric day care centers.
We were unable to use age-matched controls, so the mean age of normal subjects was less than the mean age of vestibularly impaired subjects. Normal young and middle aged adults have similar scores on balance testing although older adults’ scores are lower . Therefore for this preliminary study the difference in ages may not be of major importance. Also, data from objective diagnostic tests were not available for most normal subjects. We did not have equal numbers of males and females in either group since the majority of patients referred for diagnostic testing or vestibular rehabilitation are female. Thus, while our data represent a reasonable sample of our patient population, they may not be strictly comparable to the general population.
Testing could not be done blinded. Staff members knew which subjects were patients. That problem was ameliorated somewhat because SOT is scored by the computer. The other tests are easy to score objectively with little room for interpretation. Future research will control for blinding and verification of normals.
Standard balance tests used in rehabilitation, which may be good predictors of falls, are not sensitive indicators of vestibular disorders. For screening patients who complain of falls or dizziness, the Berg, DGI and TUG do not suggest the underlying cause of the problem. Thus, treatment planning may require other kinds of screening. Computerized dynamic posturography and an obstacle avoidance task are at least as sensitive to vestibular disorders than the Berg, DGI or TUG. When the tests are combined they are even more sensitive indicators of vestibular disorders. The results of this combined battery may suggest the need for referral for objective diagnostic tests of vestibular function.
Supported by NIH grants DC03602 and DC04167. We thank the staff of the Center for Balance Disorders for technical assistance.