Search tips
Search criteria 


Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
J Neurol Phys Ther. Author manuscript; available in PMC 2010 October 12.
Published in final edited form as:
J Neurol Phys Ther. 2008 June; 32(2): 70–79.
doi:  10.1097/NPT.0b013e3181733709
PMCID: PMC2952827

The Reliability of the Vestibular Autorotation Test (VAT) in Patients with Dizziness

Philip J. Blatt, PT, PhD, NCS, Michael C. Schubert, PT, PhD, Kathryn E. Roach, PT, PhD, and Ronald J. Tusa, MD, PhD



To establish intrarater and interrater reliability of the Vestibular Autorotation Test (VAT) (Western Systems Research Inc., Pasadena, CA) in a clinical sample of individuals reporting dizziness.

Study Design

Ninety-eight patients with reports of dizziness referred for vestibular function testing performed repeated trials of horizontal VAT. A subsample of 49 individuals repeated the test for a second rater.


Approximately 66% of subjects were unable to meet the performance criterion of six consecutive trials where data was displayed at frequencies ≥3.9 Hz with coherence values held constant trial to trial. There was a good level of intrarater reliability for gain independent of the effects of practice (intraclass correlation coefficient [ICC] = 0.78 [95% confidence interval [CI]: 0.69 – 0.87] to 0.95 [(95% CI: 0.93–0.97]). A significant difference in intrarater reliability was found when the first three trials were compared to the last three trials for phase (ICC ranged from 0.04 [95% CI: 0.00–0.31] to 0.96 [95% CI: 0.93–0.97]) and asymmetry (ICC ranged from 0.39 [95% CI: 0.17–0.56] to 0.73 [95% CI: 0.32–0.81]) particularly at frequencies ≥4.3 Hz. Interrater reliability was good to excellent across all variables at frequencies ≤3.9 Hz.


Many patients had difficulty performing the VAT. The reliability estimates for phase and asymmetry, but not gain, were significantly affected by practice. Careful attention to patient preparation, instruction, and test monitoring including sufficient patient practice before data collection are likely to be critical factors to ensure quality data.

Keywords: vestibular function testing, vestibulo-ocular reflex, vestibular hypofunction, dizziness, reliability


The angular vestibular ocular reflex (aVOR) stabilizes visual images on the retina during head rotation. The aVOR is able to stabilize visual motion on the retina over a broad spectrum of head movement frequencies from 2 to 20 Hz.1 These frequencies of head motion include the range of activities that encompass functional activities.2 The aVOR is the only system able to detect head rotation at these high frequencies. Deficiency of the aVOR is often perceived by the individual as blurring or jumping of the visual scene while their heads are moving. This is termed oscillopsia and is a common report of patients with vestibulopathy.3

The gold-standard method for diagnosis of peripheral vestibulopathy is the electronystagmography (ENG) test battery, which includes the bithermal caloric test.4,5 However, there are limitations to the caloric test for the assessment of peripheral vestibular function. Caloric testing has very low frequency response characteristics (≈0.003 Hz) that are well below head movement frequencies that occur during functional activities.6,7 It is therefore possible that a subject may have a normal caloric test, but manifest aVOR deficits during head movements within a higher frequency range. Additionally, caloric deficits may be detected in an individual who has functionally recovered after peripheral vestibular disease (central compensation).8,9 Finally, caloric testing can be extremely noxious for individuals with remaining peripheral vestibular function or for those who are anxious. These features represent significant limitations in the use of the caloric test, not only as a diagnostic tool, but also as a mechanism to monitor patients reporting dizziness or vertigo.

Rotary chair testing (the application of either a step-velocity or sinusoidal rotational stimulus through a computer-controlled motorized chair, with concurrent recording of evoked eye movements) is complementary to the caloric test for evaluation of peripheral vestibular function. Rotary chair testing provides assessment of the vestibular ocular reflex over a wider range of head movement frequencies than caloric testing, more closely approximating those found under physiologic conditions.1,2 The expenses incurred in implementing a rotary chair testing program limit widespread clinical application for diagnostic purposes. Additionally, rotary chair testing is typically performed under laboratory conditions (eg, complete darkness, passive rotation) and within a restricted physiologic range (0.01–2.0 Hz) that challenges the functional utility of this approach to defining vestibulopathy. Although accepted parameters exist to identify bilateral vestibular hypofunction with caloric examination, rotary chair testing remains the more common accepted laboratory assessment of bilateral vestibular hypofunction since the caloric test cannot readily distinguish between low normal responses and bilateral vestibular weakness.10

Several methods of testing the peripheral vestibular system have been developed that enable testing of the vestibular ocular reflex over a frequency range more similar to those frequencies of head movements produced during functional tasks.1123 One such test is the Vestibular Autorotation Test (VAT) (Western Systems Research Inc., Pasadena, CA).16,22 The VAT was developed in the early 1980s in an effort to provide a cost-effective solution to assess vestibular function over the broad range of head movement frequencies that occur during functional activities. During the VAT, the subject performs active horizontal or vertical head movements over an increasing frequency range from 0.5 to 5.9 Hz for 18 seconds while visually fixating a one square centimeter target placed six feet away. Although sweep frequency testing using active head movements is not a novel concept, being first performed by Atkin and Bender11 in the early 1960s, Western Systems Research Inc. were the first to introduce a portable, computer-based, and commercially available system.16,22 The manufacturer recommends the number of trials that a subject perform be repeated until “a sufficient number of tests results in 3 tests which demonstrate repeatability through relatively small standard deviation SD.”

To date, no peer-reviewed published study has ascertained the reliability of head-only rotation testing or the VAT in a patient population when referenced against an established vestibular function test such as the electronystagmographic examination.16,22,2431 However, varying levels of reliability have been reported in samples from normal populations.2628,31 The purposes of this study were to establish the intrarater and interrater reliability of the VAT in a clinical sample of individuals reporting dizziness and to determine what factors may affect reliability. Specifically, we were interested in determining (1) what proportion of subjects can perform six consecutive VAT trials across the full range of head movement frequencies and (2) what specific characteristic differences exist between those subjects who are able to produce VAT data for six trials and those who are not.



Ninety-eight individuals reporting dizziness were included from a pool of 280 patients via a sample of convenience who were referred for vestibular function testing within a tertiary care specialty clinic from January 1, 1997, to October 31, 1999. Subjects were excluded from the study based on the following criteria: (1) painful or limited cervical spine range of motion; (2) use of vestibular suppressant medications including meclizine hydrochloride or a benzodiazepine; (3) corrected visual acuity poorer than 20/60 or who normally wore glasses to correct myopia but who did not have their glasses with them at the time of assessment; (4) oculomotor anomalies such as skew deviation; (5) inability to understand how the head is to be rotated.

The frequency distribution for subject exclusion was not specifically monitored. All subjects provided informed consent before participation as required and approved by the Human Subjects Committee of the Institutional Review Board of the University of Miami.


Three graduate students and a technician were trained to perform vestibular function testing in the laboratory, including the primary author (P.J.B.). The primary author received extensive advanced training in clinical vestibular function testing including specific training by the manufacturer of the VAT. All other raters performing the VAT received specific instruction regarding patient set-up and instruction that was standardized to the primary author’s testing technique and therefore the manufacturer’s standardized methods.

Caloric and Rotary Chair Testing

A standardized caloric technique using a modified Fitzgerald-Hallpike method with random side of initial irrigation was used to avoid a side of lesion bias in the Jongkees formula as suggested by Furman and Jacob.32 A soft-tipped, open-loop caloric irrigator with a flow rate of approximately five milliliters per second was used to provide bithermal, binaural water stimuli. Water temperatures were 30°C (defined as the cool condition) and 40°C (defined as the warm condition). Although 44°C is the standard temperature for the warm condition, our laboratory modified this condition based on patient reports of discomfort. The duration of each irrigation was 45 seconds. In subjects with no response to both the cool and warm water stimulus in one or both ears, ice water irrigation (3°C) and rotary chair testing was done (step velocity at 240 degrees per second). If nystagmus was elicited via ice water, subjects were positioned prone to determine if the response reversed direction. In this situation, the response was considered significant for partial preservation of peripheral vestibular function. Patients with low-velocity but symmetrical caloric responses also had a rotary chair examination.

Patients were divided into three groups: unilateral vestibular hypofunction (UVH), defined as a caloric asymmetry of ≥25%; bilateral vestibular hypofunction (BVH), defined by a caloric response less than five degrees per second total slow phase eye velocity including ice water stimulation and bilateral weakness on rotary chair testing; and subjects reporting dizziness who had caloric responses within normal limits (<24% asymmetry).33


Subjects were screened for normal cervical spine range of motion and the ability to follow directions. Eye movement data were recorded using electro-oculography, as specified by the VAT instructions. The subject’s skin at the electrode sites was cleaned with alcohol. This was repeated until an alcohol wipe remained clean after contact with the patient’s skin. Subjects were fitted with five Ag-AgCl electrodes (Duo-trode Disposable Electrodes, Myo-tronics Inc., Tukwila, WA) placed at the outer canthus of each eye, above and below the left eye in a line bisecting the middle of the pupil on primary gaze, and one placed slightly above the bridge of the nose. Subjects were also fitted with a light-weight headband. The headband contained an integrated preamplifier to which the electrodes were connected via a low-mass wiring assembly and a single-plane rate sensor. The rate sensor measured the velocity of head movements during testing and was factory calibrated at 100 degrees per second per volt.

The examiner verified the electrode placement and ensured that there was a clean signal from the test subject before data collection. The VAT does not provide a quantitative measure of impedance. The VAT does provide a user-controlled method of eliminating eye blink artifacts and the ability to change the default coherence setting. Coherence refers to how well the output generated from the VAT software (eg, plotted VOR gain) matches the input (eye and head velocity) and can be chosen by the user. Coherence values range from zero to 1.0, with a coherence value set at 1.0, forcing the software to reproduce data only when the output is a precise match of the inputted signal. Coherence therefore provides the user with a quasi-measure of confidence of the plotted data. The VAT software uses a conservative although arbitrary default coherence level of 0.7 and will not plot data when coherence is lower than this value. Because the selection of these methods of data extraction is subjective, each of the raters was instructed to use the standard factory settings for all trials.

After electrode preparation, the subject had the specific testing procedure explained and demonstrated using the following standardized language: “You will move your head side to side in time with a tone the computer will play while you keep your eyes focused on the black dot behind me. I will now show you exactly how I want you to move your head.” The head movement range and frequency/amplitude were then demonstrated to the patient as if the tester was performing the test with the accompanying computer-generated tone. Subjects were instructed to maintain a forward-flexed position of the head. Target height was adjusted so that the relative position of the eyes in the orbit was always in the horizontal plane during head movements. Pilot studies conducted in our laboratory demonstrated that it was critical that the tester be vigilant in providing positive feedback to the subject regarding test performance, coaching subjects to remain relaxed and to produce head movements of small amplitude. Additionally, the raters were required to be aware of excessive blinking, facial grimacing/smiling by the test subject to ensure that electrical activity from facial muscles did not distort the electro-oculographic signal.

Each VAT trial lasted 18 seconds. During the first six seconds, the subjects rotated their head left to right from 0.6 to 0.8 Hz (ie, calibration period). Over the remaining 12 seconds of a test, the subjects moved their head from left to right in a sweep of frequencies from 2 to 6 Hz. A computer-generated tone was provided to cue the subjects to generate the required frequency of head movements.

Analog eye position and head velocity signals were relayed to an IBM-compatible portable computer through which analog-to-digital signal conversion and storage took place. Analog signals were low-pass filtered at 80 Hz. Data were then sampled and digitized at 500 samples per second. Eye position was converted to eye velocity using a two-point central difference algorithm as described by Bahill et al.34 Calibration of eye velocity was performed for the first six seconds of each trial. An automated algorithm uses the least-squares methods to correct for vergence and binocular eye position. Further filtering and data analysis were performed by factory-specified algorithms that calculated and produced graphic display of eye position, eye velocity, and head velocity. These eye measurement capabilities are intrinsic to the VAT and are standard components of the commercially available unit.

The amplitude and timing of the power spectra derived from Fourier analysis for head and eye movements were compared by the VAT algorithms at 11 discrete frequencies from 2.0 to 5.9 Hz.16,22 The gain of the VOR was computed as the ratio of spectral power between eye and head velocity data across each of the frequencies. Phase was defined as the relative timing of peak spectral power for each discrete frequency. Asymmetry was the difference between the gains for head movement in one direction vs head movement in the opposite direction. In addition to the 11 discrete frequencies plotted for gain and phase, asymmetry is plotted from 5.9 to 9.8 Hz, extending beyond the 5.9 peak frequency limit for gain and phase, by using Fourier transform of the head and eye frequency components (harmonics). A detectable level of spectral power must be generated by the subject at any given frequency for data to be generated for gain and phase values. Therefore, if the subject was unable to move the head at higher frequencies or if greater noise was introduced with the higher frequencies, missing data points for any given trial were produced based on the level of coherence specified by the user, which in all trials was fixed at the factory default value of 0.7. Gain, phase, and asymmetry data are plotted from power spectral analysis of head and eye data, which may be affected by variables such as the subject’s inability to generate sufficient frequencies of head movement, the intrusion of saccades, and noise generated from nearby muscle contraction. These specific computational functions are intrinsic components of the commercially available VAT system and were not modified by the investigators. Although coherence could be adjusted to increase the amount of data displayed on a trial-to-trial basis and is suggested by the manufacturer, this was not performed.

For intrarater reliability estimation, subjects performed six consecutive VAT trials with one trial defined as a single complete test cycle of 18 seconds. We chose to have each subject perform six trials to define when the test becomes stable and in an effort to standardize the methods of the VAT. We recognize that this number was arbitrary, but thought it necessary to avoid confusion among raters, given the manufacturer’s guidelines of “repeat a sufficient number of tests to result in 3 tests which demonstrate repeatability through relatively small SD.” To determine interrater reliability, the subjects performed an additional six trials for a second rater. In those subjects who participated in interrater assessment, the second rater reapplied the electrodes and followed the testing procedures as outlined above while the subject rested. The testing order between raters was randomized across test subjects. The VAT was performed after the clinical assessment but before any additional vestibular function testing. Retesting was performed on the same day, typically within one hour of the first rater.

Reliability Statistics

Intrarater reliability estimates (intraclass correlation co-efficients [ICCs]) were calculated to determine the effect of practice on reliability estimates selecting trials (κ) based on the following criteria: (1) subjects who could produce six trials at any frequency for whom data were displayed (n = 69, κ = 6, ICC 2,6); (2) the first three trials in the sequence for these subjects (n = 69, κ = 3, ICC 2,3); (3) the last three trials in the sequence for the same subjects (n = 69, κ = 3, ICC 2,3).

Intrarater reliability was also calculated for a subpopulation of subjects who could produce trials with head movements at ≥3.9 Hz using the trial selection criteria of the last three trials with head movements ≥3.9 Hz (n = 45, κ = 3, ICC 2,3). This was performed to define the maximum possible reliability over the widest range of frequencies for subjects.

ICC values were calculated for each of 11 discrete frequencies for gain and phase and for the additional 10 frequencies for asymmetry. ICCs were not calculated when data were not generated. Formulas and methods for calculating ICCs as well as the upper and lower 95% confidence intervals around these ICC reliability estimates were calculated using standard methodology from repeated-measures analysis of variance (ANOVA) as described by Shrout and Fleiss.35 ANOVA for between-group procedures, Student’s t tests, and χ2 analysis were used to determine whether there were significant differences for vestibular diagnosis, age, and time from onset between subjects for whom the VAT could not display data under the established performance criteria. The experiment-wise type I error rate was set at α <0.05.

Interrater reliability was estimated through a subpopulation assessed by both raters (n = 49). ICC estimates were calculated using the last three trials. This was performed to define the ICC estimates over the widest frequency range to eliminate the effects of practice. Similar to the data analysis for the intrarater estimates, interrater reliability estimates used standard methodology from repeated-measures ANOVA, including construction of 95% confidence intervals.35


Intrarater Reliability


The distribution of the total subject sample (N = 98) based on caloric test results, age, and time from onset is provided in Table 1. Of the total sample of 98 subjects, 69 subjects provided data for all six VAT trials under the established criteria. ANOVA revealed that this subset of 69 subjects did not differ across diagnostic groups for age in years, F4,65 = 0.73, P = 0.58 or time from onset in months, F4,65 = 1.59, P = 0.19. Mean VOR gain and phase values for the six trials across the discrete frequencies are displayed in Figure 1.

Mean vestibular ocular reflex (VOR) gain and phase for each trial plotted across frequency (n = 69). SDs are not plotted to preserve readability. The range of variability in VOR gain across the six trials for each frequency was 0.1 to 0.17. The range ...
Subject Demographics

Missing Trials by Frequency of Head Shaking

Figure 2 summarizes the percentage of missing data in relation to head frequency and the total number of trials (69 subjects × six trials each = total of 414 trials). This is presented for all six trials for the subjects whose data was sufficient for the VAT to generate plots at each frequency of head movement for each of the six trials (n = 69). For these subjects, the total number of possible trials they could have produced would have been κ = 414. We found that the number of trials in which the VAT did not generate data (due to coherence value) increased as a function of head frequency (Fig. 2A). This was most striking at 5.9 Hz where >60% of trials were missing. It can also be observed that even for those subjects who could consistently generate data for any head frequency ≥3.9 Hz (n = 45, κ = 135), the number of missing trials increased as a function of head frequency (Fig. 2B).

Percentage of missing trials by frequency of head rotation during the Vestibular Autorotation Test (VAT). The number of trials in which the VAT did not produce data increased as a function of head frequency. This was true for all subjects whose data were ...

Reliability Estimates

ICC values for gain across the 11 frequencies assessed by the VAT are presented in Table 2. The highest levels of intrarater reliability were seen when calculated over the full six trials. However, the levels of reliability were also at acceptable levels for the frequencies ≤5.1 Hz for the first three trials, with slightly better values for the last three trials.

Intrarater Reliability for Gain (n = 69)

ICC values for the variable phase are displayed in Table 3. There was a pronounced difference in reliability between the first three and last three trials. This difference was evident during lower frequency head movements (≤3.9 Hz). For higher frequencies (≥4.3 Hz), the difference was not apparent. ICC estimates for the variable asymmetry are displayed in Table 4. We found a substantial difference in reliability estimates when the first three trials were compared to the last three trials. Substantial improvements in reliability when comparing the first and last three trials for phase and asymmetry (Tables 3 and and4)4) suggest that a practice effect existed, which may have promoted stability in the data variables for later trials.

Intrarater Reliability for Phase (n = 69)
Intrarater Reliability for Asymmetry (n = 69)

Intrarater Reliability for Head Movements ≥3.9 Hz

Intrarater reliability (ICC 2,3) estimates were calculated for those subjects (n = 45) able to generate data at frequencies ≥3.9 Hz at factory default coherence values. Reliability estimates were calculated for the last three trials to eliminate the effects of practice. These methods should have provided the highest level of reliability achievable by the subjects within this study. We did not find any difference in age between groups who could (58.0 ± 17.6) and could not (65.3 ± 14.2) produce data for frequencies ≥3.9 Hz (t test [two tailed, Bonferroni corrected, P > 0.025], P = 0.09). Similarly, the mean difference in time from onset between subjects who could (37.7 ± 57.2) and could not (60.2 ± 62.3) produce head rotations ≥3.9 Hz was not statistically significant (t test [two tailed, Bonferroni corrected P > 0.025], P = 0.17). There was no significant difference in the distribution of subjects with a vestibular diagnosis and those who were in the dizziness group (χ2 = 0.37, P = 0.54). Finally, there was not a significant difference across individuals for age older than 65 (χ2 = 3.27, P = 0.07) or between the time from onset more or less than 12 weeks (χ2 = 0.16, P = 0.69). In summary, we were unable to determine differences in a person’s ability to generate sufficient data to produce plots for the higher frequency head rotations based on age, diagnosis, or time from onset.

Reliability Estimates

Reliability estimates for the last three trials for the variables gain, phase, and asymmetry for subjects who could produce head movements at frequencies ≥3.9 Hz are provided in Figure 3. The level of reliability ranged from moderate to excellent for both gain (0.73–0.92) and phase (0.90 – 0.96) across all frequencies, but poor to moderate reliability for asymmetry (0.67–0.81).

Intrarater reliability for subjects able to produce data at ≥3.9 Hz head rotation frequency (n = 45). Data plotted are the intraclass correlation coefficient (ICC) (2,3) reliability estimates, with error bars representing the upper and lower 95% ...

Interrater Reliability


Forty-nine subjects participated in interrater reliability estimates, although nine (18%) were unable to complete the testing or were unable to produce data sufficient to be included in the calculations. Subjects included UVH (n = 12; mean, 64.7 ± 18.6), BVH (n = 5; mean, 46.8 ± 17.0 years), dizziness (n = 18; mean, 59.0 ± 19.4), and unknown (n = 5; mean, 52.4 ± 15.7 years). Due to the unbalanced sample within a greater sample, we did not perform statistics on time from onset and age.

Reliability Estimates

Interrater reliability estimates for the last three trials for gain, phase, and asymmetry are summarized in Figure 4. Reliability for gain was good to excellent across the full range of frequencies with ICC estimates ranging from 0.95 ± 0.02 at 2.0 Hz to a low of 0.82 ± 0.07 at 5.9 Hz. Phase also demonstrated excellent reliability from 2.0 to 3.9 Hz; however, there was increased variability for frequencies >3.9 Hz, causing a significant decrease in reliability estimates.

Interrater reliability estimates for the last three trials for gain, phase, and asymmetry (n = 49). Data plotted are the intraclass correlation coefficient (ICC) (2,3) reliability estimates, with error bars representing the upper and lower 95% confidence ...


The goal of this study was to determine the intrarater and interrater reliability of the VAT in a clinical sample. The first specific aim was to determine what proportion of subjects could perform six consecutive VAT trials across the full range of head movement frequencies within the manufacturer’s preset algorithms. Thirty percent of subjects tested could not produce sufficient data across the six trials. An additional 35% of subjects were unable to produce valid data when we considered those subjects who could not produce head movements at frequencies ≥3.9 Hz. Therefore, nearly 66% of the total sample was unable to produce data that met the VAT algorithm criteria to be included in assessment of reliability. These findings are comparable to those of Guyott and Psillas27 who found that 84% of normal individuals were not able to consistently generate head movements at frequencies >3.5 Hz using a coherence value of 1.0. Similarly, Perez et al30 reported that approximately 57% of subjects who were being monitored with the VAT while undergoing intratympanic gentamicin treatment for Ménières disease were unable to perform the VAT at one week post-treatment. Additionally, these subjects demonstrated a frequency-dependent trend of trial attenuation similar to that of our subjects, with posttreatment missing trials of nearly 50% at frequencies ≥3.9 Hz.30

The second aim was to determine whether there were specific characteristic differences between those subjects who were able to produce data for six trials and those who were not. Manipulating the factory-preset coherence values (ie, lowering coherence) to recover data would have improved the display of data trial to trial. We recognize this may be done in common practice and is advocated by the manufacturer of the VAT. However, this would introduce several sources of variance that would need to be considered. First, changing coherence levels trial to trial alters a fundamental assumption of reliability testing in that it is no longer possible to consider that the underlying parameters being tested have not changed. Second, the level of coherence used by each rater from trial to trial would be unique and limit generalizability across different laboratories. Third, decreasing the coherence threshold allows more raw data to be processed, but the quality of the additional data is lower in terms of being an accurate replica of the stimulus. Further studies are warranted to determine the optimal coherence levels allowing sufficient data across all frequencies tested and to determine reliability of the VAT and data production when different operators manipulate coherence levels from trial to trial in order to achieve full-spectrum data.

Results revealed that a subject’s ability to generate data at frequencies ≥3.9 Hz was independent of the subject’s age, the gross assessment of acuteness of injury (time from onset), and the presence of a vestibular deficit (UVH or BVH). There may have been other factors that contributed to the difference in ability to generate data sufficient for plots at higher frequency head movements specific to this sample that were not quantified as part of this study. Specifically, head movement may have induced symptoms that made them reluctant to increase their frequency or they may not have been physically able to produce head movements of sufficient frequency secondary to motor planning problems or musculoskeletal limitations of the cervical spine despite having grossly normal range of motion. As noted by Guyott and Psillas,27 it is possible that the inability to generate consistent high-frequency head movements was related to subject stress, fatigue, or lack of cooperation; however, these factors, including symptom level, motor planning, and limitations in spinal mobility or coordination, were not quantified.

The VAT has been used in several studies as an outcome measure. For example, Corvera, et al29 investigated the effects of flunarizine on acute vestibular neuritis by monitoring the VOR with the VAT in patients participating in the drug trial. Corvera et al suggested that the broad variability in observed VAT results was indicative of variability in patient response to the intervention. It is interesting to note that a number of subjects participating in this study were unable to perform the VAT on initial assessment (termed by Corvera et al as “restriction”) and that an unspecified number of trials were eliminated from the analysis due to “insufficient data.” Perez et al30 suggested that the inability of subjects to produce high-frequency head movements was related to the intratympanic gentamicin treatment for persons with Ménière disease. Our results suggest caution when drawing conclusions about the relationship between VAT performance and VOR function, as it is not yet clear what variables clearly influence subject behavior on the VAT.

Our results suggest that in subjects able to meet the performance criterion, the VAT produced reliable estimates for VOR gain over the frequency range of 2.0 to 4.3 Hz and that gain values across these frequencies were independent of the number of practice trials that a subject attempted (Table 2). Corvera et al28 reported similar findings for gain when comparing test-retest reliability in normal individuals. We found that phase and asymmetry were significantly improved by practice over the frequency range of 2.0 to 4.3 Hz (Tables 3 and and4),4), differing from those of Corvera et al, 28 who found VAT phase and asymmetry to be unreliable. The authors did not comment on the cause of this poor reliability. The impact that factors such as saccadic eye movements, difficulties of motor planning, disease severity, and subjective symptoms have on VAT performance as well as intralaboratory and interlaboratory variability in the selection of performance criteria requires further exploration.


We recommend clinicians interested in using the VAT follow some specific guidelines. Very careful attention to patient preparation, instruction, and test monitoring is critical. This includes excellent skin cleansing, careful electrode placement, securing the electrodes and wires with tape, and aligning and tightening the headband. While the manufacturer provides a graphic display of the analog signal, an impendence measure with appropriate guidelines for signal-to-noise ratios would be an improvement over the current tool. Additionally, subjects need to fully understand the purpose of testing and demonstrate the ability to follow commands. Therefore, those who administer the test need to be able to clearly articulate testing instructions to the subjects and then assess the quality of the subjects’ performance. This includes providing coaching to the patients during the trials so that head movements are in time with the computer-generated tone and of sufficiently small amplitude. Finally, when it is clinically possible, subjects must be allowed to practice the test a minimum of three times before data are gathered.

As the VAT has been used to monitor change in vestibular function, stability of the VAT over a significant period of time (eight or 12 weeks) in patients with stable lesions should be demonstrated (ie, those known to have complete compensation and to be asymptomatic). To our knowledge, this stability has not yet been demonstrated. If the VAT were found to be stable over time, it could be used to monitor functional recovery related to rehabilitation programs and various treatment regimens or to differentiate subjects based on subjective symptoms and functional disability. This would be a tremendous asset to clinicians who specialize in the management of patients with vestibulopathy, as it would provide another tool to document functional recovery in these patients.


Support for this project has come from the University of Miami Department of Physical Therapy (P.J.B.) and the Foundation for Physical Therapy Promotion of Doctoral Scholars (P.J.B.). We are grateful to Dr. Mark Shelhamer for his invaluable expertise concerning various technical issues pertinent to this article and Dr. Susan J Herdman for invaluable support and guidance.


1. Tabak S, Collewijn H. Human vestibulo-ocular responses to rapid, helmet-driven head movements. Exp Brain Res. 1994;102:367–378. [PubMed]
2. Grossman GE, Leigh RJ. Instability of gaze during locomotion in patients with deficient vestibular function. Ann Neurol. 1990;27:528–532. [PubMed]
3. Halmagyi GM, Fattore CM, Curthoys IS, Wade S. Gentamicin vestibulotoxicity. Otolaryngol Head Neck Surg. 1994;111:571–574. [PubMed]
4. Assessment: Electronystagmography. Report of the Therapeutics and Technology Assessment Subcommittee. Neurology. 1996;46:1763–1766. [PubMed]
5. Bhansoli SA, Honrubia V. Current status of electronystagmography testing. Otolaryngol Head Neck Surg. 1999;120:419–426. [PubMed]
6. Scherer H, Brandt U, Clarke AH, et al. European vestibular experiments on the Spacelab-1 mission: 3. Caloric nystagmus in microgravity. Exp Brain Res. 1986;64:255–263. [PubMed]
7. Hood JD. Evidence of direct thermal action upon the vestibular receptors in the caloric test. A re-interpretation of the data of Coats and Smith. Acta Otolaryngol. 1989;107:161–165. [PubMed]
8. Keim RJ. The pitfalls of limiting ENG testing to patients with vertigo. Laryngoscope. 1985;95:1208–1212. [PubMed]
9. Zee DS. Adaptation to vestibular disturbances: some clinical implications. Acta Neurol Belg. 1991;91:97–104. [PubMed]
10. Fife TD, Tusa RJ, Furman JM, et al. Assessment: vestibular testing techniques in adults and children: report of the Therapeutics and Technology Assessment Subcommittee of the American Academy of Neurology. Neurology. 2000;55:1431–1441. [PubMed]
11. Atkin A, Bender M. Ocular stabilization during oscillatory head movements. Arch Neurol. 1968;19:599–605. [PubMed]
12. Tomlinson RD, Saunders GE, Schwarz DW. Analysis of human vestibulo-ocular reflex during active head movements. Acta Otolaryngol. 1980;90:184–190. [PubMed]
13. Zangemeister WH, Phlebs U, Huefner G, Kunze K. Active head turning and correlated cerebral potentials. Experimental and clinical aspects. Acta Otolaryngol. 1986;101:403–415. [PubMed]
14. Fineberg R, O’Leary DP, Davis LL. Use of active head movements for computerized vestibular testing. Arch Otolaryngol Head Neck Surg. 1987;113:1063–1065. [PubMed]
15. Jell RM, Stockwell CW, Turnipseed GT, Guedry FE., Jr The influence of active versus passive head oscillation and mental set on the human vestibulo-ocular reflex. Aviat Space Environ Med. 1988;59:1061–1065. [PubMed]
16. O’Leary DP, Davis LL, Kitsigianis GA. Analysis of vestibulo-ocular reflex using sweep frequency active head movements. Adv Otorhinolaryngol. 1988;41:179–183. [PubMed]
17. Goebel JA, Fortin M, Paige GD. Headshake versus whole-body rotation testing of the vestibulo-ocular reflex. Laryngoscope. 1991;101:695–698. [PubMed]
18. Demer JL, Oas JG, Baloh RW. Visual-vestibular interaction during high-frequency, active head movements in pitch and yaw. Ann N Y Acad Sci. 1992;656:832–835. [PubMed]
19. Henry DF, DiBartolomeo JD. Closed-loop caloric, harmonic acceleration and active head rotation tests: norms and reliability. Otolaryngol Head Neck Surg. 1993;109:975–987. [PubMed]
20. Furman JM, Durrant JD. Head-only rotational testing: influence of volition and vision. J Vestib Res. 1995;5:323–329. [PubMed]
21. Della Santina CC, Cramer PD, Carey JP, Minor LB. Comparison of head thrust test with head autorotation test reveals that the vestibulo-ocular reflex is enhanced during voluntary head movements. Arch Otolaryngol Head Neck Surg. 2002;128:1044–1054. [PubMed]
22. O’Leary DP. Diagnostic screening with the vestibular autorotation test (VAT) AudiologyOnline. 2002. [Accessed July 15, 2005]. Available at:
23. Tirelli G, Bigarini S, Russolo M, et al. Test-retest reliability of the VOR as measured via VORTEQ in healthy subjects. Acta Otorhinolaryngol Ital. 2004;24:58–62. [PubMed]
24. O’Leary DP, Davis LL, Kevorkian KF. Dynamic analysis of age-related responses of the vestibulo-ocular reflex. Adv Otorhinolaryngol. 1990;45:194–202. [PubMed]
25. Wilson RH, O’Leary DP. Rehabilitation R & D Progress Reports: Validation and Reliability of a Physiological Test of Vestibular Function. Long Beach, CA: VA Medical Center; 1990. p. 417.
26. Meulenbroeks AA, Kingma H, Van Twisk JJ, Vermeulen MP. Quantitative evaluation of the Vestibular Autorotation Test (VAT) in normal subjects. Acta Otolaryngol Suppl. 1995;520:327–333. [PubMed]
27. Guyot JP, Psillas G. Test-retest reliability of vestibular autorotation testing in healthy subjects. Otolaryngol Head Neck Surg. 1997;117:704–707. [PubMed]
28. Corvera J, Corvera-Behar G, Lapilover V, Ysunza A. Evaluation of the vestibular autorotation test (VAT) for measuring vestibular oculomotor reflex in clinical research. Arch Med Res. 2000;31:384–387. [PubMed]
29. Corvera J, Corvera-Behar G, Lapilover V, Ysunza A. Objective evaluation of the effect of flunarizine on vestibular neuritis. Otol Neurotol. 2002;23:933–937. [PubMed]
30. Perez N, Martin E, Garcia-Tapia R. Results of Vestibular Autorotation Testing at the end of intratympanic gentamicin treatment for Ménières disease. Acta Otolaryngol. 2003;123:506–514. [PubMed]
31. Tirelli G, Bigarini S, Russolo M, Giacomarra V, Sasso F. Test-retest reliability of the VOR as measured via Vorteq in healthy subjects. Acta Otorhinolaryngol Ital. 2004;24:58–62. [PubMed]
32. Furman JM, Jacob RG. Jongkees’ formula re-evaluated: order effects in response to alternate binaural bithermal caloric stimulation using closed-loop irrigation. Acta Otolaryngol. 1993;113:3–10. [PubMed]
33. Schubert MC, Herdman SJ, Tusa RJ. Vertical dynamic visual acuity in normal subjects and subjects with vestibular hypofunction. Otol Neurotol. 2002;23:372–377. [PubMed]
34. Bahill AT, Kallman JS, Lieberman JE. Frequency limitations of the two-point central difference differentiation algorithm. Biol Cybern. 1982;45:1–4. [PubMed]
35. Shrout PE, Fleiss JL. Intraclass correlations: uses in assessing rater reliability. Psychol Bull. 1979;86:420–428. [PubMed]