|Home | About | Journals | Submit | Contact Us | Français|
To determine the within- and between-trial repeatability of pattern electroretinogram (PERG) measurements in healthy and patient eyes, using a new clinical instrument, the PERGLA.
70 eyes of 35 healthy individuals (IOP < 22 mm Hg, healthy optic disc by stereophotograph assessment, standard visual fields within normal limits) and 90 eyes of 45 clinic patients (ocular hypertensive, glaucomatous optic neuropathy by stereophotograph assessment and/or repeatable abnormal visual fields) enrolled in the UCSD Diagnostic Innovations in Glaucoma Study (DIGS) were studied. Average mean deviation (MD) of patient eyes on standard automated perimetry was −1.81 dB (S.D. = 2.61).
The PERG was recorded using the PERGLA paradigm from both eyes simultaneously twice (i.e., 2 trials) by a single operator with electrodes being removed and re-attached between recordings. Repeatability of PERG amplitude (µV) and phase (π rad) between two runs within a single trial (within-trial condition) was compared to repeatability between two trials (i.e., after electrode replacement, between-trial condition) by calculating the coefficients of variability (CVs) and the intraclass correlation coefficients (ICCs) and displaying Bland-Altman plots.
For healthy eyes, amplitude CVs (S.D.) were 11.5% (11.5) and 9.9% (0.79) for within- and between-trial conditions, respectively. ICCs were 0.91 and 0.85. Phase CVs were 1.3% (1.5) (within-trials) and 1.5% (1.4) (between-trials) and ICCs were 0.85 and 0.88. For patient eyes, amplitude CVs (S.D.) were 12.2% (10.1) and 11.2% (7.5) for within- and between-trial conditions, respectively. ICCs were 0.92 and 0.89. Phase CVs were 2.2% (2.2) (within-trials) and 2.4% (2.2) (between-trials) and ICCs were 0.82 and 0.83. Bland-Altman plots indicated good agreement between the repeated recordings and were similar within- and between-trials for healthy and patient eyes.
Repeatability of PERGLA recordings is good and is similar within- and between-trials for both healthy and patient eyes suggesting this technique is promising for monitoring change over time.
The pattern electroretinogram (PERG) response in glaucoma has been studied extensively for over 20 years1–4 because of its documented relevance for disease detection. PERG responses in general are generated from the inner retina and in particular are disrupted by damage to the retinal ganglion cells5–7 indicating PERG measurement can contribute to glaucoma diagnosis. Like other electrophysiological tests, the PERG testing paradigm is essentially objective (i.e., requires no observer response beyond fixation). For this reason, this test may be valuable for assessing visual function or corroborating suspicious results in individuals who are unable to reliably perform perimetric testing (e.g., standard automated perimetry, SAP). In addition, it is theoretically likely that PERG-based visual function measurements are more repeatable than subjective testing because PERG results are minimally effected by motor response, learning effects (beyond operator learning) and the contribution to variability of imprecise thresholding algorithms.
Recently a PERG recording paradigm designed specifically for glaucoma detection has been introduced that employs optimized stimulus parameters for boosting the signal-to-noise ratio (SNR) and compares response amplitude and phase to an internal normative database (Glaid PERGLA, Lace Elettronica, Pisa, Italy) to aid in clinical interpretation of responses that previously required expert assessment.8 Previous studies suggest that PERG recordings using this technique are repeatable, show acceptable SNR to detect early to moderate decreases in response amplitude and are sensitive to glaucoma-related decreases in signal when comparisons are made to an internal normative database.8–11
The current study seeks to further describe the within- and between-trial repeatability of PERGLA by measuring the repeatability of recordings in both healthy and patient eyes using the same instrument. Repeatability of recordings is important because good repeatability (i.e., low between trial variability) is required for detecting small changes in signal over time; an essential task for detecting disease-related change. Assessing repeatability in both healthy and patient eyes is important because studies using other visual function techniques show increased variability in patient eyes.12 We also assessed the SNR of PERG amplitude in healthy and glaucoma eyes to describe the dynamic range of the instrument for detecting severe disease and change.
156 eyes of 78 participants enrolled in the UCSD Diagnostic Innovations in Glaucoma Study (DIGS) were included in this study. All eyes required good quality stereo-photography (TRC-SS, Topcon Instruments Corp. of America, Paramus, NJ) of the optic disc and reliable (false positives, fixation losses and false negatives ≤ 25% with no observable testing artifacts) standard automated perimetry (SAP, Humphrey Field Analyzer II with Swedish Interactive Thresholding Algorithm, Carl Zeiss Meditec, Dublin CA) testing within 6 months of PERG testing using Glaid PERGLA testing protocol (see PERG Testing section below for full description).
In addition to the testing described above, each study participant underwent a comprehensive ophthalmologic evaluation including review of medical history, best-corrected visual acuity testing, slit-lamp biomicroscopy, intraocular pressure (IOP) measurement with Goldmann applanation tonometry, gonioscopy, and dilated slit lamp fundus examination with a 78 diopter lens. To be included in the study, participants had to have a best-corrected acuity better than or equal to 20/40, spherical refraction within ± 5.0D and cylinder correction within ± 3.0D, and open angles on gonioscopy in both eyes. Eyes with coexisting retinal disease, uveitis, or non-glaucomatous optic neuropathy were excluded.
Study participants were classified as either healthy controls (n = 33 participants, 66 eyes) or patients (n = 45 participants, 90 eyes, with ocular hypertension, glaucomatous or suspicious appearing optic discs by stereo-photography and/or repeatable abnormal SAP results in at least one eye prior to study entry). Ocular hypertension was defined as history of untreated IOP ≥ 22 by Goldmann applanation tonometry. Glaucomatous/suspicious appearing optic discs were those with rim thinning or retinal nerve fiber layer (RNFL) defects indicative of glaucoma (e.g., marked violation of the “ISNT Rule” of distribution of neuroretinal rim thickness, presence of focal thinning disrupting the contour of the rim, presence of diffuse or focal (wedge-shaped) RNFL atrophy wider than the width of the largest observed vessel and evidenced by light-dark-light patterns of reflectivity in superior or inferior arcuate bundles). Abnormal SAP results required pattern standard deviation (PSD) with p ≤ 5% and/or Glaucoma Hemifield Test outside normal limits by STATPAC analysis on two consecutive tests. Healthy controls were normal on the above tests.
Because the primary goal of this study was to describe repeatability of PERGLA measurements, we did not narrowly define a “glaucoma” subgroup based on specific optic disc or visual field criteria when evaluating repeatability in patient eyes. We thought it was more important to describe test-retest variability in a typical clinical patient population, similar to that in which the instrument likely would be used.
Demographic and ocular characteristics of healthy and patient eyes/participants are shown in Table 1. Statistically significant differences in age (t-test, p = 0.003), SAP Mean deviation (MD) (t-test, p < 0.001) and SAP PSD (t-test, p < 0.001) at time of testing were observed between study groups. IOP was similar in both groups.
All study methods adhered to the provisions of the Declaration of Helsinki guidelines for research involving human participants and the Health Insurance Portability and Accountability Act (HIPAA).
A commercially available modification of the Glaid (Lace Elettronica, Pisa, Italy, software version 2.1.14) electrophysiology instrument, called PERGLA was used to measure the PERG response.8, 9 The PERGLA stimulus is a black and white (contrast 98%, mean luminance 40 cd/m2), horizontal square wave grating (1.6 c/deg), counter-phasing at 8.14 Hz, presented on a computer monitor (14.1 cm diameter circular field). At a viewing distance of 30 cm, the display subtends 25 deg centered on the fovea (assuming fixation towards a prominent central fixation circle). Responses from both eyes are measured simultaneously. Electrical signals from silver-chloride skin electrodes (9 mm adhered with conductive cream and tape) (both lower eyelids active, both temples reference, forehead ground) are fed into a two-channel differential amplifier, amplified (100,000 fold), filtered (1–30 Hz), then digitized with 12-bit resolution at 4169 Hz. Before testing, the electrode impedance is monitored automatically and an on-screen indicator signals acceptable impedance (≤ 5 kΩ). Additionally, an on-screen oscilloscope displays background noise.
The PERGLA software obtains each waveform by averaging 600 artifact-free time-periods (i.e., sweeps) of 122.8 msec each, synchronized with the contrast alternation of the stimulus grating. Two independent response blocks of 330 sweeps each are recorded and separated by a user-defined pause (i.e., inter-stimulus interval). For each block, the first 30 sweeps are rejected from the average to eliminate onset effects from the steady-state recording. Sweeps containing spurious signals attributable to blinks and eye movements are rejected over a threshold voltage of ± 25 µV. Resulting steady-state PERGs take the form of near-sine waves that are Fourier transformed to isolate the harmonic component at the contrast reversal rate (16.28 Hz, 2 contrast reversals per cycle). In addition, a noise response is obtained by multiplying alternate sweeps by 1 and –1 before averaging. The noise response also is Fourier transformed at 16.28 Hz to allow calculation of SNR.
Resulting software-provided response amplitudes, latencies (i.e., phase shifts) and SNRs were recorded as independent study variables.
Each participant was tested two times by the same operator (AT) with approximately one hour between tests. Electrodes were removed and replaced during this time. Test time was approximately 4 minutes per test, although preparation and electrode placement added from approximately 5 to 10 minutes to each examination. All eyes were refracted, appropriate corrections for viewing distance were made and near Jaeger acuity was J1 or better for all participants. Participants were requested to fixate towards a prominent fixation circle placed in the middle of the display. Fixation was monitored visually by the operator (the PERGLA instrument does not provide automatic fixation monitoring). Room lights were turned off and blinds were closed. Ambient light from various computer monitors was present. Subjects adapted to this condition for approximately 5 minutes before testing.
We assessed the repeatability of resulting amplitudes and latencies both within-trials (repeatability between each of two 300 sweep blocks that were later averaged) and between-trials (repeatability between two averaged blocks of 600 sweeps). Within-trial repeatability represented variability attributable to the instrument, while between-trial repeatability represented short-term variability and variability attributable to electrode replacement. Between-trial variability determines the ability to detect change over time using this technique. Within- and between-trial repeatability were compared for each participant and described for both healthy and patients eyes.
Repeatability of recordings between two runs within a single trial was compared to repeatability of recordings between two trials by calculating and comparing the coefficients of variation (CV, the ratio of the measurement standard deviation to the mean) and the intraclass correlation coefficients (ICC, describing proportion of total variance accounted for by within-subject variation). Agreement also was illustrated using Bland-Altman12 plots.
Finally, we investigated the correlations (Pearson’s r) between the absolute values of within- and between-trial changes in amplitude and SAP MD (after logging PERG amplitudes) and amplitude SNR to determine if repeatability of recording was influenced by these variables.
Table 2 shows mean amplitudes and phases, their noise level, SNRs, within subject variability (Sw), coefficients of variation (CV), and intraclass correlation coefficients (ICC) of PERGLA recordings for within- and between-trials. For both healthy and patients eyes, averaged within- and averaged between-trial amplitudes (and phases) were similar (Δ ≤ 0.02 mV), suggesting that repeatability of this important parameter is generally unaffected by short-term variability and electrode replacement.
Within subject variability (Sw) was low and similar for both healthy and patients eyes, both within-trials (Sw = 0.08, 0.06, respectively) and between-trials (Sw = 0.08, 0.06, respectively). This pattern held for phase results, with less variability. For amplitude, coefficients of variation (the ratio of the standard deviation to the mean) indicated that repeat measurement variability is approximately 10% to 12% for healthy and patient eyes, respectively. These values were similar within and between trials. For Phase, repeat measurement variability was very low: on the order 1% to 2%. Finally, intraclass correlation coefficients (ICCs) indicated that the proportion of total variance accounted for by within-subject variation (i.e., measurement reliability) was similar within- and between-trials and for healthy and patient eyes (range: 82% to 92%).
Bland-Altman plots describing agreement for within- and between-trial trial PERG amplitude recordings (differences in amplitude measurements as a function of average measurement) are shown for healthy and patient eyes in Figure 1 and and2,2, respectively. Within-trial amplitude differences (i.e., Δ amplitude) were significantly larger in healthy eyes than in patient eyes, although differences were quite small (average Δ amplitude = 0.09 µV and 0.05 µV, respectively, F-test p = 0.04). Between-trial differences were similar for both groups (average Δ amplitude = 0.04 µV for healthy eyes and average Δ amplitude = 0.01 µV for patient eyes, F-test p = 0.16). The between-trial agreement for patient eyes showed a small but significant proportional bias (Pearson’s test, p = 0.01). Amplitude for the first trial tended to be higher than amplitude for the second trial in eyes with higher average amplitudes. This finding may be related to the percentage of amplitude decrease attributable to adaptation to the PERG stimulus.14 Visual inspection of the Bland-Altman plots suggests that for between-trial agreement for patient eyes, variation depends somewhat on amplitude magnitude. In this case, agreement decreases as amplitude increases. Finally, there were no significant associations between the absolute values of between-trial or within-trial changes in amplitude and SAP MD or SNR (Pearson’s test, all p ≥ 0.08).
The current results indicate that amplitude and phase of steady-state pattern electroretinogram responses measured using the PERGLA paradigm are highly repeatable and this repeatability is generally similar in patient and healthy eyes, both within- and between-trials. These are the first results directly comparing repeatability in both healthy and patient eyes (i.e., with the same instrument and same operator) in an acceptably large sample. Our results suggest that recordings from the PERGLA technique are sufficiently stable, at least over the short term, to be sensitive to small changes in retinal function over time. The relatively low SNR found in patient eyes (3.8, 5.9 in healthy eyes), however, suggests that this technique may have a limited dynamic range for detecting large changes over time or advanced disease (in which case the signal may be reduced to noise levels). Other studies using the PERGLA instrument have reported higher (e.g., 13 in healthy eyes8, 8 in healthy and glaucoma eyes combined10) or more similar (e.g., 5 in glaucoma eyes11) values for this parameter. Previous studies measuring PERG have accepted SNR as low as 2.15
Average PERG amplitude measured using PERGLA in our study was slightly lower than previously reported in healthy eyes (0.83 µV compared to 1.1 µV8) and slightly higher than previously reported in patient eyes (0.57 µV compared to 0.41 µV11). This may be explained by the facts that healthy participants in the former study were younger (average age = 43 years compared to 61 years in our study) and patient eyes in the latter study had known glaucoma with mean SAP MD = −9.0 dB, compared to −1.8 dB in our mostly “suspect” eyes. PERG phase in the current study was very much the same as that initially reported using this technique.8
Overall, our repeatability results were similar to those reported in other studies using PERGLA. Initial reports by authors associated with development of the paradigm showed overall coefficients of variation of 7% and 1.5% for amplitude and phase respectively in healthy eyes8, compared to approximately 10% and 1.5% in the current study. These authors showed slightly lower within-trial variability (called “intrinsic variability” in their study) compared to when electrodes were removed (our “between-trial” condition, called “test-retest variability” in their study), however they conducted testing for the test-retest condition on different days. Other studies reporting PERGLA repeatability showed similar CVs. For instance, Yang and Swanson10 reported amplitude and phase CVs of about 8% and close to 1%, respectively (test-retest on different days, healthy and glaucoma eyes). Fredette and colleagues11 reported values of about 12% and 2% for intrinsic variability (same as within-test variability in our study) and larger values of about 21% and 4% for between-test variability (recordings on different days) in glaucoma eyes. For patients, ICCs of PERG phase and amplitude from the current study were somewhat higher than those reported by Fredette and colleagues (e.g., 0.89 and 0.79, respectively, for amplitude). This difference might reflect differences in instruments (see below) or differences in disease severity and/or SNR although, in the current study, the level of repeatability was not influenced by SAP MD or by SNR.
Repeatability results from earlier studies using PERG are not directly comparable to those reported in the current study because the PERGLA paradigm is not directly comparable to paradigms that more strictly adhere to International Society for Clinical Electrophysiology of Vision (ISCEV) standards.16 However, results from such reports vary considerably and depend on characteristics of the recording paradigm (e.g. stimulus parameters, electrode type, electrode placement). Earlier studies showed variability (CVs) on the order of 30% to 60% for transient PERG responses17; enough to mask any spatial frequency tuning. One later report by Otto and Bach18 showed low between-trial CVs for near-ISCEV standard steady-state PERG, on the order of 7% for amplitude and 1.5% for phase. These results were slightly better than those reported for the transient PERG. Similar results have been reported by other authors19; indicating that under ideal circumstances, PERG recordings can be highly repeatable.
In an attempt to identify possible sources of our comparatively low SNR values, we investigated the association among SNR and other variables. Post-hoc analyses indicated that SNR was significantly associated with SAP MD (i.e. disease severity) (positive association, R2 = 0.089, p < 0.001). The implication of this finding is that as disease increases (i.e., MD decreases), the ability of PERGLA to detect it decreases as SNR approaches noise level (as is the case for other tests of visual function). Although significant, this association was weak. In addition, the limited range of MD considered in the current analysis precludes any strong inference. Neither age (in healthy eyes only, p = 0.074) nor active electrode impedance just prior to testing (p = 0.788) were associated with SNR. Other possible influences on SNR are minor lens opacities, fixation variability or changes in impedance during trials. However, these possibilities are un-testable using the current data.
Results from the current study (combined with those of other studies) suggest that the dynamic range of change detection offered by PERGLA may be somewhat limited. The Bland-Altman plots showing between-trial agreement in healthy eyes (Figure 2, upper panel) and patient eyes (Figure 1, upper panel) indicate that 1.96 SD of the mean difference in between-trial measurement amplitude is approximately ± 0.25 µV. Therefore, based on our results, this number can be construed as representing a single “step” of significant change in amplitude (based on measurement variability). Theoretically, if a single step of significant change is 0.25 µV and average amplitude in healthy eyes is approximately 1.0 µV (combining results from our study and those from Porciatti et al.8), on average only just over three steps of change can be detected before the PERGLA amplitude reaches noise levels (from approximately 0.1 µV to 0.2 µV, based on results from our study and those from Porciatti et al.8). Of course, dynamic range would vary somewhat by individual based on individual baseline amplitude and individual noise levels.
Beyond very good repeatability of measurements, the commercially available PERGLA paradigm is attractive for several reasons, including a non-invasive testing protocol (skin electrodes, compared to more invasive corneal contact electrodes), ease of use by novice operators10 and automatic normative data-based assessment of results. However, this technique does not strictly adhere to ISCEV standards16 that generally discourage the use of PERGLA electrode placement configuration and electrode type. In addition, the PERGLA instrument in typical clinical use (i.e., use other than by an electrophysiology specialist) would not receive the frequent calibration suggested by ISCEV.20 The use of a non instrument-specific database (i.e., database collected on a possibly differently calibrated instruement) also is not recommended by ISCEV because of possible differences in stimulus (e.g., luminance, contrast) and amplification (e.g., frequency filtering cut-offs) characteristics between test instruments. Differences in instruments may explain in part the lower SNR in patient eyes in the current study compared to that of Yang and Swanson, despite more advanced disease in their study. This finding also could be influenced by limited variability in their small number of glaucoma eyes tested.
Overall, this study confirms the good repeatability of pattern electroretinogram responses measured using the PERGLA paradigm reported in a small number of previous studies. In addition, results show that repeatability is similar between healthy and patient eyes. The current study examined repeatability in a typical cross-section of clinic patients from a tertiary clinic and indicates good repeatability of measurements in one population in which it likely would be used. As good diagnostic accuracy is dependent on a low level of variability, it is important to demonstrate repeatability of measurements of any new diagnostic instrument in multiple study populations as a precursor to assessing diagnostic accuracy. The good repeatability of these measurements suggests diagnostic accuracy using this technique likely is good (see also9) although the dynamic range for detecting large intervals of change may be limited.
Grant Support: NIH EY018190, NIH EY011008, NIH EY008208 and participant incentive grants in the form of glaucoma medication at no cost from Alcon Laboratories Inc, Allergan, Pfizer Inc., and SANTEN Inc.
Carl Zeiss Meditec: PAS (F), RNW (F,C), LMZ (F)
Haag-Streit: PAS (F)
Heidelberg Engineering: RNW (F), LMZ (F)
Lace Elettronica: CB (F)
Optovue: LMZ (F)
Welch-Allyn: PAS (F)