|Home | About | Journals | Submit | Contact Us | Français|
To compare retinal nerve fiber layer (RNFL) and optic disc topographic imaging for detection of optic nerve damage in patients suspected of having glaucoma.
Observational cohort study.
A cohort of 82 patients suspected of having glaucoma based on the appearance of the optic nerve.
All patients were imaged using the GDx VCC scanning laser polarimeter and HRT (software version 3.0) confocal scanning laser ophthalmoscope. All patients had normal standard automated perimetry visual fields at the time of imaging and were classified based on history of documented stereophotographic evidence of progressive glaucomatous change in the appearance of the optic nerve occurring before the imaging sessions.
Areas under the receiver operating characteristic (ROC) curves were used to evaluate the diagnostic accuracies of GDx VCC and the HRT.
Forty eyes with progressive glaucomatous optic nerve change were included in the glaucoma group, and 42 eyes without any evidence of progressive damage to the optic nerve followed untreated for an average time of 8.97±3.08 years were included in the normal group. The area under the ROC curve for the best parameter from GDx VCC (nerve fiber indicator [NFI]) was significantly larger than that of the best parameter from the HRT (rim volume) (0.83 vs. 0.70; P = 0.044). The NFI parameter also had a larger ROC curve area than that of the contour line–independent parameter glaucoma probability score (0.83 vs. 0.68; P = 0.023). Assuming borderline results as normal, the Moorfields regression analysis classification had a sensitivity of 48% for specificity of 69%. For a similar specificity (70%), the parameter NFI had a significantly larger sensitivity (83%) (P = 0.003).
Retinal nerve fiber layer imaging with GDx VCC had a superior performance versus topographic optic disc assessment with the HRT for detecting early damage in patients suspected of having glaucoma. For glaucoma diagnosis, these results suggest that GDx VCC may offer advantage over the HRT when these tests are combined with clinical examination of the optic nerve.
To diagnose disease, a clinician integrates the constellation of symptoms and/or signs of a presenting patient and then assigns a level of certainty regarding its presence. In the case of glaucoma evaluation, the process generally starts with the medical interview and history. It is followed by clinical examination, which generally includes slit-lamp examination, intraocular pressure (IOP) measurement, and optic nerve examination. After this information is collected, the clinician hypothesizes about the chance that glaucoma is present and can order additional tests, such as the visual field (VF).
It is not unusual for a patient to present with suspicious appearance of the optic disc and normal or inconclusive VF tests. In this situation, additional testing, such as optic disc and/or retinal nerve fiber layer (RNFL) imaging, can be conducted to try to minimize the margin of error regarding the uncertainty of diagnosis. Then, the imaging results are used to complement clinical evaluation to determine whether the patient with suspected glaucoma has optic nerve damage or is likely a healthy subject.
Several imaging technologies have become available to evaluate objectively the optic disc and RNFL. One of these technologies, scanning laser polarimetry (SLP), provides quantitative estimates of RNFL thickness.1,2 Another, confocal scanning laser ophthalmoscopy (CSLO), evaluates the topography of the optic disc, although it also can provide indirect estimates of RNFL integrity.3,4 Numerous studies have evaluated and compared the ability of these instruments to discriminate patients with repeatable glaucomatous VF loss from healthy subjects.4–8 However, estimates of diagnostic accuracy of imaging instruments obtained from these studies are not directly applicable to the evaluation of those patients suspected of having glaucoma without confirmed VF loss.9,10
The conduct of studies evaluating glaucoma suspects has been limited by the inexistence of a perfect reference standard that could be used to diagnose disease at a single point in time without relying on VFs. Unfortunately, there are no sensitive and specific glaucoma biomarkers, and it is difficult, if not impossible, to ascertain the true diagnosis in a patient suspected of having glaucoma, based on a single evaluation. However, longitudinal follow-up can be used to evaluate the existence of progressive damage, which would then confirm the diagnosis. The final diagnosis based on longitudinal follow-up can then be used as a reference standard with which the results of the imaging instruments are to be compared.10,11
In the present study, we compared the RNFL and optic disc topographic assessment by SLP and CSLO, respectively, to detect glaucomatous damage in patients suspected of having the disease. Long-term follow-up was used to establish diagnosis in these patients and as a reference standard for comparison of results.
Patients from this study were included in a prospective longitudinal study designed to evaluate optic nerve structure and visual function in glaucoma (Diagnostic Innovations in Glaucoma Study) conducted at the Hamilton Glaucoma Center, University of California, San Diego. Patients in the Diagnostic Innovations in Glaucoma Study were longitudinally evaluated according to a preestablished protocol that included regular follow-up visits in which patients underwent clinical examination and several other imaging and functional tests. All the data were entered in a computer database. All patients from the Diagnostic Innovations in Glaucoma Study who met the inclusion criteria described below were enrolled in the current study. Informed consent was obtained from all participants. The University of California San Diego Human Subjects Committee approved all protocols, and the methods described adhered to the tenets of the Declaration of Helsinki.
Each subject underwent a comprehensive ophthalmologic examination including review of medical history, best-corrected visual acuity (BCVA), slit-lamp biomicroscopy, IOP measurement using Goldmann applanation tonometry, gonioscopy, dilated fundoscopic examination using a 78-diopter (D) lens, stereoscopic optic disc photography, and standard automated perimetry using the 24-2 Swedish Interactive Threshold Algorithm (Carl Zeiss Meditec Inc., Dublin, CA). To be included, subjects had to have BCVA of 20/40 or better, spherical refraction within ±5.0 D and cylinder correction within ±3.0 D, and open angles on gonioscopy. Eyes with coexisting retinal disease, uveitis, or nonglaucomatous optic neuropathy were also excluded from this investigation.
A cohort of patients suspected of having glaucoma was selected from our database. These patients were selected based on the presence of abnormal or suspicious appearance of the optic nerve from cross-sectional evaluation of stereophotographs at the time of imaging by 2 independent masked graders. Features characteristic of glaucomatous appearance of the optic disc were neuroretinal rim thinning, cupping, or suspicious/abnormal RNFL defects. A third grader reviewed the photographs in case of disagreement. All patients had normal VFs at time of imaging. A normal VF was defined as a mean deviation (MD) and pattern standard deviation within 95% confidence limits and a glaucoma hemifield test result within normal limits. Additionally, participants could not have had repeatable glaucomatous standard automated perimetry VF loss before the date of their examination with imaging instruments. All patients had been observed for at least 5 years before their imaging session.
These patients were then classified based on history of documented evidence of progressive glaucomatous change in the appearance of the optic disc occurring before the imaging sessions. Patients with documented evidence of progressive glaucomatous nerve damage at any time before both imaging sessions with SLP and CSLO were considered as having glaucoma. Progressive glaucomatous change in the appearance of the optic disc was assessed by simultaneous stereoscopic optic disc photographs (TRC-SS, Topcon Instrument Corp. of America, Paramus, NJ). Stereoscopic sets of slides were examined using a stereoscopic viewer (Asahi, Pentax, Tokyo, Japan). The photographs were evaluated by 2 experienced graders, and each was masked to the subject's identity and to the other test results. For inclusion, photographs needed to be graded adequate or better. For each patient, the most recent stereophotograph was compared with the oldest available one, to maximize the chance of detecting progressive optic disc change. Each observer was masked to the temporal sequence of the photographs. Definition of change was based on focal or diffuse thinning of the neuroretinal rim, increased excavation, or enlargement of RNFL defects. Changes in rim color, presence of disc hemorrhage, or progressive parapapillary atrophy was not sufficient for characterization of progression. Discrepancies between the 2 graders were resolved by either consensus or adjudication of a third experienced grader. Initial agreement between graders was obtained in 88% of cases (93% of agreement for judging no-progression and 83% of agreement for judging progression). When both eyes of the same patient showed progressive optic disc changes and met the inclusion criteria, one eye was randomly selected for inclusion in the study.
A total of 40 eyes with progressive glaucomatous optic disc change and no VF loss before the imaging sessions were included in the glaucoma group. These patients were observed for an average of 8.21±3.26 years.
Patients without any evidence of progressive change in the appearance of the optic disc or VF loss in both eyes, observed without any history of IOP-lowering treatment, were considered to be normal and used as the control group. One eye of each subject was randomized for analysis. A total of 42 eyes of 42 subjects were included in the normal group. These subjects were observed untreated for an average time of 8.97±3.08 years without showing any evidence of progressive damage to the optic nerve, providing reasonable confidence that they had only suspicious findings of disease but no glaucomatous damage.
Mean (± standard deviation [SD]) ages of glaucoma and normal subjects were 66.1±12.8 years and 62.5±13.1 years (P = 0.21). Medians (first quartile, third quartile) of MD of the VF closest to the imaging session were −1.28 decibels (−2.79, 0.09) and −0.54 decibels (−1.01, 0.26), respectively (P = 0.006).
Patients were imaged using a commercially available scanning laser polarimeter with variable corneal compensation (GDx VCC, software version 5.5.1, Carl Zeiss Meditec, Dublin, CA). The general principles of SLP and the algorithm used for variable corneal compensation have been described in detail elsewhere.1,12,13 Because corneal polarization axis and magnitude affect scanning laser polarimetry measurements and are not similar in all eyes, GDx VCC employs a variable corneal polarization compensator that allows eye-specific compensation of anterior chamber birefringence. After determining the axis and magnitude of corneal polarization in each eye by macular scanning,13 3 appropriately compensated retinal polarization images per eye were automatically obtained and combined to form each mean image used for analysis.
Assessment of GDx VCC image quality was performed by an experienced examiner masked to the subject's identity and results of the other tests. The assessment was based on the appearance of the reflectance image, presence of residual anterior segment retardation, and presence of an atypical pattern of retardation. To be classified as good, an image had to be focused and have evenly illuminated reflectance with a centered optic disc. To be acceptable, the mean image also had to have residual anterior segment retardation ≤ 15 nm and a typical scan score > 25. The typical scan score is a measure provided by the GDx VCC standard software that indicates the presence of atypical patterns of retardation that can generate spurious RNFL thickness measurements.14
GDx VCC parameters provided in the standard printout of the instrument and investigated in this study were superior average, inferior average, temporal–superior–nasal–inferior–temporal (TSNIT) average, TSNIT SD, and nerve fiber indicator (NFI). The NFI is calculated using a support vector machine algorithm based on several RNFL measures and assigns a number from 0 to 100 to each eye.15 The higher the NFI, the greater the likelihood the patient has glaucoma.
The HRT II (software version 3.0, Heidelberg Engineering, Dossenheim, Germany) was used to acquire CSLO images in the study. It uses confocal scanning laser principles to obtain a 3-dimensional topographic image of the optic nerve. Its working principles have been described in detail elsewhere.16 For each patient, 3 topographical images were obtained, combined, and automatically aligned to make a single mean topography for analysis. Magnification errors were corrected using patients' corneal curvature measurements. An experienced examiner outlined the optic disc margin on the mean topographic image while viewing stereoscopic photographs of the optic disc. Good images required a focused reflectance image with a standard deviation not greater than 50 μm.
Topographical parameters included with HRT software and investigated in this study were disc area, cup area, rim area, cup-to-disc (C/D) area ratio, rim-to-disc area ratio, cup volume, rim volume, mean cup depth, maximum cup depth, mean height contour, height variation contour, cup shape measure, mean RNFL thickness, RNFL cross-sectional area, linear C/D ratio, and a linear discriminant function, from Mikelberg et al.17 The software on HRT II also incorporates Moorfields regression analysis (MRA), which is a comparison of the subject's rim area to a predicted rim area for a given disc area and age, based on confidence limits of a regression analysis derived from healthy subjects included in the instrument's normative database.18 This database contains information from 733 eyes of Caucasian subjects, 215 eyes of African subjects, and 104 eyes from Indian subjects. These subjects were selected based on the presence of normal IOP (<23 mmHg), normal VFs, no family history of glaucoma, and no history of ocular disease. Each sector is classified as within normal limits if the measurement falls within a 95% confidence interval (CI), borderline if the measurement falls in the 95% to 99.9% CI, and outside normal limits if the measurement falls below the 99.9% CI. Moorfields regression analysis also provides results for the global rim area (MRA global) and a final classification (MRA classification). A normal MRA classification requires the MRA analysis of all sectors and the global rim area to be within normal limits. A borderline MRA classification occurs when at least one of the sectors or the global rim area is borderline, and an outside normal limits result occurs when at least one sector or the global rim area is outside normal limits.
The HRT 3.0 software utilizes manufacturer-developed automated analysis for the detection of glaucomatous damage, the glaucoma probability score (GPS), which is independent of the contour line traced by the examiner around the optic disc margin.19 It is based on a 3-dimensional model of the entire topographical image, including the optic disc and surrounding parapapillary RNFL. Five shape-based parameters are used in the model to characterize the shape of the optic disc and RNFL. Three parameters are used to characterize the optic disc: cup size (width), cup depth (depth), and rim steepness (slope). Two parameters are used to characterize the RNFL: vertical RNFL curvature (superior to inferior curvature) and horizontal RNFL curvature (nasal to temporal curvature). A 3-dimensional model incorporating information from the 5 parameters described above is then constructed for the optic disc being examined. The values of the parameters are then fed into a machine learning classifier analysis called a relevance vector machine, which compares a patient's results to previously defined healthy and glaucomatous models. According to the manufacturer, the final GPS result is the probability or likelihood that the scan has structural characteristics that are consistent with glaucoma.
Descriptive statistics included mean and SD for normally distributed variables and median, first quartile, and third quartile values for non–normally distributed variables. Student's t tests or Mann–Whitney U tests were used to evaluate demographic and clinical differences between glaucoma and normal subjects.
Receiver operating characteristic (ROC) curves were used to describe the ability of SLP and CSLO to differentiate glaucoma from normal subjects. The ROC curve shows the tradeoff between sensitivity and 1 − specificity.20 An ROC curve area of 1.0 represents perfect discrimination, whereas an area of 0.5 represents chance discrimination. Receiver operating characteristic curve areas were compared using the method of DeLong et al.21 Sensitivities at fixed specificities of 70% and 95% were also reported for each parameter of each instrument.
Statistical analyses were performed using STATA (version 9.0, StataCorp, College Station, TX) and SPSS (version 13.0, SPSS Inc., Chicago, IL). The α level (type I error) was set at 0.05.
Table 1 shows mean values of GDx VCC parameters in glaucomatous and normal eyes. Statistically significant differences were found for all parameters. Table 1 also shows ROC curve areas and sensitivities at fixed specificities of 95% and 70%. The GDx VCC parameter with largest area under the ROC curves was the NFI (0.83). The ROC curve area for the NFI was significantly greater than for superior average, inferior average, and TSNIT SD (P<0.05 for all comparisons), but it did not significantly differ from the ROC curve area for TSNIT average (P = 0.189).
Table 2 shows mean values of HRT parameters in the glaucoma and normal groups. Statistically significant differences between glaucoma and normal eyes were found for neuroretinal rim–related parameters (area and volume), ratios (C/D area, rim-to-disc area, and linear C/D), RNFL parameters (mean thickness and cross-sectional area), linear discriminant function Mikelberg, and GPS. No significant differences were found for disc area, cup-related parameters (area, volume, shape measure, maximum depth, and mean depth), mean height contour, and height variation contour. Table 2 also shows AUC values for each parameter. The 3 parameters with largest AUCs were rim volume (AUC = 0.70), RNFL cross-sectional area (AUC = 0.69), and RNFL thickness (AUC = 0.69), although several other parameters had very similar performances.
Table 3 shows results of Moorfields regression analysis parameters for glaucoma and normal eyes. The percentage of glaucomatous eyes with outside normal limits results ranged from 10% to 48%. The percentage of normal eyes with within normal limits results ranged from 55% to 88%.
The area under the ROC curve for the best parameter from GDx VCC (NFI) was significantly larger than that of the best parameter from HRT (rim volume) (P = 0.044). The NFI parameter also had an ROC curve area larger than that of the contour line–independent parameter GPS (P = 0.023). Figure 1 shows the ROC curves for the NFI and the parameters rim volume and GPS. Assuming borderline results as normal, the MRA classification had a sensitivity of 48% for specificity of 69%. For a similar specificity (70%), the parameter NFI had a significantly larger sensitivity (83%) (P = 0.003).
Figure 2 illustrates typical subjects included in the study in the glaucoma and normal groups, with results from imaging instruments.
We found that SLP RNFL analysis was superior to CSLO optic disc topographic measurements for detecting optic nerve damage in patients suspected of having glaucoma. Several previous studies have compared the performance of these 2 technologies for detection of glaucoma.4–8 Most of these studies evaluated the ability of these instruments to discriminate patients with repeatable VF loss from healthy subjects with no suspicious findings of disease (usually healthy volunteers selected from the general population or among staff or relatives of patients). These studies are clearly important to provide an initial exploratory evaluation of the ability of these tests to detect glaucomatous damage—that is, if a test fails to differentiate cases from controls at this stage, no more evaluation is pursued, and the test is regarded generally as not clinically useful. However, in clinical practice a diagnostic test is used to diagnose disease in patients suspected of having disease, not in patients with a confirmed diagnosis. Therefore, if a test succeeds in initial diagnostic studies, further steps are needed to evaluate whether it is able to provide clinically relevant information.
The evaluation of the ability of imaging instruments to provide additional information besides that of clinical examination and VF testing is fundamental to measure their true value as complementary tests for diagnosing glaucoma in patients suspected of having the disease. The design of our study enabled the evaluation of the performance of these tests in the clinically relevant situation of diagnosing disease in glaucoma suspects. Estimates of diagnostic accuracy found in our study were lower than those reported in previous studies including patients with glaucomatous VF loss. The decrease in performance of the imaging instruments in our study compared with previous ones is probably related, at least in part, to the less severe stage of disease in glaucoma subjects included in our analyses. As the patients did not have VF damage at the time of imaging, they were likely at an earlier stage of disease than those included in previous studies using patients with VF damage.5 The decrease in performance is also probably related to the method of selection of normal subjects. In previous studies, normal subjects had no suspicious findings of disease and were generally required to have normal optic discs, whereas in our study, normal subjects had optic discs with a suspicious appearance, making it more difficult for the diagnostic test to differentiate them from diseased subjects. In fact, a recent study by Medeiros et al10 evaluating the impact of design-related bias in studies of diagnostic tests in glaucoma found that studies with a case–control design including patients with well-established disease and a separate group of normal (unsuspected) control subjects resulted in substantial overestimation of the test's performance.
Retinal nerve fiber layer parameters performed significantly better than optic disc topographic parameters for detecting glaucoma in our study. In the presence of patients with suspicious optic disc appearance, RNFL analysis provided more useful information than topographic optic disc analysis to confirm the diagnosis. The best GDx VCC parameter, the NFI, had an ROC curve area of 0.83 and sensitivity of 83% for specificity at 70%, whereas the best topographic parameter had an ROC of 0.70 with sensitivity of only 63% for similar specificity. These results are not surprising. The evaluation of the integrity of the RNFL during clinical examination or with stereophotographs is difficult, especially in older patients with light pigmented retinas or hazy optical media. On the other hand, evaluation of the optic nerve and identification of areas of suspicious rim thinning or optic disc cupping are more straightforward. Therefore, patients with a suspicious disc appearance will generally be those with large cups or suspicious rim thinning. In these patients, it is not surprising that topographic information about cup size or cup depth, for example, does not provide much additional information in terms of establishing the definitive diagnosis.
Receiver operating characteristic curve areas above 0.80 are generally considered to be good for a diagnostic test, whereas areas ranging from 0.70 to 0.80 are only fair, and areas below 0.7 are generally considered poor. Therefore, from the parameters investigated, only the GDx VCC NFI would be considered a good test to differentiate glaucoma from normals in our study. The ROC curve area for the NFI in this situation was actually similar to those found for several clinical tests widely used in medicine when these tests are applied to challenging situations of diagnosis, such as the electrocardiogram for diagnosing myocardial infarction22,23 or magnetic resonance imaging for diagnosing multiple sclerosis.24
In our study, we used evidence of documented progressive disc change to separate glaucoma suspect patients in those who were disease positive versus disease negative. As all these patients were required to have normal VFs and suspicious optic disc findings, no other reference standard would be available to classify these patients in this situation. In the absence of VF loss, a diagnosis of glaucoma can be given with certainty only by demonstrating a history of progressive glaucomatous changes to the optic nerve. The use of progressive optic disc change as a reference standard, however, has some limitations. Demonstration of progressive optic disc change requires longitudinal follow-up and serial documentation of optic disc appearance, which may not be available for all patients. Patients with a suspicious optic disc appearance who did not show any evidence of optic disc change or VF loss during follow-up were considered as normals. It might be argued that some of these patients could have had glaucomatous damage, but the follow-up time was insufficient to detect progression. Although it is unlikely that glaucoma patients would not progress or develop functional loss observed for almost 9 years without treatment, this possibility cannot be completely discarded. The requirement for no treatment was applied only to patients included in the control group, to avoid any confounding effects of treatment in the assessment of progression in this group. Although some of the patients in the glaucoma group received treatment later during their follow-up, their diagnoses could be confirmed by the presence of progressive optic disc damage, regardless of treatment.
It is important to keep in mind that different reference standards may be required to assess the performance of imaging tests in different situations. Our design enabled the evaluation of their ability to diagnose disease in patients with suspicious disc appearance. In contrast, if the purpose of the study was to evaluate how these tests performed in detecting patients with glaucomatous VF loss in the general population for glaucoma screening, a study design such as the one used in previous studies would be more appropriate. Therefore, it is important to evaluate the results of a diagnostic study in the context that the test will be used in clinical practice.
In conclusion, RNFL imaging with SLP had a superior performance over topographic optic disc assessment with CSLO for detecting damage in patients suspected of having glaucoma. For glaucoma diagnosis, these results suggest that SLP may offer advantage over CSLO when these tests are combined with clinical examination of the optic disc.
Supported in part by the National Eye Institute, Bethesda, Maryland (grant nos. EY11008 [LMZ], EY08208 [PAS]). Research support from Carl Zeiss Meditec, Dublin, California (FAM, LMZ, PAS, RNW), and Heidelberg Engineering, Dossenheim, Germany (FAM, LMZ, RNW).