|Home | About | Journals | Submit | Contact Us | Français|
Retinal nerve fiber (RNFL) thickness and visual field loss data from patients with glaucoma were analyzed in the context of a model, to better understand individual variation in structure versus function.
Optical coherence tomography (OCT) RNFL thickness and standard automated perimetry (SAP) visual field loss were measured in the arcuate regions of one eye of 140 patients with glaucoma and 82 normal control subjects. An estimate of within-individual (measurement) error was obtained by repeat measures made on different days within a short period in 34 patients and 22 control subjects. A linear model, previously shown to describe the general characteristics of the structure–function data, was extended to predict the variability in the data.
For normal control subjects, between-individual error (individual differences) accounted for 87% and 71% of the total variance in OCT and SAP measures, respectively. SAP within-individual error increased and then decreased with increased SAP loss, whereas OCT error remained constant. The linear model with variability (LMV) described much of the variability in the data. However, 12.5% of the patients’ points fell outside the 95% boundary. An examination of these points revealed factors that can contribute to the overall variability in the data. These factors include epiretinal membranes, edema, individual variation in field-to-disc mapping, and the location of blood vessels and degree to which they are included by the RNFL algorithm.
The model and the partitioning of within- versus between-individual variability helped elucidate the factors contributing to the considerable variability in the structure-versus-function data.
The relationship between structural and functional losses due to glaucoma has long been a topic of interest. The precise nature of this relationship has important implications for detecting glaucomatous damage and for staging the disease and monitoring its progression. In early studies structural damage was identified based on disc appearance1 or postmortem counts of retinal ganglion cells (RGCs),2 but the focus has shifted to quantifying retinal nerve fiber layer (RNFL) thickness and optic rim parameters measured with automated, noninvasive techniques, such as optical coherence tomography (OCT), confocal scanning laser ophthalmoscopy (SLO), and scanning laser polarimetry (SLP).
Numerous studies have shown plots of structural losses, measured using OCT, SLO, and/or SLP, versus functional losses, measured with static automated perimetry (SAP). (See Refs. 3– 8 for the relevant literature.) Recently, Hood, et al.6,9–11 showed that a simple linear model (SLM) described the general characteristics of these data. In its simplest form,9,10 this model assumes that RNFL thickness consists of two parts: RGC axons and everything else (e.g., glial cells, blood vessels). It further assumes that the loss in the thickness of the axon portion is proportional to local field sensitivity loss, when field loss is expressed on a linear scale. That is, a 3-dB loss (i.e., one half of normal sensitivity or a 50% luminance increase in threshold) is associated, on average, with a loss of one half of the axon portion of the RNFL thickness, whereas the nonaxon portion is assumed to remain constant with RGC loss. A variety of evidence is consistent with a linear model, at least in humans.4,6,12–14 On the other hand, Harwerth et al.15–18 have argued for nonlinear relationships between structure and function in humans15,16 and monkeys.17,18
There is a large amount of scatter in published structure-versus-function data. Although investigators in past studies have looked at different field regions, analyzed different sectors of the disc, and fitted their data with different functions, they all found a wide range of RNFL thickness associated with any given loss in local field sensitivity. It is clear that whatever model best describes the central tendency of the structural (RNFL thickness) versus functional (SAP field) loss in humans, there will be considerable scatter of the data around the predicted curve (see for example, Refs. 3, 5– 8, 15, 16). Hood and Kardon6 attempted to model some of the scatter in the data by modifying the SLM to predict the 95% confidence limits of RNFL measures. However, this model predicted only the variability in RNFL measurements; it ignored the variability in the SAP measurements.
In the present study, we obtained estimates of within- and between-individual variability and propose a model, which we call the linear model with variability (LMV), that includes variability on both RNFL and SAP axes. We analyzed OCT and SAP data from 82 control subjects and 140 patients with glaucoma in the context of the LMV. The model accurately predicted the variability in the control data and predicted considerable variability in the patient data. However, more points (12.5%) fell beyond the 95% boundaries of the LMV than predicted. The model, as well as an examination of the points falling outside the boundaries, provides a framework for understanding the sources contributing to the wide variability in structure-versus-function plots so that efforts can be directed toward minimizing the variability.
One hundred forty patients and 82 control subjects were tested with SAP and OCT. Procedures adhered to the tenets of the Declaration of Helsinki, and the protocol was approved by the University of Iowa Institutional Review Board (IRB) for Human Use and the Institutional Board of Research Associates of Columbia University.
All patients had a diagnosis of glaucoma and were part of a long-term study of perimetric and structural variability and change over time. Glaucoma suspects and patients with ocular hypertension were excluded. The inclusion criteria for both patients and control subjects included corrected visual acuity of 20/30 or better, no or mild cataract, refractive error not exceeding +6 D spherical and 3.5 D cylindrical, pupil diameter ≥3 mm, and mean deviation on SAP of better than −20 dB. The diagnosis of glaucoma was based on the evaluation of two glaucoma experts. Elevated intraocular pressure was neither a necessary nor a sufficient condition. Rather, the experts diagnosed glaucoma by using their clinical expertise in evaluating glaucomatous cupping of the optic nerve, corresponding visual field with RNFL-associated pattern of loss, intraocular pressure, and history. Patients were excluded if they had a history of other ocular or neurologic diseases that could affect the structural or functional measurements, surgery that could adversely affect vision, systemic diseases that were causing vision loss, amblyopia, or unreliable behavior (frequently missed appointments or unreliable SAP tests). Patients with controlled hypertension and/or diabetes mellitus without significant retinopathy judged on intake examination were not excluded nor were patients with pseudophakic eyes or those who had undergone refractive surgery, as long as there were no complicating factors that would impair their vision. The individuals in the control group were deemed to be normal based on their ophthalmic examinations and lack of history of ocular disease. Abnormal SAP fields were not used as an exclusion criterion for the normal control eyes.
All individuals fell into either a short- or long-term repeat group. There were 34 patients with glaucoma and 22 normal control subjects in the short-term repeat group and, with two exceptions, each was tested with SAP and OCT five times on separate days within a 5.7 ± 2.8 (median 4.6; maximum 15.3)-week period in which their glaucoma was stable. Two of the control subjects had only four OCT examinations. In one case, the fifth scan did not meet the quality criteria (see below) and, in the other, one of the examination appointments was missed. The other 106 patients and 60 control subjects were in the long-term group, which are retested every 6 months. Only the results of the first examinations were analyzed for the long-term group in this study. The first OCT and SAP examinations were typically performed on the same day. When the first SAP and OCT examinations were not performed on the same day, the first OCT was used and the SAP closest to that day, but within 2 months, was selected. If there was no SAP test within 2 months, then the next earliest OCT was selected together with the closest SAP test performed within 2 months. The average time between OCT and SAP examinations was 3.2 ± 12.8 (median 0) days for the patients and 0.5 ± 6.5 (median, 0) days for the control subjects. At the time of the first OCT test, the mean ages of the control subjects and patients were 53.4 ± 11.5 years (range, 23.1–78.1) and 65.9 ± 10.2 years (range, 22.3–82.4), respectively. The effects of age on our measures of SAP loss and OCT thickness are considered in the next section.
The visual field testing was performed with SAP (Humphrey Field Analyzer, program 24-2 SITA standard; Carl Zeiss Meditec, Dublin, CA). Field test results were excluded from the analysis if they were unreliable as defined by excessive false-positive (>10%), false-negative (>33%), or fixation losses (>33%). All visual threshold analyses were performed by using the total deviations as displayed in the Total Deviation Plot of the 24-2 SAP report. The total deviation at each point in the field is the difference, in decibels, between the patient’s sensitivity and the average sensitivity of age-matched machine norms. Thus, −10 dB means that, at that test location, the patient’s sensitivity is 10 dB or 1 log unit less than the average sensitivity of the age-matched control subjects, or on a linear scale, one tenth as sensitive. As in our previous work,6,10,11 we were interested in the sensitivity loss in arcuate regions falling within the 24-2 SAP test field, and so we used the definitions of these regions provided by Garway-Health et al.19 The associated locations on the 24-2 SAP field are shown in Figures 2A and 2B. The decibel values for each location of the total deviation field within the arcuate region were converted to a linear scale (e.g., 0 dB converted to 1.0 and −30 dB to 0.001) before they were averaged within each sector.6,20,21 The decibel value of these sector averages is plotted in Figures 2 and and33.
RNFL thickness was measured by OCT (OCT3; Stratus fast RNFL circular scan; Carl Zeiss Meditec). The circular scan RNFL data consisted of the average of three circular scans in the set, and only sets with a signal ≥6 were used. Multiple sets of three scans were typically collected at each visit and the scan set having the highest signal strength and most accurately centered on the disc was used. The 256 OCT RNFL thickness values for the scan were exported. The thicknesses within the superior and inferior sectors as defined by Garway-Health et al.19 (see Figs. 2A, 2B) were averaged and are plotted in Figures 2 and and33.
Given the difference in ages between the control (53.4 ± 11.5 years) and patient (65.9 ± 10.2 years) groups, it was important to consider the possible effects of age on the measures used in the study. Although the total deviations in threshold sensitivities in the glaucomatous eyes and the 82 normal eyes are corrected for age by the visual field perimeter’s internal normative database, the OCT RNFL thicknesses are not. As OCT thickness has been reported to decrease with age, it was our intention to correct our OCT RNFL thickness measures for age by using the values for the 82 control individuals. However, in our control group, there was a slight tendency of OCT RNFL thickness in the arcuate regions to increase with age. For the superior and inferior arcuate regions, the positive slopes were 0.11 and 0.18 μm/y. Further, less than 1.5% of the variance in RNFL thickness was accounted for by age. The average thickness for the total RNFL profiles versus age had a slope of −0.10 μm/y. Although this value is low compared with most values in the literature, it is within the range of values obtained in recent studies.22–25 We do not know why the effects of age were minor in our study, but we speculate that it could be related to the quality of our scans. As mentioned, we routinely performed multiple scans at each visit and chose the scans of highest quality for analysis. Cheung et al.26 recently reported that RNFL thickness decreases with decreased signal strength. Perhaps part of the effect of age is due to lower signal strength in older populations, secondary to, for example, smaller pupils and/or media opacities (some the normal eyes in this study were also pseudophakic). The signal strength in our study for the 41 control subjects younger than 54.2 years was 9.1, close to the value (8.8) in the 41 control subjects older than 54.2 years. In any case, a correction for age would have relatively little effect on the variability seen in our study, even if we had used the values in the literature. For example, Budenz et al.24 in a study of 328 control subjects found a value of 0.2 μm/y. If we had used this value and normalized the OCT thickness relative to 50 years of age, as in 24-2 SAP, then the age-corrected points would change; but the change would be relatively small, between −6 and +6 μm for 20 to 80 years of age.
To further confirm that the younger control subjects’ data were not affecting our results, we compared the summary statistics for the full control group with those of the 38 control subjects older than 55 years (mean age of 62.5 years, close to the patients’ mean age of 65.9 years). For the inferior disc sector, the summary statistics of older versus the full control group were nearly identical: mean ± SD of 143.7 ± 18.2 μm versus 142.6 ± 17.8 μm and 95% confidence limits of 174.3 and 108.5 μm versus 173.5 and 109.3 μm. And, for the superior disc sector, the summary statistics of older versus the full control group were very similar: mean ± SD of 126.8 ± 16.6 μm vs. 125.8 ± 17.5 μm and 95% confidence limits of 150.1 and 88.7 μm vs. 155.8 and 87.6 μm.
Formally, the RNFL thickness in an individual is described as
where t is relative sensitivity and is equal to 100.1d, where d is the individual’s total deviation in decibels from the mean age-matched machine norms; so is the value of s in equation 1 when sensitivity is normal (t = 1.0); and b is the residual thickness. Note that an individual’s RNFL thickness is so + b when the field is normal (t ≥ 1.0); that is, RNFL thickness does not depend on t for normal control subjects.
For a group of patients, thickness is described as
where R, S, B, and T are the average or median values for the group. Note that the average RNFL thickness is So + B when the field sensitivity, on average, is normal (T ≥ 1.0); that is, it does not depend on T.
In Figure 1A, the curve predicted by the SLM (equations 3a and 3b) is illustrated, with the mean values for a group of control subjects shown as the filled square and the values of B and So 3 B indicated by the dashed lines. Fitting the model requires estimating So and B. B, the asymptotic value when all sensitivity and all axons are lost, can be estimated as the median of the patients’ data for field losses greater than −15 dB (where the RNFL becomes asymptotic). So, the thickness of the axon portion in a healthy RNFL, can be obtained by using the mean of the control RNFL thicknesses as the estimate of So + B.
Hood and Kardon6 modified the linear model to take into consideration individual variations in RNFL thickness. In particular, they added the assumption that:
Individuals with healthy vision differ in RNFL thickness. They have different so + b values. Individuals with healthy vision also have different b values, which are approximated as ⅓ (so + b) for the arcuate regions. In particular, based on data from a group of patients with severe unilateral anterior ischemic optic neuropathy (AION; correlating b from the affected eye with so + b from the fellow unaffected eye),11 it was assumed that
Thus, to derive the predicted curve for a group of patients, one must know the value of So + B. By definition, this is the mean value for a group of age-matched control subjects. The predicted curve, shown as a bold black line in Figure 1B, for the mean RNFL thickness (filled square) can be determined from the mean of the control measures (filled square). This theoretical curve does not depend on the patients’ data. In other words, it is derived without any degrees of freedom.
According to this model, every individual would move down his/ her own curve as glaucoma progressed and they lost field sensitivity and RNFL thickness. In particular, an individual with a relatively large RNFL thickness would start along a curve that was above the mean curve at their so + b value and asymptote at one third that value, as shown by the top dashed curve in Figure 1B. Thus, an individual’s curve is determined by the RNFL thickness so + b when it is healthy (d = 0 dB and t = 1.0). Therefore, the model predicts that there should be variability in the data, even if there is no measurement error, based on the starting RNFL thickness in a healthy eye, which varies among normal subjects. Hood and Kardon6 obtained an approximation of this range by plotting two dashed curves (Fig. 1B), encompassing the 95% range in normal OCT values.
The Hood and Kardon model is incomplete, however. First, it does not distinguish between OCT variability due to within-individual (measurement) error and between-individual error (true individual differences). Second, it does not include variability on the SAP axis (i.e., x-axis in the figures cited later). To extend the logic of the approach just described to variability on both axes requires calculating a 95% boundary that takes both axes into consideration.
Calculating this boundary for the normal control data is relatively easy. With the assumptions stated later in the article, the 95% boundary is described by an ellipse. The ellipses containing 95% of the normal points (see green ellipses in Figs. 1C, 1D, 4A, 4B) were derived by setting the radii of the major axes to 2.45 times the standard deviation of each variable (i.e., RNFL thickness and SAP loss). (For those expecting radii of 1.96 × SD, see, for example, Ref. 27). If we assume that the two variables are normally distributed and independent, this ellipse will be the 95% confidence boundary. To a first approximation, these assumptions are supported by the data. Neither variable fails a test for normality (Kolmogorov-Smirnov test) and, although there is a weak positive correlation between RNFL thickness and SAP loss in normal eyes, it is very small (r = 0.1) and not significant; further, the slope is very shallow.
A similar approach was taken to obtain a 95% confidence boundary for the patients’ data. A series of ellipses were derived and the envelope (green curves in Fig. 1D) of these ellipses was taken as the 95% confidence boundary of the model. The assumptions involved and the method used to derive these ellipses are discussed in the next section. Note that the dimensions of the resulting ellipses were confirmed with Monte Carlo simulations.
We defined a disease state d, which was taken as the “true” loss (no measurement error) in SAP field sensitivity and was a continuous variable from 0 dB (normal sensitivity for any individual) to −30 dB. In particular, let
where Yij(d) is the OCT RNFL thickness (micrometers) for the ith individual on the jth test day in disease state d; μy(d) is the mean RNFL thickness over a large (infinite) number of individuals and large number of test days, all with the same d value (i.e., μy is the population means without individual variability or measurement error); eybi(d) is the between-individual error (individual differences) for the ith individual and depends on d; and eywij is the within-individual error (measurement error) for the ith individual on jth test day. Similarly, for the SAP field loss (in dB) Xij(d) is the SAP field loss for the ith individual on the jth test day; μx(d) is the mean SAP field loss over many individuals and many test days; exbi is the between-individual error for the ith individual; and exwij(d) is the within-individual error for the ith individual on jth test day and depends on d.
We assumed that the random variables, eybi(d), exbi, eywij, and exwij(d) are independent, normal distributions with means of 0 and standard deviations of σyb(d), σxb, σyw, and σxw(d) (assumption A). In addition, we made two other assumptions. First, we assumed that random variables exbi (assumption Ba) and eywij (assumption Bb) do not vary with the level of d. Figure 3A provides support for assumption Bb, whereas exbi should depend on the sample of patients, not d. Second, we assumed that eybi(d) (assumption Ca) and exwij(d) (assumption Cb) do vary with the level of d. In particular, for assumption Cb we assumed that exwij(d) has a mean of 0 and a standard deviation of σxw(d) and is estimated from the repeat-measure data in Figure 3C (solid green curve), as explained later. For assumption Ca, we assumed that eybi(d) has a mean of 0 and a standard deviation equal to:
That is, we assumed the linear model as expressed in equations 2 and 4. Because all values of Yyb (without measurement error) for a given d will be decreased by the same factor, for any given d, the value of σyb for this value of d will be decreased by the same factor. This factor is derived from equations 2 and 4.
The ellipse for any given d can be generated by using radii of 2.45 times the standard of X and Y, where these standard deviations equal [σyb(d)2 + σyw2 ]1/2 and [σxb2 + σxw(d)2]1/2. The ellipse is centered on the mean (black curve) at [μx(d), μy(d)], where
Figure 1C shows ellipses for six values of d ranging from 0 (normal) to −30 dB for the upper field. The smooth green curves in Figure 1D are the envelopes of a large number of these ellipses and represent the 95% boundaries for all individuals. All parameters needed to derive these ellipses from the model were estimated from the data—that is, no parameter fitting was involved. For the upper field, μy(0) = 142.6 μm; σyb(0) = 16.4 μm; σyw = 6.4 μm; μx(0) = 0.33; σxb = 0.90 dB; σxw(0) = 0.57 dB; and σyb(d) is given by equation 6 and σxw(d) by the solid green curve through the data in Figure 3C. For the lower field, μy(0) = 125.8 μm; σyb(0) = 16.4 μm; σyw = 6.4 μm; μx(0) = 0.00 dB; σxb = 0.90 dB; σxw(0) = 0.57 dB; and σyb(d) is given by equation 6 and σxw(d) by the solid green curve through the data in Figure 3C.
Figure 2 presents the structure-versus-function data for the two arcuate regions as described by the Garway-Heath et al.19 map. Figures 2A and 2B show the RNFL disc sectors and associated SAP field regions for the upper visual field/inferior disc sector and the lower visual field/superior disc sector, respectively. In Figures 2C–F, the OCT RNFL thickness for the appropriate disc sector is plotted against the visual field loss (decibels) for the corresponding visual field region. The data for the 82 control subjects are shown in Figures 2C and 2D, where each open symbol represents the data for one individual. The bold lines indicate the 95% CI (i.e., ± 2 SD) for RNFL thickness (vertical) and field loss (horizontal), with the intersection of these lines indicating the mean of the control subjects’ data. These mean values are shown in Figures 2E and 2F as the filled squares. See the caption to Figure 2 for the CIs and the means.
The OCT results (Figs. 2C, 2D) showed a fairly wide range of RNFL thicknesses for both arcuate regions in normal eyes. The SAP field variability appeared to be less when plotted on a log (decibel) scale. On a linear scale, however, the SAP confidence levels ranged over a factor of 2.7 (upper field) and 2.7 (lower field), larger than the RNFL thicknesses, which ranged over a factor of 1.7 (inferior disc) and 1.8 (superior disc). It should be noted that the SAP results are already age corrected (decibel deviation from age-matched control subjects in the visual field machine’s database), but the OCT results are not (see the Methods section).
The data for all 140 patients with glaucoma are displayed in Figures 2E and 2F as open circles. As previously described,6,9,10 the patients’ RNFL thicknesses decreased with visual field loss, approaching an asymptotic value for field losses more extreme than approximately −10 dB.
The solid black curves in Figures 2E and 2F are the prediction of the SLM described in the Methods section with the parameters in the caption to Figure 1. In general, the theoretical curves capture the central tendency in the data with slightly, but not significantly (binomial test), more points falling below and to the right. For the upper field (Fig. 1E), 62 points fall above and to the left of the theoretical curve, 76 fall below and to the right, and 2 fall on the curve. The equivalent numbers for the lower visual field are 67 above/left, 72 below/right, and 1 on the curve. However, as expected from previous work, there is a fair amount of scatter in the data. The LMV attempts to account for this scatter.
To estimate the within-individual (repeat measurement) variability needed for the LMV, we used the data from the short-term repeat-testing group. In particular, 34 patients and 22 control subjects were tested on five different days (2 control subjects only four times) within a short period. Consider the results for the 22 control subjects first, plotted in Figure 3A as the small open circles. Each symbol shows the standard deviation for an individual’s five examinations as a function of the mean thickness of the superior (blue) and inferior (red) arcuate sector for these examinations. The average standard deviation of the OCT thickness in normal eyes was 6.4 μm for both superior and inferior disc sectors (Fig. 3A, black dashed line). The filled symbols show the results for the 34 patients with glaucoma. The average standard deviation of the OCT thickness was 6.5 μm for both superior (blue) and inferior (red) disc regions, indistinguishable from the control subjects. The outlier with a standard deviation of 36.3 μm is for the lower arcuate RNFL of a patient with a developmental abnormality at the inferior disc border (Fuchs’ coloboma). This patient had a large area without pigment epithelium surrounding the optic nerve inferiorly. This region of atrophy bordered the scanning circle and appeared to be the cause of the interscan variability due to the effects on the RNFL algorithm.
To better examine the trends, we combined the data for the superior and inferior discs regions and took averages for groups of six or seven, after rank ordering for mean OCT thickness. (The outlier was not included.) The open squares in Figure 3A show these grouped data for the patients (black) and control subjects (green). To a first approximation the repeat-measurement error (standard deviation) is independent of OCT thickness28,29 and is approximately the same for control subjects and patients—that is, 6.4 μm (Fig. 3A, dashed line), which supports assumption Bb. The same data are shown in Figure 3B, plotted against mean SAP field loss. As might be expected, the RNFL within-individual variability (repeat-measurement error) of normal eyes also appeared independent of the level of field loss.
However, in the patients, the repeat-measurement error of the SAP loss (in dB) was dependent on the mean level of SAP field loss (assumption Cb). It is well known that the visual threshold standard deviation first increases and then, due to a floor effect at 0 dB, decreases with SAP loss (e.g., Refs. 30 –32). This same pattern can be seen in the data for the arcuate regions in Figure 3C. In Figure 3C, the standard deviation for an individual’s five SAP examinations is plotted versus the mean of these values for the upper (red) and lower (blue) regions of the field. Note that the standard deviation increases from approximately 0.6 (the mean of the control values shown as the large green square and the black dashed line) to a peak in the range of −13 to −20 dB before decreasing again for extreme field losses. The open squares represent the grouped data obtained as in Figure 3A. These data support assumption Cb, and the solid green curve that approximates the grouped data was used for deriving the ellipses in Figures 1C and 1D, as described in the Methods section. (Note that the dashed green curve was drawn through the higher values. As mentioned earlier in the Discussion section, the LMV’s predictions were also derived with this curve.)
If we assume that RNFL measurement error is independent of mean arcuate RNFL thickness for the control subjects (Fig. 3A), we can estimate the percentage of the variance due to measurement error (within-individual variability) and the percentage due to between-subject variability. Assuming a standard deviation of 6.4 μm for OCT RNFL thickness makes the variance (SD2) of the within-individual (or repeat measure) 6.4 or 41.0 μm2. The total variance for the single examinations of the 82 control subjects is 17.6 or 309.8 μm2 for the control subjects. Thus, the between-individual RNFL variance is 268.8 μm2 (309.8 – 41.0) or approximately 87% of the total variance. (Note that based on a similar analysis for the SAP field results, the between-individual threshold variance for normal eyes is 0.81 dB2 or approximately 71% of the total variance.)
The data from Figures 2C–F are replotted in Figures 4A–D along with the predictions of the LMV. For the patients’ data in Figures 4C and 4D, the green lines are the boundaries for the 95% region (see Fig. 1D). According to the LMV, 2.5% of the points should fall above and 2.5% below these green dashed curves. In fact, 21 points or 7.5% fell above the curves and 14% or 5.0% fell below. In total, 12.5% or 35 of the points fell outside the 95% CIs. Thus, the LMV captured most of the variability in the data, although there were more points falling outside the 95% limits than predicted by the LMV.
The 35 points with the red dots fell beyond a 95% confidence boundary. We will call these extreme points or “extremes.” Thus, there were 21 upper extremes and 14 lower extremes, while, on average, the model would predict less than 7 above and 7 below. These 35 extremes involved 30 of the 140 patients. The average ages of the patients with extremes (65.9 ± 10.9 years) and nonextremes (65.9 ± 10.0 years) were similar.
The clinical histories and test results of the 30 patients associated with the 35 extreme points (Figs. 4C, 4D, red dots) were carefully scrutinized to see whether there were factors, other than repeat-measurement error, that might have contributed. Our objective was not to make excuses for the model. Rather, these extremes may reveal factors that contribute to the overall variability in structure-versus-function data.
Consider first the two points (patients [P]1 and P2) that fell well outside the other extremes shown in Figure 4D. A close examination of the fundus photos revealed that P2 had previously unrecognized retinal edema near the optic disc secondary to background diabetic retinopathy. Figure 5A shows leakage on the late-phase fluorescein angiogram in the region (red ellipse) that includes the upper arcuate sector (region between red lines in the bottom panel of Fig. 5A). This edema is the likely cause of the increased thickening of the RNFL seen in the temporal and superior temporal regions of the RNFL profile, and on this basis, the patient probably should have been excluded from the study if the edema had been detected initially. The other patient, P1, had a previously unrecognized epiretinal membrane (ERM), as shown in Figure 5B, which contributed to the thickness of the RNFL. Note the irregular inner limiting membrane incorporated within the algorithm’s white lines.
Based on an examination of the records from the 35 points classified as extremes, factors contributing to between-individual variability were identified. These are summarized in the next sections.
As indicated in Figure 5B, the presence of an ERM can add thickness to the RNFL measure and thus be a possible factor contributing to RNFL thickness of upper extremes (i.e., points falling above the predicted curves.) Figure 5C provides a second example of an ERM contributing to the thickness in the analysis window (region between the red vertical lines) of the superior arcuate region of the disc. An ERM in the region of the arcuates studied was identified in 5 of the 20 points (patient with edema excluded) associated with upper extremes and in none of those 14 associated with lower extremes. Although this difference between the upper and lower locations was not statistically significant (P = 0.06, Fisher’s exact test, two-tailed), it is of interest because the presence of an ERM certainly adds to the RNFL thickness measured.
Excluding the 4 patients (five points) with ERMs and the patient with DR/edema (two points) leaves 25 patients and 28 extremes. Of the 28 extremes remaining, 14 were upper extremes and 14 were lower extremes, falling above and below the previously defined limits of the model. For each of these extremes, we determined whether the two major temporal blood vessels (BVs; i.e., the temporal artery and vein) were included in our arcuate analysis window. If they were included, they might add to the thickness of the RNFL measured. However, if they were excluded from the circular scan region defining the arcuate regions, the RNFL might be thinner as a result. As indicated in Table 1, most (n = 13) of the 14 upper extremes had two major temporal BVs within the arcuate analysis window, whereas only one had one BV and none had no BVs within the arcuate sector analysis window. On the other hand, of the 14 lower extremes, only 2 had two BVs, whereas 12 had either one (n = 8) or none (n = 4) within the analysis window. In addition, for the two extremes with two BVs, the algorithm did not include the BVs. The difference between the number of eyes with BVs within the arcuates for the upper versus lower extremes was statistically significant (P = 0.0001; two-tailed Fisher’s exact test using a 2 × 2 table with the 0-BV and 1-BV cells in Table 1 combined).
The major temporal BVs exceed 100 μm in diameter and can contribute a local signal of more than 100 μm to the RNFL thickness (Kay et al, manuscript submitted).33 However, a BV contributes to the RNFL thickness measured only if it is incorporated within the RNFL by the algorithm. Figure 6A provides an example in which the signals from the two major inferior BVs are incorporated within the RNFL thickness determined by the algorithm (Fig. 6A, middle, white lines). The BVs are seen on the fundus photograph (bottom left) lying within the arcuate analysis window shown as the black radial lines in relation to the peripapillary scan line (green). The locations of the signals from these vessels are confirmed by the shadows they cast (arrows) in the expanded scan (bottom right). When the RNFL becomes thin and the signals from the BVs are not surrounded by significant signals from axons, the full extent of the BVs is not included within the boundaries of the algorithm. This point is illustrated in Figure 6B. The signal from a BV is indicated by the red arrow in the middle panel. The lower insets show a magnified version of this scan (right bottom inset) and a higher resolution version (left bottom inset). The BV signal was largely excluded from the RNLF thickness, as determined by the algorithm (white lines).
The scans from all the extremes were examined to see whether the BVs were incorporated by the machine’s algorithm. For the 14 upper extremes, all BVs present were largely incorporated, whereas none of the 14 lower extremes had BVs largely incorporated into the algorithm’s definition of the thickness of the RNFL.
In patients with glaucoma, we6,9–11 showed that an SLM relating structure to function describes the general shape and central tendency of the data when local RNFL thickness is plotted against local visual field loss. This model was confirmed in the present study (Figs. 2E, 2F, black curves). However, the data from individuals show a large degree of scatter around the mean theoretical curve, as is true of similar plots in the literature (see, for example, Refs. 3, 5– 8, 15, 16). Our concern in the current study was with the variability in the data. To better understand this variability, we obtained estimates of within-and between-individual variability and tested a model that incorporated the information. In particular, to obtain a prediction of the variability to be expected, a model proposed by Hood and Kardon6 was modified and extended by making explicit assumptions about the within- and between-individual variability in both OCT and SAP measurements.
In agreement with the data, the LMV predicted a fair degree of scatter. However, it failed to account for all the variability seen. In particular, 12.5% of the patients’ data points fell outside the 95% confidence boundaries predicted by the model. Our main purpose in this study was to better understand the possible sources contributing to the overall variability in structure-versus-function data, not simply to understand the deviations from the LMV. In fact, the assumptions and parameters of the LMV, especially our estimates of within- and between-individual variability, as well as the deviations from this model, provide information about the sources of variability in the structure-versus-function data. In this section, we discuss the sources of variability contributing to the scatter in the data in general, consider the factors that may contribute to the deviations from the model, and suggest the steps needed to make the structure–function data more useful clinically.
The LMV makes explicit assumptions about the variability in OCT and SAP measures for both the patients and control subjects, and the data in this article supply estimates of both within- and between-individual variability. To illustrate the relative contributions of the variabilities, Figure 7, in the same form as Figure 1C, shows the 95% ellipses for the combined within- and between-individual variability (the same green ellipses as in Fig. 1C), for between-individual variability excluding within-individual variability (red), and for within-individual variability excluding between-individual variability (blue). For the latter, we considered an individual who starts at the mean normal values (black square); this individual would progress along the solid black curve if no within-individual (measurement) error were present. (Note that the ellipses are based on standard deviations and that variances, not standard deviations, add.)
The standard deviation of the OCT within-individual error was relatively constant as a function of RNFL thickness (Fig. 3A), or SAP field loss (Fig. 3B). Further, the OCT within-individual error was relatively small in both normal control subjects and patients. The standard deviation of the repeat OCT measures was approximately 6.4 μm, consistent with previous measures (see Ref. 23 for a review). These results are similar to those of two recent studies of patients with glaucoma, which reported good OCT repeatability that was independent of mean RNFL level.27,28 In any case, within-individual variability accounted for only 13% of the variance in the control OCT values. It was not a major factor in the scatter seen in the control data, as shown by comparing the vertical dimensions of the blue and green ellipses in Figure 7. With disease progression, the relative contributions of within- and between-individual OCT variability approached similar values.
Next, consider the role of the within-individual error in SAP measurements. For SAP sensitivity better than about −5 dB (i.e., losses < −5 dB), the variability in the control subjects and patients was reasonably constant (Fig. 3C). In the control subjects, it accounted for only 29% of the variance. However, the variability of the SAP data increased with field losses worse than −8 dB or so (Fig. 3C).29–31 The relative impact of the SAP within-individual error can be seen in Figure 7. For SAP sensitivity better than −6 dB, the horizontal width of the blue ellipses (only within-individual error) was smaller than that of the red ellipses (only between-individual error). The reverse was true for field sensitivity worse than or equal to −6 dB, where within-individual error predominated until extreme field losses were attained.
In general, for mild to moderate field losses (fields better than −6 dB), between-individual variability (i.e., true individual differences) is the dominant factor. Note that the red ellipse is larger than the blue in Figure 7 over this range. Both the model and the deviation in the data from the model provide information about possible sources of between-individual variability of RNFL thickness (i.e., variability not accounted for by repeat-measurement error). Here, we consider a few.
In general, pathologies of the inner retina can contribute to an artificial thickening of the RNFL. Edema, secondary to background diabetic retinopathy, was the cause of the much larger than expected RNFL thickness in P2, and an ERM was the cause of the other obvious outlier, P1, in Figure 4B. In fact, all five extremes with ERM in the region of the analysis window were upper extremes. ERM has a direct contribution to the RNFL thickness, as the algorithm incorporates its thickness. It also has an indirect influence by artificially increasing thickness by creating small “puckered” regions in the RNFL (see Figs. 5B, 5C). It should be noted that some ERMs are not obvious on first inspection and sometimes require closer scrutiny of other scan locations to identify a distinct ERM.
Mapping between local SAP regions and local disc sectors can vary from individual to individual.6,10,19 Garway-Heath et al.19 estimated that the 95% CI for the location on the optic disc associated with a particular SAP field location spanned a range of almost 30°. Figures 6C–E illustrate how mapping may contribute to variability. In Figure 6C, the filled red symbols shows the RNFL for two patients, P3 and P4, whose RNFL thickness profiles and sample scans are provided in Figures 6D and 6E, respectively. (Note that the peaks of the RNFL profiles in these two subjects appear displaced from those of the average normal subject shown in the overlaid green, yellow, and red boundaries.) The solid red vertical lines are the boundaries of the analysis window for the lower arcuate disc sector. The dashed vertical lines show these boundaries shifted so that the major temporal BVs now fall within these boundaries. The result is a considerable increase in the RNFL thickness (Fig. 6C, red arrows). Shifts of approximately 14° (10 on a 256-point scale) and 18° (13 points) were required—shifts well within the 95% CI of nearly 30° reported by Garway-Heath et al. These two patients were chosen because, although the original analysis window (solid black line on fundus photographs in Figs. 6D, 6E) did not include the major inferior temporal BVs, the shifted window did (dashed lines on fundus photographs).
Although mapping variations may be an important contributor to between-individual OCT variation, there is another factor, considered next, that also contributes.
The location of the major temporal veins and arteries and how they are treated by the machine’s RNFL algorithm are important factors that determine RNFL thickness. To understand the interaction of BVs and the algorithm, we must consider the spatial averaging (filtering) by the machine’s RNFL algorithm. If the algorithm for segmenting RNFL included very local variations in signals, then the lines delimiting the RNFL would be very irregular because every small variation in the signal due, for example, to BVs and random noise, would be included. The algorithm used in the Stratus OCT3 (Carl Zeiss Meditec) avoids this by performing a fair amount of spatial averaging/filtering. However, because of this spatial averaging, the BVs are not treated in a consistent manner (see Figs. 6A, 6B). When BVs are close together or contiguous to signals due to axons (as in normal control subjects and Fig. 6A), they tend to be incorporated into the RNFL thickness by the algorithm. On the other hand, if a BV is isolated with little neighboring signal (as in patients with extreme SAP loss), very little of the BV is incorporated into the RNFL thickness measure. Therefore, even though the major retinal BV branches are not part of the RNFL, the software algorithm usually includes them in the thickness measurement, but the degree to which they are included by the algorithm may vary when the RNFL becomes thin due to significant axon loss. In particular, this probably contributes to the variation in asymptotic RNFL thickness among individuals observed in this and other studies.6,10,11,37,38 That is, the between-individual variation in this asymptotic part of the function is largely due to the number of BVs within the disc region of analysis, and the degree to which these BVs are incorporated into the RNFL thickness by the algorithm (see Fig. 6B) (Kay KY et al., manuscript submitted).33,39,40 It should be possible to decrease OCT between-individual error if the BVs were treated in a consistent way, independent of the state of the RNFL.
Just as individuals differ in their OCT thickness, even without measurement error, so do individuals differ in SAP field sensitivity when healthy. In addition, there may be more sources of between-individual variability of SAP field loss among the patients. For example, if the patients were more likely than the control subjects to have other diseases that decrease SAP field sensitivity without affecting RNFL thickness (i.e., media opacities, outer retinal photoreceptor disease, or damage before atrophy of axons has become complete), it would contribute a source of variability not included in the model and would tend to produce extremes falling above the curve.
This study and all similar studies have shown considerable variability in the relationship between structure and function in glaucoma, even when care is taken to ensure that the data are of the highest quality. Clearly both within- and between-individual variability in OCT and SAP measures contribute, and the LVM attempts to take this into consideration. Of particular note is the variation in the between-individual OCT measures and the impact on variability in the region of mild field losses. Among the factors contributing to the between-individual OCT variance, the location of the major BVs and the variation in mapping among individuals are clearly important. To some extent these two factors are correlated, as the location of the major temporal BVs correlate with the location of the arcuates, although this correlation is far from perfect.
We also identified additional possible factors, not accounted for by the LVM, such as inner retinal pathologies, that can contribute to the considerable between-individual variability in the structural–functional plots in this study (e.g., Figs. 2A, 2B) and, presumably, in similar studies in the past.
We have focused on factors that contribute to variability in structure–function data. We have briefly considered factors that contribute to deviations from the LMV. In addition to ERM and other retinal problems, differences in mapping and the influence of BVs and algorithms, as mentioned earlier, can contribute to the deviation of the data from the model. For example, the way the algorithm treats BVs poses a problem for testing our linear model, as well as a recent nonlinear model proposed by Harwerth et al.16,18 Both models assume that the RNFL is composed of axons and nonaxons (what we call the residual). We assume that the residual portion remains constant with disease progression.6,10 Harwerth et al.16 assumed that the residual, or at least the glial portion, increases with axon loss due to aging and/or glaucoma. Although it is not clear at this point which assumption will ultimately prove to be more accurate, it is clear that that the OCT data used to test these assumptions impose limitations. Both our approaches assume implicitly that the RNFL algorithm treats the nonaxon portion (residual) the same regardless of the thickness of the axon portion. We have found that this is not true of the RNFL algorithm of the Stratus OCT3 (Fig. 6 and Ref. 40). Normal control subjects typically have the full extent of their BVs included, whereas patients with extreme field loss typically have very little of their BVs incorporated. It should be pointed out that the frequency domain OCT does not necessarily solve this problem, as the critical factor is the algorithm, not the machine. The better resolution available with the frequency domain OCT, however, allows for a clearer identification of the BVs and offers the possibility of developing a procedure to either explicitly include or exclude their contribution. In any case, an appropriate test of a structure–function model depends on a consistent treatment of signals from BVs or alternatively, analysis of retinal regions not containing large BVs.
Although we have focused on sources of variability in the data, the assumptions of the LMV should be examined as well. For example, we have not considered possible variations in the assumptions regarding normally distributed errors. More important, changes in the assumptions with regard to SAP within-individual error, as well as the other error terms in the model, could decrease the number of points deviating from the model. We can use the LMV model to assess the role of the SAP measurement error. The LMV simulated herein assumes that the variability in SAP field loss changes as given by the solid green curve in Figure 3C. With this assumption, 12.5% of the points fall outside the 95% boundary. If instead, the variability (i.e., standard deviation) is assumed to be constant at the control value (Fig. 3C, dashed horizontal line), then 13.6% of the points fall outside. On the other hand, if we assume more variability as indicated by the green dashed curve in Figure 3C, then 8.6% fall outside. The conclusion is that although SAP measurement error can be large, it is not the major factor contributing to the deviation from the LMV. This is in large part due to the asymptotic nature of the structure–function relationship in the range of SAP where variability becomes greatest; large variations in SAP would increase the spread in the horizontal direction of the structure–function plot.
In addition, other assumptions of the model should also be examined. For example, perhaps the simple linear model does not describe all individuals; maybe some patients do not show RNFL thinning as SAP sensitivity loss occurs. Or perhaps, the RNFL gets thicker as glaucoma progresses, as suggested by Harwerth et al.,16 although this is unlikely to explain the extremes of relatively small field losses. In any case, further tests of the basic assumptions of the linear model will be possible as patients are observed over times long enough to see progression.
Structure–function models have important clinical applications that have yet to be realized. One of the most obvious applications is the identification of true progression of optic nerve disease such as glaucoma. If medical or surgical treatment is not adequate, then the decrease in both RNFL and threshold sensitivity in susceptible visual field regions should follow a predictable course over time. Disproportionate changes over time (i.e., loss of function without loss in structure) that do not adhere to the structure–function model’s prediction may signal causes other than glaucoma. For example, advancing cataract or outer retinal disease (i.e., macular degeneration) may be more easily identified by changes in SAP field sensitivity without accompanying RNFL changes.
However, given the between-individual variability observed herein, progression should be tracked using within-individual comparisons (Fig. 7, blue ellipses). In fact, the predictions of the model in Figure 7 suggest two other strategies for optimizing the detection of progression. First, it will be easiest to see progression during the early phases of the disease. Note that the blue ellipses in Figure 7 are relatively small for field losses less than −6 dB, that is, over the region where most of the RNFL thinning takes place. Second, repeat measures can substantially shrink the blue ellipses. Although this may not be practical for SAP measures, it is feasible to repeat the OCT measures and thus reduce the vertical dimension of the blue ellipses.
Further, the identification and understanding of extreme points may be of clinical importance. Consider a patient with suspected glaucoma who in fact has a compressive lesion with a reversible loss of visual field. In this case, the RNFL is not damaged, but the SAP threshold is abnormal. This patient would be an upper extreme on the structure–function relationship and, if extreme enough, should alert the physician to an underlying nonglaucomatous cause. Examples of clinically significant “lower outlier” points would include compensatory mechanisms in the visual pathway that could potentially maintain visual perception in the face of structural loss.
A linear model modified to account for variability in plots of RNFL thickness versus SAP field loss predicts most, but not all, of the extensive variability seen in these data. This model and the separation of within- versus between-individual variability, as well as the deviations from the model, helped elucidate the factors contributing to the considerable variability seen in structure-versus-function data. In particular, between-individual variability is a major factor over most of the range of changes in RNFL thickness. Three factors that contribute to RNFL variability in this range were identified. First, diseases of the inner retina (e.g., edema and ERMs) can contribute to artifactual thickening in some patients. Second, the individual variation in visual field to disc mapping plays a role. Finally, the location of BVs, as well as the degree to which they are included by the RNFL algorithm, affects variability. Finally, based on the model, suggestions are made for observing progression with structure-versus-function data. In particular, the data from single individuals should be used, only the early stages of the disease should be considered, and repeat OCT RNFL measures should be used.
Supported by National Eye Institute Grants R01-EY-09076 and R01-EY-02115, a grant from the Veterans Administration (Rehabilitation Division and Merit Review), and an unrestricted grant from Research to Prevent Blindness, New York, New York.
The authors thank Chris A. Johnson, PhD, and Vivienne Greenstein, PhD, for comments on the manuscript; Carrie Doyle and Kim Wood-ward for help in testing the patients; Wallace (Lee) Alward, MD, Young Kwon, MD, PhD, and the Glaucoma Service for help in recruitment and testing the patients; Norma Graham for helping clarify the details of the model; and William H. Swanson, PhD (one of the referees), for a thorough reading and insightful comments that substantially improved the manuscript.
Disclosure: D.C. Hood, None; S.C. Anderson, None; M. Wall, None; A.S. Raza, None; R.H. Kardon, None