|Home | About | Journals | Submit | Contact Us | Français|
Human stereopsis—the perception of depth from differences in the two eyes’ images—is very precise: Image differences smaller than a single photoreceptor can be converted into a perceived difference in depth. To better understand what determines this precision, we examined how the eyes’ optics affects stereo resolution. We did this by comparing performance with normal, well-focused optics and with optics improved by eliminating chromatic aberration and correcting higher-order aberrations. We first measured luminance contrast sensitivity in both eyes and showed that we had indeed improved optical quality significantly. We then measured stereo resolution in two ways: by finding the finest corrugation in depth that one can perceive, and by finding the smallest disparity one can perceive as different from zero. Our optical manipulation had no effect on stereo performance. We checked this by redoing the experiments at low contrast and again found no effect of improving optical quality. Thus, the resolution of human stereopsis is not limited by the optics of the well-focused eye. We discuss the implications of this remarkable finding.
A fundamental question in visual neuroscience is how the eye’s optics, photoreceptors, and subsequent neural mechanisms combine to determine visual performance. Studying visual resolution has proven particularly illuminating. For example, the contribution of optics to letter acuity is now reasonably well understood: Defocus causes a predictable worsening of acuity (Cheng et al., 2004) while correcting the high-order aberrations of the well-focused eye yields predictable improvement (Yoon et al., 2002). Similar changes are observed with contrast sensitivity: Defocusing the eye yields poorer sensitivity, particularly at high spatial frequencies (Campbell & Green, 1965), and correcting the high-order aberrations yields better-than-normal sensitivity (Williams et al., 2000; Yoon & Williams, 2002). Thus, the contribution of optics to visual acuity and contrast sensitivity is reasonably well understood. By inference, the contributions of post-optical receptoral and neural mechanisms have been quantified (Williams, 1985; Banks et al., 1987; MacLeod et al., 1992; Chen et al., 1993).
We know much less about the optical and neural determinants of stereopsis. Humans can discriminate changes in binocular disparity as small as 5arcsec (Westheimer & McKee, 1980), so stereopsis is clearly a very precise visual function. Using an approach similar to that employed in the analysis of the limits of visual acuity and contrast sensitivity, we examined how the eyes’ optics affects the resolution of stereopsis thereby revealing more about the influence of post-optical mechanisms.
Three observers (1 female, 2 male) with normal visual acuity and stereopsis participated. They were emmetropic (spherical and cylindrical refractive errors both smaller than 0.5D). Average standard deviation of root-mean-square wavefront error for the higher-order aberrations (HORMS) was 0.46 ± 0.23μm for a 6mm pupil, not differing significantly from the normal population. Two observers were authors (BV, GYY).
Stimuli were projected directly into the eyes with two DLP projectors (Sharp PG-M20X, 1024x768 pixels) that had been optically modified. Pixels subtended 24arcsec except in the two-line stereo experiment in which they subtended 12arcsec. Grayscale resolution was 8 bits and the displays were luminance calibrated.
Monocular and binocular stimuli were brought to sharp focus on the retinas with two image-relay optical systems, one for each eye. We improved the optical quality of the retinal images beyond normal, well-focused optics in three ways.
We combined these three manipulations to produce higher image quality than that of normal, well-focused eyes. The phase plates and artificial pupils were placed in pupil-conjugate planes. Accurate alignment of the observer’s visual axes with the light paths was maintained by video monitoring of the positions of the pupils relative to the phase plates, by the observer maintaining fixation on a small binocular marker, and by the observer checking alignment by repeated judgments of the sharpness of a small letter E presented between trials. Potential changes in focus were eliminated by inducing cycloplegia to both eyes (i.e., paralyzing the ciliary muscles that control accommodation) by administering cyclopentolate. Ophthalmic lenses (spherical and cylindrical) in each eye’s light path assured best focus for all conditions. The appropriate lens was chosen by having the observer judge the sharpness of the small E.
We assessed the optical improvement provided by our procedure by comparing monocular contrast sensitivity in three optical conditions: 1) with white light, normal well-focused optics (i.e., no phase plates), and 4mm pupil (a typical diameter for the experimental light level; Spring & Stiles, 1948), 2) with white light, normal well-focused optics and 6mm pupil (larger than typical for the experimental light level), and 3) when the optics were improved by the procedure described above. The stimuli were gratings with sinusoidal luminance variation and space-average retinal illumination of 561Td. Contrast was constant over the central 3° and decreased with a half-Gaussian profile (std = 0.67°) to merge smoothly with the uniform background. Observers initiated stimulus presentations with a keypress. The gratings were oriented +10° or -10° relative to horizontal and were presented for 16.7ms (one frame). We used short durations so that contrast sensitivity would be relatively low and in so doing we assured that the grayscale resolution of our projectors was sufficient to obtain reliable thresholds. The gratings’ phase was randomized from trial to trial. We presented spatial frequencies of 10, 20, 28, and 40 cycles per degree (cpd). After each presentation, observers indicated the grating’s orientation. No feedback was given. The gratings’ contrast was varied according to an adaptive staircase procedure (Watson & Pelli, 1983). Afterward, the psychometric data from six such staircases of 25 trials each were combined and fitted with a cumulative Gaussian using a maximum-likelihood criterion (Wichmann & Hill, 2001). Thresholds and confidence intervals were calculated from those fits and boot-strapping.
We measured stereo resolution under the same three optical conditions. In this experiment, the stimulus was a random-dot stereogram specifying sinusoidal corrugations in depth (Figure 1a). To create the stereograms, we first generated a hexagonal lattice with an inter-dot distance of s. Then each dot was displaced in a random direction (distributed uniformly from 0 to 2π) for a random distance (distributed uniformly from 0 to s/2). We copied the randomized lattice into the images for the left and right eyes and then horizontally displaced the dots in each image in opposite directions by half the horizontal disparity. Horizontal disparity was:
where x and y are dot coordinates, and A, f, and α are respectively the corrugation’s peak-to-trough disparity amplitude, spatial frequency, phase, and orientation. Dot density varied from 25–336 dots/deg2. Dot size varied from 0.8–1.6arcmin. Anti-aliasing was used so we could present small disparities. A fixation target with dichoptic and binocular elements was presented between stimulus presentations so observers could maintain accurate fixation and assess optical quality. Observers initiated stimulus presentations with keypresses. The corrugation’s orientation was either +10° or −10° from horizontal, and observers indicated after each 600ms presentation which orientation they had seen. By making the corrugations nearly horizontal, we greatly reduced the visibility of monocular artifacts in the stereograms. By using the orientation-discrimination task, we assured that observers had to perceive some spatial structure to perform significantly above chance. No trial-by-trial feedback was provided.
Peak-to-trough disparity amplitude was 2.4arcmin. We chose such a small value to assure that the disparity-gradient limit (Burt & Julesz, 1980; Banks et al., 2004; Filippini & Banks, 2009) was never exceeded. For sinusoidal corrugations, the gradient is approximately 2fA, so the disparity-gradient limit of 1 is reached when f becomes greater than 1/(2A), which for our experiment would be a corrugation frequency of 12.5cpd. We never presented frequencies greater than 10cpd and thereby avoided the disparity-gradient limit.
The spatial frequency of the corrugation was varied from trial to trial according to an adaptive staircase procedure to determine the highest frequency at which reliable performance could be obtained. The psychometric data from six staircases of 25 trials each were combined and fit with a cumulative Gaussian using a maximum-likelihood criterion. Threshold was the 75% point on the fitted curve.
We conducted a second corrugation experiment with stimuli of different contrasts. In this case, the retinal illuminance of the background was 561Td.
We also measured stereo resolution using a depth-discrimination task (Blakemore, 1970). Two of the three observers participated. A thin vertical test line (0.037 × 2.25°) was presented 0.27° below a reference line of the same dimensions (Figure 1b). The disparity of the reference line was 0° and the disparity of the test line was varied (both increasing and decreasing from 0). The retinal illuminance of the lines was 891Td. Pixels subtended 12arcsec and anti-aliasing was employed to allow the presentation of small disparities. A fixation target with dichoptic and binocular elements was presented between stimulus presentations allowing observers to maintain accurate fixation and to assess optical quality. Observers initiated stimulus presentations with keypresses. The test and reference lines were presented for 500ms and observers indicated whether the test was in front of or behind the reference. The disparity of the test line was varied using the method of constant stimuli. The psychometric data were combined and the proportion of “behind” responses was calculated for each disparity. A cumulative Gaussian was fit to those data and its standard deviation was taken as the threshold.
We also ran the two-line experiment with different contrasts between the lines and background. In that case, the background had a retinal illuminance of 446Td.
Figure 2a plots contrast sensitivity for both eyes of the three observers under the three optical conditions. The red and blue symbols represent sensitivity with normal, well-focused optics for pupil diameters of 6 and 4mm, respectively. The green symbols represent sensitivity with improved optics. As you can see, contrast sensitivity was higher with improved optics than with normal, well-focused optics in both eyes of all three observers. We subjected the data to a 3-way, repeated-measures ANOVA with factors optical condition, spatial frequency, and eye. There was a statistically significant effect of optical condition: Highest sensitivity was observed with improved optics and lowest with normal optics and 6mm pupil [F(2,4) = 24.711, p = 0.006; missing data at 40 and 28cpd for observer GYY were assigned sensitivities of 1]. The improvements were in some cases quite large. The contrast sensitivity of observer BV increased nearly 7-fold at 28cpd in his left eye from the normal, 6mm condition to the improved condition. The increase was more than 4-fold for HRF at 28cpd, right eye and nearly 4-fold for GYY at 20cpd, left eye. The improvement in sensitivity from the normal, 4mm condition to the improved optics condition was also statistically significant [F(2,2) = 22.863, p = 0.041]. These results show that our procedure for producing sharper-than-normal retinal images was quite effective.
Figure 2b shows the stereo resolution thresholds—the highest discriminable corrugation frequency as a function of dot density—for the three observers under the three optical conditions. The left column shows the whole functions; the right column shows exploded views of the data at high dot densities.
The Nyquist sampling frequency is the highest corrugation frequency that can be conveyed by the random-dot stimulus:
where D is dot density. The diagonal dashed lines in the figure represent this frequency. When dot density was low, the highest discriminable frequency for all three optical conditions was near the sampling limit. (Some thresholds slightly exceeded the Nyquist frequency because the random dot arrangement yielded regions in which local density was higher than overall density.) We conclude that stereo resolution is determined under those conditions strictly by the number of samples in the stimulus. However, when dot density was higher, resolution leveled off at a particular frequency, so something other than sample number is limiting performance there. Our primary interest is in understanding the determinants of that asymptotic frequency.
The different symbols in Figure 2b represent the data from the three optical conditions: Red for the normal, well-focused condition with 6mm pupil, blue for the normal, well-focused condition with 4mm pupil, and green for the improved optical condition. As you can see, performance leveled off at the same spatial frequency for all three optical conditions. We subjected the data at the two highest dot densities to a repeated-measures ANOVA and there was no reliable effect of optical condition [F(2,2) = 2.74, p = 0.178]. Examining the individual observer data reveals no systematic differences with the possible exception of observer GYY who had a slightly lower asymptotic frequency in the 6mm condition than in the other two (10.5% lower with 6mm pupil than with improved optics). Furthermore, there was no systematic relationship between the quality of individual observers’ optics and their stereo performance. With normal, well-focused optics, HRF had the best image quality (quantified by HORMS), and BV had the poorest (HRF = 0.19μm; BV = 0.7μm). Yet BV had the best stereo resolution (his asymptotic frequency was 5.3cpd in the 6mm condition) and HRF had the poorest (in the same condition, her asymptote was 2.8cpd). Collectively, these results suggest rather remarkably that improving the optics has no effect on stereo resolution. We know that degrading the optics from normal reduces stereo resolution(Westheimer & McKee, 1980; Banks et al., 2004), but it seems that the resolution of stereopsis is not limited by the blur in normal, well-focused eyes.
We tested the generality of our observations by assessing stereo resolution another way. The data points on the right side of Figure 3 represent the results from the two-line, depth-discrimination experiment. The red and blue symbols again represent the data with normal, well-focused optics and 6 and 4mm pupils, respectively. Green symbols represent data with improved optics. There was no systematic relation between performance and optical quality. For improved optics with 4 and 6mm pupil, stereo resolution for observer BV was respectively 18.7arcsec (95% confidence interval: −4.2, +7.6), 11.3 (−4.6, +6.9), and 14.6 (−4.1, +5.0) arsec; for HRF, acuity was 32.8arcsec (−9.2, +11.4), 27.3 (−5.6, +6.8), and 59.9 (−15.7, +34.1). Thus, the differences across optical condition were not statistically reliable. There was also no consistent relationship between the optical quality of individual observers and their performance in the task. For example, HRF’s 6mm optical quality was significantly better than BV’s, but her disparity threshold was poorer than his: 32.8 vs 18.7arcsec. Again we conclude that stereo resolution is not limited by the blur associated with normal, well-focused optics.
Improving optical quality increases retinal-image contrast. Perhaps the failure to observe an improvement in stereo resolution was due to saturating non-linearity early in visual processing (MacLeod et al., 1992; Chen et al., 1993) such that the high-contrast dots and lines were effectively clipped and therefore the contrast increase was not retained for processing at later neural stages. If this were the case, improving optical quality should yield better resolution with lower-contrast stimuli. We examined this possibility in two ways.
First, we re-tested observer BV in the corrugation task with low-contrast stimuli. The background was gray with a retinal illuminance of 561Td. Contrast, defined as (Ldot – Lbkgrnd)/Lbkgrnd, ranged from 0.125–1. Dot density was fixed at 232 dots/deg2. The results are plotted in Figure 4. Reducing contrast had no reliable effect on the highest discriminable corrugation frequency at contrasts from 0.25–1. At a contrast of 0.125, there was a clear reduction in stereo resolution in the 6mm condition, but this is because the dots became generally invisible. Thus, this particular finding is due to dot visibility rather than the precision of stereo processing.
Second, we redid the two-line experiment with lower-contrast stimuli. Observers BV and HRF participated. The background was gray with a retinal illuminance of 446Td and contrast was varied from 0.125–1. The results are shown on the left side of Figure 3. There was again no systematic effect of optical condition at any contrast provided that the lines were visible. BV’s disparity threshold was not measurable at 0.125 in the 6mm condition because he could not see the lines.
We conclude that the failure to observe an improvement in stereo resolution with improved optics is not a byproduct of clipping due to a saturating non-linearity. Human stereopsis simply does not seem to benefit from better retinal-image quality than that associated with natural viewing.
Humans can discriminate binocular disparities much smaller than a single foveal photoreceptor (Westheimer, 1979), so stereopsis is generally considered an extremely precise process. It is therefore surprising that providing sharper-than-normal retinal images, while improving contrast sensitivity, has no measurable effect on stereopsis.
Perhaps the limiting process is in visual cortex where the two eyes’ images are combined. The standard model of disparity estimation involves disparity-energy calculation in cortex (Ohzawa et al., 1990). Disparity estimation by a population of disparity-energy units is well modeled by local cross-correlation (Anzai et al., 1999; Banks et al., 2004). The receptive fields associated with this computation must be large enough to allow meaningful computation of inter-ocular correlation. But large receptive fields have limited stereo resolution because they cannot signal disparity variation finer than their own size. Consequently, the smallest available receptive fields determine the resolution of stereopsis (Banks et al., 2004; Nienborg et al., 2004; Filippini & Banks, 2009). Presumably, the size of the smallest fields has been determined by the visual diet they have been provided, and that diet is limited in everyday vision by the on-going optical quality of the eye, which is adversely affected by high-order aberrations, chromatic aberration, tear film changes, and accommodative fluctuations (Charman & Heron, 1988; Montés-Micó et al., 2004). Thus, a smaller receptive-field size may not be available because it would have no use in daily life. It would be interesting to know if long experience with improved optics, such as with aberration-correcting contact lenses (Sabesan et al., 2007), would yield smaller receptive fields that would then allow an increase in stereo resolution. But if we apply the same logic to luminance contrast sensitivity and visual acuity, the explanation falls short: Underlying neural mechanisms for luminance processing should also have developed receptive-field sizes that are appropriate for the visual diet they receive. Why then does contrast sensitivity and visual acuity improve with super-normal optics and stereo resolution does not?
Perhaps the difference is due to how eye movements affect performance in visual resolution and stereo resolution tasks. During fixation, the eyes continually jitter, drift, and make corrective micro-saccades (St. Cyr & Fender, 1969; Rucci et al., 2007). The visual system integrates over time, so such movements cause spatial blur in the stored retinal image. There is, however, little to no influence of these small movements on monocular contrast sensitivity at high spatial frequencies (Packer & Williams, 1992) presumably because there are epochs in which the eye is stationary thereby providing a sharp stored image and/or because some movements are parallel to the grating and do not blur the stored image. These small eye movements may have a greater effect on stereopsis. To estimate disparity, the brain cross-correlates the two eyes’ images (Ozawa et al., 1990; Banks et al., 2004). The cross-correlation operation presumably has its own integration time, which might be rather long given the inability to respond to fast alternations in disparity (Norcia & Tyler, 1984; Nienborg et al., 2005). Therefore, any difference in the movements of the two eyes would degrade the output of the cross-correlator. The movements of the two eyes during fixation are partially uncorrelated (St. Cyr & Fender, 1967), so there will be very few epochs in which both eyes are stationary or in which both are moving parallel to the depth structure. From this observation, we hypothesize that small eye movements during fixation are more detrimental to stereo resolution than to visual resolution (i.e., visual acuity, contrast sensitivity). Specifically, these movements cause changes in disparity estimation that are similar to spatial blur. As a consequence, the blur due to eye movements may be the limit to stereo performance rather than the blur inherent to the optics of normal, well-focused eyes.
This work was supported by NIH Research Grants R01-EY08266 to MSB, R01-EY01499 to GYY, and NWO Rubicon fellowship 446-06-021 to BV. We thank Austin Roorda for assistance in the wavefront measurements.