|Home | About | Journals | Submit | Contact Us | Français|
The minimum number of samples necessary to fully characterize the aberration pattern of the eye is a question under debate in the clinical as well as the scientific community. We performed repeated measurements of ocular aberrations in 12 healthy nonsurgical human eyes and in 3 artificial eyes, using different sampling patterns (hexagonal, circular, and rectangular with 19 to 177 samples, and 3 radial patterns with 49 sample coordinates corresponding to zeros of the Albrecht, Jacobi, and Legendre functions). For each measurement set we computed two different metrics based on the root-mean-square (RMS) of difference maps (RMS_Diff) and the proportional change in the wavefront (W%). These metrics are used to compare wavefront estimates as well as to summarize results across eyes. We used computer simulations to extend our results to “abnormal eyes” (keratoconic, post-LASIK, and post-radial keratotomy eyes). We found that the spatial distribution of the samples can be more important than the number of samples for both our measured as well as our simulated “abnormal” eyes. Experimentally, we did not find large differences across patterns except, as expected, for undersampled patterns.
Wavefront sensing has become a useful tool to assess the image quality of the eye, with applications to both research and clinical evaluation. Ocular aberrometry has been used for studying ocular properties as a function of accommodation , aging [2,3], or refractive error , as well as for the assessment of refractive correction techniques (refractive surgery [5,6], cataract surgery [7,8], and contact lenses [9-11]), or the correction of ocular aberrations to visualize the eye fundus [12-14]. The evaluation of the optical outcomes of refractive surgery has led to an increasing importance of aberrometry in recent years, and commercial aberrometers are now commonly used to assist in surgery [15,16].
Most current aberrometry techniques measure the ray aberrations of the eye, i.e., the local slopes of the wavefront, by estimating the deviation of the light beams from a reference, either as the light goes into the eye (i.e., laser ray tracing (LRT)  and spatially resolved refractometer (SRR) ) or out of the eye (i.e., Hartmann-Shack ). The wave aberration of the eye is then reconstructed from a discrete number of sampling points. This reconstruction can be local , modal , or a mixture of both. The most widely used method in ocular aberrometry is a modal reconstruction that is based on the expansion of the derivatives of wave aberration as a linear combination of a set of basis functions (most frequently a Zernike polynomial expansion) and a subsequent least-squares fit of the expansion coefficients to the measured gradients .
The actual sampling pattern and density differ between aberrometers. The lenslets in a Hartmann-Shack (HS) wavefront sensor are typically arranged in either a fixed rectangular or a hexagonal configuration, and the number of samples range from around 50 to more than 15,000 (for instance, the aberrometer Haso3 128, by Imagine Eyes, Orsay, France) spots within the dilated pupil. Ray-tracing aberrometers (such as LRT or SRR), on the other hand, sample the pupil sequentially and can use a variable sampling configuration. However, given the sequential nature of these devices, high sampling densities are not typically used to reduce measuring times.
The optimal number of sampling points represents a trade-off. There has been a tendency to increase the number of lenslets of the HS sensor (i.e., increasing the sampling density) with the aim of improving resolution and the accuracy of the wavefront reconstruction. However, smaller lenslet diameters decrease the amount of light captured by each lenslet and increase the size of the diffraction-limited spots. Although it is possible to optimize the size of the CCD array and the focal length of the lenslets to gain accuracy (pixels per spot), an excessive number of spots can compromise the dynamic range of the device, as well as increase the processing time and potentially decrease the reproducibility, due to the lower signal intensity. In addition, increasing the number of samples may not decrease the variance of the estimates of the wavefront  nor the aliasing error .
The determination of a sampling pattern with the minimum sampling density that provides accurate results is of practical importance for sequential aberrometers, since it would decrease measurement time, and of general interest to better understand the trade-offs between aberrometers. It is also useful to determine whether there are sampling patterns that are better adapted to typical ocular aberrations, or particular sampling patterns optimized for measurement under specific conditions.
To our knowledge, there has not been a systematic experimental study investigating whether increasing the sampling density over a certain number of samples provides significantly better accuracy in ocular aberration measurements, or whether alternative sampling configurations would be more efficient. There have been theoretical investigations of sampling configuration, although the applicability to human eyes should be ultimately tested experimentally.
The first studies on wavefront estimates date from the 1970s. Cubalchini  was the first to study the modal estimation of the wave aberration from derivative measurements using a least-squares method. He concluded that modal estimates of the wavefront obtained using this method were sensitive to the number of samples and their geometry. He advised minimizing the number of samples used to estimate a fixed number of terms and taking the measurements as far from the center of the aperture as possible in order to minimize the variance of higher-order Zernike terms.
In 1997, Rios et al.  found analytically for HS sensing that the spatial distribution of the nodes of the Albrecht cubatures  made them excellent candidates for modal wavefront reconstruction in optical systems with a centrally obscured pupil. This sampling scheme could also be a good candidate for ocular aberrations, due to the circular geometry of the cubature scheme. In addition, as the Zernike order increases (i.e., higher-order aberrations), the area of the pupil more affected by aberrations tends to be more peripheral [21,25,26], and therefore ocular wavefront estimates would potentially benefit from a denser sampling of the peripheral pupil.
He et al.  used numerical simulations to test the robustness of the fitting technique they used for their SRR (least-square fit to Zernike coefficients) to the interaction between orders as well as the error due to the finite sampling aperture. They found that the error could be minimized by extracting the coefficients corresponding to the maximum complete order possible (considering the number of samples) and by using a relatively large sampling aperture, so that the whole pattern practically covered the measured extent of the pupil. Although this large sampling aperture introduced some error due to the use of the value of the derivatives at the center of the sampling apertures to perform the fitting, and their rectangular pattern did not provide an adequate sampling for radial basis functions, their simulation confirmed that the overall effect was relatively small.
In 2003, Burns et al.  studied computationally the effect of different sampling patterns on measurements of wavefront aberrations of the eye by implementing a complete model of the wavefront processing used with a “typical” HS sensor and modal reconstruction. They also analyzed the effect of using a point estimator for the derivative at the center of the aperture, versus using the average slope across the subaperture, and found that the latter decreased modal aliasing somewhat but made little practical difference for the eye models. Given that the higher-order aberrations tended to be small, their modal aliasing (leakage of a high order into a lower order) was subsequently small. Finally, they found that nonregular sampling schemes, such as cubatures, were more efficient than grid sampling when sampling noise was high. One year later , we compared the aberrations obtained using different patterns to measure experimentally the same eyes, and we applied the previous computational model to test some additional patterns. We concluded that patterns with a very small number of samples failed at reproducing the wave aberration, but for human eyes, the differences across the rest of the patterns were of the order of the measurement error. Spatial distribution of the samples was found to be more relevant than the density.
Recently, Díaz-Santana et al.  and Soloviev et al.  developed analytical models to test different sampling patterns applied to ocular aberrometry and HS sensing in astronomy, respectively. Díaz-Santana et al.  developed an evaluation model based on matrices that included as input parameters the number of samples and their distribution (square, hexagonal, or polar lattice), the shape of the subpupil, and the size and irradiance across the pupil (uniform irradiance versus Gaussian apodization) regarding the sampling. The other input parameters were the statistics of the aberrations in the population, the sensor noise, and the estimator used to retrieve the aberrations from the aberrometer raw data. The model of Soloviev et al.  used a linear operator to describe the HS sensing, including the effects of the lenslets array geometry and the demodulation algorithm (modal wavefront reconstruction). When applying this to different sampling configurations, using the Kolmogorov statistics as a model of the incoming wavefront, they found that their pattern with 61 randomly spatially distributed samples gave better results than the regular hexagonal pattern with 91 samples of the same subaperture size (radius=1/11 times the exit pupil diameter), which completely covered the extent of the pupil in the case of the 91-sample pattern. In these theoretical models, an appropriate statistical input is crucial so that their predictions can be generalized in the population. It has been recently found  that high-order aberration terms show particular relationships (i.e., positive interactions that increase the modulation transfer function over other potential combinations), suggesting that general statistical models should include these relationships in order to describe real aberrations.
In this study, we used a configurable wavefront sensor, LRT, to measure wave aberrations in human eyes, using different sampling patterns and densities. Hexagonal and rectangular configurations were chosen because they are the most commonly used. We also used different radially symmetric geometries to test whether these patterns were better suited for measuring ocular aberrations. These geometries included uniform polar sampling, arranged in a circular pattern, and three patterns corresponding to the zeros of the cubatures of the Albrecht, Jacobi, and Legendre equations. We also tested different densities for each pattern in order to evaluate the trade-off between accuracy and sampling density. To separate variability due to biological factors from instrumental issues arising from measurement and processing, we also made measurements on artificial eyes. Finally, we used noise estimates in human eyes as well as realistic wave aberrations in computer simulations to extend the conclusions to eyes other than normal eyes (referred to in this paper as healthy eyes with no pathological condition and that have not undergone any ocular surgery).
Optical aberrations of the eyes were measured using the LRT technique. In this technique, previously described in detail [17,30], collimated light rays are sequentially delivered through different positions of the pupil, and the light reflected off the retina is simultaneously captured by a cooled CCD camera. Ray aberrations are obtained from the deviations of the centroids of the aerial images corresponding to each entry pupil location with respect to the reference (chief ray). These deviations are proportional to the local derivatives of the wave aberrations, which are typically fit using a Zernike polynomial expansion.
We used a second generation of the instrument  where the illumination source was a fiber-coupled diode laser with a wavelength of 786 nm and a nominal output power of 15 mW. The light was attenuated such that exposure was an order of magnitude below safety limits .
The distribution and density of the sampling pattern was under software control. For this study the following sampling patterns were used: hexagonal (H), evenly distributed circular (C), rectangular (R), and three radial patterns with 49 sample coordinates corresponding to zeros of the Albrecht (A49), Jacobi (J49), and Legendre (L49) functions. The patterns are shown in Fig. 1. Different densities for the hexagonal and circular patterns were also used to sample the pupil: 19, 37, and 91 samples over a 6 mm pupil. In addition, for the artificial eyes, rectangular patterns with 21, 37, 98, and 177 samples were also used. In order to simplify the reading, we will use an abbreviated notation throughout the text, where the letter indicates the pattern configuration and the number indicates the number of sampling apertures; for example, H91 stands for a hexagonal pattern with 91 samples.
The three polymethylmethacrylate artificial eyes used in this work, A1, A2, and A3, were designed and extensively described by Campbell . Nominally, A2 shows only defocus and spherical aberration, while A1 and A3 show different amounts of fifth (term , secondary vertical coma) and sixth (term , tertiary astigmatism) Zernike-order aberrations.
We also measured 12 healthy nonsurgical eyes (eyes R1 to R12; even numbers indicate left eyes, odd numbers right eyes) of 6 young subjects (age=28±2 years). Spherical error ranged from -2.25 to +0.25 diopters (D) (1.08±1.17 D), and third- and higher-order root-mean-square (RMS) error from 0.17 to 0.62 μm (0.37 μm±0.15 μm). The experiment involving human subjects fulfilled the tenets of the Declaration of Helsinki, and informed consent was obtained prior to the measurements.
A special holder with a mirror was attached to the LRT apparatus for the measurements on the artificial eyes, which allowed the eye to be placed with its optical axis in the vertical perpendicular to the LRT optical axis and minimize the variability due to mechanical instability or the effect of gravity. The pupil of the artificial eye was aligned to the optical axis and optically conjugated to the pupil of the setup. Focusing was achieved in real time by minimizing the size of the aerial image for the central ray.
The pattern sequence was almost identical in the three artificial eyes: H37, H19, H91, C19, C37, H37_2, C91, R21, R37, R98, H37_3, R177, A49, J49, L49, and H37_4. However, for L2 the pattern A49 was the last pattern measured in the sequence. As a control, identical H37 patterns were repeated throughout the session (indicated by H37, H37_2, H37_3, and H37_4). A measurement session lasted around 40 min in these artificial eyes.
Pupils were dilated with one drop of tropicamide 1% to achieve pupil diameters of at least 6 mm. A dental impression bite bar attached to the setup helped the subject to keep his/her head still during the process, and a fixation stimulus, consisting in a black radial stimulus on a green background, helped the subject to reduce eye movements. Best focus was assessed by the subject while viewing the fixation stimulus and was corrected using a Badal system. The stimulus was aligned with respect to the optical axis of the system and focused at infinity to keep the subject’s accommodation stable during the measurement.
The pupil was monitored (and recorded) during each run using back illumination, which allowed us to detect issues that would affect the measurements, such as tear film breakup, blinking, or large eye movements. When any of these was detected during a run, the subject was asked to blink a few times until feeling comfortable again, rest, or fixate more accurately, respectively, and the measurement was repeated. Custom passive eye-tracking routines were used to analyze the pupil images (captured simultaneously to retinal images) and to determine the effective entry pupil locations as well as to estimate the effects of pupil shift variability in the measurements. Scan times for these eyes ranged from 1 to 6 s, depending on the number of samples of the pattern.
In the human measurements we used fewer patterns (H37, H19, H91, C19, C37, C91, A49, J49, L49, and H37_2) to keep measurement sessions within a reasonable length of time. To assess variability, each pattern was repeated five times within a session. In addition, the H37 pattern was repeated at the end of the session H37_2 to evaluate whether there was long-term drift due to fatigue or movement. An entire measurement session lasted around 120 min for both eyes.
The centroids of the corresponding aerial images were computed similarly to previous publications . Ray aberrations (local derivatives of the wave aberrations) were fitted to a seventh-order Zernike polynomial when the number of samples of the sampling pattern allowed (36 or more samples), or to the highest order possible. From each set of Zernike coefficients we computed the corresponding third- and higher-order (i.e., excluding tilts, defocus, and astigmatism) wave aberration maps and the corresponding RMS wavefront errors. All processing routines were written in matlab (Mathworks, Natick, Massachusetts). Processing parameters were chosen (as were filters during the measurement to obtain equivalent intensities at the CCD camera) so that in both human and artificial eyes the computation of the centroid was similar and not influenced by differences in reflectance of the eye “fundus.”
The wave aberration estimated using the H91 sampling pattern was used as a reference when computing the metrics, as there is no “gold standard” measurement for the eyes. This fact can limit the conclusions based on the metrics that use a reference for comparison. We tested whether this choice biased our results by checking the effect of using the other pattern with the highest number of samples (C91) as a reference. The conclusions would have been unchanged.
We defined two metrics to evaluate differences between sampling patterns:
We obtained a difference pupil map (Diff. Map) by subtracting the wave aberration for the reference pattern from the wave aberration corresponding to the pattern to be evaluated. RMS_Diff is the RMS of the difference pupil map computed. A larger RMS_Diff corresponds to a less accurate sampling pattern. For each eye, we set up a threshold criterion to estimate the differences due to factors other than the sampling patterns. This threshold was obtained by computing the value of the metric for maps obtained using the same pattern (H37) at different times within a session. Differences lower than the threshold are within the measurement variability.
This is the percentage of the area of the pupil in which the wave aberration for the test pattern differs from the wave aberration measured using the reference pattern. Wave aberrations were calculated on a 128×128 grid for each of the five repeated measurements for each sampling pattern and for the reference. Then, at each of the 128×128 points, we computed the probability that the differences found between both groups of measurements (for the sampling and for the reference) arose by chance. Binary maps were generated by setting to one the areas with probability values below 0.05 and setting to zero those areas with probability values above 0.05. Then W% was computed as the number of pixels with value one divided by the total number of pixels in the pupil, all multiplied by 100. The larger the W%, the less accurate the corresponding sampling pattern. This metric was applied only for human eyes, where, as opposed to artificial eyes, variability was not negligible, and repeated measurements were performed.
To summarize the results obtained for all measured eyes, we performed a procedure that we named ranking. It consists in (1) sorting the patterns, according to their corresponding metric values, for each eye; (2) scoring them in ascending order, from the most to the least similar to the reference, i.e., from the smallest to the greatest value obtained for the metric (from 0, for the reference, to the maximum number of different patterns: 9 for the human eyes and 15 for the artificial eyes); and (3) adding the scores for each pattern across eyes. Since this procedure is based on the metrics, and therefore uses the reference, the conclusions obtained will be relative to the reference.
We also performed a statistical analysis, which involved the application of (1) a hierarchical cluster analysis represented by a dendrogram plot using average linkage (between groups), and (2) an analysis of variance (ANOVA; general linear model for repeated measurements, with the sampling patterns as the only factor) to the Zernike coefficients obtained for each pattern, followed by a pairwise comparison (t-test) to determine, in those cases where ANOVA indicated significant differences (p<0.05), which patterns were different. The statistical tests were performed using SPSS software (SPSS, Inc., Chicago, Illinois).
The aim of the hierarchical cluster analysis was to group those patterns producing similar Zernike sets in order to confirm tendencies found in the metrics (i.e., patterns with large metrics values can be considered as “bad,” whereas those with small metrics values can be considered as “good”). The algorithm for this test starts considering each case as a separate cluster and then combines these clusters until there is only one left. In each step the two clusters with a minimum Euclidean distance between their variables (Zernike coefficients values) are merged. We performed the analysis eye by eye and also by pooling the data from all eyes (global) to summarize the results. We computed the ANOVA coefficient by coefficient by pooling the data from all the eyes. When probability values were below a threshold of 0.05, i.e., significant differences existed, the pairwise comparison allowed us to check to which patterns the coefficients that were different corresponded.
When computing the RMS_Diff and W% metrics, only third- and higher-order aberrations were considered (i.e., coefficients 7 to 36 in the single-indexing OSA notation ). However, in the statistical analysis of the Zernike coefficients, the second order was also considered (coefficients 4 to 36). In the case of statistical analysis, no references were used, and therefore the results are not relative to any particular sampling pattern.
Figure 2 shows the wave aberration maps (W.A. map) and the difference maps (Diff. map; subtraction of the reference map from the corresponding aberration map) for the third and higher orders corresponding to the 16 patterns used to measure artificial eye A3. The wave aberration map in the top right-hand corner is that obtained using the pattern H91, which is used as the reference. To the left of the map, the corresponding RMS is indicated. The contour lines are plotted every 0.5 μm for the wave aberration maps and 0.1 μm for the difference maps. Positive and negative values in the map indicate that the wave-front is advanced or delayed, respectively, with respect to the reference. The value below each map is the corresponding RMS.
Qualitatively, the wave aberration maps are similar among patterns, except for those corresponding to the patterns with the fewest samples (H19, C19, and R21). As expected, with the undersampled patterns, spherical aberration is predominant and these patterns fail to capture higher-order defects. These differences among patterns are more noticeable in the difference maps, which reveal the highest values for the patterns with the fewest samples, followed by L49, J49, and C37. As expected, the RMS_Diff values for these six patterns were larger than for the other patterns.
RMS_Diff ranged from 0.06 to 0.46 μm (0.15 μm±0.05 μm) across eyes and patterns. Within each eye, we set up a threshold to estimate the differences due to factors other than the sampling patterns. For this purpose, we used repeated runs with H37 at four different times within the session. We subtracted the map obtained for one of the measurements from the map obtained for each of the three other measurements. We computed the threshold as the RMS (analogous to RMS_Diff) of the resulting three maps. The values obtained for the threshold averaged across measurements were 0.07 μm±0.01 μm, 0.09 μm±0.08 μm, and 0.05 μm±0.01 μm for eyes A1, A2, and A3, respectively (0.07 μm±0.03 μm averaged across the three eyes).
Figures 3(a)-3(c) show the values for the metric RMS _Diff obtained for each pattern for artificial eyes A1, A2, and A3, respectively. As previously indicated, the larger the value for RMS_Diff, the less similarity between the pattern and the reference. Within each eye, patterns are sorted by RMS_Diff value in ascending order (from most to least similar to the reference). The thick horizontal line in each graph represents the threshold for the corresponding eye, indicating that differences below this threshold can be attributed to variability in the measurement. The results of eyes A1 and A3 for RMS_Diff are similar: The values for all the patterns are above the corresponding threshold, and the worst patterns (largest value of the metric) are those with the smallest number of samples (H19, C19, and R21), as expected. H37 patterns R177, A49, and C91 were the best patterns for these eyes. In the case of A2, the values of some of the patterns (H37, H37_2, C37, and H19) were below the threshold, indicating that the differences were negligible. The ordering of the patterns for this eye is also different, with H19 and C19 obtaining better results (positions 4 and 6 out of 15, respectively) than for the other eyes. This is probably explained by the aberration pattern of this eye, which has only defocus and spherical aberration. R21, J49, and L49 are the worst patterns in this eye.
When comparing the outcomes for all three eyes, we find the following consistent trends: C91 gave better results than R98; A49 was better than L49 and J49. For patterns with 37 samples, we found that H patterns gave better results than the R patterns.
We performed a hierarchical cluster analysis for A1, A2, and A3 and plotted the resulting dendrogram in Figs. 3(d)-3(f), respectively, below the RMS_Diff plot corresponding to each eye. We have framed each significant cluster indicated by the dendrogram. This allowed us to group patterns that yielded similar results. The groups of patterns obtained in the dendrogram for each eye is consistent with the RMS_Diff plot. The line type (and color online) of the frame indicates whether the group is considered as “good” (solid line), “medium” (dashed line), or “bad” (dotted line), according to the results from RMS _Diff. C37, R37, and R21 differ for A1 and A3. For A2 (with only defocus and spherical aberration), H19 and C19 provide results similar to a denser pattern, as found with RMS_Diff.
Since the number of artificial eyes was smaller than the number of sampling patterns, using an ANOVA on the artificial eyes was not possible. Instead, we performed a Student t-test for paired samples on the three eyes, Zernike coefficient by Zernike coefficient, with the Bonferroni correction (Bonferroni multiple comparison test). Significant differences were found only for coefficient between the patterns R177 and H37.
In summary, for these eyes, the worst patterns according to the RMS_Diff metric were H19, C19, and R21 (least samples), and H37, R177, A49, and C91 were the best. For A2, with only defocus and spherical aberration, R21, J49, and L49 were the worst patterns, although the differences with the other patterns were small. Although, as previously stated, these results are relative to our reference, the grouping obtained from the metrics is in agreement with the groups formed by the hierarchical cluster analysis, which does not depend on the reference. Results from a metric that compares individual Zernike terms (Student’s t-test with the Bonferroni correction) showed very few significant differences.
Figure 4 (first row) shows third- and higher-order wave aberration maps (W.A.map) and the corresponding RMSs for each sampling pattern for human eye R12. The contour lines are plotted every 0.3 μm. The map in the top right-hand corner corresponds to the reference pattern H91. Each map is obtained from an average of four (H19) to five measurements. Qualitatively, the aberration maps are quite similar across patterns, although those with fewer samples (H19 and C19) appear less detailed than the others, as expected.
Difference maps (Diff.map), obtained by subtracting the reference map from each pattern map, are plotted in the second row of Fig. 4, with the corresponding RMS (RMS_Diff) indicated below each map. RMS_Diff ranged from 0.04 to 0.38 μm (0.13 μm±0.06 μm) across eyes and patterns. Using a procedure similar to that for the artificial eyes, we determined a threshold for RMS_Diff based on two sets of five consecutive measurements each, obtained at the beginning and at the end of the session using the H37 pattern (H37 and H37_2, respectively). To compute the threshold we subtracted each of the five wave aberration maps of the H37_2 set from each of the corresponding wave aberration maps of the H37 set to obtain the corresponding five difference maps. Next, we computed the average of the five difference maps. The RMS of the average map represents the threshold. RMS_Diff values below this threshold are attributed to the variability of the measurement. For example, the value of the threshold for the eye in Fig. 4 (R12) was 0.15 μm. This means that, in principle, only J49, H19, and C19, which had values greater than this, are practically different from the reference. The most similar patterns were H91, A49, and H37_2.
The mean threshold value that we obtained for all our human eyes (mean RMS_Diff for measurements obtained with H37) was 0.11 μm±0.04 μm, an order of magnitude larger than the standard deviation of the RMS [std (RMS)] for the two sets of five repeated measurements using H37, which was 0.05 μm±0.03 μm. This indicates that std(RMS) is less sensitive to differences between wavefronts than RMS_Diff is.
The third row shows maps (Prob.map) representing the value of significance obtained point by point when computing the W% metric. The darker areas indicate a higher probability of a difference. The maps on the fourth row (Sign.map) indicate those points for which the significance value is below the threshold (<0.05); i.e., those points that are significantly different from the reference. The number below each map indicates the corresponding value of the W% metric, which ranged from 0.7% to 80% (29%±13%) across eyes and patterns. We also computed a threshold for this metric, using the two sets of measurements with H37 obtained in each session. For the eye of the example (R12), we obtained a value for the threshold of 20.6%. This implies that differences in patterns other than L49, J49, C37, H19, and C19 (with values for W% above the threshold) can be attributed to the variability of the experiment.
According to this metric, the patterns that differ the most from the reference, are C19, H19, C37, and J49. Although H37_2, C91, and A49 are the patterns most similar to the reference, the differences are not significant according to the threshold.
Figures 5(a) and 5(b) show the results obtained for the metrics RMS_Diff and W%, respectively, after ranking across all the human eyes. The scale for the y axis indicates the value that each pattern was assigned in the ranking. This means that the “best” possible score for the ordinate (y) would be 12 (for a pattern that was the most similar to the reference for each of the 12 eyes). Similarly, for a pattern being the least similar to the reference for each of the 12 eyes, the ordinate value would be 120 (12 eyes×10 patterns).
In both graphs, patterns are sorted from smallest to greatest value of the metric, i.e., from most to least similar to the reference. The resulting order of the patterns is very similar for both metrics, showing that, as expected, the worst results are obtained for the 19-sample patterns. The best results are obtained for H91, A49, L49, and H37. We found that H patterns generally provide better results than C patterns (for 37 and 19 samples) in the ranking for both metrics. Among the 49 sample patterns, J49 produced the worst results.
We applied the hierarchical cluster analysis to the human eye data. While we performed the test eye by eye (i.e., we obtained one dendrogram per eye), Fig. 5(c) is a summary dendrogram obtained by pooling the data of all the eyes in the analysis (global). Solid, dashed, and dotted lines indicate “good,” “medium,” and “bad” clusters, respectively, according to the classification obtained from the metrics. This plot is representative of the plots corresponding to the individual eyes. The sampling patterns are distributed in three clusters: C91-A49-H91, J49-L49-C37, and H19-C19, which can be considered as “good,” “medium,” and “bad,” respectively. Although this is the trend across eyes, some individual eyes yielded different results, as shown in Fig. 6. H37 and H37_2 did not form a specific cluster in the global dendrogram and do not follow a specific trend across the eyes, so they were not included in the table. The most different eyes were 6, 7 and 8 (where 7 and 8 belong to the same subject), for which the cluster H19-C19 gets separated out. The least reproducible cluster across eyes was C91-A49-H91.
Finally, we performed an ANOVA (general linear model for repeated measurements) on the Zernike coefficients obtained using the different patterns, followed by a pairwise comparison (paired t-test with the Bonferroni correction) to detect between which patterns differences existed, when indicated by ANOVA.
For each pattern, we computed the number of Zernike coefficients that were significantly different according to the t-test relative to the total number of possible Zernike coefficients, i.e., 33 coefficients×9 alternative patterns. We also computed which coefficient tended to come out the most statistically different across pairs of patterns, i.e., statistically different across the greatest number of patterns. The patterns showing the most differences were C19 (4.7%) and H19 (6.4%), and those showing the least differences were H37, H91, C37, and C91 (1.01% each). Significant differences were found only for the following coefficients: , , , , , and .
To summarize, similar results were obtained using both metrics comparing the shape of the wave aberrations (which depends on our reference) in concordance with the cluster analysis (which does not depend on the reference): C91, A49, and H37 were the best patterns, and C19, L49, and H37_2 were the worst. However, the differences were of the order of the variability in most cases. When computing the percentage of differing patterns, those showing most differences were C19 and H19, whereas H37, H91, C37, and C91 showed the least differences. Regarding Zernike coefficients, only a few coefficients were significantly different: , , , , , and .
Artificial eyes are a good starting point to study differences in the sampling patterns because they have fewer sources of variability (only those attributable to the measurement system, such as thermal noise in the CCD, photon noise, etc.) than the human eyes (including also variability due to the subject such as eye movements or microfluctuations of accommodation). We estimated centroiding noise by computing the standard deviation of the coordinates of the centroids for each sample across different repetitions for pattern H37. The mean error (averaged between x and y coordinates) was 0.09 mrad for artificial eyes (37 samples and 3 eyes) and 0.34 mrad for human eyes (37 samples and 12 eyes).
RMS_Diff seems to be a good metric for artificial eyes, since it provides quantitative differences between the patterns. However, it would be desirable to rely on an objective independent reference for the computation of this metric, such as an interferogram. This metric shows that, for these eyes, patterns with the greatest number of samples (R177) are not always best (in terms relative to our reference, which had only 91 samples) and that spatial distribution of the samples is very important. The differences in the ordering observed with eye A2 (with no higher terms than spherical aberration), where patterns with less samples gave slightly better results than for the other eyes, support the hypothesis that the wave aberrations present in each particular eye affect the optimum pattern, as would be expected from sampling theory.
The different sorting orders for repeated measures of the same pattern (H37, H37_2, H37_3, and H37_4) indicate that differences of this magnitude are not significant. However, the sorting of the different patterns is consistent across metrics and statistics for each eye.
To evaluate whether sample density affects variability, we computed the standard deviation of RMS_Diff across eyes for each pattern and then sorted the patterns in descending order, according to their corresponding variability. We found that the worst patterns (C37, H19, C19, and R21) also showed a larger variability, indicating that they are less accurate when sampling the aberrations pattern.
Conclusions based on the artificial eyes have the advantage of avoiding biological variability but are restricted because they have aberration structures very different from those in human eyes. In our human eyes we also find that the RMS_Diff metric allows us to sort the patterns systematically, and the values of the metric obtained for human and artificial eyes are of the same order. The W% metric also was consistent with RMS_Diff, as well as being more sensitive.
The ranking procedure was successful at summarizing information obtained from the metrics, since the metric values are not as important as sorting the patterns within each eye. However, the main drawbacks of this procedure are that it does not provide information on statistical significance (although the results for the same pattern, H37, obtained for different measurements helps to establish significant differences) and that the conclusions are relative to our reference (obtained under the same conditions as the assessed patterns), and therefore these rankings might be dependent on the chosen reference. These drawbacks are overcome by the hierarchical cluster analysis, which classifies the patterns into different groups according to the values of the corresponding vectors of Zernike coefficients and therefore distinguishes between patterns yielding different results. It also helps to place the results obtained from the metrics in a more general context.
As with the artificial eyes, the grouping of the sampling patterns is consistent across metrics. The spatial distribution of the samples is important, given that some patterns with the same number of samples (49) fall into the same group or can even be worse than patterns with a lower number of samples. Similarly, a “good” sampling pattern (A49) is grouped with patterns with a larger number of samples. However, for the real eyes, the conclusions are weaker than for artificial eyes (only differences in patterns with 19 samples are significant), presumably because biological variability plays a major role.
Overall, the undersampling patterns C19 and H19 were consistently among the most variable patterns, and this was confirmed by the ANOVA for Zernike coefficients. We also did not have a problem with long-term drift, since final H37 measurements were not more variable than the standard measurements.
We have found that measurement errors in human eyes prevented us from finding statistically significant differences between most sampling patterns. However, standard deviations of repeated measurements of this study were less than or equal to those of other studies. The mean variability across patterns and eyes for our human eyes was 0.02 μm (average standard deviation across runs of the Zernike coefficient, excluding tilts and piston) for Zernike coefficients. This value is smaller than those obtained by Moreno-Barriuso et al.  on one subject measured with an earlier version of the LRT system (0.06 μm) with a HS sensor (0.07 μm) and a SRR (0.08 μm), and it is smaller than those obtained by Marcos et al. , using the same LRT device (0.07 μm for 60 eyes) and a different HS sensor (0.04 μm for 11 eyes). A similar value (0.02 μm) is obtained when computing the average of the standard deviation of the Zernike coefficients (excluding piston and tilts) corresponding to the eye reported by Davies et al.  using a HS sensor. The negligible contribution of random pupil shifts during the measurements on the wave aberration measurement and sampling pattern analysis was further studied by examining the effective entry pupils obtained from passive eye-tracking analysis. We selected the most variable set of series (according to the standard deviation of the RMS wavefront error and the standard deviation of the Zernike coefficients across series, respectively), which corresponded to eyes 1 (H19) and 2 (H37_2), respectively. We found that absolute random pupil shifts across the measurements were less than 0.17 mm for coordinate x and 0.11 mm for coordinate y. The mean shift of the pupil from the optical axis (i.e., centration errors, to which both sequential and nonsequential aberrometers can be equally subject), was in general larger than random variations. We compared the estimates of the wave aberrations obtained using the nominal entry pupils with respect to those obtained using the actual pupil coordinates (obtained from passive eye-tracking routines). When pupil shifts were accounted for, measurement variability remained practically constant both in terms of RMS standard deviation (from 0.09 to 0.07 μm and from 0.14 to 0.13 μm for eyes 1 and 2, respectively) and in terms of the average standard deviation of the Zernike coefficients (from 0.06 to 0.05 μm and from 0.03 to 0.03 μm, for eyes 1 and 2, respectively). On the other hand, the differences between the average RMS using nominal or actual entry locations (0.51 versus 0.49 μm for eye 1 and 0.61 versus 0.59 μm for eye 2) are negligible. Also, RMS_Diff values (using the wave aberrations with nominal entry locations as a reference, and wave aberrations with the actual entry locations as a test), 0.02 μm±0.01 μm for eye 1 (mean±std across repeated measurements for the same pattern) and 0.04 μm±0.02 μm for eye 2 are below the threshold for these eyes.
We have learned from artificial eyes that sampling patterns with a small number of samples (19) are good at sampling aberration patterns with no higher-order terms (eye A2). When analyzing our ranking results on normal human eyes, remarkable differences were found only in the patterns with a small number of samples. This is due to the presence of higher-order aberrations and a larger measurement variability in these eyes.
Due to the lack of a “gold standard” measurement, there are some issues that have not been addressed in the experimental part of this work, such as the following: (1) Does the magnitude of some particular aberrations determine a specific pattern as more suitable than others for sampling that particular eye? (2) Will eyes with aberration terms above the number of samples be properly characterized using the different patterns? (3) Will measurements in eyes with aberration terms of magnitude larger than that of normal eyes yield different results?
We have used computer simulations as a tool to address these issues. Simulations were performed as follows: We first assumed a “true” aberration pattern for a simulated eye, which was basically a set of Zernike coefficients (either 37 or 45 terms). From this true aberration pattern, a wavefront was computed as the “true” wavefront. The simulation then involved sampling the wavefront. The sampling was performed by computing a sampling pattern (sample location and aperture size) and computing the wavefront slopes across the sampling aperture. Noise was then introduced into the slope estimates. For this simulation, we used the noise values estimated from the actual wavefront measurements described above. While the simulation software can include light intensity and centroiding accuracy, for the current simulations it was deemed most important to set the variability of the centroid determinations to experimentally determined values. Once a new set of centroids was computed for each sample, a wavefront was estimated using a standard least-squares estimation procedure identical to that described above for the actual data, fitting up to either 17 (for the Hex19 and Circ19) or 37 terms. We calculated 25 simulated wavefronts for each simulated condition, although only the first five sets of Zernike coefficients were used to compute the metrics in order to reproduce the same conditions as in the measurements.
First, we verified that the results obtained from the simulations were realistic by using the Zernike coefficients of the real eyes (obtained with the H91 pattern). We sampled the aberrations obtained with the same sampling patterns used in the measurements of our human eyes as well as with R177 (previously used in the artificial eyes), and we obtained the corresponding coefficients. Finally, we applied the different metrics and ranking to these simulated coefficients, sorting the patterns for each metric across all eyes. We also used the hierarchical cluster analysis on these simulated data eye by eye.
Figures 5(d) and 5(e) show the ranking plot for RMS_Diff and for W%, respectively, and Fig. 5(f) shows the dendrogram corresponding to the global hierarchical cluster analysis (i.e., including all the eyes) for the simulated human eyes. The results of the global hierarchical cluster analysis are presented, similar to the actual data, as a summary of the results for each of the 12 simulated eyes. Solid, dashed, and dotted lines indicate “good,” “medium,” and “bad” clusters, respectively, as previously described. Trends similar to those of the measured human eyes are seen, with the main clusters repeating, although individual pairings changed. As with the measured human eyes, shown in Fig. 5(c), H91, C91, and A49 are in the “good” group; J49 and L49 belong to the “medium” group; and H37, H19, and C19, although not clearly within any group, appear in borderline positions. As expected, the pattern R177 was included in the “good” group. We conclude that the simulations provide a good estimate of the performance of the repeated measurements using different sampling schemes in real normal eyes.
Once we had validated our simulations in normal (healthy, nonsurgical) human eyes, we applied the simulations to three different sets of Zernike coefficients corresponding to the following: (1) A keratoconus eye measured using LRT with H37 as a sampling pattern . The main optical feature of these eyes is a larger magnitude of third-order terms (mainly coma) than in normal eyes. RMS for third- and higher-order aberrations was 2.362 μm for the original coefficients used to perform the simulation. (2) A post-LASIK eye measured using LRT with H37 as a sampling pattern . These eyes show an increase of spherical aberration toward positive values and a larger amount of coma after the surgery. RMS for third- and higher-order aberrations was 2.671 μm for the original coefficients used to perform the simulation. (3) An eye with aberrations higher than the seventh order. In this case we used the coefficients up to the seventh order corresponding to the previous post-LASIK eye and added 0.1 μm on the coefficient , simulating a post-radial keratotomy (post-RK) eye. RMS for third- and higher-order aberrations was 2.672 μm for the original coefficients used to perform the simulation.
Figure 7 shows the results obtained for these three eyes, for RMS_Diff [(a), (d), and (g)], for W% [(b), (e), and (h)], and for the hierarchical cluster analysis [(c), (f), and (i)]. The results were repetitive across the three eyes, with R177, H37, and A49 resulting as the best patterns, and C19 as the worst, for both RMS_Diff and W%. The reason why in this case H37 is consistently classified as the best pattern is apparently because this is the pattern used to perform the original measurement of aberrations from which the wavefront was computed for the simulations. We should note that the values for RMS_Diff for the keratoconic eye were smaller (the three first patterns were not above the threshold for RMS_Diff) compared with the other two eyes, indicating that differences from the reference pattern were smaller. The fact that most of the metric values are above the threshold indicates that in these eyes differences are not attributable to variability (although it should be noted that the variability values used in the simulations were obtained from normal eyes and that they may be smaller than those corresponding to pathological/surgical eyes). The cluster analysis results are similar across the three eyes, with the exception of H19, which for the surgical eyes is close to the “good” patterns group. This may be due to the predominance of spherical aberration, characteristic of these eyes
Although the values of the metrics are larger for these “pathological” eyes, the conclusions obtained from our real eyes seem applicable to eyes with greater amounts of aberrations: Even though patterns with more samples tend to give better results, the spatial distribution of the samples is important. While a large number of samples helps (R177), the correct pattern at lower sampling was more efficient (A49, H91) for eyes dominated by some specific aberrations.
We should note that the conclusions related to pathological eyes displayed in this section are obtained from simulations results and should be regarded as a preliminary approximation to the study of sampling pattern in pathological eyes, which should include experimental data.
The analytical model of Díaz-Santana et al. , previously described in the introduction, allowed them to test theoretically different sampling patterns using as a metric the RMS error introduced in wavefront measurements by the different geometries. This model uses as an input the second-order statistics of the population, and hence it is bound to include the interactions reported by McLellan et al. , as long as the population sample and number of Zernike terms are large enough to reflect all possible interactions. The model was applied to an apparently young population of 93 eyes, with aberration terms up to the fourth order, to compare square, hexagonal, and polar geometries. They found that the sampling density did not influence the RMS error much for hexagonal and square grids, whereas lower sampling densities produced a smaller error for polar grids. When comparing grids with different geometries and similar densities, they found, in agreement with our results, that the polar geometry was best (in terms of smaller error), followed by the hexagonal grid. Differences in performance between patterns decreased as density increased.
The analytic model  of Kolmogorov’s statistics proposed by Soloviev et al. indicates that random sampling produces better results than regularly spaced ones. They also reported that aliasing error increases dramatically for regular samplings for fits reconstructing more modes, whereas the associated error of the HS sensor was smaller for irregular masks (with 61 subapertures of 1/11 of the pupil diameter size), probably because an irregular geometry helps to avoid cross coupling. Our experimental study supports their conclusions that simply increasing the number of samples does not necessarily decrease the error of measurement and that sampling geometry is important.
In the current study, we used the Zernike modal fitting to represent the wave aberration because it is the standard for describing ocular aberrations. Smolek and Klyce  questioned the suitability of Zernike modal fitting for representing aberrations in eyes with a high amount of aberrations (keratoconus and postkeratoplasty eyes), reporting that the fit error had influenced the subject’s best corrected spectacle visual acuity. Marsack et al.  revisited this question recently, concluding that only in cases of severe keratoconus (with a maximum corneal curvature over 60 D) did Zernike modal fitting fail to represent visually important aberrations. In the current study we did not address this, but rather restricted our conditions to ones more commonly encountered and for which Zernike modal fitting is expected to be adequate.
We summarize our conclusions as follows:
The authors thank the volunteer subjects for this study for their patience and collaboration, Laura Barrios for her helpful assistance with statistics, and Javier Portilla and Rafael Redondo for their helpful assistance with image processing for pupil tracking. This research was supported in part by National Institutes of Health, USA, grant EY04395 (Burns); Ministerio de Educación y Ciencia, Spain, grant FIS2005-04382; and the European Young Investigators (EURYI) Award, EUROHORCs-ESF (Marcos).
OCIS codes: 330.5370, 330.7310, 330.4300.