|Home | About | Journals | Submit | Contact Us | Français|
To describe an automated method of quantification of specific fundus phenotypes and evaluate its performance in differentiating drusen, the hallmark lesions of age-related macular degeneration (AMD), from similar-looking bright lesions, the pisciform deposits or flecks typical of Stargardt disease (SD).
Fundus macular images of 30 eyes of 30 subjects were studied. Fifteen subjects had a clinical diagnosis of AMD with at least 10 intermediate and/or 1 large drusen, and the other 15 had SD. As a test of bright-lesion separation, AMD and SD subjects were chosen from the heterogeneous phenotypes of each disorder, to be as visually similar as possible. Drusen and fleck properties were quantified from the color images by using an automated method, and a shape classifier was used to divide the images as characteristic of either AMD or SD. Image identification performance was quantified by using the area under the receiver operating characteristic curve (AUC).
All SD subjects demonstrated at least one disease-associated variant of the ABCA4 gene. The method achieved an AUC of 0.936 for differentiating AMD from SD.
Automated quantification of fundus phenotypes was achieved, and the results show that the method can differentiate AMD from SD, two distinctly different genetically associated disorders, by quantifying the properties of the bright lesions (drusen and flecks) in their fundus images, even when the images were visually selected to be similar. Quantification of fundus phenotypes may allow recognition of new phenotypes, correlation with new genotypes and may measure disease-specific biomarkers to improve management of patients with AMD or SD.
Age-related macular degeneration (AMD) and Stargardt disease (SD) are the leading causes of visual loss in the United States from adult-onset and juvenile macular degeneration, respectively. Subsets of AMD and SD subjects share clinical symptoms and bear similarities of fundus manifestations, due to lightly colored, subretinal pigment epithelial (sub-RPE) deposits, although the disorders otherwise differ by age of onset, mechanism, and tempo of visual loss.1,2 Drusen, the bright lesions typical of AMD, and the pisciform deposits or flecks typical of SD, are both similar in color and size but differ in shape, with drusen displaying a more rounded shape, whereas the flecks in SD often have a more elongated, fishtail (or pisciform) shape.
At least eight genes and genetic loci have been reproducibly shown to contribute to the development of the AMD phenotype, although only two of these account for large population-attributable risk in the United States: complement factor H (CFH) and ARMS2/LOC387715.3–21 SD is a juvenile macular degeneration, recessive in 90% of cases, caused by mutations in the ABCA4 gene.
Quantifying the characteristic AMD and SD fundus phenotypes is of great interest, because it may allow a better understanding of the relation between the genes responsible for these diseases and the variable gene expression, as demonstrated in the phenotype. It can be used to enrich groups for specific endophenotypes, thereby allowing increased power for gene discovery studies. Differentiating between AMD and SD exclusively by examination of the retina is challenging, especially if only drusen or flecks are present.
Several methods of detecting drusen and other bright lesions in fundus images have been proposed, using mathematical morphology,22 histogram-based adaptive local thresholding,23 background removal and histogram-based thresholding,24 or pixel classification.25 In prior attempts at drusen segmentation and identification, thresholding techniques were used or methods that reduced choroidal and illumination variation.
We have recently developed a new method to detect flecks and differentiate them from drusen. It uses unsupervised model matching instead of supervised classification, where the models are defined mathematically. The advantage of using unsupervised model matching for this application is that no training set is required, given the small number of available SD images from molecularly confirmed ABCA4-bearing patients.26 Thus, all available fundus images can be used to test the performance of the system. We use a single phenotype metric, the flatness value, to differentiate drusen from flecks.
The purpose of the present study was to describe this automated method of quantification of specific fundus phenotypes and to evaluate its performance in differentiating patients with drusen, the hallmark lesions of age-related macular degeneration (AMD), from patients with similar-looking, pisciform deposits or flecks, while excluding patients with images showing other funduscopic signs of AMD (geographic atrophy and CNV scars) and of SD (bull's eye maculopathy).
In this retrospective study, 30 subjects were included: 15 from a large Iowa cohort of AMD and 15 from a clinically and genetically characterized SD cohort. The study protocol adhered to the tenets of the Declaration of Helsinki. The Institutional Review Board of the University of Iowa approved the study protocol, and, because only deidentified images were used, a waiver of written informed consent was granted. AMD subjects were evaluated if they had sufficient quality fundus images, had a clinical diagnosis of nonexudative AMD with at least one fundus image containing multiple intermediate and/or large drusen (AREDS simple scale grade 3), and were at least 50 years of age. SD subjects were evaluated for inclusion if they had reduced visual acuity before age 35 associated with multiple fundus flecks. In equivocal clinical cases, dyschromatopsia and/or mildly reduced ERGs were used as SD criteria. A single retinal specialist (SRR) selected the subject images, choosing from a pool of more than 100 AMD and 100 SD subjects with heterogeneous phenotypes, to identify a pool of AMD and SD subject images that were as visually similar as possible, and choosing the highest quality single fundus image from each subject (see Figs. 3, ,4).4). This selection was verified by two additional retinal specialists (MDA, EMS).
All SD subjects had molecular genetic evaluation of the ABCA4 gene, as previously described,27 and all 15 SD subjects had at least one disease-causing ABCA4 allele identified, whereas in 7, both alleles were found.
Images were acquired with a 30° camera (FF3; Carl Zeiss Meditec, Inc., Dublin, CA) and recorded on ASA 100 film (Kodachrome; Eastman Kodak, Rochester, NY). Film-based images were captured at 3456 × 2304 pixels and 12 bits per channel with a novel, high-throughput slide-scanning device. Digital images were stored as raw pixel values (.CR2) and converted to uncompressed TIFF format for processing: The green channel was used exclusively for image analysis in this study.
Drusen are round, white-to-yellow lesions that are deposited between the basement membrane of the RPE and Bruch's membrane and can be modeled by ellipsoid shapes of low eccentricity, representing the contours and potential drusen fusion. A drusen model, with ellipsoid SD α1 and α2 along the minor and the major axes, respectively, follows:
where I(u,v) is the intensity of the image at the pixel location (u,v), β is a shape parameter modeling the sharpness of the drusen boundary, and δ models drusen contrast with the fundus background. To model elliptical shapes with angular orientation, θ, (u,v) can be replaced in equation 1 by rotationally transformed coordinates (u′,v′) = Rθ(u,v).
Flecks or pisciform lesions are pisciform, yellowish deposits in the sub-RPE potential space. Typically, in SD, flecks vary in shape, including ellipsoid shapes, but at least a few are always more elongated than drusen. A fleck model, with SD α1 and α2 along the minor and the major axes, respectively, follows:
We first detected the bright lesions (flecks and drusen) by trying to match image regions to the above rotation- and scale-invariant models28—a process called template matching. The shape of each of the detected bright lesions was then quantified as its flatness value, and a histogram of all flatness values in the image was created, encoding the shape of all bright lesions in the image. Differentiating AMD patients from SD patients then became a task of finding the discriminant in the histogram space found through linear discriminant analysis.
The goal of lesion model or template matching is to find the parameters of a template that minimize the pixel-to-pixel distance, or difference, between a region in an image, centered on a pixel (u,v) and that of the lesion model. The distance is defined as the averaged sum of the squared pixel-by-pixel differences. If the distance is below a set threshold, then a lesion model of parameter is present at the pixel location (u,v).
The template-matching process was applied to each pixel location in the image, within a region adjusted for the scale of the lesion model. The scale was adapted to α1 and α2.
To make the algorithm more robust and less sensitive to noise and background variability of the retinal images, we first convolved the green-channel image with two 4 × 4 Haar (step) filters: a horizontal (H) and vertical (V) filter. Two transformed images were thus obtained, corresponding to the convolution with those two filters. In our experience, this preprocessing step leads to the highest signal-to-noise ratio when characterizing candidate bright lesions in fundus images; while preserving the shape of the bright lesions. The lesion models were also convolved with the H and V filters, and examples of the resulting transformed models are shown in Figure 2A. The transformed lesion models were matched to image regions in the two transformed images, respectively. The matching was thus performed in Haar space; in other words, the H-transformed template was matched with the H-transformed image, and the V-transformed template was matched with the V-transformed template.
Instead of matching image regions directly to the parameterized models in equations 1 and 2, which would require matching to a large number of lesion models with different , we created a compact filter set derived from a large number of instances of the models, as follows: instances of lesion models were generated from equations 1 and 2, setting β = 2 and δ = 1, based on preliminary studies, and by varying sizes α1 and α2 and rotation θ. Once generated, lesion model instances were grouped according to their sizes α1 and α2, and within each group, a single circular analysis image region was used. The radii for the three scales were 7, 13, and 26 pixels (~35, 65, and 130 μm, respectively), three times as large as the typical lesion size in each scale group, as this was the optimal region size for microaneurysm detection in our previous work.29,30 Thus, we obtained 4080 lesion model instances (1360 for each scale), up to a size of 2096 pixels, 1048 in each transformed image (after convolution with the Haar filters). Principal component analysis (PCA) was then applied to the 1360 model instances of each scale. See Figure 2B for an example of the resulting first few principal components for the 13-pixel scale lesion models. The number (n) of principal components retained for the compact filter set was chosen so that 99% of the variance in the model instances was retained, resulting in n = 13, 26, and 46 filters, respectively, for the three different scales.
The distance between the image region around each pixel and each filter is obtained by the Euclidian distance between their projections onto the compact filter set, after normalization. Approximate nearest-neighbor classification is then used to find the closest lesion model instance among the k = 1360 instances, for each scale, using the collection of all lesion model instances as the training set.26,31
Thus, three closest lesion model instances are found for each pixel, one for each scale, and if their Euclidean distance is below some threshold t, then we define a match for that lesion model instance. Figure 1 (bottom row) shows the closest lesion model instance for each pixel if there was a match for that pixel. The specificity of the lesion-detection algorithm is thus controlled by parameter t. In preliminary experiments, we found that the choice of t usually did not affect performance.
After detection of matching pixels, using the compact filter set, the pixels were clustered by assigning a label to all connected centers of detected templates, in order to prevent the detection of confluent drusen as a single, elongated lesion. Lesions were then reconstructed using the nearest-neighbor template (of 1360) for the pixels in each cluster, and its shape was quantified by applying standard deviational ellipses to the reconstructed lesion: (1) The main direction was identified; (2) the SD of the pixel intensities along the main axis and along the axis orthogonal to the main axis were computed; and (3) the flatness, defined as the ratio between these two standard deviations, was recorded.32 A 5-bin histogram was then created from the flatness values of all reconstructed lesions in an image, to estimate the distribution of the flatness of all lesions in an image.
A signed flatness value was obtained by projecting the location of each image in the five-dimensional flatness histogram space onto the discriminant direction, as found by linear discriminant analysis: the normal vector to the hyperplane that best separates SD and AMD.
A positive flatness value indicates that the subject has lesions predominantly similar to the model in equation 1, or round drusen, whereas a negative flatness value indicates that the subject has lesions predominantly similar to the model in equation 2, or flecks, because the lesions are more flat. We created a receiver-operating characteristic (ROC) curve of the performance of the flatness value to predict whether the patient has AMD or SD, based on the fundus images, by varying the threshold on the flatness value.
The system was implemented in C++, and run on a Windows XP 2.4 GHz laptop with 4 GB of RAM. The time required to quantify the shape of bright lesions in each image was approximately 12 seconds.
Fifteen AMD and 15 molecularly confirmed SD subjects were studied. Examples of AMD and SD images, with automatically detected lesions and segmentation results are shown in Figure 1. The average and SD of the five-flatness-value histogram levels, computed for both AMD images and SD images, are given in Table 1. There are significant differences at both extremes of the flatness value. A ranking of the 30 images of the 30 subjects by flatness value (from the most likely to contain drusen to the most likely to contain flecks) is displayed in Figure 3. The image in Figure 3A, with specificity of 0.20, was the most challenging of all 30 images: It contained two large, irregular, elongated flecks that the algorithm split and detected as several smaller round lesions. The other misclassifications occurred near the AMD/SD classification boundary. An area under the ROC curve of 0.936 was achieved on the 30-image data set (Fig. 4), achieving a sensitivity of 100% at a specificity of 80%.
The results of this preliminary study show that our automated system for quantification of specific phenotypes is capable of quantifying the drusen and fleck phenotypes in images from AMD and SD patients and can predict whether a patient has AMD or SD from these phenotypes in a single fundus image. Differentiating SD flecks from drusen is not a trivial task for most clinicians, but the subtle phenotypic difference between these lesions was successfully caught by template matching, which characterized the shape of each lesion. Although we expect the main use of computational quantification of phenotypes to be for phenotype characterization and aiding in the discovery of additional genotypes for AMD and SD, the specific automated system used in this study may have potential for assisting nonretinal specialists in determining whether patients have SD or AMD from a single color fundus image.
Clinicians tend to think of the difference between flecks and drusen as only the extremes (i.e., either flecks or drusen). Inspection of Figure 1 in consideration of our results, may help illustrate that, even in SD, many round (drusenlike) lesions occur. It is the distribution (the histogram, or the visual Gestalt) of the flatness of the lesions in aggregate and not of a single lesion that determines the diagnosis.
To our knowledge, this is the first study to use a computational approach to characterize fundus phenotypes. Our approach is not limited to quantification of AMD and Stargardt phenotypes, but instead, can be used anywhere that complex phenotypes are suspected to be present in images, as long as the phenotypes are based on the properties of objects in the images.
We found our approach to be robust to changes in the only parameter, t, and adjusting or fine-tuning this parameter was not necessary for discrimination. We believe this is most likely because we used the histogram of flatness values to classify an entire image: A local error of interpreting a single drusen as a fleck minimally affects the histogram; it is the distribution of the shapes of all lesions in the image that determines its classification as either AMD or SD. This study should be regarded as a proof-of-concept study. A study on a much larger dataset of retinal images is in preparation, and we plan to use our standard machine-learning approaches to more fully refine the value of t. Simple disease phenotypes, such as visual acuity or time of onset of visual loss, are simple to quantify, (i.e., express as a number), because they are one-dimensional. So far, however, quantifying complex phenotypes, such as two-dimensional retinal images, or three-dimensional ones, such as retinal OCT, MRI, or CT, has been more challenging. Commonly, such complex phenotypes are quantified through some form of collaborative reading by experts—for example, through the Wisconsin Reading Center.33 However, computational quantification, as presented in this study, is objective, allows for finer characterization, and allows for rapid hypothesis generation and testing, because the existence of suspected phenotypes can be confirmed or discarded in a matter of hours.
In addition, to avoid bias as much as possible, we used filters that were programatically—through PCA—optimized for detecting the lesion models in equations 1 and 2, instead of selecting them from a predefined filter bank, as we and others have done previously.25,34 However, bias still exists because of the clinical expert knowledge that was essential for both modeling the drusen and flecks in equations 1 and 2, as well as for choosing the flatness value as the aspect of the lesions to be quantified, rather than any other aspects. We are currently investigating how to extend our approach to allow automated discovery of other, shape-independent lesion aspects.
In this study, we used only retinal slide images that were subsequently digitized. We do not expect a major difference in performance on native digital images, given our experience with this approach on large sets of digital images.29,30
Quantification, discovery, and confirmation of new complex fundus phenotypes is of great interest, because it allows grouping of patients according to phenotype, such as endophenotypes. This method makes discovery of new genotypes for that phenotype possible and allows us to discover mutations within the gene. Other phenotype metrics beyond the scope of this preliminary study, such as spatial distribution of lesions, can also be quantified using our approach and may be relevant as well.
In conclusion, we have developed an approach for computational quantification of complex fundus phenotypes in AMD and SD, and the results show that quantitative phenotyping of but a single type of lesion in a single image of a patient is highly reliable in separating AMD from SD.
Supported by National Eye Institute Grants R01 EY017066, R01 EY11309, and R01 EY16822; Research to Prevent Blindness, NY; and the Department of Veterans Affairs.
Disclosure: G. Quellec, None; S.R. Russell, None; T.E. Scheetz, None; E.M. Stone, None; M.D. Abràmoff, None