The grading of features of AMD from optimized digital color fundus photographs in AREDS2 has reproducibility equivalent to historical grading from color slides in AREDS. This indicates continuity of AMD lesion classification from AREDS to AREDS2, which is important for comparison between data sets. A secondary outcome of AREDS2 is to evaluate the utility of the AREDS AMD severity scale for disease staging and possibly to predict outcomes. Because the severity scale was developed using color film images, there was question whether the macular lesion characteristics graded with digital photography were sufficient to reasonably justify use of the same grading steps. The agreement within two steps on the AREDS severity scale between the AREDS and AREDS2 contemporaneous sample is 93.6% and 95.6%, respectively.4
Our results indicate that overall there are no gross differences (). The use of post hoc standardized image optimization for illumination and color balance of digital fundus photographs is important to help overcome the increased variability in image quality of digital compared to film images.8
The grading schema employed in AREDS2 is focused on quantification of three key features to arrive at the nine-step AREDS severity scale: drusen area, hyperpigmentation, and hypopigmentation. The individual lesion components are graded using categorical scales (as opposed to a count of number of drusen or area of pigment changes). The reproducibility of grading some macular features, such as area of hypopigmentation, is only moderate (weighted Kappa 0.56, ). The grading of hypopigmentation from color photographs is difficult and is highly dependent upon image quality. Despite this, the reproducibility of grading on the AREDS severity scale is substantial (91% agreement, weighted Kappa 0.73, ). Two-step change along the severity scale may be a useful outcome for eyes enrolled in clinical trials with early or moderate atrophic AMD, similar to how the ETDRS diabetic retinopathy scale is employed in many clinical studies. When all data have accrued at the end of the clinical trial, the AREDS2 severity levels and change in levels will be evaluated with advanced AMD outcomes in order to test this hypothesis.
An advantage of grading color images in a digital environment is that it allows contemporaneous masked regrading for quality control. In AREDS, color film slides were individually labeled and set as stereo pairs in appropriate anatomic relations into plastic sheets, which were also labeled. The effort involved in relabeling slides for quality control was prohibitive in AREDS. Duplication of color slides under the best of circumstances resulted in images that were different enough from the film originals that an experienced grader could detect them and thereby become unmasked to the reproducibility exercise. Therefore, in AREDS, all quality control was performed unmasked to the type of grading, which theoretically could bias the reproducibility results. The duplication of images in a digital environment and assigning fictitious identifiers for evaluator masking are a relatively easy process in AREDS2, in which all reproducibility grading is masked.
An important endpoint in the color photograph grading for atrophic AMD is the enlargement of GA and the development of GA in the center of the macula. At present, there are multiple alternative image types for this assessment, including fundus autofluorescence14,15
and spectral domain optical coherence tomography.16
Each imaging type has its unique advantages and disadvantages, and the superiority of one image type over another requires comparative grading in the same set of eyes and correlation with clinically important outcomes. In AREDS2, the reproducibility of grading the area of GA from color fundus photographs is high. Of note, because the unit of measurement used in both AREDS and AREDS2 is disc area (DA), the grading area measurements are directly comparable. This is in spite of the revision of the assumed dimensions of the DA from 1.80 mm2
to the current convention of 2.54 mm2
Automated lesion detection has been employed with some success to measure drusen extent from fundus photographs in eyes with AMD.17–19
Advances in technology may make automated grading from color photographs feasible, which is desirable not only to decrease costs but also because machine grading usually has higher reproducibility than human grading. In addition, there has been developing interest in use of technology alternative to color photographs for the measurement and classification of drusen, such as spectral/Fourier domain optical coherence tomography (OCT).20–25
The classification of drusen and measurement of drusen area and volume from OCT is especially promising because it appears well within the technical capabilities of current technology. Because these OCT variables are fundamentally different measurements than those from color photographs, it may not be surprising if the results are not directly comparable to those from color photograph grading. To date, no large data sets evaluated by both methodologies have been reported. OCT measurement of drusen is an ancillary study in AREDS2 that will hopefully shed light upon the relationship between drusen variables measured by both imaging types and relationships between OCT measurements and clinical outcomes. Classification of AMD by fundus autofluorescence imaging also has been proposed.14,26–29
Fundus autofluorescence patterns may provide prognostic information for AMD outcomes. Fundus autofluorescence may be particularly useful for the semiautomated measurement of area of GA,15
although measurements are, again, sometimes discrepant with color photographs.14
Autofluorescence imaging is another ancillary study within AREDS2 that should help clarify its predictive value. Advantages to the classification of AMD from color photographs, apart from its validation in epidemiologic studies and clinical trials, include the fact that it does not require technology beyond common clinical practice (such as a scanning laser ophthalmoscope for autofluorescence or a spectral domain OCT) and that it is the image type most similar to the view of the physician examining a patient. This makes extrapolation from clinical trials to clinical practice somewhat easier.5
As potential therapeutics target earlier stages of AMD, a classification system based upon outcomes validated with a large prospective cohort is increasingly important.