IMAGING PROTOCOL
The OCT images were obtained using the Stratus OCT3 (Carl Zeiss Meditec, Dublin, California) by imaging technicians who were certified by the reading center (2 images were from OCT2 [Carl Zeiss Meditec]). The OCT images were initially evaluated for quality by the imaging technician at the clinic site before submission to the reading center, and poor quality images were retaken if possible.
As directed by the SCORE Study protocol, the study eye of the subject was scanned at baseline and every 4 months there-after for up to 36 months. The fellow eye was scanned at baseline, month 4, month 12, month 24, and month 36. The scanning protocol included the fast macular scan (128 A scans per B scan) and the higher-resolution (512 A scans per B scan) crosshair scan, both using 6.0-mm scan length and 6.0-mm display diameter. The characteristics of the fast macular scanning protocol are summarized elsewhere.
12,13 Each submission to the reading center required paper printouts of the fast macular thickness map analysis report (6 individual retinal thickness reports consisting of 6 individual radial B scan images) and the horizontal and vertical crosshair scans. All images were deidentified for subject identifiers in compliance with Health Insurance Portability and Accountability Act regulations.
GRADING EQUIPMENT
Current OCT review software has built-in software calipers for thickness measurements. Because this software was not available at the beginning of the SCORE Study, the reading center used only paper prints of OCT images for evaluation. Various tools were developed for evaluation of the paper prints. Manual measurements of center point thickness on OCT printouts were performed using a head-mounted optical glass binocular magnifier (OptiVisor; Domegan Optical Company, Lenexa, Kansas, or the equivalent) and a handheld digital caliper (Product 9900; Precision Graphic Instruments Inc, Spokane, Washington, or the equivalent). For all images, the lateral extent of morphological abnormalities was assessed using a lateral scale (127-mm grid) overlaid on the crosshair B scan (). The edges of the lateral scale were lined up with the B scan images of the crosshair scans to assess the lateral extent of morphological abnormality.
GRADING PROCEDURE
Graders evaluated the OCT images independently (ie, no reference made to previous visits or other image types for the same subject). Each image was graded by a single grader.
For each visit, the following information was derived from an OCT image: (1) the quality of the scan, (2) the center point thickness and the retinal thickness measurements of the 9 Early Treatment Diabetic Retinopathy Study subfields, (3) total macular volume, and (4) retinal morphology. Retinal thickness measurements were taken directly from the fast macular thickness report unless the center point measurement was determined to be inaccurate by quality assessment. If the center point was measured inaccurately by the software, manual measurement was performed (see below).
ASSESSMENT OF OVERALL QUALITY
The goal of the overall quality assessment was to determine the accuracy of the numeric output of the fast macular thickness report. The graders reviewed this report to assess the presence of artifacts, mainly boundary line errors and decentration. The categories for quality assessment were good, fair, borderline, and ungradable. A grade of good indicated that the OCT image was free of artifacts and the software-generated center point thickness available on the paper prints was recorded in the grading form. A grade of fair indicated that boundary line errors were present but do not involve the center point. In such cases, the center point thickness was accurate but the subfield grid had inaccurate values. The evaluation form allowed the grader to document the reliability of each subfield value within the grid. shows examples of good and fair categories.
Borderline quality indicated that the center point thickness was inaccurate () and manual measurement must be performed. Boundary line errors affecting the center point and/or decentration could lead to a grade of borderline quality. All images with a standard deviation of more than 10% of center point thickness were measured manually.
14Ungradable quality indicated that the entire image had such severe inaccuracies that a manual measurement could not be performed. Examples of ungradable quality include severe scan alignment artifact, in which the OCT image is cut off from the display window and the inner and/or outer retinal layers are not visible for determination of boundaries. Very low signal strength could also result in ungradable images ().
TYPES OF ERRORS ON OCT IMAGES
Boundary Line Errors Boundary line errors are artifacts in which the segmentation algorithm fails to identify the inner retina and/or the outer retina correctly. Boundary line errors are identified by inaccurate tracing of the white lines on the OCT image and by the presence of wedge or bow tie artifacts in the false color map ( and ). Boundary line errors are the most common cause of inaccurate center point thickness.
15 Decentration Decentration of the OCT image occurs when the intersection of the 6 radial line scans of the fast macular scanning report does not coincide with the center of the macula. Decentration was assessed using both the map report and the individual B scan images (). All 6 B scans were examined to identify the location of the fovea with respect to the center point of the scan. A shift of the fovea by more than 10 A scans (500 µm) on either side of the center point (A scan number 64) was considered decentration. The false color map helped identify the shift of the fovea (blue area in eyes with foveal depression) from the central subfield. The gray scale fundus image (which depicts the location of the 6 radial lines with respect to the anatomical fovea) was also used to assess centration, although it was taken after the scan was complete.
Other indicators such as low signal strength (<5%) and low analysis confidence message (a tool found in newer software versions) alerted the graders to the presence of an artifact.
MANUAL MEASUREMENT OF CENTER POINT THICKNESS
Manual measurement of the center point thickness included 2 basic steps: identification of the fovea and the measurement itself. In decentered scans, identification of the fovea is important to determine the point for manual measurement. From the 6 radial B scans, the image that best represented the foveal depression was chosen for manual measurement. If the foveal depression was absent, the crosshair scans were compared with the fast macular scans for identification of the fovea. Other features used to identify the fovea included the attenuation (or absence) of the ganglion cell layer and nerve fiber layer at the fovea. In eyes with cystoid edema, the location of the largest cyst was assumed to be the fovea. In borderline quality images with boundary line errors, the white line denoting the automatically detected internal limiting membrane and/or retinal pigment epithelium (RPE) was ignored and the correct layers identified.
Handheld digital calipers were used for manual measurement of center point thickness on paper prints. After ensuring that the digital readout was calibrated, the caliper tips were opened until the points for measurement just obscured the internal limiting membrane and RPE lines at the fovea. A scale factor was used to convert the caliper readout from millimeters to micrometers. describes an example of the scale factor calculation. In the SCORE Study, 28.9% of the baseline images were manually measured.
MORPHOLOGICAL EVALUATION
There were 3 main codes for assessing the presence of a morphological abnormality: absent, questionable (meaning probable), and definitely present. The definitions of these 3 codes were similar to the definitions used in Early Treatment Diabetic Retinopathy Study Report 10
16 and Age-Related Eye Disease Study Report 6.
17The higher-resolution crosshair scans were used to assess 3 distinct retinal morphologies: intraretinal cystoid spaces, subretinal fluid, and vitreoretinal interface abnormalities. The measurement procedure for quantifying these morphologies using the calipers was similar to that of center point thickness manual measurement using a scale factor calculation.
Cystoid spaces are identified as round, well-defined spaces within the neurosensory retina of at least 2 × 2 mm (60 × 60 µm) on the paper prints. A millimeter ruler was used to approximate the cyst size. The cyst cavity is typically nonreflective (dark) or minimally reflective. The presence or absence of cystoid spaces at the center point was evaluated, as well as their lateral extent with respect to the central subfield, using the 127-mm lateral scale. In addition, the height of the cystoid space at the center point was measured.
Subretinal fluid was identified on OCT as a predominantly nonreflective (dark) space between the posterior boundary of the neurosensory retina and an intact RPE/Bruch junction. The subretinal fluid is typically dome-shaped, and its greatest vertical height was measured. The location and lateral extent of the subretinal fluid were categorized in a manner similar to cysts.
Three types of vitreoretinal interface abnormalities were recorded: posterior vitreous detachment, epiretinal membrane, and macular holes. In an eye with posterior vitreous detachment, the posterior hyaloid membrane is visible on an OCT image as a thin, weakly reflecting membrane located in the dark area anterior to the retina. The attachment of the posterior hyaloid membrane to the internal limiting membrane was categorized as partially adherent or nonadherent. Tenting of the retinal tissue at or around the point of adherence was categorized as vitreomacular traction. Epiretinal membrane was identified by the presence of a well-delineated layer of increased density/reflectivity on the retinal surface. Other features supportive of an epiretinal membrane included corrugation of the retina, bridging of the innermost layer of retinal tissue, and flattening of the fovea. Retinal distortion caused by epiretinal membrane was characterized by scalloped or cleft-like areas with peaking of the retinal tissue at points of adherence or as irregular retinal tissue under the epiretinal membrane. A macular hole was identified as either a pseudohole, lamellar hole, or a full-thickness macular hole. No attempt was made to distinguish between pseudoholes and lamellar holes, both of which were identified by an abnormally wide foveal depression with a steep foveal contour. Intact outer layers of retinal tissue just above the RPE distinguish a pseudohole or lamellar hole from a full-thickness hole.
QUALITY CONTROL AND REPRODUCIBILITY
Various levels of quality control (QC) programs were performed to maintain intergrader reproducibility. An ongoing monthly QC program ensured that approximately 5% of the images were regraded every month. On a quarterly basis, intergrader agreements were generated for the whole group and for individual graders. Reproducibility data were used to identify the characteristic(s) for which the grader differed from the group, followed by focused training.
Continuous training programs in terms of bimonthly QC meetings also helped maintain excellent reproducibility. The grading team leaders and ophthalmologists at the reading center lead the meetings, in which images (randomly selected or images that focused attention on particular grading issues) were graded with the group’s input, as an exercise. Difficulties in grading were handled through these meetings, which have proven to be an efficient way of reducing differences between the grading team members.
A randomly selected sample of SCORE Study images was identified for quality control with the intent to have them regarded annually through the course of the SCORE Study. A total of 106 images of 53 subjects were randomly selected from the SCORE Study OCT images that had grading completed by July 2006, irrespective of the visit. Eight graders participated in the annual exercise, and each scan was regraded by 2 or 3 graders. Each scan had a single grade of record that was exported for data analysis to the data coordinating center. Data from the QC exercises were compared using cross tabulations (grade of record vs QC grade). The percentage of exact agreement on a categorical scale (eg, presence vs absence) and the unweighted κ values were presented here. Agreement for continuous variables such as manually measured center point thickness was assessed using the intraclass correlation coefficient.