|Home | About | Journals | Submit | Contact Us | Français|
Comparison of clinician and reading center assessments of diabetic retinopathy severity demonstrated moderate agreement on a 4-step scale that pooled a majority of cases in a single broad step of the scale.
To evaluate agreement in diabetic retinopathy severity classification by retina specialists performing ophthalmoscopy versus Reading Center (RC) grading of 7-field stereoscopic fundus photographs in a phase 2 clinical trial of intravitreal bevacizumab for center-involved diabetic macular edema.
Clinicians’ grading scale used 4 levels: microaneurysms only, mild/moderate non-proliferative diabetic retinopathy (NPDR), severe NPDR, and proliferative diabetic retinopathy (PDR) or prior panretinal photocoagulation (PRP) or both. The Reading Center scale used 8 levels: microaneurysms only, mild NPDR, moderate NPDR, moderately severe NPDR, severe NPDR, mild PDR, moderate PDR, and high-risk PDR. Percent agreement and kappa statistic were defined by collapsing RC categories to match those used by clinicians.
There was agreement in 89/118 eyes (75%) with kappa=0.55 (95% confidence interval: [0.41, 0.68]). In 6 eyes, disagreements were of potential substantial clinical importance: 5 eyes with subtle retinal neovascularization and 1 with a small preretinal hemorrhage identified only in photographs.
Clinician grading of retinopathy severity had moderate agreement with Reading Center grading and might be useful for placing eyes into broad baseline categories.
Grading severity of diabetic retinopathy from 7-field stereoscopic fundus photographs is the accepted standard method to evaluate baseline levels and progression of diabetic retinopathy in clinical trials and epidemiologic studies.1–6 There have been several publications reporting the sensitivity and specificity of diabetic retinopathy detection and severity classification as determined by clinical examination (ophthalmoscopy) compared with a reference standard of stereoscopic fundus photographs. However, in many of these studies, ophthalmoscopy was performed by a single ophthalmologist7–9 or other health care professionals,10 such as optometrists, ophthalmic technicians,9 diabetologists,11 physician assistants,12 and nurses.11 In studies which compared ophthalmoscopy to grading of 7-field stereoscopic fundus photographs graded at a Reading Center, sensitivity of diabetic retinopathy detection has been reported to range between 32 and 82%, and specificity between 95 and 100%.7, 9, 12 In the current study with 47 participating retinal specialists in a phase 2 clinical trial of intravitreal bevacizumab (Avastin®, Genentech, South San Francisco, California) for center-involved diabetic macular edema (DME), we evaluated the agreement in diabetic retinopathy severity classification as determined by ophthalmoscopy performed by retina specialists using a clinicians’ grading scale as compared with grading of stereoscopic fundus photographs at a Reading Center using a grading scale modified slightly from the ETDRS.13 Results of this study provide information regarding how ophthalmologists in the Diabetic Retinopathy Clinical Research Network (DRCR.net) categorize levels of retinopathy of subjects compared with detailed grading of fundus photographs of those same subjects by a Reading Center, and may help determine if and when clinician assessment alone may be sufficient for certain study purposes.
The design, methods, and results of the phase 2 intravitreal bevacizumab for DME trial, a multi-center randomized clinical trial conducted by the DRCR.net at 36 clinical sites in the United States, have been published.14 The study enrolled 121 subjects at 36 different sites by 47 different retina specialists. As the current study relates only to baseline data from this trial, only the pertinent eligibility criteria and aspects of the study design relevant to the current study are summarized here. The entire protocol is available on the public site at http://drcr.net Eligible subjects were at least 18 years old with type 1 or type 2 diabetes. The pertinent eligibility criteria for the study eye included: (1) definite retinal thickening due to DME involving the center of the macula based on clinical exam, (2) OCT central subfield thickness ≥ 275 microns, (3) no history of treatment for DME within the prior 3 months and (4) no panretinal photocoagulation (PRP) within the prior 4 months or an anticipated need for PRP in the 6 months following randomization. A subject could contribute only one study eye.
Standard ETDRS 7-field film color stereoscopic fundus photographs were obtained at baseline only of the study eye of each study participant and forwarded to the DRCR.net Reading Center at the University of Wisconsin-Madison for grading by trained readers who were masked to all clinical data, including clinician assessment of diabetic retinopathy level. Occasionally fundus photographs of both eyes were submitted; photographs of non-study eyes were not available to the graders. Photographs were each graded independently by each of 2 of the 6 graders assigned to DRCR.net color photograph grading. Disagreements by one or more retinopathy severity levels were resolved by a senior grader. The graders used standardized definitions for various characteristics of diabetic retinopathy and standard photographs to assess the severity of these characteristics in the 7 photographic fields and derived a composite ETDRS level score of 20, 35, 43, 47, 53, 60–61, 65, 71 or 75.13, 15 These levels correspond to the following classifications for diabetic retinopathy severity: 1) microaneurysms only, 2) mild non-proliferative diabetic retinopathy (NPDR), 3) moderate NPDR, 4) moderately severe NPDR, 5) severe NPDR, 6) mild proliferative diabetic retinopathy (PDR), 7) moderate PDR, and 8) high-risk PDR.
Clinicians were asked to grade diabetic retinopathy severity level at baseline in each eye of all study participants as one of the following levels: a) microaneurysms only, b) mild/moderate NPDR, c) severe NPDR (4-2-1 rule), and d) PDR or prior PRP or both. For eyes in category (d) presence/extent of PRP scars and of new vessels/vitreous or preretinal hemorrhage was also recorded. With the exception of required mydriasis, no specific protocol for the examinations was provided, but the principal methods used for ophthalmoscopy were slit-lamp biomicroscopy with a hand-held lens and binocular indirect ophthalmoscopy. Except for the 4-2-1 rule defining severe NPDR (severe intraretinal hemorrhages and microaneurysms in at least 4 quandrants, definite venous beading in at least 2 quandrants, or moderate intraretinal microvascular abnormalities in at least 1 quandrant), no specific definitions were provided to the clinicians for the severity levels with the presumption that their general ophthalmic fund of knowledge would suffice to differentiate these categories.
For comparison with clinician grading, Reading Center categories were collapsed to match those used by the clinicians as determined by the authors; Reading Center categories for mild, moderate and moderately severe NPDR were combined as were categories for mild, moderate and high-risk PDR. The clinician versus Reading Center grading comparison was not planned until follow-up on all patients had been completed. The current study is comprised of a retrospective analysis of data available and collected prospectively within the phase 2 study protocol.
Photographs were not available for 3 subjects, leaving 118 subjects for analysis. All photographs for which there was any disagreement in the gradings by the clinician compared with the Reading Center were re-reviewed by one of us (MDD) in order to ascertain potential reasons for the disagreements. Of the 118 eyes, 4 were initially classified as non-gradable by the Reading Center because of poor photographic quality, but on review for this manuscript could be assigned a retinopathy severity level and were included in the analysis (reviewer was aware of clinician’s grading at the time of re-review).
Percent agreement and the unweighted kappa statistic were computed, comparing the clinician and Reading Center gradings.
Results of clinician and Reading Center diabetic retinopathy severity level gradings are summarized in Table 1. The kappa value was 0.55 (95% confidence interval [0.41, 0.68]). The shaded cells represent consistent gradings (89 of 118 eyes; 75%) and highlighted cells represent agreement within one step (101 eyes; 86%).
Of the 12 one-step disagreements, 9 were between severe NPDR on one scale and the next lower step on the other scale, and one was between severe NPDR and mild PDR (Table 1). The remaining two one-step disagreements occurred in eyes with occasional microaneurysms and small hemorrhages; the hemorrhages were overlooked clinically or categorized as microaneurysms.
Of the 17 disagreements by 2 or more steps, 4 involved small differences in severity of NPDR (2 classified as microaneurysms only clinically and in the mildest part of the moderately severe NPDR category [level 47A] by Reading Center grading and 2 classified as severe NPDR clinically and moderate NPDR [level 43] by Reading Center grading). In 12 disagreements, NPDR was recorded clinically and PDR by Reading Center grading. In 6 of these 12 cases, scars of prior PRP, which were present in the photographs, had apparently been ignored in answering the retinopathy severity question clinically (new vessels, vitreous or preretinal hemorrhage or PRP were all part of the highest step on the clinical scale); other than PRP scars, these eyes were not noted to have other features that would have put them in the PDR category. The remaining 6 of these 12 disagreements comprised 2 eyes with small patches of neovascularization elsewhere (NVE), 2 with very subtle neovascularization at the disc (NVD), one with a small preretinal hemorrhage and one with definite NVE but only questionable NVD that had been overgraded as definite. Finally, there was one eye classified as PDR clinically and as moderately severe NPDR by Reading Center grading in which neither new vessels, nor vitreous or preretinal hemorrhage or PRP was visible in the photographs; presumably, one or more of these abnormalities was observed clinically outside of the 7 photographic fields.
There was agreement on presence of prior PRP scars in all of the 23 eyes in which the clinical and photographic assessments agreed on presence of PDR. In 5 of these eyes, residual new vessels or vitreous/preretinal hemorrhage were graded present only in the photographs and in 4 noted to be present only clinically. On review of the first 5, subtle NVE was present in 3, subtle NVD in one, and a small preretinal hemorrhage in the fifth. Among the latter 4 eyes, there was one in which a small amount of old white vitreous hemorrhage was not reported by the graders and 3 in which no PDR characteristics were seen in the photographs (and presumably were present only outside of the 7 photographic fields).
We found 86% agreement within one step and a kappa value of 0.55 (95% confidence interval [0.41, 0.68]) between clinical assessments of diabetic retinopathy severity level performed by retina specialists and fundus photograph gradings performed at a Reading Center evaluating one eye of each of 118 individuals participating in a clinical trial of center-involved DME. Clinician assessments were made on a 4-step retinopathy severity scale without standardization of methods and compared with the ETDRS retinopathy severity scale used by trained graders at a Reading Center. These results are generally consistent with results of other studies comparing ophthalmoscopy to grading of 7-field stereoscopic fundus photographs at a Reading Center. Kinyoun et al7 reported an agreement rate of 86% and 86% (weighted kappa, 0.56 and 0.62) in the grading of diabetic retinopathy in right and left eyes, respectively, with both the retina specialist and the Reading Center using the same scale of diabetic retinopathy level. In 5 of the 8 disagreements, the clinical assessment resulted in a lower diabetic retinopathy severity level classification than the Reading Center grading. Moss et al9 reported an agreement rate of 86% (kappa, 0.75) for grading retinopathy (using 3 levels: none, nonproliferative, or proliferative) when comparing ophthalmoscopy performed by an ophthalmologist (the principal investigator of the study), an optometrist or an ophthalmic technician (both of the latter had undergone extensive training in ophthalmoscopy and fundus photography) to grading of photographs at a Reading Center. Ophthalmoscopy was more likely to disagree with fundus photography grading in eyes with less severe forms of retinopathy. Among 301 eyes classified by Reading Center grading as microaneurysms only, 50% were classified as having no retinopathy by ophthalmoscopy. Perhaps more importantly, in 35 of 170 eyes (21%) with PDR by Reading Center grading, NPDR (or, in 3 of these eyes, no retinopathy) was found by ophthalmoscopy. Further review of these photographs by the ophthalmologist found PDR present in 34 of the 35 disagreements, but high-risk PDR in only 1 of 34 (representing 1 of the 36 eyes with high-risk PDR by Reading Center grading).
The principal causes of disagreements in our study appeared to be the greater sensitivity of photograph grading to subtle abnormalities and the larger area of retina available to clinical examination. These causes of disagreement have been recognized previously.9 Although prior PRP was listed in the written definition of PDR provided to clinicians participating in this clinical trial, in 6 of the 12 disagreements in which NPDR was recorded clinically and PDR by Reading Center grading, scars of prior PRP had apparently been ignored in answering the retinopathy severity question. Other investigators have also found that prior photocoagulation causes confusion in how to classify eyes for diabetic retinopathy.16 These findings suggest that it may be helpful to emphasize such grading criteria to clinicians in future studies.
In 6 eyes, disagreements were considered to be of potential substantial clinical importance if the patients had not been under regular ophthalmologic follow-up. Among the 6, there were 5 eyes with very subtle new vessels and one with a small preretinal hemorrhage identified only in the photographs.
Limitations of this analysis include the restricted range of diabetic retinopathy included in our sample (center-involved DME was required and anticipated need for PRP in the 6 months following enrollment was an exclusion), the pooling of a broad portion of the NPDR severity range into one category, and the relatively small number of study participants. In addition, no specific manual of definitions or direction towards certain criteria for evaluation was provided to the clinicians; it is unknown whether provision to clinicians of a defined set of criteria for evaluation would have affected the agreement between clinician and Reading Center evaluations.
In summary, comparison of clinical and photographic assessments of retinopathy severity showed moderate agreement on a 4-step scale that pooled a majority of cases in a single broad step of the scale. While this 4-step scale was deemed sufficient for the needs of a trial managing DME—that is, to classify baseline severity in order to check for balance among study groups and any treatment interaction of the diabetic retinopathy severity level on outcomes—it may be insufficient for determining precise retinopathy severity levels and for trials in which the presence of early PDR is an eligibility or outcome variable.
DRCRnet investigator financial disclosures are posted at www.drcr.net
There are no conflicts of interest.
An address for reprints will not be provided.
*The most recently published list of the Diabetic Retinopathy Clinical Research Network investigators and staff participating in this protocol can be found in Ophthalmology 2007;114:1860-1867 with a current list available at www.drcr.net Supported through a cooperative agreement from the National Eye Institute EY14231, EY14269, EY14229 and the Juvenile Diabetes Research Foundation, International (New York, NY).