|Home | About | Journals | Submit | Contact Us | Français|
To assess the sensitivities and false detection rates of two CADe systems when applied to digital or screen-film mammograms in detecting the known breast cancer cases from the DMIST breast cancer screening population.
Available screen-film and digital mammograms of 161 breast cancer cases from DMIST were analyzed by two CADe systems, iCAD SecondLook (iCAD) and R2 ImageChecker (R2). Three experienced breast imaging radiologists reviewed the CADe marks generated for each available cancer case, recording the number and locations of CADe marks and whether each CADe mark location corresponded with the known location of the cancer.
For the 161 cancer cases included in this study, the sensitivities of the DMIST reader without CAD were 0.43 (69/161, 95% CI 0.35 to 0.51) for digital and 0.41 (66/161, 95% CI 0.33 to 0.49) for film-screen mammography. The sensitivities of iCAD were 0.74 (119/161, 95% CI 0.66 to 0.81) for digital and 0.69 (111/161, 95% CI 0.61 to 0.76) for screen-film mammogram, both significantly higher than the DMIST study sensitivities (p< 0.0001 for both). The average number of false CADe marks per case of iCAD was 2.57 (SD 1.92) for digital and 3.06(SD 1.72) for screen-film mammography. The sensitivity of R2 was 0.74 (119/161, 95% CI 0.66 to 0.81) for digital, and 0.60 (97/161, 95% CI 0.52 to 0.68) for screen-film mammography, both significantly higher than the DMIST study sensitivities (p< 0.0001 for both). The average number of false CADe marks per case of R2 was 2.07 (SD 1.57) for digital and 1.52(SD 1.45) for screen-film mammogram.
Our results suggest the use of CADe in interpretation of digital and screen-film mammograms could lead to improvements in cancer detection.
In the Digital Mammographic Imaging Screening Trial (DMIST), the reported sensitivities for both digital and screen-film mammography were 0.41 based on 455 days of follow-up.1 While low, similar sensitivities for digital and screen-film mammography have been previously reported, ranging from 0.40 to 0.77 for digital mammography and from 0.31 to 0.71 for screen-film mammography in recently published results from six other controlled laboratory studies,1-12,13 while the National Breast Cancer Surveillance Consortium (NBCSC) reported an average screening sensitivity of 81.2% in their analysis of screening mammography records for over 3.8 million women between 1996 and 200714 based on 365 days of follow-up.3 There is wide variation in reported radiologist sensitivities in these studies that is likely dependent not only on the skills of the participating radiologists, but also on the image case sets, the number of rounds of screening, and definitions of sensitivity15-18. Even so, there is a need for development of tools that improve the detection of breast cancer; thus the impetus for computer-aided detection software development. We use the abbreviation CADe to refer to computer-aided detection software, as opposed to the more ubiquitous CAD, which has been used to reference both computer-aided detection algorithms and computer-aided diagnostic algorithms. While the former is designed to prompt the radiologist to regions of a mammogram that may have been overlooked, the latter, now frequently referred to as CADx in the literature, is designed to aid radiologists in distinguishing cancerous lesions from non-cancerous lesions.
It has been noted previously that the cases used to train and ultimately test a CADe system are the primary drivers on how well that technology will perform in cancer detection.19
While CADe publications have gone into great detail about the types and sizes of lesions included, the complexity of the dataset is hard to determine simply from these metrics. CADe systems were designed specifically to reduce the number of cancers missed by radiologists by marking suspicious regions of interest, prompting radiologists to inspect the annotated region. While the gold standard methodology for evaluation of a CADe system should involve radiologists reviewing clinical cases without and then with CADe, it is also useful to assess the cancer detection capability of the CADe system alone, without a radiologist reader. The study objectives detailed in this paper were: i) to estimate the sensitivity of each CADe system to known cancers when applied to digital and screen-film mammography; ii) to compare the standalone sensitivities of each CADe system to the original DMIST study sensitivities for digital and screen-film interpretation; iii) to report the average number of false positive marks per case for each CADe system; and, iv) to estimate the effect of tumor and patient characteristics on sensitivity.
Appropriate institutional review board approval was obtained prior to conducting this study. Informed consent and HIPAA consent were obtained from all evaluable subjects prior to image acquisition for DMIST. All data were handled in a HIPAA compliant manner. Digital and screen-film mammograms of the available DMIST pathologically-proven cancer cases, i.e., those detected within 455 days of acquisition of the paired digital and screen-film mammograms were included in this study. The screen-film and digital mammograms were obtained from the American College of Radiology Imaging Network (ACRIN) image archive. A total of 329 case pairs, including both digital and screen-film images, were available for inclusion in this study out of a total of 335 DMIST cancer cases.
The CADe systems for screen-film and digital mammography of two manufacturers were tested in this study: iCAD Secondlook v1.4 (digital and screen-film), R2 ImageChecker Cenova v1.0 (digital), and R2 ImageChecker v 8.0 (screen-film). The results from the default CAD sensitivity settings for each system were reported as the main results. In addition to the default CAD sensitivity setting for R2 screen-film, two additional CAD sensitivity settings were applied to a single digitized film set, and results are included in the appendix. While any film mammogram can be processed using any film CADe system, not all digital mammograms are suitable for processing with all digital CADe systems. At the time of this study, the iCAD Secondlook v1.4 used in the study was compatible with images from the Fuji CR, Hologic Selenia, and GE Senographe 2000D systems. The R2 ImageChecker Cenova CADe system tested was capable of analyzing images from the Fischer SenoScan, Fuji CR, Hologic Selenia, and GE Senographe 2000D digital mammography systems. The maximum available number of cases for each of our two digital CADe systems was limited by these compatibility constraints. In total, 245 cases were available for the iCAD digital system and all 329 cases were available for the R2 digital system.
De-identified screen-film mammograms were digitized using manufacturer specified digitizer. CADe reports were printed to paper. Twenty-seven of the 329 available screen-film mammograms (8.2%) were rejected by the iCAD screen-film digitizer and therefore could not be digitized, resulting in 302 screen-film cases analyzed by iCAD. Twenty-six cases of 329 (7.9%) available screen-film mammograms were rejected by the R2 film digitizer and therefore could not be digitized, resulting in 303 screen-film cases analyzed by R2.
Digital mammograms were processed with CADe and displayed on manufacturer specific mammography workstations. The application of CADe algorithms was not successful for all digital images. In total, there were 179 evaluable digital cases for iCAD and 227 digital cases for R2. The CADe-generated electronic reports were displayed on a Hologic Secureview 6.0 workstation for R2 (Figure 1) and a Sectra IDS5.MX review workstation for iCAD (Figure 2).
There are 161 common cases that were fully evaluated by both vendors, on both film screen and digital mammograms. We focused our data analysis on those common cases, and presented the patient and lesion characteristics of those cases on Table 1.
A radiologist with 26 years of mammography experience annotated acetate overlays with the screen-film mammograms for each case (Figure 3), denoting the locations of known cancers as seen on DMIST images and/or validated by DMIST pathology reports. The annotating radiologist specified whether cancer was visible for each modality.
Three different radiologists reviewed the generated CADe marks and compared these to the known cancer locations depicted on the overlays. All three readers completed breast imaging fellowships and had an average of 4 years (range 2-8 years) experience in mammography. An average of 48.3% (range 25%-70%) of their time was spent interpreting screening mammograms in their respective clinical practices. One radiologist scored the iCAD digital cases, a second radiologist scored the R2 digital cases, and a third radiologist scored all digitized screen-film CADe for both R2 and iCAD. The radiologists recorded the number of CADe marks generated, the location of each CADe mark, and determine whether the location of each CADe mark corresponded to a known cancer as recorded on pathology reports and the case overlay. Marks that did not correspond to a known cancer location were recorded as false positives by the radiologists.
All statistics were performed at the case level to allow for comparison to DMIST results. The study objectives were: i) to estimate the sensitivity of each CADe system to known cancers when applied to digital and screen-film mammography; ii) to compare the standalone sensitivities of each CADe system to the original DMIST study sensitivities for digital and screen-film interpretation; iii) to report the average number of false positive marks per case for each CADe system; and, iv) to estimate the effect of tumor and patient characteristics on sensitivity. Sensitivity and the associated 95% exact confidence interval for each CADe system were calculated and McNemar’s test was used to test the difference between digital and screen-film within each CAD system. McNemar’s test and 95% exact confidence intervals were used for the second objective. Mean, standard deviation, median, and number of false positive marks per case were calculated for the third objective and paired t-test were used to test the difference between digital and screen-film within each CAD system. To address the fourth objective, sensitivities for subsets defined by age, breast density, menopausal status, cancer histology, tumor size, and lesion type were calculated by simple counts as well as univariate logistic regression-based estimates. 95% confidence intervals and p-values for comparing covariate effects on sensitivity were drawn from the same logistic regression model.
We present the results for the default CADe sensitivity settings for screen-film and digital for iCAD and R2 for the 161 common cancer cases.
There were 161 common cases across both screen-film and digital modalities for each CAD vendor, allowing for modality comparison within each vendor. The overall sensitivity of iCAD digital was 119/161 (0.74) [95% CI: (0.66, 0.81)] and 111/161 (0.69) [95% CI: (0.61, 0.76)] for screen-film. This difference in sensitivity between iCAD digital and screen-film 0.05 [95% CI: (−0.03, 0.13) was not statistically significant (p=0.26). The average number of false CADe marks per case for iCAD digital (2.57±1.92) was significantly lower than that for iCAD screen-film (3.06±1.72) (p=0.0008). The overall sensitivity of R2 digital was 119/161 (0.74) [95% CI: (0.66, 0.81)] and 97/161 (0.60) [95% CI: (0.52, 0.68)] for R2 screen-film. The difference in sensitivity between R2 digital and R2 screen-film 0.14 [95% CI: (0.05, 0.23) was statistically significant (p=0.003). The average number of false CADe marks per case for R2 digital (2.07±1.57) was significantly higher than for R2 screen-film (1.52±1.45) (p <0.0001) (Table 2).
In standalone mode, each CADe system for both screen-film and digital mammography was able to detect significantly more cancer cases than were detected by the original clinical radiologist readers in DMIST for the same cases. Of the 161 common cases, the original DMIST study has sensitivity of 0.43 (69/161, 95% CI 0.35 to 0.51) for digital and 0.41 (66/161, 95% CI 0.33 to 0.49) for screen-film. For digital mammogram, both CADe systems’ sensitivity for digital (0.74) was significantly higher than the DMIST’s sensitivity (p< 0.0001). For screen-film, sensitivities for iCAD(0.69) and R2(0.60) were both significantly higher than the DMIST’s sensitivity (p< 0.0001 for both) (Table 3).
In addition, CADe system has shown ability to detect the cases that were not detected by any method in the DMIST study. Of the 46/161 cases that were detected by neither screen-film nor digital readers in the primary DMIST study, 25 (54.3%) were marked by the iCAD system when applied to digital, and 18(39.1%) when applied to screen-film. The comparable numbers for R2 system is 27(58.7%) for digital and 12 (26.1%) for screen-film.
CADe’s sensitivity on lesion’s histology type (DCIS or invasive), size and type (mass, calcification, asymmetric density and arch distortion), and the subject’s age, breast density, and menopausal status was performed (Table 4) and compared using univariate logistic regression models (Table 5). CADe sensitivity did not appear to depend on subject’s characteristics, although the sensitivity of R2 screen-film was lower on pre-menopausal women (49%) than on post-menopausal women (64%), but the difference was not significant (p=0.09). Lesion histology type and size didn’t influence the sensitivity significantly, but lesion type seemed to have some effect. In general, CADe had the highest sensitivity for calcification (83% for both iCAD and R2 digital, 73% for both iCAD and R2 screen-film), but R2 screen-film has a low sensitivity of 55% for mass, significantly lower than the 73% for calcification on the same system (p = 0.038). In addition, asymmetric density has low sensitivity of 50% on both iCAD and R2 digital, which are significantly lower than the 83% for calcification ( p=0.017).
In this study, we evaluated the standalone sensitivity of commercial CADe systems to breast cancers that were found at screening or follow-up in the DMIST study for both digital and screen-film mammography. We found that the CADe systems tested were able to detect significantly more cancers than were found on initial screening by the original radiologists in DMIST for both digital and screen-film mammography; for each manufacturer, standalone sensitivity was higher for digital than for screen-film; and the standalone sensitivities of CADe systems was not influenced by lesion characteristics (histology, lesion size, lesion type) or subject characteristics (age, breast density, menopausal status).
The overall average standalone CADe sensitivities reported here, ranging from 0.60 to 0.74, are low compared to other retrospective standalone CADe studies. Bolivar et al reported a digital R2 CADe sensitivity of 93% for a study that included cancers seen on both MLO and CC views.20 The et al reported iCAD CADe sensitivity of 94% for cancers seen in at least one screening view.21 Yang et al reported a sensitivity of 96.1% for cases that included only a single cancer per subject detected by radiologists on digital mammograms.22 Our study differed in that we included mammographically occult lesions. Besides mammographic visibility of lesions, the characteristics of lesions and subjects have been shown to impact radiologist sensitivity. In our study, there was no statistical difference in CADe sensitivity based on breast density, lesion type, cancer histology, subject age, tumor size, and menopausal status. The et al. reported no statistical difference in CADe sensitivity for their study based on histopathology or tumor size.21 Bolivar et al reported no difference in digital CADe sensitivity based on breast density, lesion type, or histopathology, but they did find a significant difference based on tumor size for masses.20
One unique benefit of using the DMIST dataset for CADe evaluation was the ability to assess the performance of CADe algorithms for both digital and screen-film mammography. In our study, we used a single CADe system for both digital and screen-film with iCAD, finding no difference in standalone CADe sensitivities between iCAD digital and screen-film (p=0.26). For R2 digital and screen-film cases, two different versions of R2 CADe algorithms were used. There was a significantly higher standalone CADe sensitivity for R2 digital than for R2 screen-film (p=0.003).
There were a few of limitations to this study. First, the methodology for scoring CADe mark localization of known cancer locations could have introduced bias. Three different radiologists conducted the scoring for screen-film (both iCAD and R2), iCAD digital, and R2 digital. While we attempted to minimize this potential bias by providing the readers with annotated overlays that directly matched the films generated by a fourth radiologist, the radiologists that scored the digital CAD did have to use their judgment as to whether the CADe mark shown on the computer screen overlapped the annotation on the overlay. Second, because our study included only cancer cases, we could not assess specificity. While higher sensitivity was realized under certain conditions, there is an unknown impact on specificity. We are in the process of completing analysis of a retrospective radiologist reader study that will allow us to assess the impact of CADe on radiologist sensitivity and specificity with digital mammography. Third, the results provide the performance of CADe at a given point in time. The algorithms tested were current at the time this study was performed in 2008-2009 timeframe. Certainly newer versions of these algorithms would be available today resulting in perhaps different results. For reference we do provide results from R2 CADe at different sensitivity settings in the appendix. Standalone CADe sensitivity assessment is a critical first step in determining if a CADe algorithm is ready for clinical assessment. Our results show that CADe marked between 26.1%-58.7% more cancers than were detected by human readers alone, suggesting that the use of CADe with both screen-film and digital mammography could lead to improved breast cancer detection.
The following tables represent the full R2 screen-film dataset including all three sensitivity settings tested. R2-1 is the default sensitivity setting, with R2-2 having higher sensitivity than the default setting, and R2-3 having the highest sensitivity setting allowed by the system.
|Age<50||64 (21.1)||65 (21.7)||61 (21.0)|
|50<=Age<=64||150 (49.5)||147 (49.0)||145 (49.8)|
|Age>=65||89 (29.4)||88 (29.3)||85 (29.1)|
|Dense Breasts||150 (49.5)||148 (49.3)||146 (50.2)|
|Fatty Breasts||153 (50.5)||152 (50.7)||145 (49.8)|
|Pre or Peri-Menopausal||90 (30.1)||91 (30.7)||87 (30.3)|
|Post-Menopausal||209 (69.9)||205 (69.3)||200 (69.7)|
|DCIS||88 (29.1)||86 (28.8)||83 (28.6)|
|Invasive||214 (70.9)||213 (71.2)||207 (71.4)|
|TUMOR SIZE (mm)|
|Size<=5||43 (19.7)||42 (19.4)||39 (18.6)|
|5<Size<=10||56 (25.7)||56 (25.9)||55 (26.2)|
|Size>10||119 (54.6)||118 (54.6)||116 (55.2)|
|Mass||129 (48.9)||127 (48.7)||123 (48.2)|
|Asym Density||21 (8.0)||21 (8.0)||21 (8.2)|
|Calcification||99 (37.5)||98 (37.5)||96 (37.6)|
|Arch Distortion||15 (5.7)||15 (5.7)||15 (5.9)|
n = number of correctly detected cancers
N = total number of cancer cases analyzable for each CADe system
95% CI = 95% exact confidence interval
SD =standard deviation
|Sensitivity||95% CI||Sensitivity||95% CI||Sensitivity||95% CI|
|AGE(years)||Age<50||0.53||(34/ 64)||(0.41, 0.65)||0.58||(38/ 65)||(0.46, 0.70)||0.59||(36/ 61)||(0.46, 0.71)|
|50≤Age≤64||0.59||(88/ 150)||(0.51, 0.66)||0.63||(93/ 147)||(0.55, 0.71)||0.64||(93/ 145)||(0.56, 0.72)|
|Age≥65||0.61||(54/ 89)||(0.50, 0.70)||0.67||(59/ 88)||(0.57, 0.76)||0.65||(55/ 85)||(0.54, 0.74)|
|Dense||0.60||(90/ 150)||(0.52, 0.68)||0.64||(95/ 148)||(0.56, 0.71)||0.65||(95/ 146)||(0.57, 0.72)|
|Fatty||0.56||(86/ 153)||(0.48, 0.64)||0.63||(95/ 152)||(0.55, 0.70)||0.61||(89/ 145)||(0.53, 0.69)|
|Pre/Peri||0.51||(46/ 90)||(0.41, 0.61)||0.58||(53/ 91)||(0.48, 0.68)||0.57||(50/ 87)||(0.47, 0.67)|
|Post||0.61||(127/ 209)||(0.54, 0.67)||0.65||(134/ 205)||(0.59, 0.72)||0.66||(131/ 200)||(0.59, 0.72)|
|HISTOLOGY||DCIS||0.57||(50/ 88)||(0.46, 0.67)||0.63||(54/ 86)||(0.52, 0.72)||0.61||(51/ 83)||(0.51, 0.71)|
|Invasive||0.59||(126/ 214)||(0.52, 0.65)||0.64||(136/ 213)||(0.57, 0.70)||0.64||(133/ 207)||(0.57, 0.70)|
|Size≤5||0.53||(23/ 43)||(0.39, 0.68)||0.64||(27/ 42)||(0.49, 0.77)||0.67||(26/ 39)||(0.51, 0.80)|
|5<Size≤10||0.48||(27/ 56)||(0.36, 0.61)||0.55||(31/ 56)||(0.42, 0.68)||0.56||(31/ 55)||(0.43, 0.69)|
|Size>10||0.66||(78/ 119)||(0.57, 0.74)||0.69||(81/ 118)||(0.60, 0.76)||0.68||(79/ 116)||(0.59, 0.76)|
|Mass||0.60||(78/ 129)||(0.52, 0.69)||0.66||(84/ 127)||(0.57, 0.74)||0.67||(83/ 123)||(0.59, 0.75)|
|0.52||(11/ 21)||(0.32, 0.72)||0.57||(12/ 21)||(0.36, 0.76)||0.57||(12/ 21)||(0.36, 0.76)|
|Calcification||0.73||(72/ 99)||(0.63, 0.81)||0.77||(75/ 98)||(0.67, 0.84)||0.76||(73/ 96)||(0.67, 0.84)|
|0.53||(8/ 15)||(0.29, 0.76)||0.67||(10/ 15)||(0.41, 0.85)||0.67||(10/ 15)||(0.41, 0.85)|
|OR||95% CI(OR)||p-value||OR||95% CI(OR)||p-value||OR||95% CI(OR)||p-value|
|Age<50||50≤Age≤64||1.25||(0.70, 2.26)||0.4538||1.22||(0.67, 2.22)||0.5071||1.24||(0.67, 2.29)||0.4882|
|Age≥65||1.36||(0.71, 2.61)||0.3520||1.45||(0.74, 2.81)||0.2768||1.27||(0.65, 2.51)||0.4844|
|50≤Age≤64||Age≥65||1.09||(0.64, 1.86)||0.7600||1.18||(0.68, 2.06)||0.5575||1.03||(0.59, 1.79)||0.9308|
|Dense||Fatty||0.86||(0.54, 1.35)||0.5038||0.93||(0.58, 1.49)||0.7615||0.85||(0.53, 1.37)||0.5142|
|Pre/Peri||Post||1.48||(0.90, 2.44)||0.1218||1.35||(0.82, 2.25)||0.2417||1.40||(0.84, 2.35)||0.1961|
|DCIS||Invasive||1.09||(0.66, 1.80)||0.7408||1.05||(0.62, 1.76)||0.8629||1.13||(0.67, 1.91)||0.6540|
|TUMOR SIZE (mm)|
|Size≤5||5<Size≤10||0.81||(0.37, 1.79)||0.6031||0.69||(0.30, 1.57)||0.3743||0.65||(0.28, 1.52)||0.3149|
|Size>10||1.65||(0.81, 3.36)||0.1638||1.22||(0.58, 2.55)||0.6048||1.07||(0.49, 2.31)||0.8681|
|5<Size≤10||Size>10||2.04||(1.07, 3.90)||0.0302||1.77||(0.92, 3.40)||0.0889||1.65||(0.85, 3.20)||0.1359|
|Mass||Asym Density||0.72||(0.28, 1.82)||0.4855||0.68||(0.27, 1.75)||0.4252||0.64||(0.25, 1.65)||0.3578|
|Calcification||1.74||(0.99, 3.07)||0.0542||1.67||(0.92, 3.02)||0.0912||1.53||(0.84, 2.79)||0.1663|
|Arch Distortion||0.75||(0.26, 2.19)||0.5949||1.02||(0.33, 3.18)||0.9676||0.96||(0.31, 3.01)||0.9494|
|Asym Density||Calcification||2.42||(0.92, 6.35)||0.0718||2.45||(0.92, 6.53)||0.0744||2.38||(0.89, 6.36)||0.0838|
|Arch Distortion||1.04||(0.28, 3.92)||0.9550||1.50||(0.38, 5.95)||0.5641||1.50||(0.38, 5.95)||0.5640|
|Calcification||Arch Distortion||0.43||(0.14, 1.30)||0.1335||0.61||(0.19, 1.98)||0.4133||0.63||(0.20, 2.03)||0.4398|
OR = odds ratio
Elodia B. Cole, Medical University of South Carolina Department of Radiology and Radiological Sciences 96 Jonathan Lucas St., MSC 323 Charleston, SC 29425.
Zheng Zhang, Brown University Center for Statistical Science Box G-S121-7 121 S. Main St. Providence, RI 02912.
Helga S. Marques, Brown University Center for Statistical Science Box G-S121-7 121 S. Main St. Providence, RI 02912.
Robert M. Nishikawa, Department of Radiology and the Committee on Medical Physics The University of Chicago 5841 South Maryland Ave., MC2026 Chicago, IL 60637.
R. Edward Hendrick, University of Colorado – Denver, School of Medicine Department of Radiology, C278 12700 E. 19th Ave. Aurora, CO 80045.
Martin J. Yaffe, Sunnybrook Health Sciences Centre 2075 Bayview Ave., S-Wing, Room S657 Toronto, ON M4N 3M5 CANADA.
Wittaya Padungchaichote, Lopburi Cancer Center Lopburi 15000, Thailand.
Cherie Kuzmiak, University of North Carolina Department of Radiology CB #5120 170 Manning Dr. Chapel Hill, NC 27599.
Jatuporn Chayakulkheeree, Memorial Hospital Chulalongkorn University Bangkok 10330, Thailand.
Emily F. Conant, University of Pennsylvania Department of Radiology/1 Silverstein 3400 Spruce St. Philadelphia, PA 19104.
Laurie Fajardo, Department of Radiology Room 3970 JPP University of Iowa Hospitals and Clinics 200 Hawkins Dr. Iowa City, IA 52242.
Janet Baum, Cambridge Health Alliance Dept of Radiology 1493 Cambridge St Cambridge, MA 02139.
Constantine Gatsonis, Brown University Center for Statistical Science Box G-S121-7 121 S. Main St. Providence, RI 02912.
Etta Pisano, Medical University of South Carolina Department of Radiology and Radiological Sciences 96 Jonathan Lucas St., MSC 323 Charleston, SC 29425.