|Home | About | Journals | Submit | Contact Us | Français|
This Project evaluated a human visual system model (JNDmetrix) based on just noticeable difference (JND) and frequency-channel vision-modeling principles to assess whether a Cathode ray tube (CRT) or a liquid crystal display (LCD) monochrome display monitor would yield better observer performance in radiographic interpretation. Key physical characteristics, such as veiling glare and modulation transfer function (MTF) of the CRT and LCD were measured. Regions of interest from mammographic images with masses of different contrast levels were shown once on each display to six radiologists using a counterbalanced presentation order. The images were analyzed using the JNDmetrix model. Performance as measured by receiver operating characteristic (ROC) analysis was significantly better overall on the LCD display (P = 0.0120). The JNDmetrix model predicted the result (P = 0.0046) and correlation between human and computer observers was high (r2 (quadratic) = 0.997). The results suggest that observer performance with LCD displays is superior to CRT viewing, at least for on-axis viewing.
ALTHOUGH PACS (picture archiving and communication systems) and teleradiology have changed the practice of radiology quite dramatically over the past number of years, there are still many changes taking place in the digital reading room1, 2, 3. One aspect that seems to evolve continuously is the display.4,5 Although early investigations tended to focus on how many monitors were required in the digital reading environment and what resolution was required,5 a critical issue today is whether flat-panel liquid crystal displays (LCDs) are more suitable for primary diagnostic reading than the more traditional CRT (cathode ray tube) display.6,7 In many respects, the LCD has performance characteristics that exceed those of the CRT.8,9 The MTF (modulation transfer function) of an LCD is essentially isotropic, whereas the MTF of a CRT is non-isotropic. At the highest digitally addressable spatial frequency along the CRT scanline in the horizontal direction, the value is usually only 10% to 20%. In the vertical direction the MTF at twice the line frequency is closer to 30% to 40%.8,9 The result is that, for the CRT, more than half the contrast modulation is lost at the highest spatial frequencies, but this degradation does not occur with the LCD. Practically speaking, this could mean degradation of high-frequency lesions in radiographic images such as microcalcifications in mammograms, resulting in reduced detection performance.
Veiling glare, defined as the diffuse spreading of light or scattering within various parts of a display device,9,10 is also significantly different for the LCD compared to the CRT, In the CRT monitor, the thick faceplate of the vacuum bulb, back-scattering of electrons and light leakage through aluminum-layer non-uniformities, all contribute to veiling glare. This diffuse spreading of light results in a significant degradation of the maximum contrast capabilities of the CRT monitor. The decreased contrast capabilities will especially affect the detection of low-contrast objects in images such as masses in mammograms. LCDs (liquid crystal displays), on the other hand, tend to have much less (almost negligible in fact) veiling glare than CRTs because there is no need for the vacuum barrier glass faceplate. The protective layer for the LCD TFTs (thin-film transistors) may result in some veiling glare, but not nearly as much as with the CRT display. The question that arises is whether the relatively low veiling glare and isotropic MTF of an LCD display affects observer performance less than the generally higher veiling glare and non-isotropic MTF of a CRT display. The goal of the present investigation was to compare performance on an LCD to a CRT display using both human and model observers.
We have demonstrated the utility of model observers to predict observer performance in a number of previous experiments aimed at defining the optimal display characteristics for viewing radiographic images in the digital environment.11, 12, 13 The eventual goal of investigating the validity of such models for predicting human observer performance is to eliminate, or at least reduce, the number of receiver operating characteristic (ROC) studies conducted. ROC studies are the gold standard for measuring observer performance in radiology, but they are time-consuming and require a significant amount of effort on the part of both the investigator and observer. If we could use model observers to at least narrow the relevant number parameters to study, we could reduce this experimental burden on observers. Additionally, model observers may help us understand better how the human visual system encodes and processes information during the diagnostic interpretation process.
Two display parameters have been identified as being significantly different for LCDs and CRTs—MTF and veiling glare. For our purposes, MTF was measured via the contrast transfer function (CTF), also known as the square wave response.14 We measured horizontal and vertical CTFs using rectangular fields filled with square-wave contrast patterns. The raw digital image data represent optical density scaled for 12 bits per pixel. Because display functions of softcopy systems are generally expressed as luminance versus digital output, the digital data were converted to luminance values. We measure veiling glare by displaying black disks of varying diameter in the center of the CRT or LCD against a uniform background of maximum luminance. We then use a spot photometer to measure the luminance in the black disk relative to the luminance in the surround and plot it as a function of the diameter of the black disk. We form the veiling glare ratio, VG as:
where Lb is the luminance at the center of the black disk, Lr is the detector dark level, and Ls is the luminance in the surround. We then plot it as a function of the diameter of the black disk.
The architecture of the JNDmetrix model (JND = just noticeable difference ; Sarnoff Corporation, Princeton, NJ)15,16 begins with two paired input images and finishes with a JND map showing the magnitude and spatial location of visible differences between the input images. In the first optics stage, the input images are convolved with a function approximating the point spread by the optics of the eye. Image sampling by the retinal cone mosaic is simulated in the second stage by a Gaussian convolution and point sampling sequence of operations. Next, the raw luminance image is converted to units of local contrast, and then decomposed into a Laplacian pyramid. The result is seven frequency bandpass levels. Each pyramid level is convolved with eight pairs of spatially oriented filters with bandwidths derived from psychophysical data. After oriented filtering, the pairs of filter output images are squared and summed, yielding a phase-independent energy response. This mimics a widely proposed transformation in the mammalian visual cortex from a linear response among simple cells to an energy response among complex cells. The phase independence from this operation has some useful properties; e.g., it makes the model less sensitive to the exact position of edges, a property exhibited in human psychophysical performance as well.
At the next “transducer” stage, the energy measure for each pyramid level is normalized by a value approximating the square of the grating contrast detection threshold for that pyramid level and local luminance; then the normalized energy measure is transformed by a sigmoid non-linearity to reproduce the dipper shape of visual contrast discrimination functions. This takes into account nonlinear masking effects that make features less detectable on a nonuniform background. To account for characteristics of human foveal sensitivity, the model includes a pooling stage that averages transducer outputs over a small neighborhood by convolving with a disc-shaped kernel. After this, the model output for each spatial position is essentially an m-dimensional vector, where m is the number of pyramid levels times the number of orientations. The distance between the vectors for the two inputs is computed in the “distance metric.” A spatial map of JND values results, representing the degree of discriminability between the two images. This JND map can be reduced to a single, aggregate value by Minkowski normalization (Qnorm). The map derived is useful because it shows the magnitude and the position of noticeable differences between the two images. Also, all image differences are rendered in the same units, and thus they are quantitatively comparable.
A series of 40 breast images with malignant masses and 40 images with benign masses were downloaded from the database for Screening Mammography.17 These base images were used to generate a larger series of images. 512 × 512 regions of interest around each mass were extracted from the original images. The masses were removed via digital processing to generate signal-absent versions of the images. Lower contrast (75%, 50%, and 25%) versions of the masses were created with weighted superposition of the signal-present and signal-absent image versions. The final test set therefore contained five versions of each of the 80 base images (100% or original contrast, 75%, 50%, 25%, 0% or signal absent), for a total of 400 images. The images were then down sampled to 256 × 265 for input as region-of-interest images into the visual system model. To capture the effects of exactly what the human observer sees on the monitors, we capture each image with a high-performance CCD camera. The camera enables linear oversampling in which 4 × 4 = 16 CCD pixels are sampled for each monitor pixel. This generates high-fidelity images that cannot be differentiated visually from those displayed on the monitor.
The set of 400 images was shown to six radiologists, once on a high performance LCD (Dome Ci5, Planar Systems, Beaverton, OR) and once on a high performance CRT (Siemens SMM 210200P with a P45 phosphor; Siemens Health Services, Erlangen, Germany) monitor using a counterbalanced presentation design. Each presentation used a different image randomization order. A minimum of 2 weeks passed between viewing sessions. Ambient room lights were turned off. The monitors were set to the same maximum luminance (500 cd/m2, 0.8 cd/m2 minimum), were the same resolution (2048 × 2560), and were both calibrated to the DICOM-14 Gray Scale Display Function Standard. Viewers were seated 27 cm from the display. The images were displayed in the center of each monitor and were viewed on-axis (since the LCD has the potential to change the appearance of images when viewed off-axis). The observers’ task was to examine each image and decide if a mass was present or absent. They then rated their confidence in that decision on a 6-point scale. No image processing (e.g., window/level adjustment) was permitted. Each viewing session lasted approximately one hour. The rating data were analyzed using the Multi-Reader Multi-Case Receiver Operating Technique of Dorfman, Berbaum and Metz.18 Area under the curve (Az) values were generated for each observer, and the overall differences between conditions were analyzed statistically using analysis of variance (ANOVA) techniques with appropriate post-hoc tests.
The MTF and veiling glare measurements of the CRT and LCD monitors are shown in Figures Figures11 and and2,2, respectively. It can be seen that the veiling glare for the LCD monitor is much lower than that for the CRT monitor, and that the MTF is essentially the same in both the horizontal and vertical directions for the LCD but quite disparate for the CRT.
The human observer performance results (ROC Az) are shown in Figure Figure33 for each of the lesion contrast levels used. Overall, performance with the LCD was significantly higher than with the CRT when evaluated with a t-test for paired observations (t = 2.727, df = 23, P = 0.0120). In every contrast condition, performance was higher with the LCD and reached statistical significance with the 75% and 50% contrast masses (P < 0.05). All six readers preformed higher at all contrast levels with the LCD than with the CRT. The VDM (Fig. (Fig.4)4) predicted the same pattern of results (t = 2.854, df = 638, P = 0.0046) for all lesion contrast levels, with the LCD yielding better discrimination performance than the CRT. The correlation (Fig. (Fig.5)5) between human Az and VDM JND results was high (r2 quadratic = 0.997).
The results of this study demonstrate that performance with images displayed on an LCD monitor with lower veiling glare and isotropic MTF is significantly better than on a CRT monitor with higher veiling glare and non-isotropic MTF. The results apply at all lesion contrast levels, although it was only at the 75% and 50% contrast levels that statistical significance was reached for the human observers. The human and model results closely parallel each other and demonstrate essentially the same pattern of results. In terms of the model, this provides us with further evidence that the JNDmetrix model is a very suitable tool for helping to predict observer performance in radiologic image interpretation tasks as a function of the manipulation of various viewing parameters.
There are a few caveats that need to be mentioned regarding this study and our plans for future work. The first is that the study used only mammographic images with masses, so it is not possible to generalize completely to other types of images and lesions. It is also important to remember that we used only small regions of interest from the original mammograms that eliminated the need for the observers to engage in any extensive searching. It is difficult to say whether performance would be better or worse if we had used the full images and the observers had to search for the lesions. It is highly likely, however, that although overall performance would likely decrease (since they would have to search the entire image and detection errors would therefore be more likely to occur), the same general pattern of results would be observed.
Another main point is that the present study was not able to determine precisely to what degree MTF and veiling glare independently affected performance. Although we controlled for as many other factors as possible (e.g., luminance, resolution), there are still some intrinsic differences between the CRT and LCD that we cannot control. Therefore, our plans for future work will use only a single monitor (CRT or LCD) and develop image-processing methods to simulate veiling glare changes in the display (i.e., create more and less veiling glare effects compared to the nominal veiling glare of the monitor). In this way we can more precisely observe the effects of veiling glare on performance. We have already examined a method for compensating for MTF degradation in CRT displays and found that compensating for the non-isotropic nature of the CRT MTF does indeed improve observer performance.12
It is also important to note that all of the images in this study were displayed at the center of the monitors. Neither the CRT nor the LCD displays are uniform in their characteristics at various locations within the viewing area of the display. For LCDs, this may in fact exacerbate the off-axis viewing problems.19,20 Future studies will be done to determine if the viewing angle effects observed differ depending on where the image is displayed on the monitor. Future studies will also use microcalcifications as targets, because they have different physical characteristics and might react differently.
The present study suggests that LCDs are likely to be suitable for diagnostic viewing of radiographic images in the digital reading environment. Because of their lower veiling glare and isotropic MTF compared to CRT displays, they may provide an even better viewing medium than the more traditional CRT. The effects of off-axis viewing, however, may represent a potential barrier for LCD use in the clinical reading room, but this aspect needs to be further investigated.
This work was supported in part by grant R01 EB002119-04 from the National Institutes of Health (NIH/NIBIB).