|Home | About | Journals | Submit | Contact Us | Français|
Picture archiving and communication systems (PACS) are expected to convert film-based radiology into a computer-based digital environment, with associated cost savings and improved physician communication. The digital workstation will be used by physicians to display these “soft-copy” images; however, difficult technical challenges must be met for the workstation to compete successfully with the familiar viewbox. Issues relating to image perception and the impact on physicians’ practice must be carefully considered. The spatial and contrast resolutions required vary according to imaging modality, type of procedure, and class of user. Rule-based software allows simple physician interaction and speeds image display. A consensus appears to be emerging concerning the requirements for the PACS workstation. Standards such as the American College of Radiology/National Electrical Manufacturers’ Association Digital Imaging and Communication Standard are facilitating commercial applications. Yet much careful study is needed before PACS workstations will be fully integrated into radiology departments.
PICTURE ARCHIVING AND COMMUNICATION SYSTEMS (PACS) are expected eventually to replace most radiographic film1. It is generally accepted that one of the more critical components of a viable PACS is the workstation2,3. This criticality stems from the workstation’s large variance in user requirements and its apparent inadequacy to compete with film as a low-cost display medium. Since many workstations are required, very powerful workstations tend to escalate the cost of a PACS.
Questions regarding the diagnostic efficacy of PACS workstations (video or hard copy) are being studied at a number of institutions 4,5,6. The use of image processing to enhance image information content is also being investigated 7,8,9. Even if diagnostic performance with PACS displays were equal to (or better than) that achieved with film, it does not by itself guarantee that physician performance (and possibly, patient care) will be unchanged.
This article presents the digital workstation from various perspectives: image perception, physics, hardware, software, image processing, and the impact on physicians. The other components of PACS—namely, networks, archives, and acquisition devices—are not included. Image manipulation is discussed, but advanced image processing, including three-dimensional presentation, is excluded.
It is important to recognize that there is a difference between the actual spatial resolution and perceived image sharpness, as well as between contrast rendition and perceived image contrast. This difference originates from the nonlinear processing of retinal signals. Perceived image sharpness depends on the spatial frequency response of the visual system, which peaks at about five cycles per degree and decreases at lower and higher frequencies. Consequently, large uniform structures are seen because of their boundaries, and small structures tend to merge with one another. The teacher tells the student to step back (or use a minifying lens) in order to see the heart on a chest radiograph and to use a magnifying glass in order to see the microcalcifications on a mammogram. An image with intrinsically poor spatial resolution can be made visually sharper by enhancing contrast.
The perceived contrast depends on retinal adaption to the local scene luminance. The visual system is sensitive to contrast over a wide range of luminance (eight or more orders of magnitude), but the actual contrast sensitivity at a fixed mean luminance is fairly narrow10. For example, the intensity of illumination has to be increased for contrast to be seen in the dark regions of film images.
A third important, but often neglected, contributor to perceived sharpness and contrast is the noise in the image display system. The term image clarity has been used to describe the complex interaction between contrast resolution and noise11.
Because of these interactions, it is very difficult to compare displays visually unless the physical parameters are properly set (especially the brightness and contrast). This can be demonstrated easily by putting identical television monitors side by side and manipulating the controls. Image clarity can be increased without affecting the intrinsic spatial resolution.
The fundamental perception of contrast and detail can be studied by using specific test patterns such as disks on uniform backgrounds. Adding pictorial detail complicates matters because higher level pattern recognition functions are involved. A blob that can be seen easily against a uniform background may be invisible when camouflaged by ribs and vessels in a chest image12.
A cross-sectional image of the body is meaningless to someone unfamiliar with anatomy and pathology. Over the years, community standards for medical images have been developed that include standard projections and standards for the gray-scale range. For example, computed tomographic (CT) images of the chest typically include standard bone and lung windows. Physicians have learned to interpret (assign meaning) to these images under the assumption that standards are maintained and that deviations from these standards may signal abnormality. Volatile displays that can be processed on-line have been set adrift from these standards. This becomes a problem when judging relative intensity, for example, whether lungs are over- or underaerated. Changes in the size and the portion of the body included in the image can also be confusing. The effect of these changes on the perception of abnormalities is unknown and requires study, although, as with many other technologic developments, their effect will be known only after they have been put into use.
The following paragraphs describe the difficulties in converting analog film images into digital images for presentation on the PACS workstation. For the existing digital acquisition systems—namely, CT, magnetic resonance (MR) imaging, ultrasound (US), and nuclear medicine—it is advisable that image information be transferred from the host computers to the workstations without alteration. However, even for these modalities, there are significant problems with the presentation of many images per examination and with the large bit depth required for each CT or MR image. The effects on diagnostic accuracy of using digital workstations rather than film for these digital modalities have not yet been thoroughly investigated.
The conversion from an analog image (plain radiographic image) to a digital image involves a potential loss of information, which can degrade image quality. There are two levels of quantization involved: First, the digital image is spatially sampled from the infinitely continuous analog image (pixel size), and second, each pixel is digitized into a finite number of gray levels (bit depth). For example, a 512 × 512 × 8-bit image involves 5122 samples of the analog image, and there are 256 gray levels corresponding to each pixel. In simple physical terms, the quantization limits the spatial resolution and contrast resolution of the system, respectively.
In addition, the spatial sampling will introduce “aliasing” in the image, insofar as high-frequency information will appear at lower frequeicies. The importance of assessing in how many pixels and bits are necessary for a satisfactory image was realized early, and a number of studies have appeared in the literature addressing these issues. These studies will now be reviewed.
Spatial resolution has been the subject of much debate in the literature. A consensus appears to be forming, and there was further confirmation at the 1989 RSNA scientific assembly. However, as will be presented, the acceptable resolution will vary substantially based on the modality and clinical question. Much less attention has been given to contrast resolution. Some of the earlier reported concerns with spatial resolution were, most likely, the result of inadequate contrast or poor signal-to-noise ratio (S/N).
The optimal choice of spatial and contrast resolution will depend on whether the input to the workstation was from conventional film or from a digital modality. Most studies have assumed a film-based system, and this approach means that the digital system can do only as well as the conventional system from which it is derived. Advantages of a fully digital system (wide dynamic range, potentially scatter-free images, dual-energy subtraction imaging, etc) are sacrificed or are inapplicable in a film-based system. Compensation for the lost image quality with high image-matrix sizes may be prohibitively expensive, if at all possible. Put in a different manner, it is possible that a fully digital system with a pixel size of 0.2 mm will produce better images than a film-based system with a pixel size of 0.1 mm. Studies of the resolution requirements for diagnostic radiology examinations use the tools of receiver operating characteristic (ROC) analysis13.
Most studies of the pixel requirements have concentrated on the chest examination, since this is both a high-volume examination and one that places stringent demands on image quality. Table 1 provides a summary of these studies 14,15,16,17,18,19,20,21,22,23 related to chest examinations, and Table 2 summarizes results for other procedures as presented in the following paragraphs.
In breast imaging for screening it is important to note various characteristics of microcalcifications for the determination of malignancy or benignity. The former tend to form clusters, are often too numerous to count, and are characterized by rough edges. The resolution limitations are not expected to affect detection of these microcalcifications as much as their classification. This examination places rather severe requirements on spatial resolution and does not require a wide dynamic range, both of which favor conventional high-resolution screen-film imaging. In a study24 in which conventional screen-film mammograms were digitized at 0.1-mm pixels, it was found that even at this small pixel size, there was lower detectability of microcalcifications in the digital hard-copy images relative to the conventional film images.
In a second study25, employing high-resolution (5 line pairs per millimeter [1p/mm]) storage phosphorplate technology, the conclusions were somewhat different. The study employed a breast phantom model with superimposed calcifications ranging in size from 50 to 800 µm. A total of 60 (30 normal and 30 abnormal) images were obtained with each of the two modalities, and the images were read by four radiologists. In terms of the ROC area, no significant difference was found between the two modalities for detection of the abnormalities, although it was found that significantly more microcalcifications were counted with the conventional film (see Table 2).
In an early study26 it was shown that pixel sizes smaller than 0.25 mm were needed to satisfactorily display skeletal images. In a study of resolution requirements for skeletal radiography27, 40 images with various degrees of subperiosteal resorption and 40 normal images were obtained with an extremity screen-film system at a magnification of two times on 14-inch radiographs. The images were digitized to 4,096 × 4,096 × 12 bits with a laser digitizer (80-µm spot size, spatial resolution of 11.4 1p/mm). The digital image was processed with a contrast-limited adaptive histogram equalization algorithm (CLAHE), which mapped each 12-bit pixel to an 8-bit value. Subsequently, the pixels were averaged to generate lower matrix sizes down to 512 × 512. The digital images were transferred to hard copy with a laser film recorder. The conventional and the computer-processed images were viewed by six radiologists in a standard ROC study. It was found that for matrix sizes of 2,000 × 2,000 (corresponding to a resolution of 5.7 1p/mm) and smaller, the conventional images were significantly better, and the 4,000 × 4,000 matrix (resolution of 11.4 1p/mm) approached the performance of the conventional films.
Here, the type of pathologic condition and the inherently higher contrast is expected to be less demanding on the resolution of the system. A study has been reported28 based on 10 images of double-contrast colon examinations, five of which were normal and five depicted the mucosal changes of subtle inflammatory bowel disease. The images were acquired with a wide-latitude screen-film system and were then digitized to 2,000 × 2,000 × 12 bits with a laser digitizer, yielding a pixel size of 100 µm (for 8 × 10-inch films). Lower resolutions were realized by means of an averaging scheme. The final images were displayed on film by using a 1,000-line laser matrix camera. The images were read by 10 radiologists, six of whom were more experienced in gastrointestinal studies than the others. The results revealed that image quality continued to improve as the pixel size was decreased, but that the experienced radiologists detected the abnormalities even at the lower (coarser) resolutions, and the acceptable resolution was approximately 400 µm, which corresponds to a 512 × 512 matrix for an 8 × 8-inch field. For teaching purposes and for more-experienced radiologists lower resolutions may be sufficient. For detection of subtle abnormalities and for less-experienced radiologists, higher (finer) resolutions may be needed.
A study has been reported29 in which photostimulable phosphor-plate technology was used to directly acquire digital images. The study included a rather large case sample (100 patients), and the cases were matched; that is, the same patients were imaged with both the conventional (400 speed system) and the digital techniques. This was possible since acquisition with the phosphor plates resembles the conventional imaging process (a phosphor plate is substituted for the cassette). The pixel size, which was not varied, was in the range of 0.2-0.33 mm for the digital images. The hard-copy digital images were created by using a laser multiformat camera. All images were read by three radiologists.
The conclusion was that diagnosis with the digital images was not significantly different from that with the conventional images, despite the lower resolution (2-3 lp/mm for the digital vs 5 lp/mm for the conventional). This was attributed to the higher contrast sensitivity (10 bits), linear rather than nonlinear response of film, higher inherent contrast, and lower spatial-resolution requirements (some of the images of each patient were tomograms) of this type of examination.
The radiographic image has inherent statistical limitations insofar as it is built up by a finite number of photons30. Consider an area A in the screen, which absorbs a photon fluence of n photons per square millimeter. The number of quanta associated with this area is the nA, and the standard deviation, assuming Poisson statistics, is
. The S/N is therefore nA/
. Clearly the S/N (detectability) decreases for small objects and low fluences. The density fluctuations [σ(D)] corresponding to the photon number fluctuations is given by σ(D) = 0.4343γ(1/
), where γ is the film gamma. Given the noise due to quantum mottle in radiographs, the question arises as to how many bits are needed for a given pixel size (spatial resolution). Intuitively, it would seem that when the gray-level spacing is on the order of the noise σ(D), a point of diminishing returns would be reached, since digitizing with more bits would be essentially digitizing the noise. A mathematical treatment of the problem has been done by several investigators. Kruger et al31 showed that if one digitizes with gray-level spacings that are twice the noise level, the noise level in the final digital image will be 17% greater than if one had digitized with an infinitely fine gray scale (ie, an infinite number of gray levels). To avoid digitization noise they recommend digital gray-level spacing no wider than twice the noise level at any part of the gray scale.
Rimkus and Baily32 performed an interesting analysis of the interplay between patient exposure and contrast resolution in digital radiography. While their calculation is for digital radiography (the analog image is formed on the target of the video camera tube), a similar argument is applicable to an image digitized from a film (instead of the electronic camera noise, one has the electronic noise due to the laser digitizer, and one needs to include film granularity in the expression for the total noise). Their argument is as follows: They define a meaningful gray level as three times the noise level of the pixel. The factor of three is the usual factor by which the signal must exceed the noise in order to be detected. The photon exposure determines the photon component of the pixel noise, and the camera S/N determines the electronic component. The total pixel noise is the quadrature sum of the two random and independent components.
In the limit of very high exposures, the camera noise dominates, and the number of meaningful gray levels approaches (S/N)/3. Thus a camera with an S/N of 1,000:1 is capable of providing only 333 meaningful gray levels, and that too in the limit of very high exposures. For smaller exposures, the meaningful number of gray levels is smaller. For example, to achieve about 256 meaningful gray levels with a 1,000:1 S/N camera requires an entrance exposure of 14 R. Their argument is open to the criticism that one is usually not interested in resolving individual pixels from noise, but rather a collection of pixels that make up whatever object that one is interested in detecting. To our knowledge this method has not been applied to the evaluation of laser digitizers to determine the optimal number of bits per pixel for a given examination.
So that information is not lost due to the digitizing, the number of bits should be high enough so that one gray level is smaller than about 2 σ(D). If all the information is in a small density range, the image has a small dynamic range, and clearly the digitizing may be accomplished with a few bits. Since this is rarely true for a clinical image, one needs more bits, and the required contrast resolution will depend on the type of examination. Chest imaging, with an inherently wide dynamic range, is expected to be demanding in this regard. The present consensus is that 8 bits are needed to record the information in the linear part of the sensitometric (Hurter and Drifield [H&D]) curve. With 2 additional bits, for a total of 10 bits, one can record additional information that is present in the image in the nonlinear portion of the H&D curve. For a chest film that is optimally exposed in a patient with relatively normal lungs, 8 bits might be sufficient. But if the exposure is not just right or if the patient has a pneumothorax or large pleural effusion, at least 10 bits are required.
Since the dynamic range of monitors is considerably less than 10 bits, clearly some kind of processing is needed to compress the dynamic range of the x-ray image into the available display dynamic range or to select a portion for viewing at any one time. The simplest processing is window and level controls, but these require substantial user intervention. Alternatively, one can apply histogram equalization to achieve this compression. Which scheme is better is still an open question.
In addition, images altered in this manner may increase false-positive readings, since the expected appearance of normal structures is different. Whether these problems could be overcome with training remains to be determined.
Another factor that affects how many bits per pixel are needed is whether further processing will be performed on the images. For example, if images will be viewed unprocessed, as in gastrointestinal studies, the number of bits needed may be as small as 6. On the other hand, if, for example, subtraction, look-up table processing, or unsharp masking will be performed, one needs 8-12 bits.
The video display is presently the weak link in the digital workstation. Its limitations pose formidable engineering problems that are compounded by the lack of acceptable methods of evaluating the image quality of displays in physical or psychophysical terms33. To improve spatial resolution, one must go to a high-line-rate system (2,000 or larger). To maintain horizontal resolution consistent with the improved vertical resolution, one needs a high electronic bandwidth. Flicker is more of a problem with high-line-rate displays, and some workstations have resorted to 75-Hz refresh rates (rather than the usual 60 Hz), which demands yet higher bandwidths. Large-area contrast resolution is limited to 7 bits or less due to light and electron scatter when the electron beam hits the phosphor screen. Another limitation is the maximum brightness of a video screen, which is about an order of magnitude less than that of a typical viewbox. Finally, there is the problem of limited display size, which is usually insufficient to show a standard chest image at full size. These factors have resulted in some workstations generating film as the display medium, which partially defeats one of the stated purposes of PACS, namely, the savings in film costs.
The resolution of a video monitor is determined on the basis of the number of lines used (vertical resolution), the bandwidth (horizontal resolution), and the refresh rate. A sample calculation illustrating these dependences is included in the next section as a case study.
Two quantities are of importance here34,35. One is the large area, or range contrast ratio, defined as the maximum overall brightness (when all pixels are driven to their highest values) divided by the minimum overall brightness (when all pixels are driven to their lowest values). The latter is influenced by the ambient light, and the former is determined by means of the physical characteristics of the phosphor, the electron beam current, and so on. The large area contrast ratio can have values up to 50.
A more important quantity is the detail contrast ratio, which is defined as the ratio of the luminance of two adjacent small areas, one driven to its highest level, and one to the lowest. Typical values for the detail contrast ratio are 10-15, and it is limited mainly by halation. This effect occurs when light originating from a spot on the phosphor is multiply reflected and scattered by the faceplate and the phosphor. This results in a bright spot of light surrounded by concentric circles, instead of a single spot. The use of neutral density filters bonded to the faceplate can increase both large area and detail contrast values at the expense of overall brightness. For example, with a 20% transmission neutral density filter, the large area contrast can be as high as 100, and the detail contrast can be as high as 25 (Table 3).
Quantities of interest are the color of light emitted by the phosphor, the brightness of the phosphor, and the decay time. A typical brightness is 40 cd/m2. The most common phosphor is P4, with a decay constant of 60 µsec. There are many more available, two of which look promising for medical images. These are P45 (1.8 msec) and P164 (>100 msec). The long-decay-constant phosphors tend to smooth cathode-ray-tube modulation electronic noise and also decrease the perception of flicker. Of course, they are less suitable for dynamic imaging and tend to show a blurring effect if the image is moved around or enlarged (zoomed).
Most monitors are 17 to 19 inches, although special-order monitors are available up to a 23-inch diagonal. The primary limitation of the large display size is the increase in weight and depth of the cathode ray tube (CRT). Also, the scan lines become more visible on larger monitors with the same number of scan lines.
Quantities to look for here are stator yokes and horizontal and vertical dynamic focus. The latter considerably improves the uniformity of the focusing over the CRT faceplate. If an MR imaging suite is nearby, one may wish to specify CRT/yoke magnetic shielding.
We illustrate the problems involved in specifying a monitor with an example (Fig 1). The task is to display images originating from conventional chest x-ray films (14 × 17 inches) on a video monitor. The images are digitized at 1,680 × 2,048 pixels, each 12 bits deep, which are characteristics of the Du Pont film digitizer (FD-2000; Wilmington, Del). This image must be interpolated down to the display memory, which is 1,280 × 1,024 pixels, by 12 bits deep. The 1,280-pixel dimension of display memory corresponds to the scan line direction of the video signal.
The monitor (Dotronix VMI M2400; New Brighton, Minn) is available in either “portrait” or “landscape” modes, corresponding to the longer dimension oriented vertically or horizontally, respectively. The “portrait” mode is selected, as this matches the conventional x-ray image closely. The aspect ratio (ratio of long dimension to short dimension of the image on the monitor) is predicated from the ratio of the film dimensions (17/14), which is 1.21. The available aspect ratios (from Dotronix) are 1:1, 4:3, and 5:4. The last choice yields 1.25, a close match to the desired value of 1.21. This ensures that the image will look proportional.
One needs to know the bandwidth of the monitor in order to calculate the resolutions. If not explicitly stated, an upper limit on the bandwidth may be inferred from the pixel writing frequency (sometimes called the “dot clock”). Note that the bandwidth can be, at best, one-half the dot clock, since two clock cycles are needed to write a complete cycle (line pair) of information. In our example, the dot clock frequency is 112 MHz. Half of this yields the bandwidth 112/2 MHz = 56 MHz. The scan line frequency is given as 65.64 kHz, yielding a time per scan line of 5.2 µsec. Multiplying the two yields 853 line pairs in the scan line direction. Assuming about 84% of this time is useful video (a typical value for RS-343), we get 717 useful line pairs in the scan line direction. Perpendicular to the scan lines, one has 1,024/2, that is, 512 line pairs available.
The next choice is to select the orientation of the scan lines, horizontal or vertical. Intuitively one expects that the higher line pair capability along the scan line direction must be used along the longer dimension of the image; that is, the scan lines must be chosen to be vertical. This conclusion is supported by the following analysis.
The resolution of the final image is determined on the basis of both digitization of the film and the video characteristics. We calculate the latter first. Suppose that the scan lines are chosen to be horizontal (Fig 1a). The horizontal resolution can be determined from the available number of line pairs divided by the width of the original chest image: 717 lp/(25.4 · 14 mm), that is, 2.0 lp/mm. Similarly, the vertical resolution is given by [(1,024/2) · 0.7 lp]/(25.4 · 17 mm), that is, 0.83 lp/mm. The factor of 0.7 is the Kell factor, and the denominator is the original length of the chest image in millimeters.
We now calculate the resolution limit that has occurred due to the sampling process involved in digitizing the films and interpolating down to the display memory. The display memory of 1,280 × 1,024 entails the following limits (Nyquist spatial frequencies) on the spatial information; namely, in the horizontal direction 1,280/2 lp)/(25.4 × 14 mm) = 1.8 lp/mm, and in the vertical direction (1,024/2 lp)/(25.4 × 17 mm) = 1.2 lp/mm. These results are summarized in the first row of numbers in Table 4. For the net resolution we have used the smaller of the digitization and the video limits. Note that the net resolutions are unequal, the asymmetry (ratio) being 2.2. Repeating the calculation for the case in which the scan lines are chosen to be vertical (Fig 1b), we get the second row of numbers in Table 4. The second choice yields a greater symmetry between the resolutions in the two directions.
The hardware platform chosen by the workstation vendor has implications on engineering performance, clinical acceptance, and cost. It is, therefore, imperative that a judicious evaluation of hardware features be made before a platform is chosen. In addition to the spatial and contrast requirements and the video optics considerations already presented, thought must be given to multiple simultaneous images, rapid image retrieval and display, image manipulation, and image processing.
One notion that needs to be stressed is that the “one size fits all” philosophy is usually inapplicable in workstation design. The workstation vendor must tailor the product to different users. Workstations may well differ based on their use: primary interpretation (by a radiologist) or review and comparison by other physicians. Workstations for advanced applications such as three-dimensional presentation and processing or teaching may also be different.
There are three main computer buses around which most of today’s popular workstations are being designed. These are the IBM-PC bus, the VME bus, and the NuBus. The recently announced (February 1990) IBM RS/6000 series computer based on the Advanced Micro channel promises to be a powerful new entry. Table 5 summarizes the most important features of these buses and associated systems.
The IBM-PC bus is found in a majority of the lower-cost workstations employed for low-volume image viewing. A very attractive feature of this bus is the fact that peripherals associated with it are relatively inexpensive. This is primarily due to the economics of mass production. It is expected that as the number of VME bus and NuBus computers increases, the price of their peripherals will decrease, thereby making these computers more affordable.
Local storage size is best discussed with a specific example in mind. Consider a workstation that is used in an MR imaging department for primary diagnoses. The workstation serves three MR imagers. Assume that each MR imager is used to perform 12 examinations every day. Each examination comprises 100 images. Each image is 256 × 256 × 16 bits, or 0.128 MBytes. The MR radiologist needs to view the current examination and one previous (comparison) examination.
The local storage (S) required is defined by the following relation: S = P · E · I · M, where S is the size of the local storage in megabytes, P is the number of patients in the locus of the workstation, E is the average number of examinations per patient, I is the average number of images per examination, and M is the size of an image in megabytes. (If the images are received in compressed form and are restored just before display, the right side of the equation must be multiplied by the compression ratio.)
Substituting our hypothetical figures into this equation, we have S = 36 · 2 · 100 · 0.128 = 921.6 MBytes. We believe that storage space of this order of magnitude is a minimum requirement at a workstation. Two factors can affect the performance of the workstation in this scenario. The first is when a physician requests the images for a patient that are currently not stored on the workstation. The second is when the physician requests cases that are “older” than those stored on the workstation. Clearly, the first factor is limited by the image archive and the communication network. However, the second factor can be handled in a more elegant manner.
When the user views any patient’s current examination, the workstation can check to see if previous examinations for that patient exist in local storage. If this is not the case, the workstation sends a request to the archive, in the background, to fetch the previous examination from the archive. The elegance of this scheme is that the retrieval of the images is performed in the background and is not perceived as a delay by the user.
There are a number of variations in this background fetching scheme. First, if the PACS has a communication link with the hospital’s radiology information system, it is possible to predict the examinations that will be viewed at a particular PACS workstation on a given day. Therefore, the day’s examinations can be “prefetched” and downloaded onto the workstation’s local storage. Second, the “background-fetch” algorithm can be tuned to be only a fraction of an examination ahead of the examination being viewed. This tuning prevents unnecessary downloading of examinations from the archive.
The multiviewer analog is often used to model the PACS workstation,36. Studies by various groups have indicated that there are two major benefits to having more than one image display monitor: It provides the user with the opportunity to make spatial associations, and it permits the user to scan rapidly through the image set37,38. This concept was formalized in the “filmstrip” metaphor model39 in which the radiologic images are treated as sequential images in a filmstrip. The actual number of images that can be seen at any time is determined by the number of “windows” or monitors. Since it is often impractical for engineering (or economic) reasons to duplicate a multiviewer with eight or 16 windows, the concept of a pictorial directory was developed and investigated2,40,41. Gee et al2 performed observer studies with CT examinations displayed on a special-purpose image display controller designed for PACS applications. They tested workstations with one, two, and four monitors. Each test was done both with and without a pictorial directory. Their results indicate that the two-monitor workstation without the pictorial directory performed better than the one-monitor station or the four-monitor workstation with or without the pictorial directory. The optimal configuration of the radiologic workstation is yet to be determined.
The image memory and the display buffers should be viewed as an extension of the workstation’s local storage. This concept is similar to a computer’s cache memory, which is typically a very-high-speed, limited-size memory. As a rule of thumb, the image memory defines the maximum number (and size) of images that can be stored and retrieved in near frame times (1/30 of a second). The frame buffer defines the maximum spatial resolution of the image matrix that can be seen at any given instant. It will be instructive to exemplify this concept (Fig. 2).
Consider an image memory that is 16 MBytes (VIEW-2000; Virtual Imaging, Sunnyvale, Calif). The image memory is viewed through two display channels, each of which is 1,024 × 1,280 in spatial resolution at 8 bits per pixel. Images are acquired at 2,048 × 1,680 at 12 bits per pixel. The ideal system would have 12 bits per pixel throughout: in the image memory, in the frame buffer, and in digital-to-analog circuitry. But today’s digital-to-analog technology does not provide components that are fast enough to handle data more than 8 bits wide. The current solution is to map 12-bit images through 8-bit look-up tables. Window and level controls are then used to move through the image’s dynamic range. However, even if it was possible to have a 12-bit digital-to-analog circuit, reflections on the monitor’s screen and the observer’s perceptual system limit the viewable gray scale to about 5 bits. Table 6 shows the number of images that can be stored and the number of images that can be displayed at any one instant.
One can now see that even 16 MBytes of image memory are rapidly consumed if the intent is to display and perform primary interpretation on (approximately) 2,048 × 2,048 images at the acquired resolution. However, if the same workstation is to be used for an application where it may be sufficient to view 1,024 × 1,024 images at 8 bits, then the number of images that can be stored and retrieved in frame times becomes more reasonable. If the workstation is to be used for digital modalities in which the typical matrix is 256 × 256, it will be possible to load approximately two examinations into image memory. However, less than one examination can be viewed at full resolution at any given instant.
The frame buffer can be considered as the output device of the image memory. It should be possible for the workstation to be configured with a variety of frame buffers of differing resolution. For example, a multimodality review and comparison workstation could be equipped with one 512 × 512 display for alpha-graphics, two 1,280 × 1,024 displays for regular viewing, and a 2,048 × 2,048 display for showing fine details. This mix and match of displays has important ramifications on the economics of workstations.
It is also important to note that unless the zoom and rove mechanisms are carefully designed, there is the danger of losing the “gestalt” contained in the image. With a true zoom capability, a 1,024 × 1,024 display can be used to display 2,048 × 2,048 images, by viewing a portion of the image at one time. By moving around the image in the display memory, relationships with adjacent structures are not lost.
Another important requirement of a workstation of this nature is that it must permit new data to be loaded while the current image is being viewed. This “dual-porting” of the image memory has significant impact on the throughput of the system. A corollary of this requirement is that workstations must be truly multitasking in a PACS environment. There is a significant period of inactivity in the computer when the user is viewing the image, and this time should be used by the workstation to carry out other functions such as background fetching of images, image memory loading, and so forth.
The majority of today’s radiologic networks are either collision sense multiple access/collision detect (exemplified by Ethernet) or token rings (exemplified by the IBM or Proteon token-rings). The two primary factors that affect the throughput of a network system are the packet size and the performance of the network scheduling algorithm under increasing load.
The higher signaling speeds of the fiber-distributed data interface (FDDI) or the Pronet-80 network may not necessarily result in higher sustained disk-to-disk throughputs because of limitations of the computer bus to which they are connected. From an economic point of view, the price of an Ethernet network interface can be 40% of the price of a comparable token-ring interface. This, again, is due to the economics of mass production. Another important aspect is that Ethernet interfaces are more widely supported than other network interfaces.
The term display refers to the surface that contains the visible image. Usually, it is a CRT configured as a television monitor. There are other media that can be used for showing images, such as plasma screens, liquid crystal displays, and electroluminescent panels33. The human visual perceptual system extracts information (symbolic and pictorial) from the display and, therefore, must be coupled optimally to the display. This coupling must occur at two levels, psychophysical and psychopictorial. The psychophysical level includes issues such as spatial resolution, intensity level and range, contrast rendition, and both random and systematic noise. The psychopictorial level includes issues of image size (multiple images, rove and zoom), completeness, familiarity (image processing), and fulfilling task parameters (search, recognition, decision).
The term workstation refers to the entire interface of the user with the images. This includes one or more displays and diverse communication devices such as keyboards, keypads, joysticks, trackballs, speech recognizers, and speech synthesizers. The arrangement of devices and the operation of the software can determine whether a person will use the workstation for an extended period of time. The user must also be able to use the workstation effectively. This involves consideration of human factors including efficiency, comfort, fatigue, acceptability, and satisfaction.
The Human Factors Society has published a handbook of standards for video display terminal workstations42 that focuses on alphanumeric displays. Stations for pictorial display, especially those that contain multiple display surfaces, pose separate and unique problems, many of which have not been addressed formally.
A distinction must be made between an individual user and a group of users. For an individual display workstation, it is convenient to angle the monitors around a prime focal point in order to maintain a comfortable viewing distance of 50-90 cm. For a group, a flat arrangement such as that on a multiviewer is preferable, since this provides less restriction for people located at the edges or in the back. The minimal viewing distance is that at which the raster scan lines are just visible. Image details that approach the size or sharpness of the raster lines must be visualized by means of “zooming” as opposed to either moving the eye closer to the display surface or using a magnifying glass. Examples of such image details are microcalcifications on mammograms and septal lines on chest radiographs. Zooming also allows more of the individuals in a group to appreciate details.
The user has to communicate with the workstation for the purpose of fetching images, changing images, and operating on images that are already displayed. The primary requirement of workstation communication is that it be fast and simple. There is one caveat here: Speed and simplicity should not be achieved at the expense of security. Some level of patient confidentiality should be maintained, and there should be control over access to the images.
The standard computer keyboard is most familiar to those used to typing, but it is usually not easily mastered by physicians. Special-purpose keyboards have been shown to improve physician acceptance43. The mouse or track-ball controller is certainly easier to learn to use but requires careful screen design to provide the functions needed for image control. The mouse is a more natural method for controlling image pan, scroll, and zoom. The joystick is similar to the track-ball and mouse but does not provide the precision of the latter devices.
Controlling a workstation by means of speech is easy for the physician and hard for the computer (processing intensive), whereas the opposite is true for the keyboard (unnatural for physician). Speech recognition allows the physician to concentrate on the image rather than frequently looking at a control screen and/or keyboard. The vocabulary required for image manipulation is limited, reducing detection errors and essentially eliminating the need to train the unit for individual physician voice patterns. Some sort of layered protocol may be optimal, in which a log-on requires a keyboard and an identification code but in which to change images one simply says “next” or “last.”
Two aspects of lighting are important: average room illumination and reflection of light from monitor surfaces. The luminance of an x-ray lightbox is about 300 cd/m2 and when covered by an x-ray film ranges from 3 to 200 cd/m2. The luminance of a television monitor ranges from 0.1 to 50 cd/m2. The eye will adapt to the average luminance and must adapt to a lower luminance in order to perceive optimally the contrast in the monitor image. Therefore, the average illumination in the room containing the monitors should be lower than that in a typical multiviewer reading room. Light from x-ray lightboxes can add significant ambient light to a multipurpose image reading room. Additionally, the monitor surface is curved, and the protective plastic cover contains surfaces that reflect light. Direct lighting can create local or specular reflections that interfere with visualization of portions of the image. Lightboxes located across from monitors can be particularly disturbing in this regard. In addition, diffuse reflection from any display decreases the displayed contrast. Diffuse illumination has been shown to affect adversely contrast perception of images on lightboxes44 and on monitors45.
Lights and lightboxes should never be located where they cast direct light onto the monitor surface. Monitor rooms should be lighted indirectly so that specular reflections are not produced. Work surfaces can be illuminated with small spotlights.
A PACS workstation combines certain actions of the film library, as well as those of the multiviewer. Unless the strategy for this combination is very carefully designed, it could result in an information-processing bottleneck at the workstation. With the conventional film system, a large support staff retrieves film jackets from shelves, selects the required cases, and loads the multiviewer, even before the physician or radiologist has arrived at the multiviewer to view the images. In a PACS, the workstation in conjunction with the archive and the network must accomplish loading and display of images, often after the physician has begun using the workstation. This could create a severe bottleneck in information processing. Elaborate schemes including rule-based prefetching46, background loading of images, and image navigation techniques2,37 have been suggested to overcome this bottleneck. The clinical impact of such schemes must be carefully studied.
One other factor may contribute to the information-processing bottleneck. With the film system, usually only one previous comparison case is loaded onto the multiviewer by the film library staff47. However, with a PACS it is conceivable that the clinician may request even older comparison cases, since they would be more easily accessible. The impact of having these large volumes of information available in a PACS has not been carefully studied. While estimates have been made regarding the retrieval rates of images stored on a PACS48, these predictions have been based on the operation of a conventional film department. We do not know for sure how the physician will react to the increased on-line information.
For the prefetching and automated sequencing to work effectively, the workstation and its associated PACS must communicate with the radiology information system, which is responsible for the patient data as well as the examination data. The PACS assumes responsibility for the image level data. The radiology information system informs the PACS when examinations are scheduled and completed and sends change notices to keep the data bases synchronized. Reports are also sent to the PACS for long-term archiving.
By notifying the PACS of scheduled examinations (the night before, for examinations scheduled earlier), the radiology information system provides as much advance notice as is required by the PACS to ensure that images are available when desired at the workstation. The PACS must know, through rules and tables set in advance, which workstations should receive the new examinations along with the previous related examinations49.
For example, let us assume that at 8:00 AM a chest examination is scheduled for John Doe to be performed at 10:00 AM the same morning. The radiology information system sends the notice to the PACS at 8:05 AM. The two previous chest examinations with the associated reports are sent from the archive to the chest workstation at 8:45 AM as a background job. Other activities such as providing STAT images to the intensive care unit take precedence. As soon as the new examination is completed at 10:15 AM, those images are sent to the same chest workstation and are merged in a “folder” with the previous examinations for that patient. This folder will wait for the radiologist to sign on and ask for any studies ready for interpretation.
In a similar fashion, examinations on patients in the intensive care unit are sent automatically to the appropriate workstations in those locations. Remote conferences can be conducted between the physicians in the intensive care unit and the radiologist with cursors on workstations in both the intensive care unit and the radiology department that can be moved simultaneously. While the radiologist is controlling the images and pointer, there could be two-way hands-free voice communication over the telephone system.
Workstations should not be designed to replace viewboxes. The economics of cost justification will prevent this. Given that a one-for-one duplication of viewboxes is not feasible, what is the most reasonable scheme to display an examination for interpretation? The most important factor to keep in mind is that the engineer is limited to a comparatively small number of display channels (two to eight). Working with this limitation, the engineer is required to provide maximal information to the user. Clearly, if the number of images that need to be viewed remains constant, but the number of surfaces that they can be displayed on is decreased (in the viewbox-to-workstation migration), then it logically follows that the images need to be viewed in some specified order. A number of schemes have been suggested to achieve this, and they generally can be considered in two classes: (a) point-and-pick schemes and (b) automatic schemes.
In the point-and-pick schemes, the user is required to select a patient record, which brings up a list of examinations for that patient. Selection of a specific examination brings up a list of images available for the same patient. The user then selects the image for viewing. After inspecting this image, the user returns to the list of images that are available for viewing and selects the next image to be viewed. The drawback of this scheme is the higher user overhead to view an examination.
The automatic schemes loosely fall into the category of rule-based schemes. In these schemes a heuristic algorithm is used to predict what the user “most likely” wishes to see next. The rule base is typically a knowledge base generated by a consensus mechanism among the department’s radiologists. A simple example may help illustrate this concept: Consider the tibia-fibula examination, which typically consists of four images or views. Assume that this examination is to be displayed on a two-monitor workstation. Also assume that the patient has undergone two previous examinations. After a patient’s record is accessed on the workstation, there is no need to select an image from a list; the system’s rule base does an automatic selection and display when the “NEXT IMAGE” key is depressed. The images would be displayed as described in Table 7.
The rule base, in addition to displaying images, has the benefit of increasing the apparent network bandwidth. A general rule is to keep (at least) the most current examination of each patient in image memory. One “previous” (comparison) examination is retained in the workstation’s magnetic storage. If the first previous examination is accessed, a request is sent to the image archive to transmit the second previous examination so that it maybe received and loaded into the virtual image memory while the first previous examination is being viewed. We have found this system to be acceptable in practice50.
The rules just described are heuristics based on opinions of radiologists in the department. Since it is conceivable that these preferences may vary among users, a personality module scheme can be set up (Angst M, Baxter T, written communication, 1988). This module permits the user to modify the rule base at four levels in the hierarchy: all examinations, examinations of a selected body region, selected examinations of a body region, or only a specific examination. Of course if the software uses the personality module, a log-on procedure is mandatory.
Electronic medical image processing had its start in the late 1950s, when television image-processing techniques were first introduced into radiology by Gershon-Cohen and Fisher51. Although Meyers et al52 actually digitized images, digital image processing in medicine seriously began when the techniques developed by the Jet Propulsion Laboratory for the restitution and correction of images returned by satellites were applied to medical images53. The goal of these studies was to improve diagnostic performance, but it soon became clear that basic limitations in the resolution and noise of the processing systems54 made improvement impossible. Television techniques were confined to teleradiology at low resolution. The introduction of CT and especially digital subtraction angiography55 revived interest in the use of digital image-processing techniques for improving performance. The use of the window and level operation involve contrast enhancement and de-enhancement. They surely affect performance through their perceptual matching function. There is some evidence that contrast enhancement may be of value beyond perceptual matching56; however, image processing for the improvement of performance is still unproved.
For the various reasons discussed earlier, the contrast resolution of video display devices is limited to approximately 7 bits of gray scale. On the other hand, laser film digitizers acquire images at 12 bits per pixel, and digital imaging equipment routinely acquires 12-bit and 16-bit information. The workstation hardware and software must transform intelligently these 12-16-bit quantities for display on 8-bit displays. Dropping the lower 4-8 bits of the pixel is not always appropriate, since useful image information may be deleted with this operation. A number of schemes have been reported in the literature for displaying large gray scales on contrast-limited video terminals. Most of these schemes fall into two categories: image-manipulation techniques and image-processing techniques. In the context of this article, image manipulation is defined as a point operation that does not change the value of a given pixel but modifies its appearance by means of a look-up table. On the other hand, image processing is defined as a point operation that changes the value of a given pixel. We discuss only image-manipulation operations in this section.
At present, the most important role of image manipulation is matching the contrast range of the display to the observer’s perceptual system57. A digital image can be stored with 4,096 intensity levels (12 bits). A high-quality television monitor can display about 128 intensity levels (7 bits). The exact number is determined by the luminance range of the display and the S/N58. The human eye, when adapted to the local luminance of the monitor, can discriminate about 90 discrete levels (6.5 bits)59. Therefore, something must be done to make the entire 10-bit intensity range accessible to the human eye.
The window and level controls common on CT consoles are fundamental image-processing operations that match the stored image primarily to the display and secondarily to the eye of the viewer. If the 4,096 intensity levels in the stored image are shown as 128 gray levels on the displayed image, there will be a loss of contrast that can be restored by displaying a window of 128 intensity levels from the stored image. The entire stored intensity range can be displayed by scanning the centerpoint of the window. If the window is narrowed to less than 128 intensity levels, contrast will be enhanced.
Window and level controls generally produce a linear gray scale. The eye operates more like a logarithmic amplifier, and further processing can be done to obtain “perceptual equivalence” between the display and the visual system60. Cromartie et al60 measured the just-noticeable differences in intensity as a function of the stored pixel intensity, which they call “digital driving units.” These data are used to generate a curve that shows for every driving level the change needed to produce a just-noticeable difference. They then use this curve to generate a function that “perceptually linearizes” the display of a particular device. This is stored as a look-up table.
Other variations of look-up tables can be used to tailor the display to a particular image or region in an image. For example, used with a planar chest image, an exponential look-up table will optimally show the lungs (black regions), while a sigmoid look-up table will optimally show the mediastinum and the structures superimposed on the heart shadow (white regions).
Efforts have been made to map the entire intensity range of the stored image into the displayed image in a way that makes all of the relevant structures visible. Adaptive histogram equalization has been proposed as an image-processing technique for CT61 but has not come into general use.
It is important to emphasize that manipulations of the image that artificially alter expected contrast relationships may degrade observer performance, especially increasing the number of false-positive results (reduced observer confidence in normal).
Under daylight conditions, the human eye has a sensitivity range of six orders of magnitude. However, the eye must be adapted to the average luminance and is simultaneously sensitive over a much narrower range. This brightness adaption is experienced when going from a bright to a dimly lighted room and vice versa. When the eye is adapted to the average luminance of a CRT display, about 40 distinct gray levels are simultaneously discernible, while thousands of colors are discernible. In theory, many more differences in images can be shown by the use of color than by the use of a gray scale62. In fact, except for a few limited uses in Doppler US, color has not been useful. This is mainly because the relationship between color and intensity is not obvious. The choice of a color scale is very important. For example, when a rainbow scale is mapped into a smooth increase in intensity, unnatural boundaries appear where one hue changes to another. People have attempted to devise natural scales like the “heated object spectrum,” white to yellow to red63, but these have not been successful. Part of the difficulty may be that color is usually manipulated in the red-green-blue domain. It may not be easy for people to conceptualize how color changes with intensity. Manipulation of color in the hue-saturation-intensity domain with subsequent transformation to the red-green-blue domain that the computer uses may be simpler for people and may be the way to develop perceptually acceptable scales64. The best use of color may be in multidimensional imaging, where it can represent some functional property in the image. For example, in Doppler US, signals that are not moving toward or away from the transducer are shown by using a gray scale, while moving objects are shown in color, with hue indicating direction and saturation indicating velocity. In this instance, the meaning of the mapping is clear to the viewer, and use of color adds information to the display.
While a major driving force for implementing PACS is the expected reduction of costs associated with the use of films, a PACS offers the opportunity to improve certain aspects of patient care as well. The immediate availability of images in the medical intensive care unit reduced the time interval between completion of the radiology examination and action taken by medical intensive care unit physicians65. Antibiotic therapy was initiated more quickly, chest tubes were inserted earlier, and medications were changed or terminated sooner.
In addition, with the archiving of all images in digital form, teaching files take on a new significance. If we are able to retrieve images based on the radiologic findings as well as diagnoses, then it becomes feasible to search the archive for interesting cases. This approach is simply an extension (at little additional cost) of techniques that have been developed recently at several institutions66. While the radiologist is in the process of interpreting a patient’s images, other images may be displayed that show similar features with known diagnoses. Information about the underlying diseases could be explored, and the recent literature could be searched.
Both radiologists and referring physicians will find that the PACS simplifies their efforts to locate and view reports and images, as well as providing learning opportunities that have up to now been too inconvenient, if possible at all. Yet the impact on the communication between the radiologist and the referring physician cannot be ignored. In our experience with the physicians in the medical intensive care unit, their reliance on our verbal or written interpretation was reduced by almost half when we made the portable radiographic images immediately available in the intensive care unit, often before interpretation by the radiologist65.
Some radiologists are concerned that PACS will undermine our referring colleagues’ dependence on us67. A well-designed PACS should not only improve the service we deliver but also should allow the radiologist to send the referring physician timely reports with images, differential diagnoses, and references68. Remote access through teleradiology will allow the radiologist to consult from home in the evenings and weekends and will permit second opinions or consultations with other radiologists when help is needed. If the radiologist presents annotated images to referring physicians, they will be even more convinced of the radiologist’s conclusions and diagnoses and will better understand the significant contribution made to the care of their patients. The referring doctors’ realization of the radiologist’s expertise and support will only enhance their dependence on radiology.
Because of the high cost of digital imaging workstations today, the relative requirements for high-resolution, speed, and processing capabillities may have to be compromised somewhat. However, in this conversion from the use of film alone to the use of these workstations, we must be assured that the quality of interpretation and the quality of patient care are not appreciably degraded. Careful selection of the properly configured components and a thorough understanding of the limitations of these systems are critical to success.
Thoughtful consideration of the impact on the communication between the radiologist and the referring physician is also vital. Careful planning for phasing these workstations into the existing environment will reduce the overall costs and help prevent disruption of necessary activities. Selection of those areas with the greatest need will help ensure physician acceptance.