|Home | About | Journals | Submit | Contact Us | Français|
The purpose of this study was to describe the effect of implementing an imaging quality assurance program on CT image quality in the Lung Screening Study component of the National Lung Screening Trial.
The National Lung Screening Trial is a multicenter study in which 53,457 subjects at increased risk of lung cancer were randomized to undergo three annual chest CT or radiographic screenings for lung cancer to determine the relative effect of use of the two screening tests on lung cancer mortality. Of the 26,724 subjects randomized to the CT screening arm of the National Lung Screening Trial, the Lung Screening Study randomized 17,309 through 10 screening centers. The others were randomized through the American College of Radiology Imaging Network. Quality assurance procedures were implemented that included centralized review of a random sample of 1,504 Lung Screening Study CT examinations. Quality defect rates were tabulated.
Quality defect rates ranged from 0% (section reconstruction interval) to 7.1% (reconstructed field of view), and most errors were sporadic. However, a recurrently high effective tube current–time product setting at one center, excessive streak artifact at one center, and excessive section thickness at one center were detected and corrected through the quality assurance process. Field-of-view and scan length errors were less frequent over the second half of the screening period (p < 0.01 for both parameters, two-tailed, paired Student’s t test). Error rates varied among the screening centers and reviewers for most parameters evaluated.
Our experience suggested that centralized monitoring of image quality is helpful for reducing quality defects in multicenter trials.
Multicenter studies are valuable for evaluating the imaging features of a disease and the effect of imaging on clinical outcome. When such studies involve a large number of facilities using different models of imaging equipment and numerous technologists over a broad geographic range, the challenge of maintaining protocol compliance and the potential for technical errors can increase. Although the importance of data quality control in multicenter trials is well recognized [1, 2], to our knowledge the quality issues of compliance with image acquisition protocols, image acquisition error rates, and the role of continual image quality assurance in multicenter imaging trials have received little attention in the literature.
The ongoing National Lung Screening Trial (http://www.cancer.gov/nlst, clinicaltrials. gov identifier NCT00047385) is a multicenter imaging trial that incorporated imaging quality control procedures as a part of the trial protocol [3, 4]. This trial was launched in the fall of 2002 by the National Cancer Institute to compare lung cancer mortality rates among persons at high risk randomly assigned to undergo three annual lung cancer screening examinations with either low-radiation-dose CT or chest radiography [5, 6]. The National Lung Screening Trial is being conducted as a joint effort of the National Cancer Institute Lung Screening Study group and the American College of Radiology Imaging Network. The Lung Screening Study component of the National Lung Screening Trial enrolled 34,614 of the final total of 53,457 participants, and the clinical follow-up phase is in progress.
Enrollment and screening in the Lung Screening Study were performed through 10 screening centers. Several of these screening centers contracted remote facilities to assist with enrollment and screening. The Lung Screening Study screening sites were located in several settings, including university and nonuniversity hospitals, outpatient imaging centers, and clinical trial facilities, and used numerous brands of imaging equipment. As a result, a need for imaging quality assurance was recognized.
In the months before subject enrollment began, a quality assurance working group of the Lung Screening Study developed and implemented quality assurance procedures to monitor image quality and equipment performance. The objectives of the image quality assurance procedures were to ensure adherence to National Lung Screening Trial image acquisition protocols, identify technical errors, monitor image quality, facilitate prompt correction of problems, and promote attention to image quality as a common goal. The purpose of our study was to examine the effect of the image quality assurance procedures applied to the CT screening arm of the study through analysis of the quality assurance data collected in the Lung Screening Study component of the National Lung Screening Trial.
The protocol of the Lung Screening Study component of the National Lung Screening Trial was approved by the institutional review boards affiliated with all Lung Screening Study screening centers. Informed consent was obtained to store deidentified images at a central location and distribute them for investigational purposes.
The Lung Screening Study was administered through 10 screening centers (Appendix 1). Of the 26,724 subjects randomized to the CT screening arm of the National Lung Screening Trial, the Lung Screening Study randomized 17,309 through 10 screening centers. Screening was conducted from September 2002 through January 2007. Most screening examinations were completed 6 months before the date of the final screen. The screening examinations were performed with more than 40 scanners at more than 25 different physical locations. There were at least two scanners in use at each screening center, and approximately one quarter of scanners were located in a university setting.
All CT technologists were certified to perform scanning for the National Lung Screening Trial after their certification by the American Registry of Radiologic Technologists was confirmed and they viewed a slide presentation produced by the American College of Radiology Imaging Network administrative component of the trial. This slide presentation reviewed the technical parameters allowed by the protocol and covered issues relevant to image quality, such as setting the reconstructed field of view, recognizing inadequate inspiration, and identifying excessive motion. All Lung Screening Study centers were strongly encouraged to adopt internal quality control procedures, such as having technologists sign a form verifying the technical parameters used for each screening CT examination and having the study coordinator periodically review the CT screening process. In addition, case forms used by screening center radiologists to record their screening CT findings included an area for rating whether image quality was adequate, suboptimal but interpretable, or nondiagnostic. Examinations producing nondiagnostic images were to be repeated if possible.
A plan for image quality assurance involving centralized review of the quality of CT screening examinations was developed by the Lung Screening Study quality assurance working group for the purpose of identifying patterns of deficiency and correcting recurrent errors. It was not intended to be a comprehensive review of all screening examinations and reports or to correct quality defects in real time. The quality assurance protocol provided for evaluation of the technical parameters and subjective aspects of image quality in 430 screening CT examinations per year. This sampling rate was determined to be that required to detect a 3% defect rate with 90% probability while allowing a 1% rate of undetected defects with 95% probability.
Each month the data and operations coordinating center (Westat) of the Lung Screening Study randomly selected approximately 36 screening CT examinations for centralized review. The number of reviews was reduced during the second half of the final year of screening owing to the limited number of final screening examinations to be completed during that time. Images that the local radiologist had rated nondiagnostic were not eligible for centralized review. The screening CT scans were deidentified locally and transmitted electronically by the screening centers via the Internet over a virtual private network to the quality assurance coordinating center at Washington University, St. Louis . Though not part of this study, imaging quality assurance in the Lung Screening Study also has addressed chest radiographic image quality  and CT scanner and radiographic equipment certification and technical performance.
The screening examinations were randomly assigned for review by the data and operations coordinating center to one of four Lung Screening Study quality assurance radiologists (randomly designated reviewers A, B, C, and D). All were practicing thoracic imaging subspecialists with 9–21 years of experience interpreting CT scans. Quality assurance reviews of the CT scans were conducted via the Internet over a virtual private network with a Web-based image viewer (Easy-web, Philips Healthcare) and electronic image quality grading forms . These reviews began in May 2003, after the electronic network had been fully implemented at all screening centers. At this time, all screening examinations selected for quality assurance assessment during the first 8 months of screening were reviewed; subsequent reviews were conducted monthly. The quality assurance radiologists were blinded to the screening center of origin and to the image quality rating of the screening center radiologist who had performed the clinical reading.
The screening CT examinations were performed with a low-radiation-dose technique without IV contrast material. The technical acquisition and reconstruction parameters were reviewed for compliance with the acceptable ranges set by the National Lung Screening Trial protocol (Table 1). The technical parameters were extracted directly from the DICOM image headers for display on the electronic review form for the quality assurance radiologist.
In addition to review of objective technical parameters, multiple indicators of image quality (Table 1) were subjectively rated adequate or inadequate by one of the quality assurance radiologists. A parameter was rated inadequate if it was considered suboptimal whether or not the defect resulted in a nondiagnostic study overall. The quality assurance radiologist also noted whether repetition of the examination should be considered for quality reasons. When the assigned quality assurance radiologist recommended considering repetition of a screening examination, the images were presented for review to two of the other quality assurance radiologists, who were unaware that the study had been previously reviewed. If at least one of the other two reviewers also recommended considering repetition, the Lung Screening Study data and operations coordinating center forwarded this recommendation to the screening center of origin, which had the responsibility for the final decision based on the nature of the quality defect and individual subject-related circumstances. The form contained a field for reviewer comments.
Quality assurance monitoring status and observations were discussed during monthly telephone conferences of the quality assurance working group. As they were recognized, recurring errors in technical parameters were reported to the screening centers at which they occurred. Any quality improvement recommendations relevant to all screening centers were communicated during monthly telephone conferences and at twice-yearly steering committee meetings held by the National Cancer Institute for Lung Screening Study screening center radiologists, principal investigators, research coordinators, and data and operations coordinating center staff.
Cumulative rates of image quality defects were communicated to each respective screening center in tabular reports in February 2005 and January 2006. Review of these reports by a screening center radiologist and a reply to the Lung Screening Study quality assurance working group describing actions planned to reduce any quality deficiencies identified were required. Annual site visits to all screening centers made by National Cancer Institute and data and operations coordinating center personnel to review screening center activities were attended by one of the quality assurance radiologists. During these visits, the quality assurance radiologist reviewed the technical aspects of image acquisition with screening center personnel, reinforced the importance of protocol compliance, and suggested ways to improve image quality when indicated.
Image quality data recorded by the quality assurance radiologists were imported from the electronic data forms by use of a spreadsheet application (Excel, Microsoft). These data spreadsheets were submitted periodically to the data and operations coordinating center. Rates of technical and image quality defects were compiled for all cases reviewed and were determined by screening center, reviewer, and month. Temporal trends in subjective parameter error rates were assessed with two-tailed unpaired Student’s t test comparisons of monthly rates during the first 25 months and the second 24 months of screening.
The Lung Screening Study component of the National Lung Screening Trial randomized 17,309 subjects to the CT screening arm and performed 48,727 screening CT examinations over the three screening time points. Image quality reviews were conducted on 1,504 of the 1,540 cases selected for review (3.1% of all examinations). The review rates for the 10 individual screening centers ranged from 2.3% to 3.6%. Reviews were not performed on 36 of the screening CT examinations selected for quality assurance review because the images were not received (34 examinations) or were not distributed for review (two examinations) by the quality assurance coordinating center. In 42 cases, neither the effective tube current–time value nor the pitch value were obtainable from the DICOM header.
The technical CT parameter most frequently outside the acceptable range defined by the National Lung Screening Trial protocol was tube current–time product (incorrect in 5.3% of cases reviewed). Twenty-nine of the 79 tube current–time errors came from one scanner at a single screening center on which a tube current–time setting of 39 mAs was recorded in the DICOM image header. There were five errors from three other centers with a tube current–time setting of 32 or 37 mAs. The 45 other tube current–time settings in error were greater than 80 mAs. The effective tube current–time product (tube current– time product divided by pitch), which is more relevant to radiation exposure, was in error in only half as many cases (38, or 2.6% of all cases reviewed) as the tube current–time product. All of the incorrect effective tube current–time product settings were greater than 60 mAs, ranging from 68 to 107 mAs in all but one case (effective tube current–time setting, 162 mAs). Most of these errors (35 of 38) occurred during the first 10 months of screening. Fourteen were from a single scanner on which the pitch had been set to 0.75 at 80 mAs (effective tube current–time product, 107 mAs), a recurrent error detected and corrected during the first centralized quality assurance review (8 months after the onset of Lung Screening Study screening).
All 20 errors in pitch were instances in which the pitch was set too low (< 1.25). No motion abnormalities were noted for these cases, indicating that the reduced pitch did not result in difficulty with breath-holding. The 10 errors in section thickness were due to a nominal section thickness setting of 3 mm (equivalent to an effective thickness of 3.75–4.00 mm). Nine of these section thickness errors occurred at a single screening location over a 2-month period after installation of a new scanner. The errors were corrected as a result of the monthly centralized quality assurance reviews.
The most frequent errors in subjective image quality involved field of view (7.1%), artifacts (4.4%), and scan length (4.3%) (Table 2). The errors in field of view and scan length were lower in frequency during the second half of the screening period (Fig. 1). Field-of-view errors almost always were due to the field of view encompassing more area outside the lungs than was necessary, reducing in-plane spatial resolution. Errors in scan length almost always were exclusion of a portion of the inferior sulcus of one or both lungs. The portion missed was usually too small to generate a recommendation to repeat the scan because of concern that a nodule 4 mm or larger (the size threshold for a positive screen result) may have been missed.
Artifacts (4.4% error rate) consisted of beam-hardening and streak artifacts, typically in the apices and posterior aspects of the upper lobes of the lungs, and were usually mild. During one 3-month period, however, numerous studies in which excessive noise totally obscured anatomic detail in the lung apices were detected through the centralized quality assurance reviews. These studies were traced to a newly installed scanner at a single screening location that was prone to artifacts in photon-depleted areas. Rather than implementing an available postprocessing global smoothing algorithm or increasing the radiation dose, the screening center elected to remove this scanner from use for National Lung Screening Trial screening. Relatively low error rates in motion and degree of inspiration were recorded, with no obvious trends over time.
One reviewer recommended considering repetition of a screening examination in 34 of the 1,504 cases (2.3%) reviewed. A second reviewer made the same recommendation in 16 of these 34 cases (1.1% of all 1,504 CT cases reviewed), one of the 34 cases receiving only one review. A small portion of the lung bases was not scanned in 12 of these 16 cases, a small portion of the lung apex was not scanned in two cases, and there was excessive noise in the lung apices in two cases. The rate of recommendations for repetition of studies (made by at least two of three reviewers) among the 10 individual screening centers ranged from 0% to 2.0% of cases reviewed at each center. A review of the trial database revealed that none of the screening centers chose to repeat a screening examination on the basis of quality review recommendations.
The median cephalocaudal distance scanned below the lung, based on the number of abdominal images obtained below the last lung-containing image and the section thickness, was 2.0 cm (range, 0–9.25 cm). The distance was 4.0 cm or greater in 241 cases (16%).
Of the 1,504 cases reviewed, 1,362 (91%) were evenly distributed among reviewers A, C, and D (30%, 29%, and 31%, respectively; sum < 91% owing to rounding). Reviewer B performed 142 of the reviews (9%). The rate of image quality defects varied among the reviewers for all subjectively evaluated features, ranging from 0% for field of view as evaluated by reviewer B to 12% for artifacts as evaluated by reviewer C (Fig. 2). The rates at which individual reviewers recommended considering repetition of an examination and the rates at which at least one other reviewer agreed with this recommendation (in parentheses) were reviewer A, 0.2% (100%); reviewer B, 11.3% (19%); reviewer C, 1.4% (83%); and reviewer D, 2.4% (64%).
Image quality assurance in the Lung Screening Study component of the National Lung Screening Trial was considered important for optimizing the ability to detect and evaluate small pulmonary nodules and assess for annual changes, while also minimizing radiation exposure, at all screening locations throughout the screening period. A similar rationale and quality monitoring approach were used in the American College of Radiology Imaging Network component of the National Lung Screening Trial . Although investigators in other multicenter imaging trials have reported implementation of imaging quality control procedures [7, 8], we are unaware of any published reports on the results of such measures. Our experience illustrates the challenges of formally assessing and maintaining image quality across a large number of trial centers and provides a point of reference for judging image quality in other multicenter trials.
Overall, the severity of quality defects identified in the Lung Screening Study was low, and the occurrence of defects was sporadic. This finding may have been due in part to the front-line level of quality control provided by the uniform training and certification of CT technologists in the National Lung Screening Trial scanning protocol, assessment of image quality by the screening center radiologists at interpretation, and the regular feedback and emphasis on quality control by the quality assurance working group. However, detection and correction of several recurrent errors with potential clinical implications through centralized review of screening examinations are evidence of the importance of continual quality control monitoring in multicenter imaging trials. In the Lung Screening Study, the recognition and correction of errors in effective tube current– time settings at a screening center early in the trial likely prevented unintended radiation exposure of additional trial subjects. In addition, identification of 3-mm section thickness images from a new scanner at one screening center and excessive image noise in the lung apices on images from a new scanner at another screening center likely served to limit the number of cases in which low anatomic detail might have compromised the detection of small pulmonary nodules.
Our experience suggests that there may be increased susceptibility to image quality defects soon after initiation of a multicenter imaging trial and after the addition of new imaging equipment. Increased attention to quality assurance issues in multicenter imaging trials may be particularly important during these times. It also may be important to ensure that a centralized image quality control process is completely implemented and functioning before initiation of image acquisition in a multicenter imaging trial. If our centralized image collection and review process had been established sooner, the recurrent errors in effective tube current–time settings at one screening center found when reviews began may have been detected and rectified earlier than they were.
Other features of CT image quality with relatively higher rates of error were field of view and scan length. Although the reduction in rates of error in these parameters during the second half of the screening period further illustrates the potential positive effect of a quality assurance program, the frequency of these errors remained greater than 2% overall in most months. This finding suggests that these parameters may be particularly susceptible to technologist error in multicenter imaging trials involving CT. The persistent errors in field of view may reflect in part a reluctance of some technologists to exclude any anatomic structure despite protocol instructions to limit the field of view to the widest dimension of the lungs. The repeated scan length errors (incomplete scanning of the inferior sulci) may have been due to inadequate review of all images by technologists before the subjects left the scanner. A need to maintain rapid throughput in busy clinical settings may have contributed to this tendency. Encouraging the use of internal quality control mechanisms, such as technologist checklists, was not sufficient to maintain lower error rates. In retrospect, periodic repeated viewing of the training slide presentation by technologists with emphasis on the features of a proper field of view and scan length for the National Lung Screening Trial may have been helpful. For multicenter imaging trials in general, such technologist recertification probably would be beneficial.
Level of inspiration is another parameter influenced by CT scanner operators but was less frequently a problem than field of view and scan length. Motion usually was related to cardiac pulsation, and streak artifacts probably were a function of the low radiation dose technique. Improving image quality related to these two parameters would require addressing the limitations of the scanner technology (scan acquisition time) and acquisition parameters (tube current) used.
In few instances did two of three quality assurance radiologists believe a quality defect warranted consideration of repetition of an examination. These defects were primarily due to incomplete coverage of the lungs rather than to defects in the images obtained. This finding may explain in part why none of the recommendations to consider repetition of a study were followed. Although we did not track the outcome of repetition recommendations, it is also possible that lung nodules were detected in some of these examinations, so further diagnostic testing or short-term CT follow-up was already planned. In addition, with the time lag in review of scans obtained during the first 8 months of screening, reporting of the repetition recommendations for some subjects may have occurred close to the date of their next annual screening examination. Finally, subjects may have refused requests to undergo repeated examinations. Regardless, our experience suggests that if repetition of an examination is strongly indicated for certain quality deficiencies in a multicenter imaging trial, making it a protocol requirement may be preferable to making it an option.
Complete CT screening requires coverage of the entire lung, but it is also desirable to limit the amount of abdominal radiation exposure. Our data show that lung CT screening is reasonably effective at limiting abdominal exposure, but we found a greater tendency toward extending scanning farther into the abdomen than was necessary (4 cm or more below the lung in 16%) than toward not extending it far enough (4.3% inadequate scan length). Because no benefit of abdominal CT screening has been proved, emphasis during technical training on potential means of minimizing abdominal scanning (such as maximizing subject inspiration during both the topogram and spiral image acquisitions and using a lateral topogram) should be considered in multicenter imaging trials involving thoracic CT.
One limitation of our study was the subjective nature of the visual image quality evaluations, illustrated by the variability in defect rates recorded by the reviewers. This variability might have been reduced if the reviewers had received specific training, though the subjectivity inherent in deciding whether to state that an image quality feature is suboptimal would still exist. Nevertheless, the averages of results from four reviewers with extensive experience interpreting thoracic CT scans likely provided reasonable estimates of image quality defect rates. In addition, blinding the reviewers to the source of the screening examinations should have limited bias in error rates by reviewer among the screening centers. Bias may have remained, however, because reviewers might have tended to designate more defects than existed owing to a desire to optimize image quality or not to designate defects owing to a desire to portray the Lung Screening Study favorably. Such biases also might have been present had external reviewers been enlisted to perform the quality reviews. Another limitation was that from our observational data, we could not determine whether the defect rates and their occurrence over time would have been different without an ongoing quality assurance program.
The results of this analysis illustrate the effect of centralized image quality assurance procedures in a multicenter imaging trial. Despite effective communication of a defined imaging protocol to screening center personnel through a well-organized infrastructure and the use of local quality control measures, errors occurred. Although complete elimination of acquisition and image quality errors is unrealistic, minimizing them will help to ensure the validity of multicenter trial data and protect clinical research subjects. The approaches used and data obtained in this study may be useful in the design and planning of quality control measures in other multicenter imaging trials.
We thank Ken Clark, Kathy Clingan, Fred Larke, Glenn Fletcher, Mike Flynn, Randell Kruger, Guillermo Marquez, Steve Moore, Pete Ohan, Tom Payne, and Xizeng Wu for their participation in Lung Screening Study of the National Lung Screening Trial Quality Assurance Working Group activities. We also thank Tim Church, David Lynch, and Paul Pinsky for their suggested revisions to the manuscript.
Supported by National Cancer Institute contracts N01-CN-75022, N01-CN-25514, N01-CN-25516, N01-CN-25511, and N01-CN-2547. Coordinating and statistical services for the Lung Screening Study and this study were provided by Westat, Inc., Rockville, MD (N01-CN-25476).
Presented at the 2nd World Congress of Thoracic Imaging and Diagnosis in Chest Disease, Valencia, Spain, May 30–June 2, 2009.