|Home | About | Journals | Submit | Contact Us | Français|
Little information exists concerning the frequency of clinically significant incidental findings (IFs) identified in the course of imaging research across a broad spectrum of imaging modalities and body regions.
To estimate the frequency with which research imaging IFs generate further clinical action, and the medical benefit/burden of identifying these IFs.
Retrospective review of subjects undergoing a research imaging exam that was interpreted by a radiologist for IFs in the first quarter of 2004, with 3-year clinical follow-up. An expert panel reviewed IFs generating clinical action to determine medical benefit/burden based on predefined criteria.
Frequency of (1) IFs that generated further clinical action by modality, body part, age, gender, and (2) IFs resulting in clear medical benefit or burden.
1376 patients underwent 1426 research imaging studies. 40% (567/1426) of exams had at least one IF (1055 total). Risk of an IF increased significantly by age (OR=1.5; [1.4–1.7=95% C.I.] per decade increase). Abdominopelvic CT generated more IFs than other exams (OR=18.9 compared with ultrasound; 9.2% with subsequent clinical action), with CT Thorax and MR brain next (OR=11.9 and 5.9; 2.8% and 2.2% with action, respectively). Overall 6.2% of exams (35/567) with an IF generated clinical action, resulting in clear medical benefit in 1.1% (6/567) and clear medical burden in 0.5% (3/567). In most instances, medical benefit/burden was unclear (4.6%; 26/567).
The frequency of IFs in imaging research exams varies significantly by imaging modality, body region and age. Research imaging studies at high risk for generating IFs can be identified. Routine evaluation of research images by radiologists may result in identification of IFs in a substantial number of cases and subsequent clinical action to address them in much smaller number. Such clinical action can result in medical benefit to a small number of patients.
An incidental finding (IF) in human subjects research is defined in a major consensus project as an observation “concerning an individual research participant that has potential clinical importance and is discovered in the course of conducting research, but is beyond the aims of the study.” 1
Numerous reports have detailed how the detection of an IF can result in the early diagnosis of an unsuspected malignancy or aneurysm. 2–5 However, others describe harm and excessive cost resulting from the aggressive follow-up of what turn out to be benign, but radiographically suspicious, IFs. 6, 7 Moreover, clinical experience dictates that many IFs are of indeterminate clinical significance and generate uncertainty among both research participants and their physicians. 6
At our institution alone there are approximately four thousand imaging exams performed solely for research purposes. However, depending upon the research study and institution, established mechanisms for handling imaging IFs may vary significantly or might not exist8. For example, imaging data may or may not be evaluated in a timely manner or by a trained radiologist, 5, 8 potentially resulting in the failure to offer a lifesaving intervention early in a disease process.
Contributing to the wide variation in IF management protocols is a lack of data available with which researchers, radiology departments, institutional review boards (IRBs) and institutions can estimate the expected frequency of IFs, burden of different management strategies, and consequences for research participants. While the frequency of IFs has been well documented in a few specialized areas such as CT colonography and functional MRI of the brain, other imaging modalities (e.g., molecular imaging, ultrasound, plain film x-ray) commonly used in clinical research are less studied. Moreover, few researchers have documented how often clinical action is taken as a result of a research IF being identified. Even less is known regarding the ultimate medical benefit or burden that may result from routinely identifying imaging IFs. Without this data, researchers and institutions will be uncertain of the resources required to manage research imaging IFs across the spectrum of imaging research.
The purpose of this study was to assess retrospectively the frequency of IFs generated from multiple imaging modalities common to imaging research at a large medical and research institution, to estimate the frequency with which clinical action is taken as a result of discovering an IF, and to attempt to examine the medical benefit or harm to human subjects from identifying and working up IFs in imaging research.
Since 2003, it has been the policy of the Mayo Clinic Department of Radiology that imaging studies performed exclusively for research purposes are examined the day they are performed by a staff radiologist. This radiologist examines research images for IFs of potential clinical significance and dictates a report that appears in the patient’s electronic medical record. Per policy, the interpreting radiologist is expected to call the patient’s primary care physician if an IF requiring immediate attention is identified. Further investigation and treatment is left to the discretion of the primary care physician and the patient.
This retrospective study was approved by the Mayo Clinic IRB. A database of research imaging exams and their clinical reports was constructed by searching for IRB numbers used for billing research imaging studies to research grants and cost centers. During the three months of January-March of 2004 there were 1823 radiology exams billed to research grants. Figure 1 details how 397 of these exams were excluded, yielding a final study cohort of 1426 research imaging exams.
An IF was defined as an observation noted in the dictated radiology report that was not directly related to the aims of the respective research study as listed in the title description available on the Mayo Clinic intranet of our IRB. Comments about past surgical interventions, old injuries, non-pathologic anatomic variations, or normal line or pacemaker locations were not considered IFs.
The 1426 radiology reports were reviewed for IFs by 3 members of our research team. For each research exam, all IFs, imaging modality used (e.g., CT, MRI) and body region being imaged (e.g., head, chest, abdomen) were recorded to create eight combinations of imaging modality and body region (Table 1). Medical records were available for all subjects, and were reviewed through February, 2007, to obtain demographic data and information about any clinical action performed as a result of the IF. Specifically, actions that were recorded included further diagnostic imaging, referral to a subspecialist, diagnostic medical testing, invasive diagnostic procedures/biopsies, initiation of medical therapy, and surgical intervention. Descriptions of the clinical course of each IF were compiled for review by an expert panel.
An expert panel was assembled consisting of 6 physicians--including 4 radiologists (3 abdominal, 1 neuro), 1 medical oncologist, and 1 gastroenterologist--and 3 bioethics scholars from several disciplines (law, social science, and religious ethics). The physicians included a former IRB chair, the head of our institutional IRB, the vice-chair for Radiology Research, and the chair of the Department of Radiology. The bioethics scholars have all had significant experience with research ethics, and include the director of the Mayo CTSA ethics resource and the principal investigator of an NHGRI study of IFs (NHGRI grant # R01-HG003178). As the latter (S.M.W.) was not based at Mayo, she did not participate in the review of any research subjects’ records or the ranking of individual cases.
The panel was charged with devising a ranking system for categorizing medical benefit/burden on a research subject imposed by clinical action resulting from an imaging IF. The panel sought to create a categorization scheme based upon objective endpoints that could be deduced from the existing medical record. The panel determined that an objective marker to use was the initiation of medical or surgical treatment on the basis of an IF. 2, 4 Medical burden/benefit rank was determined as follows: Clear medical benefit was defined as medical or surgical treatment administered as a direct result of the discovery of an IF with resolution or improvement in a disorder or disease. Clear medical burden was defined as medical or surgical treatment administered as a direct result of the discovery of an IF with resulting mortality, lack of improvement, or morbidity without medical benefit. Potential medical benefit occurred when no treatment was initiated, but an improvement or resolution in a disorder could be realized in the future based on the knowledge of the IF. Similarly, potential medical burden occurred when no treatment was initiated, but an adverse effect might occur if the IF were investigated clinically. Cases not fitting into these categories were rated as “unclear” benefit/burden. Physicians on the panel were additionally asked to rate the medical gravity of an IF along a 5-point scale (from minimal/trivial to life-threatening). Psychological, social and economic factors were not considered in creating this medical benefit/burden categorization given the difficulty contacting research participants, assessing their subjective views regarding these factors.
Panel members independently rated medical benefit/burden. The ratings on a case were considered in agreement when all members rated medical benefit/burden within one rank of each other, with the dissenting member given the opportunity to explain his/her perspective. Cases whose ratings were not in agreement were resolved by conference at a later meeting, in which case the medical benefit/burden values were agreed upon by consensus.
The number of IFs listed by staff radiologists in radiology reports was tabulated by imaging modality and body region. The number of exams with at least one IF was also calculated. Multiple variable logistic regression was used to assess the association between the presence of an IF (the dependent variable in the model) and imaging modality and body region, adjusting for patient age and gender.
To assess the risk of an IF for any type of imaging exam relative to any other, odds ratios (ORs) were also reported for each pair of research imaging exams. The significance level was set at 0.05 for statistical significance. The number of IFs generating further clinical action was also reported by imaging modality and body region.
1376 research participants underwent 1426 research imaging studies. These imaging studies came from 91 different IRB-approved research protocols. Subjects’ mean age was 58 years (range 3–97 yrs). Of the 1426 exams, 690 (48.4%) were on males and 736 (51.6%) were on females.
Out of the 1426 research imaging studies, 567 subjects had at least one IF reported (40.0%). Of the research subjects with IFs, the mean age was 63 (3–97 years), with 251 (44%) being male and 316 (56%) being female. The 567 subjects with IFs had a total of 1055 IFs (284 exams with multiple findings), with a subsequent IF-to-exam ratio of 1.86 (0.74 was the ratio for all 1426 exams).
The frequency of IFs reported varied widely depending upon body region imaged and the type of imaging modality that was employed (Table 1). CT scans of the abdomen/pelvis and thorax produced the highest percentages of imaging exams with an IF (61% and 55%, respectively), providing an average of 1.29 and 1.16 IFs per research imaging exam. Ultrasound and nuclear medicine scans infrequently produced an IF (9% and 4%, respectively).
Multiple variable logistic regression was used to assess associations between factors of interest and the odds for an IF, where factors of interest included age (considered as linear), gender, and type of imaging study. Type of imaging study was significantly associated with IF, p<0.001. Considering ultrasound as the reference imaging study, each of the other exams, with the exception of nuclear medicine exams, was associated with a significantly higher odds of an IF (Table 2). The largest ORs were for CT abdomen/pelvis [OR=18.9] and CT thorax [OR=11.9]. Table 3 reports the OR and 95% confidence interval for an IF for each pair of the 8 imaging studies of interest.
Older age was also significantly associated with increased odds for IF, OR=1.5 per 10-year increase in age [95% CI; 1.4–1.7]. This increased risk translates into a 4.2% increase in the odds of having an IF per year of age (1.9% per year of age for MR brain alone). There was a non-significant increased odds in males (relative to females), OR=1.04 [0.8, 1.3].
A second multiple variable model was also considered that included categorized age (<40, 40–64, and ≥65 years). Results again were that higher age was significantly associated with a greater odds for IF, age 40–64 (relative to <40) had OR=4.1 [2.5, 6.8] and age ≥65 (relative to <40) had OR=9.7 [5.8, 16.3].
Out of the 1426 research imaging studies examined, 35 (2.5%) research participants (8 males, 27 females; mean age 57, range 31–87) received further clinical action based on an IF (Table 4).
Of these 35 subjects, 32 received follow-up imaging and 27 were referred by their primary care physician for subspecialty consultation. Five research subjects underwent non-invasive diagnostic medical tests (serial CA-125 levels, dexamethasone and catecholamine levels, pulmonary function tests, fungal serologies, coagulation tests) while 6 underwent invasive diagnostic procedures (2 bronchoscopies, 2 biopsies, 1 FNA, 1 flexible nasopharyngoscopy). Eight research subjects underwent surgery for an IF, 2 underwent radiofrequency ablation (renal cell carcinoma, carcinoid liver metastasis), and 2 received medical treatment (1 anti-fungals, 1 anti-tussives).
While many imaging modalities and body regions were found to generate large numbers of IFs, only CT chest, CT abdomen/pelvis, CT all other, and MRI head yielded IFs that received further investigation. Clinical investigation did not result from IFs found on non-head MR exams, ultrasound, plain film x-ray or nuclear medicine. CT abdomen/pelvis had the most IFs receiving action, with 19 acted upon out of 207 exams (9.2%) (Table 1). The most frequent IFs receiving further action were ovarian/adnexal masses (n=9) in the abdomen/pelvis and indeterminate lung nodules (n=5) in the chest.
Medical burden/benefit and gravity of disease for each IF generating subsequent action is reported in Table 4. Six cases were found to be examples of clear medical benefit (rib osteomyelitis, renal cell carcinoma (Figure 2), small bowel carcinoid, sphenoid sinus aspergillus colonization, ovarian mucinous cystadenoma, grade-2 ependymoma (Figure 3)), with a mean gravity of disease score of 4.0. Twenty-four cases received an evaluation of unclear medical benefit/burden, while only two cases received the designation of potential medical burden. Three cases were found to represent clear medical burden to the patient, with a mean gravity score of 2.3. These included suspicious mesenteric nodules that were found to be benign reactive lymph nodes at laparoscopy (Figure 4), an ovarian mass found to be a physiologic cyst at laparoscopy, and an adrenal mass found at surgery to be an adrenal cortical adenoma. No deaths or post-surgical complications occurred.
In this study, 40% of research imaging exams had at least one IF. Of these, a small minority (6.2%) eventually resulted in subsequent clinical action. Research imaging modalities differed widely in their predilection for generating IFs as well as subsequent clinical action, varying further by body region being examined and patient age. Abdominopelvic CT generated significantly more IFs than any other type of exam (61% with IFs, p<0.05; 9.2% resulting in subsequent action). Brain MR had significantly more IFs than ultrasound, non-brain MR and nuclear medicine (43% with IFs, p<0.05; 2.2% resulting in subsequent action). Ultrasound, nuclear medicine scans, and non-brain MRI scans generated far fewer IFs, and failed to result in any further clinical action. The risk of an IF increased significantly by age (OR=1.5; 1.4 – 1.7 95% C.I. per decade increase).
Underscoring the uncertainty of IFs, an expert panel review of all IFs that produced subsequent clinical action concluded that in retrospect, clinical pursuit of most IFs (26/35; 68.6%) was of unclear benefit/burden to research subjects, evaluating this with the benefit of 3 years of clinical follow-up. However, within the 3 months of study at the same institution, 6 cases (1.1% of all exams with an IF) were felt to result in clear medical benefit to the research subject. These included newly-diagnosed, potentially life-threatening tumors and infections that were treated with good response. On the other hand, clinical pursuit of IFs in 3 subjects (0.5% of all exams with an IF) were felt to result in clear medical burden, resulting in lapararoscopy or surgery for benign disease, but no long-term morbidity or mortality.
The body of literature concerning IFs in imaging research originates predominately from work within CT colonography and functional MRI (fMRI) studies of the brain 2, 4, 8–17, but include studies of structural MR exams of the head 18, 19, body 20, and others 21. In contrast, the present study estimates the frequency of IFs across all research imaging modalities at one institution, differentiating by imaging modality and body region scanned, and includes an attempt to assess resulting medical benefit or burden to the research subject.
Notwithstanding, some of our observations can be compared to other studies that have examined individual imaging modalities. For example, at least half of asymptomatic subjects have an extracolonic finding at CT colonography, but only 6–8% have extracolonic findings of potential medical significance, 2, 4, 5, 9, 15, 22 paralleling our observation for all types of research CT abdomen/pelvis exams (Table 2). Overall, 0.6% (9/1426) of imaging exams in our study went on to surgery and/or radiofrequency ablation due to an IF, compared to 0.2 – 1% of asymptomatic subjects participating in CT colonography studies. 2, 13, 15, 22 It should be emphasized, however, that IF prevalence in CT colonography research is not uniform, and varies by patient population (asymptomatic vs. symptomatic; screening vs. surveillance) and CT technique (radiation dose ± IV contrast). 5
Prior studies examining IFs in brain imaging demonstrate a large range of IFs, from an incidence of about 20% for MRIs performed on younger patients 16, 17 to 47% for MRIs performed on older adults (mean age of 47). 8 In the current study, among subjects with a mean age of 63 yrs., 43% of all exams had at least one IF. In these prior studies 3–8% of brain MRs resulted in further clinical action compared to 2.2% in our study. 8, 16, 17 Additionally, we found an age-related increase in the likelihood of having an IF on a brain MR of 1.9% for each year of age, similar to findings of Vernooij et al. 18
This study demonstrates that research imaging IFs are common and have the potential to represent both an early opportunity to diagnose asymptomatic, life-threatening disease, as well as a potential invitation to invasive, costly, and ultimately unnecessary interventions for benign processes. It should be noted that the clinical investigation of a suspicious IF that results in a benign diagnosis is not necessarily without clinical value or avoidable when presented with potentially life-threatening consequences (e.g., a solid renal mass). Nevertheless, the majority of IFs seem to be of unclear significance. These instances represent a dilemma for researchers. 8, 16, 17 As a result, clinical evaluation and serial imaging is often the course of action. The research participant, research grant, or medical insurance is left to cover the cost.
Clinical researchers, therefore, may struggle regarding how to handle and plan for imaging IFs. This study attempts to inform research protocol design by supplying not only frequencies of IFs by imaging modality, body region of interest, and age, but also the proportion of these findings that were considered worrisome enough by the research subject and his/her primary care physician to undergo further investigation. For example, 9.2% of all abdominopelvic CT exams in this study not only had IFs present, but prompted further investigation or treatment. In contrast, plain film x-ray generated sizeable numbers of IFs (39% had IFs), but none went on to subsequent clinical action. It may be reasonable for a principal investigator to devote fewer resources to potential IF’s when the expected frequency of potential benefit is extremely low (e.g., generating Sharp scores of rheumatoid arthritis from x-rays of the hand). Conversely, provisions should exist regarding detection, disclosure and follow-up of IFs where the expected frequency is high (i.e., abdominopelvic CT and brain MR).
Currently, management protocols for research IFs vary. 23 In some instances, this variation may raise ethical questions. For example, at some institutions, not all research images are evaluated for IFs by a trained radiologist. Functional MRI scans are often read by PhD or even non-PhD., non-M.D., researchers. 23 Even when a radiologist is included as a member of the investigative team, images may not be reviewed in a timely manner, thereby possibly losing opportunities to intervene early in a disease process. This study and recent recommendations set forth by the NIH-supported Working Group on Managing Incidental Findings in Research should aid institutions and researchers in evaluating and drafting their IF policies. 24
Finally, it should be recognized that research subjects may overestimate the potential benefit that research imaging studies may offer. One study observed that even when research participants knew that their images would not be reviewed by a physician, they still expected that if a brain abnormality existed it would be detected. 23 In contrast, experience demonstrates that few research subjects appreciate the risk that having an IF may result in anxiety, expensive and invasive investigation, and possible surgical morbidity or mortality for benign disease that was radiographically suspicious. Consequently, the research consent process should outline risks related to the identification of an IF and that such identification may result in clinical action.
Our study has a number of limitations: (a) Variability exists among radiologists in selecting and classifying IFs; (b) The number of IFs which resulted in clinical investigation may be underestimated, as some research subjects may have pursued investigation outside our institution. However, our practice of dictating a clinical note detailing IFs the same day the research exam is performed and personally contacting the primary care provider in instances of life-threatening findings minimizes this likelihood; (c) Our attempt at assessing medical benefit/burden with an expert panel necessarily has the weaknesses of retrospective assessment with time-limited follow-up and being based on clinical opinion as its endpoint; (d) We did not include an assessment of emotional/mental/anxiety-generating burden as this was felt to be highly subjective and impractical in a retrospective study; (e) We did not perform a cost analysis; and (f) Our investigation occurred over a three month period and was limited to active investigations during that time interval.
Results from this study demonstrate that specific imaging modalities, body regions, and advanced age increase the likelihood of generating an IF during the course of imaging in clinical research. Research imaging studies at high risk for generating IFs can be identified. These data should inform researchers, radiology departments and IRBs about the risk of an IF and subsequent clinical action, and can be used in creating management plans for research imaging IFs.
Timely, routine evaluation of research images by radiologists can result in identification of IFs in a substantial number of cases that can result in significant medical benefit to a small number of patients.
Prof. Wolf’s work on this paper was aided by National Institutes of Health (NIH), National Human Genome Research Institute (NHGRI) grant # R01-HG003178 on “Managing Incidental Findings in Human Subjects Research” (S.M. Wolf, PI). The contents of this article are solely the responsibility of the authors and do not necessarily represent the views of NIH or NHGRI. Neither Prof. Wolf nor Dr. McFarland had access to the Mayo Clinic case data analyzed herein. Dr. Fletcher had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Dr. Orme and Dr. Fletcher’s work on this publication was made possible by Grant Number 1 TL1 RR024152 from the National Center for Research Resources (NCRR), a component of the National Institutes of Health (NIH), and the NIH Roadmap for Medical Research. Its contents are solely the responsibility of the authors and do not necessarily represent the official view of NCRR or NIH. Information on NCRR is available at http://www.ncrr.nih.gov/. Information on Reengineering the Clinical Research Enterprise can be obtained from http://nihroadmap.nih.gov.