|Home | About | Journals | Submit | Contact Us | Français|
Reactions to uncertainty in clinical medicine can affect decision making.
To assess the extent to which radiologists’ reactions to uncertainty influence diagnostic mammography interpretation.
Cross-sectional responses to a mailed survey assessed reactions to uncertainty using a well-validated instrument. Responses were linked to radiologists’ diagnostic mammography interpretive performance obtained from three regional mammography registries.
One hundred thirty-two radiologists from New Hampshire, Colorado, and Washington.
Mean scores and either standard errors or confidence intervals were used to assess physicians’ reactions to uncertainty. Multivariable logistic regression models were fit via generalized estimating equations to assess the impact of uncertainty on diagnostic mammography interpretive performance while adjusting for potential confounders.
When examining radiologists’ interpretation of additional diagnostic mammograms (those after screening mammograms that detected abnormalities), a 5-point increase in the reactions to uncertainty score was associated with a 17% higher odds of having a positive mammogram given cancer was diagnosed during follow-up (sensitivity), a 6% lower odds of a negative mammogram given no cancer (specificity), a 4% lower odds (not significant) of a cancer diagnosis given a positive mammogram (positive predictive value [PPV]), and a 5% higher odds of having a positive mammogram (abnormal interpretation).
Mammograms interpreted by radiologists who have more discomfort with uncertainty have higher likelihood of being recalled.
Studies on mammography interpretive performance indicate significant variability exists among radiologists.1–4 Sensitivity of screening and diagnostic mammography range from 72.4–88.6% to 78.1–85.8%, respectively.5–8 Sources of this variability have been attributed to radiologists’ interpretive volume and years of interpretive experience, but even the most recent studies on these factors reported conflicting results.9,10 Variability in interpretive performance, especially in diagnostic mammography, is an important clinical problem because low recall rates can result in missed cancer or delay in diagnosis, whereas high recall rates can lead to unnecessary work-up with associated costs, patient anxiety, and potential for patient morbidity.
Several factors may affect radiologists’ thresholds of concern as they view mammographic images and decide whether the film is positive, negative, or inconclusive. The practice environment is one such factor, as interpreting mammography occurs in a highly litigious environment in the United States. Failure or delay in breast cancer diagnosis is the most frequent medical malpractice allegation in the United States,11 likely making interpreting mammography quite stressful. One study in Internal Medicine found that physicians’ reactions to uncertainty led to excessive resource use.12 In a previous study,13 we found that reactions to uncertainty, including stress or anxiety, were associated with sex and years of interpretive experience but had no affect on interpretive performance of screening mammography. We concluded that interpreting screening mammography where cancer detection rates are quite low, approximately 4 out of 1,000,5–7,14 is not as stressful as hypothesized. In contrast, diagnostic mammography has a reported cancer detection rate of 25 out of 1,000,15 and may be associated with higher levels of stress from uncertainty than screening mammography.
We used a well-validated instrument16,17 to assess reactions to uncertainty among radiologists who interpret diagnostic mammography in three distinct regions in the United States. In addition, we explored the extent to which radiologists’ reactions to uncertainty explain variability in interpretive performance of diagnostic mammography. We hypothesized that more stressful reactions to uncertainty would be associated with the likelihood of recalling women for additional work-up.
Three regional mammography registries participated in this study: Group Health Cooperative, a nonprofit integrated health care organization in the Pacific Northwest; the New Hampshire Mammography Network,18 which captures 90% of women undergoing mammography in New Hampshire; and the Colorado Mammography Program, which captures approximately 50% of the women in the six-county metropolitan area of Denver. These three registries are members of the federally funded Breast Cancer Surveillance Consortium.19 Although certain characteristics of the registries differ, they represent the spectrum of health delivery systems providing mammography in the U.S. Federal, state and local confidentiality and data security protections obtained to guard this research data are described elsewhere.20
Eligible radiologists included those interpreting screening and diagnostic mammograms around the time of the survey, identified through the three registries. Radiologists were excluded if they planned to move or retire during the study period. After we obtained approval from the Institutional Review Board of each site, we used mail and telephone follow-up to recruit eligible radiologists. Subjects were informed that their participation would involve completing a survey about their demographic, clinical, and other mammography-related experience, which would then be linked to their interpretive performance of mammography as obtained from their respective registries.
A detailed description of the radiologist survey and its psychometric properties is provided elsewhere.13 Briefly, the survey ascertained sex, years of interpreting mammography, interpretive volume, reimbursement mechanism, medico-legal experience, and reactions to uncertainty involved in patient care. The survey section on reactions to uncertainty, adapted from an instrument developed by Gerrity et al.,16,17 characterizes physicians’ reactions to uncertainty in three domains: (1) anxiety from uncertainty, (2) concern about bad outcomes, and (3) reluctance to disclose information to physician colleagues. This analysis is limited to the first 2 domains. The third was not a significant domain in the analysis. Table 1 outlines the two domains of this analysis and their respective items as well as means and standard errors as reported by the radiologists in our sample. Individual items were revised to make them relevant to the practice of mammography interpretation.13 Radiologists rated items using a 6-point Likert scale (1=strongly disagree, 2=moderately disagree, 3=slightly disagree, 4=slightly agree, 5=moderately agree, and 6=strongly agree). An uncertainty score with the 2 domains combined could range from 8 to 48, with 8 indicating low reactions to uncertainty and 48 indicating high reactions to uncertainty. A prior study demonstrated that alpha coefficients for the revised subscales ranged from 0.69 to 0.89.13 When removing the third domain the alpha coefficient for the reaction to uncertainty measure was 0.89, indicating a high level of internal consistency for this measure.
Survey data were merged with interpretation and outcome data obtained from the three registries for diagnostic mammograms that occurred between January 1, 1996 and December 31, 2002 allowing 1 year of follow-up to ascertain a breast cancer diagnosis (through December 2003). We defined diagnostic mammograms as (1) additional evaluation performed after a recent screening mammogram to assess an abnormality, (2) those performed at short interval to evaluate stability in a previously detected abnormality, (3) those performed to evaluate a breast concern that was not reported to be a lump, and (4) those performed to evaluate a self-reported breast lump. Mammograms of women with breast implants or a history of breast reconstruction or reduction were excluded from the analysis.
Core variables obtained from each registry included date; type of mammogram (see above); BI-RADS™ interpretation and recommendation categories;21 patient demographic, clinical, and risk characteristics; and breast cancer outcome. Standardized variable definitions allowed for merging of data from different screening programs while ensuring both uniformity and confidentiality of all data.16 Linkages between radiologists’ uncertainty scores from the mailed survey and performance data were made by a centralized data-coordinating center.
Mammograms were considered positive if assigned a final BI-RADS™ code21 of 0 with a recommendation for biopsy, FNA or surgical consult, 4 (suspicious abnormality) or 5 (highly suggestive of cancer) at the end of the imaging work-up. The most severe interpretation of either breast on a bilateral mammogram was used to classify the overall assessment. We classified mammograms as negative if assigned a BI-RADS™ code of 1 (negative), 2 (benign finding), 3 (probably benign), or 0 without a recommendation for biopsy.
The follow-up period for cancer outcomes associated with each examination was 365 days. Breast pathology outcomes (benign and malignant) were identified through pathology data banks and/or regional cancer registries. Only invasive breast cancer and ductal carcinoma in situ cases were included. Lobular carcinoma in situ cases were excluded because they are not visible on mammography.
Examinations were false positive when the assessment was positive and a breast cancer diagnosis did not occur within the follow-up period (365 days). Examinations were true positive when the assessment was positive and a cancer diagnosis followed. A false negative examination was a negative assessment with a diagnosis of cancer within the follow-up period. A true negative examination was a negative assessment with no subsequent cancer diagnosis within the follow-up period. Sensitivity was calculated as true positive/(true positive+false negative). Specificity was calculated as true negative/(true negative+false positive). Positive predictive value (PPV) was calculated as true positive/(true positive+false positive). Recall rate was calculated as positive examinations/(positive examinations+negative examinations).
Performance measures may vary among the 4 types of diagnostic mammograms described above.15 Therefore, analyses were stratified by type of diagnostic mammogram. Analyses examining sensitivity, specificity, and PPV were based on radiologists interpreting diagnostic mammograms (of a specific type) resulting in breast cancers or not. Each radiologist characteristic was initially examined univariately with respect to the combined uncertainty scale as well as the individual subscale. The means and 95% confidence intervals (CI) were computed for each scale by radiologist characteristic. Statistically significant differences between means were assessed using the SAS procedure GLM.22
For each type of diagnostic mammogram, we used a multivariable analysis to examine the association between the combined reactions to uncertainty scale (as well as the 2 subscales) and each performance measurement. The analysis adjusted for mammography registry, radiologist sex, age, and interpretive volume. We also adjusted for patient age at mammogram and time since last mammogram. Odds ratios were used to quantify the influence of these variables on the probability of a true positive or true negative outcome.
The combined uncertainty score was divided by 5 for ease of interpretation of the odds ratios. Thus, we are able to discuss the changes in the odds of a positive mammogram, negative mammogram, or cancer diagnosis associated with a 5-point change in reactions to uncertainty score. We undertook this approach for 2 reasons. First, a 5-point increase represents an approximate 17% change in score based on our mean uncertainty score of 30, felt to be more reflective of clinically relevant change than a 1-point increase. We used this approach in our prior paper on uncertainty in screening mammography interpretation,13 and wanted to make comparisons between our findings for screening and diagnostic mammograms. Each diagnostic mammogram was associated with an assessment and breast cancer outcome. Therefore, the analysis was performed at the mammogram level but accounted for the correlation within each radiologist. Logistic regression models were fit using generalized estimating equations,23 assuming an independent working correlation matrix to account for potential correlation among mammograms interpreted by the same radiologist. For all analyses, tests were two-sided with P values ≤.05 considered to be statistically significant.
One hundred and eighty-one eligible radiologists were invited to participate by completing the survey. Of these, 139 consented and completed all or a portion of the survey for a response rate of 76.8%. One hundred and thirty-two participating radiologists (95% of those who completed the survey) interpreted diagnostic mammograms during the time period and had complete responses on the reactions to uncertainty scale.
Table 2 outlines the mean uncertainty scores with 95% CI by demographic and practice characteristics of participants. The majority of radiologists were male (78%), most had 10 or more years of experience (77%), and there was a fairly even distribution across annual interpretive volume (screening and/or diagnostic) categories of 500–5,000, with very few (7%) interpreting more or less than these categories. Most radiologists were reimbursed through shared partnership profits or through annual set salary with few being reimbursed per screening mammogram. Approximately half (52.3%) reported a prior medical malpractice lawsuit. Only 14% (19/132) reported a previous mammography-related lawsuit (data not shown).
The mean reactions to uncertainty score with the two domains combined was 29.6 (95% CI, 28.2–31.0) of a total possible score of 48 (Table 2). Scores were lower among female versus male radiologists (26.4 vs 30.5; P=.018) (Table 2). Radiologists with more years interpreting mammography and higher interpretive volume had slightly lower uncertainty scores. A test for trend indicated that the reduction in uncertainty scores with higher years of interpretation was borderline statistically significant (P=.058). This was especially significant for reduction in concern about bad outcomes and higher years of interpretation (P=.03). Method of reimbursement was not associated with uncertainty scores. Radiologists reporting any prior medico-legal experience had slightly higher uncertainty scores (30.4 vs 28.7), although this was not significant.
Performance data linked to survey responses included a mean of 996 diagnostic mammograms per radiologists (range 1–5,285) in the study time period. Of the 131,482 diagnostic mammograms included in the analysis, 3,080 were true positive examinations, 120,991 were true negative examinations, 6,470 were false positive examinations, and 941 were false negatives.
Figures 1 and and22 illustrate the relationship between overall uncertainty scores and performance measures based on diagnostic mammograms that were additional evaluations performed after a recent screening mammogram to assess an abnormality. The trend line on each figure is based on predictive probabilities obtained from unadjusted logistic regression models. Higher uncertainty scores are associated with higher sensitivity (P=.0013) and lower specificity (P=.0015) (Fig. 1). Similarly, higher uncertainty scores are associated with higher recall (abnormal interpretation rate; P=.0006) (Fig. 2). The inverse relationship between uncertainty scores and PPV is not statistically significant (P=.13).
Table 3 illustrates the results of the multivariable analysis after adjustments for mammography registry, radiologist characteristics, and patient characteristics. When examining radiologists’ interpretation of additional evaluations after a recent mammogram, after adjustment a 5-point increase in reactions to uncertainty was associated with a 17% higher odds of having a positive mammogram given cancer was diagnosed during follow-up (sensitivity), a 6% lower odds of a negative mammogram given no cancer (specificity), a 4% lower odds (not significant) of a cancer diagnosis given a positive mammogram (PPV), and a 5% higher odds of having a positive mammogram (abnormal interpretation). Similar relationships were found within the other subtypes of diagnostic mammograms, although the associations were not usually statistically significant. Results were unchanged after additionally adjusting for whether the radiologist had a prior medical malpractice lawsuit.
Radiologists who are more bothered by the uncertainty inherent in clinical medicine have higher recall rates, lower specificity, and lower PPV in diagnostic mammography interpretation when the diagnostic mammogram was done to evaluate an abnormality found on a recent screening mammogram. It appears that this type of diagnostic mammogram generates more concern among radiologists than other types of diagnostic mammograms. Perhaps this is because of the equivocal nature of a new finding versus tracking a prior finding, which is usually assessed for stability. These radiologists also appear to have higher sensitivity, a finding that is consistent across all subtypes of diagnostic mammograms. After adjusting for radiologist factors, within additional evaluations a 5-point increase in reactions to uncertainty was associated with a 17% higher odds of having a positive mammogram given cancer, a 5% higher odds of having a diagnostic mammogram interpreted as positive (abnormal interpretation), a 6% lower odds of having a diagnostic mammogram accurately interpreted as negative, a 4% lower odds (not significant) of detecting a cancer when the mammogram was interpreted as positive (PPV).
We conducted this study in an attempt to identify specific factors affecting radiologists’ interpretive thresholds in higher stakes mammography interpretation, which might be amenable to change. Identifying these might then lead to a better understanding of how to modify these factors to enhance interpretive accuracy. Reducing unnecessary recall in the United States, while maintaining sensitivity, could result in both significant cost savings24 and a reduction in the anxiety women experience when receiving an abnormal mammographic interpretation.25–27
The high likelihood of cancer in women obtaining diagnostic mammography might trigger an affective reaction among radiologists who are less comfortable with possibly missing an important finding on mammography. Interestingly, our previous study13 found no association between reactions to uncertainty and interpretive performance or decision making in screening mammography. A previously published paper from this study28 indicated that radiologists have misperceptions about breast cancer risk in the women for whom they interpret mammograms. We are currently conducting a study to determine whether we can help radiologists recalibrate their thresholds of concern by educating them about risk of breast cancer in their patient populations and about their own risk of medical malpractice.
Other important questions to address in intervention research include the following: Would double reading policies reduce uncomfortable reactions to uncertainty? Would certain radiologists be more likely to benefit from double reading than others? Will the ability to obtain supplemental consultation reduce negative reactions to uncertainty? Why does sex of the radiologist appear to play a significant role in stress associated with uncertainty? We hope to address several of these questions in our next study, which may assist us in determining how to improve interpretive performance.
One concern raised by our study is the relationship between uncertainty in clinical decision making and physician burnout. Radiologists in our study reported high reactions to uncertainty as well as a high desire to leave the field of mammography.29 Two previous studies have shown that negative responses to uncertainty or ambiguity among psychologists and primary care providers are associated with burnout.30,31 The recent Institute of Medicine Report, Improving Breast Imaging Quality Standards,32 identified several workforce concerns regarding available radiologists to interpret mammography. Their findings indicate that as the population of women over age 40 increases by nearly 50%, the number of radiologists per 10,000 women in the same age group is expected to decline by 14% in 2015 and 23% by 2025.32 Thus, understanding the relationship between discomfort with uncertainty and performance is important if it is causing radiologists to discontinue mammography interpretation. More research is needed to understand this phenomenon more fully.
No study is without limitations. Our study population included a representative sample of radiologists practicing in three distinct regions of the United States. Although our high response rate allows us to generalize to other radiologists in these regions, it does not allow us to generalize to radiologists around the country. The strengths of our study include a high response rate to a mailed survey, the ability to link well-validated responses to this survey to actual interpretive performance data, and inclusion of the breadth of settings where mammography is performed, including community-based, academic, and open as well as single-payer health maintenance organizations. Although this allows us to capture mammography interpretative practice as it is commonly performed in the United States, we were not able to conduct analyses by specific practice settings, such as among radiologists whose practice is confined specifically to mammography interpretation, as this was not common in our study population. A final potential limitation is that some radiologists had a sensitivity of 0, which is probably because those radiologists interpreted a small number of examinations and because breast cancer is relatively rare, thus, no cancers were detected in their small samples. Because the analysis was conducted at the level of the mammogram (not at the level of the provider), there is no need to repeat the analyses while excluding radiologists who had outlier values for sensitivity.
In conclusion, mammograms interpreted by radiologists who have more discomfort with uncertainty have higher likelihood of being recalled. Interventions should be designed to address uncertainty in clinical practice when it is found to affect clinical performance.
This work was supported by the Agency for Healthcare Research and Quality (HS-10591) and the National Cancer Institute (U01 CA63731, U01 CA86082, UO1 CA63736, and U01 CA86076).
Potential Financial Conflicts of Interest None disclosed.