In this series of analyses, we have shown that physicians often deviated from their preconceived notions of the likelihood of disease when diagnosing patients, thereby placing more weight on the current patient's presentation and less weight on prior probabilities. Patients in our experiment presented with cardinal symptoms of CHD, such that the level of evidence provided by the symptoms led the vast majority of physicians to consider CHD as a diagnosis, and just over half of the physicians (51.6 percent) reported that they would not change their diagnostic certainty based on patient gender. However, in the main factorial experiment, physicians were significantly less certain of the CHD diagnosis for female patients. Our finding that this gender effect could not be explained by the physicians' prior notions of CHD probabilities indicates that statistical discrimination via the prevalence hypothesis was not the underlying reason for gender differences in CHD certainty.
Why women were diagnosed with a lower certainty, despite presenting with the exact same symptoms and controlling for gender-relative priors in CHD prevalence, is critical to understand if worrisome inequalities in clinical decision making and health care are to be appropriately addressed. One possibility is that physicians behaved differently for men and women because of personally held stereotypes or prejudices. Discrimination resulting from personal stereotypes is very different from statistical discrimination resulting from the application of prior probabilities (Balsa, McGuire, and Meredith 2005
; McGuire et al. 2008
;). That is, when physicians use prior probabilities to guide decisions, they are attempting to use as much information as they have available to guide their decisions, in the best interests of the patients. When they are influenced by personal prejudices, they are not acting in the best interests of their patients (Balsa, McGuire, and Meredith 2005
). In the current study, the extent to which personal stereotypes or prejudices may explain our results is unknown.
Another possible reason that we did not see evidence of statistical discrimination underlying the gender effect in CHD diagnosis certainty is that our measure of prior information may not have sufficiently captured the priors that physicians held. First, it would have been helpful to know the physicians' clinical experiences with male versus female patients and which CHD symptoms they typically encountered. It is possible that physicians with more frequent exposure to male patients with CHD symptoms similar to the vignette would be more certain of the male simulated patient's diagnosis. Second, our analysis is based on a relative comparison of the physicians' prevalence estimates for men and women, rather than absolute values of estimates for each. However, a relative comparison may be preferable, particularly considering that the physicians' absolute estimates of the overall population prevalence of CHD were substantially higher than the published rate of 6.9 percent (Thom et al. 2006
). Assuming that published prevalence data are correct, other studies have also found that prior probabilities estimated by physicians were inaccurate, to the extent that the authors suggested that the use of prior probabilities as a tool for clinical decision making might cause more harm than benefit (Cahan et al. 2003
). Important to note is that in our study, the overall estimate was associated with the CHD diagnostic certainty for male patients only; among female patients, the role of this prior information was irrelevant. This difference suggests that the symptom presentation held more weight for female patients than it did for male patients. Alternatively, physicians may have been more confident of the relevance or accuracy of CHD population rates for male patients. Because the CHD prevalence estimate was important in the diagnosis of male patients, there may have been stronger overall evidence—both priors and current patient presentation—to increase the certainty of the CHD diagnosis in male patients.
While information on absolute rates of disease may be helpful for clinical decision making, understanding disparities in health care requires examining relative differences across sociodemographic groups. Comparing genders, the majority of physicians in our study thought that the prevalence estimate was similar for men and women, although a sizeable 46 percent believed it was higher in men. Our finding that the minority of physicians who assessed a higher CHD prevalence for women were most likely to be inconsistent with this notion in their diagnostic certainty suggests that the symptom presentation strongly outweighed their prior beliefs. Accordingly, physicians whose priors held that CHD prevalence was similar by gender were most likely to be consistent. Physician level of clinical experience, keeping up with medical literature, beliefs in the accuracy of published prevalence rates, and priming status did not help predict which physicians would be more likely to adhere to their priors in the diagnostic process.
For present purposes, a critical benefit of using the experimental vignette is that it allows for the manipulation of several variables at once, thereby providing unconfounded results for factors (e.g., race and socioeconomic status) that are otherwise nearly impossible to disentangle (i.e., ensuring internal validity). Studies comparing the vignette methodology with standardized patients and other methods have shown that vignettes are also externally valid for studies of medical decision making and assessments of quality of care (Braspenning and Sergeant 1994
; Peabody et al. 2000
; Veloski et al. 2005
; Robra et al. 2006
;). To further enhance the external validity of our results (i.e., that physicians behave similarly under experimental conditions as in everyday clinical practice), we took three precautionary steps. First, considerable effort was devoted to ensure the clinical authenticity of the videotaped presentation. This was achieved by basing the scripts on clinical experience of physician advisors, filming with experienced clinicians present, and by using highly trained professional actors/actresses. Second, physicians viewed the vignette in the context of their practice day (not at a professional meeting, course update, or home) so that it was likely they encountered real patients before and after they viewed the patient in the videotape, thereby retaining as much of the situational context as possible. Third, physicians were specifically instructed at the outset to view the patient as one of their own and to respond as they would typically respond in their own practice. When asked if the patient viewed on the videotape was typical of patients they encounter in everyday practice, 90 percent considered them very typical or reasonably typical.
In this paper, we have examined two major funnels of information that may help a physician come to a diagnostic decision. A critical issue in clinical decision making is that both funnels of information are subject to some unknown level of error. For example, reports of a given patient's symptoms may be faulty if there is miscommunication between the doctor and the patient, or if the patient does not provide certain details, perhaps because of embarrassment or a belief that such details are irrelevant. An advantage of our experiment was that it minimized miscommunication to the greatest extent possible; while it is impossible to control physicians' perceptions of the patient's signals, all facets of the patient presentation (apart from design factors such as gender) were exactly the same across all patient encounters.
The second funnel, which includes published data in the medical literature or personal physician knowledge or experience, may suffer its own biases. For instance, data on base rates ultimately stem from reports of physicians' diagnoses, which are prone to error. Thus, for use of prevalence data to be an acceptable option, it is essential that data are accurate and up to date. For example, if epidemiological data indicating that the prevalence of CHD is higher in men were erroneous, then using such data to dismiss a CHD diagnosis in a female patient in the face of symptom uncertainty would lead to a faulty diagnosis and delayed treatment.
We have shown that given a sufficient symptom presentation, prevalence data were often outweighed during the decision making process. Thus, our results suggest that we need not worry excessively over the potential that statistical discrimination needlessly affects published rates for CHD by gender, given an adequate patient symptom assessment. Reliance on prior beliefs rather than patient-specific information to guide clinical decision making is difficult to justify when high-quality patient information and low-cost tests are available (Balsa, McGuire, and Meredith 2005
). The finding that physicians' perceptions of base rates may be inaccurate further supports this notion. While our methodology cannot definitively rule out the potential for any role of priors in the gender effect in CHD diagnoses, the influence of stereotypes or prejudice should be examined in future work. In addition, the extent to which physicians rely on priors to diagnose patients when patient presentation is less informative or when miscommunication is likely is critical to examine in the field of clinical decision making.