|Home | About | Journals | Submit | Contact Us | Français|
To examine whether physicians attend to gender prevalence data in diagnostic decision making for coronary heart disease (CHD) and to test the hypothesis that previously reported gender differences in CHD diagnostic certainty are due to discrimination arising from reliance on prevalence data (“statistical discrimination”).
A vignette-based experiment of 256 randomly sampled primary care physicians conducted from 2006 to 2007.
Factorial experiment. Physicians observed patient presentations of cardinal CHD symptoms, standardized across design factors (gender, race, age, socioeconomic status).
Most physicians perceived the U.S. population CHD prevalence as higher in men (48.4 percent) or similar by gender (44.9 percent). For the observed patient, 52 percent did not change their CHD diagnostic certainty based on patient gender. Forty-eight percent of physicians were inconsistent in their population-level and individual-level CHD assessments. Physicians' assessments of CHD prevalence did not attenuate the observed gender effect in diagnostic certainty for the individual patient.
Given an adequate presentation of CHD symptoms, physicians may deviate from their prevalence data during diagnostic decision making. Physicians' priors on CHD prevalence did not explain the gender effect in CHD certainty. Future research should examine personal stereotypes as an explanation for gender differences.
Variations in clinical decision making have been evident for over 20 years, prompting efforts in research and training to minimize disparities in diagnostic and treatment decisions. While guidelines for disease management are thought to be useful in regulating patient care, the initial diagnostic decision making process inherently remains more difficult to standardize (McKinlay et al. 2006, 2007; Bonte et al. 2008). When interpreting a patient's symptoms, physicians undergo numerous complex cognitive processes to reach an active diagnosis (Hershey and Baron 1987; Ferreira et al. 2006; Krynski and Tenenbaum 2007;). These processes involve filtering large amounts of potentially conflicting information and incorporating relevant evidence to come to a diagnostic decision. To help, commonly advised methods use epidemiologic prevalence data in probability analysis or evidence-based decision support tools (Diamond et al. 1980, 1983; Halkin et al. 1998; Reynolds 2001). These methods entail weighing the patient's symptoms along with the prior likelihood that the patient has a condition, given other background information. Two major funnels of information that remain key for clinical decision making during an initial patient encounter, whether or not formal decision aids are used, are (1) details on the individual patient's presentation and (2) prior knowledge on likelihood of a disease, based on prevalence data or clinical experience with groups similar to the patient.
Ideally, the amount of weight a physician places on each of these two funnels of information varies depending on the quality of the information available from each. In the situation where high-quality data are available from both sources, patient-specific clinical information should be considered more heavily than preexisting prevalence data (Lutfey et al. in press). Prevalence data indicate population risks, not specific individual risks; furthermore, epidemiologic data necessitate statistical assumptions and are, in essence, derived from clinical decision making (Lutfey et al. in press; Rockhill, Kawachi, and Colditz 2000;). Consider as an illustration the extreme case where the patient's symptoms are clearly communicated and unambiguously point the physician to a diagnosis; here, preexisting data suggesting that the prevalence of the suspected disease is low in this patient population should not override the obvious diagnosis signaling from the individual's presentation.
In the more common situation where the patient's symptoms do not provide an obvious diagnostic decision path, physicians may have to put more weight onto their prior probability of disease, or “priors” (Medicine 2003). In doing so, physicians use presumably accurate group data to help make a decision in light of uncertainty about the individual. The use of prior data in the face of uncertainty about the individual patient has been termed “statistical discrimination” (Balsa, McGuire, and Meredith 2005; McGuire et al. 2008;). One hypothesized mechanism by which statistical discrimination affects clinical decision making, termed the “prevalence” hypothesis, is that physicians use prior data on the prevalence of disease to help determine their certainty of a diagnosis (i.e., posterior probability of disease). For example, a physician may believe that the prevalence of coronary heart disease (CHD) varies by gender, so when faced with some diagnostic uncertainty with a patient, the physician will consider the patient's gender as a factor. Thus, if a male and a female patient each present with exactly the same ambiguous CHD symptoms, a physician who refers to priors on the prevalence of CHD may be more certain of CHD in the male patient, for whom the population prevalence is higher (Rosamond et al. 2008). Another possible mechanism has been termed the “miscommunication” hypothesis, in which statistical discrimination occurs because the physician has trouble understanding certain patients, perhaps because of language, culture, or communication patterns (Balsa, McGuire, and Meredith 2005). With miscommunication, the physician misses the patient's signals and must rely more heavily on prior data. For example, if women tend to use vague language when describing their CHD symptoms, physicians may be more likely to miss a relevant diagnosis with women. It should be noted that within either of these hypotheses lies the potential for some role of personal stereotypes or prejudices, which is generally regarded as unjustifiable during clinical decision making (Balsa, McGuire, and Meredith 2005).
Considering the complicated pathways to a diagnostic decision, recent experiments investigating disparities in clinical decision making have attempted to control for the miscommunication pathway (Arber et al. 2006; McKinlay et al. 2006; Bonte et al. 2008; Lutfey et al. 2008;). In these experiments, physicians observe simulated patients who have the same exact verbal and nonverbal symptom presentation but vary in key factors of interest, such as gender, age, or race. In the case of CHD, results showed that even with the exact same patient communication of cardinal symptoms, physician certainty of the CHD diagnosis is significantly greater if the observed patient is male—even holding other factors (age, race, and socioeconomic status) constant (Arber et al. 2006; Bonte et al. 2008; Lutfey et al. 2008;). It is plausible that statistical discrimination, via the prevalence hypothesis, may explain all or part of this gender effect.
Understanding the cognitive processes of diagnostic decisions and distinguishing which, if any, types of discrimination are most involved, may be critical to understand health disparities and suggest possible interventions (Lutfey and Ketcham 2005). To help understand how prevalence data and statistical discrimination may affect clinical decision making for CHD, our specific objectives were as follows:
CHD is the single greatest cause of death for men and women in the United States and Europe. The case of CHD is particularly well-suited for our purposes because epidemiologic prevalence data on CHD are extensive and can be regarded as a reliable and sufficiently specific source of priors; also, it generally shows that the prevalence of CHD is higher in men than in women (Rosamond et al. 2008). As our focus is on understanding the role of priors, we use data from an experiment that restricted the role of the individual patient's symptoms and was designed to eliminate differential miscommunication by standardizing all patients' presentations to be strongly suggestive of CHD. Objective 1 descriptively examines the role of prevalence data in clinical decision making among these physicians as an initial step. It attempts to inform the question, do physicians hold on to their priors when faced with an individual patient presentation strongly suggestive of CHD? Objective 2 investigates which physicians are most likely to maintain priors, and with which patients, to help inform targets for interventions in clinical decision making. Investigating physicians' adherence to their prior probabilities during clinical decision making is important to decipher cognitive processes; furthermore, it may reveal the extent to which gender differences in CHD are due to reliance on prior data.
Our data source was the NIH-funded project “Cognitive Basis of CHD Disparities.” This project conducted a factorial experiment to measure the unconfounded effects of (a) patient attributes (age, gender, race, and socioeconomic status); (b) physician characteristics (gender and years of clinical experience); and (c) cognitive priming status on medical decision making for an actor “patient” presenting with CHD in a videotaped vignette. The study was approved by New England Research Institutes' Institutional Review Board and obtained signed informed consent from all participants.
Physicians were sampled from North and South Carolina to fill four design strata (gender by experience) totaling 256 participants. Eligibility criteria were as follows: (a) internist, general practitioner, or family practitioner; (b) graduated from medical school between 1996 and 2001 or between 1960 and 1987 (to preserve orthogonality and ensure clear separation on level of clinical experience); and (c) currently seeing patients at least half-time. Prospective participants were sent letters and screened for eligibility by telephone. An appointment was scheduled with each eligible participant for a one-on-one, structured interview. Interviews were conducted over a 10-month period in 2006–2007.
Professional actors portrayed patients presenting to a primary care provider with the essential signs and symptoms of CHD. Sixteen versions of the scenario were created using eight actors/actresses, systematically varying the patient's gender, age (55 or 75 years), race (black or white), and socioeconomic status (lower or upper, as depicted by occupation). The script was designed to include the key diagnostic evidence that would lead physicians to suspect CHD, as described in Table 1.
To investigate the effect of cognitive priming status, half of physicians were randomly selected to be primed (i.e., explicitly directed) to consider a CHD diagnosis before viewing the vignette. These participants were told that a physician who had seen the patient while s/he was on vacation had mentioned the possibility of CHD and suggested s/he see her/his primary care physician upon returning home.
After viewing the vignette, a trained interviewer asked the physician to say what she or he thought was going on with the patient, and to state a diagnostic certainty value (on a scale of 0–100, 0 indicating no certainty and 100 indicating complete certainty) for each diagnosis. The diagnostic certainty of CHD indicates a posterior probability. The majority of physicians (98.8 percent) identified CHD as one possible diagnosis; physicians who did not mention CHD were asked later in the interview to state their CHD diagnostic certainty. Then, in a gender substitution exercise, physicians were asked what their CHD certainty value would be if a patient of the opposite gender had presented in the exact same way as their observed patient.
Physicians were asked to provide their estimate of the prevalence of CHD in the entire U.S. adult population. They were then asked to rate the accuracy of a prevalence estimate of 6.9 percent for adult men and women as being “far too low, [men/women] have a much higher base rate,” or “far too high, [men/women] have a much lower base rate” as the extremes of a seven-point Likert scale, without being told that this was the overall prevalence of CHD published by the American Heart Association in 2006 (Thom et al. 2006).
In accordance with our objectives, the analysis was conducted in three phases. First, we calculated descriptive statistics for physician characteristics and responses regarding prior data and diagnostic certainty. To investigate the cognitive decision making process, a primary interest in this analysis is whether physicians were consistent in the diagnostic process regarding their CHD priors and diagnostic certainties. We categorized physicians as “consistent” or “inconsistent” in the diagnostic process based on the relationship between the direction of change in CHD prevalence estimates and the direction of change in CHD diagnostic certainty for the individual patient when asked to substitute patient gender. That is, a physician was labeled inconsistent if his/her group (i.e., prevalence) and individual (i.e., patient) assessments were not concordant (e.g., if a physician perceived the prevalence as higher in men, but was equally or more certain of the CHD diagnosis for the individual patient if she were female, then the physician would be labeled inconsistent). For our purposes, the measure of inconsistency is not meant to indicate proper or improper decision making; rather, it serves to reveal a cognitive aspect of the decision making process.
For our second objective, we examined predictors of consistency using statistical models. In models using only design factors (patient factors of gender, age, race, socioeconomic status; physician factors of gender, experience, priming status), the balanced factorial design allows the unconfounded estimation of main effects and interactions using analysis of variance. In models additionally evaluating nondesign physician factors, such as attitudes regarding medical literature and published prevalence rates, results were similar in unadjusted and multivariate models.
Finally, we examined whether a previously reported effect of patient gender on CHD certainty (Arber et al. 2006; Bonte et al. 2008; Lutfey et al. 2008;) may be explained by physicians' priors (i.e., the prevalence hypothesis of statistical discrimination). Here, we compared the effect of gender in two generalized linear models shown below: (1) the basic factorial model, in which the dichotomous patient and physician characteristics are independent variables, and certainty of CHD is the dependent continuous variable (Yi); and (2) Model 1 with added categorical indicator variables for the physicians' perceptions of CHD population prevalence (categories of higher in men, higher in women, or, the reference group, similar for men and women).
We include a pure error term with 128 degrees of freedom because there were two replications of the experiment. There is evidence for statistical discrimination if the effect of patient gender as observed in model 1 is reduced or eliminated in model 2. In addition, in exploratory analysis stratified by patient gender, we included physician estimates of the overall U.S. population prevalence of CHD as a continuous variable, to explore whether the role of this prior data differed for male and female patients.
Because priming status has high potential to affect physicians' cognitive processes during decision making, we repeated all analyses separately for primed and unprimed physicians. Results were similar to the main analysis, which adjusted for this design factor in all multivariate models, and so are not reported.
All statistical tests were two sided and performed at an α level of 0.05 using SAS v. 9.1 (SAS Institute, Cary, NC).
Physicians most commonly perceived the overall U.S. adult population prevalence of CHD to be higher in men (48.1 percent), though a substantial 44.9 percent thought the prevalence was similar by gender (Table 2). The majority of physicians provided an estimate of the overall CHD prevalence that was higher than the rate published by the American Heart Association (data not shown) (Thom et al. 2006).
For their observed patient, physicians were significantly more certain of a CHD diagnosis if the patient were male (mean certainty on scale of 0–100: 61.7 versus 53.0, p=.002). When asked how their certainty of CHD would change if the patient had been male versus female, 52 percent of all physicians said their certainty would not change, 32 percent would be more certain for male patients, and 16 percent more certain for female patients.
Inconsistency between perceptions of prevalence for men versus women and diagnostic certainty for the individual patient occurred with almost half (48.4 percent) of physicians. Inconsistent physicians most frequently diverged from their priors to diagnose male and female patients with equal certainty in the vignette (Figure 1). Physicians who thought that the prevalence of CHD was either higher in men or higher in women were more likely to be inconsistent than those who thought that there was no gender difference (p=.001, Table 3). Although there was considerable variation in the responses, overall, the most common response group was physicians who were consistent in assessing no gender difference in population prevalences and no gender difference in their certainty of CHD for the given patient (28 percent of all participants, Figure 1).
In unadjusted and multivariate models, no physician characteristics were significantly associated with being consistent in the assessment of CHD population prevalence and diagnostic certainty for the given patient (Table 3). Similarly, patient factors of age, race, gender, or socioeconomic status were not associated with consistency (data not shown).
As this experiment and others (Arber et al. 2006; Bonte et al. 2008; Lutfey et al. 2008;) have found significant effects of patient gender on CHD certainty, we analyzed whether adjustment for physicians' assessments of gender-based priors explains or attenuates the role of the observed patient's gender. Multivariate model results showed that physicians' gender-based population prevalence assessments were not associated with CHD certainty (p=.5), and the effect of patient gender remained strong (adjusted mean CHD certainty: 63.6 for male versus 54.7 for female patients, p=.001). Thus, this analysis provided no evidence to support the prevalence hypothesis as an explanation for statistical discrimination by gender.
Interestingly, in exploratory analyses of physicians' absolute estimates of the overall adult CHD prevalence, there was a statistically significant association between this population estimate and diagnostic certainty for the individual patient only for male patients (p=.02; female patients p=.52). Physicians who assigned a higher overall population prevalence were more likely to ascribe a higher certainty to the CHD diagnosis for the observed patient only if the patient were male, suggesting that these prior data were not as relevant in the diagnostic decision for female patients.
In this series of analyses, we have shown that physicians often deviated from their preconceived notions of the likelihood of disease when diagnosing patients, thereby placing more weight on the current patient's presentation and less weight on prior probabilities. Patients in our experiment presented with cardinal symptoms of CHD, such that the level of evidence provided by the symptoms led the vast majority of physicians to consider CHD as a diagnosis, and just over half of the physicians (51.6 percent) reported that they would not change their diagnostic certainty based on patient gender. However, in the main factorial experiment, physicians were significantly less certain of the CHD diagnosis for female patients. Our finding that this gender effect could not be explained by the physicians' prior notions of CHD probabilities indicates that statistical discrimination via the prevalence hypothesis was not the underlying reason for gender differences in CHD certainty.
Why women were diagnosed with a lower certainty, despite presenting with the exact same symptoms and controlling for gender-relative priors in CHD prevalence, is critical to understand if worrisome inequalities in clinical decision making and health care are to be appropriately addressed. One possibility is that physicians behaved differently for men and women because of personally held stereotypes or prejudices. Discrimination resulting from personal stereotypes is very different from statistical discrimination resulting from the application of prior probabilities (Balsa, McGuire, and Meredith 2005; McGuire et al. 2008;). That is, when physicians use prior probabilities to guide decisions, they are attempting to use as much information as they have available to guide their decisions, in the best interests of the patients. When they are influenced by personal prejudices, they are not acting in the best interests of their patients (Balsa, McGuire, and Meredith 2005). In the current study, the extent to which personal stereotypes or prejudices may explain our results is unknown.
Another possible reason that we did not see evidence of statistical discrimination underlying the gender effect in CHD diagnosis certainty is that our measure of prior information may not have sufficiently captured the priors that physicians held. First, it would have been helpful to know the physicians' clinical experiences with male versus female patients and which CHD symptoms they typically encountered. It is possible that physicians with more frequent exposure to male patients with CHD symptoms similar to the vignette would be more certain of the male simulated patient's diagnosis. Second, our analysis is based on a relative comparison of the physicians' prevalence estimates for men and women, rather than absolute values of estimates for each. However, a relative comparison may be preferable, particularly considering that the physicians' absolute estimates of the overall population prevalence of CHD were substantially higher than the published rate of 6.9 percent (Thom et al. 2006). Assuming that published prevalence data are correct, other studies have also found that prior probabilities estimated by physicians were inaccurate, to the extent that the authors suggested that the use of prior probabilities as a tool for clinical decision making might cause more harm than benefit (Cahan et al. 2003). Important to note is that in our study, the overall estimate was associated with the CHD diagnostic certainty for male patients only; among female patients, the role of this prior information was irrelevant. This difference suggests that the symptom presentation held more weight for female patients than it did for male patients. Alternatively, physicians may have been more confident of the relevance or accuracy of CHD population rates for male patients. Because the CHD prevalence estimate was important in the diagnosis of male patients, there may have been stronger overall evidence—both priors and current patient presentation—to increase the certainty of the CHD diagnosis in male patients.
While information on absolute rates of disease may be helpful for clinical decision making, understanding disparities in health care requires examining relative differences across sociodemographic groups. Comparing genders, the majority of physicians in our study thought that the prevalence estimate was similar for men and women, although a sizeable 46 percent believed it was higher in men. Our finding that the minority of physicians who assessed a higher CHD prevalence for women were most likely to be inconsistent with this notion in their diagnostic certainty suggests that the symptom presentation strongly outweighed their prior beliefs. Accordingly, physicians whose priors held that CHD prevalence was similar by gender were most likely to be consistent. Physician level of clinical experience, keeping up with medical literature, beliefs in the accuracy of published prevalence rates, and priming status did not help predict which physicians would be more likely to adhere to their priors in the diagnostic process.
For present purposes, a critical benefit of using the experimental vignette is that it allows for the manipulation of several variables at once, thereby providing unconfounded results for factors (e.g., race and socioeconomic status) that are otherwise nearly impossible to disentangle (i.e., ensuring internal validity). Studies comparing the vignette methodology with standardized patients and other methods have shown that vignettes are also externally valid for studies of medical decision making and assessments of quality of care (Braspenning and Sergeant 1994; Peabody et al. 2000; Veloski et al. 2005; Robra et al. 2006;). To further enhance the external validity of our results (i.e., that physicians behave similarly under experimental conditions as in everyday clinical practice), we took three precautionary steps. First, considerable effort was devoted to ensure the clinical authenticity of the videotaped presentation. This was achieved by basing the scripts on clinical experience of physician advisors, filming with experienced clinicians present, and by using highly trained professional actors/actresses. Second, physicians viewed the vignette in the context of their practice day (not at a professional meeting, course update, or home) so that it was likely they encountered real patients before and after they viewed the patient in the videotape, thereby retaining as much of the situational context as possible. Third, physicians were specifically instructed at the outset to view the patient as one of their own and to respond as they would typically respond in their own practice. When asked if the patient viewed on the videotape was typical of patients they encounter in everyday practice, 90 percent considered them very typical or reasonably typical.
In this paper, we have examined two major funnels of information that may help a physician come to a diagnostic decision. A critical issue in clinical decision making is that both funnels of information are subject to some unknown level of error. For example, reports of a given patient's symptoms may be faulty if there is miscommunication between the doctor and the patient, or if the patient does not provide certain details, perhaps because of embarrassment or a belief that such details are irrelevant. An advantage of our experiment was that it minimized miscommunication to the greatest extent possible; while it is impossible to control physicians' perceptions of the patient's signals, all facets of the patient presentation (apart from design factors such as gender) were exactly the same across all patient encounters.
The second funnel, which includes published data in the medical literature or personal physician knowledge or experience, may suffer its own biases. For instance, data on base rates ultimately stem from reports of physicians' diagnoses, which are prone to error. Thus, for use of prevalence data to be an acceptable option, it is essential that data are accurate and up to date. For example, if epidemiological data indicating that the prevalence of CHD is higher in men were erroneous, then using such data to dismiss a CHD diagnosis in a female patient in the face of symptom uncertainty would lead to a faulty diagnosis and delayed treatment.
We have shown that given a sufficient symptom presentation, prevalence data were often outweighed during the decision making process. Thus, our results suggest that we need not worry excessively over the potential that statistical discrimination needlessly affects published rates for CHD by gender, given an adequate patient symptom assessment. Reliance on prior beliefs rather than patient-specific information to guide clinical decision making is difficult to justify when high-quality patient information and low-cost tests are available (Balsa, McGuire, and Meredith 2005). The finding that physicians' perceptions of base rates may be inaccurate further supports this notion. While our methodology cannot definitively rule out the potential for any role of priors in the gender effect in CHD diagnoses, the influence of stereotypes or prejudice should be examined in future work. In addition, the extent to which physicians rely on priors to diagnose patients when patient presentation is less informative or when miscommunication is likely is critical to examine in the field of clinical decision making.
Joint Acknowledgment/Disclosure Statement: This study was supported by National Heart, Lung, and Blood Institute grant R01 HL079174. The authors wish to thank Dr. Carol Link, senior statistician, for data preparation and statistical guidance.
Disclosures: There are no conflicts of interest or disclosures.
Disclaimers: There are no disclaimers.