|Home | About | Journals | Submit | Contact Us | Français|
Previous studies have documented the underdiagnosis of coronary heart disease (CHD) in women, but less is known about which alternate diagnoses take precedence and whether additional patient factors modify possible gender bias.
To measure gender variation in clinical decision making, including (1) the number, types, and certainty levels of diagnoses considered and (2) how diagnoses vary according to patient characteristics, when patients have identical symptoms of CHD.
This was a factorial experiment presenting videotaped CHD symptoms, systematically altering patient gender, age, socioeconomic status (SES) and race, and physician gender and level of experience. The primary end point was physicians' most certain diagnosis.
Physicians (n=128) mentioned five diagnoses on average, most commonly heart, gastrointestinal, and mental health conditions. Physicians were significantly less certain of the underlying cause of symptoms among female patients regardless of age (p=0.006), but only among middle-aged women were they significantly less certain of the CHD diagnosis (p<0.001). Among middle-aged women, 31.3% received a mental health condition as the most certain diagnosis, compared with 15.6% of their male counterparts (p=0.03). An interaction effect showed that females with high SES were most likely to receive a mental health diagnosis as the most certain (p=0.006).
Middle-aged female patients were diagnosed with the least confidence, whether for CHD or non-CHD conditions, indicating that their gender and age combination misled physicians, particularly toward mental health alternative diagnoses. Physicians should be aware of the potential for psychological symptoms to erroneously take a central role in the diagnosis of younger women.
Several decades of health services research have documented social variations in medical care, and recent attention has turned to the contribution of provider decision making in generating health disparities.1,2 In particular, research indicates that the care a patient receives may be as much a function of who the patient is (age, gender, race/ethnicity, socioeconomic status [SES]), who the provider is (age, gender, specialty), and where the care is delivered (private/public facility, geographic location) as it is of the symptoms actually present.3–10
A compelling example is coronary heart disease (CHD), the single greatest cause of death for men and women in the United States and Europe. Remarkably, women generally have lower age-adjusted CHD incidence and mortality than men.11 As a result, CHD has been considered a predominantly male disease.5,6,12–14 Reasons for a gender difference in the occurrence of CHD are not completely understood; current hypotheses include differential gender effects of vascular pathophysiology and protective effects of estrogen.12,15 Nevertheless, some studies suggest that when admitted to the hospital with a diagnosis of myocardial infarction (MI), angina, chronic ischemic heart disease, or chest pain, women, unlike men in the same situation, are less likely to undergo various types of coronary surgery and are more likely to die in the hospital.16 Noninvasive tests for coronary artery disease (CAD) are also performed less frequently on women, even though women are more likely to complain of chest pain.17,18
In this context, the extent to which preconceptions about gender and CHD risk influence the initial diagnosis of CHD remains unclear.19 Overall, epidemiological data suggest that women, particularly younger women, are underdiagnosed as a result of unrecognized symptoms or faulty symptom interpretation. For example, more women than men who died suddenly of CHD were said to have had no previous symptoms,11 and twice as many women as men aged 45–64 had undetected MIs.20,21 Recent experimental research involving advanced medical students found gender bias in the assessment of written vignettes portraying CHD symptoms in 58-year-old female vs. 48-year-old male patients.19 However, the observed difference was mediated by the context of the patient presentation, such that only when the presentation indicated psychological distress did women receive significantly fewer CHD diagnoses, and the interpretation of symptom origins shifted from organic to psychogenic.
If physicians misinterpret symptoms of CHD in certain female patients, it is plausible that the apparent gender difference in CHD rates is due, at least in part, to gender bias in clinical decision making. In this article, we test the hypothesis that the same symptomatic CHD presentation is interpreted differently by practicing physicians depending on patient gender, expanding on previous work to additionally investigate the alternate non-CHD diagnoses patients receive. Specifically, we examine (1) the number, types, and certainty levels of diagnoses considered and (2) how diagnoses vary according to patient gender, as well as possible interactions between gender and age, SES, or race. An investigation of the extent and nature of gender bias in the diagnosis of CHD is essential to fully understand reasons for apparent gender disparities in disease rates and may ultimately serve to reduce needless delays in physicians' diagnoses of heart disease in women.
Our objective was to estimate the unconfounded influence of patient gender on medical decision making when physicians are presented patients who show identical CHD symptoms. We conducted a factorial experiment, which permits estimation of unconfounded main effects of patient and physician gender and interactions22–24 for each of the following factors: patient age, patient SES, patient race, physician gender, and physician level of experience. Details on the methodology of the video vignette and semistructured interview have been published previously and are summarized below.10 The study was approved by the Institutional Review Board of the New England Research Institutes (Watertown, MA).
Physicians were randomly sampled throughout Massachusetts to fill four design cells (two cells of gender by two cells of clinical experience) to total 128 participants as the final sample size. Eligibility criteria were (1) internist or family practitioner, (2) ≤12 years or ≥22 years clinical experience (to get a definite separation in the amount of clinical experience in our two design strata), (3) educated at an accredited U.S. medical school, and (4) currently providing clinical care half-time at a minimum. Screening telephone calls identified eligible physicians. Informed consent was obtained during in-person interviews, conducted May 2001–March 2002. Participants were provided modest stipends (U.S.$100).
Of the 128 physicians who participated, 95 (74.2%) were internists, 28 (21.9%) were family practitioners, and 5 (3.9%) were general practitioners. The majority practiced in either a small group (32.0%) or large group (21.1%) setting, with the rest in community health centers (17.2%), hospitals (14.8%), or solo practice (14.8%).
Under experienced physician supervision, professional actors and actresses were cast to portray a patient coming to a primary care provider with the signs and symptoms of CHD. Sixteen versions of the scenario were videotaped using eight actors/actresses, systematically varying the patient's age (55 vs. 75), race (white vs. black), gender, and SES (lower vs. higher, as portrayed by the same actor/actress playing either a janitor or a teacher). Each videotaped encounter simulated an initial interview with an internist or family practitioner and was of 7–8 minutes in duration, reflecting the average length of a face-to-face consultation with a primary care physician (excluding physical examination).25
A script that included both verbal and nonverbal direction for the video simulation was developed from tape-recorded role playing sessions with experienced, clinically active advisors. Patients in the vignette spoke about their reason for the visit. The script was designed to include the key diagnostic evidence that would lead physicians to suspect CHD (Table 1). The script did not deliberately attempt to divert participants' attention by including diagnostic criteria for other conditions, and no preexisting comorbidities were presented.
After the participant viewed the video-simulated consultation, the interviewer asked: Please list what you think is going on with this patient. (What are the possibilities?) Then the interviewer asked: Using a scale of 0–100, with 0 indicating total uncertainty and 100 indicating total certainty about a particular condition, how certain are you that the patient has [condition]? The interviewer recorded verbatim the physician's full response.
The purpose of this analysis was to investigate how physicians' interpretations of CHD symptoms are influenced by patient and gender and age. To this aim, the primary outcome was the diagnosis that physicians were most certain of as the underlying condition that caused the patient's symptoms. Additional outcomes for preliminary analyses included the number of diagnoses considered, the maximum certainty of any diagnosis, the certainty for a CHD diagnosis, and representation of non-CHD diagnoses. We estimated the effect of patient gender on these outcomes and also if gender effects are modified patient age, race, SES, or physician level of experience or physician gender. We used analysis of variance (ANOVA) as the primary analytical method. The balanced factorial design allows us to estimate the main effects of the patient and physician design factors and two-way interactions (e.g., gender×age) with no confounding by other design factors. All the patient and physician design factors were controlled for simultaneously in the model. We considered physician specialty and practice setting as potential covariates, but neither was associated with our outcomes and results were not changed, so they were not included in the final models.
Our sample size of 128 physicians gives 80% power to detect an absolute difference in means of 25% in analyses of main effects of physician or patient characteristics (e.g., a true difference in CHD certainty for male vs. female patients of 60 vs. 45 points will be detected 80% of the time at α=0.05). When analyzing two-way interactions between any of the design factors (e.g., patient gender×age interaction), this sample size provides 80% power to detect an effect size of 0.25. Statistical tests were performed at α=0.05; no measures were taken to account for multiple testing,31 but we note that results observed at the p<0.02 level are unlikely to have changed. In light of the multiple tests, we facilitate interpretation by presenting actual p values, unadjusted for multiple testing, to allow readers to choose their preferred level of significance. We used SAS v.9.1 (SAS Institute, Cary, NC) to conduct these analyses.
Patient gender, particularly in combination with age or SES, had significant effects on our outcomes of interest. Patient race and physician gender or level of experience did not modify the observed effects of patient gender in the following analyses (data not shown).
Physicians mentioned on average 5 possible diagnoses (range 2–10) for what they thought was going on with the simulated patient. The number of diagnoses considered did not vary by patient gender or age. Numerous conditions were mentioned as possible diagnoses for the patient (Fig. 1). The most commonly mentioned diagnoses involved gastrointestinal conditions, particularly acid-related problems. Female patients' symptoms were significantly more often attributed to gastrointestinal conditions than were the exact symptoms presented in men (44% vs. 38%, p=0.02). Second to gastrointestinal were heart conditions, followed by mental health, neither of which significantly varied in their representation by patient gender.
Regardless of the diagnoses considered, physicians were significantly more certain of their diagnosis for male patients than for female patients (mean maximum certainty 81 vs. 71 on a scale of 0–100, p=0.006). This gender effect was evident among all ages but was particularly apparent among younger patients (Table 2).
Although gastrointestinal disorders were most common among the possible diagnoses considered by each physician, the vast majority of physicians (94.5%) also mentioned CHD as a possible diagnosis. Physician certainty of the CHD diagnosis significantly depended on the patient's gender and age combination (Table 2); that is, for younger patients, physicians were significantly less certain of the CHD diagnosis in females compared with males (mean certainty 66 vs. 48, p<0.001). In contrast, for older patients, physicians were equally certain that CHD was a possible explanation for the patient's symptoms.
Interestingly, exploratory analyses showed that physicians who were less certain about the CHD diagnosis (defined as < median certainty of 65) were more likely to consider a greater number of diagnoses (mean 5.6 vs. 4.7, p=0.08; data not shown). Furthermore, a greater proportion of their diagnoses were not directly CHD related. Specifically, mental health diagnoses represented a greater proportion of all possible diagnoses among physicians who were less certain of the CHD diagnosis (23% vs. 14%, p=0.04), and the certainty of the mental health diagnosis was significantly greater among these physicians (mean certainty for mental health diagnosis 43 vs. 29, p=0.03).
Despite the various diagnoses considered, all physicians ascribed their highest certainty to a heart, gastrointestinal, or mental health condition (Table 3). When stratified by patient age, results suggested a gender effect in receiving a CHD or mental health condition as the most certain diagnosis. Among younger patients, physicians were most certain that CHD was the proper diagnosis for 62.5% of males compared with 46.9% of females, a nonstatistically significant difference (p=0.23). On the other hand, younger females were significantly more likely to receive a mental health diagnosis as the underlying cause of the symptoms compared with younger males (31.3% vs. 15.6%, p=0.03). Among older patients, in contrast, mental health diagnoses were equally distributed. Although tests for interactions between gender and age did not reach statistical significance, the stratified results suggest that gender and age together modified physicians' perceptions of the patient's condition. There were no main effects of patient race or SES in determining the most certain diagnosis.
Although patient race and physician factors did not modify the effect of gender on these diagnoses, there was a statistically significant interaction between patient gender and SES in the diagnosis of a mental health condition (p=0.006) (Table 4). Among patients of higher SES, a significantly greater percentage of females was given a mental health diagnosis as most certain (37.5% of females vs. 12.5% of males, p=0.01). In contrast, among lower SES patients, there were no clear gender effects. Age did not significantly modify these findings (p=0.34 for 3-way interaction between patient age, gender, and SES). It is noteworthy, however, that the gender difference was most apparent among younger patients of higher SES (females 43.8% vs. males 12.5%, p=0.01).
The role of SES alone in the diagnosis of younger female patients was striking as well, with 43.8% of higher SES vs. 18.8% of lower SES younger females most certainly diagnosed with a mental health condition (p=0.01). Patient SES was not a significant factor in the proportion of patients who received a confident diagnosis of a gastrointestinal condition or CHD.
In this experiment, we found evidence of a gender effect in the diagnosis of patients with CHD symptoms that was specific to younger ages (aged 55 vs. 75 years). Middle-aged female patients with the same symptoms and context were diagnosed with the least confidence for both CHD and non-CHD conditions, indicating that their gender and age combination within the context of their presentation confused or misled physicians. Physicians who were less certain of the CHD diagnosis considered more alternate diagnoses, predominantly gastrointestinal and mental health conditions. Women were twice as likely to be diagnosed with a mental health condition as the most likely cause of the symptoms if they were middle-aged. Furthermore, gender bias toward the mental health diagnosis was particular to women of higher SES, whose symptoms were most often thought to be of psychogenic origin. The effect of patient gender on certainty of diagnoses was not modified by patient race, physician gender, or physician level of experience. Results suggest that younger women with CHD symptoms are at greater risk of misdiagnosis and, hence, the possibility of treatment delay.
We used a rigorous experimental method to ensure internal validity of these results. The factorial design removes the potential for confounding by patient age, race, and SES and physician gender or level of experience. We cannot be certain, however, that physicians' clinical decisions were unaffected by the actors/actresses who portrayed the patients. Considering possible threats to external validity, we took numerous precautionary measures to maximize the generalizability of the research findings: (1) we randomly sampled participants from all Massachusetts physicians, (2) consultants (nonparticipant physicians) provided expertise during script development and were present during filming to assure clinical authenticity of the scenario, (3) participants confirmed how typical the patient was compared with patients in their everyday practice (92% considered them very or reasonably typical), (4) participants viewed the vignettes in the context of their regular practice day; that is, they likely saw actual patients before and after viewing the simulated patient; and (5) participants were specifically instructed to view the patient as one of their own and to respond as they would in their own practice. Furthermore, studies comparing the vignette methodology with other methods, such as standardized patients and chart abstraction, have shown that vignettes provide valid estimates for studies of medical decision making.32–36
Although a simple explanation for our finding of gender ageism may be that physicians are merely combining information from the case presentation with their previous knowledge of CHD risk profiles (i.e., being good Bayesians) to properly ascribe a lower likelihood of CHD to younger women,9,21 this theory alone is unconvincing for two main reasons. First, the most recent epidemiological data show that the gender difference in CHD rates is consistent throughout the adult age span. For example, NHANES data from 1999–2004 show that in 40–59-year-olds, the estimated CHD prevalence in women is 71% of that in men (7.8% of men vs. 5.5% of women), and in 60–79-year-olds, it is 68% of that in men (22.8% of men vs. 15.4% of women).11 Therefore, a straightforward gender difference in CHD diagnosis should similarly apply to both older and middle-aged patients, which is contrary to our results. We recognize, however, that physicians may not be aware that gender differences in population CHD rates occur among both middle-aged and older adults.
Second, whether or not physicians were aware of recent epidemiological data, the explanation that they were merely using their knowledge of CHD risk profiles should have been supported by results for other patient characteristics that are established risk indicators for CHD, such as race and SES. However, differences in CHD certainty by patient race37 were not concordant with population data,11 and patient SES alone did not influence our results. For these reasons, it is unlikely that our observed gender difference in diagnosing middle-aged adults is due solely to physicians' use of prevailing epidemiological data. Rather than being legitimate cues for the physician, the combination of patient gender and age may have obscured CHD symptoms among younger women.
Patient complaints that are CHD symptoms but not exclusively indicative of CHD, such as gastrointestinal discomfort and mood changes,26–30 were presented in the vignette because patients seldom appear as clear-cut cases. Our purpose was not to make the physicians' diagnostic task more difficult but to increase the clinical authenticity of the scenario, so that it more accurately represented how actual patients appear and allows results to be useful in practice. As a result of this clinical authenticity, physicians' diagnostic decisions may have been influenced by the mood changes (e.g., irritable, not myself lately) presented alongside the classic (e.g., chest pain) CHD symptoms. Indeed, results from a recent experiment by Chiaramonte and Friend19 indicated that when patients experience CHD symptoms in the context of stress, women, but not men, are less likely to be diagnosed with CHD. The authors suggested that an interaction between gender and psychological symptoms produced a shift in the interpretation of CHD symptoms, so that symptoms were no longer considered to be cardiogenic but rather to be a manifestation of psychogenic stress. Our results are concordant with the hypothesis that psychological symptoms may take a central role in the assessment of female, but not male, patients.
Whereas the study by Chiaramonte and Friend19 could not analyze age effects owing to its design, our finding that the gender bias in mental health misdiagnoses occurred only in younger patients suggests that the extent to which psychological symptoms influence the decision-making process depends on other patient demographic factors that interact with gender. In addition to age, for example, SES was also important in these results. Only among patients of higher SES were women more likely to be diagnosed with a mental health condition. If physicians were relying on prevalence data of mental health disorders, this finding would be unexpected because most evidence suggests greater psychopathology among lower SES populations.38 A possible explanation is that physicians were aware that CHD risk is greater among lower SES populations39 and were, therefore, less certain about the patient having a stereotypically low-risk combination of higher SES and female gender. Indeed, physicians who were less certain of the CHD diagnosis were significantly more certain of the mental health diagnosis. Furthermore, mental health conditions were the only category of diagnoses to be associated with CHD certainty (e.g., the certainty of the commonly reported gastrointestinal diagnoses did not vary by patient gender or CHD certainty).
In summary, our results support the hypothesis that gender alone is not sufficient to produce bias in the diagnosis of patients with CHD symptoms. Rather, the combination of female gender and younger age rendered physicians less certain of the most probable cause of the symptoms and increased the likelihood that other diagnoses were strongly considered. The broader context of the patient's presentation may influence physicians' interpretations of women's symptoms moreso than men's. As our experiment was not designed to evaluate the effect of stress, future studies should investigate if gender ageism is modified by psychological presentation. Future research should also specifically question the underlying cognitive and psychological dimensions of physicians' decision-making patterns, including knowledge of data on CHD base rates and use of Bayesian processes. In the meantime, physicians must be aware of the potential for psychological symptoms to impact the diagnosis of younger women with CHD symptoms. This awareness may help reduce needless delays in physicians' diagnoses of heart disease in middle-aged women. Research continues to investigate how CHD risk and symptoms may manifest uniquely in younger women, but physicians need to keep in mind that epidemiological base rates for CHD risk by gender do, after all, depend on their own accurate and timely diagnoses.
This work was funded by a grant from the National Institutes of Health, National Institutes on Aging (AG16747). We thank Amy O'Donnell and Sarah Addison for technical and administrative support.
No conflicts of interest or competing financial interests exist.