|Home | About | Journals | Submit | Contact Us | Français|
Efforts to improve the quality and costs of U.S. health care have focused largely on fostering physician adherence to evidence-based guidelines, ignoring the role of clinical judgment in more discretionary settings. We surveyed primary care physicians to assess variability in discretionary decision making and evaluate its relationship to the cost of health care. Physicians in high-spending regions see patients back more frequently and are more likely to recommend screening tests of unproven benefit and discretionary interventions compared with physicians in low-spending regions; however, both appear equally likely to recommend guideline-supported interventions. Greater attention should be paid to the local factors that influence physicians’ clinical judgment in discretionary settings.
Health care spending in the United States is the highest in the world and continues to grow at a rate of 7 percent per year.1 Such liberal health care spending fails, however, to provide the country with the best health in the world.2 Even within the United States—where per capita spending varies more than twofold between the lowest- and highest-spending regions—higher spending appears to result, if anything, in slightly lower quality and worse outcomes.3 U.S. regions with the highest spending levels do not achieve lower mortality, nor do they show greater improvements in mortality over time.4 Higher spending is also not associated with better access to care, patient satisfaction, or physicians’ ability to provide high-quality care.5 These findings underscore the serious problem of wasteful—and possibly harmful—overuse within the U.S. health care system.
Efforts to improve both the quality and the cost of U.S. health care have focused largely on fostering physician adherence to evidence-based clinical practice guidelines and reducing frank medical errors.6 These approaches are useful when it is possible to precisely define and reliably measure the correct action in a specific clinical situation. However, such explicit approaches are not applicable to the many discretionary decisions that physicians face. Of 2,500 treatments for a variety of conditions reviewed by BMJ Clinical Evidence, more than half fell into this gray zone.7 Current approaches also make it difficult to measure overuse of clinical services; very few of the explicit measures available today focus on overuse, including fewer than 10 percent of the 439 quality indicators in the RAND Quality Tools measurement set.8
To the extent that higher utilization rates depend on physicians’ discretionary clinical decisions—for which the evidence does not point clearly to a right answer—current quality improvement and performance measurement initiatives are unlikely to address rising spending. To examine whether alternative approaches are therefore necessary, we investigated the relationship between health care use and primary care physicians’ discretionary decision making about medical interventions, by comparing physicians across different areas of the country. Instead of comparing actual practice—which might unfairly contrast physicians caring for widely varying panels of patients—we studied discretionary decision making via physicians’ responses to identical hypothetical patients presented as part of a national physician survey. Previous work suggesting a direct relationship between utilization (spending) and physicians’ tendency to intervene was limited to six discrete questions representing simplified discretionary decisions.9 The current survey was developed specifically to reflect the complex array of decisions faced daily in primary care practice and to include decisions about both evidence-based and discretionary interventions.
We conducted a mail survey of primary care physicians to examine the tendency of physicians practicing in regions with different levels of health care spending to intervene (to order tests, referrals, or treatment) in specific clinical situations. This project was approved by the Institutional Review Boards at Dartmouth Medical School and the University of Massachusetts.
To learn how physicians make decisions in the primary care setting and to pilot specific clinical vignettes, we conducted focus groups with primary care physicians in two cities. The focus groups and all subsequent survey development were done in collaboration with the Center for Survey Research, a professional survey research firm affiliated with the University of Massachusetts Boston. Focus groups concentrated on development and wording of realistic clinical vignettes that provided adequate detail and described a patient about whom physicians might disagree on management, and response categories that included an appropriate range of options, of which any number (including none) might be chosen at the time of the visit described.
The draft survey instrument was revised based on the results of cognitive interviews, conducted to ensure that the questions were well understood and the answers meaningful. The final survey consisted of questions about the physician (for example, board certification), his or her practice (for example, setting), and his or her clinical care (including routine follow-up, cancer screening, and evaluation and management decisions).
Using the Masterfiles of the American Medical Association and American Osteopathic Association, we obtained a random sample of primary care physicians (self-identified as family practice, general practice, or internal medicine) practicing at least twenty hours per week in the United States. Residents and retired physicians were ineligible.
Using computer-assisted telephone interviewing (CATI) software, trained professional phone interviewers checked to ensure that each sampled physician met eligibility requirements. A maximum of three call attempts were made during daytime and evening hours, weekdays and weekends, to try to speak to the sampled physician or an informant (such as a receptionist). During the eight-week verification period, we identified 1,419 eligible physicians out of an original sample of 1,775, of whom 1,333 were randomly selected to receive the survey.
Each potential participant was sent an initial questionnaire packet with a letter explaining the study, a $20 cash incentive, a survey instrument, and a postage-paid return envelope. Two weeks after the initial mailing, all physicians who had not responded were sent another questionnaire packet, absent the cash incentive. Of 1,333 physicians who were mailed the initial survey, 58 were found ineligible; of the remaining 1,275, 801 responded (response rate 63 percent). Nonresponders did not differ from responders in terms of sex, primary specialty, practice type, or years in practice.
For our measure of local health care spending, we used the End-of-Life Expenditure Index (EOL-EI), a measure based on Medicare expenditures in the last six months of life that we have described in detail elsewhere.10 The advantage of using this index to compare health care spending across different U.S. regions is that it is closely correlated with overall spending but unrelated to illness—because all patients included in calculating the measure have a life expectancy of six months.11 The greater-than-twofold differences in EOL-EI across U.S. regions are not related to patients’ preferences or illness levels.12 Mean EOL-EI was calculated for each of 306 U.S. Hospital Referral Regions (HRRs), which were then grouped into quintiles (of equivalent population size); mean EOL-EI for each quintile ranged from $11,347 to $17,809 (see Exhibit 1 for selected HRRs within each quintile). Each physician was located within an HRR and a quintile based on his or her practice address. We display results using three categories for spending: low spending (the lowest quintile of EOL-EI), moderate spending (the middle three quintiles combined), and high spending (the highest quintile).
Physician practice intensity was measured using physicians’ responses to three types of survey questions: questions about routine follow-up intervals; questions about whether or not the physician routinely recommends screening patients for each of three cancers (mammography, for which evidence-based recommendations support screening; prostate-specific antigen, or PSA, for which recommendations are equivocal; and spiral computed tomography, or CT, for which recommendations do not support routine screening); and clinical vignettes in which the physician was asked how often he or she would arrange for specific interventions (such as test, referral, or hospitalization) for patients with common clinical conditions. We analyzed dichotomized responses to all questions. For follow-up interval, we used a cut-off of three months. For clinical vignettes, we combined the responses “always or almost always” and “most of the time,” compared to the grouping of “some of the time,” “rarely,” and “never.” One vignette option was excluded because of ambiguous wording.
We examined the relationship between local spending (EOL-EI) and physician practice intensity using individual item responses as measures of practice intensity. Tests for trend were based on logistic regression in which the individual physician’s (dichotomized) response was the dependent variable and the independent variable was spending in the physician’s region, expressed as a continuous variable.
We used factor analysis to summarize patterns of correlations among response variables and explore the possibility of collapsing a number of the observed variables into a factor or factors representing practice intensity.13 Factor analysis identified a single factor that appeared to measure physician practice intensity; using this factor (which weights the different component variables according to their apparent intensity), a summary intensity score was derived for each physician (n = 593) who responded to all questions included in the factor analysis. For analyses examining the relationship between local spending and summary intensity score, we used linear regression in which the physician’s intensity score was the dependent variable, and the independent variable was spending in the physicians region (continuous).
We also conducted a multivariable analysis to control for other personal (age, sex, race), professional (specialty, U.S. medical graduate, board certification), and practice-level (setting, number of managed care contracts, proportion of capitated patients) factors that could influence practice intensity.
An alternative approach to deriving summary intensity scores—using weights representing the intensity of each vignette option, derived by modified Delphi technique—yielded results that were nearly identical.14 All analyses were carried out in STATA 9.1.
Respondents were primarily male (75 percent) and white (74 percent) and had been in practice a median of twenty-one years. There were somewhat more internists (52 percent) than family practitioners (45 percent). Nearly all (86 percent) were board certified in their specialty; 23 percent had attended medical school outside the United States and Canada.
Local spending level was a strong predictor of routine follow-up interval for patients with well-controlled hypertension. In high-spending regions, 47 percent of physicians schedule hypertensive patients every three months or more often, while only 19 percent of physicians in low-spending regions do so (ptrend < 0.001; Exhibit 2). Annual follow-up was almost nonexistent in high-spending regions.
Physicians were asked whether they routinely recommend three cancer screening tests to patients in different age groups (Exhibit 3). Physicians were equally likely to recommend mammographic screening regardless of where they practiced. PSA screening, on the other hand, was more likely to be recommended by physicians in high- compared with low-spending regions for all men age forty and older. The disparity was largest for men age eighty and older. Few physicians recommended routine spiral CT screening for lung cancer, but those in high-spending regions were more likely than others to recommend such screening (for smokers ages 40–79).
When seeing a seventy-five-year-old woman with typical symptoms of gastroesophageal reflux disease (GERD) (Exhibit 4), similar proportions of physicians in high- and low-spending areas would order a number of interventions, from Helicobacter pylori testing to prescription of a proton pump inhibitor (PPI). However, physicians in high-spending areas were more likely than others to refer the patient directly for upper gastrointestinal (GI) endoscopy and much more likely to refer the patient to a gastroenterologist for further management.
For a seventy-five-year-old man with the new onset of chest pressure occurring upon heavy exertion (Exhibit 4), ordering patterns were similar across different spending levels for some interventions (stress testing and curbside consultation with a cardiologist). However, physicians in higher-spending areas were more likely than others to order an echocardiogram, refer to a cardiologist, and admit the patient to the hospital.
For an eighty-five-year-old man with an exacerbation of end-stage (Class IV) congestive heart failure (CHF), local spending level was also predictive of the aggressiveness of a physician’s approach (Exhibit 4). Physicians in high-spending areas were more likely than others to admit the patient to an acute medicine floor and much more likely to admit the patient to an intensive care unit. They were less likely to discuss palliative care with the patient.
Summary practice intensity, derived from factor analysis that included all clinical decision-making variables, was significantly correlated with local health care spending at the individual physician level (r = 0.22, p < 0.001).15 When other factors at the physician and practice level that might be expected to influence practice intensity were controlled for, the correlation between local spending and physician intensity persisted (r = 0.15, p = 0.001).
When physicians were grouped into deciles of HRR-level spending (Exhibit 5), there was an extremely strong association between local spending and (factor-derived) summary intensity score (r = 0.94, p < 0.001).
We found that widely varying levels of health care spending across the United States are strongly correlated with the tendency of local physicians to recommend discretionary interventions. Primary care physicians in high-spending regions reported seeing patients back more frequently, recommending more screening tests of uncertain benefit, and opting for more-resource-intensive interventions than those practicing in low-spending regions. Compared with physicians practicing in the lowest quintile of spending, those in the highest quintile would recommend an additional eighty hypertension follow-up visits per year, fourteen spiral CT scans, twenty-five echocardiograms, twenty-four cardiac care unit admissions, and twenty-nine gastroenterology referrals (per 100 patients in each clinical category). In contrast, physicians in high- and low-spending regions are equally likely to recommend guideline-supported interventions.
Our study has several limitations. First, because we measured our exposure (local spending) using data on Medicare patients age sixty-five and older, some might be concerned that it does not represent overall U.S. health care spending (including on younger people). However, the correlation between practice patterns for U.S. populations under and over age sixty-five has been shown to be high.16 Similarly, state-level Medicare spending is closely correlated with overall per capita spending.17
Second, because we used clinical vignette responses to measure physicians’ tendency to intervene, we cannot be certain that our outcome measure (practice intensity) accurately reflects physicians’ ordering habits for actual patients. Comparing practice intensity based on physicians’ actual practice, however, would introduce greater difficulties—because individual patients, and panels of patients, differ from provider to provider. Because clinical vignettes allow for each provider to manage the same patient, they provide a measure of practice intensity that is inherently case-mix-adjusted. For the present study, the survey methodology was revised considerably compared with previous work, to enrich clinical detail and provide a range of possible intervention options for each vignette that closely approximates the choices available in clinical practice.18 Hypothetical patient scenarios are used for other important provider comparisons (such as board-certifying examinations) and have been established as a good measure of providers’ behavior with standardized patients.19
Finally, this paper does not address how individual physicians’ attributes affect each of the outcome variables, or the effect of adjusting for physician-level variables on the relationship between spending and each individual measure of physician practice intensity. Addressing these questions is beyond the scope of the present work. Overall practice intensity, however, remained strongly correlated with health care spending even after all covariates were adjusted for.
This study expands upon earlier work suggesting that physicians in different geographic regions practice differently. Physician decision making about breast cancer surgery; cardiac catheterization; and hospitalizing patients with asthma, diabetes, and CHF varies markedly across regions.20 The tendency to intervene shows even more dramatic variation in international comparisons.21 Attempts to correlate local practice patterns with area-level variables such as physician supply, specialist supply, and hospitalization rates have produced mixed results.22 Recent work showed that physicians’ tendency to intervene was strongly associated with health care spending in a physician’s region but did not attempt to distinguish discretionary from guideline-directed interventions and was limited to six simplified scenarios.23
“When physicians in different areas see the ‘same’ patients, they make different decisions—decisions that are strongly correlated with the local practice environment.”
Our findings reveal that responses to a few clinical questions posed to primary care physicians are strongly correlated with the level of local spending in the Medicare population. On the face of it, this might not seem surprising. If utilization is higher in certain areas, many would assume that physician behavior is the cause. There are, however, other potential explanations. Higher levels of patient illness or demand could explain higher utilization. Or the fact that high-spending areas have many more physicians per capita—each of whom might act exactly the same regardless of where they practice—could explain spending differences. We have shown, however, that when physicians in different areas see the “same” patients, they make different decisions—decisions that are strongly correlated with the local practice environment (such as spending). Furthermore, physicians in regions of differing spending appear to differ only in their discretionary decision making. For the decisions we examined that are informed by evidence or practice guidelines (such as screening mammography and standard exercise tolerance testing), physicians were equally likely to recommend interventions regardless of local spending levels.24
Although we have shown that physicians who practice in areas of higher local health care spending are more prone to intervene in discretionary situations, we are unable to distinguish between a number of factors that could contribute to the development of these differing practice patterns. On the one hand, more-aggressive physicians might selectively choose to practice in certain areas. On the other, physicians might adapt to the practice style of the community in which they settle—a phenomenon that could have a number of explanations. First, the malpractice climate in high-spending (versus low-spending) regions might exert stronger effects on physicians’ practice of defensive medicine, although other work suggests that this might be a small effect.25 Second, differences in patients’ expectations and demands across different regions might directly influence physicians’ decision making (although in data not presented, there were no differences across regions in the proportion of physicians who cite patient pressures as an impetus to intervene). Third, it is possible, indeed likely, that physicians in the U.S. fee-for-service environment adapt their practice style to maintain their incomes in response to local market forces—including the local supply of services and physicians.26 In higher-spending regions, characterized by a much greater supply of physicians per capita, physicians may (consciously or unconsciously) shorten their revisit intervals to keep their schedules full—and vice versa in lower-spending areas—and might modify their referral patterns based on the availability of specialists.27 Hospitals and medical practices in high-spending regions might also feel more competitive stresses and be more likely to exert pressure on physicians to order profitable services (such as high-tech diagnostic imaging) for their patients; conversely, lower availability in low-spending regions could dissuade physicians from ordering such services. Finally, it is very plausible that some or all of these factors could interact to create a culture of high-intensity specialty-oriented (or low-intensity primary care–oriented) practice, which then becomes the local “standard of care” and may thus be difficult for either patients or providers to resist.
We know that higher-spending regions do not achieve better health outcomes and that physicians practicing in these regions perceive greater difficulty providing high-quality care. Current policy efforts to improve the quality of care and address disparities in spending have focused largely on fostering adherence to clinical guidelines. This study suggests that greater attention to clinical judgment—and to the local factors that are likely to influence physician practice—will be required.
Brenda Sirovich was supported by a Veterans Affairs Career Development Award in Health Services Research and Development at the time this work was initiated. This study was supported by the National Institute on Aging (Grant no. P01 AG19783) and the National Heart, Lung, and Blood Institute (Grant no. R01HL080437). Financial support was also provided by Research Enhancement Award no. 03-098 from the Department of Veterans Affairs and a grant from the Robert Wood Johnson Foundation. Sirovich had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis. The funders of this work had no role in the design or conduct of the study; in the collection, analysis, and interpretation of the data; or in the preparation, review, or approval of the manuscript. None of the authors has a financial conflict of interest related to the paper. The views expressed herein do not necessarily represent the views of the Department of Veterans Affairs or the United States government.
Clinical judgment, not clinical guidelines, should be the focus of policy efforts to improve the quality of care and address disparities in spending.
An abstract based on this manuscript was presented at the 2005 annual meeting of the Society of General Internal Medicine in New Orleans, Louisiana, 11–14 May.
Brenda Sirovich, Brenda Sirovich (Email: Brenda.Sirovich/at/dartmouth.edu) is a research associate in the Outcomes Group, Veterans Affairs Medical Center, in White River Junction, Vermont, and an assistant professor of medicine at Dartmouth Medical School, in Hanover, New Hampshire.
Patricia M. Gallagher, Patricia Gallagher is a senior research fellow at the Center for Survey Research, University of Massachusetts Boston.
David E. Wennberg, David Wennberg is president and chief operating officer of Health Dialog Analytic Solutions, in Portland, Maine.
Elliott S. Fisher, Elliott Fisher is director of the Center for Health Policy Research, Dartmouth Institute for Health Policy and Clinical Practice, in Hanover, New Hampshire.