|Home | About | Journals | Submit | Contact Us | Français|
Although countries around the world are grappling with the problem of rising health expenditures, the United States has reason for particular concern. Americans are dissatisfied with their healthcare system (Schoen et al., 2007) but also spend more than the citizens of other nations: 15 percent of GDP on health care in 2006, compared to 11 percent in France and Germany, 10 percent in Canada, and 8 percent in the United Kingdom and Japan (OECD, 2008).
There is no question that the United States spends the most, but some observers view this money as well spent and forecast that future healthcare expenditures could optimally account for nearly one-third of GDP (Hall and Jones, 2007). Improvements in cardiovascular health and in the survival of premature infants in the United States have been estimated to be worth their high expenditures (for example, Cutler, 2004; Murphy and Topel, 2006). But the efficiency cost of the U.S. health system has also been estimated at 20–30 percent of healthcare spending, or 3–5 percent of GDP (Fisher et al., 2003a, b; Skinner, Fisher, and Wennberg, 2005), and according to some studies, avoidable deaths and medical errors are much more common in the United States than in European countries (Schoen et al., 2007; Nolte and McKee, 2008).
In this paper, we address two distinct questions about the efficiency of U.S. healthcare expenditures. First, does U.S. health care display inferior productive efficiency—that is, given a bundle of factor inputs like physicians, nurses, hospital beds, and capital, is the aggregate impact of health care in the United States less than in other countries? This question is surprisingly difficult to answer. Cross-country comparisons of expenditures and health outcomes are common but are also of limited value because of our inability to control adequately for underlying health differences across countries—for example, that Americans are more likely to have diabetes or to be obese compared to the English (Banks, Marmot, Oldfield, and Smith, 2006). Micro-level analyses of specific treatments for comparable patients across countries are free of some of the defects of more aggregated comparisons, and they suggest that while nearly all countries fall well short of ideal on measures of productive efficiency, the United States healthcare system sometimes (but not always) lags behind. Common explanations have included fragmentation of care (as Cebul, Rebitzer, Taylor, and Votruba argue in this issue), higher administrative costs, and patterns of care that vary inappropriately with race, geography, and financial barriers.
Second, is U.S. healthcare spending allocatively efficient compared to other countries—that is, do health benefits from the marginal dollar spent on health care consistently exceed the opportunity cost of other goods that might be provided—raising teachers’ salaries, improved insurance coverage for Iraq war veterans, or even upgrading to a BMW 5 Series? Some degree of allocative inefficiency is inevitable in any healthcare system because insurance for medical care causes overutilization due to moral hazard (Pauly, 1968). But both the very high level and rate of growth of U.S. health spending suggests that it experiences a unique degree of allocative inefficiency, even when compared to other high-income countries. The fundamental cause is a combination of high prices for inputs, poorly restrained incentives for overutilization, and a tendency to adopt expensive medical innovations rapidly, even when evidence of effectiveness is weak or absent. As we argue below, the distinction between allocative and productive efficiency can make it easier to understand the consequences of different healthcare reforms, which often address one type of inefficiency but have limited or unintentional effects on the other.
Rising health expenditures, whether expressed as a share of GDP or on a per capita basis, are not unique to the United States, as Figure 1 shows. Countries around the world are grappling with the question of whether they are spending too much—or not enough—on health care and whether their citizens are receiving benefits commensurate with the increased budgetary burdens. However, no country has experienced either a level or rate of growth of health expenditures as large as in the United States.
How can the ideas of productive and allocative efficiency organize our thinking about health expenditures in the United States? Figure 2a attempts to capture these ideas in the context of a hypothetical production function for a healthcare system. In this stylized, simplified scheme, all inputs are grouped together and measured with a common metric on the horizontal axis. The vertical axis measures the outcome to which the healthcare system is designed to produce or contribute. Which output to measure is a crucial question whose answer is not always obvious. One approach measures output in terms of units of health services and hence is described as unit service productivity. For example, suppose that we want to compare resource utilization at two hospitals that are each delivering hundreds of babies each year by Caesarean section. We can compare the hospitals by measuring blood tests performed, medications used, number of nursing hours, number of physician hours, imaging studies, and use of other forms of capital such as delivery rooms and operating rooms. This approach may be useful to hospitals that wish to know whether they can perform procedures as efficiently as other hospitals, but it does not measure actual health benefits—for example, whether higher rates of Caesarean sections (per 100 births) lead to improved health outcomes for mother and child. Consequently, we focus on outcome productivity, in which outcomes are measured typically by survival or other health-related measures.1
The PF* line in Figure 2a is the production “frontier” or most efficient clinical care which plots the cumulative health outcomes of (say) 10,000 representative patients given a specific level of optimally allocated inputs. Points A, E, C, and B exhibit productive efficiency—no waste. Point D lies below the production function PF* and is therefore deemed productively inefficient.
Whether point A or point B is allocatively efficient depends on the marginal rate of substitution between medical and nonmedical goods. Spending beyond point A to point B may be productively (or technically) efficient, but not allocatively efficient, if at the margin the same expenditure on nonmedical goods would lead to greater welfare gains. Conversely, movement from point B to point A would not be allocatively efficient if the marginal welfare gains from health expenditures exceeded the gains from spending on other goods and services. Indeed, Hall and Jones (2007) have argued that the United States should be devoting an increasing fraction of its income to health care because higher income increases the marginal value of saving a life while diminishing the marginal value of yet another car or a still larger flat screen TV.
Suppose that within a country, one group locates at point A on Figure 2a, while another group locates at point C. The group at point C may have lower income and thus a higher marginal utility of nonmedical goods, or it may simply place a lower value on health care. In this setting, the average survival rate for the combined population would be on a chord between these two points, for example point D. This choice has lower apparent productive efficiency than what could be realized in an egalitarian healthcare system (point E). In this case, the fundamental cause of the attenuated “production function” of health is heterogeneous demand—which could be the consequence of differences in preferences or income, rather than a reflection of allocative (or productive) inefficiency. Such heterogeneity would be expected in the United States, if only because healthcare financing and insurance coverage are more diverse than in other wealthy nations (Davis, 2007). Inefficient heterogeneity may also hold in the presence of racial or ethnic disparities (Smedley, Stith, and Nelson, 2003), or regional difference in healthcare spending, for example the 20-fold differences across similar American regions in rates of spine surgery among the elderly, which are unlikely to be explained by demand (Dartmouth Atlas of Health Care, 2006).
In practice, as we show in the next section, no country appears to have attained productive efficiency in health care. There are sins of omission—one recent U.S. study suggested just half of recommended care is provided in a typical primary care visit (McGlynn et al., 2003)—as well as sins of commission—the spinal fusion surgery that provides marginal relief and more complications compared to conservative management (Rivero-Arias et al., 2005). Thus Figure 2a also shows a country-specific production function, PF(1), that is everywhere below the frontier; PF(1) shows the hypothetical aggregate health outcomes of the population in the specific country as per capita factor inputs are varied.
Nearly every critic of the U.S. healthcare system points out that for many aggregate health measures, the United States does no better than other countries like the United Kingdom, which spends less than half of the U.S. level on health care. This suggests that a large fraction of U.S. spending is devoted to “flat of the curve” treatments, as shown in Figure 2b by the dotted line connecting point A and point B. This pattern of expenditures might be observed if the two countries resided on the same production function, one that includes the segment AB, but the United States spent much more than the other nation. But it is difficult to reconcile consumption of health care on the flat part of the production function with any notion of efficiency, since even wealthy regions (and their doctors) would not want to waste so much money on care yielding zero marginal benefit.
Another way to interpret the cross-country variation is that the United States is on an entirely different and lower production function, PF(2) in Figure 2b, while countries like the United Kingdom exhibit greater productive efficiency, on PF(1). This interpretation is also consistent with the otherwise puzzling result that within the United States, high-spending regions appear to experience worse quality of care (Baicker and Chandra, 2004)—in other words, the marginal return to spending is positive in both regions, but the higher-cost region lies on the lower productivity curve (Skinner and Staiger, 2008; Chandra and Staiger, 2007).
Distinguishing between “flat of the curve” and the “differences in production function” views can have practical importance. For example, consider the hypothetical policy reform of shifting U.S. expenditures back to (price-adjusted) equality with the United Kingdom or some other country. If the two countries were indeed on the same production function, but the U.S. economy were on the flat of the curve with little marginal gain in health at its current level of healthcare spending, then U.S. health outcomes might deteriorate relatively little when expenditures are cut back to the U.K. levels. This would result in substantial cost-saving and an increase in average productivity. But if the U.S. healthcare system lies on a different production function, and cutbacks were not combined with improvements in productive efficiency, health outcomes could worsen considerably. Thus the question of whether the United States is more or less efficient in producing health is ultimately about two issues: whether the U.S. production function is above or below that for other countries and whether the United States also experiences greater or lesser allocative efficiency when compared to other countries. We take up these questions next.
Estimating aggregate production functions across countries is difficult, because observed health outcomes vary with behavioral, genetic, and other factors unrelated to the healthcare system, and these tend to shift the entire production function in exactly the same way as a productivity improvement. Production functions are not well-identified: we typically observe just one point on each country’s production function at any time (Baily and Garber, 1997). Unless one nation either uses the same inputs for greater output or achieves the same output with fewer inputs, it may not be possible to infer which is more productively efficient, even if the production function for one nation is everywhere interior to that of the other. Note that average productivity, or the ratio of output to input, can easily be greater in the country that has lesser productive efficiency, as measured by the production function. In addition, if one nation is on the frontier production function and its greater wealth or preferences for health cause it to select a point on the production function corresponding to greater health output (point B on PF* in Figure 2a), it can experience a lower marginal and average productivity of health care than another nation that is not on the frontier production function (for instance, a nation at point F in Figure 2a.)
Table 1 presents relevant measures of health and health care for seven countries: the United States, Canada, France, Germany, the United Kingdom, and Japan, with all data for 2005 unless otherwise noted. Clearly, health status differs across countries. Obesity rates range from 3 percent in Japan to 32 percent in the United States. The United States and Canada, at 17 percent each, have the lowest adult smoking rates of the seven countries. Japan’s rate is the highest, at 26 percent. These differences in health are reinforced by more sophisticated studies measuring clinical markers of poor health. For example, Banks, Marmot, Oldfield, and Smith (2006) found higher rates of diabetes among high-income Americans than among the low-income English. The evidence from smoking notwithstanding, health burdens generally seem to be greater in the United States, which would tend to push the observed U.S. production function for health to a lower level.
Productive efficiency is difficult to measure, but we consider four proxies for the broader delivery of cost-effective health care in Table 1. The first measures shortfalls in the use of a highly cost-effective treatment, immunization for influenza among people over age 65. Estimates of the percentage receiving the vaccine range from 43 percent in Japan and 48 percent in Germany, to over 70 percent in the United Kingdom and the Netherlands. By this measure, the United States is at the median with 65 percent of the elderly population receiving this vaccine (Cylus and Anderson, 2007). With respect to the diffusion of information technology, however, the United States lags behind most other developed countries; 98 percent of primary care physicians in the Netherlands and 89 percent in the United Kingdom use electronic health records, compared to just 28 percent in the United States and 23 percent in Canada (Cylus and Anderson, 2007).
On one dimension of productive inefficiency, the U.S. healthcare system appears to stand out: heterogeneity in access to care, leading to unequal marginal benefits per dollar spent across patients, and the consequent erosion of the aggregate production function (as in Figure 2a). As we noted before, this heterogeneity could be consistent with allocative efficiency for different groups, but when viewed through the lens of health outcomes produced per dollar spent over the population, it will appear as productive inefficiency. The percentage of chronically ill patients who reported they eschewed doctor or nurse visits, failed to adhere to recommended treatments, or did not take full medication doses because of costs ranged from 42 percent in the United States to just 5 percent in the Netherlands (as shown in Table 1).2
The U.S. healthcare system also exhibits considerable regional variation in per capita Medicare expenditures, which ranged in 2005 (adjusted for age, sex, and race) from $5,600 in Salem, Oregon, to $14,600 in Miami (Dartmouth Atlas, 2008). This variation does not appear to be the result of variation in patient preferences by region (Barnato et al., 2007). Similar variations also arise in the use of highly effective low-cost care. For example the use of β blockers for heart attacks—treatment that can reduce mortality by 25 percent, but costing pennies per day—varied from just 5 percent of patients in McAllen, Texas, to over 80 percent in Rochester, New York, during the mid-1990s (Birkmeyer and Wennberg, 2000). The Congressional Budget Office (2008a) reported that these regional variations are more pronounced in the United States compared to other countries.
Another approach focuses on how countries on average treat specific health conditions. In a multi-year study during the early 1990s, the McKinsey Global Institute attempted to measure capital and labor costs in comparable units and to assess variation in both total costs and health outcomes in the three countries for gallstone disease (cholelithiasis), breast cancer, lung cancer, and, in the United Kingdom and the United States, diabetes (Baily and Garber, 1997). In each case the United Kingdom was more parsimonious in its use of resources for the management of each condition. However, Germany, not the United States, used the most resources in the three conditions in which it was included.
In the treatment of lung cancer, U.S. patients experienced better outcomes than those in Germany and far better than for patients in the United Kingdom. For breast cancer, outcomes were slightly better in the United States, while for gallstone removal, the United Kingdom had worse outcomes than did the United States or Germany. Germany in turn had slightly better outcomes than the United States but much greater resource use. Diabetes was the only one of the diseases studied in which another country unambiguously dominated the United States—the United Kingdom had both better outcomes and lower costs for diabetes than the United States.
One cannot draw sweeping conclusions from an analysis of such a small subset of health conditions, but at a minimum, no one country dominated the others in terms of either productive or allocative efficiency. Indeed, while the United Kingdom was most parsimonious, it did not generally exhibit greater average productive efficiency. Because the United Kingdom sharply restricted the number of CT (computed tomography) scanners, lung cancer patients in the 1990s were less likely to have a CT scan before undergoing surgery, making it more likely that English patients with inoperable lung cancer would receive inappropriate surgery (Baily and Garber, 1997).3
The U.S. healthcare system also spends more on administrative or overhead costs related to health care. One study has estimated administrative costs to comprise 31 percent of healthcare spending in the United States compared to 16 percent in Canada (Woolhandler, Campbell, and Himmelstein, 2003), leading some to infer that administrative waste could be reduced drastically by a single-payer health insurance system and that the savings could be used to finance universal coverage in the United States. Presumably, much of the savings would come from reductions in the net revenue of private health insurance firms. But other estimates suggest such potential savings are modest relative to total expenditures. According to OECD data, expenditures for administration by private insurers and central and local authorities were $465 per capita in the United States, compared to $265 in France, $131 in Canada, and $52 in Japan (Peterson and Burton, 2007).
This measure of administrative cost may be too restrictive, as it does not reflect the internal administrative costs of hospitals and physician groups. The cost of organizing a complex (and fragmented) healthcare system is substantial; U.S. administrative costs in legal firms are 24 percent, not far below those in health care (Glied, 1998; p. 39). Himmelstein, Lewontin, and Woolhandler (1996) suggest a major cause for higher administrative costs in the United States is the much larger share of nonclinical staff, whether managers or office staff who make appointments or call patients. But cross-country comparisons of healthcare administrative costs are especially suspect (Aaron, 2003), precisely because we know so little about what these nonclinical workers do. Indeed, some of the cost differential in the United States likely reflects expenditures for information technology, the reporting of patient outcomes for internal quality improvements, and other efforts intended to improve the quality of care. Finally, although the United States likely spends more on administrative activities than other wealthy nations, the growth in health expenditures cannot be readily attributed to growth in administrative costs.
Is there a systematic tendency for typical U.S. consumers of health care to consume “too much” or excessively costly care relative to alternative uses of resources? Measuring allocative efficiency is also difficult. A first challenge is to measure actual consumption of healthcare goods and services while holding prices constant, and to determine whether extra consumption (if observed) is justified by higher demand.
Table 1 provides six indirect measures of healthcare consumption. In terms of physicians per capita or hospital beds per capita, the United States ranks in the middle of the pack. The United States has 2.7 hospital beds per 10,000 people, compared to 2.3 in the United Kingdom, 6.4 in Germany, and 8.1 in Japan. The number of practicing physicians in the United States, at 2.4 per 1000 population, is just higher than the number in the United Kingdom, 2.1, but below that in France, 3.4 (OECD, 2008). While a reliable quantity index of pharmaceutical consumption is elusive, a simplified measure— grams of active ingredients (in prescription drugs) per capita—is lower in the United States than in Canada (146, where 100 is the reference U.S. index) and in France (171), but higher in the United States than in Germany (85) and Japan (56) (Danzon and Furukawa, 2008).
Of course, these numbers are not direct measures of services delivered. The intensity of care per day of U.S. hospitalization is higher than in other nations,4 and the number of physicians per capita does not adjust for the level of training and quality. Furthermore, rates of specific treatments are often higher in the United States; for coronary procedures, which are typically provided on an inpatient basis, the United States performs 587 procedures per 100,000 people, compared to 357 in Germany and 154 in the United Kingdom (Peterson and Burton, 2007, p. 13). Nor is the United States the top nation on every measure of the amount or intensity of care; for example, Table 1 shows that the number of MRI machines per million people in the United States, at 26.5, exceeds the number in Germany (7.7) or the United Kingdom (5.6) but lags behind Japan which has 40.1 MRI scanners per million people. However, unlike other nations, the United States is consistently at or near the top of all of these measures.
The fifth and sixth allocative measures are waits for elective surgery of more than six months among those receiving such surgery, and whether patients felt the physician recommended treatments with little or no benefit. These measures, as expected, are strongly negatively correlated; the United Kingdom has both long waits for elective surgery (15 percent) and little reported overuse (10 percent) while the United States has short waits (4 percent) and much more overuse (20 percent).
Levels of utilization alone don’t always inform us directly about allocative efficiency, which relates to the local slope of the production function, as shown earlier in Figure 2a. One way to place a lower bound on the marginal cost per life year is to consider the average change in costs relative to the average change in outcomes over time for a healthcare system. Figure 3 shows one hypothetical example involving a shift in both spending and outcomes over time (from A to B) involving both a shift in the production function (technology) from 1988 to 2008 as well as a movement along the production function, perhaps reflecting a different curvature of the function or rising income levels leading to spending more for health. Time-series comparisons yield the slope of the line from A to B, which, given a shift in the production function, will indicate a higher average return on factor inputs than the local or marginal cost-effectiveness ratio, as shown by the slope of the production function at point B, given by the line CC’ (Weinstein, 2005).
Considerable evidence suggests that the shift in the production function over the past century has yielded great benefits. U.S. life expectancy rose from 47.3 years at birth in 1900 to 77.8 in 2004; Nordhaus (2003) estimated that the growth in life expectancy has provided as much in value to Americans as the corresponding increase in consumption. Similarly, Murphy and Topel (2006) placed a value of $95 trillion on the improved life expectancy between 1970 and 2000, which was roughly three times medical spending during this period. Health has improved over time for many reasons. Early in the twentieth century, changes in living conditions, sanitation, and behavioral factors like nutrition, exercise, and smoking cessation were far more important than medical care in explaining public health improvements (Fuchs, 1974; Cutler, Deaton, and Lleras-Muney, 2006). But in the last few decades, reductions in cardiovascular disease accounted for 70 percent of the gains in survival (Cutler, Rosen, and Vijan, 2006). An examination of cardiovascular disease is thus a useful point of departure to assess the relative contribution of behavioral changes, low-tech medical technology, and high-tech medical technology to recent gains in life expectancy.
Ford et al. (2007) accounted for factors which led to a decline of 340,000 annual cardiovascular deaths in the United States between 1980 and 2000. Health behaviors that are not directly associated with health care, such as reductions in cholesterol, and thus high blood pressure, among untreated individuals, accounted for 61 percent of the decline, albeit with 17 percent (59,000 deaths) clawed back by the rising rates of diabetes and obesity. Twenty percent of the decline in mortality was the consequence of off-patent and inexpensive drugs—aspirin, β blockers, anti-hypertensives—whose costs are measured in pennies. An additional 13 percent of the improvement was the consequence of “medium-tech” and more expensive drugs like ACE inhibitors and thrombolytics. Finally, “hi-tech” medical interventions such as cardiac bypass surgery, angioplasty, and stents accounted for just 7 percent of the overall gains in cardiovascular mortality.
Thus, the recent historical gains in health outcomes may be more closely related to the influence of 1970s exercise guru Richard Simmons than to the diffusion of open-heart surgery. In addition, the remarkable productivity gains in cardiovascular treatments have not been replicated in other diseases. Cutler (this issue) reports the more modest improvements in cancer mortality were generated by low-cost early screening, rather than more expensive end-stage treatments where success is measured in weeks of life extended.
The Cutler, Rosen, and Vijan (2006) study attributed one-half of the improvement in health outcomes to medical expenditures, arguing that this would be sufficient to compensate for the biases noted above. They found that, during the 1960 to 2000 period, the cost-effectiveness ratio was a highly favorable $19,900 per extra life year for newborns, considerably lower than either the commonly used $50,000 per quality-adjusted life year threshold or the approximately twice annual income threshold derived from a constant absolute risk aversion utility function (Garber and Phelps, 1997). However, even these estimates may overstate the return to expenditures on medical care. While Cutler, Rosen, and Vijan (2006) discount future expenditures, they do not discount future life years. The authors argue that by not discounting they avoid having to value the current life-year of a 40-year-old mother differently from the 40th year of her child. But this failure to discount outcomes leads to the Keeler–Cretin paradox (Keeler and Cretin, 1983): if one treats all life years as equally valuable, regardless of whose life-year is in question and when the life-year is saved, and so does not discount life-years, no money should be spent on health care in the present, because health expenditures should be delayed infinitely far into the future; the longer one waits (and accumulates interest) until spending the money, the more life-years can be saved. Thus standard practice discounts life-years and costs at the same rate.5
Table 2 shows the recalculated measures of the cost effectiveness ratio (the slope of the line AB in Figure 3) for the 1960s through the 1990s for a representative individual age 45.6 When both life-years and expenditures are discounted, the average cost-effectiveness ratio for a life saved by health care (again, assuming that half of life-expectancy gains arise from health care) rises from $64,000 during the 1970s to $159,000 in the 1980s and $247,000 in the 1990s. Because these measure average returns, they are lower bounds on the local or marginal cost-effectiveness ratio that would allow us to judge whether health care is allocatively inefficient or not. Given the importance of low-cost medical treatments in explaining overall cardiovascular mortality declines, one would certainly expect that the marginal cost-effectiveness ratio exceeded one-quarter of a million dollars.
But perhaps other countries have exhibited similar (or worse) degrees of allocative inefficiency. In other words, we might want to ask a different question: have the incremental dollars spent in the United States—in excess of what the United Kingdom or France has been spending—generated commensurate benefits? Comparing changes over time in the United States with changes over time in other countries avoids many of the pitfalls of traditional cross-country comparisons.
Figure 4a shows spending for the United States and a selection of high-income countries: Japan, Canada, the United Kingdom, France, Germany, and Switzerland. In the discussion that follows the average for this group of peer countries is unweighted, so Switzerland counts as much as Germany, but the population-weighted averages (and the data from individual countries) yield a similar pattern. In 1970, the United States spent 40 percent more on health care than the average of the peer countries, and since then the gap has widened, to 90 percent by 2004. In contrast, life expectancy, shown in Figure 4b, has improved at a slower rate in the United States, from 99 percent of the average life-expectancy for the European comparison group in 1970 to 97 percent in 2004. These results are not sensitive to the age at which life expectancy is estimated; for example, the results are similar for people over age 65, a group nearly universally covered by Medicare. Indeed, between 1970 and 2003, every country in the comparison group achieved larger increases in life expectancy at age 65 for both women and men, with the exception of Canada, whose 65 year-old men experienced the same 3.7 year increase in life expectancy as their American counterparts.7
Similar results were found when looking just at mortality deemed “amenable” to medical care, such as bacterial infections, treatable cancers, and certain cardiovascular diseases, as shown near the bottom of Table 1 (Nolte and McKee, 2008). In this area as well, the European countries have experienced larger declines in mortality than the United States. Other countries, then, have shared the enormously valuable improvements in health that Americans have enjoyed in recent decades, and at much lower cost.
Of course, longevity gains are not the only benefits from innovation in health and medical care, and in some circumstances they are not the most important. For example, hip replacements and knee replacements enable people with degenerative joint disease to walk again and to maintain independence (Chang, Pellisier, and Hazen, 1996), while cataract surgery (Shapiro, Shapiro, and Wilcox, 2001) and effective treatments for depression (Berndt, Bir, Busch, Frank, and Normand, 2002) are highly cost-effective but do not affect survival. Less is known about trends in functional status across countries.
Why then are U.S. healthcare expenditures growing more rapidly? One common explanation is that malpractice concerns drive physicians and hospitals to practice costly “defensive” medicine. Kessler and McClellan (1996) found that states with tort reforms limiting malpractice awards experienced less growth in Medicare expenditures for beneficiaries with heart attacks. Similarly, Baicker, Fisher, and Chandra (2007) reported that expenditures for Medicare beneficiaries in states with larger malpractice awards were 5 percent higher. Although these studies demonstrate that malpractice litigation and defensive medicine impose costs, they also suggest that these costs account for a small fraction of total expenditures and are unlikely to be the major cause of the divergence between nations in expenditure growth.
Perhaps the most compelling explanation is the diffusion and adoption of new technology, which is to a great degree endogenous within a country’s economy and healthcare system (Weisbrod, 1991; Newhouse, 1992; Chandra and Skinner, 2008). Innovation and adoption are fueled by favorable reimbursement rates, particularly when there are few limits to the rapid diffusion of new treatments with unknown benefit. For example, ezetimibe, an expensive component of the controversial cholesterol-reducing drug Vytorin, had never been recommended as a first-line treatment, because of a lack of direct evidence that it was effective in reducing cardiovascular disease. Yet by 2006, ezetimibe accounted for 15 percent of U.S. cholesterol-lowering drug sales, but only 3 percent in Canada (Jackevicius, Tu, Ross, Ko, and Krumholz, 2008).
Nuclear particle accelerators, 222-ton machines costing more than $100 million each (Pollack, 2007), offer another example of what appears to be a uniquely American willingness to provide new technology with little consideration for expense. Although the accelerators arguably are highly effective in treating very rare brain, neck, or pediatric tumors, they are also used to treat far more common prostate cancers with little impact on outcomes compared to traditional radiation therapy (Pollack, 2007). The cost structure of this treatment seems ideally suited to rapid diffusion in the United States: high fixed cost of installation, relatively low marginal cost of operation, and reimbursement rates based on average rather than marginal cost. Other healthcare systems with central budgeting or quantity constraints are far less likely to experience rapid growth in these technologies.
The United States does tend to consume more health care on a per capita basis in comparison to other developed countries, but consumption of higher inputs alone does not explain why the United States spends twice as much on a per capita basis. Anderson, Reinhardt, Hussey, and Petrosyan (2003) emphasize higher prices as the cause of the expenditure differences. Hip replacements in the United States cost twice as much as in Canada for the identical procedure (Peterson and Burton, 2007, Agrisano, Farrell, Kocher, Laboissiere, and Parker, 2007). Often apparent price differences are confounded by differences in the products or services; Danzon and Furukawa (2008) have argued for the importance of product mix, noting that American patients receive newer vintage drugs with accompanying higher prices.8
Why are U.S. prices so high? One explanation is that U.S. physicians earn more than physicians in most other countries, as can be seen in the last row in Table 1. Among the countries considered, U.S. physicians lead with average earnings of $161,000, compared with average earnings of $107,000 for physicians in Canada, $118,000 in the United Kingdom, and $92,000 in France. Specialists are also generally paid more in the United States, although the Netherlands is an exception (Peterson and Burton, 2007). But the differences in reported salaries do not appear to explain entirely the dramatic difference in costs per procedure.
The incentives embedded in physician payment mechanisms are also important determinants of overall utilization. Japan, for example, had the highest antibiotic consumption rates in the world, in part because many physicians earned money by dispensing as well as prescribing drugs. In the United States, many physicians earn additional compensation by ordering imaging studies such as magnetic resonance imaging (MRI) and computed tomography (CT) scans, and thus it is not surprising that these diagnostic tests have experienced roughly 10 percent annual growth in recent years (Iglehart, 2006). A McKinsey Global Institute study estimated that, despite legal restrictions on self-referral, U.S. health-care providers earned as much as $25 billion from profits on self-owned facilities providing laboratory, imaging, and other services (Angrisano, Farrell, Kocher, Laboissiere, and Parker, 2007, p. 51). But incentives cannot explain the variation we observe across countries in every clinical condition (Dor, Pauly, Eichleay, and Held, 2007).9
Note that higher prices per unit of services, or higher factor earnings, have no impact on efficiency beyond their influence in determining production or consumption. (We also ignore here how prices affect incentives for research and product innovation.) Nor is there evidence that more rapid growth in prices can explain any differences in the growth rates of healthcare spending between the U.S. and other countries.
The cross-country patterns of utilization, expenditures, and health outcomes can be better understood by returning to the two fundamental questions posed at the beginning of the paper. First, does the production function embodied in the U.S. healthcare system lie below that for other countries? That is, if the United States spent no more per capita on health care than Canada or France, would its health system deliver more or less in quality-adjusted health?
If we were to value improvements in health equally for those with high and low demand, the answer seems to be that productivity is indeed inferior in the United States. But insofar as Americans attach less importance to equality in health services than do the citizens of other wealthy nations, the marked heterogeneity in health care utilization by region, socioeconomic status, insurance coverage, race, and ethnicity could represent a choice to optimize for the individual rather than to maximize an egalitarian social welfare function. Arguably, if care were provided more uniformly for people with similar clinical characteristics, the production function for health care in the United States would more closely resemble that of other nations.
Greater administrative expenses are frequently blamed for lower health care productivity in the United States, but they can only have limited responsibility for the observed patterns of outcomes and expenditures. Even the largest estimates of administrative expenses are not sufficient to explain differences in spending between the U.S. and other countries, nor can they explain why U.S. expenditures are growing more rapidly than in other high-income countries. Although some policy changes might reduce administrative costs that do not provide any evident benefit—such as the high costs of processing insurance claims that do not adhere to a uniform format—we should not expect them to bring American health expenditures in line with those of other nations.
Many health policy reforms aim to improve productive or allocative efficiency or both. The main purpose of improvements in care based on adoption of electronic health records and other information technology, and of payment incentives designed to improve the quality of care, is to improve productive efficiency. Expanded adoption of highly effective, low-cost care is another approach to improving productive efficiency. One study suggested that 447,000 life-years could be saved over the next 20 years simply by following existing protocols for the use of low-cost β blockers (Philips et al., 2000). The chief controversy about attempts to improve productive efficiency is primarily about whether they will work, not whether they should be pursued. But most reforms designed to improve productive efficiency are unlikely to reduce expenditures dramatically (for example, CBO, 2008b).
Despite evidence that some aspects of health insurance expansions could improve productive efficiency (Dor and Encinosa, 2004; Gaynor, Li, and Vogt, 2006; Chandra, Gruber, and McKnight, 2007), they are unlikely to reduce expenditures overall. Unless combined with aggressive measures to limit high-cost hospitals and regions as well as the growth of healthcare expenditures, coverage expansions merely extend to a larger population the features of public and private U.S. health insurance responsible for rapid expenditure growth.
Our second question concerns allocative efficiency—in other words, do benefits from the marginal healthcare dollar in the United States exceed their opportunity cost (the benefits if used for purposes other than health improvement)? What may seem surprising from our cross-country comparisons is that the United States is not always an outlier with respect to conventional measures of healthcare utilization. In part, this is explained by the lack of consistent measures across countries—a hospital day in the United States is far more resource-intensive than in France, and the extensive substitution of outpatient surgical care for inpatient surgery in the United States is not reflected in most comparative data (Angrisano, Farrell, Kocher, Laboissiere, and Parker, 2007). And although the United States may not be the largest consumer of MRIs or inpatient surgery, it consistently ranks near the top in these and similar categories. Furthermore, the U.S. healthcare system tends to offer the most expensive treatments, whether surgery for cardiac or vascular diseases, or recently developed biologicals.
Moral hazard is inherent in any system of subsidized medical care, so every nation that provides insurance or medical care is subject to potential overuse. What sets the U.S. healthcare system apart is a combination of incentives for the overuse of some services and underuse of others in a predominantly fee-for-service system, coupled with few supply-side constraints. A small physician group that owns a clinical laboratory can be paid more than marginal cost for each test it orders and performs, while cost-effective preventive care and office services often receive reimbursement below average and even marginal cost. Other nations also have fee-for-service reimbursement, but often supply constraints limit the overuse of some services, such as Canadian controls on capital equipment. In many nations, provider incentives for overutilization are attenuated, if not absent.
The dynamic effects of incentives for excess consumption may in turn be much greater than the static consequences. The net revenue that suppliers of medical products and services gain is a stimulus for investment in the development of new medical technologies. Unrestrictive eligibility rules and high reimbursement rates result in greater rewards and a diminished risk of failure for an investment in a new form of medical care.
The policies of both private and public insurers have traditionally offered a more welcoming and cost-unconscious approach to the provision of new healthcare technologies in the United States. Health insurance coverage is often extended to technologies with the potential to provide benefits, even if those benefits ultimately prove to be elusive, and without regard to their cost.10 Almost uniquely among wealthy nations, the United States typically does not consider effectiveness relative to its costs or to the costs of alternative treatments (Garber, 2004). Neumann (2005) attributes the unwillingness to consider costs in the United States to a combination of public indifference and political barriers. In England, the National Institute for Health and Clinical Excellence has rejected or sharply restricted coverage for expensive, high-profile drugs for some cancers and for Alzheimer’s disease, for example (Emanuel, Fuchs, and Garber, 2007), and other nations have decision-making bodies that limit the availability of forms of care that are not determined to be cost-effective.
These initiatives to improve allocative efficiency are more challenging politically and socially. Although managed care was intended to improve productive efficiency, public opposition arose from the perception that it restricted access to care and limited the choice of providers. High-deductible health insurance plans are designed to restrain expenditures by limiting moral hazard, but some evidence suggests that increased cost sharing (that is, the consumer pays a greater fraction of the healthcare bill) can have the paradoxical effect of reducing consumption of highly cost-effective products and services, such as treatments for hypertension, thereby worsening apparent productive efficiency. Because such plans have cost-sharing features similar to those of conventional insurance after the deductible is reached, they have no marginal impact on expenditures that exceed the deductible, the bulk of healthcare spending.
Any policy reform that would lead to a reduction in expenditures may be resisted strongly simply because the $2 trillion in annual U.S. expenditures for health care also represents $2 trillion in income for healthcare providers and others. But efficiency-enhancing reform may nevertheless be possible. Perhaps American consumers would choose less-expensive health insurance policies that eschewed expensive treatments deemed cost-ineffective, or that required patients to seek care only at low-cost high-quality integrated group practices (Fisher, Staiger, Bynum, and Gottlieb, 2007; Shortell and Casalino, 2008). Regulatory, legal, and political barriers may have to be overcome before such policies are offered. In addition, better information about treatment options for conditions such as breast cancer and back pain has been shown in some cases to lower utilization, and could lead to Pareto superior outcomes—better health outcomes at lower cost (O’Connor, Lewellyn-Thomas, and Flood, 2004).
Perhaps the greatest hope for improving both allocative and productive efficiency will come from efforts to measure and reward accurately outcome productivity—improving health outcomes using cost-effective management of diseases—rather than rewarding on basis of unit service productivity for profitable stents, caesarian-sections, and diagnostic imaging regardless of their impact on health outcomes. Such a change in emphasis will require rethinking what we pay physicians and hospitals for, and most importantly, how to measure and pay for outcomes rather than inputs.
The work of both authors is supported by Investigator Awards in Health Policy Research from the Robert Wood Johnson Foundation. Garber is also grateful for the support of grants from the National Institute on Aging (P30 AG 17253 and P01 AG05842) and from the Department of Veterans Affairs, and Skinner is similarly grateful for support from the National Institute of Aging (P01 AG19783). We are indebted to Peter Richmond and Kathy Stroffolino for expert assistance, and thank without implicating Jay Bhattacharya, Amitabh Chandra, David Cutler, Elliott Fisher, Sherry Glied, Ann Norman, Allison Rosen, Timothy Taylor, Douglas Staiger, Victor Fuchs, and participants in the NBER 2008 Summer Institute for insightful comments.
1See Jacobs, Smith, and Street (2006) for a detailed discussion of measuring productivity in health care. Survival can reflect either the probability of surviving to the end of a period or life-expectancy, perhaps weighted by health status. Deciding on a single measure is not straightforward for most types of disease. Blood pressure reduction is the obvious outcome measure for an antihypertensive drug, but some drugs provide benefits that are not fully explained by their effects on blood pressure, while many adverse reactions from antihypertensive drugs do not operate through blood pressure. As well, other factors such as socioeconomic status, education, and individual health behavior will affect not simply health outcomes, but the marginal effectiveness of specific health treatments (Feinstein, 1993; Goldman and Smith, 2002).
2Van Doorslear et al. (2000), however, do not find evidence for more inequality in U.S. healthcare utilization compared to many European countries.
3One common epidemiological pitfall is to interpret country-level cancer survival rates as quality measures. The United States healthcare system is far more likely to identify individuals both at an earlier stage of the disease and with less serious severity, thus improving measured survival rates even in the absence of better treatment.
4For example, in 2005 there were 5.3 staff members per hospital bed in the United States, compared to an estimated 4.3 in Canada and 1.7 in France (OECD, 2008).
5Alternative discounting schemes are appropriate for alternative objective functions, but require a specific rationale. A frequent justification for deviation from equal discount rates for life years and costs is that time horizons are short, and therefore the discount rate for life-years should be larger than for costs. Another rationale for discounting life-years more is that quality of life measures may fail to account adequately for declines in well-being that accompany aging.
6We are very grateful to Allison Rosen for providing us with these discounted estimates.
7For 40 year-old women, every nation had greater increases in life expectancy than the United States. Only for 40 year-old men did the United States experience larger increases in life expectancy than for some of the other nations: Canada, Japan, and Switzerland. Trends could also differ across countries because of differences in disease prevalence. For example, the striking reduction in cardiovascular mortality will have a greater effect on life expectancy in the countries that start with a greater prevalence of the disease. However, the United States had high rates of cardiovascular mortality throughout the early years, nearly as high as the United Kingdom and similar to Germany, so if anything it should have experienced greater life expectancy gains. While obesity rates have risen sharply in the United States, it is clear they have also risen in several European countries.
8This observation abstracts from the question of whether the higher-priced, new-generation drugs are worth the extra expense (Gladwell, 2004).
9Dor, Pauly, Eichleay, and Held (2007) find that average U.S. healthcare costs for end-stage renal disease patients are surprisingly low relative to other countries, the consequence most likely of atypically restrictive reimbursement rates.
10For example, there was considerable public pressure on insurance companies to cover high-dose chemotherapy for breast cancer during the early 1990s. Yet subsequent randomized trials demonstrated no favorable impact on survival (Rettig, Jacobson, Farquhar, and Aubry 2007).