|Home | About | Journals | Submit | Contact Us | Français|
Wide geographic variation in healthcare spending has generated concern about inefficiency and policy debate about geographic-based payment reform. Evidence on variation has focused on hospital referral regions (HRRs), which incorporate numerous local hospital service areas (HSAs). If there is substantial variation across local areas within HRRs, then policies focusing on HRRs may be poorly-targeted.
Using pharmacy and medical claims data from a 5% random sample of Medicare beneficiaries in 2006–2009, we compared variation in health care spending and utilization in 306 HRRs and 3436 HSAs. We adjusted for beneficiary-level demographics, insurance status, and clinical characteristics to calculate adjusted use and spending.
There is substantial local variation in drug and non-drug utilization and spending, and substantial dispersion of local areas within HRRs; many low-spending HSAs are located within the borders of high-spending HRRs and vice versa. Only about half of the HSAs located within the borders of the highest spending quintile of HRRs are in the highest spending quintile of HSAs; conversely, only about half of the highest spending HSAs are located within the borders of the highest-spending HRRs.
The effectiveness of payment reforms in reducing overutilization while maintaining access to high-quality care depends crucially on the effectiveness of targeting. Our analysis suggests that HRR-based policies may be too crudely targeted to promote the best use of healthcare resources.
A substantial body of evidence has emerged highlighting wide geographic variation in healthcare spending that is not driven by patient characteristics and not associated with the quality of care or patient outcomes.1–7 In light of this evidence, many policy proposals suggest targeting high spending areas for lower Medicare payments or other coverage constraints, focusing on areas such as Dartmouth hospital-referral regions (HRRs). These policies are predicated on the idea that there is a system-level component driving some areas to be high-utilization and others low. The effectiveness of these policies in reducing overutilization while maintaining access to high quality care depends crucially on the effectiveness of targeting: if there is substantial variation across local areas within HRRs, then focusing on high-cost HRRs may leave many high-spending locales untouched while inadvertently penalizing some low-spending locales.
We compare variation in medical spending and prescribing patterns at the broader market level, the 306 Dartmouth HRRs, to variation within those markets, the 3436 Dartmouth hospital-service areas (HSAs).1 HRRs represent the areas served by large tertiary hospitals where patients were referred for major cardiovascular surgical procedures and for neurosurgery. HSAs are contained within HRRs and represent areas whose residents receive most of their hospitalizations from the hospitals in the area. Thus, HSAs better capture the local health care markets where Medicare beneficiaries receive most of their care. We examine: (1) variation in prescription drug and medical care spending and use across HSAs versus across HRRs; and (2) the degree to which high-spending HSAs are clustered together within high-spending HRRs. This analysis can help evaluate the effectiveness of policies targeted at different levels of aggregation; if the intention of such policies is to capture variation in local markets, it is important to understand the local-market heterogeneity that larger units may mask.
We obtained 2006–2009 enrollment, drug event and medical claims data from the Centers for Medicare & Medicaid Services (CMS) for a 5% random sample of Medicare beneficiaries. For each year between 2007 and 2009, we identified all beneficiaries having at least one month enrollment in Parts A, B, and stand-alone Part D (PDP) plans because CMS only has both medical and drug data for those enrolled in PDP plans. (2006 was used for calculation of prospective risk scores described below.) The resulting sample consisted of 1,013,477 beneficiaries in 2007, 1,024,183 in 2008, and 1,022,662 in 2009, for 3,060,322 total beneficiary-year observations. We assigned each beneficiary to 1 of 306 HRRs or 3436 HSAs based on the beneficiary’s ZIP-Code of residence.1 HSAs are nested within HRRs. The study was approved by the Institutional Review Board at the University of Pittsburgh.
We conducted sensitivity analysis for two subpopulations: (1) those aged ≥65 and (2) those enrolled for the full year or until they died, to ensure that our results are not substantially affected by a small proportion of disabled beneficiaries or those switching plans within the year (Online Supplement Table S1; coefficients of variation and interquartile ratios are virtually unchanged).
Our outcomes were utilization of and spending on medical services and prescription drugs. All outcome measures were calculated in per-person-per-year units (with spending of part-year enrollees annualized). For medication use, we defined two outcomes: (1) total gross drug spending including Part D plan payment before rebates, beneficiary out-of-pocket spending, and subsidy amount; and (2) number of monthly prescription drugs (=1 if days supply ≤30; =days supply/30 if days supply >30). For medical services, we defined four outcomes: (1) total non-drug medical spending, (2) number of inpatient admissions; (3) number of outpatient office visits; and (4) number of emergency room (ER) visits. Total non-drug medical spending included Medicare and beneficiary payments for all medical services (including inpatient, outpatient, physician, home health, hospice, skilled nursing home, and medical devices) and was adjusted for local price-level differences using county-level factor prices given to us by the Medicare Payment Advisory Commission (MedPAC).7,8 We did not adjust drug spending for regional price differences because the variation in drug prices among regions was negligible.4
To account for differences in population characteristics across regions, we adjusted for three major categories of beneficiary-level variables: patient demographics; income and insurance status; and clinical characteristics. Demographics included age in 5-year bins (<=34, 35–39, 40–44, 45–49, 50–54, 55–59, 60–64, 65–69, 70–74, 75–79, 80–84, 85–89, 90–94, 95+), gender, race/ethnicity (non-Hispanic White, non-Hispanic Black, Hispanic, Asian, and other). Part D data have an enhanced Research Triangle Institute Race Code verified by first and last name algorithms, with much improved sensitivity (>77%; Kappa coefficient 0.79).9
We adjusted for individual-level insurance status and a proxy for income. We used variables indicating Medicaid coverage (available to those under about 75% of the Federal poverty level (FPL), but with some state variation) and non-dual federal low-income subsidies (which vary based on FPL cut-points) to create income bins: <75% FPL, 75%–135% FPL, 135–150% FPL, and >150% FPL. We also included two indicators for supplemental drug coverage using Part D data: those with generic-coverage in the “donut hole” gap and those with both generic and brand-name drug in the gap. We also controlled for share of the year the beneficiary was enrolled. In addition, we adjusted for an indicator of being disabled (<65 years old) for beneficiaries who were eligible for Medicare because of disability. Last, we controlled for ZIP-Code level income (logarithms of the median household income within the ZIP-Code in which the individual lived) and educational attainment (share with less than high school, completing high school only, and completing part of college and above in the individual’s ZIP-Code).
Clinical characteristics included risk scores, indicators for institutionalization (defined as having 90 days in a nursing home), and death during the year. We calculated the two prospective risk scores using prior-year diagnosis and spending: CMS Hierarchical ConditionCategory scores (CMS-HCC) for non-drug medical care services and the analogous prescription drug hierarchical condition category (RxHCC) scores.10 Risk scores represent a proxy for health status, with higher scores indicating greater severity of illness and higher expected health care utilization (in our study sample, RxHCC ranges from 0 to 6.6 and CMS-HCC ranges from 0.1 to 12.4). We used prior-year instead of current-year diagnosis and spending to calculate risk score (except for the 4% of the sample who are new enrollees, for whom we use concurrent risk scores based on age and gender). Even with the use of the prior year’s risk score, physician coding may be endogenous; for example, physicians in higher-spending regions may code patients as sicker than physicians in lower-spending regions code similar patients.11 We conducted sensitivity analysis to exclude risk scores (Online Supplement Tables S1–S3 and Figure S1; results are robust to the exclusion of risk scores).
We used these data to generate an adjusted average value for each outcome for each HSA (or HRR). We pooled three years (2007–2009) and conducted an individual-level linear regression for each outcome.12 Each regression included HSA (or HRR) indicator variables, year indicators, and the adjustment variables described above. Regressions were weighted by the percent of year enrolled so those who only had partial-year enrollment would contribute less to the model. We then calculated the predicted value for each HSA (or HRR) using the estimating equation evaluated at national averages for the covariates, thus capturing variation at the HSA (or HRR) level that was purged of variation in population characteristics that we were able to control for with our observed covariates.
We used these adjusted HSA (or HRR) outcomes to perform two sets of analysis. First we described the degree of variation between HSAs (or HRRs), calculating statistics such as ratios of 75th to 25th percentiles and coefficients of variation (CV). CV is defined as the ratio of the standard deviation to the mean, and represents a normalized measure of dispersion. We included only 2908 out of 3436 HSAs with 50 enrollees or more to avoid the introduction of noise driven by small cell sizes and used the same sample to conduct HRR level analysis.
Second, we evaluated the degree to which HSAs with similar spending levels clustered together within HRRs. Clearly there will be a correlation between spending at the HSA level and spending at the HRR level – HRRs are just an aggregation of HSAs – but we assessed the degree of dispersion of HSA spending within and between HRRs. Specifically, we divided HRRs into quintiles based on their adjusted spending and also divided HSAs into quintiles based on their adjusted spending. We then tabulated the share of high- and low-spending HSAs located within the borders of high- and low-spending HRRs to gauge the variation of HSAs within and between HRRs
The Table presents the variation in pharmacy and non-drug medical spending, counts of monthly prescriptions filled, inpatient admissions, outpatient office visits and ER visits per person per year in different regions. Panel A shows the variation at the HSA level. Beneficiaries in the median HSA filled 53 monthly prescription drugs per year, or 4.4 prescriptions per month, corresponding to $2912 in annual gross drug spending. Medical spending and drug spending are comparably variable, with coefficients of variation of .15. The ratio of drug spending at the 75th percentile to that at the 25th percentile is 1.21, whereas the corresponding ratio for drug counts is only 1.13. This suggests that variation in the mix of drugs prescribed is larger than variation in number of drugs prescribed. Of course, part of this variation may be due to other unmeasured patient characteristics or illness severity.
Panel B reports the analogous variation across HRRs. The pattern of variation across categories is quite similar, although the overall degree of variation is somewhat lower. We explore below whether high-spending HSAs are located primarily within high spending HRRs, or whether there is substantial variation between HSAs within HRRs.
We gauge the degree to which HSAs with high spending are concentrated together in HRRs with high spending. A formal test of whether there is variation across HSAs nested within HRRs can (unsurprisingly) reject the null hypothesis of no systematic HSA variation within HRRs (joint F=4.86, p<0.01). About 41% of the variation in adjusted HSA drug spending is between HRRs, and 59% is within HRRs. About 43% of the variation in adjusted HSA non-drug medical spending is between HRRs, 57% within. (These results are virtually unchanged if we exclude the top and bottom 1% of the HSA values.) There is substantial variation of HSAs within HRRs. For example, Manhattan is one of the HRRs with the highest adjusted drug spending and Albuquerque is one of the lowest – but there is substantial dispersion in spending across the HSAs within those HRRs: the lowest-spending HSA in Manhattan has lower spending than about 25% of the HSAs within Albuquerque. For comparison, about 66% of the variation in HSA-level income is between HRRs, while 34% is within; about 55% of the variation in HSA-level education is between HRRs, 45% within; about 70% of the variation in HSA-level proportion of Whites is between HRRs, 30% within; about 66% of the variation in HSA-level CMS-HCC risk score is between HRRs, 34% within. Analysis at the HRR level thus masks more within-HRR local area variation in healthcare spending than it does in several (although not all) covariates.
Figure Panels A–B show the degree of clustering of high-spending HSAs within high-spending HRRs (See Online Supplement Table S2 for a complete set of conditional probabilities). For HRRs in each quintile of adjusted HRR spending, Panel A shows the share of the HSAs within that HRR that are high- or low-spending HSAs. For example, for adjusted pharmacy spending, 50.7% of the HSAs located within the highest drug-spending HRR quintile are in the highest drug-spending quintile of HSAs; 50.3% of the HSAs in the lowest drug-spending HRR quintile are in the lowest HSA quintile. Similar patterns are observed for adjusted medical spending: 57.4% of the HSAs in the highest HRR quintile are in the highest quintile of HSAs; 54.0% of the HSAs in the lowest HRR quintile are in the lowest HSA quintile.
For HSAs in each quintile of HSA spending, Panel B shows the share of the HSAs that are located in high- and low-spending HRRs. For adjusted pharmacy spending, 51.5% of the HSAs in the highest-spending quintile are located within the highest quintile HRR; 49.6% of the lowest drug-spending HSAs are located within the lowest-spending HRR quintile. For adjusted medical spending: 52.0% of the highest-spending HSAs are located within the highest-spending HRR; 46.0% of the lowest-spending HSAs are located within the lowest-spending HRR.
In sum, there is substantial misalignment of high-spending HSAs and HRRs, with many low-spending HSAs located within the borders of high-spending HRRs and many high-spending HSAs located within the borders of low-spending HRRs.
Much policy attention has been drawn to the large and persistent geographic variation in healthcare spending – for good reason. The presence of such variation (in the absence of commensurate variation in patient needs or even in health outcomes) suggests that high-intensity practice patterns in some areas signal inefficient resource use. This has led to discussion of policy levers to rein in spending in high-utilization, high-cost areas, such as lowering Medicare payments to providers in those areas. Such policy levers aim to focus on a level of aggregation that captures the local healthcare delivery system – too low a level (such as individual physician payments) could miss the system-level factors that drive some areas to be systematically high-utilization, while too high a level (incorporating multiple systems and markets) could reward or punish utilization well beyond providers’ control and mask substantial heterogeneity.
Analysis has primarily been focused on variation between HRRs – areas defined based on large tertiary facilities and incorporating numerous HSAs. There are advantages to looking at such large areas (they may be large enough to capture more homogeneous patient pools), but the disadvantage of such an exclusive focus is that it can mask substantial heterogeneity at the more local level. This is particularly important when considering the effects of policy levers that aim to act on local practice patterns for primary care or avoidable hospitalizations, for example.
We examined the degree of heterogeneity within HRRs. We found that there is substantial local variation in utilization and spending for both drug and non-drug medical spending and that there is substantial dispersion of local spending within HRRs.
These findings are of course subject to several limitations. First, our analysis is based on the Medicare population. Patterns among the commercially insured may differ. Medicare does, however account for 20% of all national health care spending as of 2010,13 and many of the policy levers discussed apply to Medicare payment rates. Second, our risk-adjusters are imperfect and we do not capture patient preferences.14 To the extent that these vary across localities, they could drive some of the observed patterns of heterogeneity. It is somewhat reassuring on this front that patterns of unadjusted outcomes are quite similar to those with adjusted outcomes. Third, causal connections are inherently difficult to draw from ecological data. While the variation described here (and elsewhere in the literature) is strongly suggestive of inefficient use of resources, it is difficult to use these data to forecast what the effect of different policy levers might be on spending patterns.
Nevertheless these findings do have policy implications. Policies that aim to reduce the spending in high-cost areas by targeting high-spending HRRs may fail on both sensitivity and specificity: about half of the HSAs in the highest spending HRR quintile are not in the highest spending quintile of HSAs; and about half of the HSAs in the lowest spending HRR quintile are not in the lowest HSA quintile. That said, higher spending HSAs are generally in higher spending HRRs. Whether or not this degree of concordance is sufficient to achieving policy aims depends on policy-makers’ tolerance for the ramifications of imperfect targeting.
This does not, however, tell us what the “right” level of aggregation for policy is. There is clearly variation in spending within HSAs – should policy focus on an even more local level? The movement towards Accountable Care Organizations (ACOs) aims to tie payments to the care delivered by provider groups that are large enough to pool risk and abstract from individual-level variation in needs and idiosyncratic outcomes, but small enough to hold the group accountable for the use of resources. In the absence of formal ACOs, payments tied to local area practice patterns aim to accomplish similar goals. This analysis suggests that policies focused exclusively on the hospital referral region may be too blunt to promote the best use of health care resources.
Institute of Medicine (HHSP22320042509XI), National Institute of Mental Health (No. RC1 MH088510) and Agency for Healthcare Research and Quality (No. R01 HS018657) to Dr. Zhang.
Conflicts of interest
Baicker is a Commissioner on the Medicare Payment Advisory Commission and a director of Eli Lilly.