|Home | About | Journals | Submit | Contact Us | Français|
Proposals to use episodes of care as a basis for payment and performance measurement are largely conceptual at this stage, with little empirical work or experience in applied settings to guide their design. Based on analyses of Medicare data, we identified key issues that will need to be considered related to defining episodes and determining which provider is accountable for an episode. We suggest a number of applied studies and demonstrations that would facilitate more rapid movement of episode-based approaches from concept to implementation.
An array of recent health care reform proposals have called for the use of episodes of care as a basis for payment and performance measurement.1,2,3 Episode-based approaches are viewed as a means to drive improvements in the quality and efficiency of health care delivery. Under an episode approach, some or all of the services related to the management of a patient’s chronic or acute medical condition would be grouped together. The specific applications of episodes being discussed include profiling providers to provide comparative feedback for quality improvement, public reporting, pay-for-performance, and “bundled” payments for groups of services.
An episode-based approach stands in sharp contrast to existing payment and performance measurement systems, which often focus on discrete services. Fee-for-service payment systems reimburse individual providers separately for the services they deliver, which encourages increased volume and intensity of services rather than overall value.4 Similarly, when quality of care is measured, the focus is most often on the delivery of discrete services (e.g., did a patient with heart attack receive a beta blocker) within separate settings (e.g., hospital or physician’s office) rather than on the overall quality of the care delivered to manage a condition.
The array of trajectories a patient could take through the health care system – potentially touching multiple providers located in different settings – highlights the challenges of delivering coordinated care. Medicare beneficiaries receive care from a median of 7 physicians,5 and the typical primary care physician must coordinate with 229 other physicians working in 117 practices.6 Typically, no single provider or set of providers claims responsibility for managing a patient’s care from the start to finish of a care episode. Episode-based approaches seek to remediate these problems by strengthening incentives for greater coordination among the array of providers involved in a patient’s care.
While there is great interest in moving to episode-based payment and performance measurement, the proposed applications remain largely conceptual. There is only a handful of real-world experiments completed or in the early stages of implementation. Episodes of care have been used by many commercial insurers and several regional alliances to profile physicians on their relative resource use.7 The Centers for Medicare & Medicaid Services (CMS) has invested substantial resources in testing episode-based approaches to measuring physician resource use, and is preparing to report resource use to physicians on a confidential basis.8
Episodes of care are also being tested as the basis for payment. Among the applications currently being or soon to be tested are: 1) Geisinger Health Plan’s payment for cardiac care episodes,9 2) the Medicare Acute Care Episodes (ACE) demonstration, which will provide a payment for hospital and physician services provided during an episode for orthopedic and cardiovascular procedures, 3) the Medicare Physician Hospital Collaboration demonstration, which is testing gain sharing models involving physician-hospital collaborations, with an emphasis on tracking patients beyond the inpatient stay; and 4) Prometheus Payment’s pilot testing of episode-based payments for several acute and chronic conditions.10 The Senate Finance Committee is also considering a policy that, beginning in fiscal year 2015, would pay for acute inpatient hospital services and post-acute care services within 30 days through a bundled Medicare payment.11
While these efforts provide important steps towards building a knowledge base, much remains unknown about how to define and apply the episode-of-care concept. The purpose of this paper is to explore two design issues that are fundamental to implementing any type of episode-based approach to payment or performance measurement: (1) How should episodes be defined? and (2) How should accountability be attributed to providers? In considering these issues, we suggest a number of applied studies and demonstrations that would facilitate more rapid movement of episode-based approaches from concept to implementation.
We constructed episodes of care using two commercially available episode grouper tools, the Symmetry Episode Treatment Groups (ETGs) and the Thomson Medical Episode Groups (MEGs). These grouper tools define episodes of care on a condition-specific basis.12 Because of the similarity in results, we discuss only the findings from the ETG-constructed episodes. Although the specific results are affected by underlying logic of how each grouper tool assigns claims to episodes, we believe the issues we highlight require consideration under any method of episode construction.
The study population consisted of continuously enrolled Medicare beneficiaries who were in Medicare Part A and Part B for 2004–2006, age 65 or older, and primarily resided in Florida, Oregon, or Texas. Our analyses focused on patients with one of nine clinical conditions: acute myocardial infarction (AMI), bacterial pneumonia, breast cancer, cerebrovascular disease, chronic obstructive pulmonary disease, congestive heart failure (CHF), diabetes, hip fracture, and low back pain.13 The nine conditions were purposively selected to represent a mix of chronic and acute conditions that are treated in variety of care settings.
Defining the group of related services that will constitute a discrete episode is a central design consideration. Whether to narrowly or broadly define an episode is important because patients have multiple, co-occurring chronic conditions and are treated in many care settings. Our analyses of the Medicare data flagged three issues that could influence how policymakers choose to define an episode: 1) the number of different settings involved in management of the condition; 2) a single versus multi-condition focus; and 3) the amount of heterogeneity within episodes of the same type.
Medicare beneficiaries typically received care from a wide variety of providers and in numerous settings.14 The number of settings (e.g., inpatient, physician office, home health, etc.) involved in episodes varied both within episodes related to a particular condition and between episodes related to different conditions. For example, 57% of episodes related to hip fracture and 28% of episodes related to AMI involved four or more settings. Some types of episodes, such as those related to diabetes and low back pain, are treated in only one setting (physician office) about 50% of the time. Even for these conditions, however, over 10% of episodes involved three or more settings. Similar variation is observed in the number of physicians involved in the management of the episode (Exhibit 1).
In defining episodes, the question is how many different care settings to include. While the ETG and MEG episodes encompass all services delivered across all settings related to a particular condition, it is not clear which providers in which settings could feasibly and fairly be considered jointly accountable for episodes. For example, to what extent should hospitals be accountable for the quality of post-acute care? The current lack of inter-connectedness between providers and the absence of systems of care in most communities will create operational challenges in the near term. However, application of episodes may encourage providers to move towards establishing more formal relationships and quasi-systems of care.
Should episodes of care include only single condition, or should interrelated conditions be captured in a single episode? Many Medicare beneficiaries have multiple chronic conditions.15 Our analyses found that Medicare beneficiaries in our study sample had, on average, eight or more episodes of care during a year, some of which were for interrelated conditions. For example, many Medicare beneficiaries who had an AMI also had hypertension (63%), CHF (54%), or diabetes (35%) episodes.
Encouraging a single-condition focus through episode-based approaches may not be optimal for patient management, given that many conditions co-occur and their management is interrelated. To illustrate this point, consider patients who had an AMI during an ischemic heart disease episode and had separate episodes for related conditions including hypertension, hyperlipidemia, cerebrovascular disease, and CHF (Exhibit 1). AMI patients with these comorbidities not only had higher total costs, but also higher cost for the ischemic heart disease episode alone, suggesting greater complexity in clinical management.
In defining episodes, should these related conditions be considered discrete episodes or packaged as a cluster? If episodes are being used for performance measurement, it may be more appropriate to define episodes to include multiple, interrelated conditions that are often treated concurrently by a physician. Such a broad definition would provide a broader perspective on clinical management of complex patients. However, broader definitions would create more heterogeneous episodes, introducing greater financial risk in payment-related applications. For these uses, a narrower definition that would result in more homogeneous episodes may be desirable.
How similar are episodes for the purposes of managing care and resources? We found substantial variation in the average standardized payments for episodes of a specific condition.16 We used the coefficient of variation (CV) to assess how much variability there was in standardized payments. The CV ranged from 72 for episodes related to hip fracture, indicating lower variability, to 269 for episodes related to diabetes, indicating a great deal of variability within episodes of the same condition type.
The observed variation in payments could be due to a variety of factors, including 1) variation in patterns of care among providers managing patients with the same condition, 2) heterogeneity in the clinical condition of the patient (e.g., severe pneumonia versus mild pneumonia), and/or 3) random variation.
Variation could represent an opportunity for episode-based approaches to reduce undesired variations in practice patterns.17 But some of the variation may be outside the control of providers and could represent chance variation or differences in patient risk factors. If these risks are not controlled for, unintended consequences could occur in using episodes for either payment or performance measurement. In particular, providers may avoid sicker patients with higher utilization whose care trajectory is likely to differ due to factors outside the control of the provider. Ways to minimize variation include constructing more homogeneous episode definitions, or risk mitigation strategies such as risk adjustment, risk corridors, stop-loss insurance, and outlier payments.4
The second design issue that is relevant to all episode-of-care applications is how responsibility is attributed to one or more providers for the content and outcomes of an episode. Ideally, the provider(s) who is assigned accountability feels responsible for care delivered within the episode. Achieving a sense of ownership by providers may prove challenging, particularly when care delivered during an episode is dispersed across multiple providers (Exhibit 1), who may be located in different settings.
Attribution of care, depending on the application, could be retrospective or prospective. We examined six retrospective attribution rules that assigned episodes based on the provider’s proportion of visits or payments during the episode.18 We assigned episodes to two different kinds of providers, physicians and facilities (e.g., hospitals, skilled nursing facilities). Applying these rules, we were able to attribute most episodes to a single or multiple providers. The performance of the attribution rules did vary, however, by the assignment method and clinical condition. The rules assigned between 73% and 99% of AMI-related episodes to a single or multiple providers. But among diabetes episodes, where treatment is mostly outpatient, there was frequently no facility that could be assigned care and facility-based rules did not perform well. The variability in the performance of attribution rules by clinical condition suggests that careful consideration of the attribution algorithm is needed and the approach may need to vary depending on the clinical condition and its care trajectory.
An alternative approach to attribution is to encourage providers to prospectively assume accountability for the episodes of patients. We describe below a variety of testing that could be done regarding virtual groupings of providers that could prospectively assume accountability.
There is a great deal of interest in episode-based approaches to payment and performance measurement that can be implemented in the near term. However, there is a paucity of solid empirical work or real-world applications that provide knowledge on how best to design and implement episodes of care in practice. In this section, we outline a path towards implementation of episode-based approaches. We first describe the scope of current pilot testing. We then outline a research agenda that would fill the gaps in knowledge that are needed for broader use of episode-based approaches.
As described above, CMS and others have already begun several pilots and demonstrations of episode-based approaches. These pilots will provide essential empirical information on the effects of different episode-based approaches and how they can be administered in practice. Not surprisingly, current pilots focus on the most feasible approaches and frequently share three characteristics.
First, eligibility for pilots is sometimes limited to integrated groups of providers such as integrated delivery systems and medical groups. Integrated systems currently represent the minority of providers delivering care in the U.S; in most cases, providers are either loosely configured or there is an absence of explicitly defined relationships.
Second, most pilots tie relatively weak incentives to episodes, such as confidential reporting on episode cost and quality, financial incentives for good performance with no downside provider risk, or shared savings. As experience with these approaches increases, stronger incentives such as prospective payment could potentially be used.
Third, relatively narrow episode definitions are used in many pilots. Most of the episodes currently used are anchored around a hospitalization, for several reasons: inpatient care is acute and easier to identify and demarcate; hospital-based care is expensive; hospitals have existing relationships with physicians that can be used to improve care episodes, with some specialties working predominantly in the hospital; and many hospital-based services are already bundled into a Medicare prospective payment.
To build from this base, we recommend two approaches for expanding the scope of episode-based approaches. First, we recommend a “building block” approach where episode definitions continue to increase in breadth over time. For example, pilots such as the Medicare ACE and Physician Hospital Collaboration demonstrations could be expanded to include additional care settings (e.g., post-acute care) and additional conditions.
Second, we recommend testing episodes for chronic conditions, which would include both inpatient and ambulatory care. Although hospital-based episodes have been the most feasible for pilot testing at this stage, these approaches do not create an incentive to avoid the initial hospitalization, since episodes are based around admissions. Providers responsible for chronic condition episodes would assume responsibility for managing patients on an ambulatory basis and avoiding hospitalization to the extent possible. Although limited pilot testing (e.g., Prometheus) of episode-based performance measurement and payment for chronic conditions is under way, Medicare has taken a different approach to date in demonstration projects on chronic care management. The approach of the Medicare Physician Group Practice and Medical Home demonstrations is to attribute all of a patient’s care for a time period to a provider group. Given the number of co-occurring chronic conditions for Medicare beneficiaries, this single-patient focus may have advantages over an episode-based approach. However, the patient-based approach (similar to capitation) holds providers accountable for the “probability risk” that patients will develop a condition, whereas episode-based approaches limit provider accountability to the “technical risk” related to each condition. In addition, relatively few provider organizations may be able to assume accountability for all of a patient’s care. For these reasons, episode-based approaches for chronic care should be evaluated along with patient-based approaches.
There are several applied studies that would facilitate more rapid movement of episode-based approaches from concept to implementation beyond the approaches used in most current pilots. Some of this research mirrors work that has been done in development of existing prospective payment systems. In this section, we describe an illustrative list addressing the two design issues highlighted in our analyses: episode definition and episode attribution.
CMS and others, such as Prometheus, the state of Minnesota, the American Board of Medical Specialties (ABMS) and the National Quality Forum (NQF), health plans and health systems, have begun important work to explore alternative constructions and to understand how current episode groupers define episodes. We believe additional investments are required to inform key operational issues. Our exploratory analyses illustrated that different episode definitions are likely necessary for different applications – e.g., broader episodes may be better-suited for quality measurement, while narrower episodes may be more feasible for payment applications. Starting with high volume/cost conditions, alternative episode constructions for specific applications should be developed. This work should explicitly examine whether different approaches are required for different episodes types, such as chronic episodes with acute exacerbations, strictly chronic episodes, and strictly acute episodes. Consideration needs to be given to how to account for the multiple comorbidities of many beneficiaries and whether to lump co-related and co-occurring conditions together or to address them separately.
These episode definitions could then be tested in a variety of ways before implementation in pilot tests, including face validity testing with providers, analysis of sources of variation, and simulation of episode-based payments.
Testing the face validity of various approaches to defining an episode with providers will serve to highlight potential implementation barriers and can be used to refine definitions. Soliciting physician input during the definitional stage will ensure the clinical integrity of episodes and help mitigate resistance to their use among providers.
Understanding the sources of variation within episodes will help to adjust how episodes are defined and applied. The first step in this type of analysis would be to determine the extent of variation in episodes using a particular definition. For example, we found that the CV for selected ETGs ranged from 72 to 269. Subsequent analytic steps could segment that variation into categories, such as patient risk/severity and potentially avoidable care.
First, the amount of variation in episode costs that can be explained by patient characteristics would be identified using risk adjustment models. The adjustment methodology will need to consider the impact of within-condition variation in severity. In addition, as suggested by Exhibit 2, the methodology also needs to capture the effect of comorbid conditions on the costs of an episode. Currently, episode groupers attempt to either segment patients into homogeneous groups or include some form of within-condition adjustment. However, the extent to which these methodologies explain within-episode variation in costs is not well-documented. Medicare uses the patient-level risk adjustor CMS Hierarchical Condition Categories (CMS-HCCs) to risk adjust payments to Medicare Advantage plans. While using CMS-HCCs for the purpose of risk-adjusting episodes is a natural extension, its performance in this application has not been demonstrated yet.
Next, analyses of services delivered within episodes could be conducted to determine what types of care are driving episode costs. Utilization data could be reviewed by panels of clinical experts to determine which services commonly provided during episodes are considered typical care and which are potentially avoidable or inappropriate. Prometheus Payment has performed this type of analysis for several conditions in a national commercial population, and found that 40% of episode costs, on average, were for services classified as “potentially avoidable complications.”10 This could be augmented by more-detailed analyses by clinical experts of episodes using medical charts for particular types of patients to identify opportunities for improvement in patient care.
Some episode applications, such as payments, could be simulated prior to pilot testing. Simulations using Medicare data can be used to identify patterns in characteristics of providers expected to win and lose under new payment methods, and to test the level of financial risk under specific configurations. Similar studies have been used to test the expected impact of a broad range of new Medicare payment policies. For example, studies of bundled payment for inpatient physician services in the 1980s indicated high levels of financial risk for medical admissions, as well as systematic patterns in winners and losers (specialists would be paid less, and generalists more). These results contributed to the decision not to bundle payment for inpatient physician services.
A hypothetical simulation study would examine bundled payment for episodes of diabetes care. Actual utilization data would be used to determine payment amounts if bundled payment had been applied, and these rates could be compared to actual payments for the same services. This analysis could show if there would be systematic patterns in winners and losers under the bundled payment policy – for example, if endocrinologists are more likely to benefit than internists; if larger physician practices are more likely to benefit than solo practices; or if certain regions are more likely to benefit. Some systematic differences could be targeted by special payment policies. For example, if rural safety-net providers would be expected to consistently incur losses, they could be exempted from bundled payments, similarly to how Critical Access Hospitals are exempt from the Inpatient Prospective Payment System.
Additional analyses could estimate provider financial risk under different payment scenarios. Simulations could determine how often providers would be expected to incur large gains or losses given the natural variation in episode costs. These simulations would be used to craft payment policies with an acceptable level of risk through methods such as blended fee-for-service and bundled payment or outlier payments.
A limitation of this type of simulation analysis is that it relies on historical utilization patterns, which may change under a new payment system. Attempts could be made to model expected changes under new payment incentives, but actual results are likely to differ from simulation estimates.
Encouraging greater coordination of care delivery and resource use among providers is a key component of episode-based approaches. This will, however, require a shift in accountability in health care delivery, and there is little agreement at this point about what should define entities that are accountable for care, and how accountability should be attributed to specific entities.
Current application of episode-based resource use measures typically hold a single physician accountable for the resources used within an episode. This is frequently the physician who is responsible for the largest number of visits or greatest share of costs. More recent proposals, which emphasize creating shared accountabilities for quality and cost across the broader set of providers involved in patient care, call for developing Accountable Care Organizations (ACOs) 19 or Accountable Care Systems (ACSs)20 – groups of providers that would assume accountability for episodes or patients.
The ACO concept is more feasible with integrated medical groups or integrated delivery systems of care where providers are already linked organizationally and financially. For non-integrated providers, the operational challenges of using ACOs for payment (i.e., shared savings, performance-based bonus payments, or bundled payments), management of financial risk, and provision of more coordinated care are not yet understood.
Recent MedPAC deliberations highlighted some of the challenges in the ACO concept, and underscore the need for additional empirical and applied work to move the ACO concept towards an operational form.21 For example, how would providers within an ACO feel about being responsible for an episode of care if patients are free to receive care outside their designated ACO? Analyses that would illustrate what fraction of episodes of care is delivered outside an ACO would help providers understand the risks they would face in being responsible for managing quality and cost, and determine whether some type of patient “lock-in” provision or other risk mitigation strategy would be needed.
Another issue to be addressed is the appropriate size of an ACO. In proposals by MedPAC and Fisher et al., 5,000 was identified as a minimum number of beneficiaries within an ACO required to generate statistically stable results. It is unclear that 5,000 beneficiaries would in fact provide a sufficient basis for stable estimates of quality performance, save for the most frequently triggered measures. For example, colorectal cancer screening has a 168.2 eligible population per 1000 member years, which would yield 841 denominator patients for scoring in an ACO with 5,000 members, while management of heart failure only has an eligible population of 0.6 per 1000 member years, yielding 1.5 denominator events. Analyses could easily be done to estimate the size of the denominator population for various quality measures at an ACO-level, compute performance scores at the ACO-level using administrative data, and assess the minimum number of denominator events required to produce a reliable result.
The issues identified here are only a few of the many issues to be considered in defining what set of providers could hold broad accountabilities for care delivery. How an ACO is defined is still an open question and an array of issues associated with implementing this type of accountability model need to be evaluated. To this end, CMS has begun work to consider alternative models. Additional work in this arena could explore differences between patient-driven (analysis of actual care-seeking patterns) versus provider-driven (how providers see themselves as related within a community) patterns of care to define ACOs.
A key focus of future investigation should be identifying appropriate attribution methods for retrospective or prospective assignment of patients to ACOs, individual providers, or other accountable entities. There are a variety of claims-based attribution algorithms that have been used in retrospective assignment. Previous work by CMS and others, as well as our results presented above, has demonstrated that these algorithms vary in what fraction of episodes can be assigned to a provider and which provider is assigned a given episode.8 Similar results have been observed in studies that compare algorithms to assign patients or quality measures (vs. episodes) to providers.5,22 Unfortunately, none of these efforts help inform policymakers on which of the many attribution algorithms they should use in particular applications.
The next step is to engage physicians and other stakeholders to test the face validity of different attribution approaches to reach consensus on the choice of algorithm. One attribution rule likely will be not suitable for all policy applications (e.g. quality reporting vs. episode-based payment). Therefore this research should be application specific. This testing should consider a mix of different types of clinical conditions that involve more/fewer providers and settings to illustrate the complexities and challenges that may arise. Ideally it would use real claims of a provider’s patients so that they can compare whether assignment of a patient via an algorithm is consistent with their clinical judgment.
Prospective attribution is an alternative method that could be tested in several ways. One approach, taken in the Medicare Medical Home demonstration, is for a patient and physician to jointly sign an agreement.23 Another approach, taken by the Medicare Physician Group Practice demonstration, is to prospectively assign a patient to a group based on the beneficiary’s previous care pattern.24 Whether either approach is practical with an episode-based system needs to be addressed and, if so, the optimal approach identified.
Current policies emphasize measurement and payment for individual services delivered by individual providers in separate settings of care. Patient needs might be better served by a more coordinated and integrated approach to care delivery. What remains untested is whether episode-based reforms will foster system changes that will lead to more coordinated, integrated care delivery.
Evolving towards episode-based approaches for payment and performance measurement is complex, and important work is required to identify, define, and test various design elements to move the concept into operation. The recommendations here serve as a starting point for a more robust agenda for testing applications of episodes of care.
The work presented here is based on research conducted under a U.S. Department of Health and Human Services Assistant Secretary for Planning and Evaluation funded project (#HHS-100-03-001). The views expressed in this paper reflect those of the authors, and do not necessarily reflect the views of DHHS.