|Home | About | Journals | Submit | Contact Us | Français|
With the introduction of Part D drug benefits, Medicare collects information on diagnoses, treatments, and clinical events for millions of beneficiaries. These data are a promising resource for comparative effectiveness research (CER) on treatments, benefit designs, and delivery systems.
We explore the data available for researchers and approaches that could be used to enhance the value of Medicare data for CER.
Using currently available Medicare data for CER is challenging; as with all administrative data, it is not possible to capture every factor that contributes to prescribing decisions and patients are not randomly assigned to treatments. In addition, Part D plan selection and switching may influence treatment decisions and contribute to selection bias. Exploiting certain program aspects can help address these limitations. For example, ongoing changes in Medicare or plan policies, and the random assignment of beneficiaries who receive Part D low income subsidies into plans with different formularies could yield natural experiments.
Refining policies around time to data release, provision of additional data elements, and linkage with greater beneficiary-level information would improve the value and usability of these data. Improving the transparency and reproduceability of findings, and potential open access to qualified stakeholders are also important policy considerations. Work is needed to reconcile data needs with current policies and goals.
Medicare data provides a rich resource for CER. Leveraging existing program elements combined with some administrative changes in data availability could create large datasets for evaluating treatment patterns, spending, and coverage decisions.
Information on the safety, effectiveness, and value of medical care requires detailed clinical data from large numbers of patients receiving care in real-world settings. With the introduction of Medicare Part D prescription drug benefits in 2006, Medicare now collects information on the use of prescription drugs for over 27 million beneficiaries;1 previously this information was not widely available for Medicare beneficiaries, although drug use data were available for dual-eligible Medicare beneficiaries via Medicaid claims. The more comprehensive collection of drug use data allows for linkage with previously available information from Parts A and B including inpatient and outpatient diagnoses, and major clinical events for millions of beneficiaries, including many persons over the age of 65. These data provide a promising resource for assessing the comparative effectiveness of many types of care across a range of settings and geographic areas in the United States.
Medicare collects these data for payment and administrative purposes and not research, thus there are several limitations to using this type of observational data. The care beneficiaries receive will vary depending on where they live, the types of Medicare plans they choose, and their physicians and hospitals. In other words, there may be factors that are associated with both the care beneficiaries receive and the outcomes of care; these factors confound assessments of care effectiveness and limit the validity of simple comparisons. However, exploiting certain program aspects and ongoing natural experiments within the Medicare program can mitigate some biases associated with purely observational data.
We discuss these strengths and limitations of using Medicare data for comparative effectiveness research, and propose policy recommendations for improving the usefulness of these data for patients, providers, and policy-makers.
There is a profound need for more evidence to guide clinical and policy decisions on drug treatments, devices, interventions, care delivery, payment models, and delivery systems. For example, while there is arguably substantial trial evidence supporting the use of many prescription drugs, this evidence often provides limited guidance for actual clinical decisions.
There are two main approaches for developing comparative effectiveness evidence: 1) clinical trials, including randomized clinical trials (RCTs) and pragmatic trials; and 2) studies using observational data from actual practice. While double-blinded RCTs represent the gold standard for generating clinical evidence, they have a number of practical limitations. Specifically, trials have historically compared single drugs to placebo rather than to existing alternative drugs, or in combination with commonly used drug regimens. Trials test under rigorous experimental conditions (efficacy) rather than real-world situations (effectiveness), and are not designed to evaluate costs or rare adverse events. RCTs also tend to be expensive and examine relatively short-term effects. Moreover, trials may have limited generalizability to specific subgroups of patients, such as the elderly, racial/ethnic minorities, or those with severe diseases, because they tend to target relatively homogenous patient groups rather than the broader mix seen in actual practice.2-4 Pragmatic trials attempt to overcome some of these limitations by focusing on more heterogeneous groups of patients and evaluating effectiveness under routine care; however, existing evidence from pragmatic trials is in short supply and funding for these types of studies is limited.5, 6
Studies using observational, longitudinal datasets could provide complementary information that addresses many of the limitations associated with RCTs. For example, Medicare collects information on millions of beneficiaries and allows for linkages across a range of claims data, including inpatient, outpatient, and prescription drugs. Having a large sample of individuals is critical for ensuring adequate statistical power for studying rare conditions or specific patient subgroups. Use of observational data including Medicare data, however, requires consideration of numerous factors, such as the types of Medicare plans for inpatient, outpatient, and drug services; coverage/cost-sharing for treatments; availability of physicians and hospitals; and clustering of patients by physician. In addition, there can also be variations in practice patterns across geographic areas. Examinations of drug use within Part D should consider these various levels of analysis and account for factors that could affect drug use, adherence, and ultimately outcomes, including a range of patient, provider, and plan-level characteristics. In short, the strengths and limitations of both clinical trials and observational data analyses should be considered carefully when evaluating the value of these approaches for addressing specific questions.
The following sections describe relevant structural aspects of the Medicare Part D program, the Medicare data available for researchers and potential approaches that could be used to create quasi-experiments and enhance the value of historical Medicare data for CER.
Medicare currently collects diagnostic and treatment information through four programs: Part A (inpatient), Part B (outpatient), and Part D (prescription drugs). Medicare Advantage (Part C) includes medical information for beneficiaries enrolled in managed care organizations. Part D is administered by private plans either as stand-alone Prescription Drug Plans (PDPs) that supplement Traditional Medicare or Medicare Advantage Prescription Drug (MAPD) plans that bundle Part A, B and D benefits. Part D is a voluntary benefit; in 2009 about 27 of 45 million Medicare beneficiaries were enrolled in a Part D plan, including 9.6 million low income beneficiaries who received additional premium and cost-sharing subsidies from Medicare. The Centers for Medicare and Medicaid Services (CMS) randomly assigns low income subsidy beneficiaries who have not chosen a Part D plan to qualified stand-alone drug plans.7
Beneficiaries choose their Part D plans; these plans have some autonomy in determining their benefit structures and formulary drug lists provided they meet basic Medicare requirements. For example, all plans must offer benefits at least as generous as the defined standard. Medicare also requires that plans cover at least two drugs within a therapeutic class. However, plans can determine coverage for specific drugs within a class, as well as tier placement and utilization management requirements. The use of utilization management tools, such as prior authorization, has grown since Part D’s introduction, and are most often used for drugs that are newer, more expensive, or more risky with greater potential for adverse effects or with less available evidence on the possible benefits or harms.8
Table 1 outlines the available Part D research data files.10 The primary data source for Part D drug utilization is the Part D Event (PDE) files, which are currently available for 2006-2008. This file contains detailed information on each drug event for PDP and MAPD plan beneficiaries, and encrypted beneficiary, pharmacy, prescriber, and plan identifiers that allow linkage with other files, such as inpatient and outpatient claims data, and Part D plan characteristics files. The PDE contains information on each drug dispensed including the National Drug Code (NDC), the quantity dispensed, and days supply, allowing for the examination of therapy adherence and persistence based on dispensing data, which have been previously validated.11, 12
The PDE data also capture cost data, such as total drug costs and patient payments. These data allow for examination of variation in spending patterns and cost of care analyses. The PDE data also specifies the benefit phase during which each drug was filled (e.g., deductible or initial coverage phase) based on the benefit structure implemented by each beneficiaries’ plan, which affects patients’ and payers’ costs. This file also includes plan-specific information on the formulary coverage for each drug dispensed, including the tier and utilization management requirements. Plan-level information on formulary and benefit structures can be valuable for identifying quasi-experiments or instrumental variables for statistical analyses.
PDE data can be linked with information on individual beneficiary characteristics. The Beneficiary Summary File contains beneficiary-level information on basic demographics, including age, gender, race, and geographic location. This file also describes beneficiaries’ months of enrollment in Parts A, B, C, and D, including the type of coverage (e.g., retiree, Part D stand-alone plan, Part D Medicare Advantage Plan), dual eligibility status, and whether they are receiving the Part D low income subsidy. The Beneficiary Annual Summary File contains additional information on patients’ inpatient diagnosis related groups, as well as two sets of chronic condition flags.
CMS has rolled out an increasing number of data elements since the Part D data rule was originally issued that supplement the PDE and beneficiary data (Table 1). The Drug Characteristics File includes drug names (generic and brand) and strength and dosage form information by National Drug Code. Plan characteristics files can be linked to the PDE data using the encrypted plan identifier to examine detailed information on plan type, cost-sharing levels, premium information, and service area. The Pharmacy Characteristics File includes information on the type of pharmacy where beneficiaries filled their prescriptions. Lastly, the Prescriber Characteristics File contains information on the prescriber’s specialty, credentials and geographic location.
Part D data can also be linked to Traditional Medicare data on beneficiaries’ other medical claims (e.g., hospital, skilled nursing facility, hospice, physician). For example, the Inpatient Standard Analytic File includes claims for inpatient stays, including diagnosis and procedure codes (ICD-9), Diagnosis Related Groups (a classification system used for prospective Medicare payments), date, facility, and cost information. The Carrier file contains claims for non-institutional providers, largely physicians. It also contains diagnosis and procedure codes (ICD-9 and CMS Common Procedure Coding System (HCPCS) codes), reimbursement amounts, and provider identifiers.
Information at the beneficiary level on inpatient and outpatient services use and diagnoses within the Medicare Advantage program is not currently available to researchers. Monthly enrollment information is available in the Beneficiary Summary file so beneficiaries’ transitions between Traditional fee-for-service Medicare and Medicare Advantage can be tracked. Prior to the introduction of Part D, beneficiaries could switch between Traditional Medicare and Medicare Advantage plans monthly. In 2006, beneficiaries were restricted to switching only during the first six months of the year, and from 2007 onward, the first three months of the year.
Medicare claims data can also be linked with Medicaid data for dual-eligible beneficiaries, as well as datasets focused on specific subpopulations or collected via surveys, such as the SEER (Surveillance Epidemiology and End Results) Cancer Registry, Long Term Care Minimum Data Set (MDS), the Home Health Outcome and Assessment Information Set (OASIS), the Medicare Current Beneficiary Survey, and the Health and Retirement Survey. While the available sample size is more limited when linking to these data sources, they can provide a richer set of socio-demographic, health, and clinical characteristics than is available in the claims data alone.
A substantial literature describes the limitations of using observational data for CER, as well as potential strategies for addressing these limitations.13-17 Data and measurement quality, the formation and stability of comparison groups, and methods to deal with cross-over effects or switching between therapies, are particularly relevant for observational data studies and have been described elsewhere.18-20
Despite the range of information available on Medicare beneficiaries’ drug treatments and diagnoses, simple analyses using currently available Medicare data are likely to yield biased results (Table 2). Because patients are not randomly assigned to treatments there may be important differences between the comparison groups in the factors that contributed to patients receiving different treatments or therapies, such as disease severity, risk factors, and comorbidities (e.g., confounding by indication).21, 22 The lack of randomization creates analytic challenges because it is not possible to capture all factors that contribute to physicians’ prescribing decisions using administrative data.
Plan selection and switching within the Part D program beneficiaries also creates challenges. Beneficiaries choose to enroll in plans with varying characteristics that can influence treatment decisions, such as different delivery systems (e.g., MA vs. PDP), benefit designs, and formularies. Beneficiaries are encouraged to choose plans based on these characteristics to minimize treatment disruptions and lower out-of-pocket drug costs, so there is likely to be substantial self-selection into Part D plans.23, 24 For example, there was evidence of substantial adverse selection into plans that offered full coverage during the standard coverage gap in 2006, prompting these plans to leave the market the following year.25 In addition, the availability of employer-sponsored drug coverage for current or retired employees also influences the characteristics of beneficiaries choosing to enroll in Part D plans; an estimated 8.3 million beneficiaries received retiree drug coverage in 2010. Beneficiaries can also switch from plans during the open enrollment phase of each year, for example, based on changes in clinical need. Beneficiaries’ plan choice and geographic location may have important implications for their access to different treatment alternatives.26, 27 While a number of patient, provider, and plan characteristics are available in Medicare datasets, simply adjusting for these factors in a multivariate model is unlikely to sufficiently address selection differences between patient groups that contribute to treatment decisions, plan choice, or health outcomes.
Certain features of the Medicare program present opportunities to design studies that mitigate the traditional limitations of observational data analyses and improve the relevancy of study findings. For example, ongoing changes in Medicare or Part D plan policies may provide opportunities to examine natural experiments. Policy changes that are determined at higher levels (e.g., program, plan-level) are likely to result in changes in treatment patterns that are unrelated to underlying beneficiary characteristics. For example, within the Part D program, some plans may change their formulary requirements from year-to-year, while other plans’ formularies remain stable. These variations could be used to create quasi-experimental treatment and control groups if researchers can identify comparable plans with stable enrollment.28
Instrumental variables are another approach for accounting for unobserved differences between groups. Instrumental variables are related to treatment assignment, but not to other patient risk factors that are associated with the treatment choice or health outcome of interest. Numerous studies have demonstrated that the use of instrumental variables can decrease the bias associated with unmeasured confounders;15-17, 29 however, this approach is dependent on finding a high-quality instrument, which can be difficult. Millions of beneficiaries who receive Part D low income subsidies are randomly assigned into qualified stand-alone drug plans. Differences in plan formularies for these beneficiaries could be associated with variations in drug use that are unrelated to underlying risk factors, given that patients were randomly assigned to plans. Thus, plan formulary coverage for these beneficiaries represents a potential high-quality instrument because it removes the association between plan choice and patient risk factors.
Importantly, all of these approaches may not provide information that clinicians want when treating patients, i.e., what is the best medical strategy for an individual patient, versus the average or marginal effect for a group of patients. For example, the instrumental variable approach generally provides estimates of a marginal effect, e.g., the effect on outcomes if the probability of receiving a treatment went from X percent to X+Y percent. However, if properly interpreted such information could still be valuable for informing policy or coverage decisions.
The release of Part D Event data and related files was an important step; however, additional changes in the data policy could make Medicare data more useful for CER:
The Medicare program provides a rich data source for evaluating the comparative effectiveness of drug treatments. Simple analyses using currently available information, however, are unlikely to provide useful information. Leveraging program elements combined with some changes in data availability could improve the value of these datasets and the transparency of comparative effectiveness work evaluating treatment patterns, spending, and coverage decisions.
Medicare now collects information on diagnoses, treatments, prescription drug use, and clinical events for millions of beneficiaries. These data are a promising resource for comparative effectiveness research (CER) on treatments, benefit designs, and delivery systems; however, there are a number of challenges to using this data for CER:
Funding source: The National Institute on Aging (R01 AG029316), the Commonwealth Fund, and the Alfred P. Sloan Foundation provided funding for the study.
Publisher's Disclaimer: “This is the pre-publication version of a manuscript that has been accepted for publication in The American Journal of Managed Care (AJMC). This version does not include post-acceptance editing and formatting. The editors and publisher of AJMC are not responsible for the content or presentation of the prepublication version of the manuscript or any version that a third party derives from it. Readers who wish to access the definitive published version of this manuscript and any ancillary material related to it (eg, correspondence, corrections, editorials, etc) should go to www.ajmc.com or to the print issue in which the article appears. Those who cite this manuscript should cite the published version, as it is the official version of record.”