|Home | About | Journals | Submit | Contact Us | Français|
To compare the adequacy of the state buy-in variable (SBI) in the Medicare denominator file to identify dually eligible patients.
We used linked Medicare and Medicaid data from Michigan and Ohio for elders diagnosed with incident breast, prostate, or colorectal cancer between 1996 and 2001.
Using the Medicaid enrollment file as the “gold standard,” we assessed the number of duals from Medicare files in cross-sectional and longitudinal analyses.
Data for the study population were linked with Medicare and Medicaid files using patient identifiers.
Sensitivity was low (74.2 percent, 95 percent confidence interval [CI]: 72.7, 75.6 and 80.8 percent, 79.7, 81.9, in Michigan and Ohio, respectively). PPV was above 95 percent in Michigan and 88.8 percent in Ohio. Both sensitivity and PPV varied between and within the states. Both in Michigan and in Ohio, we observed limited agreement on the length of enrollment in Medicaid between the two data sources.
Except to examine disparities by dual status at a very broad level, the SBI variable alone may be inadequate to identify duals. The findings call for improvements in Medicare and Medicaid information management systems and for uniformity in database linking strategies.
Dually eligible Medicare and Medicaid beneficiaries (referred to as “duals”), comprise the most vulnerable subgroup in the elderly population, accounting for a disproportionate share of Medicaid and Medicare utilization and payments (Lied 2006). In addition to their socioeconomic disadvantage, duals are disproportionately represented among those with chronic illnesses and functional limitations, as well as among nursing home residents (Kaiser Family Foundation 2003).
In cancer-related health services research, studies have documented disparities in the receipt of colorectal cancer screening services by dual status (Koroukian et al. 2006), as well as disparities in cancer stage, treatment, and outcomes (Bradley et al. 2007a; Bradley, Clement, and Lin 2008a; Bradley, Luo, and Given 2008b; Bradley et al. 2008c;). Despite these important differences, examining disparities by dual status is challenging because of uncertainties surrounding the best method to identify duals in population-based databases.
As defined in a report by the Research Data Assistance Center (ResDAC), a contractor with Centers for Medicare & Medicaid Services (CMS) to provide assistance to researchers using Medicare and Medicaid data, dual enrollees “(1) are Medicare enrollees; (2) might have their Medicare Part B premium paid by their states' Medicaid program (Specified Low-Income Beneficiaries (SLMBs); (3) might have both their Medicare Part B premium and their Medicare cost-sharing amounts paid by their states' Medicaid program (Qualified Medicare Beneficiaries (QMBs); (4) might receive full Medicaid benefits in addition to (2) or (3)” (Barosso 2006). One potential source for identifying duals is the state buy-in (SBI) variable in the Medicare denominator file, used previously in population-based studies (Koroukian et al. 2006; O'Leary, Sloss, and Melnick 2007;). The alternative approach has been to link Medicare and Medicaid files, a laborious process employed in other studies (Bradley et al. 2007a; Bradley, Clement, and Lin 2008a; Bradley, Luo, and Given 2008b; Bradley et al., 2008c).
Sparse information is available on the data limitations associated with using Medicare data alone to identify or count duals (Rosenbach and Lamphere 1999; Baugh 2004;). Baugh (2004) emphasized the need to link Medicare and Medicaid data to estimate the number of full Medicaid benefit dual enrollees because neither Medicare nor Medicaid data alone provide an accurate count of duals. The reason why Medicare data undercount duals is because Medicaid does not pay Medicare premium for all duals (Baugh 2004). However, the extent to which this undercount occurs has not been estimated previously. Rosenbach and Lamphere (1999) found severe underreporting in the SBI variable in 10 states relative to the number of duals reported by these states' Medicaid officials during a telephone interview. This finding prompted ResDAC to alert the research community about the inadequacy of the SBI variable to identify duals (Barosso 2006). The lack of empirical data on how to identify duals and whether to derive claims-based measures for duals from Medicare data, Medicaid data, or both sources combined has contributed to a significant hiatus in research relevant to this vulnerable population.
In addition to identifying a patient as a dual, it is important also to determine when, in relation to an index health event (e.g., cancer, stroke), and for how long a dual has been enrolled in the Medicaid program. For example, the timing of enrollment in Medicaid in relation to a cancer diagnosis is significantly associated with cancer stage at diagnosis, with patients enrolled in Medicaid several months before diagnosis being more likely to be diagnosed at earlier stages than their counterparts enrolled in Medicaid immediately preceding or following cancer diagnosis (Perkins et al. 2001; Bradley, Given, and Roberts 2003; Koroukian 2003; O'Malley et al. 2006;).
In this study, we examine (1) in a cross-sectional analysis, the sensitivity and positive predictive value (PPV) of the SBI variable to identify duals in a subgroup of older cancer patients from Michigan and Ohio; and (2) in a longitudinal analysis, the extent to which Medicare and Medicaid sources agree on the length of enrollment in Medicaid relative to their month of cancer diagnosis. We conducted these studies using data from cancer surveillance systems from Michigan and Ohio that were linked with Medicare and Medicaid data.
To study our research question, we identified duals separately through the Medicare denominator file by using the SBI monthly variables and linked statewide cancer surveillance and Medicaid enrollment files. For the cross-sectional analysis, we examined a cross-tabulation of the data by the two sources and calculated the sensitivity and PPV of SBI-derived dual status. For the longitudinal analysis, we constructed the history of patients' dual status relative to the date of cancer diagnosis, separately from each of the Medicare and Medicaid files. We then examined agreement between the two sources on the length by which beneficiaries were enrolled in the dual program in the year before and the year after cancer diagnosis. Medicaid enrollment files were considered as the gold standard, albeit an imperfect one, as Medicaid data may slightly over- or undercount duals (Baugh 2004).
Data were obtained from linked databases consisting of cancer surveillance data, Medicare enrollment files, and Medicaid enrollment files. These databases, aimed at studying cancer-related disparities in elders were developed independently by investigators in Michigan and Ohio. The studies, with their respective data users' agreement from the CMS, were approved by the Institutional Review Board at each of Michigan State University, Virginia Commonwealth University, and University Hospitals of Cleveland. The present study was carried out independently after investigators agreed on the study design and analytic strategies. Data by the respective parties were shared in aggregate only.
The databases are described in greater detail elsewhere (Bradley et al. 2007b; Koroukian 2008;). While the linking strategy differed in some procedures across the two states, they share two key components: (1) cancer surveillance data were linked with Medicare files by CMS, using patient identifiers, including social security number (SSN), and gender; and (2) cancer surveillance data were linked with Medicaid data by the investigators using a multistep algorithm, using patient SSN, first and last name, date of birth (month and year), and gender.
Differences in the linkage strategies used by the two states are noteworthy (Figure 1a and 1b). First, in Michigan, the Medicare files consisted of Michigan residents only. Once the SSN to health insurance claim (HIC) conversion file was obtained, it was matched against Medicare enrollment and claims data for Michigan residents only. This is in contrast to the algorithm in Ohio, where the SSN to HIC conversion file was matched against the Medicare enrollment and claims files nationwide, without consideration for beneficiaries' state of residence. Second, the Michigan cancer registry was used to identify all residents diagnosed with cancer, irrespective of anatomic cancer site; whereas in Ohio, only patients diagnosed with incident breast, prostate, and colorectal cancer were considered eligible for matching. Third, Michigan Medicaid administrators provided researchers with a CMS generated file developed for the purposes of improving the accuracy of HICs recorded for Medicaid beneficiaries. The use of this administrative file improved the investigators' ability to identify a greater number of duals. Finally, Michigan linked Medicare and Medicaid files for cancer patients diagnosed between January 1, 1996, and December 31, 2000, and Ohio linked data for patients diagnosed between January 1, 1997, and December 31, 2001. To maximize comparability between the two states, we limited our study population to cases diagnosed with incident breast, prostate, and colorectal cancer. In addition, we limited our cross-sectional and longitudinal analyses to patients enrolled in Medicare in calendar year 1999 because it was the most recent year's data that were available for both states and for which we could summarize patients' enrollment in the 12 months before and after cancer diagnosis.
The data elements used from the cancer surveillance systems included patient identifiers, anatomic site, and date of cancer diagnosis. The Medicare denominator file was used to identify Medicare beneficiaries enrolled in the SBI program on a monthly basis. As noted above, this variable documents enrollment in the SLMB or the QMB program, or whether the person receives full Medicaid benefits. The Medicaid enrollment file was used to verify patients' enrollment in the program on a month by month basis. We constructed the history of state buy-in and Medicaid enrollment in each of the Medicare and Medicaid files, respectively. We grouped enrollment months in 6-month windows to look at the agreement between the Medicare and Medicaid files relative to a cancer diagnosis.
We used a two-step approach in our analysis. We start with a cross-sectional analysis, including patients who were dually eligible at any time during the year 1999, irrespective of when they were diagnosed with cancer. Here, we examined a cross-tabulation of enrollees identified by the two sources and calculated sensitivity and PPV. Sensitivity reflects the percentage of patients in the 1999 Medicaid enrollment files that were also identified in the 1999 Medicare denominator file. PPV is defined as the percentage of patients who were identified through the Medicare denominator file that were also identified in the Medicaid enrollment files. We report 95 percent confidence intervals (CIs) for the sensitivity and PPV measures. We explored possible variations in these measures between the states, as well as within the states, by patient demographics and cancer site using bivariate and multivariable models.
Next, we conducted a longitudinal analysis on a cohort of duals diagnosed with cancer in 1999. This analysis served as the basis to determine the agreement between the two sources on the length of enrollment in Medicaid before and after cancer diagnosis.
We were unable to derive specificity measures due to the fact that our study population included duals only, leaving the number of “true negatives” at zero.
All analyses were performed using SAS V9.1 (SAS Institute, Inc., Cary, NC).
In our cross-sectional analysis, we identified 3,487 duals in Michigan and 5,186 duals in Ohio. These individuals were identified as duals through the SBI variable in the Medicare denominator file, through the Medicaid enrollment files, or both sources. The notable differences between the two states were the greater proportion of patients 85 years of age or older in Ohio, and the greater proportion of African Americans in Michigan. These differences were observed consistently across the three cancer sites.
Table 1 presents the proportion of duals identified through Medicaid enrollment files, the SBI variable, and both sources. Considerable variations were observed both within and between states. In Michigan, the proportion of patients identified through each of these sources varied somewhat across cancer sites. We observed a greater level of consistency in this regard across the cancer sites in Ohio.
We noted important differences across the two states in the trends of these proportions, particularly by age. The proportion of patients identified through Medicaid enrollment files increased considerably with older age in Michigan, while the reverse was observed in Ohio. In Michigan, for example, the proportion of dually eligible breast cancer patients identified from Medicaid increased from 18.8 percent in the 65–69 age group to 34.6 percent among those 85 years of age or older. These proportions for Ohio were, respectively, 20.1 and 14.5 percent. The trends in proportions by race and sex were somewhat similar across the two states and were consistent across the three cancer sites.
Table 2 presents the sensitivity and PPV of the SBI variable from Medicare compared with Medicaid data, by patient demographics and cancer site. Overall, sensitivity was low for both states, but it was lower in Michigan than in Ohio (74.2 percent, 95 percent CI: 72.7, 75.6, and 80.8 percent, 95 percent CI: 79.7, 81.9, respectively). Most notably, and consistent with age-related variations noted above, sensitivity in Michigan decreased considerably with older age (82.3–62.4 percent), but improved with older age in Ohio (75.7–83.9 percent).
Overall, the PPV in Michigan exceeded 97 percent (97.8 percent, 95 percent CI: 97.3, 98.4), but neared 90 percent in Ohio (88.8 percent, 95 percent CI: 87.9, 90.0). The PPV in Michigan was consistently high in all demographic strata and cancer sites. The most important variation in Ohio was observed across age groups, with higher PPV in older age groups (95.3 percent, 95 percent CI: 94.0, 96.6, in the oldest age group, and 85.2 percent, 95 percent CI: 82.5, 87.9, in the 65–69 age group). The results from multivariable analyses for sensitivity and PPV were consistent with those obtained through the bivariate analyses discussed above (data not shown).
The following measures pertain to patients identified as duals through both sources (n=685 and n=1,015 in Michigan and in Ohio, respectively). We assessed the agreement between the Medicare and Medicaid sources on the length of enrollment and in the month in which patients were enrolled in Medicaid (Table 3). In both states, agreement ranged between 20 and 30 percent for 1–6 months pre- and postdiagnosis and between 60 and 70 percent for 7–12 months pre- and postdiagnosis, indicating that agreement was greater for patients with longer periods of enrollment in Medicaid. In the prediagnosis period, the mean length of enrollment in the dual program that was in agreement by both sources was 8.5 months, which was consistent across the two states. In the postdiagnosis period, the mean length of enrollment was 8.5 months in Michigan, and 7.7 months in Ohio.
Using a unique database constructed independently in Michigan and Ohio to study cancer-related disparities, we evaluated the ability of the SBI variable in the Medicare denominator file to identify beneficiaries dually enrolled in the Medicare and Medicaid programs. The study considered the Medicaid enrollment file as the gold standard because Medicaid pays medical expenses for those enrolled and Medicare relies on Medicaid to report enrollment.
Overall, sensitivity was low for both states, but lower in Michigan than in Ohio. Low sensitivity implies that the SBI variable in the denominator file fails to identify a beneficiary as a dual, when in fact s/he is a dual according to the Medicaid source. Additionally, we observed substantial variation in sensitivity by patient demographics, as well as by cancer site, both within and between the states.
Several reasons for the discrepancies between the states are speculated. First, Michigan was unable to locate a number of Medicare beneficiaries due to a CMS denominator file restricted to Michigan residents. Roughly half of these cases not in the Michigan denominator file were confirmed as a CMS recipient but with a different recorded state of residence. Ohio, on the other hand, had Medicare data on Ohio cancer patients regardless of residence.
Second, we note that the records from the cancer registry, used to link with the other files, encompassed all anatomic cancer sites in Michigan, but were limited to breast, prostate, and colorectal cancer cases in Ohio. It is possible that this difference may have yielded a greater match rate in Michigan than in Ohio. Finally, disagreement as to Medicaid eligibility can arise from poor communication and reporting errors between the Medicare and Medicaid systems.
We obtained higher PPV in Michigan than in Ohio, meaning that nearly all patients identified as duals from the Michigan Medicare denominator file were in fact duals, according to the Medicaid enrollment files. On the other hand, data from Ohio indicated that 10 percent or more of those identified as duals through the SBI indicator were false positive or not identified through the Medicaid files. The higher PPV in Michigan is likely due to using a different linkage algorithm, as well as to augmenting the standard CMS social security match with an administrative file of linked Medicaid to Medicare information developed by CMS. This file was developed by CMS using date of birth, sex, beneficiary identification code, and claim account number along with SSN, resulting in a much more comprehensive file linkage achieved when using SSN alone. This step contributed 4 percent of all Medicaid and Medicare linked recipients in the Michigan study file. Had Ohio used similar strategies, perhaps the PPV would have been similar to that obtained in Michigan.
The linking strategy employed in Michigan differs from that of Ohio in some aspects. The linking algorithm in Ohio was deterministic in nature and employed four steps accounting for SSN, first and last name, and date of birth (month and year), as detailed elsewhere (Koroukian 2008). Sex was added to the matching criteria only in the case of colorectal cancer. For the Michigan database, both probabilistic and deterministic approaches were used (Bradley et al. 2007b). Cases identified through probabilistic matching but not through deterministic matching were resolved through manual review. This nuance in matching criteria, in addition to the manual review of cases identified through probabilistic matching and not through deterministic matching may have been responsible for the superior performance of the linking strategy in Michigan as compared with that of Ohio, although this remains to be confirmed in future studies testing various algorithms.
With regard to the length of enrollment in Medicaid, and for patients identified through both sources, there was agreement across the two sources that, on average, patients enroll in Medicaid for a length of approximately 8 months in the year before or after cancer diagnosis. However, the two sources agreed less frequently when accounting for patients with shorter lengths of enrollment in Medicaid. The rate of agreement was limited, barely reaching 30 percent among beneficiaries enrolled in Medicaid for less than 6 months, and ranging 60–70 percent among those enrolled for a period of 7–12 months.
To our knowledge, this is the first study to assess the ability of the SBI variable in the Medicare denominator file to identify dual beneficiaries by examining both Medicare and Medicaid sources of data. It is also the first to compare measures across similar databases constructed independently in two states. The above discussion pointing to idiosyncrasies across the linking strategies is very informative, although it does not definitively explain how the divergence in linking strategy between Ohio and Michigan contributed to the observed differences across the states. No doubt that this study would have been more informative if the databases across the two states were developed following the same linking strategy. However, the national trend has been for individual investigators to link databases within their home states, and, absent a consensus and a uniform methodology to link databases, idiosyncrasies are bound to occur.
Considerable variations in sensitivity and PPV between the two states by patient demographics and often in reverse trends were apparent. For example, while sensitivity in Ohio increased in older patients, it decreased with older age in Michigan. These unexplained trends further add to the level of uncertainty as to the biases to be expected when identifying duals through the SBI variable in the denominator file versus a given state's Medicaid enrollment files. A researcher planning a similar study in another state may see altogether different trends.
The findings from this study highlight the need for improvements in information management across Medicare and Medicaid to obtain a greater level of agreement between the two programs. It would also be extremely helpful if CMS included other variables in addition to the beneficiary SSN in their matching algorithm. CMS uses a more extensive list of variables (e.g., sex, date of birth) when they match their data to sources from other government agencies (e.g., the National Cancer Institute sponsored SEER-Medicare match), but when performing linkages for independent researchers, only SSN and gender are used.
The addition of the State-Reported Dual Eligibility Status code, a variable in the Part D enrollment file, may improve the identification of duals using the denominator files alone (Research Data Assistance Center (ResDAC) website, 2009). The Part D enrollment file will also include the various dual eligibility categories (e.g., SLMB, QMB), which can be very useful when investigating questions related to access to care. This constitutes a substantial improvement over the SBI variable, which fails to distinguish between these varying levels of coverage (Barosso 2006). Depending on states and study periods, Medicaid enrollment files may include these eligibility categories. In this study, for example, these categories could be identified in the Michigan Medicaid files, but not in the Ohio Medicaid files.
Accurate identification of duals through the Medicare denominator file is crucial to the valid comparison of health services use and outcomes between duals and nonduals. A high proportion of false negatives would bias results toward the null because outcomes in nonduals would be unfavorably influenced by the preponderance of unidentified duals within the sample. In turn, falsely underestimating the extent to which disparities between duals and nonduals exist would undermine the urgency with which the special needs of this vulnerable group of elders should be identified and addressed. It is left to future research to assess the adequacy Part D data to identify duals.
A uniform approach to linking strategies is crucial, given the utility of enhanced databases combining multiple sources of data to conduct in-depth studies on disparities. While recent research with such databases focuses on cancer-related disparities, a similar approach can be used to study outcomes for other clinical conditions and for evaluating other aspects of health care delivery and financing, such as cost of care and other outcomes as well. The potential utility of such databases should provide the impetus for the research community to agree on a linking strategy that could be used in a uniform fashion.
We note that the Medicare denominator file was linked with Medicaid analytic extract (MAX) data for the first time in 1999, leading to the creation of an enhanced MAX eligibility file that incorporates Medicare enrollment data for individuals identified in both datasets (Baugh 2004). While useful, such a data source presents important limitations when data for controls (or nonduals) are not readily available. In addition, the feasibility of linking such a data source to an external one, such as cancer registry, is unknown.
In conclusion, the use of the SBI variable for identifying duals is limited. The somewhat low sensitivity and the varying PPV between the two states call for improvements in the management of information across the Medicare and Medicaid programs and for uniformity in strategies of linking databases. At present, the identification of beneficiaries dually eligible for Medicare and Medicaid must continue to rely on state-by-state matching algorithms, which ultimately restricts research related to duals, and when such research is conducted, its generalizability is limited.
Joint Acknowledgment/Disclosure Statement: This research was supported by National Cancer Institute grants K07 CA096705 Cancer-Related Disparities in the Elderly Population, Siran M. Koroukian, principal investigator; P20 CA103736 Cancer-Aging Research Development Grant, Nathan Berger, principal investigator, Siran M. Koroukian, pilot project investigator; and R01-CA101835-01 In-Depth Examination of Disparities in Cancer Outcomes, Cathy J. Bradley, principal investigator.
The authors would like to thank Mr. James Gearheart of the Ohio Department of Job and Family Services and Ms. Georgette Haydu of the Ohio Department of Health for their review of the manuscript.
Disclaimer: Cancer incidence data were obtained from the Ohio Cancer Incidence Surveillance System (OCISS), Ohio Department of Health. Use of these data does not imply that the Ohio Department of Health either agrees or disagrees with any presentation, analyses, interpretations, or conclusions. Information about the OCISS may be obtained at http://odh.state.oh.us/ODHPrograms/CI_SURV/ci_surv1.htm
Additional supporting information may be found in the online version of this article:
Appendix SA1: Author Matrix.
Please note: Wiley-Blackwell is not responsible for the content or functionality of any supporting materials supplied by the authors. Any queries (other than missing material) should be directed to the corresponding author for the article.