|Home | About | Journals | Submit | Contact Us | Français|
The use of Medicaid data to study cancer-related outcomes would be highly desirable. However, the accuracy of Medicaid claims data in the identification of incident cases of breast cancer is unknown.
(1) To estimate the sensitivity of Medicaid claims data for case ascertainment of breast cancer; and (2) to determine the positive predictive value (PPV) of diagnostic and procedure codes retrieved from Medicaid claims, using the Ohio Cancer Incidence Surveillance System (OCISS) as the gold standard.
The study used the linked OCISS and Medicaid enrollment files, 1997–1998 (n=1,648). The claims search yielded 2,635 incident cases, of which 1,132 were also identified through the OCISS-Medicaid files. Sensitivity and PPV of Medicaid data were calculated in subgroups of the population.
The overall sensitivity was 68.7 percent, but varied greatly across the subgroups of the population. It was lower among women enrolled in Medicaid only for part of the study year than those enrolled in Medicaid for 12 months of the study year (56.7 percent and 78.0 percent respectively, p<0.0001), and lower among those who are dual Medicare-Medicaid eligible compared to those not participating in the Medicare program (63.1 percent and 78.6 percent respectively, p<0.0001). The overall PPV was 43.0 percent, increasing up to 86.6 percent in the presence of procedure codes indicating the presence of mastectomy and lumpectomy, in addition to that of breast cancer diagnosis.
The sensitivity of Medicaid claims for case ascertainment of breast cancer is somewhat low, but improves considerably when accounting for women enrolled in Medicaid for the entire duration of the study year. The higher PPV obtained in the presence of procedure codes, in addition to diagnosis codes, will help researchers to correctly identify incident cases of breast cancer using Medicaid claims data.
Claims data are increasingly used in cancer-related health services research. A number of studies have compared the completeness and accuracy of Medicare or commercial insurance claims data to that of medical records (Fisher et al. 1992), SEER (Surveillance, Epidemiology, and End Results) of the National Cancer Institute (Cooper et al. 1999; Doebbeling et al. 1999; McBean, Babish, and Warren 1993; Warren et al. 1999), or the state cancer registry (McClh et al. 1997). These studies have identified the strengths and weaknesses associated with the use of Medicare claims data to study cancer-related outcomes in the elderly population.
Disparities by socioeconomic and insurance status in cancer-related outcomes have been well documented (Ayanian et al. 1993; Roetzheim et al. 1999), and the use of Medicaid claims data to study cancer-related outcomes in vulnerable populations would be highly desirable. However, the ability of Medicaid claims data to identify incident cases of breast cancer is unknown, and studies comparable to the ones referenced above are needed to assess the quality of Medicaid claims data prior to their use in cancer-related health services research. To date, studies have examined the utility of Medicaid claims data to estimate the prevalence of certain clinical conditions, or to monitor performance indicators and outcomes (Buescher and Jones-Veseey 1999; Cotter et al. 1999; Steinwachs et al. 1998). However, given their noncancer focus, they do not provide any indication as to the quality of Medicaid claims as it relates to studying cancer-related outcomes in the Medicaid population.
The Ohio Cancer Incidence Surveillance System (OCISS) and Medicaid files were linked as part of an effort to conduct cancer-related studies pertaining to the Ohio Medicaid population, and, given the representation of women in the Medicaid program, the present study examines the ability of Medicaid claims data to correctly identify incident cases of female breast cancer, using OCISS as the gold standard. The OCISS is a system designed to collect cancer incidence data among all residents of the state of Ohio. All cases of primary cancer, with the exception of basal and squamous cell carcinoma and carcinoma in situ of the cervix, diagnosed on or after January 1, 1992, are required to be reported to the OCISS. The national standard-setting organization, the North American Association for Central Cancer Registries (NAACCR), awarded Silver Certification to the OCISS for the completeness, timeliness, and quality of its database for all cancers for the 1997 and 1998 diagnosis years, which led us to consider OCISS as the gold standard for this study. The algorithm used by NAACCR in determining these rates is based on age-adjusted observed and expected incidence rates for the state of Ohio. The expected cancer incidence rates are calculated based on Ohio-specific age-adjusted cancer mortality rates and incidence-to-mortality rate ratios obtained from SEER and U.S. death rates. For female breast cancer, specifically, the OCISS estimates that data are 89 percent complete for the 1997 diagnosis year and 94 percent complete for the 1998 diagnosis year (Ohio Cancer Incidence Surveillance System 2002).
The present study assesses the ability of Medicaid claims data to correctly identify incident cases of breast cancer using OCISS as the gold standard. We examine the sensitivity and positive predictive value (PPV) of Medicaid claims for case ascertainment of incident cases of breast cancer.
The study used 1997 and 1998 OCISS and Medicaid claims and enrollment files. Because of low prevalence of breast cancer in younger women, the study was limited to women 40 years of age or older. Medicaid enrollment files were linked with OCISS on a calendar year-by-year basis, using patient identifiers, including last name, first name, social security number, and date of birth. The project was approved by the Institutional Review Boards at the University Hospitals of Cleveland and the Ohio Department of Health, as well as by the Bureau of Ohio Health Plans, Ohio Department of Job and Family Services. The matching algorithm used in this study was comparable to the one employed to link Medicare and SEER files (Potosky et al. 1993). Namely, matching was performed in four steps, after creating unique identifiers with different combinations of names, social security number, and date of birth, as follows:
Step 1: social security number, first name, last name
Step 2: social security number, last name, date of birth (month)
Step 3: social security number, first name, date of birth (month)
Step 4: first name, last name, date of birth (month and year)
First and last names were truncated to include the first 6 digits only.
Medicaid claims data from inpatient, outpatient (institutional and noninstitutional), and pharmacy files were used to identify cases of female breast cancer. Breast cancer cases were identified in the presence of invasive or in situ breast cancer diagnosis, mastectomy, and/or lumpectomy. In the absence of codes for breast cancer diagnosis or mastectomy, cases with lumpectomy were identified as breast cancer only in the presence of codes for chemotherapy and/or radiation therapy as well. The table in the Appendix lists all the codes used in identifying the above-referenced diagnosis and procedure codes. Cases were further categorized as prevalent in the presence of any of the above diagnosis and/or procedure codes on claims with dates preceding the initial date of breast cancer diagnosis in the study year, as identified through claims data. Cases were also categorized as prevalent in the presence of any pharmacy claim documenting a prescription of tamoxifen with date of service preceding the claims-based initial date of breast cancer diagnosis. Claims files were searched as far back as July 1994, which implies a search period of at least 30 months for cases identified in 1997, and 42 months for those identified in 1998.
Variables retrieved from OCISS included age (dichotomized as 40–64 andZ≥65), race (white and nonwhite); and SEER summary stage (in situ, local, regional, distant, and unknown/unstaged).
Monthly variables from Medicaid enrollment files were used to identify women who had been enrolled in Medicaid for all 12 months of the study year (full-year) and those enrolled for only part of the year (partial-year, or<12 months of enrollment in Medicaid during the study year). Partial-year enrollment in Medicaid reflects the experience of individuals joining the Medicaid program some time during the study period, or those with no continuous enrollment in Medicaid. Monthly enrollment variables were also used to identify those with any participation in the Medicare program (enrollment in Part A or Part B, or presence of crossover claims); and those with any participation in the Medicaid spend-down program—a program that enables beneficiaries with incomes exceeding the income limit required to become eligible for Medicaid services to spend the excess amount on medical costs. For example, if the income limit is $600 per month, and the individual's monthly income amounts to $750 per month, she or he can qualify for Medicaid after spending down$150 on medical costs. Since these are typically out-of-pocket expenditures, the care received will not be documented in Medicaid claims files. All three of these variables—partial-year enrollment, dual Medicare-Medicaid eligibility, and participation in the spend-down program—can be associated with incomplete claims history, hence their relevance in this study.
Medicaid claims data were used to identify cases of breast cancer, and also to identify women with comorbidity conditions (excluding neoplasms), as defined through the Deyo comorbidity index (Deyo, Cherkin, and Ciol 1992).
Sensitivity and PPV were calculated as follows:
Thus, sensitivity represents the ability of claims data to correctly identify those who are diagnosed with breast cancer. The PPV reflects the proportion of patients who were identified as incident through claims data and who actually had the disease. The PPV was calculated for the study population overall, and also for each combination of breast cancer diagnosis, presence of mastectomy, lumpectomy, or chemotherapy/radiation therapy procedure codes. Since the analysis did not consider patients with absent diagnosis or procedures in both OCISS and Medicaid files, other measures such as negative predictive value could not be determined. Stratified analysis was conducted to detect differences in sensitivity and PPV between the different subgroups of the population, using chi-square tests. For sensitivity analysis, stratification was used on variables that could be retrieved from OCISS or Medicaid enrollment files; for PPV, stratified analysis used variables originating from Medicaid enrollment or claims files.
A total of 234,195 and 230,474 women 40 years of age or older were enrolled for the Medicaid program at least for one month respectively in each of the study years 1997 and 1998, and a total of 8,648 and 9,250 incident cases of female breast cancer were identified through OCISS respectively in these study years. The matched files of OCISS and Medicaid enrollment files included 812 and 842 incident cases of breast cancer in each of the study years. Through step 1 of the matching algorithm above, 87 percent of the cases were identified, and more than 7 percent were identified through step 4 (Table 1). Six cases, which were identified both in the 1997 and the 1998 linked files, were excluded from the analysis, leaving a total of 1,648 cases in the study population for the two years combined.
Table 2 shows the distribution of cases identified through Medicaid claims search, by incident status. A total of 2,635 cases were identified as incident cases, implying that the claims search for services preceding the claims-based initial date of breast cancer diagnosis did not yield any claims carrying a relevant diagnosis or procedure codes. Cases identified as prevalent numbered 3,052. Of these, 250 cases were identified through pharmacy claims only (prescription of tamoxifen prior to the claims-based initial date of breast cancer diagnosis). A total of 1,271 cases were identified through both claims data and the OCISS-Medicaid enrollment files, of which 1,132 cases were identified through claims data as incident and 139 as prevalent.
The sensitivity rates for each of the subgroups of the Medicaid population are presented in Table 3. The overall sensitivity was 68.7 percent (1,132/1,648). It was significantly higher in the 40–64 age group than in their older counterparts (78.4 percent and 60.7 percent respectively); and significantly lower among those enrolled in the study year for less than 12 months compared to those enrolled continuously for 12 months of the year (56.7 percent and 78.0 percent respectively), and among those who had participated in the Medicare program compared to those with no dual eligibility for the Medicare and Medicaid programs (63.1 percent compared to 78.6 percent; p<0.0001 for the above comparisons). Sensitivity also varied by stage of breast cancer at diagnosis, with highest rates among women diagnosed with regional and distant metastases, and lowest rate among women with cancer that was unstaged or of unknown stage.
The PPV, calculated for different combinations of breast cancer diagnosis, mastectomy, lumpectomy, and chemotherapy/radiation therapy, are presented in Table 4. The overall PPV was 43.0 percent (1,132/2,635). It ranged from 15.1 percent in the presence of breast cancer diagnosis alone, to a high of 86.6 percent in the presence of mastectomy or lumpectomy, in addition to breast cancer diagnosis. The second and third highest PPVs were found respectively among women with claims carrying a combination of breast cancer diagnosis, lumpectomy, and chemotherapy/radiation therapy (84.7 percent) and those with claims carrying a diagnosis of breast cancer and mastectomy (84.4 percent). The PPV was lower among women who had breast cancer diagnosis with chemotherapy/radiation therapy codes, and those with breast cancer diagnosis and lumpectomy codes only. Only one woman was identified with incident breast cancer in the presence of a mastectomy and lumpectomy, but in the absence of a breast cancer diagnosis. Stratified analysis of PPV by age and race, full-year versus partial-year enrollment in Medicaid, participation in the Medicare or spend-down programs, or by the presence or absence of comorbidity factors, yielded similar results and patterns, with higher PPVs in the diagnostic and procedure code combinations described above (data not shown).
In this study, we assessed the ability of Medicaid claims data to identify incident cases of breast cancer in the Medicaid population, using OCISS as the gold standard. This study highlights two important findings: First, the sensitivity was significantly lower among women who were enrolled for only part of the study year, potentially because of incomplete claims history. Although other factors, such as age, and dual eligibility for the Medicare and Medicaid programs, are associated with lower sensitivity, we found through additional analysis that sensitivity was higher among women who had been enrolled in Medicaid for all 12 months of the study year than those who had been participating in Medicaid for only part of the year, even among dually eligible Medicare–Medicaid individuals. Relative to women not participating in Medicare, for example, the sensitivity was 74.2 percent among those with partial-year enrollment in Medicaid, and 83.3 percent among those with full-year enrollment in Medicaid—a rate comparable to that reported for 1992 Medicare claims (Cooper et al. 1999; Warren et al. 1999). Similarly, among dually eligible women, sensitivity was 43.6 percent among partial-year enrollees and 75.6 percent among those enrolled in Medicaid for 12 months of the study year (p<0.001 for above comparisons). That sensitivity remained lower among dually eligible Medicare–Medicaid enrollees than among others, even if they were continuously enrolled in Medicaid for 12 months of the study year, remains to be explored. Like other claims, crossover claims are believed to carry diagnostic and procedure codes, and to provide an account of the services received by the beneficiary. It is possible, however, that similar to those enrolled for only part of the study year, Medicare-eligible women with low incomes would be likely to enroll in Medicaid after they are diagnosed with and treated for their cancer, just as they transition to the phase where treatment is rendered in outpatient settings and out-of-pocket expenditures for Part B services begin to constitute financial burden. Such issues would be best explored through a linked OCISS and Medicare–Medicaid database.
Second, the PPV, which reflects the ability of claims to identify true positive cases, varied greatly, depending on the combination of diagnosis and procedures codes used to identify incident cases. The PPV was highest in groups where cases were identified using a combination of procedures for mastectomy, lumpectomy with chemotherapy/radiation therapy, in addition to breast cancer diagnosis—combinations of diagnostic and procedure codes that reflect the typical experience of a woman undergoing diagnosis and treatment of an incident case of breast cancer. These findings indicate that the presence of procedure codes that are specific to the treatment of breast cancer (mastectomy and/or lumpectomy with chemotherapy/radiation therapy), in addition to that of breast cancer diagnosis, greatly contributes to the correct identification of incident cases of breast cancer using claims data. Low PPV implies a high rate of false positives. For a large number of cases, claims data may have identified prevalent, rather than incident cases of breast cancer, especially (a) where breast cancer diagnosis was documented in the absence of any procedures specific to the disease, and (b) where breast cancer diagnosis was present along with codes for chemotherapy/radiation therapy, without the documentation of any surgical codes. In instances where breast cancer diagnosis was present with a lumpectomy code, but with no documentation for chemotherapy/radiation therapy, it was not possible to distinguish between cases that were coded as breast cancer in the presence of the disease, or to rule out the disease. Similarly, cases where only breast cancer diagnosis was documented in the absence of relevant procedure or treatment codes could well have been diagnostic rule-out cases, with the receipt of a screening or diagnostic mammography exam. In fact, nearly 30 percent of such cases presented with a mammography procedure code.
This study also enabled us to assess the utility of pharmacy data in identifying prevalent cases of breast cancer. Of the 3,052 prevalent cases identified across the two-year period through claims data, 250 (or 8.2 percent) had been identified through pharmacy claims only, based on the fact that they had been prescribed tamoxifen prior to the claims-based initial date of breast cancer diagnosis. Of these, 26 cases matched successfully with OCISS records. Given their prevalent status from the claims data, these cases were not accounted for in the sensitivity rates. Because of their small number, the inclusion of these cases in the numerator to measure the claims' sensitivity would not have resulted in a large increase in the rates. However, the decision of whether to account for these cases as prevalent based on pharmacy claims alone could be questioned, given that tamoxifen was being prescribed as a prophylactic therapy in recent years to prevent primary cancers among high-risk women or new primary cancers in the contralateral breast among women with history of breast cancer (Vogel 2000).
The finding that sensitivity was somewhat higher among women with spend-down deserves to be explored further. While no study has documented that individuals participating in the spend-down program have incomplete claims history, that is believed to be the case, due to the fact that part of the services are covered by the patient through out-of-pocket payments. In this study, 73 percent of patients on spend-down were 65 years of age or older, and 89 percent participated in the Medicare program. Both factors are known to be associated with lower sensitivity rates. On the other hand, 61 percent of women on spend-down and 53 percent of their nonspend-down counterparts were enrolled full time in the Medicaid program during the study year (p<0.01), a factor favoring higher sensitivity rates. More studies are needed to gain a better understanding of the subgroups of the Medicaid population that are more likely than others to be represented in the claims data with incomplete history.
An important consideration in this study is that, despite its Silver Certification by the NAACCR, OCISS is not a reporting source that is as well established as SEER. Given the fact that this study is limited to the Ohio Medicaid program and the state's system of monitoring cancer incidence, some of the findings may not be generalizable to other states. Nevertheless, it is highly likely that the findings of this study, especially as they relate to the lower rates of sensitivity among individuals with potentially incomplete claims history, and the higher PPV in the presence of certain combinations of diagnostic and procedure codes, will hold true in other settings, including to other third-party payers.
To the authors' knowledge, this is the first study to evaluate the quality of Medicaid claims data to ascertain incident cases of breast cancer. An important strength of the study is the use of claims data originating from all categories of service to retrieve cancer-related diagnosis and procedure codes, as well as the use of pharmacy claims to identify incident and prevalent cases of breast cancer. This approach seems to have been successful, as more than 89 percent of cases that were successfully linked with OCISS records were incident, rather than prevalent cases of cancer.
Findings from this study highlight the need to exercise great caution in analyzing Medicaid data and interpreting the results, mainly because of the intricacies of the Medicaid program. Incomplete claims history is an important factor to account for in studying outcomes in the Medicaid population, mainly due to discontinuity in enrollment, and participation in spend-down and/or Medicare programs. Participation in spend-down results in out-of-pocket expenditures, and therefore incomplete documentation of services received, and participation in Medicare results in claims in the Medicare database. With Medicaid being the payer of last resort for the dually eligible Medicare–Medicaid population, it is possible that some of the services may have been undocumented in the Medicaid database, or documented in an incomplete fashion. Future studies should use linked OCISS and Medicare–Medicaid files to assess the incremental benefit of adding Medicare claims in the analysis of evaluating the ability of claims data to ascertain incident cases of cancer in the dually eligible Medicare–Medicaid population. It is also important to be mindful of the fact that Medicaid is a safety net program. As described above, this implies that individuals would be likely to join the Medicaid program after undergoing part of the diagnostic and therapeutic regimens while being uninsured, or part of a different health care delivery system, resulting again in incomplete claims history in the Medicaid database.
In conclusion, much remains to be explored in assessing the utility of Medicaid claims data in cancer-related outcomes. This study evaluated only the ability of Medicaid claims data to ascertain incident cases of breast cancer in the Medicaid population. In the process, it identified programmatic issues that could affect the completeness of claims history. Additional studies are needed to explore the utility of Medicaid claims data in analyzing other aspects of cancer-related outcomes, such as cancer stage, treatment, and follow-up care, and to identify with greater certainty the circumstances under which Medicaid claims can be considered useful in studying cancer-related outcomes.
The authors wish to thank the Ohio Department of Health (ODH) and the Department of Ohio Job and Family Services (ODJFS) for making the data available. The authors also thank Ms. Georgette Haydu, M.S., of the Ohio Cancer Incidence Surveillance System, ODH, for her review of earlier drafts of this manuscript.
Cancer incidence data were obtained from the Ohio Cancer Incidence Surveillance System (OCISS), Ohio Department of Health. Use of these data does not imply that the Ohio Department of Health either agrees or disagrees with any presentation, analyses, interpretations, or conclusions. Information about the OCISS may be obtained at http://www.odh.state.oh.us/ODHPrograms/CI_SURV/ci_surv1.htm.
|Malignant Neoplasm of Female Breast||174.0-174.9|
|Carcinoma In Situ of Breast||233.0|
|Procedure||ICD-9-CM Diagnostic codes||ICD-9-CM Procedure Codes||CPT-4 or HCPCS Codes|
|Local Excision/||85.20, 85.21,||19120, 19125,|
|V67.2||J9000, J9001, J9010,|
|J9070, J9080, J9090-|
|J9097, J9190, J9250,|
Dr. Koroukian, senior instructor, was supported by grant no. F32-CA84621 at the time the study was conducted. Results were presented in part on December 3, 2001, at the annual meeting of the National Association of Health Data Organization (NAHDO), Washington, DC.