|Home | About | Journals | Submit | Contact Us | Français|
The Medical Expenditure Panel Survey (MEPS) is a widely used nationally representative survey of health care use and expenditures. Numerous studies raise concerns that use is underreported in household surveys.
To assess the quality of household reports in the MEPS and the impact of misreporting on descriptive and behavioral analyses.
Participants in MEPS with Medicare coverage during 2001–2003 were matched to their Medicare enrollment and claims data (4,045 person-year observations). Household reports of Medicare-covered services for the matched sample were validated against Medicare claims. Standard models of the determinants of health care utilization were estimated with both MEPS and claims-based utilization measures.
In-person interviews with household informants obtained data on hospital inpatient, emergency department (ED), and office-based visits. Comparable measures were created from the claims.
In the validation sample, households accurately reported inpatient stays (agreement rate=0.96, κ=0.89) and number of nights (Lin's concordance statistic=0.88). Households underreported ED visits by one-third (Lin's concordance statistic=0.51) and office visits by 19 percent (Lin's concordance statistic=0.67).
Household respondents in the validation sample accurately report inpatient hospitalizations but underreport ED and office visits. Behavioral analyses are largely unaffected because underreporting cuts across all sociodemographic groups.
The Medical Expenditure Panel Survey (MEPS) is a widely used nationally representative survey of the levels and determinants of health care use and spending by U.S. households. Before its inception in 1996, these data were collected decennially in the 1987 National Medical Expenditure Survey and the 1977 National Medical Care Expenditure Survey, but the need for more timely data led to the significant public investment of an annual expenditure survey. The MEPS collects detailed information from households about their use of office-based and hospital services, prescription drugs, and other health care services. The household-reported use is the basis for national estimates of health care expenditures, but reliance on household-reported utilization raises concerns about the accuracy of the estimates.
A recent review of 42 studies evaluating the accuracy of self-reported utilization data found that the most common problem is underreporting (Bhandari and Wagner 2006). However, these studies provide mixed and sometimes conflicting evidence about the extent of underreporting. Some studies found relatively little underreporting error for physician visits (Cleary and Jette 1984; Roberts et al. 1996; Raina et al. 1998; Lubeck and Hubert 2005;), inpatient stays (Roberts et al. 1996; Lubeck and Hubert 2005;), and emergency room visits (Lubeck and Hubert 2005). Other studies found relatively large underreporting in some or all types of visits (Glandon, Counte, and Tancredi 1992; Ungar, Coyte, and the Pharmacy Medication Monitoring Program Advisory Board 1998; Wallihan, Stump, and Callahan 1999; Rozario, Morrow-Howell, and Proctor 2004;). Another found overreporting of emergency room visits (Ritter et al. 2001).
In a more recent study, Wolinsky et al. (2007) used Medicare claims for respondents in the Survey on Assets and Health Dynamics among the Oldest Old (AHEAD) to validate the accuracy of self-reported hospital stays and physician visits for evaluation and management services. Concordance was high for self-reported hospital stays and Medicare claims, but low for physician visits.
Together these findings shed little light on underreporting in the MEPS, even though it is likely a major contributing factor to the estimated gap of 14 percent in benchmark comparisons of MEPS to the National Health Care Expenditure Accounts (Sing et al. 2006). In general, we would expect more reliable reporting in the MEPS compared with AHEAD and many other national surveys, because MEPS uses a relatively short recall period (5 months on average versus 12 months), diaries, and extensive probes to enhance recall (Cohen et al. 1996/1997). However, the MEPS relies on one household informant to report health care use for all household members (i.e., they are a household proxy) and, in rare instances, uses proxies outside of the household.
Like the Wolinsky et al. (2007) study, we use Medicare claims to assess the accuracy of household-reported office-based and hospital care for a sample of Medicare beneficiaries in the MEPS. While not an absolute gold standard, we similarly presume that Medicare claims represent a complete record of the services paid for by Medicare. Because we also restrict attention in MEPS to visits and stays paid by Medicare, these claims serve as an appropriate benchmark. Our MEPS sample includes Medicare beneficiaries who reported utilization for themselves, as well as beneficiaries for whom either a resident or nonresident proxy reported their health care use. All three are important to understanding the accuracy of MEPS utilization data. Besides assessing the level of concordance between MEPS utilization data and Medicare claims, we seek to understand and quantify the extent to which differential reporting by population subgroups affects descriptive and behavioral analyses such as Aday–Andersen (1974) models of health care utilization. In doing so, we build on existing studies that tend to focus principally on the assessment of concordance.
The MEPS uses a rotating panel design with two overlapping cohorts of the U.S. noninstitutionalized civilian population combining to produce annual estimates (J. Cohen 1997; S. Cohen 1997, 2003). A new cohort of households is initiated each year and interviewed five times to collect two calendar years of data. A single informant reports for each household during each interview round. The MEPS asks that this person be the family member most knowledgeable about health and health care use in the household. We pooled data for calendar years 2001–2003 and initially subset the sample to persons covered by Medicare at any point during a year. This sample included 9,015 unique persons, or 13,680 person-year observations.
Medicare beneficiaries in the MEPS were asked to voluntarily provide their Medicare card number so that their Medicare records could be located and used for statistical research purposes. Under a Data Use Agreement with the Centers for Medicare and Medicaid Services (CMS), beneficiaries in our full MEPS sample were matched to their Medicare enrollment and claims data using survey-reported Medicare health insurance claim numbers (HICNs) or social security numbers (SSNs). Complete HICNs or SSNs were reported for 3,788 sample persons in the 2001–2003 surveys, and 3,463 of these persons (or 91 percent) matched exactly to the same HICN or SSN, sex, and date of birth in the Medicare administrative records (38 percent of the 9,015 people with Medicare coverage in the full sample). Under our agreement with CMS, only exact matches could be used.
A logistic regression found that the exact matches were more likely to be the household informant (self-respondent), live in the Midwest or South compared with the West and East regions, reside in a nonmetropolitan statistical area (non-MSA), report their race as white compared with nonwhite, and at least 65 compared with the Medicare beneficiaries who did not match exactly or provide their HICN or SSN for the matching (full results available in an appendix table available from the authors). We used a propensity-score reweighting procedure based on this regression to adjust the standard MEPS weights for differences in sociodemographic and interview characteristics in the likelihood of matching to CMS enrollment files. Applying the adjusted weight, we found no statistically significant differences in survey-reported inpatient and emergency department (ED) utilization, and small differences (6 percent) in their ambulatory utilization (office-based and outpatient department visits) for those who did not match.
To ensure that our analytic sample had complete utilization data for a comparable period from both sources, we restricted the matched sample to Medicare beneficiaries who were in MEPS for the entire calendar year, leaving 5,169 person-year observations. We further restricted this group to beneficiaries with Part A and Part B Medicare fee-for-service coverage for the entire calendar year based on their monthly Medicare enrollment data. We note that no claims data are available during periods when beneficiaries are enrolled in Medicare Advantage plans.
The final analytic sample contained 2,649 persons and 4,045 person-year observations. Their Medicare claims for office- and hospital-based care are the benchmark in our comparisons with the MEPS utilization data. We used Medicare claims rather than data from the Medical Provider Component (MPC) of the MEPS to validate survey-reported utilization because the MPC only has data for a subset of medical providers used by survey respondents. Moreover, comparisons at the person-provider level can be misleading if the survey respondent links the wrong provider to a specific event. The claims, on the other hand, include all doctor and hospital services reimbursed by Medicare, and they can be linked to the beneficiary regardless of whether the person correctly identified the doctor or hospital providing the care.
We compared annual utilization of inpatient hospitalizations, ED visits, and office-based visits from MEPS and the Medicare claims. Our comparisons of ED visits are restricted to stand-alone events. ED visits resulting in an inpatient hospitalization were combined with inpatient stays in our comparisons. We omit comparisons of the reporting of outpatient department visits because our examination of Medicare claims found a large number of claims for laboratory services where it was not possible to determine whether the Medicare beneficiary was physically present (as opposed to their doctor simply sending their blood to a hospital laboratory).
Household-reported annual utilization was derived from the MEPS event files for inpatient hospitalizations, ED visits, and office-based visits. Each observation in the event files corresponds to a single hospitalization, ED visit, or office visit. Office visits to the same provider occurring on the same day were combined. We used the events where Medicare was one of the payers in our comparisons to Medicare claims because the Medicare standard analytic files (SAFs) only include final action (nonrejected) claims for which a payment was made and all disputes and adjustments had been made. The SAFs do not include Part B (physician/supplier) events where the beneficiary had not met his or her annual deductible, or events not typically covered by Medicare such as preventive care, routine eye exams, or cosmetic surgery. The Medicare SAFs also do not include visits that were part of a bundle of services for which the physician already received a single payment (termed a flat or global fee in MEPS), for example, postoperative visits included in the surgical fee under Medicare rules. While household may report these visits in MEPS, they are implicitly excluded from our utilization counts because there would be no payment recorded for them in MEPS either. Services provided by VA facilities are similarly excluded from our comparisons because there are no Medicare payments reported for these visits or hospital stays.
The SAFs are annual files created from claims processed through June of the following year, and they are usually available to researchers about 9 months after the close of the year. Claims in the inpatient, outpatient, and carrier (Part B physician and supplier) SAFs were used to construct utilization measures comparable to those in the MEPS. Algorithms from the CMS-funded Research Data Assistance Center were used to construct ED events from claims in the Medicare inpatient and outpatient files (Merriman 2003). The site of care variable in the carrier SAF was used to identify physician and other provider office-based visits. In parallel to the MEPS, office visits occurring on the same day to the same provider were combined into a single visit. A technical report available from the authors (WESTAT 2008) contains the full set of algorithms used in constructing the utilization measures.
We made a few adjustments to the MEPS events because the household-reported type of event can be different from the corresponding record in the claim files. For example, a late-night ED visit may have been reported for a beneficiary in MEPS, but a one-night hospital stay was recorded in the claims, or vice versa. Accounting creations such as “zero-night” hospital stays create even more confusion for respondents. While the respondents misreport the type of care received from a purely administrative records perspective, they are clearly reporting the same episode of care. These adjustments are described in a technical appendix (available from the authors).
We created the following sociodemographic variables from the MEPS. Age was categorized as under 65, 65–74, 75–84, and 85 and older. Binary indicators represent the following categories: female, nonwhite including Hispanics, married, region (North, South, Midwest, and West), and living in an MSA. Family income was coded as below 100, 100–199, and 200 percent or more of the federal poverty line. Education was categorized as <12, 12, and >12 years. Binary indicators represented the five categories of perceived health: excellent, very good, good, fair, or poor. A cognitive limitation indicator was coded “1” for persons who experienced confusion or memory loss, had problems making decisions, or required supervision for their own safety. An activity limitation indicator was coded “1” if the person had limited ability to work in a job, do housework, or go to school “because of impairment or physical or mental health problem.” Private insurance and Medicaid were coded “1” if the person had private coverage or Medicaid coverage at the first interview of the year.
We also constructed indicators describing the interviews and how utilization data were obtained for each beneficiary. Interview language indicates that at least one MEPS interview was in a language other than English (3 percent). We classified reporting of utilization data into one of three categories: self-reported indicates that the sample beneficiary was the household informant in all of the interviews (61 percent of the final matched sample), household proxy (37 percent) indicates that utilization data were reported by a proxy living residing in the household (usually a spouse), and nonresident proxy indicates that a person outside of the household reported utilization data for the sampled person (2 percent). Finally, year in survey was coded “0” for first and “1” for second year (46 percent) in the MEPS survey.
We examined survey underreporting compared with claims for inpatient stays, ED visits, and office visits in our matched analytic sample. We calculated univariate means for annual utilization at the person level and the ratio of the mean utilization reported in MEPS and the mean recorded in claims. Adjusted t-tests were used to examine differences in means between MEPS and the claims measures. For the binary utilization measures, we also calculated the mean agreement rate (defined as “1” if use is reported in both MEPS and the claims, and “0” otherwise). We also calculated κ statistics to facilitate comparisons with other studies of underreporting. For the continuous utilization measures, we calculated Lin's concordance measure, which is scaled from −1 for perfect disagreement to 1 for perfect agreement (Lin 1989). We then conducted bivariate analyses of the utilization means and ratios and the agreement and concordance statistics by sociodemographic and interview characteristics, using adjusted Wald statistics to test for differences.
We investigated whether reporting errors in MEPS lead to systematic biases in behavioral analyses by estimating pairs of utilization regressions using the claims and MEPS household reported utilization measures, respectively, as the dependent variable and comparing the results. The independent variables in each pair included an identical set of sociodemographic covariates often used in models of health care utilization (Aday and Andersen 1974). Logistic regression models were estimated for binary utilization measures and negative binomial count data regression models for number of office visits. We report odds ratios for the logistic models and incidence rate ratios for the negative binomial models in order to compare the magnitude of effects in each pair of regressions. We formally test whether the effect of each covariate is the same in the pairs of regressions. For example, does poor health increase the odds of an ED visit by the same magnitude whether using the household-reported or claims-based measure? Because coefficient estimates and the resulting odds and incident rate ratios are interpretable as random variables, this is analogous to a pairwise t-test of the means of two (correlated) random variables. Following Hosmer and Lemeshow (2000), we performed these tests using the coefficient estimates.
All analyses used the adjusted MEPS sampling weights and the method of balanced repeated replications (BRR) to adjust for the stratified and clustered (at the PSU level) design of the MEPS survey (MEPS PUF 36BRR). This method also corrects for clustering at the household and individual level (Wolter 1985, pp. 111–21), and in particular the clustering that occurs because household-reported and claims-based measures of utilization were estimated using the same matched sample of beneficiaries. We implemented the BRR corrections for clustering at all levels using the built-in survey commands in Stata/MP 10.1 for means and ratios. We constructed equivalent BRR routines for the Lin's concordance and κ statistics and the pairwise tests of the utilization regression coefficients (MEPS PUF 36BRR; Wolter 1985, pp. 111–21). All methods were developed with and reviewed by senior statisticians associated with the MEPS.
Table 1 compares mean utilization based on claims and MEPS household reports for our matched sample. There were no differences in the proportion of beneficiaries reporting any hospital stay, with an agreement rate of 0.96 (95 percent CI: 0.96–0.97) and a κ statistic of 0.89 (95 percent CI: 0.86–0.91), indicating “almost perfect” agreement (Landis and Koch 1977). There was a small difference in the annual count of inpatient stays (0.31 in claims versus 0.29 from MEPS household reports, p=.043) and no statistically significant difference in total number of inpatient nights, both with high Lin's concordance statistics.
Annual counts of ED and office-based visits from the two sources for the matched sample accord less closely. MEPS households reported that 15 percent of Medicare beneficiaries had an ED visit covered by Medicare compared with 19 percent using ED claims (p<.001). The difference in overall rates is explained by a much higher proportion of false negatives than false positives: 5.2 percent had ED visits recorded in the claims but not reported in MEPS, 1.3 percent had ED visits reported in MEPS but not recorded claims, 13.3 percent had ED visits according to both sources, and 80.1 percent had no ED visit according to both sources. The κ statistic of 0.76 (95 percent CI: 0.73–0.80) indicates substantial agreement (Landis and Koch 1977) on the reporting of any ED use. However, the total annual number of ED visits reported by households in MEPS is one-third lower than claims (p<.001), with a Lin's concordance statistic of 0.51 (95 percent CI: 0.31–0.71).
The proportion of beneficiaries with at least one office-based visit covered by Medicare is almost identical (90 versus 89 percent, p=.21), but the κ statistic of 0.55 (95 percent CI: 0.49–0.62) suggests only moderate agreement (Landis and Koch 1977). The total number of office-based visits reported by MEPS respondents is 19 percent lower than in the CMS claims (p<.001), with a Lin's concordance statistic of 0.67 (95 percent CI: 0.59–0.74).
Differences in the accuracy of inpatient hospitalization reporting by sociodemographic characteristics were generally small and not statistically significant (data not shown). Reporting was lower when the informant was either by a proxy residing in the household or nonresident proxy, and when interviews were conducted in a language other than English (data not shown).
However, ED and office-based visits showed greater difference in reporting by sociodemographic and interview characteristics as illustrated in Table 2 (any ED use) and Table 3 (number of office visits). Sociodemographic characteristics associated with better household reporting include higher income, higher educational attainment, better perceived health (ED visits only), and non-Hispanic white race/ethnicity (ED visits only). Activity and, unexpectedly, cognitive limitations were associated with better reporting of office visits. We note that the household respondent may not necessarily be the person with the cognitive limitation. Disabled beneficiaries under age 65 showed worse reporting of ED visits but better reporting of office visits compared with older Medicare beneficiaries. Having private insurance in addition to Medicare was associated with better reporting of both ED and office visits, while supplemental Medicaid coverage was associated with poorer reporting. Reporting also declined as utilization increased. The same level of reporting was observed for Medicare beneficiaries who reported their own use compared with Medicare beneficiaries for which a proxy residing in the household (generally a spouse) reported for the family.
Some of the sociodemographic factors associated with better reporting are clearly related to each other, for example, income, education, and insurance coverage. Rather than using multivariate regressions to estimate their independent effects on reporting accuracy, we focus on the impact of reporting error in typical behavioral analyses of health care use. Table 4 reports the results of pairs of regressions of the determinants of health care utilization using the claims-based and MEPS household reported measures, respectively, as the dependent variable and the same set of covariates from the matched sample.
The first set of columns shows the logistic regression results for any ED use. The odds of an ED visit increased markedly with poorer self-perceived health status, with no statistically significant differences in this relationship whether the model was estimated with claims or household reported ED use. The effects of having Medicaid (higher odds) and being married (lower odds) were also the same in the two regressions, as were geographic differences. Race/ethnicity was not associated with ED use in either regression, but patterns for other covariates were less clear. Among beneficiaries aged 65 and older, the odds of having an ED visit show the same pattern with respect to age regardless of whether the claims-based or household-reported measures are used. However, disabled beneficiaries under age 65 had higher odds of an ED visit relative to Medicare beneficiaries aged 65–74 using the claims-based use measure, but there was no difference using household reported ED use. The effects of family income were also different in the two regressions.
The second set of columns shows the results of negative binomial regressions on annual number of office visits. Again, perceived health was strongly associated with office visits, showing similar patterns in both regressions. Activity limitation was also associated with increased ambulatory use, with a stronger effect in the MEPS regression (p=.007). As with ED use, the two regressions showed the same increasing effect of age on the number of visits among those aged 65 and older but differences in the estimated effects for disabled beneficiaries under age 65 relative to older beneficiaries (p=.019). Nonwhite race/ethnicity was associated with fewer office visits in both regressions, with no significant difference in the magnitudes. There were no statistically significant differences in the magnitudes of the effects of sex, region, and living in an MSA on office visits. The magnitudes of the effects of education, Medicaid coverage, and private coverage on ambulatory use were somewhat different but in the same direction in the two regressions.
Our comparisons of household-reported utilization to Medicare claims in the matched analytic sample found that household respondents in the MEPS were surprisingly good at reporting inpatient hospitalizations, but that ED and office visits were underreported for Medicare beneficiaries. Underreporting varied by income, education, health status, and race/ethnicity, but much of this variation, while statistically significant, was small in magnitude relative to the overall gap in reporting. That is, underreporting of the less salient ambulatory care events affected all groups to a substantial degree so that relative bias between groups is likely to be small in most applications. Results from the utilization regressions using the claims-based and household reported measures, respectively, generally bore this out. However, there were some differences in magnitudes and patterns in comparing disabled Medicare beneficiaries under age 65 to older beneficiaries and in comparisons by education, income, and activity limitations that might prove important in some applications.
The general underreporting of ED and office visits in MEPS is more problematic when aggregate comparisons of health care use are required, for example, in cross-national studies or in benchmarking MEPS to claims data. Users of MEPS data may need to consider adjustments for this underreporting for these purposes.
We note three potential limitations in our comparisons of MEPS household reporting to Medicare claims. First, while we matched a large sample of Medicare beneficiaries in MEPS to claims data, our matched sample itself is not nationally representative of Medicare beneficiaries. However, we note that our matched sample mirrors utilization by the full sample of Medicare beneficiaries in MEPS when using weights adjusted for differential matching. Second, we examine household reporting for Medicare beneficiaries only and our findings may not generalize to the reporting for other family members of Medicare beneficiaries or to the rest of the U.S. population residing in households with no Medicare beneficiaries. Elderly and disabled Medicare beneficiaries use substantially more health care services than other Americans (Ezzati-Rice, Kashihara, and Machlin 2004), and previous studies and results presented here suggest underreporting is greatest among high use groups (Cleary and Jette 1984; Roberts et al. 1996; Wallihan, Stump, and Callahan 1999; Ritter et al. 2001;). To this extent, our findings may provide an upper bound estimate of underreporting for the full MEPS sample. The elderly and disabled Medicare populations differ in other important ways from the rest of the population, but it is unclear how this would affect reporting of health care use.
Our third concern is the potential misreporting of sources of payment in the MEPS. Medicare may be either incorrectly omitted or incorrectly identified as a source of payment for household-reported ambulatory care events, potentially affecting comparisons with Medicare claims that generally contain only records for Medicare-covered care. Medicare payments are identified for 84 percent of all ambulatory visits by Medicare beneficiaries in MEPS (either by the household, their providers in the follow-back survey, or by imputation). Our review of provider-reported data in MEPS shows this to be approximately correct on average, with errors in identifying Medicare as a payer going both directions.
The systematic underreporting of ED and ambulatory visits remains an ongoing concern for the MEPS. Underreporting is minimized to some extent by the relative short recall period (5 months on average) relative to 12-month recall periods common in many large-scale surveys such as AHEAD and the National Survey on Drug Use and Health, but it is also longer than the 2-week reference period found for office visits in the National Health Interview Survey (NHIS). Early methodological studies found that reporting of less salient events such as office visits declined substantially after a few weeks (Madow 1967; Cohen and Burt 1985;). The MEPS seeks to minimize the effects of longer recall periods by asking households to use calendars and keep diaries of all their health care use between interviews and to retrieve medical bills, explanation of benefits forms, and other documents during the actual interviews (Cohen et al. 1996/1997). Perhaps as a result of these efforts, a comparison of MEPS and NHIS found similar levels of reporting of ambulatory care services (Machlin et al. 2001). The combination of shorter recall periods and the efforts expended to enhance recall also probably explain why we found substantially higher rates of concordance of both hospital stays and office visits than those reported by Wolinsky et al. (2007) in their comparisons of AHEAD with Medicare claims. Finally, they may also explain why we found no substantive differences in reporting quality whether health care use was reported by the Medicare beneficiary or by another household member.
Studies of recall from an earlier predecessor of MEPS suggest that adding interview rounds, thereby shortening recall periods, would not increase reporting at the margin (Cohen and Burt 1985). Thus, a more fruitful area for future methodological research with MEPS may be in developing better and more efficient mechanisms for households to track their health care use between interviews and ways to encourage more households to use these tools.
Joint Acknowledgment/Disclosure Statement: The authors wish to thank Virender Kumar, Brian Taiffe, Kitty Williams, Diana Wobus, and Pat Ward of Westat for the preparation of the Medicare claims and MEPS analytic files and Doris Lefkowitz (AHRQ) for arranging the Data Use Agreement (15816) with the CMS.
Disclaimers: The views expressed in this paper are those of the authors, and no official endorsement by the Agency for Healthcare Research and Quality or the Department of Health and Human Services is intended or should be inferred. Approval to conduct this study was granted by the Westat Institutional Review Board (IRB, FWA 5551) on October 11, 2005.
Additional supporting information may be found in the online version of this article:
Appendix SA1: Author Matrix.
Appendix S1: Validating Household Reports of Health Care Use in the Medical Expenditure Panel Survey: Technical Appendix.
Appendix S2: List of Specifications.
Appendix S3: List of Data File Contents.
Table S1. Logistic Regression Results on Probability of Matching to CMS Beneficiary Files, 2001–2003.
Please note: Wiley-Blackwell is not responsible for the content or functionality of any supporting materials supplied by the authors. Any queries (other than missing material) should be directed to the corresponding author for the article.