|Home | About | Journals | Submit | Contact Us | Français|
To explain observed differences in patient outcomes across payer types using hospital discharge records. Specifically, we address two mechanisms: hospital-payer matching versus unobserved patient heterogeneity.
Florida's hospital discharge records (1996–2000) of major surgery patients with private health insurance between the ages of 18 and 65, Health Maintenance Organization (HMO) market penetration data, hospital systems data, and the Area Resource File.
The dependent variable is occurrence of one or more in-hospital complications as identified by the Complication Screening Program. The key independent variable is patients' primary-payer type (HMO, Preferred Provider Organization, and fee-for-service). We estimate five different logistic regression models, each representing a different assumption about the underlying factors that confound the causal relationship between the payer type and the likelihood of experiencing complications.
We find that the observed differences in complication rates across payer types are largely driven by unobserved differences in patient health, even after adjusting for case mix using available data elements in the discharge records.
Because of the limitations inherent to hospital discharge records, making quality comparisons in terms of patient outcomes is challenging. As such, any efforts to assess quality in such a manner must be carried out cautiously.
Because quality of care became an important health policy issue in the United States (Institute of Medicine 2001), public reporting of comparative quality information has received attention as a way to improve quality (Marshall et al. 2000). Consequently, the recognition of the fact that this approach to quality improvement requires accurate measurements and reliable data sources has led to an increased interest in using readily available data sources such as hospital discharge records to assess quality. For instance, the Healthcare Cost & Utilization Project and the Patient Safety Indicators (PSIs) developed by Agency for Healthcare Research and Quality illustrate such efforts.
Administrative data such as hospital discharge records offer the potential to be an objective yet inexpensive source of quality data. On the other hand, the disadvantage lies precisely in the fact that such data are not collected for the explicit purpose of quality measurement. As a result, hospital discharge records typically lack key clinical information (e.g., test results) and individual characteristics (e.g., income, marital status) that would be important for case-mix adjustments (GAO 1994; Iezzoni et al. 1996).
In this paper, we attempt to explain the observed differences in patient outcomes among different payer types using hospital discharge records and, in the process, identify the difficulties of doing so. Specifically, we test two separate hypotheses. First, managed care organizations (MCO) may selectively contract with either low- or high-quality hospitals. Alternatively, there may be a systematic but unobserved difference in health status between the patients covered by MCO and those by fee-for-service (FFS) health insurers. We find that the observed differences in patient outcome are driven largely by the unobserved differences in patient health status not captured by the existing data elements within the hospital discharge records. We thus conclude that there is no significant difference in the inpatient quality of care received by MCO and FFS patients, and that hospital discharge records as currently available are unlikely to be an adequate data source for assessing quality of care in terms of inpatient treatment outcomes.
We propose three possible hypotheses that explain any observed differences in patient outcomes across payer types. First, there may be selective contracting between providers and payer types based on quality. For instance, lower-quality providers may have an incentive to seek out contracts with certain payers to attract patients; likewise, certain payers may have an incentive to contract with either high-quality providers (to ensure high quality of care to their enrollees) or low-quality providers (to secure lower costs at the expense of quality). The previous literature on this issue suggests mixed results. Chernew, Hayward, and Scanlon (1996) have found that MCO are more likely to contract with high-volume hospitals, which indicate higher quality, whereas Escarce, Shea, and Chen (1997) have found essentially the opposite. Given that these two studies had focused on different markets (California and Florida, respectively), these findings may reflect different MCO behaviors in different markets.
The second hypothesis is that there may be unobserved health status differences among the patients covered by different payer types. While the conventional wisdom suggests that MCO selectively enroll healthier enrollees to reduce cost, there is evidence that MCO enrollees may in fact be sicker than their non-MCO counterparts (Schaefer and Reschovsky 2002). Huesch (2010) also suggests sicker Medicare patients may selectively enroll into Medicare-managed care plans, although the author is unable to find any direct evidence supporting that conclusion. Thus, it is possible that MCO patients may in fact be more likely to experience worse treatment outcomes than FFS patients.
There are several possible explanations for this: First, because MCO typically require lower cost-sharing than FFS plans, those who are in worse health conditions and are therefore more likely to face cost sharing may choose MCO over FFS, while MCO are unable to adequately screen out such patients. Second, it may be that because the MCO enrollees had been previously denied of needed care, they are more prone to complications. Moreover, once their enrollees are admitted, MCO may also impose stricter restrictions on who receives a surgery, which may lead to a selection of patients who are more prone to complications at the time of surgery. Unfortunately, the current literature offers little guidance as to which one of these might be the most likely explanation.
Lastly, providers may respond to MCO financial incentives by reducing both the quantity and the quality of treatments rendered, resulting in worse outcomes for their enrollees (Berenson 1986; Landon et al. 2008). If providers are able to easily alter their treatment patterns by each patient's payment source, this may explain the differences in patient outcomes across payer types within a hospital. However, in hospital settings where processes and procedures are increasingly standardized, it is difficult to see how such discrimination by payer types can actually occur. Therefore, in our analysis, we ignore this last hypothesis and focus on the first two as the most likely explanations.
The prior work that most closely resembles this paper is the paper by Haile and Stein (2002). Using the hospital discharge records from 39 California hospitals in 1996 and 1997, they report a significant association between payer types and in-hospital complication rates, suggesting that MCO patients appeared to be more likely to experience in-hospital complications than FFS patients. They conclude that this association is attributable to the variation in treatment patterns across hospitals rather than within each hospital. Their finding is therefore consistent with the selective contracting hypothesis between MCO and low-quality hospitals. However, their empirical model does not explicitly account for the unobserved differences in patient health status. In this paper, we build upon this prior analysis by explicitly controlling for any unobserved patient heterogeneity in our models.
The main source of data for this analysis was the Florida hospital inpatient discharge records collected by Florida Agency for Health Care Administration. The data set consisted of patient-level data containing all the discharge records from approximately 250 of Florida's major hospitals. Of the available years of data, those from 1996 through 2000 were chosen and pooled. This yielded 1,712,873 discharges for which one or more major surgeries had taken place during the period. From this, we retained only those who had private Health Maintenance Organization (HMO), Preferred Provider Organization (PPO), or FFS as the primary payer and were between the ages of 18 and 64 to obtain a more homogeneous population in terms of health status (1,195,797 excluded).
To reduce the computational burden due to the large sample size, we took a 50 percent random sample in each hospital (258,538 excluded). This random sampling was done before dropping those observations with missing variables (19,890 excluded) to ensure that the random sampling would not be subject to any bias resulting from potential nonrandom missingness in the variables. The final sample size included 238,648 discharges from 175 hospitals located throughout the state of Florida.
From 1996 through 2000, Florida's private managed care market experienced a period of change that closely reflects what had occurred in other states during the same period (Baumgarten 2003). Following a rapid period of growth between 1996 and 1998, the HMO enrollment in Florida began to shrink beginning in 2000. This was attributable to the increases in the HMO premium, which prompted employers to switch to other managed care types such as PPO. Therefore, this was the period in which there were perceivable distinctions between HMO and PPO plans, at least from the purchasers' perspective. Furthermore, our study period captured the time during which HMO as well as PPO had consisted a significant portion of the private health insurance market.
The key independent variable of interest was the primary payer indicator variable for each patient. In particular, those patients whose primary payer was identified as either HMO or PPO were considered to be covered by MCO. Those patients whose primary payer was listed as “commercial charge-based” insurance were considered to be covered by private FFS insurance. The data also contained information on the type of admission (e.g., emergency, urgent, or elective) and the source of admission (e.g., transfer from another hospital, long-term care facility, or emergency room) on each discharge, as well as up to 10 diagnosis and procedure codes.
We also had detailed information on hospital characteristics such as the bed size, ownership type (for-profit or nonprofit), teaching status, as well as the total number of admissions. In addition, using the hospital zip code, we merged in a set of geographical and market-level variables that have been known to be associated with hospital quality (Sari 2002; Dranove, Lindrooth, and White 2008), which included the county-level yearly managed care penetration data compiled by Baker (1995) as well as the Florida hospital systems data compiled by Dranove, Lindrooth, and White (2008) to calculate the Herfindahl-Hirschman index that captures the local hospital market concentration. For the purposes of this study, market was defined as county.
In particular, the managed care penetration data provide an important control variable in our empirical model that captures the “spill over” effect of MCO. Baker (2003) suggests that an increased MCO enrollment within a market may induce overall changes in the treatment patterns and resource utilization across all patients in the area, affecting outcomes even among those not covered by MCO. This means that those hospitals located within the high-MCO penetration markets may be expected to have different patient outcomes than those located in the low-MCO penetration markets. Thus, our market-level managed care penetration rate variable allows us to control for this effect.
Our discharge data set contained limited information on patient demographics (age, gender, and race), including the zip code of each patient's residence. Most notably, the data set did not contain information on patient income, which may be problematic if patient income is a significant predictor of patients' health status and treatment outcomes. In prior studies, researchers used each patient's zip code to obtain the median income of that zip code to proxy for the patient's income (Shapiro et al. 1994; Haile and Stein 2002; Encinosa and Bernard 2005). We recognized two issues with this approach: First, this assumed that the median income of each zip code corresponds to the income of a particular patient, which may or may not be true. Second, given that patients' proximity to hospitals is a significant predictor of patients' choice of hospitals (Gowrisankaran and Town 1999), this approach was essentially analogous to obtaining the local market conditions of each hospital. As such, we merged in the relevant market condition variables—unemployment rate and percentage of population who are 65 years old or older—from the Area Resource File.
This study utilizes in-hospital complication rates among patients who had undergone major surgeries computed by the Complication Screening Program (CSP) as the primary outcome variable (Iezzoni et al. 1999b). CSP is similar to PSI in that it relies on commonly available hospital discharge data to identify cases in which one or more preventable complications may have potentially occurred due to medical errors caused by the provider. In this study, we use CSP instead of PSI in order to build upon the earlier work by Haile and Stein, who also utilized CSP to construct their outcome variables. CSP includes 28 complication categories based on ICD-9-CM codes. In our data set, we find that about 13 percent of the major-surgery patients had experienced at least one complication. Refer to Iezzoni et al. (1999a, b) for more detailed descriptions and the logic of CSP (see Appendix SA2 for the descriptive statistics of all the variables used in this analysis). Table 1 lists the types of complications considered under CSP and their frequencies in our data.
CSP provides its own algorithm for case-mix adjustment. First, CSP classifies patients into different risk pools. For the purposes of this study, we have selected those patients who had undergone major surgeries (Risk Pool A) based on the assumptions that (1) they are likely to be more homogeneous in terms of the severity of their conditions and that (2) they are more likely to yield meaningful variation in the complication rates relative to those in other, less severe, CSP risk pools. Furthermore, CSP suggests controlling for the Major Diagnostic Categories and presence of 13 chronic conditions when making comparisons.
In addition to patient age, gender, and race, we have included the numbers of diagnosis and procedure codes (up to 10 each) as covariates in our empirical models to further capture the severity of illness of each patient, even though CSP does not explicitly call for their inclusion. Clearly, these are endogenously determined—that is, experiencing complications would increase the number of diagnoses and procedure codes. We have estimated alternative models without these variables and found that our results did not change significantly.
We also include the admission types and sources as control variables that capture any differences in patient severity at the time of admission. We recognize that these variables may also capture the MCO influences and therefore are likely to be correlated with our key independent variable, patient's payer type. However, since the prior studies similar to this one (e.g., Haile and Stein 2002; Encinosa and Bernard 2005) have included them as case-mix control variables, we maintain the interpretation of these variables as case-mix control variables and include them in our empirical models.
To minimize the presence of confounding factors in our data, our analysis is restricted to major surgery patients between the ages of 18 and 64 covered by private health insurance, which was defined as private FFS, HMO, or PPO. We begin with the following basic specification:
where Cijt equals 1 if patient i undergoing a major surgery in hospital j at time t experienced one or more complications and 0 otherwise. MCijt is the categorical variable for the patient i's payer type, with the private FFS insurance being the referent category. Pijt is a vector of the patient characteristics, and Hjt represents the vector of hospital characteristics. MKjt represents the market condition for hospital j at time t. Also, ωijt represents the unobserved patient heterogeneity that affects his or her complication in hospital j and time t, while λjt represents the unobserved quality of hospital j at time t. Because of the discrete nature of the dependent variable, (1) is estimated via binary logistic regression model, with the assumption that the error term εijt is logistically distributed with mean 0. Our basic approach is to see how α1 changes under different model specifications.
To test whether the association between patients' payer types and complication rates are driven by selective contracting between hospitals and MCO based on hospital quality, we use hospital fixed effects (i.e., include dummy variables for 174 hospitals if the unobserved hospital quality λjt is time invariant) and hospital-year fixed effects (i.e., include dummy variables for 174 hospitals × 4 years if λjt is time variant) to capture λjt, while assuming that the unobserved patient heterogeneity, ωijt, is 0. Compared with a naïve model in which both ωijt and λjt are assumed 0, these fixed-effects models should yield an estimate of α1 that is closer to 0. That is, if MCO selectively contract with lower-quality hospitals, controlling for the unobserved hospital quality via hospital fixed effects should eliminate the positive and significant effect of the payer type variables.
It is important to note that this constitutes only an indirect method of testing the selective contracting hypothesis. In other words, we are unable to conclude from this part of our analysis whether selective contracting based on hospital quality does or does not occur. Rather, the purpose of this is to show whether selective contracting is a plausible explanation for the observed association between payer types and complication rates.
If the significant and positive effect still remains even after controlling for the unobserved hospital quality via fixed effects, we then move on to testing whether the association is driven by the unobserved patient heterogeneity, ωijt. To do so, we take a more structural approach. More specifically, we specify a two-equation model as the following:
The positive and significant estimate of β1 arises from the cross-equation correlation between 1 and 2; that is, if MCO patients are in worse health conditions than their FFS counterparts, then they may be more likely to experience complication. To account for this correlation, the error terms are decomposed as the following:
Thus, the cross-equation correlation of the error terms arises from the presence of ωijt in both (2) and (3).
Put differently, the difference between the naïve model as shown in Equation (1) and the two-equation model is the explicit recognition that MCO patients are different from FFS patients in terms of the unobserved characteristics that contribute to one's likelihood of experiencing complications. The naïve model implicitly assumes that patients across all payer types are essentially homogeneous in terms of their unobserved characteristics, whereas the two-equation model allows MCO patients to be systematically different from FFS patients.
To remove the effects of ωijt in our model, the DFM is used (Mroz 1999). Rather than imposing a stringent distributional assumption on ωijt, DFM assumes that the distribution may be approximated using a finite discrete set of mass points (Heckman and Singer 1984), or “factors.” Mechanically, DFM estimates (2) and (3) simultaneously via maximum likelihood estimation (MLE), conditional on each mass point representing a value of the random variable ωijt. Then, the conditional likelihood values are summed over all the mass points to obtain the unconditional likelihood values.
Intuitively, the mass points represent the unobserved “types” of individuals in the data set. That is, we assume that there are finite types of patients in our data set and that those who belong to the same type are similar to one another in terms of the unobserved health status. Thus, rather than assuming an arbitrary distribution of ωijt, which is unknown, we approximate the empirical distribution of the types in our data with discrete mass points. Conceptually, once we know the approximate empirical distribution of the types, we can then “integrate out” the unobserved ωijt from our equations by summing over all discrete values of ωijt.
Although our DFM model is technically identified by its functional form, the identification is strengthened via exclusion restriction. In our model, complication rates among the publicly insured patients of each hospital serve as the exclusion restriction. That is, it appears as a right-hand-side variable in (3) to predict whether or not a privately insured patient experiences a complication, but it does not predict the payer type in (2). Publicly insured patients are defined as those covered by Medicare or Medicaid (mostly Medicare, as Medicaid patients accounted for <8 percent of the total hospital admissions in Florida during our study period [Encinosa and Bernard 2005]). To the extent that how a hospital treats its publicly insured patient is strongly correlated with how it treats its privately insured patients, this is a reasonable variable to include in our complication model. There is ample literature to suggest that this may indeed be a reasonable assumption (Needleman et al. 2003; Wennberg et al. 2004; Baker, Fisher, and Wennberg 2008).
The complication rate among the publicly insured patients in each hospital thus represents the underlying risk of complication experienced by all patients treated in that hospital—that is, unobserved hospital quality. To help identify our model, we further assume that the unobserved hospital quality is uncorrelated with the payer types of the privately insured patients. Therefore, this exclusion restriction would not hold if the hospital-MCO selective contracting hypothesis were true. That is, if MCO are able to observe the hospital quality and thus “steer” their enrollees to particular hospitals, the complication rates among the publicly insured patients would be correlated with the privately insured patients' payer types as well. However, we believe this is unlikely based on our findings presented below. In short, we find little evidence of MCO-hospital matching based on unobserved hospital quality, suggesting that our exclusion restriction is warranted.
We use DFM rather than the traditional instrumental variable (IV) method because the functional form of DFM helps the identification of the parameters of interest and thus avoids the restrictive properties of IV. Using our current data source, it is difficult to identify a plausible IV that predicts patients' payer types but not directly the patients' complication rates. Instead, we have identified a variable that is correlated with the complication rates but not with the payer types. Thus, we are able to achieve identification using an exclusion restriction and the functional form of DFM rather than resorting to an implausible IV.
Since η represents a mass point in a discrete probability distribution, each ηm has a probability—denoted by πm—associated with it such that . Thus, the probability corresponding to each ηm is given by the following logit transformation:
Thus, we estimate ϕm instead of πm, which can be calculated from (8). Following Mroz's suggestion, ρ1 is constrained to one to determine the scale of the discrete factors. The estimated magnitude and sign of ρ2, therefore, indicate the extent to which the unobserved patient heterogeneity leads to the biased estimates of β1. The number of mass points is arbitrary. Deb (2001) suggests that three or four mass points are usually sufficient to approximate the discrete empirical distribution of ωijt. Thus, our strategy is to obtain separate model estimates with two, three, and four mass points and choose the one that yields the highest likelihood value—in this case, four mass points. For more details on DFM and the derivation of the likelihood function, see Appendix SA2.
Table 2 summarizes the estimated coefficient estimates obtained from estimating the 1-equation model as described in (1).
The “Naïve Model” column shows the parameter estimates as obtained under the assumption that there are no unobserved patient or hospital characteristics. On the other hand, the hospital FE and hospital-year FE columns show the estimates obtained by controlling for the unobserved hospital quality. In these models, we omit those independent variables that do not vary over time (in the hospital FE model) and those that do not vary within each hospital in a given year (in the hospital-year FE model) because they are perfectly collinear with the fixed effect terms.
The naïve model estimates suggest that those patients covered by HMO and PPO are significantly more likely to experience in-hospital complications than their FFS counterparts. Interestingly, however, even after controlling for the unobserved hospital quality via the fixed effects, the coefficient estimates on the payer type variables remain positive and statistically significant, albeit the magnitudes get smaller toward zero. This suggests that the hospital-HMO selective contracting is unlikely to be the primary driver of the observed relationship between payer type and treatment outcome. Thus, we move on to test the alternative hypothesis that MCO patients may be more prone to complication than FFS patients.
For this step, we use DFM as described above. Before implementing DFM, however, we collapse the HMO and PPO indicator variables into a single binary variable that equals one if the patient is covered by HMO or PPO and zero if covered by FFS. This is done because (1) we fail to reject that the coefficient estimates on the HMO and PPO dummy variables are equivalent (p-value=.35 in the Hospital-Year FE model), and (2) this greatly simplifies the DFM likelihood function to be estimated via MLE. As a comparison, we also estimate the two-equation model without DFM, which corresponds to the “Naïve Model” in Table 2. The results are shown in Tables 3 and and44 below.
In Table 3, we find that under DFM, the estimated coefficient on the payer type dummy variable is −0.17 and statistically insignificant. Compared with the estimated coefficient of about 0.1 under the naïve model (no DFM), the sign of the coefficient estimate under DFM is now reversed and statistically insignificant. Thus, we conclude that those covered by HMO and PPO are no more or less likely to experience complications than their FFS counterparts. This is in contrast to what the estimates from the one-equation logistic regression in Table 2 implies and is consistent with the hypothesis that those patients covered by HMO and PPO tend to be more prone to complications than their FFS counterparts.
Table 4 reinforces this conclusion. It suggests that MCO patients tend to be more likely to be chronically ill—metastatic cancer, chronic pulmonary disease, coronary artery disease, congestive heart failure, peripheral vascular disease, diabetes with end organ damage, and nutritional deficiencies—than FFS patients. Thus, MCO patients may have been in worse health conditions than their FFS counterparts even before they were admitted. The results also indicate that those who are covered by MCO are more likely to have fewer diagnosis codes and undergo fewer procedures. One possible explanation for this is that through the use of gatekeepers and practice guidelines, MCO may reduce the use of certain procedures by providers, without impacting the treatment outcomes of their enrollees.
The estimates of the DFM parameters as shown in Table 4 indicate the presence of the unobserved patient heterogeneity affecting the coefficient estimate on the payer type variable. In particular, the ρ2 parameter is positive and significant, suggesting that there is an upward bias on the payer type coefficient in the naïve model estimates. Again, this is consistent with the hypothesis that there are unobserved differences in the underlying health status between MCO and FFS patients.
Our results may be summarized as the following: MCO patients appeared to be more prone to complications than FFS patients after a risk adjustment based only on the observed characteristics. This association remained even after controlling for the unobserved hospital quality that may be correlated with both the patients' payer types and complications (i.e., selective contracting). However, in a model in which unobserved patient heterogeneity (e.g., health status) was explicitly accounted for, the association disappeared, suggesting that the observed association between payer types and complications was largely driven by the unobserved differences in patient health status.
There are two implications to our study: First, further study is necessary to examine why MCO patients appear to be in worse health condition than their FFS counterparts. One possible explanation is that the MCO patients are more prone to complications because utilization restrictions make it more difficult for them to access necessary care before hospital admission. Alternatively, patients with chronic conditions may be attracted to MCO because of the lower cost sharing. Our results appear to be more consistent with the latter hypothesis, as we have found that certain chronic conditions appear to be more prevalent among the MCO patients than among the FFS patients. This is in contrast to the commonly held perception that MCO “cherry pick” the healthier individuals. Thus, if this adverse selection of sicker individuals by MCO is true, then it suggests a further area to explore in terms of how well private insurers are able to assess the risk of each potential enrollee.
Second, our results illustrate the limitations of using hospital discharge data to make inferences on quality of care, particularly when quality is measured in terms of patient outcome. Even after extensively adjusting for patient case mix using the available data elements in the discharge records, we find that there is likely to be substantial unobserved differences in patient health status that confound the relationship between payer types and outcomes. However, as other researchers have also pointed out (Wray et al. 1997), adequately capturing case mix from discharge records is difficult. Researchers who are aware of this issue have offered several methods of addressing it (e.g., Elixhauser et al. 1998; Geweke, Gowrisankaran, and Town 2003), but there has not been one universally accepted case-mix adjustment method. Our paper demonstrates that health care administrators and researchers must be cautious when using such data to make quality comparisons and, if feasible, should consider augmenting the administrative data with relevant clinical data that can help reduce the unobserved differences in patients' underlying health conditions.
Joint Acknowledgment/Disclosure Statement: None.
Additional supporting information may be found in the online version of this article:
Appendix SA1: Author Matrix.
Appendix SA2: Derivation of DFM Likelihood Function.
Table S1:Descriptive Statistics—VariableMean and StandardDeviation.
Please note: Wiley-Blackwell is not responsible for the content or functionality of any supporting materials supplied by the authors. Any queries (other than missing material) should be directed to the corresponding author for the article.