|Home | About | Journals | Submit | Contact Us | Français|
Performance measurement at the provider group level is increasingly advocated, but different methods for selecting patients when calculating provider group performance have received little evaluation.
We compared 2 currently used methods according to characteristics of the patients selected and impact on performance estimates.
We analyzed Medicare claims data for fee-for-service beneficiaries with diabetes ever seen at an academic multispecialty physician group in 2003-2004. We examined sample size, socio-demographics, clinical characteristics, and receipt of recommended diabetes monitoring in 2004 for the groups of patients selected using 2 methods implemented in large-scale performance initiatives: the Plurality Provider Algorithm and the Diabetes Care Home method. We examined differences among discordantly assigned patients to determine evidence for differential selection regarding these measures.
Fewer patients were selected under the Diabetes Care Home method (n=3,558) than the Plurality Provider Algorithm (n=4,859). Compared to the Plurality Provider Algorithm, the Diabetes Care Home method preferentially selected patients who were female, not entitled because of disability, older, more likely to have hypertension, and less likely to have kidney disease and peripheral vascular disease, and had lower levels of predicted utilization. Diabetes performance was higher under Diabetes Care Home, with 67% vs. 58% receiving >1 A1c tests, 70% vs. 65% receiving ≥1 LDL test, and 38% vs. 37% receiving an eye exam.
The method used to select patients when calculating provider group performance may affect patient case-mix and estimated performance levels, and warrants careful consideration when comparing performance estimates.
Performance measurement and pay-for-performance initiatives have become increasingly implemented in an effort to improve the quality of health care in the United States.1 In recent years, initiatives targeted at provider groups comprised of more than 1 physician, rather than individual physicians, have been advocated.2,3 Group-level performance measurement has several relative advantages, including larger samples, ability to assess a broad scope of measures across diseases and specialties, and better alignment with goals for improving shared accountability and coordination of care among providers.2-4 Recent examples of such group-level performance initiatives include Medicare’s Physician Group Practice (PGP) Demonstration project,5 Accountable Care Organizations,3 and the Group Practice Reporting Option (GPRO) within Medicare’s Physician Quality Reporting Initiative (PQRI).6
In order to implement initiatives such as these, the population of patients for whom the group is responsible (the “denominator” of performance measures) must first be determined. Most group-level performance initiatives, including those listed above, rely on the use of administrative claims or billing data that capture patient visit patterns to identify the population of interest (e.g., patients with diabetes) and attribute them to a provider group. However, because care for a given patient is often dispersed across multiple provider groups,7 it can be difficult to determine from these data which group should be held accountable for that patient. Patient attribution may also pose a challenge for organizations wishing to self-monitor performance, as they must determine the population for whom they are primarily responsible using only their internal visit records.
Accurately attributing responsibility for patients to provider groups may be crucial for performance measurement initiatives to receive acceptance by these groups, but methods for doing so have received little critical evaluation. In the 2 studies we identified, both simulated estimates of cancer screening rates in community health centers and diabetes performance measures varied substantially depending on restrictiveness of criteria used to identify eligible patients.8,9 However, it is unknown how their findings extend to patient selection methods used in current performance initiatives. We address this gap by applying 2 methods, which are currently implemented in large initiatives by Medicare and the Wisconsin Collaborative for Healthcare Quality (WCHQ), to claims and enrollment data for Medicare fee-for-serve beneficiaries with diabetes, and comparing the resulting groups of selected patients in terms of socio-demographics and levels of diabetes performance.
The Institutional Review Board at the first author’s institution approved this study with a waiver of HIPAA authorization. The participating provider group consisted of a large, Midwestern, academic, multispecialty provider group that serves as a statewide specialty care referral center as well as a major source of primary care for the local metropolitan area. We obtained all inpatient, skilled nursing facility, outpatient, and carrier claims for Medicare beneficiaries with at least one claim of any type in 2003-2004 associated with a Unique Physician Identification Number (UPIN) for a physician within the provider group, including claims for services provided by all providers (within and outside of the provider group). To identify the subset of patients with diabetes, we used an established algorithm (sensitivity=73.4; specificity=97.6)10 requiring patients to have at least 1 inpatient or skilled nursing facility claim or more than 1 carrier claim with an ICD-9-CM code of 250.xx, 357.2, 362.0x, 366.41, or 648.0x in any position. Beneficiaries with railroad benefits or lacking continuous Part A and B coverage in 2003-2004 were excluded.
We used 2003-2004 carrier claims to determine whether beneficiaries would be assigned to the provider group in 2004 under each of 2 methods: the Plurality Provider Algorithm5,7 (PPA), used by Medicare in several performance initiatives, and the Diabetes Care Home (DCH) method, which is based on methodology from WCHQ, a statewide public reporting initiative. This method is property of WCHQ and is used herein with their permission. Both methods assign patients based on patterns of outpatient, face-to-face Evaluation & Management (E&M) visits to the provider group, identified using Federal Employer Identification Numbers, as reported in professional service claims (see Supplemental Digital Content 1 for methods used to identify face-to-face E&M visit). Details on the 2 selection methods are shown in Table 1.
We used the Medicare denominator file to determine patient age, sex, race/ethnicity, original entitlement due to disability, and the state buy-in indicator. We used 2003-2004 claims to characterize patients’ clinical complexity using established algorithms for complications and co-morbidities that are common to diabetes and may cue or distract from its appropriate management,11 including lower extremity ulcers, amputation, eye diseases, and peripheral vascular disease);12 hypertension, obesity, and depression;13,14 dementia,15 congestive heart failure,16 and chronic kidney disease.17 We used the end-stage renal disease indicator in the denominator file to further classify kidney disease as end-stage versus not. As a measure of overall clinical complexity, we calculated the Hierarchical Condition Categories (HCC) community risk score.18 Finally, we constructed 3 established measures19-21 of diabetes performance in 2004: at least 2 HbA1c tests, at least 1 LDL test, and at least 1 eye exam (see table, Supplemental Digital Content 2).20
We examined the frequency of patients assigned under at least 1 method versus neither method, as well as each individual method. We also determined the patients simultaneously assigned under both methods (concordantly assigned individuals) as well as those assigned under the PPA only or the DCH only (discordantly assigned individuals). For each subgroup, we generated descriptive statistics for sociodemographics, clinical characteristics, and diabetes performance (n and % for categorical variables; median and interquartile range for continuous variables, which were not normally distributed). To determine whether the PPA and DCH method differentially selected patients, we conducted chi-square and Wilcoxon-Mann-Whitney tests for differences among discordantly assigned patients.
A total of 22,778 continuously enrolled Medicare fee-for-service beneficiaries with at least 1 encounter in any setting with the provider group in 2003-2004 were identified as having diabetes. As shown in Figure 1, 5,124 (23%) were assigned to the group under at least 1 method for 2004, with 4,859 (21%) assigned under PPA and 3,558 (16%) assigned under DCH. The analysis of concordantly and discordantly assigned individuals revealed 3,293 patients assigned under both methods, 1,566 assigned under PPA only, and 265 assigned under DCH only.
Table 2 shows characteristics for the full sample of patients with diabetes, by whether or not patients were assigned under at least 1 method. Unassigned patients were significantly older and more likely to have specific co-morbidities than assigned patients. Among patients who were assigned under at least 1 method, slightly more than half were female, 95% were white, and mean age was 71 years; chronic kidney disease, congestive heart failure, and peripheral vascular disease affected 16%, 18%, and 31%, respectively. The median HCC risk score was 1.26, translating to total predicted health care costs 26% higher than the average community-dwelling Medicare beneficiary.
Table 3 shows that compared to patients selected under PPA only, those assigned under DCH only were older, more likely to be female and have hypertension, less likely to be disabled or have kidney disease or peripheral vascular disease, and had lower HCC scores. Examination of the overall estimates for these factors in the entire PPA (n=4,859) and DCH subgroups (n=3,558) revealed a similar pattern of differences, although the overall magnitude of differences were fairly small.
Across all measures, performance estimates were higher under DCH than PPA (Table 4), with 67% versus 58% receiving at least 2 HbA1c tests, 70% versus 65% receiving at least 1 LDL test, and 38% versus 37% of patients receiving an eye exam. Chi-square tests for discordantly assigned patients confirmed that the 2 methods differentially selected patients with regard to their likelihood of meeting diabetes performance standards.
Despite increasing implementation of performance measurement targeted at provider groups rather than individual providers, our study is the first to directly compare methods currently in widespread use for selecting patients to be included in group-level performance estimates. Notably, we found that the PPA, which has been implemented in Medicare performance initiatives and has been proposed for use when extending the Medicare PGP Demonstration to Accountable Care Organizations,3 selected approximately 1/3 more patients when compared to the DCH method, which has been implemented in a state-level voluntary public reporting initiative. The patients selected only by the PPA represented those who received the bulk of their outpatient care from the group, but had less than 2 diabetes-coded visits or less than 2 primary care/endocrinology visits needed to qualify them for inclusion under DCH. Given that this study’s group practice serves as a statewide referral center for specialty care, it is not surprising that so many patients with this visit pattern were identified, or that so many patients overall were not assigned to the group under either method. Our results also provide evidence that using these 2 different sets of visit pattern criteria result in substantive differences in characteristics of patients who are selected. When compared to the DCH method, the PPA preferentially selected patients whose HCC scores indicated greater overall clinical complexity and who were more likely to have complications of diabetes (e.g., kidney disease, peripheral vascular disease).
The different visit pattern criteria used by the 2 methods also produced meaningful differences in diabetes performance estimates. Performance levels appeared lower under the PPA, due to markedly lower rates of testing among patients who were only selected under the PPA method compared to those who were selected by the DCH method solely or in addition to the PPA method. Given that differences in the patient characteristics (i.e., greater co-morbidity) and visit patterns described above are known to affect patients’ receipt of recommended diabetes care22-27 and estimates of physician performance,28 these findings are not surprising, but have major implications for comparing performance estimates across groups. Although our focus on a single group practice did not allow for a direct test of whether groups’ relative ranking on performance would be affected by the choice of patient selection method, our results coupled with what is known about the effect of patient case-mix on performance estimates would suggest this to be the case. Hong and colleagues recently reported that relative rankings of individual providers on performance were substantially impacted by the characteristics of patients in their panels.28 Thus, if the characteristics of patient populations are very different across provider groups, use of different selection methods may have the effect of differentially raising or lowering performance estimates across practices. In particular, practices with more clinically complex patient populations with extensive specialty care needs (similar to this study’s practice) may experience marked decreases in apparent A1c testing rates when the PPA method is used instead of the DCH. Furthermore, our results highlight the need to consider which patient selection method is used when interpreting absolute levels of performance and defining threshold and improvement targets in the design of pay-for-performance systems, and may help provider groups understand why they appear to have different levels of performance when different metrics are used.
It is important to note that our study used Medicare fee-for-service claims data from 2003-2004 to select patients for assignment to a single large, academic multispecialty provider group and it is unknown how results may differ for other groups or more recent years. In addition, our study was not intended to result in a recommendation for one method of selection over another. The 2 methods have different conceptual and practical advantages and disadvantages (Table 1) that may be more or less relevant for different health plans and provider groups. They may also differ in terms of their acceptability and face validity to providers – an important consideration when designing performance initiatives – although more research on how providers view the 2 methods and resulting patient panels is needed.
Our study makes an important contribution to the future design of performance measurement initiatives by demonstrating that performance estimates for provider groups may be meaningfully affected by 2 common methods used to attribute responsibility for patients. Our results suggest that it will be important for systems to carefully consider their method of patient selection in relation to their choice of threshold and improvement targets for various performance measures and when comparing across practices.
Supplemental Digital Content 1.doc. Method of Identifying Outpatient, Face-to-Face Evaluation and Management (E&M) Visits to Physicians in Medicare Carrier Claims.
Supplemental Digital Content 2.doc. CPT Codes Used to Identify HbA1c tests, LDL cholesterol tests, and dilated eye exams.
Funding Support: Funding for this project was provided by the Agency for Healthcare Research and Quality, grant numbers R21 HS017646 and R01 HS018368. Additional support was provided by the Health Innovation Program; the Community-Academic Partnerships core of the University of Wisconsin Institute for Clinical and Translational Research (UW ICTR), grant 1UL1RR025011 from the Clinical and Translational Science Award (CTSA) program of the National Center for Research Resources, National Institutes of Health; and the UW School of Medicine and Public Health from the Wisconsin Partnership Program. Christine Everett was supported by the Agency for Healthcare Research and Quality (AHRQ)/National Research Service Award (NRSA) T-32 Institutional Training Program Grant Number: 5-T32-HS00083. The UW Health Innovation Program provided assistance with IRB application, Medicare data management, variable creation, and manuscript formatting. No other funding source had a role in the design or conduct; data collection, management, analysis or interpretation; or preparation, review, or approval of the manuscript. A poster based on this research was presented at the AcademyHealth Annual Research Meeting on Monday, June 29, 2009 in Chicago, IL.
This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Carolyn T. Thorpe, Department of Population Health Sciences, University of Wisconsin – Madison, 800 University Bay Dr., Room 210-16, Madison, WI 53705. Phone: 608-262-4051, Fax: 888-263-2864, Email: cthorpe/at/wisc.edu.
Grace E. Flood, Department of Population Health Sciences, University of Wisconsin – Madison, 800 University Bay Dr., Room 210-38, Madison, WI 53705. Phone: 608-262-3103, Fax: 888-263-2864, Email: geflood/at/wisc.edu.
Sally A. Kraft, Department of Medicine, University of Wisconsin – Madison, 7974 UW Health Court, Middleton, WI 53562. Phone: 608-821-4900, Fax: 608-824-2237, Email: sakraft/at/wisc.edu.
Christine M. Everett, Department of Population Health Sciences, University of Wisconsin – Madison, 800 University Bay Dr., Room 210-38, Madison, WI 53705. Phone: 608-263-4416, Fax: 888-263-2864, Email: cmeverett/at/wisc.edu.
Maureen A. Smith, Department of Population Health Sciences, University of Wisconsin – Madison, 800 University Bay Dr., Room 210-31, Madison, WI 53705. Phone: 608-262-4802, Fax: 888-263-2864, Email: maureensmith/at/wisc.edu.