Just as treatment guidelines for diabetes care were at the forefront of medical guideline development (1), diabetes has been a prominent focus of performance measurement and quality improvement initiatives for well over a decade. However, the constraints of pre-electronic health records (EHRs) data systems have consistently limited the clinical scope and sophistication of current diabetes quality measures. The U.S. health care system is nearing a tipping point in the use of more sophisticated EHR-based information systems, and widespread use of these systems will usher in a new era for diabetes quality measurement. New information system capabilities will enable improvements to existing measures and enable development of much more sophisticated measures that can accommodate personalization of clinical goals, patient preferences, and patient-reported data, thus moving both guidelines and measures toward personalization based on sophisticated assessment of the risks and benefits of certain clinical actions for a given patient at a given clinical encounter.
To facilitate discussion of the future of performance measurement in diabetes in this era of rapid transition to EHRs, the American Diabetes Association (ADA) convened a consensus development conference in December 2010. Participating experts identified and discussed the following questions:
- 1.What is the evidence that measuring quality, benchmarking, and providing feedback or incentives improve diabetes care?
- 2.What are the limitations, burdens, and consequences (intended or unintended) of diabetes quality measures as currently structured?
- 3.What should be the role of shared decision making, patient preferences, and patient-reported data in quality measures?
- 4.What is the future of quality measurement in diabetes?
- 5.How can quality monitoring be integrated into population surveillance efforts?
This report summarizes the consensus meeting, and represents the expert opinion of its authors and not the official position of the ADA or any other participating organization.
1. What is the evidence that measuring quality, benchmarking, and providing feedback or incentives improve diabetes care?
The first national effort to develop a set of performance measures for diabetes was convened by the Center for Medicare and Medicaid Services (CMS), the National Committee on Quality Assurance (NCQA), and the ADA in 1995 (2). Evidence showed that complications of diabetes can be reduced by controlling hemoglobin A1c (A1C), blood pressure, and LDL cholesterol, but health system performance was suboptimal and highly variable (2–4). The Diabetes Quality Improvement Program (DQIP) groups specified a set of eight process and outcomes measures that were measured at the individual patient level and aggregated across the patient samples of health plans, physicians, or other units. The DQIP measures were specified for use in the Healthcare Effectiveness Data and Information Set (HEDIS) measure established by NCQA and subsequently widely adopted for performance assessment in commercial, Medicare, and Medicaid health plans. Other health plans and some government agencies, such as the Veterans Health Administration (VHA) and CMS, also adopted the core measure set for use at physician or group practice level. Most of the measures were subsequently endorsed by the National Quality Forum (NQF) and are included in payment programs such as the Physician Quality Reporting System (PQRS) and Meaningful Use. Simple processes, such as periodic testing for A1C, LDL cholesterol, or microalbuminuria, or periodic retinal examination, are relatively easy to identify in either medical records or health care claims. Periodic performance of these processes is appropriate for nearly all patients, with the possible exception of very elderly patients for whom limited life span may preclude the need to screen for complications if they have not already appeared.
During the past decade, the proportion of patients receiving these processes of care has increased across a range of settings (5–7). For several measures, including A1C, LDL cholesterol, and microalbuminuria testing, proportions are approaching 90%, at least in commercial health maintenance organizations and Veterans Administration populations. However, quality of care improvements with performance measurement does not seem to generalize to aspects of care beyond diabetes. For example, the U.S. Department of Veterans Affairs, which implemented aggressive measurement and quality improvement strategies in the 1990s, has been shown to have better quality of diabetes care than the private sector, but has care comparable to the private sector in clinical domains without performance measures (8,9).
Several studies demonstrate that although it is relatively easy to improve performance for simple processes of care, improvements in important intermediate outcomes such as A1C, blood pressure, and LDL cholesterol do not necessarily follow (10,11). Some care systems with intense disease management programs have improved processes of care but not necessarily intermediate outcomes (12), and correlations between system-level performance for processes of care and for intermediate outcomes such as risk factor control are weak (13). This disconnect between processes and outcomes of care raises the question of whether process measures are valid indicators of quality and point to the need to emphasize intermediate outcomes or to develop alternative process indicators more closely linked to intermediate outcomes of care.
Indicators of intermediate outcomes of care (control of blood pressure, A1C, and LDL cholesterol) were also among the original DQIP measures and have been included in most subsequent diabetes quality measurement sets. Unlike simple process measures, adequate control of these risk factors is related to improved clinical outcomes including cardiovascular events, microvascular complications, and mortality. Assuming that safe, evidence-based treatments are used (14), it is likely that populations with better risk factor control or greater improvements in risk factor control over time are receiving better quality care and are benefiting clinically. In fact, as process measures and measures of risk factor control have improved in the U.S., a concomitant reduction in several major adverse outcomes (kidney failure, amputation) has been documented among the population with diabetes (6,15–17).
Are these measureable improvements due, at least in part, to initiatives related to performance measurement, quality assessment, and quality improvement? A number of small randomized controlled trials of performance measurement suggest that measurement and feedback can lead to improvements in some quality indicators. This effect is more evident with process measures than with risk factor control, and observed improvements generally wane over time, especially once feedback ceases (18). Pay for Performance (PfP) initiatives have been implemented in multiple systems, and their effect on quality of care remains controversial. (19,20) “Real world” data suggest that the aggressive U.K. PfP initiative markedly improved control of glucose and cholesterol for several years after its implementation (21). However, once targets were reached, further improvements in quality of diabetes care slowed, and quality of care for conditions with no incentives declined. In the Kaiser Permanente system, financial incentives for diabetic retinopathy screening increased screening rates modestly from 85 to 88%. However, when financial incentives and other care supports were removed, retinopathy screening rates fell by 3% per year to levels below baseline (80%) (22). In a cluster-randomized trial, incentives and feedback linked to EHR-based diabetes clinical decision support modestly improved glucose and blood pressure control, but effects waned after incentives and feedback were removed, even though the clinical decision support continued (23).
In summary, various combinations of performance measurement, feedback to clinicians, quality improvement programs, public reporting, and financial incentives have been associated with sustained improvements in some aspects of diabetes care in many settings. These strategies tend to change specific aspects of care that are being measured and/or paid for, and improvements, which are difficult to maintain, do not necessarily extend to other aspects of care.
2. What are the limitations, burdens, and consequences (intended or unintended) of diabetes quality measures as currently structured?
Dichotomous quality measures based on thresholds for continuous variables.
Research now demonstrates that sole reliance on measuring and reporting simple processes is unlikely to have a substantial impact on patient outcomes, and improvement in process measures can no longer be taken as evidence that quality of care has improved (24). Performance measures based on control of risk factors such as A1C, blood pressure, and LDL cholesterol are appealing because these risk factors predict clinical outcomes, but this approach presents measurement complexities and challenges. Control of these risk factors is influenced not only by provider actions, but also by factors such as patient behaviors, comorbidity, and concerns about medication safety and cost. Current performance measures identify thresholds for A1C, blood pressure, and LDL cholesterol control and usually dichotomized performance measures based on these threshold levels. The use of thresholds is easily understood and simple to report, but selection of an appropriate threshold is difficult, especially in the light of recent clinical trial results and subsequent guideline recommendations to individualize clinical goals for A1C and blood pressure (25–29).
Dichotomous threshold-based measures suggest that all patients above the threshold need additional pharmacologic or lifestyle intervention. Setting high threshold goals (such as A1C <9%, or systolic blood pressure [sBP] <160 mmHg) reduces poor quality care and can be appropriately applied to all patients eligible for the measure. However, in most care systems, only a small fraction of patients will fail to meet such a high threshold. As threshold goals are lowered, an increasing proportion of patients require additional treatment to reach the more stringent threshold goals. However, the marginal benefits of increased treatment diminish as patients approach the goal, while the likelihood of treatment-related side effects and costs of treatment typically increase.
If the risks associated with more intensive treatment are substantive, then setting low thresholds for accountability measures (such as A1C <7% or sBP <130 mmHg) may actually do more harm than good for many patients—clearly an undesirable situation (30). Lack of benefit or unintended harm is possible, especially for those above the accountability threshold but already on high dose therapy, those with terminal illness or limited life expectancy, and those susceptible to serious side effects of aggressive therapy such as hypoglycemia or hypotension (31–34). In the past, some guidelines have adopted blood pressure or A1C goals more stringent than those validated in clinical trials (25,35). While low blood pressure or A1C levels may benefit some subsets of patients, incorporating low threshold goals in accountability measures is problematic (36). Finally, aiming for stringent targets in every patient ignores patient preferences (37).
Since 2008, many diabetes clinical guidelines recommend individualization of A1C and blood pressure goals. In response, some quality measures now include a complex set of exemptions and exclusions that may remain challenging to implement even when EHRs data are available. Alternative approaches discussed below are to increase the accountability threshold to a value that is appropriated for nearly all patients, to move from goal-based to risk-based measures, or to implement new “clinical action” measures, which are more tightly linked to outcomes than some current measures.
Composite diabetes quality measures.
Composite performance scores have been widely adopted and may improve the reliability of performance measurement and ranking compared with single measures (38–40). However, various approaches to combining indicators (averaging by indicator, averaging by patient, or simply measuring all indicators across all patients) may yield somewhat different rankings (41). Composite measures convey less granular clinical information and should be supplemented by providing individual measure data to the physicians. Current composite scores typically weight each indicator equally, so that simple process measures contribute as much to the score as having risk factors in control. This problem can be remedied by weighting the components of a composite measure based on clinical importance.
One variant of the composite score is the “all-or-none” score, which is the proportion of patients for whom all of a set of process indicators are met. It has been suggested that the all-or-nothing approach is the best way to drive toward excellence (42). However, because the score reduces a set of indicators to a single dichotomous score for each patient, all-or-none measures discard a large amount of information. Consequently they lack sensitivity for distinguishing between plans or physicians and tend to have poor reliability (41). All-or-none measures may be more useful for evaluating a multistep process (e.g., diagnosing and treating pneumonia), in which each step is necessary to achieve a successful outcome. They have less to offer in assessing or improving the parallel and often independent processes of diabetes care, especially since not all care components are of equal importance to individual patients.
3. What should be the role of shared decision making, patient preferences, and patient-reported data in quality measures?
Patient self-management is an essential aspect of diabetes care and requires health care systems and providers to actively support their patients’ “performance.” Many experts have suggested that clinical performance measures evaluate how diabetes patients are doing—on both processes (such as self care and behaviors) and outcomes (such as health status) (43). Patient-reported information may be useful to identify patient preferences and goals, decision making, action plans and follow-up, behavioral risk factors, psychosocial functioning and distress, self-care behaviors, and to assess specific aspects of care such as aspirin use, influenza vaccinations, foot examinations, and comorbid conditions such as depression (34,44–47).
Patient-reported information could be derived in part from electronic medical records, and in part through surveys or other evolving technologies. Patient-reported information could also be used to assess other aspects of care quality, including care experiences, care transitions, continuity of care (47), patient-provider interactions, as well as some adverse events, such as hypoglycemic episodes (48). Patient decisions not to follow provider advice can be documented and may provide an opportunity for the provider to understand the reasons and respond in a mutually satisfactory way. Health literacy, numeracy, out-of-pocket costs, and social environment, which may mediate health disparities by influencing patient preferences and adherence to treatment, may serve as case-mix adjusters for quality measures.
The British National Health Service (NHS) has pioneered the use of patient-reported outcomes of care by having all patients undergoing certain elective surgeries fill out pre- and postsurgery reports of their health status, functional status, and other information. In the U.S., the Health Outcomes Survey (HOS) and Consumer Assessment of Healthcare Providers and Systems (CAHPS) survey include a number of performance measurements, functional assessments, and other patient-reported measures (PRMs). Collecting PRMs via efficient and user-friendly modalities (e.g., kiosks, cell phones, Internet, automated phone systems) may facilitate use of a standardized set of behavioral and psychosocial PRMs with high clinical value that could be incorporated in the EHR and then be extracted as performance measures (49).
Methodological considerations in selecting PRMs that merit further research include reliability, validity, sensitivity to change, feasibility, importance to clinicians, importance to public health, actionability, and user friendliness (50). The National Institutes of Health (NIH)-funded Patient Reported Outcomes Measurement Information System (PROMIS) initiative is an important example of the potential of PRMs. PROMIS uses analytic techniques such as item response theory to create and validate very brief measures that assess a range of symptoms and quality of life–related issues (51). In summary, changing technology, including broader use of EHRs, will likely usher in a new era in patient-reported performance measures, which will broaden the scope and usefulness of existing performance measure sets (52–54).
4. What is the future of quality measurement in diabetes?
The advent of EHR technology will open new options for diabetes quality measurement, as already noted. Several of the new opportunities that deserve further attention are highlighted below, and Table 1 briefly outlines some of the advantages and challenges of selected innovations.
Clinical action measures.
One possible refinement of dichotomous intermediate outcome measures is the clinical action measure. Clinical action measures are of two types: 1) those that combine a threshold measure for an intermediate outcome with a process of care for those above the threshold, and 2) those that suggest a high-benefit evidence-based clinical action in certain clinical circumstances (55,56). Examples of these measures include prescribing moderate dose statins to patients with diabetes over age 40 years, or prescribing an ACE-inhibitor or angiotensin receptor blocker to patients with albuminuria. Clinical action measures could take exclusions into account by removing patients for whom care may be contraindicated or not beneficial from the denominator (e.g., women of childbearing potential, patients with end-stage renal disease). By focusing on the clinical treatment (e.g., statins) rather than only a threshold intermediate outcome value (e.g., LDL cholesterol <100 mg/dL), these measures are less likely to motivate treatment with nonevidence-based treatments (e.g., ezetimibe) in order to reach a clinical threshold (14,57).
For example, clinical action measures may credit the clinician for appropriate care if 1) the threshold is met (e.g., blood pressure below the measure threshold), or 2) the provider takes an appropriate clinical action (e.g., starting or increasing the dose of an appropriate medication) for a patient above threshold, or 3) the risk factor returns to below the threshold within a given time frame without changes in therapy, or 4) the patient has a contraindication to further therapy intensification (e.g., a very low diastolic blood pressure) or is already on high-dose therapy despite an elevated risk factor level.
Clinical action measures have several strengths. They direct attention to patients most likely to benefit from added therapy, and they point directly to the appropriate treatment rather than just the risk factor level. Thus, they help providers do the “right” thing for the right patient. They also give credit when the appropriate clinical action is to not intensify medications, thereby diminishing the potential for unintended consequences related to overtreatment. Finally, they take known variation in measurement into account by giving credit for values that return to target within a specified time period. Because many clinical action measures require access to detailed clinical data, they depend on evolved electronic data systems (56,58,59).
Weighted quality measures.
Some have expressed concern that threshold-based performance measures could focus clinician attention inordinately on patients currently just above the target and away from those who are further from the target and may benefit more (60). Others are concerned that performance thresholds could also increase health disparities because vulnerable patients are often further from control, although one recent study allays this concern (61). In general, current use of threshold measures may discard important information compared with considering the full distribution of values in physician or health plan populations (62).
If an A1C threshold measure for “good care” is set at 7%, a provider could get full credit for moving a patient from 7.1 to 6.9%, but no credit for improving control in another patient from 8.8 to 7.1%, despite the fact that the latter patient's risk has been reduced much more than the former's (63,64). These concerns and others may be somewhat allayed by giving “partial credit” to clinicians or systems for treatment efforts, even when a patient does not reach the target.
Credit is assigned based on predicted clinical benefit gained by moving patients from prior poor control to a more favorable clinical level. This requires specifying a threshold for poor control (e.g., A1C >8%) above which no credit is given, and a threshold for good control (e.g., A1C <7%) at which point full credit is given. Some experts have suggested that benefits be quantified using quality-adjusted life-years saved (65). Other methods to assign partial credit have also been proposed and deserve careful consideration (66).
Personalized risk-based quality measures.
The use of risk-based prediction models can extend the concept of risk and benefit in performance measurement by considering each patient's calculated risk for an adverse outcome and defining the benefit a patient is likely to obtain from a specific clinical action based on the UK Propective Diabetes Study (UKPDS), QRISK, Framingham, or other risk engines (67–69). Depending on known evidence, the selected risk engine integrates age, comorbidity, other risk factors (e.g., smoking), and current treatments to predict the patient's risk for a poor outcome. This approach facilitates a patient-specific performance measurement across the continuum of benefit and risk. Such performance measures might assess 1) whether patients above a certain threshold of high risk (where benefit of therapy would clearly outweigh potential harms of treatment) received the therapy in question; 2) whether those below a certain threshold of low risk (where benefit is lower than potential harms) did not receive the therapy; and 3) whether those in between the two thresholds had a documented discussion of risk and benefit of the therapy and engaged in shared or informed decision making (70). Before this approach is ready for prime time, more work needs to be done to assure that the risk and benefit estimates provided by the risk engines are accurate and based on evidence from intervention trials whenever possible. At present, some risk engines overestimate benefits by relying too heavily on epidemiological rather than clinical trial evidence.
Measures of overtreatment.
While the suggestions outlined above are likely to maximize appropriate care and minimize unintended consequences of performance measures (60,71–76), an additional fruitful area lies in constructing and testing direct measures of potential overtreatment, inappropriate treatment, or harm (74,77). Creating and reporting such measures could serve to counter any pressure to intensify therapy inappropriately in the name of performance improvement. Such measures might identify suboptimal practices such as further intensification of therapy for patients with low diastolic blood pressure and moderate sBP levels (e.g., <140/65 mmHg); use of glyburide among the elderly or those with impaired renal function; falls or episodes of hypoglycemia severe enough to require emergency care or hospitalization in patients on complex glucose-lowering regimens or insulin; or on high doses of blood pressure medications. Patient-reported data regarding symptoms and treatment burden may enhance our future ability to quantify overtreatment or potential harm.
Quality measures for primary prevention of diabetes.
Currently, diabetes quality measures focus on the treatment of those with diagnosed diabetes. The Diabetes Prevention Program (DPP) demonstrated that either intensive lifestyle change leading to 7% weight loss or use of metformin substantially reduced the incidence of type 2 diabetes in a diverse U.S. population with impaired glucose tolerance (78). Both interventions were cost-effective from the perspective of the health system and society as delivered in the DPP (79), and similar weight loss and exercise outcomes have been achieved when the DPP lifestyle intervention is implemented in a much less costly form in community settings (80). New performance measures could be designed to assess 1) appropriate implementation of diagnostic tests to identify those at high risk of diabetes; 2) appropriate referral to lifestyle programs and/or metformin therapy; and 3) relative quality and efficacy of lifestyle programs designed to achieve weight loss. Such measures would not only foster the implementation of proven interventions of tremendous public health significance, but would be important models of measuring community-based public health initiatives designed to address weight management, healthy eating, and physical activity.
Incorporating measures of adherence into performance measures.
An estimated 20–50% of patients with chronic disease do not take their medications as prescribed (81). Poor medication adherence contributes to poor diabetes control, disability, unnecessary hospitalization, and death (82,83). A meta-analysis of 63 studies with over 19,000 participants reported that higher adherence rate decreases the risk for poor treatment outcome by 26% (84). Measures of patient adherence are impeded by 1) lack of agreement on the best methods to measure medication adherence, 2) paucity of integrated data systems that include both prescription and medication dispensing data, and 3) a sparse body of research on interventions to improve medication adherence (85). As information systems evolve and more effective interventions to improve adherence are identified, quality measures related to medication adherence may catalyze new efforts to improve adherence and patient health outcomes.
Incorporating costs into quality measurement.
Patients with diabetes generate medical care costs that are on average two to three times higher than age- and gender-matched patients without diabetes. Cardiovascular complications remain the principal driver of high diabetes care costs; medication costs are also rising more rapidly than overall inflation (86–88). Diabetes and other medical expenditures vary greatly across care systems in relation to the benefits achieved, and a substantial portion of expenditures does not appear to provide any net benefit to the patient (89). These services are usually labeled as wasteful, inappropriate, or inefficient.
NCQA has recently developed diabetes relative resource use measures at the plan level and is testing them at the group practice level. These measures are designed to look at resource use in diabetes care and, when combined with quality measures, can provide an overview of efficiency (high resource use–low quality vs. low resource use–high quality). Barriers to expanding such measures include the need for large sample sizes, the difficulty of accurately quantifying expenditures, and the need for accurate risk adjustment. A measure of total expenditures per patient is now available within the Medicare program and is based on administrative claims data. The information can be further categorized as hospital inpatient, outpatient, or pharmacy-related (medications). Expenditures can be compared with a set of outcome-related quality measures as an initial step toward trying to define the value (benefit per unit of expenditure) of care in diabetes.
Over the short term, we need more analysis and understanding of which elements of resource use have positive or negative correlations with measures of quality and outcomes of care. Currently most diabetes performance measures only assess whether tests or examinations are being underused. Development of measures that look at overuse of tests, examinations, procedures, or technology may be useful in evaluating and maximizing efficiency of care. Measures that encourage the use of generic medications, when available, may also conserve resources. Care provided by various subspecialties for patients with advanced complications of diabetes may be variably efficient or inefficient. With further refinement of both quality and cost-related measures, diabetes could become the poster child for efficient and effective health care.
Using performance measurement to reduce, not worsen, health disparities.
As with many chronic diseases, diabetes is marked by disparities in both treatment and outcomes. Such disparities are primarily based on socioeconomic status (SES), race, and ethnicity, but also exist by sex and age. Because patients of lower SES often have more barriers to self care and worse control of risk factors, clinicians who provide care to many such patients may have lower quality-of-care scores publicly reported, or lose income or incentives related to unadjusted measures of clinical performance. Currently, the HEDIS data are grouped by Medicare, Medicaid, and commercial insurance. In the future, quality measures could be adjusted in more sophisticated ways to account for variation in patient SES, health literacy, or other factors related to disparities in care. Possible methods include geo-coding, case-mix adjustment, or use of other metrics for SES. On the other hand, “overadjusting” for race/ethnicity and SES could mask real differences in quality of care provided to different groups; such disparities can only be corrected if they are identified.
5. How can quality monitoring be integrated into population surveillance efforts?
Population surveillance of quality of diabetes care provides a crucial complement to health system monitoring (e.g., HEDIS) by assessing care in the full population, including persons with limited or no health insurance. Appropriately selected performance measures may serve well as measures for population-based diabetes care surveillance and enable more detailed examination of geographic and other disparities in patterns of care. In addition, surveillance systems are important to monitor risks, adverse events, and resource use in the population, and to guide the design and implementation of strategies to improve quality and outcomes of care.
Existing population-level monitoring of diabetes care include the National Health Interview Survey (NHIS); the Behavior Risk Factor Surveillance System (BRFSS), which assesses care processes; and the National Health and Nutrition Examination Surveys (NHANES), which assess both processes of care and risk factor control. All three of these systems include extensive PRMs and provide a useful foundation for further development of such PRMs for diabetes care. Data from these sources also provide estimates of diabetes care quality that inform national quality and disparity reports and development of the Healthy People Objectives for 2010 and 2020. Other systems such as the National Ambulatory Medical Care Survey (NAMCS), and the National Hospital Discharge Survey (NHDS) provide additional population data on costs and outcomes of care. With the exception of the Dartmouth Health Atlas and selected metropolitan area surveys and laboratory-based registries in New York and Vermont, there is limited population-based data in the U.S. today within smaller geographic areas (90,91).
Expansion of existing surveillance systems to include measures of risk factor control, patient characteristics and behaviors, risk preferences, indicators of primary prevention, and other measures could serve several useful purposes such as 1) permit more accurate assessment of care quality for patients at different levels of risk, insurance, and socioeconomic status and to assess geographic variations in care; 2) promote monitoring of patient safety, drug safety, costs, adverse outcomes and unintended consequences (e.g., hypoglycemia and polypharmacy), and medication adherence; 3) prove useful within networks of Patient Centered Medical Homes or Accountable Care Organizations; and 4) facilitate systematic assessment of prevention efforts. Some of these innovations could be based on the modification of current population-based surveys. Others, such as clinical action measures, weighted quality measures, or risk-based quality measures may require fundamentally new surveillance systems.
Integration with health system–based data could augment the depth of public health systems and extend the representativeness of health system–based data. The growing use of EHRs presents an opportunity to assess variation in intensity and quality of diabetes care. Prototypes for the use of EHRs data for national surveillance include surveillance systems for vaccine safety, selected infectious diseases, and bioterrorism threats. Diabetes care surveillance might be carefully expanded in phases, perhaps with a “sentinel” system or a distributed data system as initial steps (92). An essential step is to develop and validate a common set of diabetes quality measures. Key data elements might include laboratory results, pharmaceutical use, utilization of services, and selected patient characteristics and experiences of care, including elements collected by patient self-report.