|Home | About | Journals | Submit | Contact Us | Français|
To assess the internal consistency and agreement between the Health Care Information and Management Systems Society (HIMSS) and the Leapfrog computerized provider order entry (CPOE) data.
Secondary hospital data collected by HIMSS Analytics, the Leapfrog Group, and the American Hospital Association from 2005 to 2007.
Dichotomous measures of full CPOE status were created for the HIMSS and Leapfrog datasets in each year. We assessed internal consistency by calculating the percent of full adopters in a given year that report full CPOE status in subsequent years. We assessed the level of agreement between the two datasets by calculating the κ statistic and McNemar's test. We examined responsiveness by assessing the change in full CPOE status rates, over time, reported by HIMSS and Leapfrog data, respectively.
Findings indicate minimal agreement between the two datasets regarding positive hospital CPOE status, but adequate agreement within a given dataset from year to year. Relative to each other, the HIMSS data tend to overestimate increases in full CPOE status over time, while the Leapfrog data may underestimate year over year increases in national CPOE status.
Both Leapfrog and HIMSS data have strengths and weaknesses. Those interested in studying outcomes associated with CPOE use or adoption should be aware of the strengths and limitations of the Leapfrog and HIMSS datasets. Future development of a standard definition of CPOE status in hospitals will allow for a more comprehensive validation of these data.
Policy makers, hospital leaders, and researchers are interested in computerized provider order entry (CPOE) systems as a means to improve the quality of care by reducing errors and improving efficiency (Bates et al. 1998; Bobb et al. 2004; Poon et al. 2004; Cutler, Feldman, and Horwitz 2005). The literature in this area has grown with studies that have identified both the benefits (Bates et al. 1999; Bates and Gawande 2003; Kaushal, Shojania, and Bates 2003; Chaudhry et al. 2006) and potential risks (Han et al. 2005; Koppel et al. 2005) associated with CPOE systems. However, in separate systematic reviews, Chaudhry et al. (2006) and Reckmann et al. (2009) found that a significant portion of CPOE studies emanate from select individual organizations with characteristics that limit the generalizability of the findings from such studies. As a result, researchers have recently begun to examine national datasets that include measures of CPOE to identify factors influencing adoption (Cutler, Feldman, and Horwitz 2005; Hillman and Givens 2005; Ford and Short 2008) and the outcomes associated with CPOE systems (Jha et al. 2008; Yu et al. 2009a,b; Kazely and Diana 2011).
Most national studies examining CPOE systems have used data from one of two sources: Health Care Information and Management Systems Society (HIMSS) Analytics (see, e.g., Teufel, Kazley, and Basco 2009) or the Leapfrog Group (see, e.g., Hillman and Givens 2005). While the use of these national datasets has provided potentially valuable information that complements the single-institutional studies that dominate the literature, there has been no empirical assessment of either the HIMSS Analytics or the Leapfrog data. Thus, it is unclear whether these datasets are appropriate for rigorous health services research endeavors designed to influence either health policy or the adoption of a resource-intensive CPOE system by health organizations.
The purpose of this paper is to assess the internal consistency, level of agreement, and responsiveness of the HIMSS and Leapfrog measures of CPOE. In so doing, we will provide information regarding the appropriateness of these national datasets for the purposes of health service research. Given the intensifying interest in generalizable studies of CPOE systems in hospitals, we expect our results to be of interest to researchers, policy makers, and hospital leaders alike.
The internal consistency, level of agreement, and responsiveness of CPOE measures are important components of high-quality and generalizable research regarding CPOE use and adoption. These characteristics are similar to the well-known concepts of reliability and validity, which are difficult to assess in this case due to the lack of a gold standard measure. In our study, we focus on internal consistency, which, like reliability, refers to the stability of a measure and the degree to which the measure is free from random error (Ray et al. 2006). Level of agreement attempts to address issues of validity in the absence of a gold standard. It is the notion that a scale actually measures what it intends to measure. Responsiveness is the ability of the scale to detect changes over time (Ray et al. 2006). Each of these characteristics influences the extent to which a given measure can represent the practices or constructs under study. The following is a more in-depth overview of each of these elements.
Internal consistency is the reproducibility of a measure. A measure with good internal consistency can be replicated and avoids random errors through a smaller range of fluctuation from one measure to the next (Cherulnik 2001). Accurate consistency of a measure may be indicative of reliability if the reported measure has not changed in practice. Internal consistency can be influenced by the source reporting the measure, untruthful reporting, the consistent description of the construct, the method of data collection, and response bias.
The level of agreement is closely related to construct validity, which is the degree to which a measure provides trustworthy information about the construct it intends to measure (Cherulnik 2001; Grembowski 2001). A measure that has a high level of agreement with a gold standard measure is believed to be free of systematic error or bias, and thus valid. Validity can be influenced by a definition, individual perception of a construct, response bias, untruthful reporting, or a vague description of the construct to be measured. In instances where a gold standard does not exist (e.g., national data on CPOE), measuring the level of agreement between two different surveys is the first step in the empirical assessment of those surveys.
Lastly, responsiveness describes the sensitivity of a measure and its ability to accurately detect small but important changes in a construct over time (Grembowski 2001). Measuring responsiveness has become increasingly important because detecting small changes in management or health outcomes is needed in health services research. It has been claimed that “Responsiveness should join reliability and validity as necessary requirements for instruments designed primarily to measure change over time” (Guyatt, Walter, and Norman 1987).
The internal consistency, level of agreement, and responsiveness of CPOE measures are of great importance to researchers. Because there is national interest in increasing CPOE adoption and use, some researchers have focused on using national measures to study its impact. In doing so, they rely upon the established national data, which has not been empirically evaluated. If a CPOE measure fails to show internal consistency, it may bias the results by over- or underestimating CPOE adoption and use in a national sample. Such practice would make inference between CPOE and performance tenuous.
Likewise, if national measures of CPOE do not agree, it is possible that one or both measures are not valid. Potential reasons for disagreement may be that the two surveys are measuring something different than intended, thus biasing the results of any study using the measure. Because researchers have struggled to define and measure CPOE, it is conceivable that the measures may be flawed or may not distinguish between full CPOE use and early CPOE adoption. Other scenarios, such as considering an organization to have CPOE without a clinical decision support system as one that is using CPOE, are also possible. The challenge lies in the fact that CPOE is not well defined on a national level and may vary based on the system used and the level of adoption within the organization.
Lastly, a CPOE measure without good responsiveness would bias results of a study by failing to detect changes in CPOE from one year to the next. Because such variance may exist, researchers may ask whether changes in the reported CPOE measure for an organization are representative of current CPOE practice within that organization. Each of these criteria—internal consistency, agreement, and responsiveness—gives value to the measures themselves by ensuring they are truly measures of the construct of interest. Understanding these criteria in the context of national CPOE data will help researchers make better-informed choices about the use of such data sources.
Data sources for this study are the HIMSS Analytics data and the Leapfrog Group data. Participation in each survey is voluntary, and each responding organization chooses the individuals that complete the survey. The HIMSS Analytics data contain a wide range of information on integrated delivery systems, including detailed information on the health information technology environment. These data are collected annually at the system level, but they also contain information on a large number of individual hospitals and other health care organizations. We used HIMSS Analytics data from 2005 to 2007. We did not use data earlier than 2005 because HIMSS made a significant change in the definition of CPOE in 2004. We used the responses from two questions from the HIMSS survey; one on the presence of clinical decision support, and the other on the presence of CPOE. Responses on each question were considered positive if the application status was “Live and Operational,” and negative for any other status.
The Leapfrog data contains information collected from all participating hospitals in one dataset from 2003 to 2007. Hospitals voluntarily report information annually about hospital practices (including CPOE use), performance, and outcomes to the Leapfrog Group. The Leapfrog Group allows for voluntary participation from one year to the next. Overall, the Leapfrog Group aims to use large employer purchasing power to improve health care safety, quality, and affordability through public reporting. We used Leapfrog data from 2005 through 2007 to correspond with the HIMSS data. The Leapfrog survey uses several criteria to determine whether the hospital-wide CPOE implementation in responding hospitals meets the standard they seek. These criteria include the reported percentage of prescribers using CPOE and the presence of decision support functionality. Responses to the survey were considered positive if the implementation “fully meets the standard” and negative for any other status.
We used the American Hospital Association Annual Survey of Hospitals data from 2005 to 2007 to match hospitals in each dataset and to obtain hospital characteristics for comparison. In addition to variables that describe hospital characteristics obtained from the American Hospital Association data, we classified hospitals' geographic location using the Rural–Urban Commuting Area Codes (Morrill, Cromartie, and Hart 1999). Rural–Urban Commuting Area codes, based on hospital zip codes, take into consideration commuting patterns and as such represent the state of the art in measuring rurality (Rural Health Reserach Center 2009).
Measuring CPOE presents several challenges. One of the main challenges is the nonstandardized definition of CPOE. For example, these two surveys use different definitions of CPOE, and HIMSS changed its definition of CPOE between the 2004 and 2005 surveys. A second challenge is defining CPOE adoption and use. There are differences between the presence of a CPOE system, what capabilities it includes, and how often or extensively clinicians use it. We do not attempt to measure CPOE adoption and use, but only whether the hospital reports that it has a hospital-wide CPOE system according to the definitions of each survey. As noted below, we calibrated the available information from both surveys to be as equivalent as possible. This approach reduces the likelihood that differences in adoption and utilization among hospitals will affect our results.
In order to allow for a fair comparison, we defined full CPOE status conservatively and dichotomously as the most complete status of CPOE reported in each of the two surveys. For the Leapfrog data, this corresponds to a survey response by the hospital that it fully meets the Leapfrog CPOE standard. Leapfrog includes in its definition of CPOE the requirement that there be some decision support capabilities in the system to reduce or prevent medication errors. The HIMSS definition of CPOE does not include this requirement, so in order to make the two measures comparable, we constructed a measure of CPOE in the HIMSS data that includes the presence of clinical decision support in the hospital. Therefore, our final measure of the presence of CPOE in the HIMSS data required the presence of both CPOE and clinical decision support with a live and operational status. Furthermore, we only consider hospitals with known CPOE status. Thus, we excluded hospitals with missing data for CPOE questions.
Given the concepts of internal consistency, level of agreement, and responsiveness, we used different samples and approaches to investigate each concept. Internal consistency is often measured through the stability of a measure from one point of time until another, as in the case of the consistency of items on a survey. To test for data internal consistency, we took advantage of the fact that when a hospital achieves “complete” CPOE status it is highly unlikely that in subsequent years their status would regress back to less than complete status. Thus, we examined the percentage of hospitals, in a given dataset (HIMSS or Leapfrog) that had CPOE in a given year and reported CPOE in the subsequent year. To do so, the HIMSS and Leapfrog datasets were each examined separately.
The concept of “level of agreement” in empirical assessments quantifies the degree to which differing measures of the same construct agree with each other. To test the level of agreement of each dataset, we restricted the sample of hospitals to organizations that responded to both the HIMSS and Leapfrog survey in any given year. For each year, we calculated the κ statistic and McNemar's test. The κ statistic provides a measure of the degree of association between two categorical variables, and McNemar's test is used to compare two proportions from dependent or matched samples (Rosner 2006).
We also calculated the κ statistic and McNemar's test on a restricted sample containing hospitals that had reported CPOE status across all 3 years as a sensitivity analysis. We did this because no gold standard data source on CPOE exists. Specifically, we wanted to be as sure as possible that we “correctly” classified hospitals for CPOE status. Thus, we constructed a measure that used hospitals that responded as having CPOE in all 3 years as positive for CPOE status and negative otherwise for each survey. This approach maximized our ability to examine the level of agreement by utilizing hospitals whose responses over time strongly suggested they were consistently classified as having CPOE in the Leapfrog and HIMSS data, respectively.
Responsiveness is the ability of a measure to detect changes over time. A responsive measure will be consistent and accurate despite changes in the response itself. To test for data responsiveness, we examined the percent increase in complete CPOE status from 1 year to the next in HIMSS and Leapfrog data, respectively. We examined this calculation in the individual HIMSS and Leapfrog datasets as well as in the restricted dataset that represented organizations that responded to both surveys in any given year.
Lastly, we used Stata version 11 for all data management, including merging of data, sample identification, and calculation of descriptive statistics.
We present the number of hospitals with known CPOE status from the HIMSS, Leapfrog, and combined dataset by year in Table 1. Briefly, in 2007, the HIMSS data included CPOE information on 4,679 organizations. During the same year, the Leapfrog dataset included CPOE information on 2,619 hospitals. The number of hospitals with known CPOE status in both the HIMSS and Leapfrog datasets was 1,053 in 2005 and 2,355 in 2007.
The 2007 organizational characteristics of respondents in the HIMSS, Leapfrog, and the overlapping set are presented in Table 2. In most cases, respondents to the Leapfrog survey also were represented in the HIMSS dataset. As such, the hospitals presented in the combined dataset have identical characteristics to the Leapfrog set in many categories. Overall, the Leapfrog dataset included a lower proportion of small hospitals <125 beds. All other organizational characteristics are generally similar.
In the HIMSS dataset, 238 hospitals reported complete CPOE status in 2005, of which 180 (76 percent) reported complete CPOE status in the HIMSS 2006 data; and 148 (62 percent) reported complete CPOE status in the HIMSS 2007 data (see Table 3). In 2006, 276 hospitals reported complete CPOE status in the HIMSS data of which 216 (78 percent) reported the same status in 2007.
Leapfrog data reported 69 hospitals with complete CPOE status in 2005. In the subsequent year (2006), 59 of these hospitals (86 percent) reported complete CPOE status. The same 59 hospitals (86 percent) reported complete CPOE status in 2007 as well. In 2006, 95 hospitals indicated complete CPOE status on the Leapfrog survey, of which 77 (81 percent) reported the same status in 2007.
Measures of agreement are reported in Table 4. There were 1,053 hospitals in both datasets that reported on CPOE in 2005. Of these, HIMSS and Leapfrog agreed on 15 hospitals as having CPOE and 902 hospitals as not having CPOE. For 2005, κ = 0.14, p<.00l; and McNemar's χ2 = 79.53, p<.001. These indicate that there is significant disagreement in the likelihood of CPOE classification between the Leapfrog and HIMSS surveys.
In 2006, there were 1,823 hospitals in both datasets that reported on CPOE. Of these, HIMSS and Leapfrog agreed on 48 hospitals as having CPOE and on 1,549 hospitals as not having CPOE. For 2006, κ = 0.25, p<.00l; and McNemar's χ2 = 110.46, p<.001. Similarly, there is significant disagreement between Leapfrog and HIMSS on CPOE status.
In 2007, there were 2,355 hospitals in both datasets that reported on CPOE. Of these, HIMSS and Leapfrog agreed on 89 hospitals as having CPOE and on 1,869 hospitals as not having CPOE. For 2007, κ = 0.25, p<.00l; and McNemar's χ2 = 269.34, p<.001. Again, there is significant disagreement in the identification of CPOE status between the two data sources.
In the sensitivity analysis using the sample restricted to hospitals the reported having CPOE in all 3 years for each survey, there were 1,346 hospitals. Of these, HIMSS and Leapfrog agreed on 7 hospitals as having CPOE and on 1,238 hospitals as not having CPOE. For the sample in this sensitivity analysis, κ = 0.10, p<.00l; and McNemar's χ2 = 47.14, p<.001; suggesting again that there is significant disagreement between the two datasets with respect to CPOE status.
When considering the 2005 HIMSS full sample only, 2,045 hospitals reported not having CPOE, of which 97 (4.7 percent) reported complete CPOE status in 2006 (see Table 5). In 2006, 3,208 hospitals in the HIMSS data reported not having CPOE of which 312 (9.7 percent) reported full CPOE status in 2007.
When considering the 2005 Leapfrog full sample only, 1,991 hospitals did not have CPOE, of which 36 (1.8 percent) reported full CPOE status in 2006. In 2006, 2,253 hospitals in the Leapfrog data reported not having CPOE, of which 18 (0.8 percent) changed their status to full CPOE in 2007.
We conducted the same analyses on the overlapping sample of hospitals that report on CPOE in all 3 years (see Table 5), but we did not use the restricted measure of CPOE status that we used in the level of agreement sensitivity analysis. From 2005 to 2006, the HIMSS data reported that 5.4 percent of hospitals switched their status from not having CPOE to complete CPOE implementation. From 2006 to 2007, the HIMSS data further found that 12.5 percent of hospitals switched their status from no CPOE to complete CPOE implementation. During the same period, the Leapfrog data (examining the same subsample of hospitals) reported a 2005–2006 increase in CPOE adopters of 1.7 percent; and a 2006–2007 increase in CPOE adopters of 3 percent.
Many national studies that examine CPOE use in hospitals have relied on either HIMSS or Leapfrog data. These datasets have been used to estimate hospital CPOE adoption rates (Ford et al. 2008), examine hospital characteristics associated with adoption (Teufel, Kazley, and Basco 2009), and correlate CPOE adoption with medication-related quality outcomes (Jha et al. 2008; Yu et al. 2009b; Kazely and Diana 2011). No previous study has empirically examined the HIMSS or Leapfrog datasets, despite the fact that findings from previous studies can influence managerial and policy decisions. In this study, we sought to examine the internal consistency, level of agreement, and responsiveness of the CPOE variables from these datasets.
Overall, respondents to both the Leapfrog and HIMSS datasets have similar organizational characteristics and in many cases overlap. However, our findings indicate minimal agreement between the two datasets regarding which hospitals have adopted CPOE, but adequate consistency within a given dataset from year to year. Both surveys tend to inaccurately estimate overall changes in true hospital CPOE status over time (i.e., responsiveness), but in different ways. Compared with the Leapfrog data, the HIMSS data tend to overestimate increases in adoption over time. Likewise, relative to the HIMSS data, the Leapfrog data have more downward trending estimates for year over year increases in CPOE status. Our findings suggest serious limitations on the use of either dataset for many health services research purposes. However, given the properties of each survey instrument, it would appear that each data source has differing strengths.
The κ statistic compares the expected level of agreement that would be present by chance to the level of agreement present in the data. A κ of 0 indicates agreement levels expected purely by chance, and a κ of 1 indicates perfect agreement. The κ statistics for all years is high enough to reject the hypothesis that the level of agreement between these two surveys is occurring only by chance. However, the κ statistics are consistently low enough to indicate there is only marginal agreement between the two surveys. Rosner (2006) suggests that a κ<0.4 indicates marginal reproducibility, and the κ statistics indentified in our study range from 0.1 to 0.3. Further, based on our sensitivity analysis, it appears that the level of agreement may not be improving over time. McNemar's test was used to test the probability that the two surveys were equivalent in the proportions of hospitals identified as having CPOE. In each individual year and in the restricted sample, McNemar's test leads us to reject that hypothesis and conclude that the two surveys are indentifying different proportions of hospitals that have CPOE.
The HIMSS dataset shows consistently higher estimates of hospitals with CPOE than does the Leapfrog dataset, while the two datasets are much closer in estimates of those without CPOE. Our analysis is not able to identify the cause of this difference, but it does suggest that either HIMSS data overstate or Leapfrog data understate the true number of hospitals with CPOE. The Leapfrog criteria for reporting fully implemented CPOE is more restrictive than the HIMSS criteria, which is one likely source of the disagreement. Without a consensus definition of what constitutes full CPOE implementation, it may be impossible to determine which is more accurate. However, it seems reasonable to suggest that researchers wishing to avoid the possibility of having high false positives (regarding CPOE status) may be better served utilizing the Leapfrog data. Conversely, if a given researcher's concern is potential false negatives, the HIMSS data may be more appropriate to use in that instance. These qualities of the two surveys offer the researcher the opportunity to conduct sensitivity analyses using both datasets if such an approach is relevant to their study.
Once a hospital reported full CPOE status, the Leapfrog data and the HIMSS data (less so) were able to reidentify the same hospital as reporting full CPOE status in a subsequent year with adequate consistency. The Leapfrog consistency rate (81–86 percent) and the HIMSS consistency rate (62–78 percent) are either within, or approaching, the generally expectable rate of 80–90 percent (Hennekens and Buring 1987). Because CPOE adoption is a large undertaking requiring significant financial and human resources to implement, one would expect that sudden downgrades from the status of full CPOE would not take place regularly. Nevertheless, a number of hospitals from each dataset reported less than full CPOE status in an immediately subsequent year. This may be the result of legitimate cases where an implementation failure occurs and full CPOE status regresses (see, e.g., the experience of Cedar Sinai Medical Center; Connolly 2005); or it may reflect variability in how the question is perceived from year to year by a given individual or his/her successor who fills out the survey.
The differences noted in responses by the same organization to an almost identical question (full CPOE status) on the HIMSS and Leapfrog surveys may have something to do with how, and why, these data are collected. First, as noted above, the mission of the Leapfrog and HIMSS organizations are very different. Thus, certain responses on the Leapfrog survey may result in benefits to the organization (e.g., being able to participate in providing care for Leapfrog employer members' health plan beneficiaries) while no such direct incentives exist for specific responses on the HIMSS survey. This difference may result in less exaggeration of CPOE status on the Leapfrog dataset because organizations would be subject to the loss of benefits, sanctions, or public embarrassment in the event of a Leapfrog audit. Similarly, because Leapfrog offers hospitals partial “credit” for different stages of CPOE implementation, hospitals may be further dissuaded from overestimating their current CPOE status, and they may find benefit from more slowly reporting progress. In contrast, fewer forces may influence the accuracy of responses on the HIMSS survey.
Second, differences in responses by the same hospital to the Leapfrog and HIMSS surveys may be a function of market forces. In regions where competition among Leapfrog-participating hospitals is relatively high, responding hospitals may be more prone to err on the side of “meeting the CPOE standard” and thus garner access to profitable patient populations. Hospitals in less competitive markets may not have such pressures to assure their financial well-being. Lastly, while both surveys are voluntary, it is not known who ultimately in the organization provides the response to either survey or whether this person changes from one year to the next.
Our analysis has several limitations worth noting. First, we only utilized 3 years of HIMSS and Leapfrog data. Most results either showed obvious yearly trends (in one direction or the other) or were relatively consistent from year to year. Nevertheless, we recognize that using only 3 years worth of data may be a limiting factor. Second, we examined only variables related to CPOE for each of the datasets. Thus, our findings may not be applicable to other variables collected in either the HIMSS or Leapfrog data. Future studies should address the validity of other variables in these datasets. Third, some aspects of our analysis required that we compare only those hospitals that participated in both surveys in a given year. The loss of some hospitals that report to one survey but not the other could biases our findings. Lastly, we carefully calibrated the questions regarding CPOE status from both surveys so that we examined “full CPOE” in both datasets. Despite our best efforts, we recognize that the two surveys ultimately word their questions differently, which may explain some of the differences in responses we identified.
We recommend that future research in this area examine two key issues. First, we need a clear definition of what constitutes a CPOE system. It may seem obvious, but without a consensus on this question, validating the presence of CPOE will remain highly problematic. Second, a validation of any measure of CPOE—or other health information technologies such as electronic medical records—will require comparison of survey responses with the direct assessment of the presence of CPOE through standard survey validation approaches. Without a gold standard of CPOE use, it is not possible to fully validate the data.
In conclusion, the disagreement between the HIMSS and Leapfrog datasets regarding hospital CPOE status creates a challenge for researchers, practitioners, and policy makers who wish to understand CPOE from a national perspective. Future research is encouraged to overcome limitations in each data source. Currently, notwithstanding our analyses, the HIMSS and Leapfrog surveys remain the most widely available secondary sources of information on CPOE. Importantly, the AHA has recently begun to collect data on CPOE among hospitals in their newly developed Hospital EHR Adoption Database, a supplemental to the AHA Annual Survey (Jha et al. 2009). While this dataset promises to provide additional valuable information about CPOE (and other health information technology), to our knowledge, it has not undergone a similar assessment as described herein. Both the Leapfrog and HIMSS datasets, as well as the new AHA data, will continue to be among the scant choices for health services researchers interested in this topic for the foreseeable future. In the absence of a true gold standard measure for CPOE, it is our hope that all organizations collecting data used by health services researchers consider strategies to validate their data collection efforts.
Joint Acknowledgment/Disclosure Statement: There is no form of financial or material support, other contributors, or disclosures. The authors wish to thank HIMSS Analytics and The Leapfrog Group for the use of their data.
Additional supporting information may be found in the online version of this article:
Appendix SA1: Author Matrix.
Please note: Wiley-Blackwell is not responsible for the content or functionality of any supportingmaterials supplied by the authors. Any queries (other than missing material) should be directed to the corresponding author for the article.