Several common themes emerged in the activities of the V&E group. The findings, discussed in detail below, are likely to reflect not just the group of grantees with whom the V&E group interacted directly, but also many others who are committed to evaluating health IT projects but do not have direct evaluation experience themselves or access to those who do.
Leaving Evaluation as an ‘After-thought’
We found that many projects had not planned on evaluating the health IT they were tasked with implementing, and others had vague notions about evaluation. This issue has been addressed within the AHRQ health IT portfolio through the work of the V&E group and subsequent funding announcements, but it likely reflects the general level of misunderstanding of evaluation work beyond the academic setting. Most projects were able to specify the goals for the implementation of health IT and the success criteria for the project, but did not specify how the project team would measure whether the goals or success criteria, if identified, would be met. Other implementation projects did acknowledge the need for evaluation by including specific personnel with the training and experience needed to carry out an evaluation, but in the vast majority of cases, did not allocate sufficient resources for it. Despite these concerns, the NRC, with the assistance of AHRQ, was able to help grantees develop realistic evaluation plans. In our experience, it is important to help health IT project teams understand the benefits of evaluation and how evaluation can facilitate the overall implementation process. Once that is achieved, health IT implementers should incorporate the evaluation efforts as a key component of their overall project plan.
As projects teams worked on evaluation plans, many took advantage of the list of sample measures listed in the earliest version of the Evaluation Toolkit and provided long lists of outcome measures with which to evaluate their implementation. A common phrase in the critique of the evaluation plans by NRC reviewers was that the plans were “overly ambitious,” as many project teams failed to recognize the significant resources needed to effectively carry out their extensive evaluation plans. These teams either lacked the financial resources to support appropriate staff to execute the evaluation plans, or lacked access to the appropriate experts (e.g., statisticians) to guide the evaluation. Another concern of the V&E team was the possibility of false positives if too many outcomes were examined.
Subsequent versions of the toolkit, which laid out a framework for choosing among the many possible evaluation metrics, resulted in more realistic evaluation plans. Once the project teams explicitly assessed the feasibility and the importance of each desired evaluation metric in light of stakeholders' goals and the resources available, they were able to focus their energies on metrics that were feasible to collect data on and would yield information meaningful to their local stakeholders. This was to be expected as the toolkit became more proscriptive and AHRQ and the NRC strongly encouraged the second group of implementation grantees to use it. Future health IT implementers should leverage this “lesson learned” and plan for evaluation efforts that address the primary needs of key stakeholders without taking valuable resources away from the IT implementation project.
Mismatch between Evaluating Metrics Chosen and the Health IT Being Implemented
Some project teams chose evaluation metrics without understanding if each was relevant to the specific health IT implementation and the implementation environment. For example, if a stand-alone inpatient pharmacy system is being implemented, should the rate of pneumococcal vaccine administration be chosen as an evaluation metric? Appropriate use of pneumococcal vaccines is a practice that is well supported by evidence in the literature; it represents an important quality improvement goal for many hospitals. In deciding whether this metric is appropriate for the implementation of a stand-alone pharmacy system, one should determine whether such a system would actually affect the rate of pneumococcal vaccine administration. In theory, pharmacists could remind physicians to prescribe the pneumococcal vaccine for eligible patients. However, if the pharmacy system is not integrated with the patient's outpatient electronic health record (EHR), then a pharmacist would have to take the initiative to review a patient's outpatient EHR for vaccine status. In such circumstances, most busy pharmacists would not be able to overcome the many barriers to improving the rate pneumococcal vaccine use. No one can guarantee that any particular measure will be affected by health IT, but project teams need to focus their limited resources on metrics that are likely to reflect an impact of their implementation. Health IT implementers need to think through workflow and cultural issues in conjunction with health IT being implemented to formulate appropriate and testable hypotheses and choose the right metrics to test those hypotheses directly.
Chasing Rare Events without Adequate Statistical Power
Health IT has the potential for impact on significant patient outcomes, such as mortality and adverse drug events. To detect these relatively rare events, a high volume of observations must be made. In some cases, project teams did not have the resources to collect adequate data, and in others, particularly in rural areas, it would take many years to make a sufficient number of observations. In such cases, the advice from V&E team was to select measures for which they would have sufficient statistical power, even if it meant focusing on process rather than outcome measures. Understanding the need for sufficient statistical power will be critical to future evaluation of health IT implementation as more and more such implementations are built around improving quality of care.
Limitations of Data Available
It is often possible to use data collected for another purpose to support the evaluation efforts. The V&E team encouraged this practice, especially if project teams had limited resources to devote to evaluation. Common sources included billing data, quality improvement data, and data used for external reporting. However, these data sources may have limitations that deserve consideration. For example, billing data may not adequately or accurately capture the care given, unless clinicians are going to be incrementally reimbursed for a specific activity. In some cases, quality improvement data and data collected for external reporting may represent an insufficient sample for statistical inferencing, limiting generalizability of the findings. These challenges are not insurmountable, but require mitigation strategies, including data validation and consideration for statistical power before these data sources can be used for evaluation. One technique the NRC often suggested, that is applicable to most future implementations, is to pilot data collection and analysis efforts early so that midcourse corrections are possible should initial assumptions about the quality of data and feasibility of data collection methods prove to be incorrect.
Improper Comparison Group
To demonstrate that the health IT being implemented has an impact on metrics chosen, data must be collected on a valid comparison group. When the energy of the project teams is directed at the new technology, it becomes easy to forget to do so. In our experience, even if evaluation resources are limited, it is possible to capture at least baseline data through low-cost methods such as surveys or data that has been collected for other purposes such as billing. By collecting data using the same methodology before and after the implementation of health IT, implementers are at least be able to conduct a valid before-and-after study to measure the impact of health IT.
Many health IT project teams wanted to follow the gold standard of study design by conducting randomized-controlled trials. Some of them realized that logistically it was not possible to do so because the community implementing health IT did not find it acceptable to delay implementation even for a short time period for a randomly chosen subset of the community. In these cases, valid comparison groups on which to collect data could still be identified. For example, project teams could identify another community that was not implementing any similar form of health IT and use it as a “control” community. If data could be collected in both the grantee's community and in the “control” community before and after the implementation of health IT, then the change in outcome over time could be compared between the two communities to determine if the health IT affected on outcome. In other cases, communities were planning to roll out the health IT in a staggered fashion over the course of months to years across different sites within the community. In these cases, project teams could collect outcomes data before and after the rollout of health IT in each site, and data collected in this fashion could be used to support time series analyses. Alternative approaches to the traditional randomized controlled trial can frequently satisfy the needs of the health IT implementers and their key stakeholders.
Insufficient Details on Data Collection and Analysis
Details are important, and the process of developing an evaluation plan offers the opportunity to define them. At a minimum, the plan should consider how the data needed to support the chosen metrics will be collected, the population on which the data will be collected, and when the data will be collected. If these details are not thought through, it is easy to “over-promise” on the number of measures to be collected. The plan should also discuss how the data collected will be analyzed, and statistical power calculations should be part of the plan. Our experience at the NRC suggests that these gaps in the evaluation planning can be addressed through access to evaluation plan templates and remote mentorship.
Exclusive Focus on Quantitative Methods
Data collected using qualitative methodologies may be as illustrative of lessons learned, if not more so, than data collected through quantitative methodologies. While quantitative methodologies are powerful and efficient at capturing healthcare outcomes, qualitative methodologies are often superior at capturing the “why's” and the “how to's”. To that end, the NRC has continued to encourage the use of qualitative methodologies, such as focus groups, semi-structured interviews, and surveys, to capture the lessons learned, barriers encountered, and the success factors in each project. The V&E team discovered that many project teams were not familiar with these methods or were sometimes reluctant to use them, believing that findings from qualitative methodologies are not concrete and are difficult to disseminate, particularly in peer-reviewed journals. Because qualitative methodologies offer unique tools for health IT evaluation, they should not be discounted because of misperceptions or lack of expertise. In some cases, this expertise can be identified in nonhealth-IT fields, such as sociology and anthropology, although experts in those areas are likely to need assistance in gaining health IT domain knowledge.