|Home | About | Journals | Submit | Contact Us | Français|
An important barrier to the widespread dissemination of clinical decision support (CDS) is the heterogeneity of information models and terminologies used across healthcare institutions, health information systems, and CDS resources such as knowledge bases. To address this problem, the Health Level 7 (HL7) Virtual Medical Record project (an open, international standards development effort) is developing community consensus on the clinical information exchanged between CDS engines and clinical information systems. As a part of this effort, the HL7 CDS Work Group embarked on a multinational, collaborative effort to identify a representative set of clinical data elements required for CDS. Based on an analysis of CDS systems from 20 institutions representing 4 nations, 131 data elements were identified as being currently utilized for CDS. These findings will inform the development of the emerging HL7 Virtual Medical Record standard and will facilitate the achievement of scalable, standards-based CDS.
An important problem facing healthcare systems is the significant gap between optimal, evidence-based medical practice and actual clinical care. For example, in a recent multi-national survey of chronically ill adults living in eight industrialized nations, 14–23% of patients in each country reported at least one medical error in the previous two years.1 Moreover, a systematic analysis of 439 care quality indicators has found that U.S. adults receive only about 55% of recommended care,2 and it takes over 15 years for rigorously validated clinical research findings to be routinely implemented in clinical care.3
In seeking to address this gap between evidence-based best practice and actual clinical care, a highly promising strategy is the use of clinical decision support (CDS) interventions, which entail providing clinicians, staff, patients or other individuals with knowledge and person-specific information, intelligently filtered or presented at appropriate times, to enhance health and health care.4 When automatically delivered to clinicians as actionable care recommendations within their routine clinical workflows, computer-based CDS interventions have significantly improved clinical practice in over 90% of randomized controlled trials.5
Despite the great potential for CDS interventions to improve care quality and ensure patient safety, robust CDS capabilities beyond basic medication-related CDS is not widely available, especially in the United States.4 One important reason for the limited deployment of CDS capabilities is the lack of standard clinical information models and associated terminologies that are consistently used across healthcare institutions, health information systems, and CDS resources.4 Without a common information model, the effort required for cross-system information mapping will become unsustainable.6 Moreover, different information models may be semantically incompatible and incapable of being mapped to each other.7 In the context of the HL7 Arden Syntax standard, for example, this problem has long been identified as the “curly braces” problem, due to the implementation-specific nature of data input specifications contained within curly braces in Arden Syntax modules.8 Thus, the heterogeneity of clinical information models and terminologies in current use represents a significant barrier to the scalable deployment of CDS.
Within the CDS community, it has been recognized for some time that the definition and adoption of a common information model for CDS would be of great value, and this concept of a common CDS information model has been generally referred to as a virtual medical record (vMR).9–14 To address this need, the HL7 CDS Work Group initiated the vMR project in 2007. The objective of the HL7 vMR project is to support scalable and interoperable CDS by establishing a standard information model for representing clinical information inputs and outputs that can be exchanged between CDS engines and clinical information systems, through mechanisms such as CDS services. Of note, this project intends to leverage existing HL7 information models and to map them to the vMR. The project charter, as well as all other project artifacts, are available on the HL7 wiki.15
Following initial work focused on identifying vMR requirements based on four CDS use scenarios (hypertension, diabetes, breast cancer, and cerebral aneurysms), the HL7 vMR project was re-scoped in January 2010 to more formally incorporate a wider range of insights from CDS implementers both within and outside of the Work Group.15 Accordingly, the vMR project team conducted a multi-institutional analysis of current CDS systems in February and March 2010 to identify a representative set of data elements used for CDS. Here, we describe the results from this analysis, which were used to inform the development of the emerging HL7 vMR standard.
Objective. The objective of this analysis was to identify a representative set of data elements and associated terminologies used by current CDS systems, so as to inform the data elements and associated terminologies that need to be included in the vMR standard as potential inputs into a CDS engine. In order to facilitate the gathering and analysis of data from a number of disparate CDS systems, we chose to obtain information on atomic data elements using a flat structure, with the intent to address the structural relationships between the data elements at a later stage. Also, CDS engine outputs were not included within the scope of the analysis, as the HL7 vMR project team felt that development of this aspect of the vMR would be better served through the analysis of specific use cases and existing HL7 information models for communicating the results of specific CDS inferences (e.g., for vaccination CDS).
Study Participants. Individuals were eligible to participate in the study if they (i) had knowledge of the data used by an operational CDS system or by a CDS system under active design and development, and/or (ii) were active contributors to the HL7 vMR project. A CDS system was defined using the definition provided above.4 All study participants were invited to be co-authors on this manuscript.
Participant Recruitment. In February 2010, a request for participation was communicated through the HL7 CDS Work Group’s list-serv, and this request asked recipients to forward the email to any potentially interested individuals. The HL7 vMR project team also identified relevant experts and proactively reached out to these individuals. All interested contributors were included as long as the inclusion criteria specified above were met.
Data Collection. Each study participant was asked to provide his or her name, degree(s), institutional affiliation, title, and contact information. For each CDS system with which the study participant had familiarity, the participant was asked to provide the following information: (i) description of system, including purpose, deployment scope, operational status, and any references; (ii) the participant’s relationship with the system (e.g., co-designer; knowledge engineer); (iii) data elements used by the CDS system for making CDS inferences (e.g., procedure code, encounter date); (iv) a definition and example of the data element; (v) if applicable, value sets and terminologies used; (vi) example(s) of data element usage for CDS; and (vii) any comments. To expedite data collection, an initial data entry template was created by the vMR project team based on the draft vMR previously developed by the team. To minimize misinterpretation by the contributors, each data element in the data entry template included a definition, examples, and clarifying comments. In a second round of data collection, the template was revised to include data elements that were not in the original template, and study participants were asked to explicitly identify their usage of these additional data elements.
Data Analysis. As needed, collected data were consolidated through an open, consensus-based process by members of the HL7 vMR project team. For example, equivalent data elements identified by contributors using different terms were merged, either through project conference calls involving the relevant contributors or through direct phone or email communications between the primary author and the relevant contributors. Following consolidation, the data were summarized in terms of data elements used by at least one CDS system, instance examples, the proportion of CDS systems reporting the use of each data element, and use case examples. We provide below the salient aspects of this analysis.
Study Participants and CDS Systems. A total of 28 individuals from 22 institutions participated in the study. Together, these individuals contributed data on the data requirements of 20 CDS systems from 4 nations, which included both large-scale home-grown CDS systems (e.g., CDS systems of the Veterans Health Administration, Intermountain Healthcare, and Partners Healthcare) as well as a number of commercial CDS systems (Siemens Soarian, Eclipsys Sunrise, Medical-Objects CDS, Altos OncoEMR, Hughes riskApps, Wolters Kluwer Health Infobutton API, and Medi-Span) (Table 1).
Multi-Institutional CDS Data Needs. A total of 131 data elements were identified as being in use by the 20 CDS systems. Of these data elements, 22 (17%) were not in the original data collection template and were identified by the data contributors. These multi-institutional CDS data needs are summarized in Table 2, and the frequency of their use across the systems is shown in Figure 1. As shown in the figure, most of the data elements were used by 20–80% of the CDS systems.
With regard to terminologies, the contributors reported using both standard and non-standard terminologies and value sets. Standard terminologies and value sets reported to be used for CDS included SNOMED CT, LOINC, ICD9, ICD10, CPT, MeSH, NDC, RxNorm, and HL7-defined value sets (e.g., for gender and race). Many respondents reported that the non-standard terminologies and value sets in use could be, or have been, mapped to standard terminologies of similar granularity.
The full data set and analysis, including details on the terminologies and value sets used by the contributing CDS systems, are available online.16
Summary and Interpretation of Findings. In this study, we analyzed the data needs of 20 CDS systems from 4 nations to identify a representative set of data elements used by CDS systems. Through this analysis, we identified 131 data elements used for CDS, all but two of which were used across multiple systems. Also, while both standard and non-standard terminologies were used, many contributors reported that their non-standard terminologies could be mapped to standard terminologies. Therefore, we believe that this work represents a solid step forward in the HL7 CDS Work Group’s efforts to define a common vMR for CDS that can overcome the “curly braces” problem and facilitate highly scalable CDS.
Strengths. As one important strength, this study sampled a highly diverse set of CDS systems, including mature home-grown and commercial CDS systems. This diversity minimizes the chances of false negative findings (i.e., the overlooking of important data elements). Second, this study is based on actual CDS systems and their data needs. Consequently, our methodology minimizes the chances of false positive findings (i.e., the inclusion of data elements not truly useful for CDS). Third, the data element set identified appears to be relatively compact and suitable for standardization and adoption. Finally, this study addresses a well-recognized problem and has the potential to facilitate significant advances in CDS scalability and impact.
Limitations. As one limitation, study participants were self-selected based on interest and were primarily from one country (the United States). Thus, it is possible that this analysis did not capture data elements used by non-participants. However, the large number and significant diversity of CDS systems included in this analysis should minimize the risk of such false negative findings. Second, the use of an initial data entry template may have biased responses. However, as indicated by the fact that close to 20% of the data elements we identified were not included in the original data entry template, individual contributors actively pursued the inclusion of data elements regardless of whether they were included in the original data entry template.
Implications and Future Directions. Based on this multi-national, multi-institutional analysis of CDS data needs, the HL7 vMR project team developed an initial proposal for a vMR standard that incorporated a CDS input model that was the focus of this study, a query model for specifying the data required in a given instance, and a CDS output model.15 This proposed standard underwent balloting in May 2010, and we are currently addressing the ballot comments to improve the proposed standard. Moving forward, all project artifacts will continue to be posted on the project Wiki,15 and any interested individuals are invited to participate. Ultimately, we envision that this work will serve as an important foundation for the health informatics community to develop and deploy interoperable CDS solutions that improve population health on a widespread scale.
KK and GDF are co-chairs of the HL7 CDS Work Group, and KK is coordinating the HL7 Virtual Medical Record standards development effort. Preparation of this manuscript was supported by Award Number K01HG004645 from the National Human Genome Research Institute (KK). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health, Health Level 7, or the other institutions with which the authors are affiliated.