The issues and examples described above provide a basis for thinking about best practices when implementing EHRs, and the important implications about the choices made for documentation. Epic Systems Corporation (Verona, WI) is an EHR vendor that is quickly becoming one of the largest providers of software to health systems across the country. Although their market is primarily large medical centers, with less penetration into smaller office based practices, the company is growing at a rapid pace [33
]. Thus, the functionality provided by Epic provides a useful window into understanding the possibilities, and challenges, of developing a system to support the secondary use and sharing of clinical data.
Various vendors take different approaches to implementing EHR systems. While some vendors provide detailed, pre-configured components and documentation templates, Epic does not. Rather, they provide a ‘model system,’ which essentially contain examples of what could be done in terms of a clinical build, but which most clinicians would not consider to be usable in a clinical environment. It is then up to each institution to develop their own clinical content, either through internal working groups, through direct interactions with other institutions, or via Epic User Groups. Local implementation decisions include choices not only about documentation structure, but about the use of specific terminology systems for features ranging from medication formulary lists (e.g., RxNorm), laboratory data (e.g. LOINC), to problem list elements (e.g., SNOMED-CT).
An advantage of this locally driven approach is that it allows each institution to customize the system to their specific needs. The disadvantage is that most institutions and practices don’t necessarily know what their needs are until well after the implementation has occurred, and it can be very difficult among a large number of users to standardize clinical content across, or even within, disciplines and specialties. Usage patterns of EHRs can actually vary even within the same institution [34
Based on our experience, it is important that all clinical groups put upfront effort into standardizing their processes for data collection, especially for data that are known to be important for secondary use and sharing. This may involve standardizing data elements on specific CDEs, terminologies, and defining an essential set of data elements which must be coded, and which can be captured as free text. Choosing a common location in the medical record is important, since some data elements might reasonably be recorded in multiple locations, making them harder to locate later. An example we recently encountered involved the recording of cancer staging information. After various meetings and input from clinical groups, it was decided to record staging in the problem list, so that it could easily be found and extracted by other clinicians. The alternative would have been to have staging ‘buried’ in the clinical notes, making it much harder to find.
Our experience also provides an example in the complexities of extracting data for secondary purposes. The underlying database in Epic, called Chronicles, is based on a system called Cache. Cache, in turn, is based on MUMPS (the Massachusetts General Hospital Utility Multi-Programming System), initially developed nearly a half-century ago. MUMPS is efficient but complex, and therefore, a subset of the Epic data are transferred nightly to a second database called Clarity. Clarity uses databases that support the structured query language (SQL), which is the current standard for extracting data. This is advantageous except for a few not-so-minor details. First, Clarity only represents a subset of what the full Chronicles database contains. As a result, some data may not be readily available. Second, because the data are not updated in real time, some uses of the data might not be possible—such as building a system that utilizes an up-to-the-minute patient schedule for identifying eligible study patients that have just arrived in the clinic. Third, Clarity contains thousands of database tables, making the extraction of data complex, even for talented SQL programmers. As a result, some institutions have had to build yet a third instance of Epic data using simplified data structures that are easier to understand and query.
Because of the complexity of the data and underlying system architectures, strong technical skills are required to extract the data. In fact, Epic often requires that individuals who extract data from Clarity become ‘certified’ or ‘proficient,’ by studying educational materials and subsequently passing an examination. As a result, extracting data should probably be done by informatics or information technology professionals.
Other challenges also exist in using the data that have been stored in Clarity. Much of the ‘structure’ of the documentation is stripped out, including line feeds and table grids. Thus metadata is lost, and documents that initially had a structured table of data or had information presented where line feeds matter, lose some integrity with this transformation. Bypassing this limitation often requires complex technical interventions, such as intercepting HL7 feeds as data are passed between systems and databases. Furthermore, even if data are structured it is still necessary to know the meaning of it; that is, what the stored codes actually represent. For example, if a clinical group chooses to classify pain on a scale of 1 to 5, then it will be necessary to know whether the 1 or the 5 represents the most severe pain. Such metadata might not be captured in the database, and therefore accurately recording those details in a separate searchable database may become important. The issue might be complicated further if another group choose the same scale for pain but with reversed meaning of the numbered sequence, or if a group chooses to measure pain on a completely different scale (e.g., “none”, “moderate”, “severe”). Complex mapping of concepts may be required to integrate data from disparate sources, even from different clinical groups within a single Epic installation. Each organization needs to manage terminology concepts in order to share data within an institution. Data sharing between institutions also requires management of terminology concepts, highlighting the importance of using standardized and concept-based terminologies (e.g., SNOMED-CT).
When clinical groups can agree on essential data elements, standardized definitions, and agree to adopt semantic interoperability standards, then sharing data and subsequent data extractions becomes easier. This can greatly reduce the amount of time needed long-term for continued data extraction and use. Data from one source (e.g. EHRs) could be automatically shared with other receiving systems that similarly agree to adopt semantic interoperability standards. The primary lesson to be learned is that dealing with the issues up-front, including the use of standards for semantic interoperability and data transfer, can make the downstream sharing of data much easier.