|Home | About | Journals | Submit | Contact Us | Français|
OBJECTIVE: To develop and validate an informatics infrastructure for syndrome surveillance, decision support, reporting, and modeling of critical illness.
METHODS: Using open-schema data feeds imported from electronic medical records (EMRs), we developed a near-real-time relational database (Multidisciplinary Epidemiology and Translational Research in Intensive Care Data Mart). Imported data domains included physiologic monitoring, medication orders, laboratory and radiologic investigations, and physician and nursing notes. Open database connectivity supported the use of Boolean combinations of data that allowed authorized users to develop syndrome surveillance, decision support, and reporting (data “sniffers”) routines. Random samples of database entries in each category were validated against corresponding independent manual reviews.
RESULTS: The Multidisciplinary Epidemiology and Translational Research in Intensive Care Data Mart accommodates, on average, 15,000 admissions to the intensive care unit (ICU) per year and 200,000 vital records per day. Agreement between database entries and manual EMR audits was high for sex, mortality, and use of mechanical ventilation (κ, 1.0 for all) and for age and laboratory and monitored data (Bland-Altman mean difference ± SD, 1(0) for all). Agreement was lower for interpreted or calculated variables, such as specific syndrome diagnoses (κ, 0.5 for acute lung injury), duration of ICU stay (mean difference ± SD, 0.43±0.2), or duration of mechanical ventilation (mean difference ± SD, 0.2±0.9).
CONCLUSION: Extraction of essential ICU data from a hospital EMR into an open, integrative database facilitates process control, reporting, syndrome surveillance, decision support, and outcome research in the ICU.
EMR = electronic medical record; ICU = intensive care unit; IRB = Institutional Review Board; METRIC = Multidisciplinary Epidemiology and Translational Research in Intensive Care; SQL = structured query language
The relevance of care in the intensive care unit (ICU) to public health in the United States is reflected in annual figures of 4.4 million ICU admissions, 500,000 deaths, 13.3% of hospital costs, 4.2% of national health expenditures, and 0.56% of the gross domestic product.1,2 This demand is expected to increase as the US population ages; patients older than 65 years currently account for more than 55% of all ICU days.3,4 Unmeasured burdens include the high degree of disability and loss of productivity for both ICU survivors and their caregivers.5-7
The complexity of the ICU environment, characterized by a vast amount of information and the critical importance of timing of interventions, presents a major barrier to safe and efficient care delivery.8,9 Recent advances in medical informatics and the anticipated widespread implementation of electronic medical records (EMRs) combine to provide an opportunity to facilitate processes for delivery of safe, high-quality care in the ICU.
This article describes the development and implementation of the Multidisciplinary Epidemiology and Translational Research in Intensive Care (METRIC) Data Mart, an informatics infrastructure for syndrome surveillance, decision support, reporting, and modeling of critical illness at Mayo Clinic.
The Institutional Critical Care Committee approved and supported the METRIC Data Mart project. Mayo Clinic Institutional Review Board (IRB) approval is required for specific research studies using METRIC Data Mart.
The Mayo Clinic campus in Rochester, MN, is an academic medical center with 1900 beds and 135,000 hospital admissions per year. The combined capacity of the ICUs is 204 beds and 14,800 admissions per year. Saint Marys Hospital has 183 ICU beds: 24 general medical, 16 medical cardiology, 25 cardiac surgery, 8 transplant surgery, 20 thoracic or vascular surgery, 24 trauma critical care, 20 neurologic, 26 neonatal (with the option of dual-occupancy stay for twins in 4 of them), and 16 pediatric. Rochester Methodist Hospital has a 21-bed medical-surgical ICU. Every ICU bed has full physiologic monitoring capabilities and a computer workstation with access to EMR and clinical databases.
Beginning on March 21, 2005, the medical records of all new patients coming to the Mayo Clinic campus in Rochester, MN, were stored in an electronic form. Moving to the Mayo Integrated Clinical System has been a large, multiphase project occurring over 11 years.10 About 64,000 clinical notes and 900,000 laboratory results and reports are entered into EMRs each week. The Rochester campus has about 16,000 standardized workstations with full access to the EMR and related applications.11 The Mayo Clinic EMR complies with the American National Clinical Document Architecture, a widely accepted standard for clinical documentation.12
The METRIC Data Mart (Figure 1) is a Microsoft Structured Query Language (MS SQL) (Microsoft Corporation, Redmond, WA) relational data warehouse that provides direct access to the pertinent EMR data of critically ill patients in both Mayo Clinic hospitals in Rochester, MN. Replicated EMR data are available in near real time, with a delay ranging from 15 minutes for monitored data and laboratory results to 4 hours for chest radiograph reports and clinical notes. Static data such as the Minnesota death registry are updated once per quarter. Technology standards, architecture, and policies are based on the strategic direction established by the Mayo Clinic Information Technology Committee. The METRIC Data Mart was first implemented in November 2004 and has been the subject of continuous development and improvement since then.
The development of complex near real-time, high-volume databases required several steps, as outlined in Figure 2.
Planning. The data generated in caring for patients in the ICU are of such variety that the principal technical challenge encountered during planning was the development of a data warehouse with the versatility to cope with that heterogeneity. Examples of heterogeneity encountered in the ICU include the following: (1) data generated through the continuous monitoring of physiologic vital signs, such as blood pressure, heart rate, and respiratory rate; (2) data generated through additional continuous monitoring, such as electrocardiography, venous oxygen saturation, and pulse oximetry, depending on the clinical circumstances; and (3) intermittent variables that also need to be accommodated, including laboratory data, fluid balance, pharmacology intervention, and respiratory/hemodynamic parameters.
Data Acquisition. Administrative approval was sought and confirmed before any medical data were accessed and stored. The METRIC Data Mart is updated from original sources using technically differing approaches, including direct SQL transactions (“push”), stored procedures, and Web services.
Determination of Data Fields to be Included. Not all available data fields are extracted from original sources and stored in the METRIC Data Mart. Stored items were chosen on the basis of necessity from a clinical, administrative, or technical task performance perspective. A multidisciplinary approach in the planning stage assisted with identification of vital data and ultimately resulted in improved METRIC Data Mart performance and utility.
Development of Data Model and Design of Database Structure. The METRIC Data Mart database design combines different approaches to achieve a relative balance between competing needs: speed, reliability, and the ability to extend the database without interruption. In an Entity-Attribute-Value database, data are stored in a single table with 4 columns: (1) patient identification, (2) attribute, (3) value, and (4) time. Such a design simplifies the physical layout of data tables and was chosen in preference to traditional alternatives such as the relational database design, which may perform very slowly for cross-tab reports or aggregations.
We chose MS SQL as our database for a number of reasons. It is the institutional standard and is supported by an established informatics infrastructure. In addition, it has good reliability, storage capacity, and performance, which were sufficient to handle an estimated 15,000 admission episodes per year.
Validation of Completeness and Accuracy. Because of the enormous volume, it is impossible to manually audit all data generated in the ICU. Random samples of data from each category were validated against independent manual audits of actual EMR data. Correlation, κ statistics, and Bland-Altman plots were used as appropriate to test the agreement between database entries and manual EMR audits.
Maintenance of Real-Time Update. The primary functions of the support team were determined to include data quality assurance, real-time outage detection, and timely repair of the infrastructure.
Detection and Repair of Outage. The requirements for outage notification and repair included the development of basic statistical analysis that runs continuously to detect unusual data patterns suggestive of possible outages; specific algorithms to notify the support team when a possible interruption in data transfer has been detected via display message paging and Web-based “dashboards”; and support staff work flow that facilitates the filling of detected data gaps during regular business hours.
The interface and query design required that authorized users be able to access the METRIC Data Mart through open database connectivity using either (1) JMP or SAS statistical software (JMP, Version 6; SAS Institute, Cary, NC) or (2) MS Access and MS Excel software packages (Microsoft Corporation, Redmond, WA), according to user preference. An index of available dictionaries and search terms was constructed and made available to help users understand which types of information are available and how queries should be constructed. This information was contained within a warehouse metadata table. Java, JMP, and SAS scripting languages were used for the initial modeling and rules development. Boolean combinations of data matching and natural language processing were used for syndrome surveillance and detection of alert events. Web reports and e-mail and pager alerts were developed for use within the institutional intranet, e-mail, and paging systems. Rules were developed to direct those alerts to appropriate persons (administrators, clinicians, and researchers).
Adequate security and confidentiality of patient information were major prerequisites of this project. These requirements were achieved by development of an operating protocol that incorporates a variety of features. A single log-on password, synchronized for multiple applications and changed on a regular basis, protects access to the METRIC Data Mart and complies with institutional security policies for access to clinical data. Any access to the database for research requires prior IRB approval. In keeping with the 1997 Minnesota State Law (1997 Minn. Stat. §144.335), data may be accessed only from the records of patients who authorized the use of their medical records for research purposes. On this final point, the institution records authorization information at the time of admission, and historically more than 93% of ICU patients admitted to Mayo Clinic permit such use.
Data sources, metadata, and frequency and timeliness of updates for individual categories of data available within the METRIC Data Mart are described in Table 1. A “demo” table contains all demographic information and serves as the main index for ICU admissions. The combination of the patient's medical record number and the time of ICU admission serves as a unique identifier in the database.
Table 2 lists the results of random sample validation by manual record review for each specific category of ICU data (demographics, severity of illness, admission diagnoses and comorbidities, monitored data, diagnostic tests, interventions, and outcomes). Agreement between methods ranges from perfect for raw laboratory and monitored data, age, sex, and mortality to moderate for calculated variables and natural language processing.
The METRIC Data Mart serves as a clinical reporting tool for management and quality improvement within the ICUs. A Web-based application generates custom tabular reports for each ICU as well as “control charts” of both outcomes and processes of care based on metrics developed by the Institutional Critical Care Committee.
Standard ICU monitoring and alarm systems are grossly inadequate, have a poor specificity, and lack the ability to recognize complex physiologic syndromes, such as acute lung injury or severe sepsis. Automated recognition of these conditions requires integration of monitored variables with specific laboratory and radiologic findings. Development of such algorithms follows a specific process for each condition of interest. A clinical definition (for example, the definition for acute lung injury of the American-European Consensus Conference on Acute Respiratory Distress Syndrome14) is broken down into clearly defined component concepts, and their corresponding informatics representation within the EMR is sought. Once these representations are identified, a query can be written that closely approximates the clinical definition and facilitates automated syndrome surveillance.
For example, the concept hypoxemia is represented within the electronic environment as the ratio of partial pressure of oxygen to inspired oxygen concentration (Pao2/Fio2 [fraction of inspired oxygen]) of less than 300 in arterial blood gas analysis, and it is retrieved using specific codes for Pao2 and Fio2 in the laboratory database. Similarly, the concept chest radiographic evidence of pulmonary edema is represented on the basis of a text query of radiologic interpretation for the words edema or bilateral + infiltrates. The process of defining criteria is iterative, with refinement of the algorithms based on the observed sensitivity and specificity during the testing period.
Once fully developed, an algorithm can be run within the EMR of all patients, and when prespecified alert conditions are met, a notification will be provided via text page to bedside providers (“check the ventilator, room XXXX”) or research coordinators. A schematic diagram of the METRIC Data Mart intelligent alerts (“sniffers”) from inception to notification of end users is presented in Figure 1. A list of METRIC Data Mart—supported projects related to syndrome surveillance, decision support, and modeling of critical illness is presented in Table 3.
With implementation of a comprehensive EMR at our institution, an opportunity arose to construct and validate the METRIC Data Mart, a custom, near real-time data warehouse.
After an Institute of Medicine Report19 highlighted an unacceptable rate of medical errors in hospitalized patients, the National Institutes of Health Roadmap called for the urgent support of patient safety initiatives, quality improvement, and translational outcome research. All these initiatives are facilitated by a flexible, EMR-based informatics infrastructure such as the METRIC Data Mart. The resulting infrastructure facilitated development of the following: (1) process control, reporting, and feedback; (2) “smart” alarms (“sniffers”) with custom data displays for decision support and clinical research; and (3) modeling of health care delivery to critically ill patients.
Electronic medical records are becoming integral components of health care services and soon will likely replace established paper-based medical records. The American Medical Association recently published a white paper with key recommendations regarding secondary uses of electronic health information.20 Secondary analysis of patient data can expand knowledge about diseases and appropriate treatments, improve understanding of effectiveness and efficiency, support public health and safety goals, and aid health care businesses in meeting consumer needs.
Although implementation of ICU-specific electronic data storage dates back to the late 1970s (HELP system21), the complexity of the ICU environment and the need for manual data entry have limited progress in the field. Secondary use of EMR data as a knowledge discovery tool in the ICU has been hampered by the heterogeneity of source data, vendor products, laboratory and diagnostic coding, and the relationships between individual data elements. Modern EMRs typically do not track ICU-specific syndromes (eg, shock, sepsis), processes of care (eg, goal-directed resuscitation), or outcomes (eg, duration of mechanical ventilation).
Commercial clinical databases, such as Acute Physiology and Chronic Health Evaluation (APACHE, Cerner Corporation, Kansas City, MO), enable the calculation of severity-adjusted mortality and resource utilization and have been used extensively in clinical research.22 However, the time and cost of data collection have hindered widespread use of such systems for quality assurance and outcome research in the ICU. Some patient data management systems currently offer semiautomatic score calculation; nevertheless, some data values still have to be entered manually through a separate interface.23 The ideal ICU informatics system would extract pertinent information automatically and eliminate the workload and error associated with manual data entry.24-26 The availability of accurate, automatically collected, near real-time granular electronic data greatly facilitates the development of specific tools to allow analysis of complex, multidimensional care delivery processes and outcomes.27
The METRIC Data Mart offers an opportunity to test and implement “smart” alarms and electronic decision support tools in the ICU (Figure 1, Table 3). Conventional electronic warning systems generally are limited to nonspecific safety alarms and lack the complexity needed for meaningful decision support for critically ill patients. Moreover, traditional monitoring systems have low clinical importance and may distract bedside clinicians28 from useful tasks. Specifically, bedside ICU monitors are not designed to recognize the development of complex physiologic syndromes and, except for monitoring cardiac arrhythmias, are of limited use in daily practice.
Similar problems characterize decision support triggers, such as abnormal laboratory alerts and early computerized provider order entry applications. We have deployed computerized clincian order entry throughout Mayo Clinic hospitals and have introduced a number of decision support rules for real-time interventions. These have the advantage of directing prescriber actions at the time of order entry. However, clinically relevant events are not restricted to single laboratory values or physiologic measures, often including parameters obtained from monitors, free text notes, laboratory data, and medication orders. An urgent need exists for diagnostic alarms that can be triggered by specific time-sensitive pathophysiologic conditions rather than by variance from thresholds set for individual variables. Preliminary studies of other rule-based alerts and computer-generated messages directed at bedside clinicians report that this approach can lead to faster recognition of pneumothorax29 and a small but significant decrease in the hospital length of stay.30
Clinical and translational research in the hospital environment critically depends on medical records. Most researchers and study coordinators retrieve patient data manually from the EMR and reenter it into research databases. This process is time-consuming, especially with large sample sizes, and introduces error. Automatic transfer from EMR to research databases has been shown to be an efficient and accurate alternative.31 The technologies used in the METRIC Data Mart can be implemented in any EMR hospital environment. The major difference between institutions will be in the resolution and volume of available data.
Electronic screening algorithms facilitate enrollment into time-sensitive clinical studies, an essential value for critical care research. Additionally, large electronic repositories of clinical data enable epidemiological studies to be conducted at modest cost and provide a platform for novel analytic approaches, such as artificial intelligence for data mining or neural networks for complex system analysis.
Determinants of outcomes in the ICU are complex and incompletely understood. The most obvious contributors would appear to be the chronic health state and severity of illness at presentation to the ICU. However, it appears likely that variation in systems of health care delivery and their components plays an important role in determining outcome for any given severity of illness. Analyzing the role that characteristics of individual components (eg, patients' or family members' preferences, health care providers' performance, ICU organizational and environmental factors) and their interactions play in determining outcome presents a major challenge to investigators. The EMR provides an ideal platform for data coordination, synchronization, and integration of complex systems information and thus facilitates meaningful analysis of that role.
Future iterations of the EMR will transform the electronic environment from a static repository of physiologic data to a dynamic clinical information platform that facilitates patient care and continuous quality improvement initiatives. The current capabilities of the METRIC Data Mart illustrate some of that anticipated function and already contribute to surveillance of patient outcome and health care system performance (eg, provider compliance with sepsis bundle guidelines). By combining traditional patient data (eg, vital signs, laboratory results) with health care delivery process data (eg, procedure time, procedure checklist), a new generation of EMRs will support the development of a high-reliability, low-error health care environment.
Numerous factors limit the more rapid adoption of tools such as the METRIC Data Mart. Human factors pose a major barrier for wider implementation because a large proportion of potential users have a low tolerance for complexity or for learning new tools.32 Barriers in the ICU include the reliability of data capture, information overload, and meaningful interpretation.
Successful deployment requires a multidisciplinary approach, including clinical content experts (intensivists, nurses) and specialists in medical informatics, statistics, and computer science. In many institutions, existing data sources are isolated within systems with different structures and coding systems. Extracting data from disparate clinical sources requires content expertise in addition to familiarity with the database and raw data. This expertise is all the more important because many of the databases do not have adequate supporting documentation.
Using open, flexible data schemas facilitates development of systems like the METRIC Data Mart in other institutions; established resources are used initially, and the system can be extended when new electronic data sources become available. The availability of metadata will make the exchange and integration of such systems less challenging and less costly in the future.
The METRIC Data Mart provides a novel informatics infrastructure for the extraction of ICU data from a hospital EMR into an open, integrative database. With the inevitable arrival of the EMR in hospital practice, the availability of an appropriately designed and implemented resource, such as that described in this study, will become an essential requirement for the conduct of activities such as process control, reporting, syndrome surveillance, decision support, and outcome research in the ICU.
This publication was made possible by grant 1 KL2 RR024151 from the National Center for Research Resources (NCRR), a component of the National Institutes of Health (NIH), the NIH Roadmap for Medical Research, and Mayo Foundation. Its contents are solely the responsibility of the authors and do not necessarily represent the official view of the NCRR or NIH. Information on NCRR is available at http://www.ncrr.nih.gov/. Information on Reengineering the Clinical Research Enterprise can be obtained from http://nihroamap.nih.gov/clinicalresearch/overviewtranslational.asp. This study was supported in part by National Heart, Lung and Blood Institute grant K23 HL78743-01A1 and NIH grant KL2 RR024151.