|Home | About | Journals | Submit | Contact Us | Français|
Problems in the structure, consistency, and completeness of electronic health record data are barriers to outcomes research, quality improvement, and practice redesign. This nonexperimental retrospective study examines the utility of importing de-identified electronic health record data into an external system to identify patients with and at risk for essential hypertension. We find a statistically significant increase in cases based on combined use of diagnostic and free-text coding (mean = 1,256.1, 95% CI 1,232.3–1,279.7) compared to diagnostic coding alone (mean = 1,174.5, 95% CI 1,150.5—1,198.3). While it is not surprising that significantly more patients are identified when broadening search criteria, the implications are critical for quality of care, the movement toward the National Committee for Quality Assurance's Patient-Centered Medical Home program, and meaningful use of electronic health records. Further, we find a statistically significant increase in potential cases based on the last two or more blood pressure readings greater than or equal to 140/90 mm Hg (mean = 1,353.9, 95% CI 1,329.9—1,377.9).
The benefits of electronic health records (EHRs) for primary care and the application of these systems to outcomes research and current efforts in practice redesign such as the National Committee for Quality Assurance's Patient-Centered Medical Home program are often hampered by barriers to full integration of EHRs. Common barriers include lack of trust in EHRs to securely store medical records,1, 2, 3, 4, 5 physicians’ views that EHRs interfere with clinical judgment;6, 7 lack of standards in data formatting and lack of interoperability;8, 9, 10, 11, 12, 13, 14, 15, 16 the required time, training, and investment to become proficient in using the systems;17, 18, 19, 20, 21, 22 the absence of local leadership to champion the systems;23, 24, 25, 26, 27, 27 difficulties in organizational redesign to use the EHR;28, 29, 30, 31, 32, 33, 34 and lack of readiness to implement EHRs successfully.35, 36, 37 We sought to examine one problem—the structure, consistency, and completeness of EHR data—by importing de-identified EHR data into an external system for analysis of diagnostic information.
EHRs have the potential to be valuable tools for health outcomes research in primary care38, 39, 40, 41, 42, 43 and a critical component in practice redesign and prevention of chronic diseases such as hypertension through identification of at-risk patients.44, 45 While manual review of medical records is resource intensive,46 using diagnosis codes stored within EHRs permits searching in a more comprehensive and efficient manner.47 However, problems in the structure, consistency, and completeness of EHR data and the use of free-text entries rather than discrete data fields48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58 create barriers to research, outcomes reporting, and quality improvement activities, particularly among smaller, rural practices.59, 60, 61, 62, 63, 64, 65
Given the challenges created by free-text data entry into EHRs, the current study examines the ability to identify cases with essential hypertension by importing de-identified EHR data from 11 West Virginia primary care centers into an external system, in this case a public-domain patient registry. An advantage of the registry is that it is accessible at the practice level and requires no programming or statistical expertise to use. This study examines whether patients with a diagnosis of essential hypertension are missed if searching only by International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) diagnostic codes (401.0–401.9). ICD-9-CM coding is currently used in this particular EHR system. We test the hypothesis that there will be significantly fewer patients identified with hypertension based on ICD-9-CM diagnosis codes relative to use of diagnosis codes plus free-text coding of hypertension. Support of this hypothesis would document the benefits of auditing EHR data for completeness and consistency, inform quality improvement efforts in overcoming barriers to EHR data quality and reliability, and support the National Committee for Quality Assurance health information framework, which highlights the need for interventions designed to improve the management and application of EHR data for research and quality improvement. Improving the management and application of EHR data has gained increased attention as a vital component in the overall success of health information technology endeavors.66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76 Secondarily, we hypothesize that significantly more cases will be identified through a third measure based on the guidelines for diagnosis of hypertension presented in the Seventh Report of the Joint National Committee on Prevention, Detection, Evaluation, and Treatment of High Blood Pressure (two or more most recent blood pressure readings greater than or equal to 140/90 mm Hg among those without any diagnosis of essential hypertension).77 This third measure identifies those at risk for or undiagnosed with hypertension, and would help document the benefits of analyzing EHR data in an external format.
This research is a nonexperimental retrospective study of essential hypertension cases identified across 11 West Virginia primary care centers using the same EHR system. We used a previously developed tool to import data from the EHRs into the Chronic Disease Electronic Management System (CDEMS).78 CDEMS is a Microsoft Access–based public-domain registry originally developed by the Washington State Department of Health. Moving the de-identified EHR data to an external system, in this case a registry, allows for data transparency in that key data within the EHR (i.e., patient diagnoses, demographics, vitals, laboratory results and services) can be queried for coding consistency and completeness. Table Table11 lists all data elements imported from the EHR into the registry. This registry was chosen as the tool for analysis because it is accessible by each primary care center, allowing methods and tools for this research to be applied at the practice level for quality improvement efforts in data management, identification of at-risk patients, and quality-of-care improvement. De-identified data included in this analysis are for all active patients in the 11 sites as of December 31, 2010.
Queries were built in the registry to search the de-identified EHR data to 1) identify unduplicated patients with a diagnosis of essential hypertension based on ICD-9-CM codes (using the diagnosis and demographic portions of the data); 2) identify unduplicated patients with a diagnosis of essential hypertension based on free-text entries (using the diagnosis and demographic portions of the data); and 3) identify unduplicated patients whose last two or more blood pressure readings were greater than or equal to 140/90 mm Hg and who did not have a documented diagnosis of essential hypertension in either ICD-9-CM code or free-text format (using the diagnosis, demographic, and vital signs portions of the data). Table Table22 provides a listing of queries used to identify patients, a description of the functions of each query, and a list of the free-text diagnoses that were detected. Identification of patients with a diagnosis of essential hypertension by ICD-9-CM code was accomplished by limiting search criteria to codes 401.0–401.9. Identification of patients with a free-text diagnosis of essential hypertension required more search steps. Search of the diagnosis field called “Other” required use of the LIKE condition function in
Microsoft Access, also known as wildcards, to locate all matching criteria. Wildcard expressions used in the search were “*Hyperten*”; “*HTN*”; “*401*”; and “*Hyper ten*”. Upon review of all free-text results, it was evident that the search returned results relating to other forms of hypertension that needed to be excluded in this study. Excluded from search criteria were the following: “*Pulm*”; “*Neph*”;
“*Coat*”; “*Retino*”; “*Pre*”; “*Ocul*”; “*Occul*”; “*Portal*”; “*Gastro*”; “*Partum*”; “*Ortho*”;
“*Oval*”; “*Border*”; “*Myopathy*”; “*Medic*”; “*Renal*”; “*Venous*”; “*FH*”; “*Family*”; “*Gest*”; “*Decea*”; “*Tight*”; “*Disea*”; “*Infar*”; “*Intracran*”; and “*Elevated*”. Means, standard deviations, and 95 percent confidence intervals were calculated for the number of cases to measure the differences between each method.
Based on use of ICD-9-CM codes alone, 12,919 unduplicated patients with essential hypertension were identified in the 11 sites. Searching free-text diagnoses, 898 additional unduplicated patients were identified. Broadening the search criteria to patients whose last two or more blood pressure readings were consistently greater than or equal to 140/90 mm Hg identifies an additional 1,076 unduplicated patients not identified by ICD-9-CM codes or free-text entries (range = 297). Use of all three methods identified 14,893 cases. Table Table33 presents these findings.
Placing confidence intervals around the means of each patient count method, we find a statistically significant increase in total cases identified with essential hypertension based on combined use of ICD-9-CM coding and free text (mean = 1,256.1, 95% CI = 1,232.3–1,279.7) compared to ICD-9-CM coding alone (mean = 1,174.5, 95% CI = 1,150.5–1,198.3). Furthermore, we find a statistically significant increase in identification of potential cases based on cases in which the last two or more blood pressure readings were greater than or equal to 140/90 mm Hg (mean = 1,353.9, 95% CI = 1,329.9–1,377.9) compared to ICD-9-CM coding plus free-text search. Use of only ICD-9-CM codes missed 13.3 percent of cases as identified using all three methods. Table Table44 and Figure Figure11 present these findings.
By auditing EHR data in an external system, this study finds significant limitation in the ability to identify patients with a diagnosis of essential hypertension due to the use of free-text diagnosis entries. This study allows for the identification of a problem in data quality and completeness, and is translational in nature in that the study methods and tools are accessible to each site to monitor documentation processes and make adjustments as needed without time-intensive chart reviews or special programming. Further, importing the EHR data into an external system allows for analysis of blood pressure results to identify patients either undiagnosed with or at risk for development of the condition.
While it may not be surprising that significantly more patients are identified when broadening the search criteria, the implications are critical for quality of care because the identification of patients by health condition is a fundamental step in the process of applying data for quality improvement and reporting. Furthermore, the ability to accurately report data at the population level is central to the Patient-Centered Medical Home program and to meaningful use criteria. The inability to capture all patients by health condition yields reports for Patient-Centered Medical Home and meaningful use purposes that are inaccurate. Likewise, EHRs offer the promise of better patient care through decision support tools such as those that suggest care guidelines and treatment based on a patient's health condition. However, problems in data quality can result in lower levels of provider trust in the data and the therefore decreased application of EHR data to patient care. Auditing data within the EHR can help identify these problems and provide the opportunity to correct them, for example, through training on the use of EHRs and through development of practice policies and procedures aimed at eliminating free-text entry of diagnoses. While it may not be feasible to alter the structure or functions of the EHR, it is reasonable to expect that quality improvement efforts centered on training and practice policies will help overcome barriers to data quality.
The substantial variability between clinics that was detected enables identification of clinics that follow best practices (i.e., those with relatively low proportions of diagnoses of hypertension recorded by free text, such as clinics E and F in Table Table4),4), from which other sites can learn and apply documentation practices and policies. Likewise, this analysis aids in identifying clinics at which data management support and follow-up training is warranted (e.g., clinics J and G in Table Table4).4). While clinic-level variability is not addressed within this study, results from this research allow for follow-up research efforts to be designed and conducted with these sites.
Our study reported significant loss in the ability to identify essential hypertension cases due to use of free-text coding. However, the study methods and tools offer translational opportunities at the primary care level, enabling each participating site to use these methods and tools to improve their own office procedures, training, and policies surrounding data entry into EHRs. This study highlights the need for training in data quality and management, even on basic levels such as using EHR templates and discrete fields for data entry rather than free-text fields. Targeted training is advisable because various members of a care team, such as physicians, nurses, medical assistants, and front-office staff, contribute data to the EHR at various points in the care process. Continued monitoring of these sites using tools developed in this research will help determine the long-term benefits of increased attention to EHR data quality. It is reasonable to expect that efforts to improve data quality will bolster improved integration of these systems while also facilitating the use of EHRs for quality-of-care improvement and efforts in practice redesign.
This study points to the need for future research. First, only essential hypertension was studied. Additional health conditions, such as other forms of hypertension, comorbid cardiovascular health conditions, diabetes, or chronic kidney disease, need to be examined. Second, the de-identified data are from only one EHR system. Future research needs to account for data from other systems to see if these findings are replicated. Third, patients with consistently high blood pressure readings need clinical follow-up to determine whether or not they have hypertension. Lastly, additional analyses are needed to account for patient age criteria (i.e., 25–79 years of age) when identifying patients with hypertension. The intent of this research was to identify patients with essential hypertension regardless of age or demographic criteria, thereby permitting initial exploration of the ability to conduct a more rigorous level of analysis of EHR data through importation of data into an external system.
The authors would like to acknowledge the support of ongoing partnership with the West Virginia Bureau for Public Health, Office of Community Health Systems and Health Promotion.
Adam Baus, the Office of Health Services Research in the West Virginia University Department of Community Medicine in Morgantown, WV.
Michael Hendryx, the West Virginia Rural Health Research Center in Morgantown, WV.
Cecil Pollard, the Office of Health Services Research in the West Virginia University Department of Community Medicine in Morgantown, WV.