|Home | About | Journals | Submit | Contact Us | Français|
The Telemetric and Holter ECG Warehouse (THEW) hosts more than 3,700 digital 24-Holter ECG recordings from 13 independent studies. In addition to the ECGs, the repository includes patient information in separate clinical database with a content varying according to the study focus. In its third year of activities, the THEW database has been accessed by researchers from 37 universities and 16 corporations located in 16 countries worldwide. Twenty publications have been released primarily focusing on the development and validation of ECG-based technologies. This communication describes the content of the databases of the repository, with brief summary of the research and development projects completed using these data.
The Telemetric and Holter ECG Warehouse so-called THEW is an initiative that was started in late 2008. At this time, the idea of developing a digital repository for continuous ECGs did not sound novel, yet it was supported by a large community of scientists working on the development of ECG technologies related to drug safety. Indeed, the existing repositories of ECG waveforms were not addressing the need of scientists looking for ECG waveforms from individuals exposed to cardiac and non-cardiac drugs. Primarily, the effect of these drugs on the ventricular repolarization signals specifically its duration (QT/QTc interval) and its morphology were of interest. Initially, this trend was supported by a significant driving force led by a regulatory need to better understand and measure the effect of drugs and their concentrations on the QT interval. Subsequently, the impact of heart rate and the autonomic regulation of the heart as confounding factors became a major component of this research field. This trend is illustrated on Figure 1 where we plotted the throughput of scientific literature (number of publications) related to QT interval and drug across the past 20 years. These publications encompass works on the improvement of QT measurement, and a better understanding of the effect of drugs on the QT interval following (effect of heart rate, gender, drug concentration, study designs, amongst many other factors). So, the inception of our initiative was driven by the need to better evaluate the role of physiological confounding factors including, but not limited to the effect of heart rate, signal quality and autonomic regulation on the QT intervals. During the first year the database of ECG from the THEW expanded to other scientific arenas and the repository received ECG waveforms from cohort of patients not related to drug safety assessment but on very general cardiac diseases: acute myocardial infarction, coronary artery diseases and other we will describe in the following sections. Interestingly, these new databases attracted a broader spectrum of scientists and spawned new research projects around the validation of algorithm to measure T-wave alternans, QRS components and repolarization dispersion from these digitally recorded waveforms. In this article marking the third year of activities around the THEW repository, we provide a detailed description of the data in the repository.
The THEW has grown steadily over the past three years and presently contains 14 databases of continuous ECG waveforms from clinical studies and clinical trials described in Table 1. This table contains technical specifications of the ECG waveforms, it reveals that the databases are heterogeneous in terms of type of patients and list of clinical parameters. However, the format in which the waveforms are stored is consistent and based on the International Society for Holter and Non-invasive Electrocardiology (ISHNE) file format1–3. The Tables 2 provides basic demographic information about each population.
More information about each database can be accessed by using the QR code provided at the beginning of each section. All data within the THEW is compliant with the Health Insurance Portability and Accountability Act (HIPAA) and all patients have signed informed consent when enrolled in the various studies described in the following 14 sections.
This database includes 160 3-lead pseudo-orthogonal ECG waveforms. Their sampling frequency is equal to 200 Hz, the amplitude resolution is equal to 10 microvolts. Ninety three patients with acute myocardial infarction were identified based on clinical symptomatic: sudden chest pain (typically radiating to the left arm or left side of the neck), shortness of breath, nausea, vomiting, palpitations, sweating, and anxiety. The exclusion criteria for the AMI patients were: patients in non-sinus rhythm, patients with major co-morbidity such as malignancy, severe hepatic, renal or cerebral disease. The patients whom had prior coronary revascularization artery bypass graft surgery (CABG) were also excluded (but not the patient with history non-CABG coronary revascularization). There is an initial resting supine period of 20-minute duration before starting the ambulatory recording. This set of recordings includes 2 groups (N=96): 1) 67 patients with no pre-specified conditions (age, gender, treatment) and two Holter recordings acquired 24–48 after index even, and between the 5th and 10th day after index event; five patients (9019, 9023, 9067 9096 and 10001) from this first group had only the second recording done; 2) 29 patients in whom one Holter recording was acquired 24–48 hrs after index event. All patients were recorded in sinus rhythm.
The CAD database includes two hundred and seventy one digitally recorded 3-lead pseudo orthogonal ECG waveforms. Their sampling frequency is equal to 200 Hz,, the amplitude resolution is equal to 10 microvolts. This group of cardiac patients includes coronary artery disease patients (271 patients), Average age ~59 yrs old, ejection fraction close to 58%, 223 patients are males. All patients have positive angiogram (at least one vessel narrowing >75%) and either exercise induced ischemia or evidence of previous myocardial infarction. These patients were in stable phase of ischemic heart disease at the time of the Holter recording, they were enrolled at least 2 months after their event. The patients' history of arrhythmia (primarily ventricular tachycardia), left ventricular systolic and diastolic dimensions, and myocardial infarction location (anterior, inferior, lateral, anterior lateral inferior lateral based on ECG tracings) are available. There is an initial resting supine period for 20 minutes duration before starting the ambulatory recording as in the previous database. All patients are supposed to be in sinus rhythm. Only cardiac patients without evidence of congestive heart failure were included.
The dataset of 24-hour Holter ECGs from healthy individuals is similar in terms of technical characteristics than the two previous databases. This set comprises 202 recordings (the label of the database claims 203 recordings but one recordings was removed because of its poor quality). These individuals were defined as healthy based on a battery of tests and information including; no history of cardiovascular disease or disorders (stroke, TIA, peripheral vascular disease), no hasty of high blood pressure (>150/90 mmHg), no medication, no chronic illness, normal physical examination, normal echocardiography, normal exercise testing and no pregnancy.
The recordings extracted from this through QT study provided by Pfizer Inc. to the THEW was designed as a double-blind, randomized, 5-way crossover study. The study consisted of a single cohort of 35 subjects. Each subject was randomized to one of 10 possible sequences of treatment and received a single dose of Unknown Compound (UnC): 30mg / 100mg and 300mg. Placebo or moxifloxacin was given on five occasions, each being 10–14 days apart. During each study period subjects were resident for two nights/three days and a follow-up visit occurred 10–14 days after the final administration. A run-in day, independent from the randomization was included in period 3 requiring an additional one night/one day residency during which subjects were administered a single blinded placebo dose. The individuals enrolled in the study were healthy males and females between age 18 and 55 years old. One hundred and two digitally recorded 3-lead ECG waveforms (EASI lead configuration). Their sampling frequency is equal to 200 Hz, the amplitude resolution is equal to 4.9 microvolts. Recordings from the placebo, baseline, moxifloxacin and tested drug arms are available in the repository. As a note, the tested drug is associated with a positive QT signal.
The second set of Holter recordings from a TQT study is different from the first one by various aspects. First, the technical specifications, these are 12-lead 24-hour high-resolution (1000Hz) ECGs recorded using Mason-Likar lead configuration, and 3.75 microvolt resolution, and second only the baseline, placebo and drug arms are hosted in the repository. The database also include a large list information regarding the study protocol such as drug plasma concentration for all considered time points, time and date of food intake, and population demographic.
This unique set of 6 recordings consists of ECGs acquired in patients who experienced drug-induced TdPs in the context of a drug testing protocol. Amongst these individuals, two are patients with the congenital long QT syndrome (LQTS) with KCHN2 mutations (Type 2), and the remaining 4 individuals have an acquired form of the LQTS (sotalol induced). The sampling frequency of these 12-lead recordings is 180Hz. Each individual experienced one or multiple episodes of TdPs. These patients had a history of syncope or TdPs, and they were subsequently enrolled for a diagnostic test based on dl-sotalol IV (at 2mg/kg body weight). The test was used to unmask latent repolarization abnormalities. It is described in more details in the next section. Four patients had drug-induced TdPs (dLQTS) while the other two had spontaneously induced TdPs (cLQTS).
This group of recordings was acquired using non Holter technology but resting equipment providing continuous 4–5 min waveforms. The 34 patients were enrolled after being admitted to the University Hospital of Munich, Germany for documented torsades de pointes (TdPs) in the context of a drug with QT-prolonging potential: sotalol, sumatriptan, amiodarone, bisacodyl, cipramil, furosemide, clarithromycin, erythromycin, roxythromicin. The patients were enrolled for an evaluation of the individual level of repolarization reserve. The patients were all genetically tested for presence of mutation of the major LQTS genes using standard genotyping techniques (genomic DNA was prepared from lymphocytes, amplification of KCNQ1, KCNH2, KCNE1, KCNE2 and SCN5A using polymerase chain reactions were performed, followed by direct sequencing of these major LQT-disease genes). The control group consisted of patients who were started on sotalol for prevention of paroxysmal atrial fibrillation. The study protocol is described in the article published in 2003 by Kaab et al.4 and it is the one used in the study which generated the data for the previous database (TdPs). Briefly, dl-sotalol was given intravenously at a constant rate over a 20 minute interval at a dose of 2mg/kg body weight in 50ml of 0.9% saline solution in a group of individuals with (+TdP) and without history of drug-induced TdPs (−TdPs). Tests were performed in the morning. Sotalol was injected to unmask latent repolarization abnormalities while patients were closely and continuously monitored in the intensive care unit. Continuous 3 to 4-minute surface 12-lead ECG recordings were acquired at rest in supine position at baseline and at 20-min steady state phase after injection. The THEW database includes 2 ECG tracings per patients at baseline and on peak concentration of the drug.
The ECG recordings are 12-lead following standard configuration, 1000 HZ sampling frequency and 5 microvolt amplitude resolution. The clinical characteristics of the study population are provided in Table 2. The average ages of the populations were not significantly different between the two groups: 59±13 vs. 61±12 yrs. The number of females is slightly higher (N=12) in the group of individuals without TdPs than in the group with (N=9). Presence of history of myocardial infarction, coronary artery disease and hypertension were similar between the study groups. There were several patients with history of AF in both groups (+TdPs: N=11 and −TdPs: N=17, p<0.05). One of the patients had atrial fibrillation during the ECG recording. This ECG was removed from the analysis resulting in a group of 16 patients with a history of TdPs and 17 individuals free of such history. The group of individuals with a history of drug-induced TdPs has been reported using a heterogeneous list of medications. None of the patients of the study experienced episodes of TdPs during the sotalol challenge and none of them were carrying mutation linked to the major congenital forms of the LQTS.
All ECGs recordings from this database were acquired during atrial fibrillation (AF). The arrhythmia duration was defined as the time elapsed from symptom onset (or in case of asymptomatic AF, from the last documented sinus rhythm) until cardioversion. All patients (N=73) underwent external DC-cardioversion. Cardioversion was considered successful if AF was abolished followed by at least two sinus beats. All patients successfully cardioverted underwent a 12-lead ECG prior to hospital discharge (not available in the database). All patients were followed for four weeks. Cardioactive drugs were left unchanged for the duration of the study. Exclusion criteria for the study were pharmacological treatment with Class I or III anti-arrhythmic drugs, planned changes in pharmacological regime, overt congestive heart failure or ischemic heart disease, or implanted pacemaker. The patients were rested supine for 10–20 min in a quiet room. The 12-lead ECGs are ~1 minute duration. More information about the study are available in5, 6.
This large set of 1172 24-holter recordings was acquired during the Ischemia Monitoring and Mapping in the Emergency Department in Appropriate Triage and Evaluation of Acute Ischemic Myocardium (IMMEDIATE AIM) study. A prospective trial in which patients were enrolled in the study between 2002 and 2004 and 1-year follow-up was completed in December 2005. The overall goal of the IMMEDIATE AIM study was to improve the noninvasive electrocardiogram (ECG) diagnosis of patients who present to the emergency department with acute coronary syndrome.7 Specific aims were to (1) acquire continuous, 24-hour, standard 12-lead ECG Holter recordings in cohorts of ED patients undergoing evaluation for possible acute coronary syndrome, (2) simultaneously acquire continuous, 24-hour Holter recordings from electrode sites considered optimal for ischemia detection and then estimate body surface potential maps (EBSPM), and (3) compare the sensitivity and specificity of standard electrocardiography with the EBSPM method for identifying acute myocardial ischemia and infarction. The recordings are associated with a large set of clinical factors and endpoints including final diagnostic (acute coronary syndrome (ACS), non ST elevation myocardial infarction, unstable angina), troponin levels, cardiac arrest and cardiac resuscitation, and death. Importantly, troponin markers are included as well in most patients. This dataset is very relevant for the evaluation of ECG markers for the presence of ACS.
This Holter database has been nurtured by the French group located in Paris and led by Dr. Denjoy I and her collaborators. After discovery of the underlying genetic basis of the syndrome, the ECG data accumulated over 25 years have been saved with the results of genetic test obtained in 480 LQTS patients. This primary care-oriented database contains clinical data (demographics, symptomatic status, and beta-blocker treatment at the time of recording) of genotyped LQTS patients who have had one or repeated Holter ECG recording(s). This database does not include detailed information regarding the event and time of the reported event. Yet, it does represent a unique and important type of database to investigate electrocardiograhic phenotype association to genotypic information. Recent work based on this dataset can be found in our work.8
The ECGs from this dataset are from the same study than the database E-HOL-03-1172-012. The particularity of these ECGs is the sampling frequency of the ECG waveforms which is equal to 1000Hz rather than 180 Hz. Other specifications for this database are provided in tables 1 and and22.
This database includes continuous ~20 minute ECG recordings from 927 patients referred for exercise myocardial perfusion single photon emission computed tomography (SPECT). These high-resolution 12-lead ECGs were continuously recorded throughout baseline, exercise and recovery phases. ECG acquisition was performed at a rate of 1000 samples/second using 16 bit resolution (measurement sensitivity < 0.15μV). Conventional ECG monitoring during the exercise test was either extracted automatically from the high resolution ECG traces or acquired using an additional ECG device using a second set of 10 electrodes. The patients underwent eitherstress/rest 99mTc-sestamibi gated SPECT, or stress/ redistribution 201Tl SPECT. Perfusion images were visually scored using a 20-segment model of the left ventricle and a 5-point scale (0=normal tracer uptake, 1=mildly reduced, 2=moderately reduced, 3=severely reduced, 4=no uptake). This analysis was performed by expert interpreters over-riding automatic segmental scores obtained automatically. The summed stress score and summed rest score were calculated as the sum of the individual segment scores for the stress and rest images and converted to percent total myocardium (% myocardium) by dividing by 80 (maximal potential score = 4 × 20). The amount of ischemic myocardium (IM) was calculated as the summed difference score (the difference between summed stress and summed rest scores) divided by 80. Patients were classified as: no ischemia or equivocal (IM<5%), mild ischemia (5%≤IM<10%) and moderate/severe ischemia (IM≥10%).
These set of ECGs was recorded in patients with end stage renal disease (ESRD), the aim of the study was to test the hypothesis that measurement of cardiac repolarization heterogeneity in response to dialysis can be used to stratify ESRD patients in terms of risk for sudden arrhythmic death and thereby determine which patients may benefit most from ICD placement. Fifty-one ESRD patients with significant risk for sudden arrhythmic death were enrolled in one University of Rochester affiliated out-patient dialysis centers. Information about patient cardiac history and current drug therapies was recorded. The study protocol included 3 periods: baseline (prior to hemodialysis), hemodialysis period, and the post-hemodialysis. Clinical information such as age, gender, BMI and number of months on dialysis are recorded available as well.
The purpose of the Defibrillator in Non-Ischemic Cardiomyopathy Treatment Evaluation (DEFINITE) Study was to determine if the combination of ICD therapy to optimal medical therapy reduces all-cause mortality in non-ischemic cardiomyopathy patients with low ejection fraction and NSVT/PVC's compared to optimal medical therapy only. The enrolled patients in this trial had a diagnosis of nonischemic dilated cardiomyopathy, a LVEF < 36%, and PVC's or unsustained ventricular tachycardia. The control group received standard medical therapy, while the test-ICD group received standard medical therapy and a single-chamber ICD. At 2 years, the mortality rate was greater in the control group (14.1%) than the test-ICD group (7.9%). There were significantly more sudden deaths from arrhythmia in the control group than there were in the test-ICD group (p=0,006). The investigators concluded that the combination of ICD therapy with standard optimal medical therapy reduced the risk of sudden death from arrhythmia in patients with severe non-ischemic dilated cardiomyopathy. The results of the trial demonstrated that ICD therapy reduces the incidence of “arrhythmic death” in non-ischemic dilated cardiomyopathic patients with severe LV dysfunction and PVCs or NSVT concomitantly treated with ACEI and Beta-blockers. There was no statistically significant reduction in all-cause mortality noted between “standard therapy” group and “ICD” group.
This database includes the Holter recordings acquired during the various visits of the patients enrolled in the trial. These 3-lead 24-hour recordings are from 301 patients, and the number of recordings per yearly visit is 140 at baseline, 138 year 1, 65 year 2, 33 year 3, 10 year 4 and 1 year 5. The clinical database includes the outcome of the trial in terms of cardiac events and their time of occurrence, appropriate ICD therapy being also included.
The access to the data from the THEW is free of charge for not-for-profit organizations, membership is required for industries. The process to gain access to the database is rather simple. It implies one form which can be downloaded from the THEW website. This form is an agreement so-called the THEW Data Use Agreement (DUA) defining the legal framework around the use data. It specifically stipulates that the organization gaining access to the THEW data cannot distribute this data to other organizations.
Industries purchase the access while not-for-profit organization are required to provide a research proposal which is a one page form in which user needs to broadly described the field of research related to the use of data. This requirement allows our group to provide recommendations in terms of collaborations and potential synergies with other organization if the applicant is open to such collaboration.
There are three paths recommended for the access of the data: 1) obtain copy of the data by mail (either full repository content or selected databases), 2) download data from the THEW SFTP server, and 3) use the THEW ECG viewer to view and download ECG epochs. The selection of the mechanism is at the users' discretion. Yet, one highly recommends using direct mailing of a copy of the data on an external hard drive for data transfer of a size larger than 500GB. Access to the data through our SFTP can be done using either the THEW ECG software (freely available from our website) allowing for review ECG tracings and downloading up to 2 hours of ECG waveforms. Importantly, our initiative also provides a library of codes to access the THEW data from MATLAB directly.
The themes of research conducted around the data of the THEW are many folds. We provide below a non-exhaustive list of the current research:
During the first 3 years of activities, we reported 25 peer-review publications which mentioned or described research work based on data hosted in the repository. This number is expected to grow steadily at a pace following the growth of data contained in the repository.
We have developed a scientific resource gathering a unique set of ECG waveforms and clinical information to be used for the development of ECG-related technologies. This paper provides an overview of the data available in the repository.
This work was supported by the National Heart, Lung, and Blood Institute of the U.S. Department of Healthy and Human Services grant # U24HL096556.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.