|Home | About | Journals | Submit | Contact Us | Français|
To use linked electronic medical and dental records to discover associations between periodontitis and medical conditions independent of a priori hypotheses.
This case-control study included 2475 patients who underwent dental treatment at the College of Dental Medicine at Columbia University and medical treatment at NewYork-Presbyterian Hospital. Our cases are patients who received periodontal treatment and our controls are patients who received dental maintenance but no periodontal treatment. Chi-square analysis was performed for medical treatment codes and logistic regression was used to adjust for confounders.
Our method replicated several important periodontitis associations in a largely Hispanic population, including diabetes mellitus type I (OR = 1.6, 95% CI 1.30–1.99, p < 0.001) and type II (OR = 1.4, 95% CI 1.22–1.67, p < 0.001), hypertension (OR = 1.2, 95% CI 1.10–1.37, p < 0.001), hypercholesterolaemia (OR = 1.2, 95% CI 1.07–1.38, p = 0.004), hyperlipidaemia (OR = 1.2, 95% CI 1.06–1.43, p = 0.008) and conditions pertaining to pregnancy and childbirth (OR = 2.9, 95% CI: 1.32–7.21, p = 0.014). We also found a previously unreported association with benign prostatic hyperplasia (OR = 1.5, 95% CI 1.05–2.10, p = 0.026) after adjusting for age, gender, ethnicity, hypertension, diabetes, obesity, lipid and circulatory system conditions, alcohol and tobacco abuse.
This study contributes a high-throughput method for associating periodontitis with systemic diseases using linked electronic records.
Periodontitis is a chronic condition affecting gingival tissue and is known to interact with a variety of chronic diseases, including heart disease and diabetes (Williams & Mahan 1960, Mattila et al. 1989, Destefano et al. 1993, Beck & Offenbacher 1998, Grossi & Genco 1998, Lalla et al. 2003, Desvarieux et al. 2005). Identifying the connections between periodontitis and systemic diseases can ensure a holistic approach to patient care management. Regrettably, information about dental conditions is often inaccessible to physicians, partly due to inadequate interdisciplinary collaboration between dentistry and medicine (Slavkin & Baum 2000). Barriers to linking dental and medical records include disparate coding standards for dental and medical diagnoses, fragmented record systems and different patient identifiers used by dentists and physicians (Din & Powell 2008, Rudman et al. 2010, Theis et al. 2010). These barriers have long prevented largescale association studies between periodontitis and other medical conditions.
The rapidly widening adoption of electronic health records (EHRs) over the past few years has accelerated the digitization of patient records in various healthcare domains, including dentistry and medicine (111th-Congress 2009, Jha et al. 2009, Bramble et al. 2010, Casnoff et al. 2011). At Columbia University Medical Center, such electronic records have been used to develop a Dental Data Warehouse (DDW) at the College of Dental Medicine and a Clinical Data Warehouse (CDW) at its teaching hospital, NewYork-Presbyterian (NYP). Many patients have records in both databases. This study describes an effective method for mining associations between periodontitis and other medical conditions using linked electronic health records without selecting certain hypotheses a priori.
We used a case-control design to compare the prevalence of medical conditions between patients who had received treatment for periodontitis (case) and patients without periodontitis treatment but with general dental maintenance (control). Using all of the available patient records in our medical centre, we first identified 2126 patients whose dental record contained at least one Common Dental Terminology (CDT) periodontal treatment code (D4000-D4999) between January 3, 2007 and June 17, 2011. Then, we identified 2429 patients with one of two common dental maintenance codes (D0120 or D0150) but no periodontal treatment codes (D4000-D4999) within the same time frame. These dental records were matched to the medical records using first name, last name, gender and date of birth. If any one of these fields was not exactly matched, the patient was removed from further analysis. Duplicate patient identifiers were reviewed to consolidate records belonging to the same patients. Excluding patients who were either not NYP patients or had inconsistent information in one of the data fields, we mapped health records for 58% (1235/2126) of periodontitis patients and 51% (1240/2429) of non-periodontitis patients. Record linkage was performed by database administrators from both the College of Dental Medicine and NYP. Therefore, the researchers of this study only had access to de-identified linked records for analysis purposes. This study was approved by the Columbia University Institutional Review Board (IRB-AAAI9251).
The resulting study sample (N = 2475) contained 1235 cases and 1240 controls. For each patient, we extracted all International Classification of Diseases 9th Revision (ICD-9) codes along with date of birth, gender, ethnicity, dental date of service and medical diagnosis date. In our analysis, we obtained only the earliest diagnosis date for each ICD-9 code and the earliest dental service date for periodontitis. For gender and ethnicity, all information was self-reported by the patient at the time of medical or dental service. For assessment of age, we used two variables: birth year and age at time of dental service.
We extracted unique ICD-9 codes for all patients’ diagnoses that occurred between January 1, 2004 and December 16, 2011. Then, we grouped ICD-9 codes into 17 categories according to the AMA ICD-9 coding guidelines (Ama 2008). The following analyses were performed: (1) chi-squared goodness-of-fit tests with multiple hypothesis correction for each unique ICD-9 code observed by at least one patient in both the case and control groups; (2) multivariate logistic regression modelling of chisquare significant ICD-9 codes to identify associations while adjusting for age, gender, ethnicity and tobacco abuse disorder (defined as the presence of ICD-9 305.1 in the patient record); (3) multivariate logistic regression for each ICD-9 category with the same adjustment factors; and (4) literature validation of identified associations. For associations unreported in the literature, we performed further regression modelling while including additional covariates for other lifestyle factors and disease comorbidities. Clinicians diagnose “tobacco abuse disorder” for patients who are either present with a condition that is determined to result from tobacco usage or who require tobacco cessation treatment, for example, prescription nicotine patch (Ahrq 2012). Therefore, we used this diagnosis to adjust for patients who smoke or chew tobacco as they have experienced detrimental health effects of such use.
First, we performed a chi-square goodness-of-fit test on each ICD-9 code to measure its potential for association with periodontitis. As a cautionary step, we used only those codes found both in case and control groups (N = 2266) to avoid a type II error due to sparse data. We also excluded from our analysis those ICD-9 codes with an expected frequency <5 for either the cases or the controls because the probabilities from the chi-squared distribution for these codes cannot be estimated with a sufficient level of precision (Hill & Lewicki 2006). Therefore, we designed our method to search for associations strongly supported by the data set with high observed and expected frequencies. This process resulted in an analysis set of 993 ICD-9 codes (Table S1).
After the ICD-9 code association tests, the ICD-9 codes were partitioned into two groups, positive and negative, based on the ratio of observed frequency to expected frequency to interpret the results and correct for multiple hypothesis testing.
If a code’s ratio for cases was ≥1, the code was assigned to the positive group; otherwise, it was assigned to the negative group. Then, we used the Bonferroni correction method to adjust for multiple hypothesis testing (Bonferroni 1935,1936). We selected two Bonferroni p-value cut-offs, one for the positive group and one for the negative. In this way, we forced the family-wise error rate to be ≤0.05 within each group. We also present results that control the number of false positive associations using the Benjamini and Hochberg method (Benjamini & Hochberg 1995).
Another factor to consider when interpreting the results of an association mining study is the issue of confounding. If knowledge regarding potential confounders is absent then false positive associations may result. A common example from the literature is the association between lung cancer and match sales (Kleinberg & Hripcsak 2011). This association occurs when the confounding variable, that is, smoking, is absent from the analysis. Once information regarding the confounder is added then the original association, that is, between lung cancer and match sales, is no longer significant. Therefore, to address the issue of confounding in our study, we performed multivariate logistic regression using all significantly associated variables from the chi-square goodness-of-fit tests. The regression modelling allowed us to identify associations after adjusting for potentially confounding factors including age, gender, ethnicity, tobacco abuse disorder and other disease-specific comorbidities extracted from the medical records. For gender-specific conditions (e.g. pre-term birth, testicular cancer), we stratified the data by the relevant gender and then performed the regression analysis. Odds ratios and 95% confidence intervals (CI) were estimated using logistic regression models. All regression modelling was performed in R (R-Documentation 2002).
Lastly, we performed logistic regression for each ICD-9 category while adjusting for age, gender, ethnicity and tobacco abuse disorder. We first tested each ICD-9 category for significance individually. This revealed an initial set of statistically significant (p < 0.05) code categories. These categories were then added to a full model and then each category that lost significance was removed and a new model was built. This was performed recursively until our model contained only significant ICD-9 categories, which we report as our final model.
By extracting the unique ICD-9 codes from each patient, we initially arrived at 36230 ICD-9 codes in the control group and 38541 ICD-9 codes in the case group with a total of 3908 unique ICD-9 codes in the entire data set. As the age ranges for our cases and controls differed, we restricted our sample in all regression models to those who were younger than 70 (Figure S1), because everyone in the study aged 70 and above had received treatment for periodontitis. In addition, we adjusted for age in the regression models as this was still a significant indicator of periodontitis and we removed persons of unidentified gender. Both age variables were significant, along with the interaction term, and therefore all three terms were retained in the regression models. Table 1 shows the characteristics of the codes in the restricted data set used in the regression models. We also tested potential confounders – age, gender and ethnicity – for association with periodontitis.
We performed chi-square goodness-of-fit tests for each ICD-9 code in the analysis set (i.e. 993). We partitioned these codes into positive and negative groups (see Methods) and association analysis was performed with 462 hypothesis tests in the positive group with Bonferroni threshold = 1.083 × 10−4 and 531 in the negative group with Bonferroni threshold = 9.416 × 10−5. The association mining analyses were performed on all ICD-9 codes in the analysis set (i.e. 993 codes); therefore, no restrictions were made based on the type of medical condition. Instead, all conditions were tested for association with periodontitis.
In the positive group, we identified nine significant ICD-9 codes using the Benjamini–Hochberg correction threshold. Three of these codes also passed the Bonferroni cut-off. Figure 1 illustrates each ICD-9 code’s association with periodontitis. No codes achieved significance in the negative group; therefore, no medical conditions were associated with the absence of periodontitis (Figure S2).
Using logistic regression, we tested all nine significant results, adjusting for age, gender, ethnicity and tobacco abuse. Table 2 shows the resulting odds ratios, both before and after correction, for the associations in descending order of significance.
After adjusting for confounders, we found that two ICD-9 codes lost significance, senile nuclear cataracts (p = 0.403) and benign prostatic hyperplasia (BPH) (p = 0.071). The main confounding factor for the periodontitis-senile nuclear cataracts association was age, which resulted in a p-value of 0.363. Likewise, for the BPH-periodontitis association the greatest confounder was age, which resulted in a p-value of 0.063.
As the association between BPH and periodontitis is not reported in the literature and was almost significant after adjustment (p = 0.071), we decided to investigate it further. First, we merged BPH with and without urinary obstruction (ICD-9 600, 600.01) to examine if BPH in general was significantly associated with periodontitis (Model 2, Table 3). Because the BPH group was significant, we decided to add additional covariates for known comorbidities of BPH and related conditions such as erectile dysfunction (Model 3, Table 3) (Mcvary 2006, Keller et al. 2012). We added comorbidities into the model to determine whether the association between BPH and periodontitis was due to shared comorbidities that both conditions have in common, for example, diseases of the circulatory system. In the model, we included the following lifestyle factors: obesity, alcohol abuse (defined as the presence of any ICD-9 codes indicative of excessive alcohol consumption, which includes diagnoses for alcohol withdrawal, alcoholic hepatitis and alcohol abuse), tobacco abuse disorder and BPH co-morbidities: diabetes, hypertension, lipid conditions and diseases of the circulatory system. The association between BPH and periodontitis remained significant even after adjustment (p = 0.026). In Model 3, some potential confounders were not significant including obesity (p = 0.141), lipid conditions (p = 0.100), and conditions of the circulatory system (p = 0.309). We retained Model 3 as our final model (Table 4).
The other remaining ICD-9 codes were significant after adjustment for age, gender, ethnicity and tobacco abuse and therefore represent clinically meaningful associations. Thus, periodontitis was associated with the following medical diseases or conditions: diabetes types I and II, hypertension, diabetes with ophthalmic manifestations, hypercholesterolaemia and hyperlipidaemia.
We grouped the 3786 ICD-9 codes in our restricted sample (i.e. age <70 years and only male or female genders) into 17 unique categories using AMA ICD-9 Official Coding Guidelines (Ama 2008). Then, we used multivariate logistic regression to determine the ICD-9 code categories associated with periodontitis after adjusting for age, gender, ethnicity and tobacco abuse. We found three significant ICD-9 categories: the endocrine system (OR = 1.3, 95% CI: 1.21–1.34, p < 0.001), nervous system (OR = 1.2, 95% CI: 1.11–1.24, p < 0.001) and pregnancy and conditions of childbirth (partitioned for females only) (OR = 2.9, 95% CI: 1.32–7.21, p = 0.014) (Figure S3–S5). The frequency of codes in the ‘complications of pregnancy and childbirth’ category was low in our population. Therefore, identifying associations between the individual ICD-9 codes and periodontitis was not possible (Fig. 2); however, an association between the complications of pregnancy and childbirth ICD-9 category and periodontitis was found at the category level using logistic regression.
This study used linked medical and dental electronic health records to mine for potential associations between periodontitis and all possible medical conditions without selecting certain hypotheses a priori. We replicated known periodontitis associations that remained significant after adjusting for age, gender, ethnicity and tobacco abuse. We also found one unreported association with BPH that remained significant after adjustments for hypertension, obesity, diabetes, lipid and circulatory system conditions, and alcohol abuse. Although there might be prior studies that manually linked paper records, this study exemplifies how existing electronic clinical and dental data can be reused to accelerate discoveries of associations between oral and systemic disease.
We confirmed that diabetes types I and II are positively associated with periodontitis after adjustment for age, gender, ethnicity and tobacco abuse. These results replicated by our method are widely reported in the literature (Williams & Mahan 1960, Grossi & Genco 1998, Resnick et al. 1998, Iacopino 2001, Soskolne & Klinger 2001, Campus et al. 2005, Mealey 2006, Mealey & Oates 2006, Narayan et al. 2007, Mealey & Rose 2008, Lalla & Papapanou 2011, Hodge et al. 2012). We found diabetes with ophthalmic manifestations (ICD-9 250.5) to be associated with periodontitis (OR = 1.8, 95% CI 1.34–2.47, p < 0.001), which replicates the result from a study of Pima Indian diabetic patients with retinopathy who were found to be approximately five times more likely to have periodontitis (Löe 1993). Our association result remained valid after adjusting for our ethnically heterogeneous population, which further bolsters the literature supporting this association.
Overall, the literature shows that metabolic syndrome – whose symptoms include hypertension, hypercholesterolaemia, hyperlipidaemia – is associated with periodontitis (Morita et al. 2009, Andriankaja et al. 2010). One study found that hypertension was associated with tooth loss in post-menopausal women (Taguchi et al. 2004), while another study failed to find an association between hypertension and periodontitis after adjusting for multiple factors such as fruit and vegetable intake and multivitamin use (Rivas-Tumanyan et al. 2012). It is possible that our identified association between hypertension and periodontitis could be due to other factors, such as diet or vitamin use, not recorded in EHR data. However, a study on the relationship between the levels of periodontitis bacteria and hypertension established a direct association between the two diseases (Desvarieux et al. 2010).
Related to this association are two more replicated periodontitis associations: hypercholesterolaemia and hyperlipidaemia. The association between hypercholesterolaemia and periodontitis has been previously reported.(Katz et al. 2001, Ramesh et al. 2010) For hyperlipidaemia, a link with periodontitis has been proposed as the underlying factor behind the connection between periodontitis and atherosclerosis, (Fentoglu & Bozkurt 2008), which is supported by a number of studies (Ramesh et al. 2010, Joshi & Marawar 2011).
Our method also replicated an association between periodontitis and complications of pregnancy, and childbirth (OR = 2.9, 95% CI: 1.32–7.21, p = 0.014). This is a category-wide association involving multiple ICD-9 codes, including pre-term birth, delivery with complications, legal abortion and normal delivery. The association between threatened pre-term labour and periodontitis has been widely discussed in the literature (Offenbacher et al. 1996, Mitchell-Lewis et al. 2001, Michalowicz et al. 2009). The over-arching result of this category-wide association is that pregnancy itself is associated with gingival inflammation, a finding consistent with the literature (Bobetsis et al. 2006, Santos-Pereira et al. 2007, Wimmer & Pihlstrom 2008, Chambrone et al. 2011). Various explanations for the increased risk of periodontal manifestations during pregnancy have been suggested, including its link to the hormonal changes associated with pregnancy (Carrillo-De-Albornoz et al. 2012), although pregnancy does not cause periodontitis but may rather aggravate pre-existing periodontal disease (Laine 2002).
We found a previously unreported association between BPH and periodontitis that remained after adjusting for various lifestyle, demographic and comorbidity confounders (OR = 1.5, 95% CI 1.05–2.10, p = 0.026). To the best of our knowledge, no direct association study between BPH and periodontitis has been documented until now, though there is a study showing that levels of prostate-specific antigen (PSA) are higher in patients with periodontitis than in periodontitisfree individuals (Joshi et al. 2010). However, elevated PSA levels have many causes and the interpretation of PSA test results varies by patient (Nci 2012). The literature does support an association between chronic periodontitis and erectile dysfunction (Zadik et al. 2009, Keller et al. 2012). BPH and erectile dysfunction have been associated with each other, suggesting that both could be associated with periodontitis (Mcvary 2005, Costabile & Steers 2006).
A plausible biological mechanism behind our observed association between BPH and periodontitis could involve the mutual role that inflammation plays in both diseases (Amar et al. 2003, Nickel 2008). Both periodontitis and BPH have been associated with an increase in TGF-beta production, an important growth factor involved in the immune response and wound healing (Skalerič et al. 1997, Untergasser et al. 2005). Based on a plausible biological underpinning, we suggest that the BPH - Periodontitis association warrants further validation and study in other databases to remove any possible selection bias and to better understand the biological mechanism behind this association, which has potential clinical importance.
Our patient population consisted of all patients with records in both the College of Dental Medicine and NYP, which made our population heterogeneous in both known (e.g. ethnicity) and unknown (e.g. family history, genetic predisposition) ways. This is the main difference between EHR-based clinical research and other types of clinical research (e.g. cohort studies, randomized controlled trials) that are able to collect these types of data. However, although EHR data is heterogeneous, our study and others have successfully utilized EHR data to both replicate and reveal disease discoveries useful to clinicians (Denny et al. 2010, Ritchie et al. 2010).
Therefore, we measured and adjusted for known causes of heterogeneity within our study. Ethnically our population consists of 69% Hispanic, 12% African descent (i.e. close to national average), 9% unreported, 5% European descent, 3% other and 1% Asian or Pacific Islander. Because of this, we performed ethnicity adjustment in our regression models to identify medical conditions associated with periodontitis regardless of ethnic background. The distribution of ethnicities in our patient population differs from the 2010 US census data, which shows 75% of the population being of European descent, 16% Hispanic, 14% African descent, 7% other and 6% Asian or Pacific Islander (note that percentages do not sum to 100 because of persons with multi-ethnic/racial backgrounds) (Us-Census-Bureau 2011). As we adjusted for ethnicity, all identified associations are independent of ethnic background. The contribution of race on periodontitis has been well described in the literature, including the association with being of Mexican American heritage (Eke et al. 2012, Papapanou 2012). However, while Hispanics are over-represented in our study (i.e. cases as well as controls), we adjust for this in our regression models. Therefore, this is not necessarily a shortcoming of our study. In fact, the ability of our method to successfully replicate known medical conditions associated with periodontitis in an ethnically heterogeneous population further demonstrates its utility and robustness, as it facilitates high-throughput identification of medical conditions associated with periodontitis.
In this study we used the ICD-9 code for tobacco abuse disorder as our proxy for smoking status. We realize that tobacco abuse disorder and smoking status are not synonymous. However, we used tobacco abuse because this variable was available in our complete records database while smoking status was not. Similarly, other periodontal researchers have used tobacco abuse to adjust for smoking in their regression analyses (Keller et al. 2013). Knowing that smoking status is commonly incomplete in EHRs, we further investigated the impact of using tobacco abuse instead of smoking status on our findings. Therefore, we used the available smoking status data, recorded as never smoker and current or former smoker, for a subset of 1,071 patients that were treated in NYP’s ambulatory care division and repeated our regression analyses in this subset. Our subset regression analyses revealed that the directionality of the associations for all significant ICD-9 codes (Table 2) and categories remained the same. However, to avoid the reduction in sample size, we report our results using the tobacco abuse adjustment because this variable was available for all patients.
Potential limitations of this study are selection bias and possible incompleteness in the de-identified EHR data. Our sample was limited to patients who received treatment at both Columbia University’s College of Dental Medicine and NYP. Since hospital treatment was necessary for inclusion into this study, our case and control patients are not likely to be in optimal physical condition. Also, the average age of our patients was 57 years, with no patients younger than 40. This selection bias may have an effect on our findings.
Analysing de-identified linked electronic health records introduced several limitations. First, we used billing codes to represent medical and dental conditions. Therefore, we did not have direct patient contact to either determine definitively that they indeed had the disease or condition that they were diagnosed with, or assess the severity of the condition. Likewise, our control patients may have had some degree of periodontitis that was not recorded in our records either because it was untreated due to a financial or other reason or it was treated outside of CUMC-NYP. Second, we grouped all periodontal treatment codes into one group to represent periodontitis and thus did not distinguish between treated or untreated periodontitis, or among periodontal treatment modalities, such as scaling and root planing, periodontal flap surgery or periodontal maintenance. Likewise, we used ICD-9 codes to represent medical conditions, while studies show that ICD-9 codes have low specificity (Goldstein 1998, Birman-Deych et al. 2005). Importantly, the absence of an ICD-9 code does not necessarily indicate that the patient does not have the condition in question (Birman-Deych et al. 2005). Furthermore, because we do not have access to longitudinally complete medical histories, we cannot definitively state that a patient did not have a particular condition at some prior point in their medical history. As we use electronic health record data in this association study, it is possible that some of the associations could result from the absence of other potential confounders, for example, socio-economic status, which is usually not available from electronic health records. However, there is literature support for all of the identified associations resulting from this study, which bolsters our belief in their clinical meaningfulness. Lastly, we used ICD-9 codes for two levels of association analyses: individual code level and category level. Therefore, we did not aggregate ICD-9 codes by diseases as some other researchers did (Denny et al. 2010), but instead we aggregated ICD-9 codes using the AMA’s published ICD-9 categories. Part of the strength of our method is that by using individual ICD-9 codes, we were able to detect fine-grained associations. For example, we found an association between periodontitis and diabetes with ophthalmic manifestations (ICD-9 250.5, OR = 1.8, 95% CI 1.34–2.47, p < 0.001), which would not have been possible with grouped codes (i.e. where all diabetes codes are represented by one diagnosis “diabetes”). One limitation of our approach is that it could result in a reduction in sensitivity and specificity when compared with multiple ICD-9 codes used to represent a single disease (e.g. diabetes type II).
This study presents a method for using linked electronic medical and dental records to support high-throughput knowledge discovery. The data-driven method does not rely on an a priori hypothesis but instead tests every possible hypothesis simultaneously and identifies medical conditions significantly associated with periodontitis. Our method successfully replicated known associations between periodontitis and diabetes mellitus types I and II, hypertension, hypercholesterolaemia, hyperlipidaemia and conditions pertaining to pregnancy and childbirth. We also discovered a thus far unreported association between benign prostatic hyperplasia and periodontitis in male patients less than 70 years of age that has potential clinical importance due to both diseases mutual role in inflammation pathways. We conclude that linked electronic health records can promote oral disease knowledge discovery.
Periodontitis is known to be associated with chronic systemic diseases. Unlike conventional association studies that are designed to test only one hypothesis at a time, we developed a method that tests multiple associations concurrently using linked electronic health records.
We confirmed known associations with diabetes type I and II, hypertension, hypercholesterolaemia, hyperlipidaemia and conditions pertaining to pregnancy and childbirth. We also identified a previously unreported association with benign prostatic hyperplasia.
Linked electronic medical and dental records are useful for exploring associations between medical and dental conditions without a priori hypothesis selection.
We thank Richard Steinman for editing this manuscript and Titus Schleyer for his feedback.
This study was supported by grants R01LM009886, R01LM010815 and R01 LM006910 from the National Library of Medicine, grant UL1 TR000040 from the National Center for Research Resources and an AHRQ grant R01 HS019853.
Conflict of interest and source of funding statement
Additional Supporting Information may be found in the online version of this article:
Figure S1. Age distribution for patients <70 years old, for cases and controls stratified by gender.
Figure S2. Medical conditions not significantly associated with absence of periodontitis.
Figure S3. Frequency of ICD-9 codes for complications of pregnancy and childbirth.
Figure S4. Frequency of ICD-9 codes for nervous system conditions.
Figure S5. Frequency of ICD-9 codes for endocrine system conditions.
Table S1. List of 993 ICD-9 codes in our data set for association mining.