Search tips
Search criteria 


Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
J Clin Periodontol. Author manuscript; available in PMC 2014 May 1.
Published in final edited form as:
PMCID: PMC3690348

Discovering medical conditions associated with periodontitis using linked electronic health records



To use linked electronic medical and dental records to discover associations between periodontitis and medical conditions independent of a priori hypotheses.

Materials and Methods

This case-control study included 2475 patients who underwent dental treatment at the College of Dental Medicine at Columbia University and medical treatment at NewYork-Presbyterian Hospital. Our cases are patients who received periodontal treatment and our controls are patients who received dental maintenance but no periodontal treatment. Chi-square analysis was performed for medical treatment codes and logistic regression was used to adjust for confounders.


Our method replicated several important periodontitis associations in a largely Hispanic population, including diabetes mellitus type I (OR = 1.6, 95% CI 1.30–1.99, p < 0.001) and type II (OR = 1.4, 95% CI 1.22–1.67, p < 0.001), hypertension (OR = 1.2, 95% CI 1.10–1.37, p < 0.001), hypercholesterolaemia (OR = 1.2, 95% CI 1.07–1.38, p = 0.004), hyperlipidaemia (OR = 1.2, 95% CI 1.06–1.43, p = 0.008) and conditions pertaining to pregnancy and childbirth (OR = 2.9, 95% CI: 1.32–7.21, p = 0.014). We also found a previously unreported association with benign prostatic hyperplasia (OR = 1.5, 95% CI 1.05–2.10, p = 0.026) after adjusting for age, gender, ethnicity, hypertension, diabetes, obesity, lipid and circulatory system conditions, alcohol and tobacco abuse.


This study contributes a high-throughput method for associating periodontitis with systemic diseases using linked electronic records.

Keywords: data linkage, dental records, electronic health records, medical informatics, periodontal diseases

Periodontitis is a chronic condition affecting gingival tissue and is known to interact with a variety of chronic diseases, including heart disease and diabetes (Williams & Mahan 1960, Mattila et al. 1989, Destefano et al. 1993, Beck & Offenbacher 1998, Grossi & Genco 1998, Lalla et al. 2003, Desvarieux et al. 2005). Identifying the connections between periodontitis and systemic diseases can ensure a holistic approach to patient care management. Regrettably, information about dental conditions is often inaccessible to physicians, partly due to inadequate interdisciplinary collaboration between dentistry and medicine (Slavkin & Baum 2000). Barriers to linking dental and medical records include disparate coding standards for dental and medical diagnoses, fragmented record systems and different patient identifiers used by dentists and physicians (Din & Powell 2008, Rudman et al. 2010, Theis et al. 2010). These barriers have long prevented largescale association studies between periodontitis and other medical conditions.

The rapidly widening adoption of electronic health records (EHRs) over the past few years has accelerated the digitization of patient records in various healthcare domains, including dentistry and medicine (111th-Congress 2009, Jha et al. 2009, Bramble et al. 2010, Casnoff et al. 2011). At Columbia University Medical Center, such electronic records have been used to develop a Dental Data Warehouse (DDW) at the College of Dental Medicine and a Clinical Data Warehouse (CDW) at its teaching hospital, NewYork-Presbyterian (NYP). Many patients have records in both databases. This study describes an effective method for mining associations between periodontitis and other medical conditions using linked electronic health records without selecting certain hypotheses a priori.

Materials and Methods

Records extraction and linkage

We used a case-control design to compare the prevalence of medical conditions between patients who had received treatment for periodontitis (case) and patients without periodontitis treatment but with general dental maintenance (control). Using all of the available patient records in our medical centre, we first identified 2126 patients whose dental record contained at least one Common Dental Terminology (CDT) periodontal treatment code (D4000-D4999) between January 3, 2007 and June 17, 2011. Then, we identified 2429 patients with one of two common dental maintenance codes (D0120 or D0150) but no periodontal treatment codes (D4000-D4999) within the same time frame. These dental records were matched to the medical records using first name, last name, gender and date of birth. If any one of these fields was not exactly matched, the patient was removed from further analysis. Duplicate patient identifiers were reviewed to consolidate records belonging to the same patients. Excluding patients who were either not NYP patients or had inconsistent information in one of the data fields, we mapped health records for 58% (1235/2126) of periodontitis patients and 51% (1240/2429) of non-periodontitis patients. Record linkage was performed by database administrators from both the College of Dental Medicine and NYP. Therefore, the researchers of this study only had access to de-identified linked records for analysis purposes. This study was approved by the Columbia University Institutional Review Board (IRB-AAAI9251).

The resulting study sample (N = 2475) contained 1235 cases and 1240 controls. For each patient, we extracted all International Classification of Diseases 9th Revision (ICD-9) codes along with date of birth, gender, ethnicity, dental date of service and medical diagnosis date. In our analysis, we obtained only the earliest diagnosis date for each ICD-9 code and the earliest dental service date for periodontitis. For gender and ethnicity, all information was self-reported by the patient at the time of medical or dental service. For assessment of age, we used two variables: birth year and age at time of dental service.

Association mining

We extracted unique ICD-9 codes for all patients’ diagnoses that occurred between January 1, 2004 and December 16, 2011. Then, we grouped ICD-9 codes into 17 categories according to the AMA ICD-9 coding guidelines (Ama 2008). The following analyses were performed: (1) chi-squared goodness-of-fit tests with multiple hypothesis correction for each unique ICD-9 code observed by at least one patient in both the case and control groups; (2) multivariate logistic regression modelling of chisquare significant ICD-9 codes to identify associations while adjusting for age, gender, ethnicity and tobacco abuse disorder (defined as the presence of ICD-9 305.1 in the patient record); (3) multivariate logistic regression for each ICD-9 category with the same adjustment factors; and (4) literature validation of identified associations. For associations unreported in the literature, we performed further regression modelling while including additional covariates for other lifestyle factors and disease comorbidities. Clinicians diagnose “tobacco abuse disorder” for patients who are either present with a condition that is determined to result from tobacco usage or who require tobacco cessation treatment, for example, prescription nicotine patch (Ahrq 2012). Therefore, we used this diagnosis to adjust for patients who smoke or chew tobacco as they have experienced detrimental health effects of such use.

First, we performed a chi-square goodness-of-fit test on each ICD-9 code to measure its potential for association with periodontitis. As a cautionary step, we used only those codes found both in case and control groups (N = 2266) to avoid a type II error due to sparse data. We also excluded from our analysis those ICD-9 codes with an expected frequency <5 for either the cases or the controls because the probabilities from the chi-squared distribution for these codes cannot be estimated with a sufficient level of precision (Hill & Lewicki 2006). Therefore, we designed our method to search for associations strongly supported by the data set with high observed and expected frequencies. This process resulted in an analysis set of 993 ICD-9 codes (Table S1).

After the ICD-9 code association tests, the ICD-9 codes were partitioned into two groups, positive and negative, based on the ratio of observed frequency to expected frequency to interpret the results and correct for multiple hypothesis testing.

If a code’s ratio for cases was ≥1, the code was assigned to the positive group; otherwise, it was assigned to the negative group. Then, we used the Bonferroni correction method to adjust for multiple hypothesis testing (Bonferroni 1935,1936). We selected two Bonferroni p-value cut-offs, one for the positive group and one for the negative. In this way, we forced the family-wise error rate to be ≤0.05 within each group. We also present results that control the number of false positive associations using the Benjamini and Hochberg method (Benjamini & Hochberg 1995).

Expected Frequency for Cases=Total Frequency of Cases * Total Frequency of a Particular ICD9 Code in CasesTotal Number of ICD9 Codes in Cases and Controls (i.e.74,771)

Another factor to consider when interpreting the results of an association mining study is the issue of confounding. If knowledge regarding potential confounders is absent then false positive associations may result. A common example from the literature is the association between lung cancer and match sales (Kleinberg & Hripcsak 2011). This association occurs when the confounding variable, that is, smoking, is absent from the analysis. Once information regarding the confounder is added then the original association, that is, between lung cancer and match sales, is no longer significant. Therefore, to address the issue of confounding in our study, we performed multivariate logistic regression using all significantly associated variables from the chi-square goodness-of-fit tests. The regression modelling allowed us to identify associations after adjusting for potentially confounding factors including age, gender, ethnicity, tobacco abuse disorder and other disease-specific comorbidities extracted from the medical records. For gender-specific conditions (e.g. pre-term birth, testicular cancer), we stratified the data by the relevant gender and then performed the regression analysis. Odds ratios and 95% confidence intervals (CI) were estimated using logistic regression models. All regression modelling was performed in R (R-Documentation 2002).

Lastly, we performed logistic regression for each ICD-9 category while adjusting for age, gender, ethnicity and tobacco abuse disorder. We first tested each ICD-9 category for significance individually. This revealed an initial set of statistically significant (p < 0.05) code categories. These categories were then added to a full model and then each category that lost significance was removed and a new model was built. This was performed recursively until our model contained only significant ICD-9 categories, which we report as our final model.


By extracting the unique ICD-9 codes from each patient, we initially arrived at 36230 ICD-9 codes in the control group and 38541 ICD-9 codes in the case group with a total of 3908 unique ICD-9 codes in the entire data set. As the age ranges for our cases and controls differed, we restricted our sample in all regression models to those who were younger than 70 (Figure S1), because everyone in the study aged 70 and above had received treatment for periodontitis. In addition, we adjusted for age in the regression models as this was still a significant indicator of periodontitis and we removed persons of unidentified gender. Both age variables were significant, along with the interaction term, and therefore all three terms were retained in the regression models. Table 1 shows the characteristics of the codes in the restricted data set used in the regression models. We also tested potential confounders – age, gender and ethnicity – for association with periodontitis.

Table 1
Distribution of ICD-9 codes by patient characteristic in regression modelling

Chi-squared goodness-of-fit for individual ICD-9 codes

We performed chi-square goodness-of-fit tests for each ICD-9 code in the analysis set (i.e. 993). We partitioned these codes into positive and negative groups (see Methods) and association analysis was performed with 462 hypothesis tests in the positive group with Bonferroni threshold = 1.083 × 10−4 and 531 in the negative group with Bonferroni threshold = 9.416 × 10−5. The association mining analyses were performed on all ICD-9 codes in the analysis set (i.e. 993 codes); therefore, no restrictions were made based on the type of medical condition. Instead, all conditions were tested for association with periodontitis.

In the positive group, we identified nine significant ICD-9 codes using the Benjamini–Hochberg correction threshold. Three of these codes also passed the Bonferroni cut-off. Figure 1 illustrates each ICD-9 code’s association with periodontitis. No codes achieved significance in the negative group; therefore, no medical conditions were associated with the absence of periodontitis (Figure S2).

Fig 1
Associations between periodontitis and other medical conditions. Medical conditions associated with periodontitis are shown above. ICD-9 codes above the red line are associations that pass the Bonferroni significance threshold while codes above the blue ...

Adjustment for confounding using logistic regression models

Using logistic regression, we tested all nine significant results, adjusting for age, gender, ethnicity and tobacco abuse. Table 2 shows the resulting odds ratios, both before and after correction, for the associations in descending order of significance.

Table 2
Odds ratios for medical conditions in patients < 70-years old before and after adjustment for age, gender, ethnicity and tobacco abuse ranked by p-value (most significant to least)

After adjusting for confounders, we found that two ICD-9 codes lost significance, senile nuclear cataracts (p = 0.403) and benign prostatic hyperplasia (BPH) (p = 0.071). The main confounding factor for the periodontitis-senile nuclear cataracts association was age, which resulted in a p-value of 0.363. Likewise, for the BPH-periodontitis association the greatest confounder was age, which resulted in a p-value of 0.063.

As the association between BPH and periodontitis is not reported in the literature and was almost significant after adjustment (p = 0.071), we decided to investigate it further. First, we merged BPH with and without urinary obstruction (ICD-9 600, 600.01) to examine if BPH in general was significantly associated with periodontitis (Model 2, Table 3). Because the BPH group was significant, we decided to add additional covariates for known comorbidities of BPH and related conditions such as erectile dysfunction (Model 3, Table 3) (Mcvary 2006, Keller et al. 2012). We added comorbidities into the model to determine whether the association between BPH and periodontitis was due to shared comorbidities that both conditions have in common, for example, diseases of the circulatory system. In the model, we included the following lifestyle factors: obesity, alcohol abuse (defined as the presence of any ICD-9 codes indicative of excessive alcohol consumption, which includes diagnoses for alcohol withdrawal, alcoholic hepatitis and alcohol abuse), tobacco abuse disorder and BPH co-morbidities: diabetes, hypertension, lipid conditions and diseases of the circulatory system. The association between BPH and periodontitis remained significant even after adjustment (p = 0.026). In Model 3, some potential confounders were not significant including obesity (p = 0.141), lipid conditions (p = 0.100), and conditions of the circulatory system (p = 0.309). We retained Model 3 as our final model (Table 4).

Table 3
Adjusted odds ratios (OR) and their 95% confidence intervals (CI) for various regression models of the association between BPH and Periodontitis in males <70 years old
Table 4
Logistic regression model for the association between BPH and Periodontitis in males <70-years old after adjustments

The other remaining ICD-9 codes were significant after adjustment for age, gender, ethnicity and tobacco abuse and therefore represent clinically meaningful associations. Thus, periodontitis was associated with the following medical diseases or conditions: diabetes types I and II, hypertension, diabetes with ophthalmic manifestations, hypercholesterolaemia and hyperlipidaemia.

Logistic regression for ICD-9 categories

We grouped the 3786 ICD-9 codes in our restricted sample (i.e. age <70 years and only male or female genders) into 17 unique categories using AMA ICD-9 Official Coding Guidelines (Ama 2008). Then, we used multivariate logistic regression to determine the ICD-9 code categories associated with periodontitis after adjusting for age, gender, ethnicity and tobacco abuse. We found three significant ICD-9 categories: the endocrine system (OR = 1.3, 95% CI: 1.21–1.34, p < 0.001), nervous system (OR = 1.2, 95% CI: 1.11–1.24, p < 0.001) and pregnancy and conditions of childbirth (partitioned for females only) (OR = 2.9, 95% CI: 1.32–7.21, p = 0.014) (Figure S3–S5). The frequency of codes in the ‘complications of pregnancy and childbirth’ category was low in our population. Therefore, identifying associations between the individual ICD-9 codes and periodontitis was not possible (Fig. 2); however, an association between the complications of pregnancy and childbirth ICD-9 category and periodontitis was found at the category level using logistic regression.

Fig 2
ICD-9 codes with low frequency achieve significance after aggregation. Orange nodes represent periodontitis patients and blue nodes represent control patients. Unique ICD-9 codes may be too infrequent (<5 patients each) to achieve significance ...


This study used linked medical and dental electronic health records to mine for potential associations between periodontitis and all possible medical conditions without selecting certain hypotheses a priori. We replicated known periodontitis associations that remained significant after adjusting for age, gender, ethnicity and tobacco abuse. We also found one unreported association with BPH that remained significant after adjustments for hypertension, obesity, diabetes, lipid and circulatory system conditions, and alcohol abuse. Although there might be prior studies that manually linked paper records, this study exemplifies how existing electronic clinical and dental data can be reused to accelerate discoveries of associations between oral and systemic disease.

Replicated associations

We confirmed that diabetes types I and II are positively associated with periodontitis after adjustment for age, gender, ethnicity and tobacco abuse. These results replicated by our method are widely reported in the literature (Williams & Mahan 1960, Grossi & Genco 1998, Resnick et al. 1998, Iacopino 2001, Soskolne & Klinger 2001, Campus et al. 2005, Mealey 2006, Mealey & Oates 2006, Narayan et al. 2007, Mealey & Rose 2008, Lalla & Papapanou 2011, Hodge et al. 2012). We found diabetes with ophthalmic manifestations (ICD-9 250.5) to be associated with periodontitis (OR = 1.8, 95% CI 1.34–2.47, p < 0.001), which replicates the result from a study of Pima Indian diabetic patients with retinopathy who were found to be approximately five times more likely to have periodontitis (Löe 1993). Our association result remained valid after adjusting for our ethnically heterogeneous population, which further bolsters the literature supporting this association.

Overall, the literature shows that metabolic syndrome – whose symptoms include hypertension, hypercholesterolaemia, hyperlipidaemia – is associated with periodontitis (Morita et al. 2009, Andriankaja et al. 2010). One study found that hypertension was associated with tooth loss in post-menopausal women (Taguchi et al. 2004), while another study failed to find an association between hypertension and periodontitis after adjusting for multiple factors such as fruit and vegetable intake and multivitamin use (Rivas-Tumanyan et al. 2012). It is possible that our identified association between hypertension and periodontitis could be due to other factors, such as diet or vitamin use, not recorded in EHR data. However, a study on the relationship between the levels of periodontitis bacteria and hypertension established a direct association between the two diseases (Desvarieux et al. 2010).

Related to this association are two more replicated periodontitis associations: hypercholesterolaemia and hyperlipidaemia. The association between hypercholesterolaemia and periodontitis has been previously reported.(Katz et al. 2001, Ramesh et al. 2010) For hyperlipidaemia, a link with periodontitis has been proposed as the underlying factor behind the connection between periodontitis and atherosclerosis, (Fentoglu & Bozkurt 2008), which is supported by a number of studies (Ramesh et al. 2010, Joshi & Marawar 2011).

Our method also replicated an association between periodontitis and complications of pregnancy, and childbirth (OR = 2.9, 95% CI: 1.32–7.21, p = 0.014). This is a category-wide association involving multiple ICD-9 codes, including pre-term birth, delivery with complications, legal abortion and normal delivery. The association between threatened pre-term labour and periodontitis has been widely discussed in the literature (Offenbacher et al. 1996, Mitchell-Lewis et al. 2001, Michalowicz et al. 2009). The over-arching result of this category-wide association is that pregnancy itself is associated with gingival inflammation, a finding consistent with the literature (Bobetsis et al. 2006, Santos-Pereira et al. 2007, Wimmer & Pihlstrom 2008, Chambrone et al. 2011). Various explanations for the increased risk of periodontal manifestations during pregnancy have been suggested, including its link to the hormonal changes associated with pregnancy (Carrillo-De-Albornoz et al. 2012), although pregnancy does not cause periodontitis but may rather aggravate pre-existing periodontal disease (Laine 2002).

Prostatitis–Periodontitis association

We found a previously unreported association between BPH and periodontitis that remained after adjusting for various lifestyle, demographic and comorbidity confounders (OR = 1.5, 95% CI 1.05–2.10, p = 0.026). To the best of our knowledge, no direct association study between BPH and periodontitis has been documented until now, though there is a study showing that levels of prostate-specific antigen (PSA) are higher in patients with periodontitis than in periodontitisfree individuals (Joshi et al. 2010). However, elevated PSA levels have many causes and the interpretation of PSA test results varies by patient (Nci 2012). The literature does support an association between chronic periodontitis and erectile dysfunction (Zadik et al. 2009, Keller et al. 2012). BPH and erectile dysfunction have been associated with each other, suggesting that both could be associated with periodontitis (Mcvary 2005, Costabile & Steers 2006).

A plausible biological mechanism behind our observed association between BPH and periodontitis could involve the mutual role that inflammation plays in both diseases (Amar et al. 2003, Nickel 2008). Both periodontitis and BPH have been associated with an increase in TGF-beta production, an important growth factor involved in the immune response and wound healing (Skalerič et al. 1997, Untergasser et al. 2005). Based on a plausible biological underpinning, we suggest that the BPH - Periodontitis association warrants further validation and study in other databases to remove any possible selection bias and to better understand the biological mechanism behind this association, which has potential clinical importance.

Ethnically heterogeneous populationbased study

Our patient population consisted of all patients with records in both the College of Dental Medicine and NYP, which made our population heterogeneous in both known (e.g. ethnicity) and unknown (e.g. family history, genetic predisposition) ways. This is the main difference between EHR-based clinical research and other types of clinical research (e.g. cohort studies, randomized controlled trials) that are able to collect these types of data. However, although EHR data is heterogeneous, our study and others have successfully utilized EHR data to both replicate and reveal disease discoveries useful to clinicians (Denny et al. 2010, Ritchie et al. 2010).

Therefore, we measured and adjusted for known causes of heterogeneity within our study. Ethnically our population consists of 69% Hispanic, 12% African descent (i.e. close to national average), 9% unreported, 5% European descent, 3% other and 1% Asian or Pacific Islander. Because of this, we performed ethnicity adjustment in our regression models to identify medical conditions associated with periodontitis regardless of ethnic background. The distribution of ethnicities in our patient population differs from the 2010 US census data, which shows 75% of the population being of European descent, 16% Hispanic, 14% African descent, 7% other and 6% Asian or Pacific Islander (note that percentages do not sum to 100 because of persons with multi-ethnic/racial backgrounds) (Us-Census-Bureau 2011). As we adjusted for ethnicity, all identified associations are independent of ethnic background. The contribution of race on periodontitis has been well described in the literature, including the association with being of Mexican American heritage (Eke et al. 2012, Papapanou 2012). However, while Hispanics are over-represented in our study (i.e. cases as well as controls), we adjust for this in our regression models. Therefore, this is not necessarily a shortcoming of our study. In fact, the ability of our method to successfully replicate known medical conditions associated with periodontitis in an ethnically heterogeneous population further demonstrates its utility and robustness, as it facilitates high-throughput identification of medical conditions associated with periodontitis.

Adjusting for smoking status

In this study we used the ICD-9 code for tobacco abuse disorder as our proxy for smoking status. We realize that tobacco abuse disorder and smoking status are not synonymous. However, we used tobacco abuse because this variable was available in our complete records database while smoking status was not. Similarly, other periodontal researchers have used tobacco abuse to adjust for smoking in their regression analyses (Keller et al. 2013). Knowing that smoking status is commonly incomplete in EHRs, we further investigated the impact of using tobacco abuse instead of smoking status on our findings. Therefore, we used the available smoking status data, recorded as never smoker and current or former smoker, for a subset of 1,071 patients that were treated in NYP’s ambulatory care division and repeated our regression analyses in this subset. Our subset regression analyses revealed that the directionality of the associations for all significant ICD-9 codes (Table 2) and categories remained the same. However, to avoid the reduction in sample size, we report our results using the tobacco abuse adjustment because this variable was available for all patients.


Potential limitations of this study are selection bias and possible incompleteness in the de-identified EHR data. Our sample was limited to patients who received treatment at both Columbia University’s College of Dental Medicine and NYP. Since hospital treatment was necessary for inclusion into this study, our case and control patients are not likely to be in optimal physical condition. Also, the average age of our patients was 57 years, with no patients younger than 40. This selection bias may have an effect on our findings.

Analysing de-identified linked electronic health records introduced several limitations. First, we used billing codes to represent medical and dental conditions. Therefore, we did not have direct patient contact to either determine definitively that they indeed had the disease or condition that they were diagnosed with, or assess the severity of the condition. Likewise, our control patients may have had some degree of periodontitis that was not recorded in our records either because it was untreated due to a financial or other reason or it was treated outside of CUMC-NYP. Second, we grouped all periodontal treatment codes into one group to represent periodontitis and thus did not distinguish between treated or untreated periodontitis, or among periodontal treatment modalities, such as scaling and root planing, periodontal flap surgery or periodontal maintenance. Likewise, we used ICD-9 codes to represent medical conditions, while studies show that ICD-9 codes have low specificity (Goldstein 1998, Birman-Deych et al. 2005). Importantly, the absence of an ICD-9 code does not necessarily indicate that the patient does not have the condition in question (Birman-Deych et al. 2005). Furthermore, because we do not have access to longitudinally complete medical histories, we cannot definitively state that a patient did not have a particular condition at some prior point in their medical history. As we use electronic health record data in this association study, it is possible that some of the associations could result from the absence of other potential confounders, for example, socio-economic status, which is usually not available from electronic health records. However, there is literature support for all of the identified associations resulting from this study, which bolsters our belief in their clinical meaningfulness. Lastly, we used ICD-9 codes for two levels of association analyses: individual code level and category level. Therefore, we did not aggregate ICD-9 codes by diseases as some other researchers did (Denny et al. 2010), but instead we aggregated ICD-9 codes using the AMA’s published ICD-9 categories. Part of the strength of our method is that by using individual ICD-9 codes, we were able to detect fine-grained associations. For example, we found an association between periodontitis and diabetes with ophthalmic manifestations (ICD-9 250.5, OR = 1.8, 95% CI 1.34–2.47, p < 0.001), which would not have been possible with grouped codes (i.e. where all diabetes codes are represented by one diagnosis “diabetes”). One limitation of our approach is that it could result in a reduction in sensitivity and specificity when compared with multiple ICD-9 codes used to represent a single disease (e.g. diabetes type II).


This study presents a method for using linked electronic medical and dental records to support high-throughput knowledge discovery. The data-driven method does not rely on an a priori hypothesis but instead tests every possible hypothesis simultaneously and identifies medical conditions significantly associated with periodontitis. Our method successfully replicated known associations between periodontitis and diabetes mellitus types I and II, hypertension, hypercholesterolaemia, hyperlipidaemia and conditions pertaining to pregnancy and childbirth. We also discovered a thus far unreported association between benign prostatic hyperplasia and periodontitis in male patients less than 70 years of age that has potential clinical importance due to both diseases mutual role in inflammation pathways. We conclude that linked electronic health records can promote oral disease knowledge discovery.

Clinical Relevance

Scientific rationale for the study

Periodontitis is known to be associated with chronic systemic diseases. Unlike conventional association studies that are designed to test only one hypothesis at a time, we developed a method that tests multiple associations concurrently using linked electronic health records.

Principal findings

We confirmed known associations with diabetes type I and II, hypertension, hypercholesterolaemia, hyperlipidaemia and conditions pertaining to pregnancy and childbirth. We also identified a previously unreported association with benign prostatic hyperplasia.

Practical implications

Linked electronic medical and dental records are useful for exploring associations between medical and dental conditions without a priori hypothesis selection.

Supplementary Material

Tables and Figures


We thank Richard Steinman for editing this manuscript and Titus Schleyer for his feedback.

This study was supported by grants R01LM009886, R01LM010815 and R01 LM006910 from the National Library of Medicine, grant UL1 TR000040 from the National Center for Research Resources and an AHRQ grant R01 HS019853.


Conflict of interest and source of funding statement

Supporting Information

Additional Supporting Information may be found in the online version of this article:

Figure S1. Age distribution for patients <70 years old, for cases and controls stratified by gender.

Figure S2. Medical conditions not significantly associated with absence of periodontitis.

Figure S3. Frequency of ICD-9 codes for complications of pregnancy and childbirth.

Figure S4. Frequency of ICD-9 codes for nervous system conditions.

Figure S5. Frequency of ICD-9 codes for endocrine system conditions.

Table S1. List of 993 ICD-9 codes in our data set for association mining.


  • Ahrq [accessed on 20 July 2012];Treating Tobacco Use and Dependence. PHS Clinical Practice Guideline. 2012
  • Ama [accessed on 6 April 2012];ICD-9-CM Official Coding Guidelines. American Medical Association. 2008
  • Amar S, Gokce N, Morgan S, Loukideli M, Van Dyke TE, Vita JA. Periodontal disease is associated with brachial artery endothelial dysfunction and systemic inflammation. Arteriosclerosis, Thrombosis, and Vascular Biology. 2003;23:1245–1249. [PubMed]
  • Andriankaja O, Screenivasa S, Dunford RG, Denardin E. Association between metabolic syndrome and periodontal disease. Australian dental journal. 2010;55:252–259. [PubMed]
  • Beck JD, Offenbacher S. Oral health and systemic disease: periodontitis and cardiovascular disease. Journal of Dental Education. 1998;62:859–870. [PubMed]
  • Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society, Series B (Methodological) 1995;57:289–300.
  • Birman-Deych E, Waterman AD, Yan Y, Nilasena DS, Radford MJ, Gage BF. Accuracy of ICD-9-CM Codes for Identifying Cardiovascular and Stroke Risk Factors. Medical Care. 2005;43:480–485. [PubMed]
  • Bobetsis YA, Barros SP, Offenbacher S. Exploring the relationship between periodontal disease and pregnancy complications. Journal of the American Dental Association. 2006;137:7S–13S. [PubMed]
  • Bonferroni CE. Studi in Onore del Professore Salvatore Ortu Carboni. Rome; Italy: 1935. Il calcolo delle assicurazioni su gruppi di teste; pp. 13–60.
  • Bonferroni CE. Teoria statistica delle classi e calcolo delle probabilità Pubblicazioni del R Istituto Superiore di Scienze Economiche e Commerciali di Firenze. 1936;8:3–62.
  • Bramble JD, Galt KA, Siracuse MV, Abbott AA, Drincic A, Paschal KA, Fuji KT. The relationship between physician practice characteristics and physician adoption of electronic health records. Health Care Management Review. 2010;35:55–64. [PubMed]
  • Campus G, Salem A, Uzzau S, Baldoni E, Tonolo G. Diabetes and Periodontal Disease: A Case-Control Study. Journal of Periodontology. 2005;76:418–425. [PubMed]
  • Carrillo-De lbornoz, A., Figuero E, Herrera D, Cuesta P, Bascones-Martínez A. Gingival changes during pregnancy: III. Impact of clinical, microbiological, immunological and socio-demographic factors on gingival inflammation. Journal of Clinical Periodontology. 2012;39:272–283. [PubMed]
  • Casnoff C, Rosenberger L, Kwon N, Scherer H. Quality Oral Health Care in Medicaid Through Health IT: Final Report. Agency for Healthcare Research and Quality. 2011 AHRQ Publication No. 11-0085-EF.
  • Chambrone L, Pannuti CM, Guglielmetti MR, Chambrone LA. Evidence grade associating periodontitis with preterm birth and/or low birth weight: II. A systematic review of randomized trials evaluating the effects of periodontal treatment. Journal of Clinical Periodontology. 2011;38:902–914. [PubMed]
  • 111th-Congress . Public Law 111-5-Feb. 17, 123 Stat, 115-521. 2009. H.R. I (111th): American Recovery and Reinvestment Act of 2009.
  • Costabile RA, Steers WD. How can we best characterize the relationship between erectile dysfunction and benign prostatic hyperplasia? Journal of Sexual Medicine. 2006;3:676–681. [PubMed]
  • Denny JC, Ritchie MD, Basford MA, Pulley JM, Bastarache L, Brown-Gentry K, Wang D, Masys DR, Roden DM, Crawford DC. PheWAS: demonstrating the feasibility of a phenome-wide scan to discover gene-disease associations. Bioinformatics. 2010;26:1205–1210. [PMC free article] [PubMed]
  • Destefano F, Anda RF, Kahn HS, Williamson DF, Russell CM. Dental disease and risk of coronary heart disease and mortality. BMJ: British Medical Journal. 1993;306:688–691. [PMC free article] [PubMed]
  • Desvarieux M, Demmer RT, Jacobs DR, Jr, Rundek T, Boden-Albala B, Sacco RL, Papapanou PN. Periodontal bacteria and hypertension: the oral infections and vascular disease epidemiology study (INVEST) Journal of Hypertension. 2010;28:1413–1421. [PMC free article] [PubMed]
  • Desvarieux M, Demmer RT, Rundek T, Boden-Albala B, Jacobs DR, Sacco RL, Papapanou PN. Periodontal Microbiota and Carotid Intima-Media Thickness. Circulation. 2005;111:576–582. [PMC free article] [PubMed]
  • Din F, Powell V. [accessed on 28 September 2011];Integration of Medical and Dental Records to Improve Healthcare Outcomes, Costs, and Overall Public Health. Community Healthcare Meeting and Teleconference, December 17, 2008. 2008
  • Eke PI, Dye BA, Wei L, Thornton-Evans GO, Genco RJ. Prevalence of Periodontitis in Adults in the United States: 2009 and 2010. Journal of Dental Research. 2012;91:914–920. [PubMed]
  • Fentoglu O, Bozkurt FY. The Bi-Directional Relationship between Periodontal Disease and Hyperlipidemia. European Journal of Dentistry. 2008;2:142–146. [PMC free article] [PubMed]
  • Goldstein LB. Accuracy of ICD-9-CM Coding for the Identification of Patients With Acute Ischemic Stroke: Effect of Modifier Codes. Stroke. 1998;29:1602–1604. [PubMed]
  • Grossi SG, Genco RJ. Periodontal disease and diabetes mellitus: a two-way relationship. Annals of Periodontology. 1998;3:51–61. [PubMed]
  • Hill T, Lewicki P. Statistics: Methods and applications: A comprehensive reference for science, industry, and data mining. StatSoft Inc.; Tulsa: 2006. pp. 1–832.
  • Hodge PJ, Robertson D, Paterson K, Smith GLF, Creanor S, Sherriff A. Periodontitis in non-smoking type 1 diabetic adults: a cross-sectional study. Journal of Clinical Periodontology. 2012;39:20–29. [PubMed]
  • Iacopino AM. Periodontitis and diabetes interrelationships: role of inflammation. Annals of Periodontology. 2001;6:125–137. [PubMed]
  • Jha AK, Desroches CM, Campbell EG, Donelan K, Rao SR, Ferris TG, Shields A, Rosenbaum S, Blumenthal D. Use of Electronic Health Records in U.S Hospitals. New England Journal of Medicine. 2009;360:1628–1638. [PubMed]
  • Joshi N, Bissada NF, Bodner D, Maclennan GT, Narendran S, Jurevic R, Skillicorn R. Association between periodontal disease and prostate-specific antigen levels in chronic prostatitis patients. Journal of Periodontology. 2010;81:864–869. [PubMed]
  • Joshi NV, Marawar P. Hyperlipidemia- A Link Between Periodontitis and Coronary Heart Disease. Journal of the Indian Dental Association. 2011;5
  • Katz J, Chaushu G, Sharabi Y. On the association between hypercholesterolemia, cardiovascular disease and severe periodontal disease. Journal of Clinical Periodontology. 2001;28:865–8. [PubMed]
  • Keller JJ, Chung S-D, Lin H-C. A nationwide population-based study on the association between chronic periodontitis and erectile dysfunction. Journal of Clinical Periodontology. 2012;39:507–512. [PubMed]
  • Keller JJ, Wu C-S, Chen Y-H, Lin H-C. Association between obstructive sleep apnoea and chronic periodontitis: a population-based study. Journal of Clinical Periodontology. 2013;40:111–117. [PubMed]
  • Kleinberg S, Hripcsak G. A review of causal inference for biomedical informatics. Journal of Biomedical Informatics. 2011;44:1102–1112. [PMC free article] [PubMed]
  • Laine MA. Effect of pregnancy on periodontal and dental health. Acta Odontologica Scandinavica. 2002;60:257–264. [PubMed]
  • Lalla E, Lamster IB, Hofmann MA, Bucciarelli L, Jerud AP, Tucker S, Lu Y, Papapanou PN, Schmidt AM. Oral infection with a periodontal pathogen accelerates early atherosclerosis in apolipoprotein E-null mice. Arteriosclerosis, Thrombosis, and Vascular Biology. 2003;23:1405–1411. [PubMed]
  • Lalla E, Papapanou PN. Diabetes mellitus and periodontitis: a tale of two common interrelated diseases. Nature Reviews Endocrinology. 2011;7:738–748. [PubMed]
  • Löe H. Periodontal disease: the sixth complication of diabetes mellitus. Diabetes Care. 1993;16:329–334. [PubMed]
  • Mattila KJ, Nieminen MS, Valtonen VV, Rasi VP, Kes€aniemi YA, Syrjälä SL, Jungell PS, Isoluoma M, Hietaniemi K, Jokinen MJ, Huttunen JK. Association between dental health and acute myocardial infarction. BMJ: British Medical Journal. 1989;298:779–781. [PMC free article] [PubMed]
  • Mcvary K. BPH: epidemiology and comorbidities. Am J Manag Care. 2006;12:S122–S128. [PubMed]
  • Mcvary KT. Erectile dysfunction and lower urinary tract symptoms secondary to BPH. European Urology. 2005;47:838–845. [PubMed]
  • Mealey B. Periodontal disease and diabetes A two-way street. The Journal of the American Dental Association. 2006;137:26S–31S. [PubMed]
  • Mealey BL, Oates TW. Diabetes mellitus and periodontal diseases. Journal of Periodontology. 2006;77:1289–1303. [PubMed]
  • Mealey BL, Rose LF. Diabetes mellitus and inflammatory periodontal diseases. Current Opinion in Endocrinology, Diabetes and Obesity. 2008;15:135–141. [PubMed]
  • Michalowicz BS, Hodges JS, Novak MJ, Buchanan W, Diangelis AJ, Papapanou PN, Mitchell DA, Ferguson JE, Lupo VR, Bofill J, Matseoane S. Change in periodontitis during pregnancy and the risk of pre-term birth and low birth-weight. Journal of Clinical Periodontology. 2009;36:308–314. [PMC free article] [PubMed]
  • Mitchell-Lewis D, Engebretson SP, Chen J, Lamster IB, Papapanou PN. Periodontal infections and pre-term birth: early findings from a cohort of young minority women in New York. European Journal of Oral Sciences. 2001;109:34–39. [PubMed]
  • Morita T, Ogawa Y, Takada K, Nishinoue N, Sasaki Y, Motohashi M, Maeno M. Association between periodontal disease and metabolic syndrome. Journal of Public Health Dentistry. 2009;69:248–253. [PubMed]
  • Narayan KMV, Boyle JP, Thompson TJ, Gregg EW, Williamson DF. Effect of BMI on Lifetime Risk for Diabetes in the U.S. Diabetes Care. 2007;30:1562–1566. [PubMed]
  • Nci [accessed on 20 July 2012];Prostate-Specific Antigen (PSA) Test. National Cancer Institute Fact Sheet. 2012
  • Nickel JC. Inflammation and benign prostatic hyperplasia. Urologic Clinics of North America. 2008;35:109–115. vii. [PMC free article] [PubMed]
  • Offenbacher S, Katz V, Fertik G, Collins J, Boyd D, Maynor G, Mckaig R, Beck J. Periodontal Infection as a Possible Risk Factor for Preterm Low Birth Weight. Journal of Periodontology. 1996;67:1103–1113. [PubMed]
  • Papapanou PN. The prevalence of periodontitis in the US: forget what you were told. Journal of Dental Research. 2012;91:907–908. [PubMed]
  • Ramesh A, Shaju JP, Zade R. Association between chronic generalized periodontitis and hyperlipidemia - a case control study. Bangladesh Journal of Medical Science. 2010;9:95–100.
  • R-Documentation [accessed on 30 May 2012];Fitting Generalized Linear Models. 2002
  • Resnick HE, Valsania P, Halter JB, Lin X. Differential effects of BMI on diabetes risk among black and white Americans. Diabetes Care. 1998;21:1828–1835. [PubMed]
  • Ritchie MD, Denny JC, Crawford DC, Ramirez AH, Weiner JB, Pulley JM, Basford MA, Brown-Gentry K, Balser JR, Masys DR, Haines JL, Roden DM. Robust replication of genotypephenotype associations across multiple diseases in an electronic medical record. American Journal of Human Genetics. 2010;86:560–572. [PubMed]
  • Rivas-Tumanyan S, Spiegelman D, Curhan GC, Forman JP, Joshipura KJ. Periodontal disease and incidence of hypertension in the health professionals follow-up study. American Journal of Hypertension. 2012;25:770–776. [PMC free article] [PubMed]
  • Rudman W, Hart-Hester S, Jones W, Caputo N, Madison M. Integrating medical and dental records. A new frontier in health information management. Journal of AHIMA. 2010;81:36–39. [PubMed]
  • Santos-Pereira SA, Giraldo PC, Saba-Chujfi E, Amaral RLG, Morais SS, Fachini AM, Gonçalves AKS. Chronic periodontitis and pre-term labour in Brazilian pregnant women: an association to be analysed. Journal of Clinical Periodontology. 2007;34:208–213. [PubMed]
  • Skalerič U, Kramar B, Petelin M, Pavllia Z, Wahl SM. Changes in TGF-β1 levels in gingiva, crevicular fluid and serum associated with periodontal inflammation in humans and dogs. European Journal of Oral Sciences. 1997;105:136–142. [PubMed]
  • Slavkin HC, Baum BJ. Relationship of dental and oral pathology to systemic illness. Journal of the American Medical Association. 2000;284:1215–1217. [PubMed]
  • Soskolne W, Klinger A. The relationship between periodontal diseases and diabetes: an overview. Annals of Periodontology. 2001;6:91–98. [PubMed]
  • Taguchi A, Sanada M, Suei Y, Ohtsuka M, Lee K, Tanimoto K, Tsuda M, Ohama K, Yoshizumi M, Higashi Y. Tooth loss is associated with an increased risk of hypertension in postmenopausal women. Hypertension. 2004;43:1297–1300. [PubMed]
  • Theis M, Reid R, Chaudhari M, Newton K, Spangler L, Grossman D, Inge R. Case study of linking dental and medical healthcare records. Am J Manag Care. 2010;16:e51–e56. [PubMed]
  • Untergasser G, Madersbacher S, Berger P. Benign prostatic hyperplasia: age-related tissue-remodeling. Experimental Gerontology. 2005;40:121–128. [PubMed]
  • Us-Census-Bureau [accessed on 24, October 2012];Overview of race and Hispanic origin: 2010. 2011
  • Williams RC, Jr, Mahan CJ. Periodontal disease and diabetes in young adults. Journal of the American Medical Association. 1960;172:776–778. [PubMed]
  • Wimmer G, Pihlstrom BL. A critical assessment of adverse pregnancy outcome and periodontal disease. Journal of Clinical Periodontology. 2008;35:380–397. [PubMed]
  • Zadik Y, Bechor R, Galor S, Justo D, Heruti RJ. Erectile dysfunction might be associated with chronic periodontal disease: two ends of the cardiovascular spectrum. Journal of Sexual Medicine. 2009;6:1111–1116. [PubMed]