|Home | About | Journals | Submit | Contact Us | Français|
Dental caries is the most common disease to cause irreversible damage in humans. Several therapeutic agents are available to treat or prevent dental caries, but none besides fluoride has significantly influenced the disease burden globally. Etiologic mechanisms of the mutans group streptococci and specific Lactobacillus species have been characterized to various degrees of detail, from identification of physiologic processes to specific proteins. Here, we analyze the entire Streptococcus mutans proteome for potential drug targets by investigating their uniqueness with respect to non-cariogenic dental plaque bacteria, quality of protein structure models, and the likelihood of finding a drug for the active site. Our results suggest specific targets for rational drug discovery, including 15 known virulence factors, 16 proteins for which crystallographic structures are available, and 84 previously uncharacterized proteins, with various levels of similarity to homologs in dental plaque bacteria. This analysis provides a map to streamline the process of clinical development of effective multispecies pharmacologic interventions for dental caries.
Dental caries affects the vast majority of people in developed nations (Petersen, 2003). The multifactorial etiology of dental caries includes multiple bacterial species and nutrients that facilitate bacterial acidogenesis (van Palenstein Helderman et al., 1996). Factors influencing susceptibility include age, immunologic status (Smith and Mattos-Graner, 2008), salivary function (Stookey, 2008), human genetics (Wright, 2010), bacterial genetics (Loesche, 1986), and behavioral practices such as diet and hygiene (Featherstone, 2000; Milgrom et al., 2009).
The most effective approach to preventing dental caries is to completely exclude refined sugars from the diet (e.g., sucrose, fructose) and to promote consumption of protein, lipids, and complex carbohydrates (van Palenstein Helderman et al., 1996; Featherstone, 2000). However, within the constraints of consumer culture, dietary changes are not expected to affect the pandemic of dental caries significantly in the near future. Thus, innovative approaches are needed (Milgrom et al., 2009). Pharmacologic medicaments to create environmental pressures that drive the emergence of more symbiotic bacterial strains present one option for long-term treatment of dental caries.
The primary targets of contemporary pharmacologic treatment are the etiologic bacteria and the diseased tissues. The tooth is the target of various effective chemical regimens for prevention and regeneration (Featherstone, 2009). Some natural and synthetic molecular agents show moderate efficacy against cariogenic bacteria, but no clinical panacea has been found. Concerted approaches to rational drug design are rare. Since drugs traditionally target protein-binding sites, protein structure is required for rational design (Agüero et al., 2008; Horst et al., 2012). The recent explosion of sequencing technology made available the protein sequences corresponding to all genes of Streptococcus mutans and various other dental plaque bacteria (Fig. 1b). Combined with comparative structure prediction to model structure from sequence (Sali and Blundell, 1993), we are presented with the novel opportunity to rationally design multitarget multispecies drugs.
Targeting multiple disease-mediating proteins with single compounds has become the paradigm for new cancer drugs (Petrelli and Giordano, 2008). The predominant drug for chronic myelogenous leukemia, Gleevec, serendipitously inhibits at least two pathways specific to cell proliferation in this disease (Kaelin, 2004). When applied to computational drug design, multitargeting increases the odds of success: If a compound is predicted to inhibit multiple proteins, it is likely that it will actually inhibit at least one. We previously validated this approach for the development of inhibitors for microbial pathogens by docking all compounds approved for use in humans to the 13 available crystallographic structures for Plasmodium falciparum; 6 of 16 tested compounds demonstrated sub-micromolar activity (Jenwitheesuk et al., 2008). Thus we believe that identification of multiple protein targets within S. mutans will create useful paths for the development of novel multitarget treatments for dental caries.
Although the concept of targeting S. mutans alone is attractive, multispecies therapy is essential because multiple species contribute to dental caries. Caries experience seems to depend more on diet than on the prevailing plaque species (van Palenstein Helderman et al., 1996). Additionally, S. mutans levels in older patients do not correlate with caries experience (Milgrom et al., 2009), and inverse associations of caries experience with S. mutans detection are reported for children with blood dyscrasias (Ou-Yang et al., 2010). Even when S. mutans correlates best to caries experience, many other species and genera are also significantly associated (Tanner et al., 2011). Furthermore, histologically distinct regions of caries lesions have been found to associate with different bacteria: In early lesions, lack of cultivatable Veillonella is associated with lack of S. mutans (Marsh et al., 1989). Meanwhile, evidence suggests that some bacteria are protective and should be permitted to thrive. In Fig. 1b, we detail the species that appear to be contributory (n = 16) or protective (n = 7) for dental caries. We investigate them here as target or antitarget species, respectively.
In this work, we estimate the likelihood of each S. mutans protein being successfully targeted by structure-based drug discovery (Jenwitheesuk et al., 2008; Fan et al., 2009). We funnel down the entire proteome to those sequences for which highly reliable models can be computed or for which experimentally determined structures are available (modelable), and then to those with binding site features similar to those of known drug targets (druggable). We then continue to funnel these proteins for uniqueness to S. mutans by comparing each with the entire proteomes of 23 dental plaque bacteria, stratified by contribution to dental caries (Fig. 1). We predict whether pharmacologic inhibition of any S. mutans protein would also selectively inhibit other cariogenic bacteria. The output is a guide to strategic target selection for effective long-term preventive and therapeutic pharmacologic interventions. This approach is novel to dental caries and provides a model for chronic multi-bacterial diseases.
We take a three-stage approach to assess the likelihood of a given protein interaction site binding a drug-like compound (druggability) and of a drug for that protein to target other dental plaque bacteria (Fig. 1). In the first stage, we build atomic models with all relevant templates. In the second stage, we assess the druggability of the template that was used to generate the best model. In the third stage, we assess the similarity of each protein to all proteins (proteomes) in dental plaque bacteria.
All available S. mutans protein structures were obtained from the Protein Data Bank (PDB; Berman et al., 2000; accessed October 4, 2011). All protein sequences ascribed to the reviewed complete proteome sets for S. mutans and other dental plaque bacteria were downloaded from UniProtKB (Ajdić et al., 2002; Apweiler et al., 2004; accessed January 16, 2011).
To generate atomic models for each S. mutans protein, we applied the restraint-based comparative modeling program MODELLER-v9.10 (Sali and Blundell, 1993). The model dataset was generated with the automated modeling pipeline ModPipe (Pieper et al., 2008), including template selection and target-template alignment (MODELLER, PSI-BLAST), with crystal structures available in a subset of the PDB, with redundancy removed at the 95% sequence identity level, model building, and model evaluation. To select the most accurate model for each sequence from the model pool created by ModPipe, we applied the Z-score of the DOPE atomic distance-dependent statistical potential (zDOPE; Shen and Sali, 2006), which estimates the reliability of each model. zDOPE < −1 indicates that the modeling process identified the native fold topology, which is deemed “modelable”.
To predict proteins that bind compounds which satisfy Lipinski’s Rule of 5 (Lipinski et al., 1997) and have ≤10 rotatable bonds, we applied the DrugEBIlity analysis (Agüero et al., 2008). The DrugEBIlity score is calculated as the mean of 11 machine-learning algorithms, separately trained with 25 physicochemical descriptors of all known drug-binding sites. To obtain predictions of high specificity, we applied the threshold of satisfying at least 8 of the 11 algorithms (DrugEBIlity ensemble score > 0.5; Fig. 2).
To anticipate analogous targeting of other relevant bacteria, we built HHsearch HMM-based phylogenetic profiles for each S. mutans to all proteins in other dental plaque bacteria. We built an HMM with HHsearch for each protein in each proteome by comparing similarity patterns found in the 70% and 90% non-redundant NCBI protein sequence database by fold family hierarchically, and calibrating (normalizing) against a set of HMMs including one for each fold family in SCOP. We compared the HMM for each S. mutans protein with all 23 cariogenic and non-cariogenic bacterial proteomes using HHSearch. HHsearch evaluates protein similarity by maximizing the co-emission log-odds probability for a pair of HMMs, which represent position-specific insertion-deletion probabilities of multiple sequence alignment profiles (Söding, 2005). We plotted the proportion of matching HMM alignment columns for the most similar protein in each proteome (Figs. 3, ,44).
Fifteen known virulence factors, 16 proteins for which crystal structures are available, and 84 previously unidentified proteins were identified as modelable and druggable. All comparative models are available through ModBase (http://modbase.compbio.ucsf.edu).
We illustrate protein druggability using 6 proteins with highly reliable models (Fig. 2). First, as predicted by the DrugEBIlity score of −0.71, the proton/lactate pump (P50976) has no detectable pocket large enough for any drug-like compound (Fig. 2a). Second, a multiple sugar-binding protein that facilitates uptake of raffinose and other nutritional sugars (Russell et al., 1992) contains a central cavity (DrugEBIlity score 0.76) large enough to fit galactose (shown), other sugars, and most drug families (Fig. 2b). Third, a cell-surface adhesin that mediates attachment to the enamel pellicle (Koga et al., 1990) presents a shallow cleft capable of binding peptides or RNA (Fig. 2c). Fourth, glucosamine-6-phosphate deaminase illustrates a druggable pocket from a crystal structure, with suitable geometry and chemistry to bind the glucosamine-6-phosphate, other physiologic riboses, and analog drugs (Fig. 2d). Fifth, among all S. mutans proteins for which a crystal structure is not yet available, uracil-diphosphate acetyl-glucosamine epimerase (Q8DTB7) bears the binding site predicted with the highest confidence to be pharmacologically inhibited (Fig. 2e). The fit of the uracil diphosphate from template structure 3beo suggests accurate modeling of the binding site: The long, narrow pocket, and the hydrophobic patch at the end (red) are favorable conditions to facilitate drug-induced inhibition. Sixth, a completely uncharacterized protein exemplifies a protein predicted to be modelable and druggable, which is relatively unique to S. mutans (Fig. 2f). All modelable and druggable proteins represent potential drug targets.
Sixty-one proteins contributing to cariogenesis were identified from the literature, in general because inhibition has reduced some parameter of cariogenicity. References for the involvement of each gene in dental caries are included in Appendix Table 1. The strength of evidence for each protein being a virulence factor corresponds to the clinical relevance of the model system in which experiments were performed, the method by which the protein was inhibited, and the magnitude of impact on surrogate markers of cariogenesis. We annotated 22 of these proteins with highly reliable (zDOPE < −1) or moderately reliable (zDOPE < −0.5) atomic models, the likelihood of discovering a drug for the template protein (DrugEBIlity > 0.5), and comparison of phylogenetic profiles among cariogenic or protective bacterial species (Fig. 3). These proteins are categorized according to etiologic mechanisms and physiologic processes essential to bacterial colonization and thriving (Fig. 3; Appendix Table 1).
The identified potential drug targets within the metabolic protein subset include multiple sugar-binding protein (UniProt Q00749; Fig. 2b), fructose phosphotransferase (Q8DUN3), purine nucleoside phosphorylase (Q8DTU4), glycogen synthase (Q8CWX0), signal recognition particle (Q54431), formyltetrahydrofolate ligase (Q59925), and panthothenate flavoprotein (Q8DU74). For all these proteins, homologs are identified in the vast majority of bacteria sampled, suggesting general cross-reactivity (Fig. 3).
Modelable and druggable proteins that function in attachment to the biofilm extracellular polysaccharide include glycogen phosphorylase (Q8DT55), another phosphorylase (Q8DT31), dextran glucosidase (Q99040), secreted peptidoglycan hydrolase (Q8DWM3), glucan-binding protein-C response regulator (Q9S151), and cell-surface adhesin (P11657). Most of these proteins are present in all sampled proteomes. The hydrolase is present more in protective than cariogenic bacteria, and therefore is not a good target, whereas the adhesin is relatively unique to S. mutans and is a good target.
Targets that facilitate environmental adaptation by signaling changes via quorum sensing include bromodomain-containing RNA-binding protein-2 response regulator (Q8DVJ8) and oxidative stress sensor kinase (Q8DT64), which are both ubiquitous and predicted to be potential drug targets.
Crystallographic structures provide the most globally accurate models currently obtainable, and are generally preferable for drug discovery (Baker and Sali, 2001), although comparative models can also be useful (Fan et al., 2009). We predict 14 out of 81 known structures for S. mutans to be highly amenable to drug discovery, and present their phylogenetic profiles to aid design of specificity (Fig. 4a).
Our S. mutans proteome modeling and druggability experiment discovered 84 novel high-quality models (zDOPE < −1) with highly druggable template structures (> 0.5; Figs. 1a, ,3b).3b). While functional annotations have been made by sequence comparison, most of these proteins are not well-studied. We assert these proteins as suitable targets for rational drug discovery. Future work on these proteins could include crystallography with physiologic ligand analogs, high-throughput screening, or computational multitarget molecular docking studies (CANDO: http://cando.compbio.washington.edu).
The character of a bacterial species is found in the divergent structural features and the differential physiologic responses to environmental shifts. To inform a strategic plan against S. mutans, we assessed the accessibility of its structural features to rational drug discovery, and the uniqueness of its proteins with respect to those of other relevant bacteria in the dental plaque. We performed this analysis to inform discovery of pharmacologic inhibitors for dental caries.
Unfortunately, no druggable proteins were found to be differentially abundant in cariogenic bacteria. Rather, all are either ubiquitous to this set, common to all Streptococci and Bifidobacteria but absent from Lactobacilli, or relatively unique to S. mutans (Fig. 3). It seems that the probability of developing a highly accurate model for a given protein is greatly increased for well-studied protein families, since more template structures are available for them; physiologically central roles are of high interest for study, but centrality equates to ubiquity, so modelable proteins tend to be common. Nonetheless, specific analyses of binding-site residues may reveal more specificity than estimated by this ortholog prediction.
Inability to produce accurate models with the current PDB makes no statement about the druggability of the protein: It is simply not possible to perform structural analysis without a structure. Many currently unmodelable proteins are expected to be drug targets. Bench assays and crystallography are indicated for proteins with no template that correlate closely with cariogenicity. Meanwhile, the 15 virulence factors predicted to be modelable and druggable validate the funnel approach we took to analyze the full proteome.
The information explosion in sequence and structural data can be cross-referenced with epidemiologic data that identify differential gene presence (Zhang et al., 2009) or in vitro studies of gene expression (Sol et al., 2011). These and environment-specific phylogenetic analyses will become more meaningful as sequencing data expand to the many yet-unrepresented dental plaque bacterial species.
A subset of the targets identified here will progress to virtual screening, which has resulted in the selection of verifiable hits with 40-60% accuracy when applied with our recent protocols to crystal structures or comparative models constructed from templates with as low as 30% sequence identity (Fan et al., 2009; Horst et al., 2011). In our experience, a week’s worth of effort is sufficient to model, dock, and select compounds for one protein. Thereafter, virtual hits must be tested at the bench. It is expected that application to the modelable and druggable proteins identified here will lead to in vitro hits for at least some of these proteins. Focusing on proteins that are at least moderately unique to S. mutans (rare, Fig. 1a) will add specificity over other dental plaque bacteria, facilitating a shift in the microbial ecology. Selecting compounds that are predicted to target multiple proteins has been successful in other disease models (Jenwitheesuk et al., 2008). Elevating the search for specific multispecies inhibition would make dental caries a useful study model for other biofilm-mediated diseases, such as periodontitis, ulcers, enteritis, and gluten sensitivity.
A supplemental appendix to this article is published electronically only at http://adr.sagepub.com/supplemental.
This work was supported by NIH grants DP1-OD006779, U54-GM094662, and P01-GM71790.
The authors declare no potential conflicts of interest with respect to the authorship and/or publication of this article.