|Home | About | Journals | Submit | Contact Us | Français|
Campylobacters are a major global health burden and a cause of food-borne diarrheal illness and economic loss worldwide. In developing countries, Campylobacter infections are frequent in children under age two and may be associated with mortality. In developed countries, they are a common cause of bacterial diarrhea in early adulthood. In the United States, antibiotic resistance against Campylobacter is notably increased from 13% in 1997 to nearly 25% in 2011. Novel drug targets are urgently needed but remain a daunting task to accomplish. We suggest that omics-guided drug discovery is timely and worth considering in this context. The present study employed an integrated subtractive genomics and comparative metabolic pathway analysis approach. We identified 16 unique pathways from Campylobacter when compared against H. sapiens with 326 non-redundant proteins; 115 of these were found to be essential in the Database of Essential Genes. Sixty-six proteins among these were non-homologous to the human proteome. Six membrane proteins, of which four are transporters, have been proposed as potential vaccine candidates. Screening of 66 essential non-homologous proteins against DrugBank resulted in identification of 34 proteins with drug-ability potential, many of which play critical roles in bacterial growth and survival. Out of these, eight proteins had approved drug targets available in DrugBank, the majority serving crucial roles in cell wall synthesis and energy metabolism and therefore having the potential to be utilized as drug targets. We conclude by underscoring that screening against these proteins with inhibitors may aid in future discovery of novel therapeutics against campylobacteriosis in ways that will be pathogen specific, and thus have minimal toxic effect on host. Omics-guided drug discovery and bioinformatics analyses offer the broad potential for veritable advances in global health relevant novel therapeutics.
Campylobacters are a major cause of food-borne diarrheal illness and result in a high morbidity and mortality rate, and economic loss in every region of the world (WHO, 2011). In developing countries, Campylobacter infections are frequent in children under age 2, sometimes leading to death. In industrialized nations, they are most frequently identified cause of bacterial diarrhea in early adulthood (CDC, 2000). According to a report released by Centers for Disease Control and Prevention (CDC), there are 1.3 million incidences of campylobacteriosis and there is rapid escalation of antibiotic resistance in Campylobacter from 13% in 1997 to almost 25% in 2011 in United States (CDC, 2013). The growing body of literature has documented that resistance to antibiotics such as quinolones, macrolides, tetracyclines, chloramphenicol, cephalosporins, and aminoglycosides is increasing rapidly in most parts of the world due to common and indiscriminate use of these agents (Akhtar, 1988; Engberg et al., 2001; Hoge et al., 1998; Reina et al., 1994). Campylobacters are highly important from a socioeconomic perspective, which strongly indicates a need for novel therapeutic targets with a high potential to improve quality of life and survival rates.
Since the publication of pathogenic bacterial genome sequences Haemophilus influenzae (Fleischmann et al., 1995) and Mycoplasma genetalium (Fraser et al., 1995), in 1995, the number of completed genome sequence for various microbial species has increased rapidly. These data in the post-genomic era have provided researchers with the possibility to exploit it fully for identification of novel therapeutic targets and have opened up new avenues for genome-wide application of comparative and subtractive genomics approaches for therapeutic intervention.
The subtractive genomics approach has been advertently used by many researchers (Chawley et al., 2014; Ghosh et al., 2014; Samal et al., 2015; Sarangi et al., 2009) in search of novel drug targets for various microbes such as Vibrio cholerae, Staphylococcus aureus, Salmonella typhimurium, and Neisseria meningitides, respectively. Genome sequences of several Campylobacter species have been published including Campylobacter concisus, C. curvus, C. fetus, C. hominis, and six strains of C. jejuni (http://gcid.jcvi.org/projects/msc/campylobacter/). In this study we report the subtractive genomics approach integrated with comparative metabolic pathway analysis aimed at identifying novel therapeutic target proteins of C. jejuni pathogenic strain NCTC11168.
Various databases and tools as described in the workflow (Fig. 1) were utilized for the identification of putative therapeutic targets against C. jejuni, integrating a subtractive genomics approach with genome-wide comparative pathway analysis.
The Kyoto Encyclopedia of Genes and Genomes (KEGG) database (Kanehisa and Goto, 2000), the most comprehensive resource of pathway information, was used for comparative genome-wide pathway analysis of C. jejuni and Homo sapiens. Manual comparison was done to identify the pathways unique to C. jejuni as per KEGG database annotations. Protein sequences for the enzymes involved in the unique pathways were retrieved from the Uniprot protein database.
Essential non-homologous proteins from pathogen proteome were selected by a two-step comparison. In the first comparison, a subtractive genomics approach was applied to pathogen proteins from unique pathways. The BLAST search (Altschul et al., 1990) based on essentiality criteria was performed against the Database of Essential Genes (DEG version 10.9) (Zhang et al., 2004), which hosts records of essential genomic elements critical for an organism's survival, such as protein-coding genes and non-coding RNAs, among bacteria, archaea, and eukaryotes. An expectation value (e-value) threshold of 0.005 was used as filtering criteria for BLAST hits; with C. jejuni and Helicobacter pylori as background organisms against which a similarity search for identification of essential genes was performed. In the second step comparison, essential pathogen proteins were further screened on the basis of homology with the host proteome at an e-value cut off of 0.05. In the BLASTP search, proteins that did not have any hits below e-value inclusion threshold were selected to be essential non-homologous proteins from C. jejuni.
The molecular and structural properties of essential non-homologous pathogen proteins were calculated for aid in prioritization of the drug targets that will most likely lead to effective treatments. Subcellular localization of identified proteins was predicted by PSORTb (Yu et al., 2010), which uses a Bayesian network model to calculate associated probability value for five major localization sites viz. cytoplasmic, inner membrane, periplasmic, outer membrane, and extracellular with p-value criteria 7.5. TMHMM server, which is a Hidden Markov Model (HMM)-based tool for prediction of alpha helices in membrane proteins, was used for transmembrane predictions (Krogh et al., 2001).
A search was performed to identify the proteins for which experimentally or computationally solved structures are available in PDB (Berman et al., 2000), Modbase (Pieper et al., 2006), or Protein Model Portal (Haas et al., 2013). A search of identified proteins in Uniprot was performed for retrieving information pertaining to molecular weight and existence of proteins. The information on protein existence provides different types of evidence for experimental characterization at protein and/or transcript level, homology inference, or uncertainty.
The VaxiJen server (Doytchinova and Flower, 2007) was employed to check the antigenicity of the membrane-localized proteins with a threshold of 0.6. The antigenic sequences were further screened for their ability to bind to MHC Class I molecule using ProPred-I (Singh and Raghava, 2003). ProPred-I implements proteasomal processing with matrices for 47 MHC Class-I alleles to identify the regions in the antigenic sequences that can act as potential MHC binders.
To evaluate the drug-ability potential, which is the ability of a particular biological target to bind with high affinity to known drugs of each identified therapeutic target, we subjected each protein to a BLASTP search against DrugBank with an e-value 0.01. DrugBank is a unique comprehensive resource that integrates drug data with drug target information at sequence, structure, and pathway levels (Wishart et al., 2008). DrugBank version 4.2 currently has information about 7737 drug entries that include 1585 FDA-approved small molecule drugs, 158 FDA-approved biotech (protein/peptide) drugs, 89 nutraceuticals, and over 6000 experimental drugs, along with 4281 non-redundant protein sequences (drug target/enzyme/transporter/carrier) linked to these drugs.
The KEGG database initiated in 1995 is a computational representation for biological systems, integrating genetic information of genes and proteins with chemical and systemic information of molecular interaction and reaction networks. It links genetic building block information with higher order functional information. Currently KEGG database houses 87 different pathways for C. jejuni 11168 strain and 292 pathways for H. sapiens. As described in the workflow (Fig. 1), a manual comparison of host and pathogen pathways resulted in identification of 16 unique pathways to the pathogen (Table 1), with 71 remaining pathways being shared by both humans and Campylobacters. Furthermore, the proteins involved in unique pathways were identified.
A total of 446 proteins were identified to be involved in 16 unique pathways. Few proteins were involved in more than one pathway, which resulted in 326 protein sequences from unique pathways after removing redundant protein sequences. A BLASTP search of these 326 protein sequences against 551 essential genes from C. jejuni and H. pylori in DEG revealed a total of 115 essential protein sequences with hits below e-value 0.05.
These essential protein sequences were further filtered out by homology search against human proteome for identification of non-homologous protein sequences. This comparison detected 66 essential non-homologous proteins with no hits against H. sapiens below e-value 0.05. This comparison was performed to identify proteins unique to pathogen so as to avoid adverse effects on the human host (Butt et al., 2012), as the potential drug may also target host enzymes. These 66 essential non-homologous protein sequences represent the potential to be further exploited for therapeutic drug design against C. jejuni.
Although all 66 essential non-homologous identified proteins are potential drug targets, these can be further filtered using additional prioritization parameters. Subcellular localization prediction using PSORTb program identified 63 proteins to be of cytoplasmic origin and all the essential non-homologous proteins were found to be <110kDa. A search for availability of 3D structure identified seven proteins for which there was no structure available, while for six of these proteins experimentally solved structures were available. The remaining 53 proteins had computationally solved structures available in either Modbase or ProteinModel Portal. TMHHM server predicted 16 proteins that had one or more helices traversing the membrane, six of which were found to be antigenic above the specified threshold. ProPred-I server predicted MHC binder regions in all of the six proteins (murF, frdC, ccoP, secD, Cj1094c, and tatC) for different MHC class I alleles, and these proteins represent potential vaccine candidates as most of them are transporters, surface exposed proteins. All these results are presented in Table 2.
To examine the drug-ability of each of the essential non-homologous proteins, they were subjected to a BLASTP search against all the drug-targeted proteins in DrugBank database, which resulted in the identification of 34 C. jejuni proteins that shared high similarity to the binding partners of these drug-targeted proteins from DrugBank. Nine of these 34 proteins were FDA-approved drugs or nutraceuticals, while the remaining 25 were under experimentation. In Table 3 we have summarized the identified target protein binding partners of the all the drugs of 34 C. jejuni target proteins.
The advent of a huge array of omics technologies has revolutionized the identification of drug targets (Russell et al., 2013). Examples include proteome scale comparative modeling of Corynebacterium pseudotuberculosis responsible for ulcerative lymphangitis, mastitis in ruminants (Hassan et al., 2014); RNA-seq profiling based target identification for non-small cell lung cancer (Riccardo et al., 2014); identification of markers for early prediction of preeclampsia using metabolic profiling (Kuc et al., 2014); computational systems biology approach for drug prioritization in Clostridium botulinum (Muhammad et al., 2014); and using protein–protein interaction network for drug target identification (Li et al., 2015).
The subtractive genomics approach is very efficient and has greatly accelerated the identification of relevant drug targets against many pathogens such as M. tuberculosis and Bacillus anthracis (Hosen et al., 2014; Rahman et al., 2014). Large-scale genomic projects such as the 1000 Genome and Encode are a rich source of background information and provide a deeper understanding of the genomes in relation to diseases. These computational approaches, together with the availability of genomic sequences, make it feasible for us to perform subtractive genomics and comparative pathway analysis aimed at identifying putative therapeutic drug targets and potential vaccine candidate proteins of C. jejuni.
Selecting drug targets using subtractive genomics approaches essentially relies on looking for those proteins that are absent in host but present in the pathogen, as it will minimize any adverse effects on host biology (Butt et al., 2012). Gene essentiality is also thought to be an important criteria for identification of therapeutic drug targets (Agüero et al., 2008). But there is a limitation to this approach as it fails to identify some targets such as hypoxanthine phosphoribosyl transferase as essential in Plasmodium falciparum (false negative) (Winzeler et al., 1999), while sometimes it yields false positives as in the case of dihydrofolate reductase in Leishmania major (Titus et al., 1995).
Gene essentiality prediction via experimental methods such as single gene knockouts, RNA interference, and conditional knockouts is labor-intensive, expensive, and time-consuming (Butt et al., 2012). Furthermore only a few infectious agents are amenable to experimental approaches of gene essentiality, as the tools for identification of drug or vaccine targets are often limited or absent for many pathogens. In such scenarios, computational methods for gene essentiality prediction seem to streamline the gap between the amount of data generated from sequencing projects and whole genome approaches for prediction of essential genes (Agüero et al., 2008; Doyle et al., 2010; Volker and Brown, 2002).
A previous work also identified the drug targets in C. jejuni by utilizing the CAI (Codon Adaptation Index) criterion as a measure of gene essentiality (Tilton et al., 2014) Essential genes are highly conserved, highly expressed, and preferentially positioned in the leading strand. But high gene expression rates do not correlate significantly with the gene strand biasness and nonessential genes also show high expression rates (Perrière and Thioulouse, 2002; Rocha and Danchin, 2003), in which case the definition of essentiality based on high expression could be erroneous. In our study, we have utilized a subtractive genomics approach to predict genes essential to C. jejuni through homology search against experimentally predicted essentiality data from C. jejuni and H. pylori in DEG, both of which belong to the epsilon class of proteobacteria.
Significant advancements in genome sequencing and bioinformatics coupled with experimental data have shown that factors which are determinant of structural and molecular properties of proteins, such as molecular weight, subcellular localization, transmembrane prediction, and availability of 3D structure, can aid in prioritization of drug targets (Agüero et al., 2008) and maximize the likelihood of landing to the best therapeutic target against pathogen, considerably reducing the time and resources for developing such an agent.
Molecular mass prioritization identified all the proteins with weight <110kDa; suggesting the possibility of experimental verification of the identified targets. Smaller proteins are easy to purify, and localization information of proteins can yield insights into protein function (Duffield et al., 2010; Yu et al., 2010). There were only few experimentally solved structures available in PDB, which points to a gap in the structural characterization of pathogen proteins, albeit C. jejuni first whole genome was published in 2000 (Parkhill et al., 2000). Protein structural information can be used to a significant advantage in drug identification and validation, greatly reducing the cost of high throughput experimental assays (Grant, 2009). Sixteen proteins predicted as transmembrane represent potential vaccine candidate proteins, which were further filtered based on antigenicity and MHC-binding criteria.
The likelihood of developing a drug-like compound to modulate the target is an important consideration that can aid drug design and an important determinant for a non-homologous protein to be a potential therapeutic target. The proteins for which drugs are already available can be useful starting points for drug discovery. The distribution of the essential non-homologous proteins was checked before and after similarity search against DrugBank (Fig. 2). It was noticed that there was ~49% reduction in the number of proteins and 20% reduction in the number of pathways after DrugBank analysis, which leads to a shift in the pathway priority. Before DrugBank analysis biosynthesis of secondary metabolites, two-component system, microbial metabolism in diverse environments, peptidoglycan biosynthesis, bacterial secretion system, flagellar assembly, and lipopolysaccharide biosynthesis pathways were having maximum drug-able targets, but after DrugBank analysis, biosynthesis of secondary metabolites, peptidoglycan biosynthesis, two-component systems were the major pathways of drug-able targets.
Finally, we have narrowed down the search of therapeutic targets to final 14 prioritized drug targets and vaccine candidates. 5′-Methylthioadenosine/S-adenosylhomocysteine nucleosidase (pfs) enzyme catalyzes the direct conversion of aminodeoxyfutalosine (AFL) into dehypoxanthine futalosine (DHFL) and adenine via the hydrolysis of the N-glycosidic bond that represents an essential step in the menaquinone biosynthesis pathway (Li et al., 2011). This enzyme is an attractive drug target as it is involved in many pathways, such as ubiquinone and other terpenoid-quinone biosynthesis, cysteine and methionine metabolism, and biosynthesis of amino acids.
Alanine racemase (alr) catalyzes the pyridoxal 5′-phosphate-dependent interconversion of L-alanine and D-alanine. D-alanyl-alanine synthetase A (ddl) is involved in cell wall formation by joining two of the D-alanine residues together, catalyzing the formation of the ATP-dependent D-alanine-D-alanine dipeptide bond between the resulting D-alanine molecules. Inhibition of these two enzymes leads to effective inhibition of peptidoglycan synthesis; ddl has been proposed to be an attractive drug target in Mycobacterium tuberculosis (Prosser and de Carvalho, 2013). Vancomycin binds to ddl, a peptidoglycan precursor, forming a stable complex under normal conditions and inhibits cell wall synthesis (Howden et al., 2010), ultimately leading to cell lysis. Thus ddl plays an important role in the vancomycin resistance pathway.
Penicillin Binding Proteins (PBPs) are of special interest, as these are target sites for beta-lactam antibiotics. They also play an important role in cell wall formation. pbpA is important for cell division and essential for growth (Wada and Watanabe, 1998). pbpB also is critical for bacterial growth and cell wall biosynthesis (Pinho et al., 2001). pbpC is a major protein of a cell division complex. PBPs have already been utilized as model drug target system (von Rechenberg et al., 2005). PBPs are highly similar to the binding partners of many FDA-approved and experimental drugs. Hence PBPs can be considered of high potential for experimental validation as vaccine candidates.
UDP-N-acetylglucosamine 1-carboxyvinyltransferase (murA) and UDP-N-acetylenolpyruvoylglucosamine reductase (murB) both catalyze important reactions in peptidoglycan precursor synthesis. murA catalyzes transfer of an enolpyruvate residue from phosphoenolpyruvate (PEP) to position 3 of UDP-N-acetylglucosamine, followed by a MurB-catalyzed NADPH dependent reduction of the UDP-N-acetylglucosamine enolpyruvate to UDP-N-acetylmuramic acid. Majority of antibiotics in clinical use today target later steps of peptidoglycan synthesis (El Zoeiby et al., 2003). murB is essential in Escherichia coli (Pucci et al., 1992). murA to murF genes are all essential and highly conserved among bacterial species, thus holding a great promise as future therapeutic drug targets.
Both frdC (fumarate reductase cytochrome B subunit) and ccoP are important constituents of oxidative phosphorylation pathway for ATP formation, often called a molecular unit of energy transfer. While frdC couples the reduction of fumarate to succinate with the oxidation of quinol to quinine, ccoP (Cbb3-type cytochrome c oxidase subunit) is required for transfer of electrons from donor cytochrome c via its heme groups to CcoO subunit. secD (protein translocase subunit SecD) a part of the Sec protein translocase complex, tatC (Sec-independent protein translocase protein TatC) an important part of the twin-arginine translocation (Tat) system and Cj1094c (Putative preprotein translocase protein) transports large proteins across membranes. The twin-arginine translocation (TAT) pathway is important to bacterial growth and virulence (Ding and Christie, 2003; Lavander et al., 2007). secD and Cj1094c help in secretion across the inner membrane via preprotein translocase pathway. Transport proteins are associated with pathogenesis and virulence and have been identified as potential vaccine candidates in several previous studies as well (Garmory and Titball, 2004; Harris et al., 2011).
Peptidoglycan is an important component of bacterial cell wall, responsible for maintaining a definite cell shape and primarily conferring mechanical resistance to higher osmotic pressure (Vollmer et al., 2008). Any interference with peptidoglycan biosynthesis will result in cell lysis. Peptidoglycan biosynthesis, the pathway with largest distribution of final identified drug targets, can be exploited for therapeutic drug targets owing to its multiple target enzymes whose inhibition could lead to disruption of cell well and in turn attenuate bacterial cell growth.
In summary, the computational subtractive genomics approach integrated with comparative pathway analysis resulted in a significant reduction in the number of protein targets (Fig. 3) at each step. Thus we were able to identify several essential proteins critical for bacterial growth and survival and with minimum toxicity to host. Further studies are warranted to validate these findings by in vitro and in vivo experiments for effective drug design against Campylobacter infections.
We have performed subtractive genomics analyses of the C. jejuni pathogenic strain NCTC11168, and have identified several proteins in the genome that can prove to be potential targets for effective drug design. As many of the identified drug targets have already been reviewed to play critical role in the metabolic pathways that regulate bacterial growth and survival, a systematic approach to develop antibiotics against the identified targets would likely be very promising for the treatment of Campylobacter infections. Information about these targets can also lead to significant progress in testing the efficacy of already existing drugs, which is as equally important as development of new drugs. It is believed that the drugs developed against these identified targets will be pathogen specific and with minimal toxic effects on the host. Omics-guided drug discovery and bioinformatics analyses offer a broad and veritable potential for advances in global health relevant novel therapeutics (Cuadrat et al. 2014; Preidis and Hotez, 2015).
This research was supported by Fast Track Young Scientist Fellowship from DST (Department of Science and Technology), Ministry of Science and Technology, India under grant number SB/FT/LS-278/2012.
The authors declare that there are no conflicting financial interests.