Microsporidia, parasitic fungi-related eukaryotes infecting many cell types in a wide range of animals (including humans), represent a serious health threat in immunocompromised patients. The 2.9 Mb genome of the microsporidium Encephalitozoon cuniculi is the smallest known of any eukaryote. Eukaryotic protein kinases are a large superfamily of enzymes with crucial roles in most cellular processes, and therefore represent potential drug targets. We report here an exhaustive analysis of the E. cuniculi genomic database aimed at identifying and classifying all protein kinases of this organism with reference to the kinomes of two highly-divergent yeast species, Saccharomyces cerevisiae and Schizosaccharomyces pombe.
A database search with a multi-level protein kinase family hidden Markov model library led to the identification of 29 conventional protein kinase sequences in the E. cuniculi genome, as well as 3 genes encoding atypical protein kinases. The microsporidian kinome presents striking differences from those of other eukaryotes, and this minimal kinome underscores the importance of conserved protein kinases involved in essential cellular processes. ~30% of its kinases are predicted to regulate cell cycle progression while another ~28% have no identifiable homologues in model eukaryotes and are likely to reflect parasitic adaptations. E. cuniculi lacks MAP kinase cascades and almost all protein kinases that are involved in stress responses, ion homeostasis and nutrient signalling in the model fungi S. cerevisiae and S. pombe, including AMPactivated protein kinase (Snf1), previously thought to be ubiquitous in eukaryotes. A detailed database search and phylogenetic analysis of the kinomes of the two model fungi showed that the degree of homology between their kinomes of ~85% is much higher than that previously reported.
The E. cuniculi kinome is by far the smallest eukaryotic kinome characterised to date. The difficulty in assigning clear homology relationships for nine out of the twentynine microsporidian conventional protein kinases despite its compact genome reflects the phylogenetic distance between microsporidia and other eukaryotes. Indeed, the E. cuniculi genome presents a high proportion of genes in which evolution has been accelerated by up to four-fold. There are no orthologues of the protein kinases that constitute MAP kinase pathways and many other protein kinases with roles in nutrient signalling are absent from the E. cuniculi kinome. However, orthologous kinases can nonetheless be identified that correspond to members of the yeast kinomes with roles in some of the most fundamental cellular processes. For example, E. cuniculi has clear orthologues of virtually all the major conserved protein kinases that regulate the core cell cycle machinery (Aurora, Polo, DDK, CDK and Chk1). A comprehensive comparison of the homology relationships between the budding and fission yeast kinomes indicates that, despite an estimated 800 million years of independent evolution, the two model fungi share ~85% of their protein kinases. This will facilitate the annotation of many of the as yet uncharacterised fission yeast kinases, and also those of novel fungal genomes.
Oomycetes are a large group of economically and ecologically important species. Its most notorious member is Phytophthora infestans, the cause of the devastating potato late blight disease. The life cycle of P. infestans involves hyphae which differentiate into spores used for dispersal and host infection. Protein phosphorylation likely plays crucial roles in these stages, and to help understand this we present here a genome-wide analysis of the protein kinases of P. infestans and several relatives. The study also provides new insight into kinase evolution since oomycetes are taxonomically distant from organisms with well-characterized kinomes.
Bioinformatic searches of the genomes of P. infestans, P. ramorum, and P. sojae reveal they have similar kinomes, which for P. infestans contains 354 eukaryotic protein kinases (ePKs) and 18 atypical kinases (aPKs), equaling 2% of total genes. After refining gene models, most were classifiable into families seen in other eukaryotes. Some ePK families are nevertheless unusual, especially the tyrosine kinase-like (TKL) group which includes large oomycete-specific subfamilies. Also identified were two tyrosine kinases, which are rare in non-metazoans. Several ePKs bear accessory domains not identified previously on kinases, such as cyclin-dependent kinases with integral cyclin domains. Most ePKs lack accessory domains, implying that many are regulated transcriptionally. This was confirmed by mRNA expression-profiling studies that showed that two-thirds vary significantly between hyphae, sporangia, and zoospores. Comparisons to neighboring taxa (apicomplexans, ciliates, diatoms) revealed both clade-specific and conserved features, and multiple connections to plant kinases were observed. The kinome of Hyaloperonospora arabidopsidis, an oomycete with a simpler life cycle than P. infestans, was found to be one-third smaller. Some differences may be attributable to gene clustering, which facilitates subfamily expansion (or loss) through unequal crossing-over.
The large sizes of the Phytophthora kinomes imply that phosphorylation plays major roles in their life cycles. Their kinomes also include many novel ePKs, some specific to oomycetes or shared with neighboring groups. Little experimentation to date has addressed the biological functions of oomycete kinases, but this should be stimulated by the structural, evolutionary, and expression data presented here. This may lead to targets for disease control.
Protein phosphorylation is responsible for a large portion of the regulatory functions of eukaryotic cells. Although the list of sequenced genomes of filamentous fungi has grown rapidly, the kinomes of recently sequenced species have not yet been studied in detail. The objective of this study is to apply a comparative analysis of the kinase distribution in different fungal phyla, and to explore its relevance to understanding the evolution of fungi and their taxonomic classification. We have analyzed in detail 12 subgroups of kinases and their distribution over 30 species, as well as their potential use as a classifier for members of the fungal kingdom.
Our findings show that despite the similarity of the kinase distribution in all fungi, their domain distributions and kinome density can potentially be used to classify them and give insight into their evolutionary origin. In general, we found that the overall representation of kinase groups is similar across fungal genomes, the only exception being a large number of tyrosine kinase-like (TKL) kinases predicted in Laccaria bicolor. This unexpected finding underscores the need to continue to sequence fungal genomes, since many species or lineage-specific properties may remain to be discovered. Furthermore, we found that the domain organization significantly varies between the fungal species. Our results suggest that protein kinases and their functional domains strongly reflect fungal taxonomy.
Comparison of the predicted kinomes of sequenced fungi suggests essential signaling functions common to all species, but also specific adaptations of the signal transduction networks to particular species.
Motivation: Kinases of the eukaryotic protein kinase superfamily are key regulators of most aspects eukaryotic cellular behavior and have provided several drug targets including kinases dysregulated in cancers. The rapid increase in the number of genomic sequences has created an acute need to identify and classify members of this important class of enzymes efficiently and accurately.
Results: Kinannote produces a draft kinome and comparative analyses for a predicted proteome using a single line command, and it is currently the only tool that automatically classifies protein kinases using the controlled vocabulary of Hanks and Hunter [Hanks and Hunter (1995)]. A hidden Markov model in combination with a position-specific scoring matrix is used by Kinannote to identify kinases, which are subsequently classified using a BLAST comparison with a local version of KinBase, the curated protein kinase dataset from www.kinase.com. Kinannote was tested on the predicted proteomes from four divergent species. The average sensitivity and precision for kinome retrieval from the test species are 94.4 and 96.8%. The ability of Kinannote to classify identified kinases was also evaluated, and the average sensitivity and precision for full classification of conserved kinases are 71.5 and 82.5%, respectively. Kinannote has had a significant impact on eukaryotic genome annotation, providing protein kinase annotations for 36 genomes made public by the Broad Institute in the period spanning 2009 to the present.
Availability: Kinannote is freely available at http://sourceforge.net/projects/kinannote.
Supplementary data are available at Bioinformatics online.
The protozoan parasite Trypanosoma brucei is the causative agent
of human African sleeping sickness and related animal diseases, and it has over
170 predicted protein kinases. Protein phosphorylation is a key regulatory
mechanism for cellular function that, thus far, has been studied in
T.brucei principally through putative kinase mRNA knockdown
and observation of the resulting phenotype. However, despite the relatively
large kinome of this organism and the demonstrated essentiality of several
T. brucei kinases, very few specific phosphorylation sites
have been determined in this organism. Using a gel-free, phosphopeptide
enrichment-based proteomics approach we performed the first large scale
phosphorylation site analyses for T.brucei. Serine, threonine,
and tyrosine phosphorylation sites were determined for a cytosolic protein
fraction of the bloodstream form of the parasite, resulting in the
identification of 491 phosphoproteins based on the identification of 852 unique
phosphopeptides and 1204 phosphorylation sites. The phosphoproteins detected in
this study are predicted from their genome annotations to participate in a wide
variety of biological processes, including signal transduction, processing of
DNA and RNA, protein synthesis, and degradation and to a minor extent in
metabolic pathways. The analysis of phosphopeptides and phosphorylation sites
was facilitated by in-house developed software, and this automated approach was
validated by manual annotation of spectra of the kinase subset of proteins.
Analysis of the cytosolic bloodstream form T. brucei kinome
revealed the presence of 44 phosphorylated protein kinases in our data set that
could be classified into the major eukaryotic protein kinase groups by applying
a multilevel hidden Markov model library of the kinase catalytic domain.
Identification of the kinase phosphorylation sites showed conserved
phosphorylation sequence motifs in several kinase activation segments,
supporting the view that phosphorylation-based signaling is a general and
fundamental regulatory process that extends to this highly divergent lower
Microsporidia have attracted considerable attention because they infect a wide range of hosts, from invertebrates to vertebrates, and cause serious human diseases and major economic losses in the livestock industry. There are no prospective drugs to counteract this pathogen. Eukaryotic protein kinases (ePKs) play a central role in regulating many essential cellular processes and are therefore potential drug targets. In this study, a comprehensive summary and comparative analysis of the protein kinases in four microsporidia–Enterocytozoon bieneusi, Encephalitozoon cuniculi, Nosema bombycis and Nosema ceranae–was performed. The results show that there are 34 ePKs and 4 atypical protein kinases (aPKs) in E. bieneusi, 29 ePKs and 6 aPKs in E. cuniculi, 41 ePKs and 5 aPKs in N. bombycis, and 27 ePKs and 4 aPKs in N. ceranae. These data support the previous conclusion that the microsporidian kinome is the smallest eukaryotic kinome. Microsporidian kinomes contain only serine-threonine kinases and do not contain receptor-like and tyrosine kinases. Many of the kinases related to nutrient and energy signaling and the stress response have been lost in microsporidian kinomes. However, cell cycle-, development- and growth-related kinases, which are important to parasites, are well conserved. This reduction of the microsporidian kinome is in good agreement with genome compaction, but kinome density is negatively correlated with proteome size. Furthermore, the protein kinases in each microsporidian genome are under strong purifying selection pressure. No remarkable differences in kinase family classification, domain features, gain and/or loss, and selective pressure were observed in these four species. Although microsporidia adapt to different host types, the coevolution of microsporidia and their hosts was not clearly reflected in the protein kinases. Overall, this study enriches and updates the microsporidian protein kinase database and may provide valuable information and candidate targets for the design of treatments for pathogenic diseases.
Human protein kinases play fundamental roles mediating the majority of signal transduction pathways in eukaryotic cells as well as a multitude of other processes involved in metabolism, cell-cycle regulation, cellular shape, motility, differentiation and apoptosis. The human protein kinome contains 518 members. Most studies that focus on the human kinome require, at some point, the visualization of large amounts of data. The visualization of such data within the framework of a phylogenetic tree may help identify key relationships between different protein kinases in view of their evolutionary distance and the information used to annotate the kinome tree. For example, studies that focus on the promiscuity of kinase inhibitors can benefit from the annotations to depict binding affinities across kinase groups. Images involving the mapping of information into the kinome tree are common. However, producing such figures manually can be a long arduous process prone to errors. To circumvent this issue, we have developed a web-based tool called Kinome Render (KR) that produces customized annotations on the human kinome tree. KR allows the creation and automatic overlay of customizable text or shape-based annotations of different sizes and colors on the human kinome tree. The web interface can be accessed at: http://bcb.med.usherbrooke.ca/kinomerender. A stand-alone version is also available and can be run locally.
Annotation; Human kinome tree; Protein kinases; Data visualisation
The protein kinases are a large family of enzymes that play fundamental roles in propagating signals within the cell. Because of the high degree of binding site similarity shared among protein kinases, designing drug compounds with high specificity among the kinases has proven difficult. However, computational approaches to comparing the 3-dimensional geometry and physicochemical properties of key binding site residue positions have been shown to be informative of inhibitor selectivity. The Combinatorial Clustering Of Residue Position Subsets (ccorps) method, introduced here, provides a semi-supervised learning approach for identifying structural features that are correlated with a given set of annotation labels. Here, ccorps is applied to the problem of identifying structural features of the kinase atp binding site that are informative of inhibitor binding. ccorps is demonstrated to make perfect or near-perfect predictions for the binding affinity profile of 8 of the 38 kinase inhibitors studied, while only having overall poor predictive ability for 1 of the 38 compounds. Additionally, ccorps is shown to identify shared structural features across phylogenetically diverse groups of kinases that are correlated with binding affinity for particular inhibitors; such instances of structural similarity among phylogenetically diverse kinases are also shown to not be rare among kinases. Finally, these function-specific structural features may serve as potential starting points for the development of highly specific kinase inhibitors.
The kinases are a group of essential signaling proteins within the cell and are the largest family of enzymes encoded by the human genome. The high degree of binding site similarity shared across the protein kinases has made them difficult targets for which to design highly selective inhibitors, but kinome-wide binding site analysis can help predict unintended off-target inhibitions. Given the increasingly large number of available kinase structures, kinome-wide comparative analysis of binding sites is now possible. In this paper, the Combinatorial Clustering Of Residue Position Subsets (ccorps) method is introduced and used to synthesize kinome-wide structure datasets with a kinome-wide inhibitor affinity screening dataset consisting of 38 kinase inhibitors. ccorps identifies structural features of the kinase binding site that are correlated with an inhibitor binding and uses these features to predict if this inhibitor will be capable of binding to uncharacterized kinases. This paper demonstrates the ability of ccorps to accurately predict inhibitor binding and identify features of the kinase binding site that are unique to kinases capable of binding a given inhibitor.
The major human intestinal pathogen Giardia lamblia is a very early branching eukaryote with a minimal genome of broad evolutionary and biological interest.
To explore early kinase evolution and regulation of Giardia biology, we cataloged the kinomes of three sequenced strains. Comparison with published kinomes and those of the excavates Trichomonas vaginalis and Leishmania major shows that Giardia's 80 core kinases constitute the smallest known core kinome of any eukaryote that can be grown in pure culture, reflecting both its early origin and secondary gene loss. Kinase losses in DNA repair, mitochondrial function, transcription, splicing, and stress response reflect this reduced genome, while the presence of other kinases helps define the kinome of the last common eukaryotic ancestor. Immunofluorescence analysis shows abundant phospho-staining in trophozoites, with phosphotyrosine abundant in the nuclei and phosphothreonine and phosphoserine in distinct cytoskeletal organelles. The Nek kinase family has been massively expanded, accounting for 198 of the 278 protein kinases in Giardia. Most Neks are catalytically inactive, have very divergent sequences and undergo extensive duplication and loss between strains. Many Neks are highly induced during development. We localized four catalytically active Neks to distinct parts of the cytoskeleton and one inactive Nek to the cytoplasm.
The reduced kinome of Giardia sheds new light on early kinase evolution, and its highly divergent sequences add to the definition of individual kinase families as well as offering specific drug targets. Giardia's massive Nek expansion may reflect its distinctive lifestyle, biphasic life cycle and complex cytoskeleton.
Eukaryotic protein kinases belong to a large superfamily with hundreds to thousands of copies and are components of essentially all cellular functions. The goals of this study are to classify protein kinases from 25 plant species and to assess their evolutionary history in conjunction with consideration of their molecular functions. The protein kinase superfamily has expanded in the flowering plant lineage, in part through recent duplications. As a result, the flowering plant protein kinase repertoire, or kinome, is in general significantly larger than other eukaryotes, ranging in size from 600 to 2500 members. This large variation in kinome size is mainly due to the expansion and contraction of a few families, particularly the receptor-like kinase/Pelle family. A number of protein kinases reside in highly conserved, low copy number families and often play broadly conserved regulatory roles in metabolism and cell division, although functions of plant homologues have often diverged from their metazoan counterparts. Members of expanded plant kinase families often have roles in plant-specific processes and some may have contributed to adaptive evolution. Nonetheless, non-adaptive explanations, such as kinase duplicate subfunctionalization and insufficient time for pseudogenization, may also contribute to the large number of seemingly functional protein kinases in plants.
plant protein kinase; gene family evolution; lineage-specific expansion; comparative genomics
Malaria, caused by the parasitic protist Plasmodium falciparum, represents a major public health problem in the developing world. The P. falciparum genome has been sequenced, which provides new opportunities for the identification of novel drug targets. Eukaryotic protein kinases (ePKs) form a large family of enzymes with crucial roles in most cellular processes; hence malarial ePKS represent potential drug targets. We report an exhaustive analysis of the P. falciparum genomic database (PlasmoDB) aimed at identifying and classifying all ePKs in this organism.
Using a variety of bioinformatics tools, we identified 65 malarial ePK sequences and constructed a phylogenetic tree to position these sequences relative to the seven established ePK groups. Predominant features of the tree were: (i) that several malarial sequences did not cluster within any of the known ePK groups; (ii) that the CMGC group, whose members are usually involved in the control of cell proliferation, had the highest number of malarial ePKs; and (iii) that no malarial ePK clustered with the tyrosine kinase (TyrK) or STE groups, pointing to the absence of three-component MAPK modules in the parasite. A novel family of 20 ePK-related sequences was identified and called FIKK, on the basis of a conserved amino acid motif. The FIKK family seems restricted to Apicomplexa, with 20 members in P. falciparum and just one member in some other Apicomplexan species.
The considerable phylogenetic distance between Apicomplexa and other Eukaryotes is reflected by profound divergences between the kinome of malaria parasites and that of yeast or mammalian cells.
Protein kinases constitute a particularly large protein family in Arabidopsis with important functions in cellular signal transduction networks. At the same time Arabidopsis is a model plant with high frequencies of gene duplications. Here, we have conducted a systematic analysis of the Arabidopsis kinase complement, the kinome, with particular focus on gene duplication events. We matched Arabidopsis proteins to a Hidden-Markov Model of eukaryotic kinases and computed a phylogeny of 942 Arabidopsis protein kinase domains and mapped their origin by gene duplication.
The phylogeny showed two major clades of receptor kinases and soluble kinases, each of which was divided into functional subclades. Based on this phylogeny, association of yet uncharacterized kinases to families was possible which extended functional annotation of unknowns. Classification of gene duplications within these protein kinases revealed that representatives of cytosolic subfamilies showed a tendency to maintain segmentally duplicated genes, while some subfamilies of the receptor kinases were enriched for tandem duplicates. Although functional diversification is observed throughout most subfamilies, some instances of functional conservation among genes transposed from the same ancestor were observed. In general, a significant enrichment of essential genes was found among genes encoding for protein kinases.
The inferred phylogeny allowed classification and annotation of yet uncharacterized kinases. The prediction and analysis of syntenic blocks and duplication events within gene families of interest can be used to link functional biology to insights from an evolutionary viewpoint. The approach undertaken here can be applied to any gene family in any organism with an annotated genome.
Electronic supplementary material
The online version of this article (doi:10.1186/1471-2164-15-548) contains supplementary material, which is available to authorized users.
Hundreds of millions of people are infected with cryptosporidiosis annually, with immunocompromised individuals suffering debilitating symptoms and children in socioeconomically challenged regions at risk of repeated infections. There is currently no effective drug available. In order to facilitate the pursuit of anti-cryptosporidiosis targets and compounds, our study spans the classification of the Cryptosporidium parvum kinome and the structural and biochemical characterization of representatives from the CDPK family and a MAP kinase.
The C. parvum kinome comprises over 70 members, some of which may be promising drug targets. These C. parvum protein kinases include members in the AGC, Atypical, CaMK, CK1, CMGC, and TKL groups; however, almost 35% could only be classified as OPK (other protein kinases). In addition, about 25% of the kinases identified did not have any known orthologues outside of Cryptosporidium spp. Comparison of specific kinases with their Plasmodium falciparum and Toxoplasma gondii orthologues revealed some distinct characteristics within the C. parvum kinome, including potential targets and opportunities for drug design. Structural and biochemical analysis of 4 representatives of the CaMK group and a MAP kinase confirms features that may be exploited in inhibitor design. Indeed, screening CpCDPK1 against a library of kinase inhibitors yielded a set of the pyrazolopyrimidine derivatives (PP1-derivatives) with IC50 values of < 10 nM. The binding of a PP1-derivative is further described by an inhibitor-bound crystal structure of CpCDPK1. In addition, structural analysis of CpCDPK4 identified an unprecedented Zn-finger within the CDPK kinase domain that may have implications for its regulation.
Identification and comparison of the C. parvum protein kinases against other parasitic kinases shows how orthologue- and family-based research can be used to facilitate characterization of promising drug targets and the search for new drugs.
As in other eukaryotes, protein kinases play major regulatory roles in filamentous fungi. Although the genomes of many plant pathogenic fungi have been sequenced, systematic characterization of their kinomes has not been reported. The wheat scab fungus Fusarium graminearum has 116 protein kinases (PK) genes. Although twenty of them appeared to be essential, we generated deletion mutants for the other 96 PK genes, including 12 orthologs of essential genes in yeast. All of the PK mutants were assayed for changes in 17 phenotypes, including growth, conidiation, pathogenesis, stress responses, and sexual reproduction. Overall, deletion of 64 PK genes resulted in at least one of the phenotypes examined, including three mutants blocked in conidiation and five mutants with increased tolerance to hyperosmotic stress. In total, 42 PK mutants were significantly reduced in virulence or non-pathogenic, including mutants deleted of key components of the cAMP signaling and three MAPK pathways. A number of these PK genes, including Fg03146 and Fg04770 that are unique to filamentous fungi, are dispensable for hyphal growth and likely encode novel fungal virulence factors. Ascospores play a critical role in the initiation of wheat scab. Twenty-six PK mutants were blocked in perithecia formation or aborted in ascosporogenesis. Additional 19 mutants were defective in ascospore release or morphology. Interestingly, F. graminearum contains two aurora kinase genes with distinct functions, which has not been reported in fungi. In addition, we used the interlog approach to predict the PK-PK and PK-protein interaction networks of F. graminearum. Several predicted interactions were verified with yeast two-hybrid or co-immunoprecipitation assays. To our knowledge, this is the first functional characterization of the kinome in plant pathogenic fungi. Protein kinase genes important for various aspects of growth, developmental, and infection processes in F. graminearum were identified in this study.
Fusarium head blight caused by Fusarium graminearum is one of the most important diseases on wheat and barley. Although protein kinases are known to play major regulatory roles in fungi, systematic characterization of fungal kinomes has not been reported in plant pathogens. In this study we generated deletion mutants for 96 protein kinase genes. All of the resulting knockout mutants were assayed for changes in 17 phenotypes, including growth, reproduction, stress responses, and plant infection. Overall, deletion of 64 kinase genes resulted in at least one of the phenotypes examined. In total, 42 kinase mutants were significantly reduced in virulence or non-pathogenic. A number of these protein kinase genes, including two that are unique to filamentous fungi, are dispensable for hyphal growth and likely encode novel fungal virulence factors. Ascospores are the primary inoculum for wheat scab. We identified 26 mutants blocked in ascospore. We also used the in silico approach to predict the kinase-kinase interactions and verified some of them by yeast two-hybrid or co-IP assays. Overall, in this study we functionally characterize the kinome of F. graminearum. Protein kinase genes that are important for various aspects of growth, developmental, and plant infection processes were identified.
Endometrial cancer (EC) is the 8th leading cause of cancer death amongst American women. Most ECs are endometrioid, serous, or clear cell carcinomas, or an admixture of histologies. Serous and clear ECs are clinically aggressive tumors for which alternative therapeutic approaches are needed. The purpose of this study was to search for somatic mutations in the tyrosine kinome of serous and clear cell ECs, because mutated kinases can point to potential therapeutic targets.
In a mutation discovery screen, we PCR amplified and Sanger sequenced the exons encoding the catalytic domains of 86 tyrosine kinases from 24 serous, 11 clear cell, and 5 mixed histology ECs. For somatically mutated genes, we next sequenced the remaining coding exons from the 40 discovery screen tumors and sequenced all coding exons from another 72 ECs (10 clear cell, 21 serous, 41 endometrioid). We assessed the copy number of mutated kinases in this cohort of 112 tumors using quantitative real time PCR, and we used immunoblotting to measure expression of these kinases in endometrial cancer cell lines.
Overall, we identified somatic mutations in TNK2 (tyrosine kinase non-receptor, 2) and DDR1 (discoidin domain receptor tyrosine kinase 1) in 5.3% (6 of 112) and 2.7% (3 of 112) of ECs. Copy number gains of TNK2 and DDR1 were identified in another 4.5% and 0.9% of 112 cases respectively. Immunoblotting confirmed TNK2 and DDR1 expression in endometrial cancer cell lines. Three of five missense mutations in TNK2 and one of two missense mutations in DDR1 are predicted to impact protein function by two or more in silico algorithms. The TNK2P761Rfs*72 frameshift mutation was recurrent in EC, and the DDR1R570Q missense mutation was recurrent across tumor types.
This is the first study to systematically search for mutations in the tyrosine kinome in clear cell endometrial tumors. Our findings indicate that high-frequency somatic mutations in the catalytic domains of the tyrosine kinome are rare in clear cell ECs. We uncovered ten new mutations in TNK2 and DDR1 within serous and endometrioid ECs, thus providing novel insights into the mutation spectrum of each gene in EC.
Electronic supplementary material
The online version of this article (doi:10.1186/1471-2407-14-884) contains supplementary material, which is available to authorized users.
Endometrial; Cancer; Mutation; TNK2; ACK1; DDR1; Copy number; Tyrosine kinase; Tyrosine kinome
Dictyostelium discoideum is a widely studied model organism with both unicellular and multicellular forms in its developmental cycle. The Dictyostelium genome encodes 285 predicted protein kinases, similar to the count of the much more advanced Drosophila. It contains members of most kinase classes shared by fungi and metazoans, as well as many previously thought to be metazoan specific, indicating that they have been secondarily lost from the fungal lineage. This includes the entire tyrosine kinase–like (TKL) group, which is expanded in Dictyostelium and includes several novel receptor kinases. Dictyostelium lacks tyrosine kinase group kinases, and most tyrosine phosphorylation appears to be mediated by TKL kinases. About half of Dictyostelium kinases occur in subfamilies not present in yeast or metazoa, suggesting that protein kinases have played key roles in the adaptation of Dictyostelium to its habitat. This study offers insights into kinase evolution and provides a focus for signaling analysis in this system.
Protein kinases are eukaryotic enzymes involved in cell communication pathways, and transmit information from outside the cell or between subcellular components within the cell. About 2.5% of genes code for protein kinases, and mutations in many of these cause human disease. The authors characterize the complete set of protein kinases (kinome) from Dictyostelium discoideum, a social amoeba that responds to starvation by forming aggregates of cells, which then differentiate into multicellular fruiting bodies. Dictyostelium branched from the vertebrate lineage after plants but before fungi, and thus illuminates an interesting period in evolutionary history. By comparing the Dictyostelium kinome to those of other organisms, the authors find 46 types of kinases that appear to be conserved in all organisms, and are likely to be involved in fundamental cellular processes. Dictyostelium is an established model organism for studying many aspects of cell biology that are conserved in humans, and this exposition of conserved kinases will help to guide future studies. The Dictyostelium kinome also contains an impressive degree of creativity—almost half of the kinases are unique to Dictyostelium. Many of these Dictyostelium-specific kinases may be related to this organism's distinctive mechanism for coping with starvation.
As one of the largest protein families, protein kinases (PKs) regulate nearly all processes within the cell and are considered important drug targets. Much research has been conducted on inhibitors for PKs, leading to a wealth of compounds that target PKs that have potential to be lead anthelmintic drugs. Identifying compounds that have already been developed to treat neglected tropical diseases is an attractive way to obtain lead compounds inexpensively that can be developed into much needed drugs, especially for use in developing countries. In this study, PKs from nematodes, hosts, and DrugBank were identified and classified into kinase families and subfamilies. Nematode proteins were placed into orthologous groups that span the phylum Nematoda. A minimal kinome for the phylum Nematoda was identified, and properties of the minimal kinome were explored. Orthologous groups from the minimal kinome were prioritized for experimental testing based on RNAi phenotype of the Caenorhabditis elegans ortholog, transcript expression over the life-cycle and anatomic expression patterns. Compounds linked to targets in DrugBank belonging to the same kinase families and subfamilies in the minimal nematode kinome were extracted. Thirty-five compounds were tested in the non-parasitic C. elegans and active compounds progressed to testing against nematode species with different modes of parasitism, the blood-feeding Haemonchus contortus and the filarial Brugia malayi. Eighteen compounds showed efficacy in C. elegans, and six compounds also showed efficacy in at least one of the parasitic species. Hypotheses regarding the pathway the compounds may target and their molecular mechanism for activity are discussed.
Parasitic nematode infection is a large global health and economic problem, infecting around 2 billion people and costing $100 billion in crops and livestock. People in developing countries often live on one dollar per day, so treatments cannot be expensive, therefore using pre-existing drugs as lead compounds provides an economical way to begin to develop affordable treatments. Protein kinases were chosen as the focus of this work due to the large number of pre-existing drugs that target them and their important role in regulating almost all activities in the cell. Herein we describe a set of protein kinases conserved in diverse nematode species and experimental screening results of pre-existing drugs that target these kinases. The compounds that show in vitro efficacy in both C. elegans and parasitic nematodes, H. contortus or B. malayi have potential to be optimized further. These compounds have potential to provide accessible treatment to people in developing countries, as well as improving the health of livestock and boosting food production globally.
In addition to redox regulation, protein phosphorylation has gained increasing importance as a regulatory principle in chloroplasts in recent years. However, only very few chloroplast-localized protein kinases have been identified to date. Protein phosphorylation regulates important chloroplast processes such as photosynthesis or transcription. In order to better understand chloroplast function, it is therefore crucial to obtain a complete picture of the chloroplast kinome, which is currently constrained by two effects: first, recent observations showed that the bioinformatics-based prediction of chloroplast-localized protein kinases from available sequence data is strongly biased; and, secondly, protein kinases are of very low abundance, which makes their identification by proteomics approaches extremely difficult. Therefore, the aim of this study was to obtain a complete list of chloroplast-localized protein kinases from different species. Evaluation of protein kinases which were either highly predicted to be chloroplast localized or have been identified in different chloroplast proteomic studies resulted in the confirmation of only three new kinases. Considering also all reports of experimentally verified chloroplast protein kinases to date, compelling evidence was found for a total set of 15 chloroplast-localized protein kinases in different species. This is in contrast to a much higher number that would be expected based on targeting prediction or on the general abundance of protein kinases in relation to the entire proteome. Moreover, it is shown that unusual protein kinases with differing ATP-binding sites or catalytic centres seem to occur frequently within the chloroplast kinome, thus making their identification by mass spectrometry-based approaches even more difficult due to a different annotation.
Casein kinase; chloroplast protein kinase; organellar proteomics; photosynthesis; STN7; STN8; subcellular localization; YFP fusion protein
The heat shock protein 90 (Hsp90) is required for the stability of many signalling kinases. As a target for cancer therapy it allows the simultaneous inhibition of several signalling pathways. However, its inhibition in healthy cells could also lead to severe side effects. This is the first comprehensive analysis of the response to Hsp90 inhibition at the kinome level.
We quantitatively profiled the effects of Hsp90 inhibition by geldanamycin on the kinome of one primary (Hs68) and three tumour cell lines (SW480, U2OS, A549) by affinity proteomics based on immobilized broad spectrum kinase inhibitors ("kinobeads"). To identify affected pathways we used the KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway classification. We combined Hsp90 and proteasome inhibition to identify Hsp90 substrates in Hs68 and SW480 cells. The mutational status of kinases from the used cell lines was determined using next-generation sequencing. A mutation of Hsp90 candidate client RIPK2 was mapped onto its structure.
We measured relative abundances of > 140 protein kinases from the four cell lines in response to geldanamycin treatment and identified many new potential Hsp90 substrates. These kinases represent diverse families and cellular functions, with a strong representation of pathways involved in tumour progression like the BMP, MAPK and TGF-beta signalling cascades. Co-treatment with the proteasome inhibitor MG132 enabled us to classify 64 kinases as true Hsp90 clients. Finally, mutations in 7 kinases correlate with an altered response to Hsp90 inhibition. Structural modelling of the candidate client RIPK2 suggests an impact of the mutation on a proposed Hsp90 binding domain.
We propose a high confidence list of Hsp90 kinase clients, which provides new opportunities for targeted and combinatorial cancer treatment and diagnostic applications.
Protein kinases are a large and diverse family of enzymes that are genomically altered in many human cancers. Targeted cancer genome sequencing efforts have unveiled the mutational profiles of protein kinase genes from many different cancer types. While mutational data on protein kinases is currently catalogued in various databases, integration of mutation data with other forms of data on protein kinases such as sequence, structure, function and pathway is necessary to identify and characterize key cancer causing mutations. Integrative analysis of protein kinase data, however, is a challenge because of the disparate nature of protein kinase data sources and data formats.
Here, we describe ProKinO, a protein kinase-specific ontology, which provides a controlled vocabulary of terms, their hierarchy, and relationships unifying sequence, structure, function, mutation and pathway information on protein kinases. The conceptual representation of such diverse forms of information in one place not only allows rapid discovery of significant information related to a specific protein kinase, but also enables large-scale integrative analysis of protein kinase data in ways not possible through other kinase-specific resources. We have performed several integrative analyses of ProKinO data and, as an example, found that a large number of somatic mutations (∼288 distinct mutations) associated with the haematopoietic neoplasm cancer type map to only 8 kinases in the human kinome. This is in contrast to glioma, where the mutations are spread over 82 distinct kinases. We also provide examples of how ontology-based data analysis can be used to generate testable hypotheses regarding cancer mutations.
We present an integrated framework for large-scale integrative analysis of protein kinase data. Navigation and analysis of ontology data can be performed using the ontology browser available at: http://vulcan.cs.uga.edu/prokino.
The Apicomplexa constitute an evolutionarily divergent phylum of protozoan pathogens responsible for widespread parasitic diseases such as malaria and toxoplasmosis. Many cellular functions in these medically important organisms are controlled by protein kinases, which have emerged as promising drug targets for parasitic diseases. However, an incomplete understanding of how apicomplexan kinases structurally and mechanistically differ from their host counterparts has hindered drug development efforts to target parasite kinases.
We used the wealth of sequence data recently made available for 15 apicomplexan species to identify the kinome of each species and quantify the evolutionary constraints imposed on each family of apicomplexan kinases. Our analysis revealed lineage-specific adaptations in selected families, namely cyclin-dependent kinase (CDK), calcium-dependent protein kinase (CDPK) and CLK/LAMMER, which have been identified as important in the pathogenesis of these organisms. Bayesian analysis of selective constraints imposed on these families identified the sequence and structural features that most distinguish apicomplexan protein kinases from their homologs in model organisms and other eukaryotes. In particular, in a subfamily of CDKs orthologous to Plasmodium falciparum crk-5, the activation loop contains a novel PTxC motif which is absent from all CDKs outside Apicomplexa. Our analysis also suggests a convergent mode of regulation in a subset of apicomplexan CDPKs and mammalian MAPKs involving a commonly conserved arginine in the αC helix. In all recognized apicomplexan CLKs, we find a set of co-conserved residues involved in substrate recognition and docking that are distinct from metazoan CLKs.
We pinpoint key conserved residues that can be predicted to mediate functional differences from eukaryotic homologs in three identified kinase families. We discuss the structural, functional and evolutionary implications of these lineage-specific variations and propose specific hypotheses for experimental investigation. The apicomplexan-specific kinase features reported in this study can be used in the design of selective kinase inhibitors.
Function prediction by homology is widely used to provide preliminary functional annotations for genes for which experimental evidence of function is unavailable or limited. This approach has been shown to be prone to systematic error, including percolation of annotation errors through sequence databases. Phylogenomic analysis avoids these errors in function prediction but has been difficult to automate for high-throughput application. To address this limitation, we present a computationally efficient pipeline for phylogenomic classification of proteins. This pipeline uses the SCI-PHY (Subfamily Classification in Phylogenomics) algorithm for automatic subfamily identification, followed by subfamily hidden Markov model (HMM) construction. A simple and computationally efficient scoring scheme using family and subfamily HMMs enables classification of novel sequences to protein families and subfamilies. Sequences representing entirely novel subfamilies are differentiated from those that can be classified to subfamilies in the input training set using logistic regression. Subfamily HMM parameters are estimated using an information-sharing protocol, enabling subfamilies containing even a single sequence to benefit from conservation patterns defining the family as a whole or in related subfamilies. SCI-PHY subfamilies correspond closely to functional subtypes defined by experts and to conserved clades found by phylogenetic analysis. Extensive comparisons of subfamily and family HMM performances show that subfamily HMMs dramatically improve the separation between homologous and non-homologous proteins in sequence database searches. Subfamily HMMs also provide extremely high specificity of classification and can be used to predict entirely novel subtypes. The SCI-PHY Web server at http://phylogenomics.berkeley.edu/SCI-PHY/ allows users to upload a multiple sequence alignment for subfamily identification and subfamily HMM construction. Biologists wishing to provide their own subfamily definitions can do so. Source code is available on the Web page. The Berkeley Phylogenomics Group PhyloFacts resource contains pre-calculated subfamily predictions and subfamily HMMs for more than 40,000 protein families and domains at http://phylogenomics.berkeley.edu/phylofacts/.
Predicting the function of a gene or protein (gene product) from its primary sequence is a major focus of many bioinformatics methods. In this paper, the authors present a three-stage computational pipeline for gene functional annotation in an evolutionary framework to reduce the systematic errors associated with the standard protocol (annotation transfer from predicted homologs). In the first stage, a functional hierarchy is estimated for each protein family and subfamilies are identified. In the second stage, hidden Markov models (HMMs) (a type of statistical model) are constructed for each subfamily to model both the family-defining and subfamily-specific signatures. In the third stage, subfamily HMMs are used to assign novel sequences to functional subtypes. Extensive experimental validation of these methods shows that predicted subfamilies correspond closely to functional subtypes identified by experts and to conserved clades in phylogenetic trees; that subfamily HMMs increase the separation between homologs and non-homologs in sequence database discrimination tests relative to the use of a single HMM for the family; and that specificity of classification of novel sequences to subfamilies using subfamily HMMs is near perfect (1.5% error rate when sequences are assigned to the top-scoring subfamily, and <0.5% error rate when logistic regression of scores is employed).
The hepatitis C virus NS5A protein plays a critical role in virus replication, conferring interferon resistance to the virus through perturbation of multiple intracellular signaling pathways. Since NS5A is a phosphoprotein, it is of considerable interest to understand the role of phosphorylation in NS5A function. In this report, we investigated the phosphorylation of NS5A by taking advantage of 119 glutathione S-transferase-tagged protein kinases purified from Saccharomyces cerevisiae to perform a global screening of yeast kinases capable of phosphorylating NS5A in vitro. A database BLAST search was subsequently performed by using the sequences of the yeast kinases that phosphorylated NS5A in order to identify human kinases with the highest sequence homologies. Subsequent in vitro kinase assays and phosphopeptide mapping studies confirmed that several of the homologous human protein kinases were capable of phosphorylating NS5A. In vivo phosphopeptide mapping revealed phosphopeptides common to those generated in vitro by AKT, p70S6K, MEK1, and MKK6, suggesting that these kinases may phosphorylate NS5A in mammalian cells. Significantly, rapamycin, an inhibitor commonly used to investigate the mTOR/p70S6K pathway, reduced the in vivo phosphorylation of specific NS5A phosphopeptides, strongly suggesting that p70S6 kinase and potentially related members of this group phosphorylate NS5A inside the cell. Curiously, certain of these kinases also play a major role in mRNA translation and antiapoptotic pathways, some of which are already known to be regulated by NS5A. The findings presented here demonstrate the use of high-throughput screening of the yeast kinome to facilitate the major task of identifying human NS5A protein kinases for further characterization of phosphorylation events in vivo. Our results suggest that this novel approach may be generally applicable to the screening of other protein biochemical activities by mechanistic class.
‘Phylogenetic trees’ are commonly used for the analysis of chemogenomics datasets and to relate protein targets to each other, based on the (shared) bioactivities of their ligands. However, no real assessment as to the suitability of this representation has been performed yet in this area. We aimed to address this shortcoming in the current work, as exemplified by a kinase data set, given the importance of kinases in many diseases as well as the availability of large-scale datasets for analysis. In this work, we analyzed a dataset comprising 157 compounds, which have been tested at concentrations of 1 μM and 10 μM against a panel of 225 human protein kinases in full-matrix experiments, aiming to explain kinase promiscuity and selectivity against inhibitors. Compounds were described by chemical features, which were used to represent kinases (i.e. each kinase had an active set of features and an inactive set).
Using this representation, a bioactivity-based classification was made of the kinome, which partially resembles previous sequence-based classifications, where particularly kinases from the TK, CDK, CLK and AGC branches cluster together. However, we were also able to show that in approximately 57% of cases, on average 6 kinase inhibitors exhibit activity against kinases which are located at a large distance in the sequence-based classification (at a relative distance of 0.6 – 0.8 on a scale from 0 to 1), but are correctly located closer to each other in our bioactivity-based tree (distance 0 – 0.4). Despite this improvement on sequence-based classification, also the bioactivity-based classification needed further attention: for approximately 80% of all analyzed kinases, kinases classified as neighbors according to the bioactivity-based classification also show high SAR similarity (i.e. a high fraction of shared active compounds and therefore, interaction with similar inhibitors). However, in the remaining ~20% of cases a clear relationship between kinase bioactivity profile similarity and shared active compounds could not be established, which is in agreement with previously published atypical SAR (such as for LCK, FGFR1, AKT2, DAPK1, TGFR1, MK12 and AKT1).
In this work we were hence able to show that (1) targets (here kinases) with few shared activities are difficult to establish neighborhood relationships for, and (2) phylogenetic tree representations make implicit assumptions (i.e. that neighboring kinases exhibit similar interaction profiles with inhibitors) that are not always suitable for analyses of bioactivity space. While both points have been implicitly alluded to before, this is to the information of the authors the first study that explores both points on a comprehensive basis. Excluding kinases with few shared activities improved the situation greatly (the percentage of kinases for which no neighborhood relationship could be established dropped from 20% to only 4%). We can conclude that all of the above findings need to be taken into account when performing chemogenomics analyses, also for other target classes.
Kinase inhibitor; Selectivity; Phylogenetics; Chemogenomics; Polypharmacology
Cancer is a genetic disease that develops through a series of somatic mutations, a subset of which drive cancer progression. Although cancer genome sequencing studies are beginning to reveal the mutational patterns of genes in various cancers, identifying the small subset of “causative” mutations from the large subset of “non-causative” mutations, which accumulate as a consequence of the disease, is a challenge. In this article, we present an effective machine learning approach for identifying cancer-associated mutations in human protein kinases, a class of signaling proteins known to be frequently mutated in human cancers. We evaluate the performance of 11 well known supervised learners and show that a multiple-classifier approach, which combines the performances of individual learners, significantly improves the classification of known cancer-associated mutations. We introduce several novel features related specifically to structural and functional characteristics of protein kinases and find that the level of conservation of the mutated residue at specific evolutionary depths is an important predictor of oncogenic effect. We consolidate the novel features and the multiple-classifier approach to prioritize and experimentally test a set of rare unconfirmed mutations in the epidermal growth factor receptor tyrosine kinase (EGFR). Our studies identify T725M and L861R as rare cancer-associated mutations inasmuch as these mutations increase EGFR activity in the absence of the activating EGF ligand in cell-based assays.
Cancer progresses by accumulation of mutations in a subset of genes that confer growth advantage. The 518 protein kinase genes encoded in the human genome, collectively called the kinome, represent one of the largest families of oncogenes. Targeted sequencing studies of many different cancers have shown that the mutational landscape comprises both cancer-causing “driver” mutations and harmless “passenger” mutations. While the frequent recurrence of some driver mutations in human cancers helps distinguish them from the large number of passenger mutations, a significant challenge is to identify the rare “driver” mutations that are less frequently observed in patient samples and yet are causative. Here we combine computational and experimental approaches to identify rare cancer-associated mutations in Epidermal Growth Factor receptor kinase (EGFR), a signaling protein frequently mutated in cancers. Specifically, we evaluate a novel multiple-classifier approach and features specific to the protein kinase super-family in distinguishing known cancer-associated mutations from benign mutations. We then apply the multiple classifier to identify and test the functional impact of rare cancer-associated mutations in EGFR. We report, for the first time, that the EGFR mutations T725M and L861R, which are infrequently observed in cancers, constitutively activate EGFR in a manner analogous to the frequently observed driver mutations.