PMCCPMCCPMCC

Search tips
Search criteria 

Advanced

 
Logo of narLink to Publisher's site
 
Nucleic Acids Res. Jan 2009; 37(Database issue): D244–D250.
Published online Oct 30, 2008. doi:  10.1093/nar/gkn834
PMCID: PMC2686601
Kinomer v. 1.0: a database of systematically classified eukaryotic protein kinases
David M. A. Martin, Diego Miranda-Saavedra, and Geoffrey J. Barton*
College of Life Sciences, University of Dundee, Dow Street, Dundee DD1 5EH, Scotland, UK
*To whom correspondence should be addressed. Tel: Phone: +44 1382 385860; Fax: +44 1382 385764; Email: geoff/at/compbio.dundee.ac.uk
Present address: Diego Miranda-Saavedra, Cambridge Institute for Medical Research, Wellcome Trust/MRC Building, Addenbrooke's Hospital, Hills Road, Cambridge CB2 0XY, UK.
The authors wish it to be known that, in their opinion, the first two authors should be regarded as the joint First Authors.
Received August 15, 2008; Revised October 13, 2008; Accepted October 14, 2008.
The regulation of protein function through reversible phosphorylation by protein kinases and phosphatases is a general mechanism controlling virtually every cellular activity. Eukaryotic protein kinases can be classified into distinct, well-characterized groups based on amino acid sequence similarity and function. We recently reported a highly sensitive and accurate hidden Markov model-based method for the automatic detection and classification of protein kinases into these specific groups. The Kinomer v. 1.0 database presented here contains annotated classifications for the protein kinase complements of 43 eukaryotic genomes. These span the taxonomic range and include fungi (16 species), plants (6), diatoms (1), amoebas (2), protists (1) and animals (17). The kinomes are stored in a relational database and are accessible through a web interface on the basis of species, kinase group or a combination of both. In addition, the Kinomer v. 1.0 HMM library is made available for users to perform classification on arbitrary sequences. The Kinomer v. 1.0 database is a continually updated resource where direct comparison of kinase sequences across kinase groups and across species can give insights into kinase function and evolution. Kinomer v. 1.0 is available at http://www.compbio.dundee.ac.uk/kinomer/.
The regulation of protein function through reversible phosphorylation by protein kinases and phosphatases is a widespread cellular mechanism thought to control virtually every cellular activity (1), and abnormal levels of phosphorylation are known to be responsible for severe diseases (2).
Hanks and Hunter were the first to report that sequence similarity of kinase catalytic domains reflects protein kinase function and/or mode of regulation (3,4). Observation of distinct clades where function segregated with sequence similarity allowed Hanks and Hunter to divide the protein kinase superfamily into specific ‘groups’. The currently accepted classification of the eukaryotic protein kinase superfamily considers eight ‘conventional’ protein kinase groups (ePKs) and four ‘atypical’ groups (aPKs) (5,6). Among the ePKs are the AGC group (including cyclic-nucleotide and calcium-phospholipid-dependent kinases, ribosomal S6-phosphorylating kinases, G protein-coupled kinases and all close relatives of these sets); the CAMKs (calmodulin-regulated kinases); the CK1 group (casein kinase 1, and close relatives); the CMGC group (including cyclin-dependent kinases, mitogen-activated protein kinases, glycogen synthase kinases and CDK-like kinases); the RGC group (receptor guanylate cyclase); the STEs (including many kinases functioning in MAP kinase cascades); the TKs (tyrosine kinases) and the TKLs (tyrosine kinase-like kinases). However, there is a significant proportion of kinases which, whilst exhibiting some degree of sequence similarity to the eight groups above, could not be classified easily into particular groups. These form a ninth group called ‘Other’.
The aPKs are a small set of protein kinases that do not share clear sequence similarity with ePKs, but have been shown experimentally to have protein kinase activity. The bona fide aPKs (6) are the alpha-kinase group (exemplified by myosin heavy chain kinase of Dictyostelium discoideum), PIKK (phosphatidyl inositol 3′ kinase-related kinases), RIO and PHDK (pyruvate dehydrogenase kinases).
The sequencing of complete genomes for many eukaryotic species has allowed the determination and comparison of their complete kinase complements (kinomes). These include the kinomes of Saccharomyces cerevisiae (7), Caenorhabditis elegans (8), Drosophila melanogaster (9), Mus musculus (10), Homo sapiens (5), Dictyostelium discoideum (11), Strongylocentrotus purpuratus (12), Tetrahymena thermophila (13), and the plants Arabidopsis thaliana and Oryza sativa (14). Several parasite kinomes have been determined, including the malaria parasite Plasmodium falciparum (15), its comparison with Plasmodium yoelii (16) and those of the three Trypanosomatid species Leishmania major, Trypanosoma brucei and Trypanosoma cruzi (17). The kinomes of H. sapiens, M. musculus, S. purpuratus, D. melanogaster, C. elegans, S. cerevisiae, D. discoideum and T. thermophila are available through Kinbase (http://www.kinase.com/kinbase/). In particular, the observation that many important protein kinases of parasitic protozoa are significantly dissimilar from their eukaryotic counterparts has raised the prospects for therapeutics based on the selective inhibition of parasitic protein kinases (18–20).
We have recently exploited the sequence similarity of protein kinases in developing a multi-level Hidden Markov Model (HMM) library that is capable of classifying protein kinases into their correct functional group (6). The protein kinase HMM library was shown to be three times more sensitive than BLAST for identifying kinase catalytic domains. It was also shown to be more sensitive than a general Pfam model of the kinase catalytic domain, with the added advantage that the HMM library is capable of discriminating among protein kinase groups. The validated HMM library was applied to improve the group-level classification of the S. cerevisiae ePKs from 66.96% to 90.43% by classifying many of the yeast kinases previously assigned to the ‘Other’ group. In this article, we describe the extension of this analysis to the complete classification at the kinase group level of 43 curated eukaryotic kinomes and a web-based resource through which these annotations can be examined. In addition, we provide an interface to the HMM library, allowing for the classification of arbitrary sequences.
Sequence data sources
The complete translated protein coding sequences were obtained for the fungi Aspergillus fumigatus (21), Aspergillus nidulans (22), Aspergillus niger (23), Aspergillus oryzae (24), Candida glabrata (25), Cryptococcus neoformans (26), Debaryomyces hansenii (25), Kluyveromyces lactis (25), Magnaporthe grisea (27), Neurospora crassa (28), Phanerochaete chrysosporium (29), Ustilago maydis (30) and Yarrowia lipolytica (25). Among the photosynthetic organisms we have included A. thaliana (31), the red alga Cyanidioschyzon merolae (32), the rice species Oryza sativa ssp. Japonica (33), the green algae Ostreococcus lucimarinus (34) and Ostreococcus tauri (35), and the poplar tree Populus trichocarpa (36). The metazoan genomes include the yellow fever mosquito Aedes aegypti (37), the malaria mosquito vector Anopheles gambiae (38), the silkworm Bombyx mori (39), the common dog Canis familiaris (40), the early chordate Ciona intestinalis (41), the chicken Gallus gallus (42), the Rhesus macaque Macaca mulatta (43), the marsupial Monodelphis domestica (Opossum) (44), the fishes medaka Oryzias latipes (45), Takifugu rubripes (46) and Tetraodon nigroviridis (47), the laboratory rat Rattus norvegicus (48) and the chimpanzee Pan troglodytes (49). Finally, we have also included the amoeba Entamoeba histolytica (50), the diatom Thalassiosira pseudonana (51) and the pathogenic protist Trichomonas vaginalis (52). The manually annotated kinomes of Caenorhabditis elegans (8), Dictyostelium discoideum (11), Drosophila melanogaster, Homo sapiens (5) and M. musculus (10) were downloaded from Kinbase (http://www.kinase.com/kinbase/) on 28 September 2008. The manually annotated kinomes of Encephalitozoon cuniculi, Saccharomyces cerevisiae and Schyzosaccharomyces pombe had previously been manually annotated and analysed in detail (53).
Kinase classification
The predicted peptide sequences for each of the genomes were searched individually against the Kinomer v. 1.0 multi-level HMM library (6) with the hmmpfam program of the HMMer package (54). Partial matches to the kinase catalytic domain were excluded through manual curation. Empirical cutoffs for association of kinase matches with each of the specific kinase groups were determined through analysis of the significance scores for the matches of the library HMMs to the well annotated kinases in Kinbase for the organisms H. sapiens, C. elegans, D. melanogaster and S. cerevisiae (6). The highest observed E-value for that group was taken as the cutoff for confident assignment. These are AGC (2.7e−7), CAMK (3.2e−14), CK1 (3.2e−5), CMGC (1.2e−7), RGC (4.8e−5), STE (1.4e−6), TK (1.1e−9), TKL (1.7e−12), Alpha (8.5e−66), PDHK (2.7e−10), PIKK (8.4e−6) and RIO (2.3e−3). Protein kinase catalytic domains that had E-values above this cutoff were automatically classified as belonging to the ‘Other’ group. Table 1 lists the protein kinase complements of the 43 eukaryotic genomes contained in Kinomer v.1.0, split by kinase group. All kinase matches were stored in a relational database, linking the sequence to the library matches and the subsequent assignments to a functional group.
Table 1.
Table 1.
The kinomes of the 43 genomes analysed split into the major kinase groups
User interface
The Kinomer v. 1.0 web server provides a comprehensive search interface for accessing the database. Sequences can be retrieved by kinase group, by species or by a combination of both. A summary table illustrates the quality of match of each sequence to the HMM library, as well as providing direct clickable links to the public databases (Figure 1). In addition, an option is available to allow data sets to be downloaded as FASTA format sequence files. The multiple sequence alignment analysis program Jalview (55) is integrated into the Kinomer v. 1.0 interface and allows visualization of the query results. Kinase sequences retrieved are grouped by type and aligned. Jalview allows colouring of the sequences by protein secondary structural properties or amino acid chemical character and on-the-fly calculation of Neighbour-Joining and average distance phylogenetic trees. The web-applet form of Jalview can launch the full Jalview application via the ‘File->View in Full Application’ option. This gives access to further tools for the generation of multiple sequence alignments by Muscle (56), MAFFT (57,58) or ClustalW (59) and secondary structure prediction by JNet (60,61).
Figure 1.
Figure 1.
The precalculated kinomes may be downloaded from the Kinomer v. 1.0 website and select by species, kinase group or a combination of both.
In addition, a separate web interface allows users to classify arbitrary sequences with the HMM library. This web based tool allows a user to upload a sequence in any of the many sequence formats supported by EMBOSS (62), including the popular FASTA, GCG, PIR and SwissProt (62) formats. This sequence is subjected to basic quality assurance checks before the hmmpfam search job is queued for execution on a multi-node Linux cluster. The user is then provided with a job ID, and the interface is asynchronous, returning a status page to the user which is updated automatically. The user can bookmark the results page and return at a later time. In addition, an optional field allows the user to associate arbitrary comments with their job, a useful feature to allow otherwise similar jobs to be distinguished. There are no additional parameters that are user-selectable. This allows for a clean and straightforward interface form.
The results are displayed as a formatted HTML page (Figure 2) with the group classification clearly indicated. This shows to which protein kinase group Kinomer v. 1.0 has assigned the sequence. In addition, alternative assignments are given and a summary of all potential significant matches shown. Kinomer v. 1.0 will typically show matches to many kinase group HMMs spanning several kinase groups. All the top-scoring HMMs for one particular group will be the most significant matches, followed by closely related groups. The detailed alignment for each HMM match is linked further down the screen. As some users may wish for more details, the Kinomer v. 1.0 results page also provides a link to the raw HMMer output.
Figure 2.
Figure 2.
Results of searching a peptide sequence for kinase catalytic domains using the Kinomer v. 1.0 HMM library. A list of hits is displayed at the top followed by the alignment of the peptide sequence to the individual sub-group HMMs that constitute the HMM (more ...)
The 43 species considered here span a number of phylogenetic lineages, genome sizes and display a range of adaptations to their environment. The genome-wide kinase group assignments are consistent with our previously published results (6) in that seven protein kinase groups (AGC, CAMK, CK1, CMGC, STE, PIKK and RIO) are present in all species surveyed (Table 1) and some kinases in these groups are likely to be essential. Kinases of the groups RGC, TK, TKL, Alpha and PDHK are late innovations in specific phyla or have been lost secondarily in specific lines of descent. The presence of a discrete number of putative TKs in photosynthetic organisms and the pathogen Entamoeba histolytica suggests that TKs are also likely to have had an ancient origin. This observation has recently been strengthened by the finding of animal-like signalling molecules in the green alga Chlamydomonas reinhardtii (63). These include scavenger receptor cysteine rich (SRCR) and C-type lectin domain (CTLD) proteins, both of which play key roles in the innate immune system of metazoa. The identification of SH2 domain proteins in photosynthetic organisms (63,64) suggests that phosphotyrosine-SH2 domain signalling also has an ancient origin and that important cell signalling and adhesion domains evolved before the divergence of the animal lineage.
The observation that many species outside the Opisthokont group lack important kinase groups, as is the case of TKs in Apicomplexa (Miranda-Saavedra, D. et al., manuscript submitted for publication), and which have many lineage-specific groups of kinases, suggests that the group level is the most specific level for the automatic classification of kinomes based on models constructed from sequences outside the taxonomic clade under investigation. With the availability of a number of Deuterostome, Protostome and pre-bilaterian genome sequences, having all kinases belonging to a particular kinase group enables novel analyses to be performed. For example, it is now possible to trace the evolution of receptor tyrosine kinase families and that of their ligands. Since receptor tyrosine kinases are multi-domain proteins, diverging rates of evolution of the various domains, and their incorporation in the receptor molecule in select phylogenetic lineages, is informative of distinct selection pressures and can be informative of newly acquired functions through the acquisition of new ligand-binding domains. This is the case with the Trk family of receptor tyrosine kinases, which encode the neurotrophin receptors [nerve growth factor (NGF), brain-derived neurotrophic factor (BDNF), neurotrophin-3 (NT-3) and neurotrophin-4 (NT-4)]. The neurotrophin receptors are an ancient family whose function has been lost in multiple lineages and the roles of the receptors have been modified over time (65).
Kinomer v. 1.0 also includes the manually annotated kinomes of the model fungi S. cerevisiae and S. pombe, and that of the unicellular fungi-like parasite Encephalitozoon cuniculi (53). We have recently shown that the two model fungi share ~85% of their kinomes (53), a degree of similarity much higher than that previously reported. The kinomes of budding and fission yeasts are therefore a useful dataset for annotating the kinomes of other fungi, among which we have included species of importance in basic and medical research, and in biotechnology. The manually annotated kinomes of C. elegans, D. discoideum, D. melanogaster, H. sapiens and M. musculus, as provided in Kinbase (http://www.kinase.com/kinbase/), have also been included in the Kinomer v. 1.0 database. These will facilitate the manual annotation of other kinomes included in the database and which belong to the same taxonomic clade. The classification of a number of kinases in the kinomes of C. elegans, D. discoideum, D. melanogaster, H. sapiens and M. musculus could be improved as suggested by the Kinomer v. 1.0 HMM group scores. However, careful manual annotation of the kinomes of other species in the same taxonomic clades will be performed in the future to make a more informed decision about the re-classification of such kinases.
To our knowledge, Kinomer v. 1.0 is unique in being based on a high-accuracy validated kinase-group classification method (6). Other databases of protein kinases exist, but none of these offer the combination of breadth and accuracy of kinase classification that is present in Kinomer v. 1.0. These include KinMutBase (66), a database of clinically validated mutations in human kinases that lead to disease, and RTK.db (67), a database of receptor tyrosine kinases. The Protein Kinase Resource (68) collates data from several databases and includes a subset of protein kinase 3D structures to produce high-quality multiple structure-based alignments. Kinbase (http://www.kinase.com/kinbase/) contains manually curated kinomes classified according to the Hanks and Hunter classification of protein kinases (4). Although of high quality, Kinbase only contains kinomes for nine species. Finally, KinG (69) includes protein kinases identified in completed genomes that have been classified by a variety of metazoan kinome-based sequence search methods, but do not provide the confidence in kinase classification that is seen in Kinomer v. 1.0. Different eukaryotic lineages possess lineage-specific kinase groups and families that are just beginning to be characterized and which constitute as much as 50% of their kinomes (17). The applicability of the KinG approach to non-metazoan kinases needs further testing. A similar limitation is encountered by the PANTHER (70) database. Although not specific to protein kinases, PANTHER provides an extensive and detailed HMM library for kinase families and sub-families. These family and sub-family HMM libraries are trained on metazoan sequences and thus preclude their use to annotate non-metazoan sequences confidently into kinase families and sub-families which may not exist in non-metazoan species. Kinomer v. 1.0 annotates to the group level only and in our view annotating to the family/sub-family level requires manual curation.
In summary, Kinomer v. 1.0 is an easy-to-use interface to a novel database of both manually and automatically annotated kinomes. The availability of 43 eukaryotic kinomes in a relational database allows the easy querying of protein kinases by species and/or protein kinase group. In addition, the Kinomer v. 1.0 website includes a web server interface to the previously validated HMM library for the classification of peptide sequences into protein kinase groups. In the future, Kinomer v. 1.0 will be enhanced with the addition of a number of manually annotated kinomes of fungal, metazoan and photosynthetic organisms (Miranda-Saavedra, D., et al., manuscript in preparation). These will include the kinomes of pathogenic fungi of the Rhizopus and Fusarium geni, and the kinomes of several unicellular and multicellular photosynthetic organisms including diatoms, red, brown and green algae, and vascular plants. Thus, Kinomer v. 1.0 is a useful and developing repository of expert and automatically annotated kinomes.
FUNDING
D.M.S. was a Wellcome Trust Prize Student at the University of Dundee. Funding for open access charge: Wellcome Trust.
Conflict of interest statement. None declared.
ACKNOWLEDGEMENTS
We thank Drs Tom Walsh and Jonathan Monk for assistance with computing.
1. Cohen P. The regulation of protein function by multisite phosphorylation—a 25 year update. Trends Biochem Sci. 2000;25:596–601. [PubMed]
2. Cohen P. The role of protein phosphorylation in human health and disease. The Sir Hans Krebs Medal Lecture. Eur. J. Biochem. 2001;268:5001–5010. [PubMed]
3. Hanks SK, Quinn AM, Hunter T. The protein kinase family: conserved features and deduced phylogeny of the catalytic domains. Science. 1988;241:42–52. [PubMed]
4. Hanks SK, Hunter T. Protein kinases 6. The eukaryotic protein kinase superfamily: kinase (catalytic) domain structure and classification. FASEB J. 1995;9:576–596. [PubMed]
5. Manning G, Whyte DB, Martinez R, Hunter T, Sudarsanam S. The protein kinase complement of the human genome. Science. 2002;298:1912–1934. [PubMed]
6. Miranda-Saavedra D, Barton GJ. Classification and functional annotation of eukaryotic protein kinases. Proteins. 2007;68:893–914. [PubMed]
7. Hunter T, Plowman GD. The protein kinases of budding yeast: six score and more. Trends Biochem Sci. 1997;22:18–22. [PubMed]
8. Plowman GD, Sudarsanam S, Bingham J, Whyte D, Hunter T. The protein kinases of Caenorhabditis elegans: a model for signal transduction in multicellular organisms. Proc. Natl Acad. Sci. USA. 1999;96:13603–13610. [PubMed]
9. Morrison DK, Murakami MS, Cleghon V. Protein kinases and phosphatases in the Drosophila genome. J. Cell Biol. 2000;150:F57–F62. [PMC free article] [PubMed]
10. Caenepeel S, Charydczak G, Sudarsanam S, Hunter T, Manning G. The mouse kinome: discovery and comparative genomics of all mouse protein kinases. Proc. Natl Acad. Sci. USA. 2004;101:11707–11712. [PubMed]
11. Goldberg JM, Manning G, Liu A, Fey P, Pilcher KE, Xu Y, Smith JL. The dictyostelium kinome—analysis of the protein kinases from a simple model organism. PLoS Genet. 2006;2:e38. [PMC free article] [PubMed]
12. Bradham CA, Foltz KR, Beane WS, Arnone MI, Rizzo F, Coffman JA, Mushegian A, Goel M, Morales J, Geneviere AM, et al. The sea urchin kinome: a first look. Dev. Biol. 2006;300:180–193. [PubMed]
13. Eisen JA, Coyne RS, Wu M, Wu D, Thiagarajan M, Wortman JR, Badger JH, Ren Q, Amedeo P, Jones KM, et al. Macronuclear genome sequence of the ciliate Tetrahymena thermophila, a model eukaryote. PLoS Biol. 2006;4:e286. [PMC free article] [PubMed]
14. Krupa A, Anamika, Srinivasan N. Genome-wide comparative analyses of domain organisation of repertoires of protein kinases of Arabidopsis thaliana and Oryza sativa. Gene. 2006;380:1–13. [PubMed]
15. Ward P, Equinet L, Packer J, Doerig C. Protein kinases of the human malaria parasite Plasmodium falciparum: the kinome of a divergent eukaryote. BMC Genomics. 2004;5:79. [PMC free article] [PubMed]
16. Anamika K, Srinivasan N. Comparative kinomics of Plasmodium organisms: unity in diversity. Protein Pept. Lett. 2007;14:509–517. [PubMed]
17. Parsons M, Worthey EA, Ward PN, Mottram JC. Comparative analysis of the kinomes of three pathogenic trypanosomatids: Leishmania major, Trypanosoma brucei and Trypanosoma cruzi. BMC Genomics. 2005;6:127. [PMC free article] [PubMed]
18. Doerig C, Billker O, Pratt D, Endicott J. Protein kinases as targets for antimalarial intervention: kinomics, structure-based design, transmission-blockade, and targeting host cell enzymes. Biochim. Biophys. Acta. 2005;1754:132–150. [PubMed]
19. Doerig C, Meijer L. Antimalarial drug discovery: targeting protein kinases. Expert Opin. Ther. Targets. 2007;11:279–290. [PubMed]
20. Naula C, Parsons M, Mottram JC. Protein kinases as drug targets in trypanosomes and Leishmania. Biochim. Biophys. Acta. 2005;1754:151–159. [PMC free article] [PubMed]
21. Nierman WC, Pain A, Anderson MJ, Wortman JR, Kim HS, Arroyo J, Berriman M, Abe K, Archer DB, Bermejo C, et al. Genomic sequence of the pathogenic and allergenic filamentous fungus Aspergillus fumigatus. Nature. 2005;438:1151–1156. [PubMed]
22. Galagan JE, Calvo SE, Cuomo C, Ma LJ, Wortman JR, Batzoglou S, Lee SI, Basturkmen M, Spevak CC, Clutterbuck J, et al. Sequencing of Aspergillus nidulans and comparative analysis with A. fumigatus and A. oryzae. Nature. 2005;438:1105–1115. [PubMed]
23. Pel HJ, de Winde JH, Archer DB, Dyer PS, Hofmann G, Schaap PJ, Turner G, de Vries RP, Albang R, Albermann K, et al. Genome sequencing and analysis of the versatile cell factory Aspergillus niger CBS 513.88. Nat. Biotechnol. 2007;25:221–231. [PubMed]
24. Machida M, Asai K, Sano M, Tanaka T, Kumagai T, Terai G, Kusumoto K, Arima T, Akita O, Kashiwagi Y, et al. Genome sequencing and analysis of Aspergillus oryzae. Nature. 2005;438:1157–1161. [PubMed]
25. Dujon B, Sherman D, Fischer G, Durrens P, Casaregola S, Lafontaine I, De Montigny J, Marck C, Neuveglise C, Talla E, et al. Genome evolution in yeasts. Nature. 2004;430:35–44. [PubMed]
26. Loftus BJ, Fung E, Roncaglia P, Rowley D, Amedeo P, Bruno D, Vamathevan J, Miranda M, Anderson IJ, Fraser JA, et al. The genome of the basidiomycetous yeast and human pathogen Cryptococcus neoformans. Science. 2005;307:1321–1324. [PMC free article] [PubMed]
27. Dean RA, Talbot NJ, Ebbole DJ, Farman ML, Mitchell TK, Orbach MJ, Thon M, Kulkarni R, Xu JR, Pan H, et al. The genome sequence of the rice blast fungus Magnaporthe grisea. Nature. 2005;434:980–986. [PubMed]
28. Galagan JE, Calvo SE, Borkovich KA, Selker EU, Read ND, Jaffe D, FitzHugh W, Ma LJ, Smirnov S, Purcell S, et al. The genome sequence of the filamentous fungus Neurospora crassa. Nature. 2003;422:859–868. [PubMed]
29. Martinez D, Larrondo LF, Putnam N, Gelpke MD, Huang K, Chapman J, Helfenbein KG, Ramaiya P, Detter JC, Larimer F, et al. Genome sequence of the lignocellulose degrading fungus Phanerochaete chrysosporium strain RP78. Nat. Biotechnol. 2004;22:695–700. [PubMed]
30. Kamper J, Kahmann R, Bolker M, Ma LJ, Brefort T, Saville BJ, Banuett F, Kronstad JW, Gold SE, Muller O, et al. Insights from the genome of the biotrophic fungal plant pathogen Ustilago maydis. Nature. 2006;444:97–101. [PubMed]
31. Arabidopsis.Genome.Initiative. Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature. 2000;408:796–815. [PubMed]
32. Matsuzaki M, Misumi O, Shin IT, Maruyama S, Takahara M, Miyagishima SY, Mori T, Nishida K, Yagisawa F, Yoshida Y, et al. Genome sequence of the ultrasmall unicellular red alga Cyanidioschyzon merolae 10D. Nature. 2004;428:653–657. [PubMed]
33. Goff SA, Ricke D, Lan TH, Presting G, Wang R, Dunn M, Glazebrook J, Sessions A, Oeller P, Varma H, et al. A draft sequence of the rice genome (Oryza sativa L. ssp. japonica) Science. 2002;296:92–100. [PubMed]
34. Palenik B, Grimwood J, Aerts A, Rouze P, Salamov A, Putnam N, Dupont C, Jorgensen R, Derelle E, Rombauts S, et al. The tiny eukaryote Ostreococcus provides genomic insights into the paradox of plankton speciation. Proc. Natl Acad. Sci. USA. 2007;104:7705–7710. [PubMed]
35. Derelle E, Ferraz C, Rombauts S, Rouze P, Worden AZ, Robbens S, Partensky F, Degroeve S, Echeynie S, Cooke R, et al. Genome analysis of the smallest free-living eukaryote Ostreococcus tauri unveils many unique features. Proc. Natl. Acad. Sci. USA. 2006;103:11647–11652. [PubMed]
36. Tuskan GA, Difazio S, Jansson S, Bohlmann J, Grigoriev I, Hellsten U, Putnam N, Ralph S, Rombauts S, Salamov A, et al. The genome of black cottonwood, Populus trichocarpa (Torr. & Gray) Science. 2006;313:1596–1604. [PubMed]
37. Nene V, Wortman JR, Lawson D, Haas B, Kodira C, Tu ZJ, Loftus B, Xi Z, Megy K, Grabherr M, et al. Genome sequence of Aedes aegypti, a major arbovirus vector. Science. 2007;316:1718–1723. [PMC free article] [PubMed]
38. Holt RA, Subramanian GM, Halpern A, Sutton GG, Charlab R, Nusskern DR, Wincker P, Clark AG, Ribeiro JM, Wides R, et al. The genome sequence of the malaria mosquito Anopheles gambiae. Science. 2002;298:129–149. [PubMed]
39. Xia Q, Zhou Z, Lu C, Cheng D, Dai F, Li B, Zhao P, Zha X, Cheng T, Chai C, et al. A draft sequence for the genome of the domesticated silkworm (Bombyx mori) Science. 2004;306:1937–1940. [PubMed]
40. Lindblad-Toh K, Wade CM, Mikkelsen TS, Karlsson EK, Jaffe DB, Kamal M, Clamp M, Chang JL, Kulbokas E.J., III, Zody MC, et al. Genome sequence, comparative analysis and haplotype structure of the domestic dog. Nature. 2005;438:803–819. [PubMed]
41. Dehal P, Satou Y, Campbell RK, Chapman J, Degnan B, De Tomaso A, Davidson B, Di Gregorio A, Gelpke M, Goodstein DM, et al. The draft genome of Ciona intestinalis: insights into chordate and vertebrate origins. Science. 2002;298:2157–2167. [PubMed]
42. International.Chicken.Genome.Sequencing.Consortium. Sequence and comparative analysis of the chicken genome provide unique perspectives on vertebrate evolution. Nature. 2004;432:695–716. [PubMed]
43. Gibbs RA, Rogers J, Katze MG, Bumgarner R, Weinstock GM, Mardis ER, Remington KA, Strausberg RL, Venter JC, Wilson RK, et al. Evolutionary and biomedical insights from the rhesus macaque genome. Science. 2007;316:222–234. [PubMed]
44. Mikkelsen TS, Wakefield MJ, Aken B, Amemiya CT, Chang JL, Duke S, Garber M, Gentles AJ, Goodstadt L, Heger A, et al. Genome of the marsupial Monodelphis domestica reveals innovation in non-coding sequences. Nature. 2007;447:167–177. [PubMed]
45. Kasahara M, Naruse K, Sasaki S, Nakatani Y, Qu W, Ahsan B, Yamada T, Nagayasu Y, Doi K, Kasai Y, et al. The medaka draft genome and insights into vertebrate genome evolution. Nature. 2007;447:714–719. [PubMed]
46. Aparicio S, Chapman J, Stupka E, Putnam N, Chia JM, Dehal P, Christoffels A, Rash S, Hoon S, Smit A, et al. Whole-genome shotgun assembly and analysis of the genome of Fugu rubripes. Science. 2002;297:1301–1310. [PubMed]
47. Jaillon O, Aury JM, Brunet F, Petit JL, Stange-Thomann N, Mauceli E, Bouneau L, Fischer C, Ozouf-Costaz C, Bernot A, et al. Genome duplication in the teleost fish Tetraodon nigroviridis reveals the early vertebrate proto-karyotype. Nature. 2004;431:946–957. [PubMed]
48. Gibbs RA, Weinstock GM, Metzker ML, Muzny DM, Sodergren EJ, Scherer S, Scott G, Steffen D, Worley KC, Burch PE, et al. Genome sequence of the Brown Norway rat yields insights into mammalian evolution. Nature. 2004;428:493–521. [PubMed]
49. Chimpanzee.Sequencing.and.Analysis.Consortium. Initial sequence of the chimpanzee genome and comparison with the human genome. Nature. 2005;437:69–87. [PubMed]
50. Loftus B, Anderson I, Davies R, Alsmark UC, Samuelson J, Amedeo P, Roncaglia P, Berriman M, Hirt RP, Mann BJ, et al. The genome of the protist parasite Entamoeba histolytica. Nature. 2005;433:865–868. [PubMed]
51. Armbrust EV, Berges JA, Bowler C, Green BR, Martinez D, Putnam NH, Zhou S, Allen AE, Apt KE, Bechner M, et al. The genome of the diatom Thalassiosira pseudonana: ecology, evolution, and metabolism. Science. 2004;306:79–86. [PubMed]
52. Carlton JM, Hirt RP, Silva JC, Delcher AL, Schatz M, Zhao Q, Wortman JR, Bidwell SL, Alsmark UC, Besteiro S, et al. Draft genome sequence of the sexually transmitted pathogen Trichomonas vaginalis. Science. 2007;315:207–212. [PMC free article] [PubMed]
53. Miranda-Saavedra D, Stark MJ, Packer JC, Vivares CP, Doerig C, Barton GJ. The complement of protein kinases of the microsporidium Encephalitozoon cuniculi in relation to those of Saccharomyces cerevisiae and Schizosaccharomyces pombe. BMC Genomics. 2007;8:309. [PMC free article] [PubMed]
54. Eddy SR. Profile hidden Markov models. Bioinformatics. 1998;14:755–763. [PubMed]
55. Clamp M, Cuff J, Searle SM, Barton GJ. The Jalview Java alignment editor. Bioinformatics. 2004;20:426–427. [PubMed]
56. Edgar RC. MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics. 2004;5:113. [PMC free article] [PubMed]
57. Katoh K, Kuma K, Toh H, Miyata T. MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acids Res. 2005;33:511–518. [PMC free article] [PubMed]
58. Katoh K, Kuma K, Miyata T, Toh H. Improvement in the accuracy of multiple sequence alignment program MAFFT. Genome Inform. 2005;16:22–33. [PubMed]
59. Larkin MA, Blackshields G, Brown NP, Chenna R, McGettigan PA, McWilliam H, Valentin F, Wallace IM, Wilm A, Lopez R, et al. Clustal W and Clustal X version 2.0. Bioinformatics. 2007;23:2947–2948. [PubMed]
60. Cole C, Barber JD, Barton GJ. The Jpred 3 secondary structure prediction server. Nucleic Acids Res. 2008;36:W197–W201. [PMC free article] [PubMed]
61. Cuff JA, Barton GJ. Application of multiple sequence alignment profiles to improve protein secondary structure prediction. Proteins. 2000;40:502–511. [PubMed]
62. Rice P, Longden I, Bleasby A. EMBOSS: the European Molecular Biology Open Software Suite. Trends Genet. 2000;16:276–277. [PubMed]
63. Wheeler GL, Miranda-Saavedra D, Barton GJ. Genome analysis of the unicellular green alga Chlamydomonas reinhardtii Indicates an ancient evolutionary origin for key pattern recognition and cell-signaling protein families. Genetics. 2008;179:193–197. [PubMed]
64. Williams JG, Zvelebil M. SH2 domains in plants imply new signalling scenarios. Trends Plant Sci. 2004;9:161–163. [PubMed]
65. Lanave C, Colangelo AM, Saccone C, Alberghina L. Molecular evolution of the neurotrophin family members and their Trk receptors. Gene. 2007;394:1–12. [PubMed]
66. Ortutay C, Valiaho J, Stenberg K, Vihinen M. KinMutBase: a registry of disease-causing mutations in protein kinase domains. Hum. Mutat. 2005;25:435–442. [PubMed]
67. Grassot J, Mouchiroud G, Perriere G. RTKdb: database of Receptor Tyrosine Kinase. Nucleic Acids Res. 2003;31:353–358. [PMC free article] [PubMed]
68. Niedner RH, Buzko OV, Haste NM, Taylor A, Gribskov M, Taylor SS. Protein kinase resource: an integrated environment for phosphorylation research. Proteins. 2006;63:78–86. [PubMed]
69. Krupa A, Abhinandan KR, Srinivasan N. KinG: a database of protein kinases in genomes. Nucleic Acids Res. 2004;32:D153–D155. [PMC free article] [PubMed]
70. Mi H, Lazareva-Ulitsky B, Loo R, Kejariwal A, Vandergriff J, Rabkin S, Guo N, Muruganujan A, Doremieux O, Campbell MJ, et al. The PANTHER database of protein families, subfamilies, functions and pathways. Nucleic Acids Res. 2005;33:D284–D288. [PMC free article] [PubMed]
Articles from Nucleic Acids Research are provided here courtesy of
Oxford University Press