We identified 133 CDK family members, 123 from animals, plants, yeasts, and four protists from which genome sequences have been completed, and 10 additional CDKs from incomplete genome sequences of organisms with known CTD sequences (Table ). Although all of sequences are included in our supplemental phylogenetic analysis (additional file 1
), only 101 of them are included in the major phylogenetic analysis (Fig. ); a large plant-specific amplification of CDK9-like kinases (the phylogenetic weight of these sequences disrupts the CDK9 sub-clade) and sequences from incomplete genomes are excluded (see Fig. and additional file 1
legends for further explanation). The nomenclature for kinases from Arabidopsis
followed Joubès et al. (2000) and Vandepoele et al. (2002) [33
] (Table ). The catalytic core base, Gly-rich motif and T-loop, required for characterized CDK function, appear to be conserved across all defined and putative kinase sequences analyzed (additional file 2
). The 50% majority rule consensus tree of 4,000 likelihood trees, sampled from the posterior probability distribution from Bayesian phylogenetic inference, is shown in Figure . This tree provides strong support for grouping a number of previously uncharacterized CDKs, from a variety of organisms, with defined CDKs from animals and yeast. Overall, however, very little support is found for relationships among different CDK orthologous groups.
CDK-related kinases used in this study.
Figure 1 Unrooted 50% majority consensus tree from 4,000 ML trees sampled from the Bayesian posterior probability distribution. Support values are shown above the internode from Bayesian inference/distance bootstrap respectively. Only values above 50% are reported (more ...)
In this unrooted tree the highly diversified cell-cycle kinases defined in humans, CDKs1-6, fall into a large cluster with 69% Bayesian support. This grouping includes CDKs from all organisms examined in the study. Among these putative cell-cycle CDKs, some plant and protistan kinases can be assigned with reasonable confidence to specific CDK groups. For example, apparent orthologs of human CDK1 are found in other animals (Drosophila
), yeasts, both plants (Arabidopsis
(Fig. ). Likewise, putative orthologs of CDK5 were identified in all organisms examined, except for the two plants (Fig. ). A number of other sequences, such as TbCrk2 and 3 from Trypanosoma
, cluster with cell-cycle kinases but not clearly with any specific CDK family. Significantly, and consistent with the results of Liu and Kipreos (2000) [32
], CDK5 and PCTAIRE-like kinases from fungi and animals form a strongly supported group, indicating their close relationship (Fig. ).
In contrast to cell-cycle kinases, our phylogenetic results failed to identify a clear ortholog of any transcription-related CDKs from two of the complete genomes examined, Trypanosoma brucei
and Giardia lamblia
. This includes strongly supported clades of presumed orthologs of human CDKs7-11 respectively. A well-defined CDK7 family is recovered, including sequences from yeasts, the microsporidian, plants, and animals. These are the primary groups that make up the "CTD-clade," in which the RNAP II CTD is invariably conserved (Fig. ). CDK7 shows an interesting sister relationship to HsCCRK from human and apparent orthologs from Drosophila
. In Arabidopsis
, four possible CDK7 orthologs were found, as reported previously by Shimotohno and colleagues (2003) [35
]; however, AtCdkF (CAK1) is quite divergent from the core CDK7 family and related specifically to HsCCRK in our analyses. PfMRK from Plasmodium
, suggested previously to be a CDK7 [36
], does not fall within the well-defined CDK7 group, but clusters with another Plasmodium
kinase. The a priori
hypothesis that PfMRK belongs in the core CDK7 group is strongly rejected with our data set in a likelihood paired-sites test.
Figure 2 Hypothesis of RNA polymerase II evolution inferred from phylogenetic analyses of RPB1 sequences conserved regions A-H. The tree displayed, after Stiller and Cook  had the highest likelihood of all trees sampled from the posterior probability distribution (more ...)
Likewise, GlCAKlike (gi: 292497120) has been proposed as a CDK7 from Giardia
, based on nearest sequence similarity to Kin28 in a more limited comparison to CDK sequences from fission yeast [38
]. In our expanded analyses of CDKs from 11 completed genomes, we find no evidence supporting an orthologous relationship to CDK7 for this, or any Giardia
sequence. The a priori
hypothesis that GlCAKlike belongs in the core CDK7 group also is strongly rejected in a likelihood paired-sites test.
A robust CDK8 family is recovered with strong support values in both distance bootstrap and Bayesian inference. Like CDK7, this family includes putative orthologs only from members of the "CTD-clade," specifically yeasts, animals and plants. Although the microsporidian Encephalitozoon is a member of the RNAP II "CTD clade," TBlastN searches of the complete genome of Encephalitozoon found six CDKs but none show a phylogenetic affinity to CDK8.
A CDK9 grouping also is supported as monophyletic with representative CDKs from yeasts, Encephalitozoon
, animals, plants and Plasmodium
. This group is divided into two well-defined sub-clades. One of them consists of BUR1 from yeast along with CDK9 orthologs from animals; the other contains CTK1 from yeast, CDC2L5 and CrkRS from human, and apparent orthologs from Drosophila
, both plants, and Plasmodium
. A putative CDK9 also is found in Encephalitozoon
, but falls at the base of the larger CDK9 grouping and does not associate clearly with either subgroup (Fig. ). Plants also contain a large number of putative CDKs that show strong phylogenetic affinity to CDK9 (additional file 1
). These kinases appear to represent a plant-specific amplification of CDK9, although their functions have not been determined experimentally.
Human CDK10 and CDK11 group with apparent orthologs from other animals, plants, fission yeast, and PfCRK1 from Plasmodium. Once again, no kinases from either Trypanosoma or Giardia show any phylogenetic affinity to this group.