Homeobox genes constitute an ancient superclass of regulatory genes with diverse developmental functions [
1]. The homeobox, which encodes a helix-turn-helix DNA-binding motif known as the homeodomain, originated prior to the evolutionary split between plants, fungi, and metazoans [
2]. The homeodomain is commonly 60 amino acids in length, though recognizable homeodomains may be as long as 97 or as short as 54 amino acids (reviewed in [
3]).
Based on phylogenetic analyses and chromosomal mapping studies, animal homeodomains can be divided among ten distinct classes: ANTP, CUT, HNF, LIM, POU, PRD, PROS, SINE, TALE, and ZF [
3-
16]. The ANTP and PRD classes are substantially larger than the other classes, and these two classes are thought to be sister clades [
5,
7]. Within the ANTP class, there is evidence for a monophyletic subclass comprising Hox-related genes [
4,
7]. The PRD class can be divided into subclasses based on the amino acid present at position 50 of the homeodomain (Q50, K50, or S50), but these subclasses do no not appear to represent monophyletic groups [
5,
7]. The remaining eight homeodomain classes are significantly smaller than the ANTP and PRD classes, and they are thought to have emerged as a series of lineages basal to an ANTP-PRD clade [
6]. To this point, the HNF class has only been reported from vertebrates [
6]. Structural and functional properties of the homeodomain appear largely conserved within these homeodomain classes [
4]. The homeodomain sequences encoded by orthologous homeobox genes are often so highly conserved that orthology between protostomes and deuterostomes, and even between bilaterians and non-bilaterians, is readily apparent [
17].
The ANTP, PRD, CUT, LIM, POU, PROS, SINE, TALE, and ZF classes are known from both protostome and deuterostome metazoans [
3]. Therefore, we can trace their origins to the protostome-deuterostome ancestor, which a recent estimate places at some 579 to 700 million years ago (Figure ) [
18]. Identification of these homeobox classes in outgroup taxa would indicate even greater antiquity. For example, molecular clock estimates based on maximum likelihood and minimum evolution suggest that the cnidarian-bilaterian divergence predated the protostome-deuterostome divergence by 25 to 48 million years [
18].
Establishing the antiquity of homeobox genes is critical to understanding the role of these genes in metazoan evolution. The functional diversification of homeobox genes, by gene duplication and divergence, or by cis-regulatory evolution, has been touted as an important mechanism in the evolution of diverse body plans and organs in bilaterian metazoans [
6,
19-
25]. The Cnidaria is the likely sister group of the Bilateria [
26,
27], and since their divergence from a common ancestor, these two lineages have undergone very different evolutionary trajectories (Figure ). The bilaterian ancestor has spawned over 30 distinct phyla comprising more than one million extant species; the cnidarian ancestor has spawned some 10,000 extant species, all comfortably housed in a single phylum [
28]. The maximum complexity and morphological diversity of cnidarian body plans (for example, sea anemones, sea pens, corals, hydras, and jellyfishes) is modest when compared to the maximum complexity and morphological diversity of bilaterian body plans (for example, vertebrates, sea squirts, sea urchins, insects, nematodes, octopi, and phoronids [
25,
29]). Taking into account the presumed importance of homeobox genes in the morphological diversification of bilaterians, the close evolutionary relationship between the Bilateria and the Cnidaria, and the contrasting evolutionary trajectories of these two lineages, a comparison of cnidarians and bilaterians becomes critical for understanding the significance of homeobox genes in the morphological diversification of animal body plans.
Here, we seek to identify homeobox genes that were present in the cnidarian-bilaterian ancestor using phylogenetic analysis of homeodomains from bilaterians and cnidarians. Our analysis takes advantage of the curated genomic datasets of the fruit fly
Drosophila melanogaster [
30-
34] and
Homo sapiens [
35,
36] as well as the recently completed rough draft of the sea anemone
Nematostella vectensis, a representative cnidarian (Joint Genome Institute; D Rokhsar, principal investigator).
The phylogenetic analyses presented here reveal the extent to which the homeobox gene superclass had radiated prior to the evolutionary split between Cnidaria and Bilateria. For example, at one extreme, the Cnidaria could have diverged from the Bilateria prior to the origin of the aforementioned homeobox classes (ANTP, PRD, LIM, POU, and so on). If so, then the cnidarian homeobox genes and the bilaterian homeobox genes would constitute independent radiations on the phylogeny (Figure ). This possibility is ruled out by published studies that have identified distinct ANTP, POU, PRD, and SINE homeodomains in the Cnidaria [
5,
17,
37-
45]. Alternatively, the Cnidaria could have diverged from the Bilateria after the origin of the class founder genes (for example, the ancestral ANTP class gene, the ancestral PRD class gene, and so on), but prior to the subsequent radiations of these classes. In this case, the cnidarian and bilaterian class radiations would constitute mutually exclusive monophyletic groups (Figure ). However, if the homeobox classes had undergone extensive radiations prior to the cnidarian-bilaterian divergence, then the same homeobox families would be represented in cnidarian and bilaterian genomes (Figure ). Finally, it might also be the case that some homeobox classes had radiated prior to the cnidarian-bilaterian radiation, while other classes had not (Figure ).
The phylogenetic analyses presented here reveal that the ANTP, PRD, LIM, SINE, and POU classes had radiated extensively prior to the divergence of the Cnidaria and the Bilateria. The HNF class, formerly known only from vertebrates, is also represented in the
Nematostella genome. In addition, we identify a putative CUT class gene in
Nematostella by searching the predicted gene database at StellaBase [
46,
47]. Our analyses fail to identify ZF or PROS homeodomains in
Nematostella. The phylogenetic analyses reveal 56 distinct homeodomain families that appear to be shared by
Nematostella and one or both of the bilaterian taxa.