The ParaHox gene cluster was first discovered in the invertebrate chordate amphioxus (
Branchiostoma floridae) [
1]. ParaHox gene clustering is conserved in humans and other tetrapods, but disrupted in several other chordates [
2-
4]. The ancestral condition for the chordates is, however, clearly one of possession of a ParaHox cluster, which has been conserved since the Cambrian along both the cephalochordate and tetrapod lineages.
The organisation of the cluster and the phylogenetic relationships of its component genes (
Gsx,
Xlox and
Cdx) relative to the Hox cluster genes are consistent with a paralogous relationship between the ParaHox and Hox clusters, that is, they are evolutionary sisters. Whilst Hox gene clustering is widespread across the bilaterians, the ParaHox cluster has only been found in chordates thus far (as well as one example from the probable sister group to the bilaterians, the cnidarians[
5]). In protostomes that have been examined to date (insects and nematodes) the ParaHox cluster does not exist, and one or more ParaHox genes have been lost. This ParaHox gene loss is clearly a secondarily derived condition for protostomes, since all three ParaHox genes are present in a variety of lophotrochozoan protostomes, including annelids and molluscs [
6-
9]. Whilst the ancestral presence of all three ParaHox genes is now well established for protostomes, the genomic organisation of the genes in an animal that is not as derived as insects and nematodes, and which still retains all three genes, has not been determined.
The ordered clustering of the Hox genes is related to their expression and function, at least in the vertebrates. The order of the genes along the chromosome corresponds to the order of the gene expression domains along the embryonic anterior-posterior axis: the phenomenon of colinearity. Due to the paralogous relationship and retention of clustering between the ParaHox and Hox cluster, there is a distinct possibility that the organisation of the ParaHox cluster also relates to the expression and function of the component genes in a similar fashion to the Hox cluster situation.
ParaHox gene organisation and expression has been widely examined within deuterostomes. The prototypical ParaHox cluster of amphioxus exhibits spatial colinearity, with
AmphiGsx expressed in the anterior central nervous system (CNS),
AmphiXlox in a more central region of the CNS and the developing gut, and
AmphiCdx at the posterior end of the larva in both the CNS and gut [
1,
10]. This has distinct similarities to vertebrate ParaHox gene expression. Vertebrate Gsx genes (usually called
Gsh1 and
Gsh2) have anterior boundaries of expression in the brain with extensive expression posteriorly into the neural tube, in a dorso-ventrally restricted fashion that may be comparable to
Drosophila [
11-
17]. Vertebrate Xlox genes (with synonyms of
PDX1,
IPF1,
IDX1,
XlHbox8,
STF1 or
MODY4) are expressed in the gut during pancreas development [
18-
22] and in the CNS [
23-
25]. Vertebrate Cdx genes are predominantly posterior patterning genes, expressed in the CNS, mesoderm and gut [
25-
28]. In invertebrate deuterostomes, apart from amphioxus, the ParaHox cluster has broken apart [
2,
29] but there are still elements of the spatial restriction and tissue specificity of ParaHox expression. In urochordates Gsx is expressed in a small domain in the anterior CNS [
30], Xlox (called
Ci-IPF1) is in mesenchymal cells and some cells of the CNS [
31], and Cdx patterns the posterior tadpole tail and is expressed in the hindgut of post-metamorphic animals [
32,
33]. In the echinoderm,
Strongylocentrotus purpuratus, Gsx is expressed in a small patch of putative nerve cells, whilst Xlox and Cdx have staggered expression domains in the posterior gut tube [
29], and in a starfish,
Archaster typicus, Xlox is expressed throughout the early archenteron and a few vegetal ectodermal cells [
34]. In summary, deuterostome Gsx tends to be expressed solely in the CNS with a rostral anterior limit, Xlox is expressed both in the CNS and the developing gut, in central regions such as the pancreas of vertebrates, and Cdx is expressed in more posterior regions of the CNS and gut. Whether the deuterostome ParaHox genes exist in an intact or broken cluster may depend on the regulatory mechanism(s) controlling the temporal activation of the genes [
2,
35].
ParaHox gene expression has been more sparsely sampled in protostomes, apart from Cdx (or caudal). Gsx expression has been documented in the insects
Drosophila and
Tribolium, and the polychaetes
Capitella and
Nereis virens. Insect Gsx is expressed along a pair of medio-laterally restricted neural columns, and has a role in neuronal patterning [
11,
36]. There are also domains of expression in the head region of these insects that have yet to be fully characterised, but do include expression in neural cells [
11,
36,
37]. In contrast, expression of Gsx in the polychaete
Capitella is restricted to a small domain close to the anterior end of the CNS [
7]. This is very different to the spatially and temporally dynamic expression of Gsx in the nereid polychaetes,
Nereis virens [
38] and
Platynereis dumerilii (described below). The central ParaHox gene, Xlox, is missing from all ecdysozoan genomes sequenced to date, but is present in lophotrochozoans. In the leeches
Helobdella triserialis and
Hirudo medicinalis Xlox (named
HtrA2 or
Lox3) is expressed throughout the midgut, as is also the case for the polychaete
Capitella [
7,
39,
40]. No neural Xlox expression has been described in these annelids. Nereid Xlox expression also has a midgut component, but in contrast to these other annelids is also expressed in the CNS [
38] (and see below). In contrast to the sparse data on protostome Gsx and Xlox expression, Cdx has been examined in a large variety of taxa. First characterised as a posterior patterning gene acting early in the segmentation gene cascade in
Drosophila (in which the gene is called
caudal) [
41], Cdx has subsequently been studied in many other arthropods [
42-
51]. Broadly, Cdx is a posterior patterning gene in all of these animals, as it also is in the nematode
Caenorhabditis elegans (where the gene is called
pal-1) [
52] and the mollusc,
Patella [
53]. In the annelids
Platynereis ([
54] and herein),
Nereis [
38],
Tubifex [
55] and
Capitella [
7] there are both anterior and posterior expression domains of Cdx (see Discussion). There is little data on the genomic organisation of protostome ParaHox genes and how it may relate to their expression.
Here we provide the first description of the expression patterns for all three ParaHox genes for a protostome animal in relation to their genomic organisation, in the polychaete P. dumerilii. Clustering of protostome ParaHox genes is shown for the first time, which reveals that some clustering of ParaHox genes has been conserved on both the protostome and deuterostome lineages. The P. dumerilii ParaHox cluster is not, however, entirely intact. The posterior member, Pdu-Cdx, has been separated from the other two genes, Pdu-Gsx and Pdu-Xlox, and the two parts of the Platynereis ParaHox cluster now reside on opposite ends of the same chromosome arm. Comparison of the genes neighbouring the Platynereis ParaHox genes with the map positions of the mammalian orthologues allows the reconstruction of the genomic region surrounding the ParaHox cluster of the protostome-deuterostome ancestor (PDA), an extinct animal that lived over 550 million years ago. The details of the Platynereis ParaHox gene expression patterns, by comparison with those of other animals, imply a complex role for Gsx in the PDA's CNS and a possible function in protostome mouth development, a role for Xlox in CNS patterning as well as gut development, and a complex and dynamic pattern of Cdx expression in polychaetes that correlates with the relocation of the gene out of the cluster.