The
pax genes encode a family of transcription factors that have been conserved through evolution and play different roles in early development. This family is defined by the presence of a highly conserved motif of 128 amino acids, the paired-domain, which does not have any obvious sequence homology with other known protein domains. Nine members of the
pax gene family have been isolated in vertebrates, which are grouped into four distinct subfamilies, based on sequence similarity and structural domains [
1-
3]. The subfamily consisting of PAX2, PAX5 and PAX8 (PAX2/5/8) encodes transcription regulators that bind DNA via the amino-terminal paired-domain, whereas the carboxy-terminal region is required for
trans-activation or repression of target genes. Detailed DNA binding studies led to the definition of a consensus recognition sequence that is bound by all members of this subfamily [
4,
5]. The
pax2/
5/
8 genes are expressed in a spatially and temporally overlapping manner in the brain, eye, kidney and inner ear in several model organisms [
6-
9]. Particularly, the members of this subfamily are the earliest known genes that are involved in inner ear development. In teleosts,
pax8 is expressed in preotic cells by the early somitogenesis stages, followed by
pax2 expression in the otic placode and vesicle, whereas
pax5 is restricted to the
utricular macula [
10-
12]. Although the roles of
pax2/
5/
8 genes during ear development are partly illustrated by loss-of-function, mutant analysis and gain-of-function in fish [
11,
13-
17], little is known about the direct downstream target genes of this PAX subfamily. More particularly, although gene expression profiling comparing wild type and PAX2 mutants has already been performed in mouse embryos [
18], this analysis was restricted to the identification of PAX2 targets in the midbrain-hindbrain boundary. A systematic discovery of specific PAX2/5/8 direct targets in the otic vesicle has not yet been performed.
We therefore aimed to identify PAX2/5/8 direct downstream targets, especially those involved in inner ear development. For this purpose, we opted for a novel approach that takes advantage of the vast amount of biological resources generated by large-scale experiments and available to the scientific community through public databases. Indeed, on one hand, numerous high-throughput gene expression pattern screens (for example, in vertebrates [
19-
23]) combined with massive whole-genome sequencing (for example, pioneer efforts with mammals [
24-
26]) have generated an invaluable resource of information concerning any given gene. On the other hand, a myriad of bioinformatics tools from functional to comparative genomics [
27,
28] have emerged to extract and mine this information in a systematic way. Therefore, we took advantage of these bioinformatics tools to develop a strategy that combines a comparative genomics algorithm, gene expression pattern databases queries and text mining.
Firstly, we ran an improved version of the previously described evolutionary double filtering algorithm (EDF) [
29] on PAX2/5/8 position weight matrices (PWMs) [
4,
5] to predict PAX2/5/8 downstream targets
in silico. This algorithm has been successfully applied for the discovery of ATH5 target genes [
29] and its power lies in the requirement of a unique single input, the PWM representing the binding site of the transcription factor of interest.
Secondly, from this primary list of in silico predicted PAX2/5/8 target genes, we extracted the subset of candidate genes that would be specifically involved in otic vesicle development by selecting the genes that were either known to be expressed in the otic vesicle or cited in the context of otic vesicle development. Queries against mouse and zebrafish expression pattern databases and text mining of MEDLINE abstracts were respectively applied to perform this selection.
Thirdly, to validate the putative PAX2/5/8 downstream targets in the otic vesicle predicted by this combination of in silico analysis, we carried out in vitro electrophoretic mobility shift assays and in vivo misexpression experiments in medaka to provide experimental evidence that four predicted candidates genes (brn2 (pou3f2), claudin-7, sec31-like, ccdc102a and meteorin-like precursor) are new PAX2/5/8 direct target genes for otic development.