1.  Enigmatic Orthology Relationships between Hox Clusters of the African Butterfly Fish and Other Teleosts Following Ancient Whole-Genome Duplication 
Molecular Biology and Evolution  2014;31(10):2592-2611.
Numerous ancient whole-genome duplications (WGD) have occurred during eukaryote evolution. In vertebrates, duplicated developmental genes and their functional divergence have had important consequences for morphological evolution. Although two vertebrate WGD events (1R/2R) occurred over 525 Ma, we have focused on the more recent 3R or TGD (teleost genome duplication) event which occurred approximately 350 Ma in a common ancestor of over 26,000 species of teleost fishes. Through a combination of whole genome and bacterial artificial chromosome clone sequencing we characterized all Hox gene clusters of Pantodon buchholzi, a member of the early branching teleost subdivision Osteoglossomorpha. We find 45 Hox genes organized in only five clusters indicating that Pantodon has suffered more Hox cluster loss than other known species. Despite strong evidence for homology of the five Pantodon clusters to the four canonical pre-TGD vertebrate clusters (one HoxA, two HoxB, one HoxC, and one HoxD), we were unable to confidently resolve 1:1 orthology relationships between four of the Pantodon clusters and the eight post-TGD clusters of other teleosts. Phylogenetic analysis revealed that many Pantodon genes segregate outside the conventional “a” and “b” post-TGD orthology groups, that extensive topological incongruence exists between genes physically linked on a single cluster, and that signal divergence causes ambivalence in assigning 1:1 orthology in concatenated Hox cluster analyses. Out of several possible explanations for this phenomenon we favor a model which keeps with the prevailing view of a single TGD prior to teleost radiation, but which also considers the timing of diploidization after duplication, relative to speciation events. We suggest that although the duplicated hoxa clusters diploidized prior to divergence of osteoglossomorphs, the duplicated hoxb, hoxc, and hoxd clusters concluded diploidization independently in osteoglossomorphs and other teleosts. We use the term “tetralogy” to describe the homology relationship which exists between duplicated sequences which originate through a shared WGD, but which diploidize into distinct paralogs from a common allelic pool independently in two lineages following speciation.
PMCID: PMC4166920  PMID: 24974377
tetraploidy; diploidization; homeobox; WGD; Pantodon; teleost
2.  The fate of the duplicated androgen receptor in fishes: a late neofunctionalization event? 
Based on the observation of an increased number of paralogous genes in teleost fishes compared with other vertebrates and on the conserved synteny between duplicated copies, it has been shown that a whole genome duplication (WGD) occurred during the evolution of Actinopterygian fish. Comparative phylogenetic dating of this duplication event suggests that it occurred early on, specifically in teleosts. It has been proposed that this event might have facilitated the evolutionary radiation and the phenotypic diversification of the teleost fish, notably by allowing the sub- or neo-functionalization of many duplicated genes.
In this paper, we studied in a wide range of Actinopterygians the duplication and fate of the androgen receptor (AR, NR3C4), a nuclear receptor known to play a key role in sex-determination in vertebrates. The pattern of AR gene duplication is consistent with an early WGD event: it has been duplicated into two genes AR-A and AR-B after the split of the Acipenseriformes from the lineage leading to teleost fish but before the divergence of Osteoglossiformes. Genomic and syntenic analyses in addition to lack of PCR amplification show that one of the duplicated copies, AR-B, was lost in several basal Clupeocephala such as Cypriniformes (including the model species zebrafish), Siluriformes, Characiformes and Salmoniformes. Interestingly, we also found that, in basal teleost fish (Osteoglossiformes and Anguilliformes), the two copies remain very similar, whereas, specifically in Percomorphs, one of the copies, AR-B, has accumulated substitutions in both the ligand binding domain (LBD) and the DNA binding domain (DBD).
The comparison of the mutations present in these divergent AR-B with those known in human to be implicated in complete, partial or mild androgen insensitivity syndrome suggests that the existence of two distinct AR duplicates may be correlated to specific functional differences that may be connected to the well-known plasticity of sex determination in fish. This suggests that three specific events have shaped the present diversity of ARs in Actinopterygians: (i) early WGD, (ii) parallel loss of one duplicate in several lineages and (iii) putative neofunctionalization of the same duplicate in percomorphs, which occurred a long time after the WGD.
PMCID: PMC2637867  PMID: 19094205
3.  Temporal pattern of loss/persistence of duplicate genes involved in signal transduction and metabolic pathways after teleost-specific genome duplication 
Recent genomic studies have revealed a teleost-specific third-round whole genome duplication (3R-WGD) event occurred in a common ancestor of teleost fishes. However, it is unclear how the genes duplicated in this event were lost or persisted during the diversification of teleosts, and therefore, how many of the duplicated genes contribute to the genetic differences among teleosts. This subject is also important for understanding the process of vertebrate evolution through WGD events. We applied a comparative evolutionary approach to this question by focusing on the genes involved in long-term potentiation, taste and olfactory transduction, and the tricarboxylic acid cycle, based on the whole genome sequences of four teleosts; zebrafish, medaka, stickleback, and green spotted puffer fish.
We applied a state-of-the-art method of maximum-likelihood phylogenetic inference and conserved synteny analyses to each of 130 genes involved in the above biological systems of human. These analyses identified 116 orthologous gene groups between teleosts and tetrapods, and 45 pairs of 3R-WGD-derived duplicate genes among them. This suggests that more than half [(45×2)/(116+45)] = 56.5%) of the loci, probably more than ten thousand genes, present in a common ancestor of the four teleosts were still duplicated after the 3R-WGD. The estimated temporal pattern of gene loss suggested that, after the 3R-WGD, many (71/116) of the duplicated genes were rapidly lost during the initial 75 million years (MY), whereas on average more than half (27.3/45) of the duplicated genes remaining in the ancestor of the four teleosts (45/116) have persisted for about 275 MY. The 3R-WGD-derived duplicates that have persisted for a long evolutionary periods of time had significantly larger number of interacting partners and longer length of protein coding sequence, implying that they tend to be more multifunctional than the singletons after the 3R-WGD.
We have shown firstly the temporal pattern of gene loss process after 3R-WGD on the basis of teleost phylogeny and divergence time frameworks. The 3R-WGD-derived duplicates have not undergone constant exponential decay, suggesting that selection favoured the long-term persistence of a subset of duplicates that tend to be multi-functional. On the basis of these results obtained from the analysis of 116 orthologous gene groups, we propose that more than ten thousand of 3R-WGD-derived duplicates have experienced lineage-specific evolution, that is, the differential sub-/neo-functionalization or secondary loss between lineages, and contributed to teleost diversity.
PMCID: PMC2702319  PMID: 19500364
4.  An Independent Genome Duplication Inferred from Hox Paralogs in the American Paddlefish—A Representative Basal Ray-Finned Fish and Important Comparative Reference 
Genome Biology and Evolution  2012;4(9):937-953.
Vertebrates have experienced two rounds of whole-genome duplication (WGD) in the stem lineages of deep nodes within the group and a subsequent duplication event in the stem lineage of the teleosts—a highly diverse group of ray-finned fishes. Here, we present the first full Hox gene sequences for any member of the Acipenseriformes, the American paddlefish, and confirm that an independent WGD occurred in the paddlefish lineage, approximately 42 Ma based on sequences spanning the entire HoxA cluster and eight genes on the HoxD gene cluster. These clusters comprise different HOX loci and maintain conserved synteny relative to bichir, zebrafish, stickleback, and pufferfish, as well as human, mouse, and chick. We also provide a gene genealogy for the duplicated fzd8 gene in paddlefish and present evidence for the first Hox14 gene in any ray-finned fish. Taken together, these data demonstrate that the American paddlefish has an independently duplicated genome. Substitution patterns of the “alpha” paralogs on both the HoxA and HoxD gene clusters suggest transcriptional inactivation consistent with functional diploidization. Further, there are similarities in the pattern of sequence divergence among duplicated Hox genes in paddlefish and teleost lineages, even though they occurred independently approximately 200 Myr apart. We highlight implications on comparative analyses in the study of the “fin-limb transition” as well as gene and genome duplication in bony fishes, which includes all ray-finned fishes as well as the lobe-finned fishes and tetrapod vertebrates.
PMCID: PMC3509897  PMID: 22851613
Polyodon spathula; whole-genome duplication; WGD; rate asymmetry; paralog retention; fin-limb transition
5.  Vertebrate Vitellogenin Gene Duplication in Relation to the “3R Hypothesis”: Correlation to the Pelagic Egg and the Oceanic Radiation of Teleosts 
PLoS ONE  2007;2(1):e169.
The spiny ray-finned teleost fishes (Acanthomorpha) are the most successful group of vertebrates in terms of species diversity. Their meteoric radiation and speciation in the oceans during the late Cretaceous and Eocene epoch is unprecedented in vertebrate history, occurring in one third of the time for similar diversity to appear in the birds and mammals. The success of marine teleosts is even more remarkable considering their long freshwater ancestry, since it implies solving major physiological challenges when freely broadcasting their eggs in the hyper-osmotic conditions of seawater. Most extant marine teleosts spawn highly hydrated pelagic eggs, due to differential proteolysis of vitellogenin (Vtg)-derived yolk proteins. The maturational degradation of Vtg involves depolymerization of mainly the lipovitellin heavy chain (LvH) of one form of Vtg to generate a large pool of free amino acids (FAA 150–200 mM). This organic osmolyte pool drives hydration of the ooctye while still protected within the maternal ovary. In the present contribution, we have used Bayesian analysis to examine the evolution of vertebrate Vtg genes in relation to the “3R hypothesis” of whole genome duplication (WGD) and the functional end points of LvH degradation during oocyte maturation. We find that teleost Vtgs have experienced a post-R3 lineage-specific gene duplication to form paralogous clusters that correlate to the pelagic and benthic character of the eggs. Neo-functionalization allowed one paralogue to be proteolyzed to FAA driving hydration of the maturing oocytes, which pre-adapts them to the marine environment and causes them to float. The timing of these events matches the appearance of the Acanthomorpha in the fossil record. We discuss the significance of these adaptations in relation to ancestral physiological features, and propose that the neo-functionalization of duplicated Vtg genes was a key event in the evolution and success of the teleosts in the oceanic environment.
PMCID: PMC1770952  PMID: 17245445
6.  Evolution of ligand specificity in vertebrate corticosteroid receptors 
Corticosteroid receptors include mineralocorticoid (MR) and glucocorticoid (GR) receptors. Teleost fishes have a single MR and duplicate GRs that show variable sensitivities to mineralocorticoids and glucocorticoids. How these receptors compare functionally to tetrapod MR and GR, and the evolutionary significance of maintaining two GRs, remains unclear.
We used up to seven steroids (including aldosterone, cortisol and 11-deoxycorticosterone [DOC]) to compare the ligand specificity of the ligand binding domains of corticosteroid receptors between a mammal (Mus musculus) and the midshipman fish (Porichthys notatus), a teleost model for steroid regulation of neural and behavioral plasticity. Variation in mineralocorticoid sensitivity was considered in a broader phylogenetic context by examining the aldosterone sensitivity of MR and GRs from the distantly related daffodil cichlid (Neolamprologus pulcher), another teleost model for neurobehavioral plasticity. Both teleost species had a single MR and duplicate GRs. All MRs were sensitive to DOC, consistent with the hypothesis that DOC was the initial ligand of the ancestral MR. Variation in GR steroid-specificity corresponds to nine identified amino acid residue substitutions rather than phylogenetic relationships based on receptor sequences.
The mineralocorticoid sensitivity of duplicate GRs in teleosts is highly labile in the context of their evolutionary phylogeny, a property that likely led to neo-functionalization and maintenance of two GRs.
PMCID: PMC3025851  PMID: 21232159
7.  Distribution of ancestral proto-Actinopterygian chromosome arms within the genomes of 4R-derivative salmonid fishes (Rainbow trout and Atlantic salmon) 
BMC Genomics  2008;9:557.
Comparative genomic studies suggest that the modern day assemblage of ray-finned fishes have descended from an ancestral grouping of fishes that possessed 12–13 linkage groups. All jawed vertebrates are postulated to have experienced two whole genome duplications (WGD) in their ancestry (2R duplication). Salmonids have experienced one additional WGD (4R duplication event) compared to most extant teleosts which underwent a further 3R WGD compared to other vertebrates. We describe the organization of the 4R chromosomal segments of the proto-ray-finned fish karyotype in Atlantic salmon and rainbow trout based upon their comparative syntenies with two model species of 3R ray-finned fishes.
Evidence is presented for the retention of large whole-arm affinities between the ancestral linkage groups of the ray-finned fishes, and the 50 homeologous chromosomal segments in Atlantic salmon and rainbow trout. In the comparisons between the two salmonid species, there is also evidence for the retention of large whole-arm homeologous affinities that are associated with the retention of duplicated markers. Five of the 7 pairs of chromosomal arm regions expressing the highest level of duplicate gene expression in rainbow trout share homologous synteny to the 5 pairs of homeologs with the greatest duplicate gene expression in Atlantic salmon. These regions are derived from proto-Actinopterygian linkage groups B, C, E, J and K.
Two chromosome arms in Danio rerio and Oryzias latipes (descendants of the 3R duplication) can, in most instances be related to at least 4 whole or partial chromosomal arms in the salmonid species. Multiple arm assignments in the two salmonid species do not clearly support a 13 proto-linkage group model, and suggest that a 12 proto-linkage group arrangement (i.e., a separate single chromosome duplication and ancestral fusion/fissions/recombination within the putative G/H/I groupings) may have occurred in the more basal soft-rayed fishes. We also found evidence supporting the model that ancestral linkage group M underwent a single chromosome duplication following the 3R duplication. In the salmonids, the M ancestral linkage groups are localized to 5 whole arm, and 3 partial arm regions (i.e., 6 whole arm regions expected). Thus, 3 distinct ancestral linkage groups are postulated to have existed in the G/H and M lineage chromosomes in the ancestor of the salmonids.
PMCID: PMC2632648  PMID: 19032764
8.  Pigmentation Pathway Evolution after Whole-Genome Duplication in Fish 
Whole-genome duplications (WGDs) have occurred repeatedly in the vertebrate lineage, but their evolutionary significance for phenotypic evolution remains elusive. Here, we have investigated the impact of the fish-specific genome duplication (FSGD) on the evolution of pigmentation pathways in teleost fishes. Pigmentation and color patterning are among the most diverse traits in teleosts, and their pigmentary system is the most complex of all vertebrate groups.
Using a comparative genomic approach including phylogenetic and synteny analyses, the evolution of 128 vertebrate pigmentation genes in five teleost genomes following the FSGD has been reconstructed. We show that pigmentation genes have been preferentially retained in duplicate after the FSGD, so that teleosts have 30% more pigmentation genes compared with tetrapods. This is significantly higher than genome-wide estimates of FSGD gene duplicate retention in teleosts. Large parts of the melanocyte regulatory network have been retained in two copies after the FSGD. Duplicated pigmentation genes follow general evolutionary patterns such as the preservation of protein complex stoichiometries and the overrepresentation of developmental genes among retained duplicates. These results suggest that the FSGD has made an important contribution to the evolution of teleost-specific features of pigmentation, which include novel pigment cell types or the division of existing pigment cell types into distinct subtypes. Furthermore, we have observed species-specific differences in duplicate retention and evolution that might contribute to pigmentary diversity among teleosts.
Our study therefore strongly supports the hypothesis that WGDs have promoted the increase of complexity and diversity during vertebrate phenotypic evolution.
PMCID: PMC2839281  PMID: 20333216
genome duplication; fish; conserved synteny; pigment cell; melanocyte; functional module
9.  Evolution of the osteoblast: skeletogenesis in gar and zebrafish 
Although the vertebrate skeleton arose in the sea 500 million years ago, our understanding of the molecular fingerprints of chondrocytes and osteoblasts may be biased because it is informed mainly by research on land animals. In fact, the molecular fingerprint of teleost osteoblasts differs in key ways from that of tetrapods, but we do not know the origin of these novel gene functions. They either arose as neofunctionalization events after the teleost genome duplication (TGD), or they represent preserved ancestral functions that pre-date the TGD. Here, we provide evolutionary perspective to the molecular fingerprints of skeletal cells and assess the role of genome duplication in generating novel gene functions. We compared the molecular fingerprints of skeletogenic cells in two ray-finned fish: zebrafish (Danio rerio)--a teleost--and the spotted gar (Lepisosteus oculatus)--a "living fossil" representative of a lineage that diverged from the teleost lineage prior to the TGD (i.e., the teleost sister group). We analyzed developing embryos for expression of the structural collagen genes col1a2, col2a1, col10a1, and col11a2 in well-formed cartilage and bone, and studied expression of skeletal regulators, including the transcription factor genes sox9 and runx2, during mesenchymal condensation.
Results provided no evidence for the evolution of novel functions among gene duplicates in zebrafish compared to the gar outgroup, but our findings shed light on the evolution of the osteoblast. Zebrafish and gar chondrocytes both expressed col10a1 as they matured, but both species' osteoblasts also expressed col10a1, which tetrapod osteoblasts do not express. This novel finding, along with sox9 and col2a1 expression in developing osteoblasts of both zebrafish and gar, demonstrates that osteoblasts of both a teleost and a basally diverging ray-fin fish express components of the supposed chondrocyte molecular fingerprint.
Our surprising finding that the "chondrogenic" transcription factor sox9 is expressed in developing osteoblasts of both zebrafish and gar can help explain the expression of chondrocyte genes in osteoblasts of ray-finned fish. More broadly, our data suggest that the molecular fingerprint of the osteoblast, which largely is constrained among land animals, was not fixed during early vertebrate evolution.
PMCID: PMC3314580  PMID: 22390748
10.  Lack of Plasma Kallikrein-Kinin System Cascade in Teleosts  
PLoS ONE  2013;8(11):e81057.
The kallikrein-kinin system (KKS) consists of two major cascades in mammals: “plasma KKS” consisting of high molecular-weight (HMW) kininogen (KNG), plasma kallikrein (KLKB1), and bradykinin (BK); and “tissue KKS” consisting of low molecular-weight (LMW) KNG, tissue kallikreins (KLKs), and [Lys0]-BK. Some components of the KKS have been identified in the fishes, but systematic analyses have not been performed, thus this study aims to define the KKS components in teleosts and pave a way for future physiological and evolutionary studies. Through a combination of genomics, molecular, and biochemical methods, we showed that the entire plasma KKS cascade is absent in teleosts. Instead of two KNGs as found in mammals, a single molecular weight KNG was found in various teleosts, which is homologous to the mammalian LMW KNG. Results of molecular phylogenetic and synteny analyses indicated that the all current teleost genomes lack KLKB1, and its unique protein structure, four apple domains and one trypsin domain, could not be identified in any genome or nucleotide databases. We identified some KLK-like proteins in teleost genomes by synteny and conserved domain analyses, which could be the orthologs of tetrapod KLKs. A radioimmunoassay system was established to measure the teleost BK and we found that [Arg0]-BK is the major circulating form instead of BK, which supports that the teleost KKS is similar to the mammalian tissue KKS. Coincidently, coelacanths are the earliest vertebrate that possess both HMW KNG and KLKB1, which implies that the plasma KKS could have evolved in the early lobe-finned fish and descended to the tetrapod lineage. The co-evolution of HMW KNG and KLKB1 in lobe-finned fish and early tetrapods may mark the emergence of the plasma KKS and a contact activation system in blood coagulation, while teleosts may have retained a single KKS cascade.
PMCID: PMC3835742  PMID: 24278376
11.  Genome-wide identification, characterization, and expression analysis of lineage-specific genes within zebrafish 
BMC Genomics  2013;14:65.
The genomic basis of teleost phenotypic complexity remains obscure, despite increasing availability of genome and transcriptome sequence data. Fish-specific genome duplication cannot provide sufficient explanation for the morphological complexity of teleosts, considering the relatively large number of extinct basal ray-finned fishes.
In this study, we performed comparative genomic analysis to discover the Conserved Teleost-Specific Genes (CTSGs) and orphan genes within zebrafish and found that these two sets of lineage-specific genes may have played important roles during zebrafish embryogenesis. Lineage-specific genes within zebrafish share many of the characteristics of their counterparts in other species: shorter length, fewer exon numbers, higher GC content, and fewer of them have transcript support. Chromosomal location analysis indicated that neither the CTSGs nor the orphan genes were distributed evenly in the chromosomes of zebrafish. The significant enrichment of immunity proteins in CTSGs annotated by gene ontology (GO) or predicted ab initio may imply that defense against pathogens may be an important reason for the diversification of teleosts. The evolutionary origin of the lineage-specific genes was determined and a very high percentage of lineage-specific genes were generated via gene duplications. The temporal and spatial expression profile of lineage-specific genes obtained by expressed sequence tags (EST) and RNA-seq data revealed two novel properties: in addition to being highly tissue-preferred expression, lineage-specific genes are also highly temporally restricted, namely they are expressed in narrower time windows than evolutionarily conserved genes and are specifically enriched in later-stage embryos and early larval stages.
Our study provides the first systematic identification of two different sets of lineage-specific genes within zebrafish and provides valuable information leading towards a better understanding of the molecular mechanisms of the genomic basis of teleost phenotypic complexity for future studies.
PMCID: PMC3599513  PMID: 23368736
Teleost; Lineage-specific gene; Transcriptome; Zebrafish embryogenesis
12.  Sequencing and comparative analysis of fugu protocadherin clusters reveal diversity of protocadherin genes among teleosts 
The synaptic cell adhesion molecules, protocadherins, are a vertebrate innovation that accompanied the emergence of the neural tube and the elaborate central nervous system. In mammals, the protocadherins are encoded by three closely-linked clusters (α, β and γ) of tandem genes and are hypothesized to provide a molecular code for specifying the remarkably-diverse neural connections in the central nervous system. Like mammals, the coelacanth, a lobe-finned fish, contains a single protocadherin locus, also arranged into α, β and γ clusters. Zebrafish, however, possesses two protocadherin loci that contain more than twice the number of genes as the coelacanth, but arranged only into α and γ clusters. To gain further insight into the evolutionary history of protocadherin clusters, we have sequenced and analyzed protocadherin clusters from the compact genome of the pufferfish, Fugu rubripes.
Fugu contains two unlinked protocadherin loci, Pcdh1 and Pcdh2, that collectively consist of at least 77 genes. The fugu Pcdh1 locus has been subject to extensive degeneration, resulting in the complete loss of Pcdh1γ cluster. The fugu Pcdh genes have undergone lineage-specific regional gene conversion processes that have resulted in a remarkable regional sequence homogenization among paralogs in the same subcluster. Phylogenetic analyses show that most protocadherin genes are orthologous between fugu and zebrafish either individually or as paralog groups. Based on the inferred phylogenetic relationships of fugu and zebrafish genes, we have reconstructed the evolutionary history of protocadherin clusters in the teleost fish lineage.
Our results demonstrate the exceptional evolutionary dynamism of protocadherin genes in vertebrates in general, and in teleost fishes in particular. Besides the 'fish-specific' whole genome duplication, the evolution of protocadherin genes in teleost fishes is influenced by lineage-specific gene losses, tandem gene duplications and regional sequence homogenization. The dynamic protocadherin clusters might have led to the diversification of neural circuitry among teleosts, and contributed to the behavioral and physiological diversity of teleosts.
PMCID: PMC1852091  PMID: 17394664
13.  Hox cluster duplication in the basal teleost Hiodon alosoides (Osteoglossomorpha) 
Theory in Biosciences  2009;128(2):109-120.
Large-scale—even genome-wide—duplications have repeatedly been invoked as an explanation for major radiations. Teleosts, the most species-rich vertebrate clade, underwent a “fish-specific genome duplication” (FSGD) that is shared by most ray-finned fish lineages. We investigate here the Hox complement of the goldeye (Hiodon alosoides), a representative of Osteoglossomorpha, the most basal teleostean clade. An extensive PCR survey reveals that goldeye has at least eight Hox clusters, indicating a duplicated genome compared to basal actinopterygians. The possession of duplicated Hox clusters is uncoupled to species richness. The Hox system of the goldeye is substantially different from that of other teleost lineages, having retained several duplicates of Hox genes for which crown teleosts have lost at least one copy. A detailed analysis of the PCR fragments as well as full length sequences of two HoxA13 paralogs, and HoxA10 and HoxC4 genes places the duplication event close in time to the divergence of Osteoglossomorpha and crown teleosts. The data are consistent with—but do not conclusively prove—that Osteoglossomorpha shares the FSGD.
Electronic supplementary material
The online version of this article (doi:10.1007/s12064-009-0056-1) contains supplementary material, which is available to authorized users.
PMCID: PMC2683926  PMID: 19225820
Hox clusters; Fish-specific genome duplication; Goldeye Hiodon alosoides
14.  Duplication of the dystroglycan gene in most branches of teleost fish 
The dystroglycan (DG) complex is a major non-integrin cell adhesion system whose multiple biological roles involve, among others, skeletal muscle stability, embryonic development and synapse maturation. DG is composed of two subunits: α-DG, extracellular and highly glycosylated, and the transmembrane β-DG, linking the cytoskeleton to the surrounding basement membrane in a wide variety of tissues. A single copy of the DG gene (DAG1) has been identified so far in humans and other mammals, encoding for a precursor protein which is post-translationally cleaved to liberate the two DG subunits. Similarly, D. rerio (zebrafish) seems to have a single copy of DAG1, whose removal was shown to cause a severe dystrophic phenotype in adult animals, although it is known that during evolution, due to a whole genome duplication (WGD) event, many teleost fish acquired multiple copies of several genes (paralogues).
Data mining of pufferfish (T. nigroviridis and T. rubripes) and other teleost fish (O. latipes and G. aculeatus) available nucleotide sequences revealed the presence of two functional paralogous DG sequences. RT-PCR analysis proved that both the DG sequences are transcribed in T. nigroviridis. One of the two DG sequences harbours an additional mini-intronic sequence, 137 bp long, interrupting the uncomplicated exon-intron-exon pattern displayed by DAG1 in mammals and D. rerio. A similar scenario emerged also in D. labrax (sea bass), from whose genome we have cloned and sequenced a new DG sequence that also harbours a shorter additional intronic sequence of 116 bp. Western blot analysis confirmed the presence of DG protein products in all the species analysed including two teleost Antarctic species (T. bernacchii and C. hamatus).
Our evolutionary analysis has shown that the whole-genome duplication event in the Class Actinopterygii (ray-finned fish) involved also DAG1. We unravelled new important molecular genetic details about fish orthologous DGs, which might help to increase the current knowledge on DG expression, maturation and targeting and on its physiopathological role in higher organisms.
PMCID: PMC1885269  PMID: 17509131
15.  Comparative phylogenomic analyses of teleost fish Hox gene clusters: lessons from the cichlid fish Astatotilapia burtoni 
BMC Genomics  2007;8:317.
Teleost fish have seven paralogous clusters of Hox genes stemming from two complete genome duplications early in vertebrate evolution, and an additional genome duplication during the evolution of ray-finned fish, followed by the secondary loss of one cluster. Gene duplications on the one hand, and the evolution of regulatory sequences on the other, are thought to be among the most important mechanisms for the evolution of new gene functions. Cichlid fish, the largest family of vertebrates with about 2500 species, are famous examples of speciation and morphological diversity. Since this diversity could be based on regulatory changes, we chose to study the coding as well as putative regulatory regions of their Hox clusters within a comparative genomic framework.
We sequenced and characterized all seven Hox clusters of Astatotilapia burtoni, a haplochromine cichlid fish. Comparative analyses with data from other teleost fish such as zebrafish, two species of pufferfish, stickleback and medaka were performed. We traced losses of genes and microRNAs of Hox clusters, the medaka lineage seems to have lost more microRNAs than the other fish lineages. We found that each teleost genome studied so far has a unique set of Hox genes. The hoxb7a gene was lost independently several times during teleost evolution, the most recent event being within the radiation of East African cichlid fish. The conserved non-coding sequences (CNS) encompass a surprisingly large part of the clusters, especially in the HoxAa, HoxCa, and HoxDa clusters. Across all clusters, we observe a trend towards an increased content of CNS towards the anterior end.
The gene content of Hox clusters in teleost fishes is more variable than expected, with each species studied so far having a different set. Although the highest loss rate of Hox genes occurred immediately after whole genome duplications, our analyses showed that gene loss continued and is still ongoing in all teleost lineages. Along with the gene content, the CNS content also varies across clusters. The excess of CNS at the anterior end of clusters could imply a stronger conservation of anterior expression patters than those towards more posterior areas of the embryo.
PMCID: PMC2080641  PMID: 17845724
16.  Relaxin gene family in teleosts: phylogeny, syntenic mapping, selective constraint, and expression analysis 
In recent years, the relaxin family of signaling molecules has been shown to play diverse roles in mammalian physiology, but little is known about its diversity or physiology in teleosts, an infraclass of the bony fishes comprising ~ 50% of all extant vertebrates. In this paper, 32 relaxin family sequences were obtained by searching genomic and cDNA databases from eight teleost species; phylogenetic, molecular evolutionary, and syntenic data analyses were conducted to understand the relationship and differential patterns of evolution of relaxin family genes in teleosts compared with mammals. Additionally, real-time quantitative PCR was used to confirm and assess the tissues of expression of five relaxin family genes in Danio rerio and in situ hybridization used to assess the site-specific expression of the insulin 3-like gene in D. rerio testis.
Up to six relaxin family genes were identified in each teleost species. Comparative syntenic mapping revealed that fish possess two paralogous copies of human RLN3, which we call rln3a and rln3b, an orthologue of human RLN2, rln, two paralogous copies of human INSL5, insl5a and insl5b, and an orthologue of human INSL3, insl3. Molecular evolutionary analyses indicated that: rln3a, rln3b and rln are under strong evolutionary constraint, that insl3 has been subject to moderate rates of sequence evolution with two amino acids in insl3/INSL3 showing evidence of positively selection, and that insl5b exhibits a higher rate of sequence evolution than its paralogue insl5a suggesting that it may have been neo-functionalized after the teleost whole genome duplication. Quantitative PCR analyses in D. rerio indicated that rln3a and rln3b are expressed in brain, insl3 is highly expressed in gonads, and that there was low expression of both insl5 genes in adult zebrafish. Finally, in situ hybridization of insl3 in D. rerio testes showed highly specific hybridization to interstitial Leydig cells.
Contrary to previous studies, we find convincing evidence that teleosts contain orthologues of four relaxin family peptides. Overall our analyses suggest that in teleosts: 1) rln3 exhibits a similar evolution and expression pattern to mammalian RLN3, 2) insl3 has been subject to positive selection like its mammalian counterpart and shows similar tissue-specific expression in Leydig cells, 3) insl5 genes are highly represented and have a relatively high rate of sequence evolution in teleost genomes, but they exhibited only low levels of expression in adult zebrafish, 4) rln is evolving under very different selective constraints from mammalian RLN. The results presented here should facilitate the development of hypothesis-driven experimental work on the specific roles of relaxin family genes in teleosts.
PMCID: PMC2805637  PMID: 20015397
17.  The Role of Gene Duplication and Unconstrained Selective Pressures in the Melanopsin Gene Family Evolution and Vertebrate Circadian Rhythm Regulation 
PLoS ONE  2012;7(12):e52413.
Melanopsin is a photosensitive cell protein involved in regulating circadian rhythms and other non-visual responses to light. The melanopsin gene family is represented by two paralogs, OPN4x and OPN4m, which originated through gene duplication early in the emergence of vertebrates. Here we studied the melanopsin gene family using an integrated gene/protein evolutionary approach, which revealed that the rhabdomeric urbilaterian ancestor had the same amino acid patterns (DRY motif and the Y and E conterions) as extant vertebrate species, suggesting that the mechanism for light detection and regulation is similar to rhabdomeric rhodopsins. Both OPN4m and OPN4x paralogs are found in vertebrate genomic paralogons, suggesting that they diverged following this duplication event about 600 million years ago, when the complex eye emerged in the vertebrate ancestor. Melanopsins generally evolved under negative selection (ω = 0.171) with some minor episodes of positive selection (proportion of sites = 25%) and functional divergence (θI = 0.349 and θII = 0.126). The OPN4m and OPN4x melanopsin paralogs show evidence of spectral divergence at sites likely involved in melanopsin light absorbance (200F, 273S and 276A). Also, following the teleost lineage-specific whole genome duplication (3R) that prompted the teleost fish radiation, type I divergence (θI = 0.181) and positive selection (affecting 11% of sites) contributed to amino acid variability that we related with the photo-activation stability of melanopsin. The melanopsin intracellular regions had unexpectedly high variability in their coupling specificity of G-proteins and we propose that Gq/11 and Gi/o are the two G-proteins most-likely to mediate the melanopsin phototransduction pathway. The selection signatures were mainly observed on retinal-related sites and the third and second intracellular loops, demonstrating the physiological plasticity of the melanopsin protein group. Our results provide new insights on the phototransduction process and additional tools for disentangling and understanding the links between melanopsin gene evolution and the specializations observed in vertebrates, especially in teleost fish.
PMCID: PMC3528684  PMID: 23285031
18.  Rapid Evolution of piRNA Pathway in the Teleost Fish: Implication for an Adaptation to Transposon Diversity 
Genome Biology and Evolution  2014;6(6):1393-1407.
The Piwi-interacting RNA (piRNA) pathway is responsible for germline specification, gametogenesis, transposon silencing, and genome integrity. Transposable elements can disrupt genome and its functions. However, piRNA pathway evolution and its adaptation to transposon diversity in the teleost fish remain unknown. This article unveils evolutionary scene of piRNA pathway and its association with diverse transposons by systematically comparative analysis on diverse teleost fish genomes. Selective pressure analysis on piRNA pathway and miRNA/siRNA (microRNA/small interfering RNA) pathway genes between teleosts and mammals showed an accelerated evolution of piRNA pathway genes in the teleost lineages, and positive selection on functional PAZ (Piwi/Ago/Zwille) and Tudor domains involved in the Piwi–piRNA/Tudor interaction, suggesting that the amino acid substitutions are adaptive to their functions in piRNA pathway in the teleost fish species. Notably five piRNA pathway genes evolved faster in the swamp eel, a kind of protogynous hermaphrodite fish, than the other teleosts, indicating a differential evolution of piRNA pathway between the swamp eel and other gonochoristic fishes. In addition, genome-wide analysis showed higher diversity of transposons in the teleost fish species compared with mammals. Our results suggest that rapidly evolved piRNA pathway in the teleost fish is likely to be involved in the adaption to transposon diversity.
PMCID: PMC4079211  PMID: 24846630
teleost fish; evolution; positive selection; reproduction
19.  Survey Sequencing and Comparative Analysis of the Elephant Shark (Callorhinchus milii) Genome 
PLoS Biology  2007;5(4):e101.
Owing to their phylogenetic position, cartilaginous fishes (sharks, rays, skates, and chimaeras) provide a critical reference for our understanding of vertebrate genome evolution. The relatively small genome of the elephant shark, Callorhinchus milii, a chimaera, makes it an attractive model cartilaginous fish genome for whole-genome sequencing and comparative analysis. Here, the authors describe survey sequencing (1.4× coverage) and comparative analysis of the elephant shark genome, one of the first cartilaginous fish genomes to be sequenced to this depth. Repetitive sequences, represented mainly by a novel family of short interspersed element–like and long interspersed element–like sequences, account for about 28% of the elephant shark genome. Fragments of approximately 15,000 elephant shark genes reveal specific examples of genes that have been lost differentially during the evolution of tetrapod and teleost fish lineages. Interestingly, the degree of conserved synteny and conserved sequences between the human and elephant shark genomes are higher than that between human and teleost fish genomes. Elephant shark contains putative four Hox clusters indicating that, unlike teleost fish genomes, the elephant shark genome has not experienced an additional whole-genome duplication. These findings underscore the importance of the elephant shark as a critical reference vertebrate genome for comparative analysis of the human and other vertebrate genomes. This study also demonstrates that a survey-sequencing approach can be applied productively for comparative analysis of distantly related vertebrate genomes.
Author Summary
Cartilaginous fishes (sharks, rays, skates, and chimaeras) are the phylogenetically oldest group of living jawed vertebrates. They are also an important outgroup for understanding the evolution of bony vertebrates such as human and teleost fishes. We performed survey sequencing (1.4× coverage) of a chimaera, the elephant shark (Callorhinchus milii). The elephant shark genome, estimated to be about 910 Mb long, comprises about 28% repetitive elements. Comparative analysis of approximately 15,000 elephant shark gene fragments revealed examples of several ancient genes that have been lost differentially during the evolution of human and teleost fish lineages. Interestingly, the human and elephant shark genomes exhibit a higher degree of synteny and sequence conservation than human and teleost fish (zebrafish and fugu) genomes, even though humans are more closely related to teleost fishes than to the elephant shark. Unlike teleost fish genomes, the elephant shark genome does not seem to have experienced an additional round of whole-genome duplication. These findings underscore the importance of the elephant shark as a useful “model” cartilaginous fish genome for understanding vertebrate genome evolution.
The cartilaginous elephant shark has a basal phylogenetic position useful for understanding jawed vertebrate evolution. Survey sequencing of its genome identified four Hox clusters, suggesting that, unlike for teleost fishes, no additional whole-genome duplication has occurred.
PMCID: PMC1845163  PMID: 17407382
20.  Tissue-specific differential induction of duplicated fatty acid-binding protein genes by the peroxisome proliferator, clofibrate, in zebrafish (Danio rerio) 
Force, Lynch and Conery proposed the duplication-degeneration-complementation (DDC) model in which partitioning of ancestral functions (subfunctionalization) and acquisition of novel functions (neofunctionalization) were the two primary mechanisms for the retention of duplicated genes. The DDC model was tested by analyzing the transcriptional induction of the duplicated fatty acid-binding protein (fabp) genes by clofibrate in zebrafish. Clofibrate is a specific ligand of the peroxisome proliferator-activated receptor (PPAR); it activates PPAR which then binds to a peroxisome proliferator response element (PPRE) to induce the transcriptional initiation of genes primarily involved in lipid homeostasis. Zebrafish was chosen as our model organism as it has many duplicated genes owing to a whole genome duplication (WGD) event that occurred ~230-400 million years ago in the teleost fish lineage. We assayed the steady-state levels of fabp mRNA and heterogeneous nuclear RNA (hnRNA) transcripts in liver, intestine, muscle, brain and heart for four sets of duplicated fabp genes, fabp1a/fabp1b.1/fabp1b.2, fabp7a/fabp7b, fabp10a/fabp10b and fabp11a/fabp11b in zebrafish fed different concentrations of clofibrate.
Electron microscopy showed an increase in the number of peroxisomes and mitochondria in liver and heart, respectively, in zebrafish fed clofibrate. Clofibrate also increased the steady-state level of acox1 mRNA and hnRNA transcripts in different tissues, a gene with a functional PPRE. These results demonstrate that zebrafish is responsive to clofibrate, unlike some other fishes. The levels of fabp mRNA and hnRNA transcripts for the four sets of duplicated fabp genes was determined by reverse transcription, quantitative polymerase chain reaction (RT-qPCR). The level of hnRNA coded by a gene is an indirect estimate of the rate of transcriptional initiation of that gene. Clofibrate increased the steady-state level of fabp mRNAs and hnRNAs for both the duplicated copies of fabp1a/fabp1b.1, and fabp7a/fabp7b, but in different tissues. Clofibrate also increased the steady-state level of fabp10a and fabp11a mRNAs and hnRNAs in liver, but not for fabp10b and fabp11b.
Some duplicated fabp genes have, most likely, retained PPREs, but induction by clofibrate is over-ridden by an, as yet, unknown tissue-specific mechanism(s). Regardless of the tissue-specific mechanism(s), transcriptional control of duplicated zebrafish fabp genes by clofibrate has markedly diverged since the WGD event.
PMCID: PMC3483278  PMID: 22776158
21.  Positive Darwinian selection in the singularly large taste receptor gene family of an ‘ancient’ fish, Latimeria chalumnae 
BMC Genomics  2014;15(1):650.
Chemical senses are one of the foremost means by which organisms make sense of their environment, among them the olfactory and gustatory sense of vertebrates and arthropods. Both senses use large repertoires of receptors to achieve perception of complex chemosensory stimuli. High evolutionary dynamics of some olfactory and gustatory receptor gene families result in considerable variance of chemosensory perception between species. Interestingly, both ora/v1r genes and the closely related t2r genes constitute small and rather conserved families in teleost fish, but show rapid evolution and large species differences in tetrapods. To understand this transition, chemosensory gene repertoires of earlier diverging members of the tetrapod lineage, i.e. lobe-finned fish such as Latimeria would be of high interest.
We report here the complete T2R repertoire of Latimeria chalumnae, using thorough data mining and extensive phylogenetic analysis. Eighty t2r genes were identified, by far the largest family reported for any species so far. The genomic neighborhood of t2r genes is enriched in repeat elements, which may have facilitated the extensive gene duplication events resulting in such a large family. Examination of non-synonymous vs. synonymous substitution rates (dN/dS) suggests pronounced positive Darwinian selection in Latimeria T2Rs, conceivably ensuring efficient neo-functionalization of newly born t2r genes. Notably, both traits, positive selection and enrichment of repeat elements in the genomic neighborhood, are absent in the twenty v1r genes of Latimeria. Sequence divergence in Latimeria T2Rs and V1Rs is high, reminescent of the corresponding teleost families. Some conserved sequence motifs of Latimeria T2Rs and V1Rs are shared with the respective teleost but not tetrapod genes, consistent with a potential role of such motifs in detection of aquatic chemosensory stimuli.
The singularly large T2R repertoire of Latimeria may have been generated by facilitating local gene duplication via increased density of repeat elements, and efficient neofunctionalization via positive Darwinian selection.
The high evolutionary dynamics of tetrapod t2r gene families precedes the emergence of tetrapods, i.e. the water-to-land transition, and thus constitutes a basal feature of the lobe-finned lineage of vertebrates.
Electronic supplementary material
The online version of this article (doi:10.1186/1471-2164-15-650) contains supplementary material, which is available to authorized users.
PMCID: PMC4132921  PMID: 25091523
Coelacanth; Bitter taste; Pheromone; Phylogeny; Sarcopterygian; Evolution
22.  The Timing of Timezyme Diversification in Vertebrates 
PLoS ONE  2014;9(12):e112380.
All biological functions in vertebrates are synchronized with daily and seasonal changes in the environment by the time keeping hormone melatonin. Its nocturnal surge is primarily due to the rhythmic activity of the arylalkylamine N-acetyl transferase AANAT, which thus became the focus of many investigations regarding its evolution and function. Various vertebrate isoforms have been reported from cartilaginous fish to mammals but their origin has not been clearly established. Using phylogeny and synteny, we took advantage of the increasing number of available genomes in order to test whether the various rounds of vertebrate whole genome duplications were responsible for the diversification of AANAT. We highlight a gene secondary loss of the AANAT2 in the Sarcopterygii, revealing for the first time that the AAANAT1/2 duplication occurred before the divergence between Actinopterygii (bony fish) and Sarcopterygii (tetrapods, lobe-finned fish, and lungfish). We hypothesize the teleost-specific whole genome duplication (WDG) generated the appearance of the AANAT1a/1b and the AANAT2/2′paralogs, the 2′ isoform being rapidly lost in the teleost common ancestor (ray-finned fish). We also demonstrate the secondary loss of the AANAT1a in a Paracantopterygii (Atlantic cod) and of the 1b in some Ostariophysi (zebrafish and cave fish). Salmonids present an even more diverse set of AANATs that may be due to their specific WGD followed by secondary losses. We propose that vertebrate AANAT diversity resulted from 3 rounds of WGD followed by previously uncharacterized secondary losses. Extant isoforms show subfunctionalized localizations, enzyme activities and affinities that have increased with time since their emergence.
PMCID: PMC4259306  PMID: 25486407
23.  Whole-Genome Duplication and the Functional Diversification of Teleost Fish Hemoglobins 
Molecular Biology and Evolution  2012;30(1):140-153.
Subsequent to the two rounds of whole-genome duplication that occurred in the common ancestor of vertebrates, a third genome duplication occurred in the stem lineage of teleost fishes. This teleost-specific genome duplication (TGD) is thought to have provided genetic raw materials for the physiological, morphological, and behavioral diversification of this highly speciose group. The extreme physiological versatility of teleost fish is manifest in their diversity of blood–gas transport traits, which reflects the myriad solutions that have evolved to maintain tissue O2 delivery in the face of changing metabolic demands and environmental O2 availability during different ontogenetic stages. During the course of development, regulatory changes in blood–O2 transport are mediated by the expression of multiple, functionally distinct hemoglobin (Hb) isoforms that meet the particular O2-transport challenges encountered by the developing embryo or fetus (in viviparous or oviparous species) and in free-swimming larvae and adults. The main objective of the present study was to assess the relative contributions of whole-genome duplication, large-scale segmental duplication, and small-scale gene duplication in producing the extraordinary functional diversity of teleost Hbs. To accomplish this, we integrated phylogenetic reconstructions with analyses of conserved synteny to characterize the genomic organization and evolutionary history of the globin gene clusters of teleosts. These results were then integrated with available experimental data on functional properties and developmental patterns of stage-specific gene expression. Our results indicate that multiple α- and β-globin genes were present in the common ancestor of gars (order Lepisoteiformes) and teleosts. The comparative genomic analysis revealed that teleosts possess a dual set of TGD-derived globin gene clusters, each of which has undergone lineage-specific changes in gene content via repeated duplication and deletion events. Phylogenetic reconstructions revealed that paralogous genes convergently evolved similar functional properties in different teleost lineages. Consistent with other recent studies of globin gene family evolution in vertebrates, our results revealed evidence for repeated evolutionary transitions in the developmental regulation of Hb synthesis.
PMCID: PMC3525417  PMID: 22949522
gene duplication; genome duplication; gene family evolution; convergent evolution
24.  Phylogenomic analyses of KCNA gene clusters in vertebrates: why do gene clusters stay intact? 
Gene clusters are of interest for the understanding of genome evolution since they provide insight in large-scale duplications events as well as patterns of individual gene losses. Vertebrates tend to have multiple copies of gene clusters that typically are only single clusters or are not present at all in genomes of invertebrates. We investigated the genomic architecture and conserved non-coding sequences of vertebrate KCNA gene clusters. KCNA genes encode shaker-related voltage-gated potassium channels and are arranged in two three-gene clusters in tetrapods. Teleost fish are found to possess four clusters. The two tetrapod KNCA clusters are of approximately the same age as the Hox gene clusters that arose through duplications early in vertebrate evolution. For some genes, their conserved retention and arrangement in clusters are thought to be related to regulatory elements in the intergenic regions, which might prevent rearrangements and gene loss. Interestingly, this hypothesis does not appear to apply to the KCNA clusters, as too few conserved putative regulatory elements are retained.
We obtained KCNA coding sequences from basal ray-finned fishes (sturgeon, gar, bowfin) and confirmed that the duplication of these genes is specific to teleosts and therefore consistent with the fish-specific genome duplication (FSGD). Phylogenetic analyses of the genes suggest a basal position of the only intron containing KCNA gene in vertebrates (KCNA7). Sistergroup relationships of KCNA1/2 and KCNA3/6 support that a large-scale duplication gave rise to the two clusters found in the genome of tetrapods. We analyzed the intergenic regions of KCNA clusters in vertebrates and found that there are only a few conserved sequences shared between tetrapods and teleosts or between paralogous clusters. The orthologous teleost clusters, however, show sequence conservation in these regions.
The lack of overall conserved sequences in intergenic regions suggests that there are either other processes than regulatory evolution leading to cluster conservation or that the ancestral regulatory relationships among genes in KCNA clusters have been changed together with their regulatory sites.
PMCID: PMC1978502  PMID: 17697377
25.  A Window into Domain Amplification Through Piccolo in Teleost Fish 
G3: Genes|Genomes|Genetics  2012;2(11):1325-1339.
I describe and characterize the extensive amplification of the zinc finger domain of Piccolo selectively in teleost fish. Piccolo and Bassoon are partially functionally redundant and play roles in regulating the pool of neurotransmitter-filled synaptic vesicles present at synapses. In mice, each protein contains two N-terminal zinc finger domains that have been implicated in interacting with synaptic vesicles. In all teleosts examined, both the Bassoon and Piccolo genes are duplicated. Both teleost bassoon genes and one piccolo gene show very similar domain structure and intron-exon organization to their mouse homologs. In contrast, in piccolo b a single exon that encodes a zinc finger domain is amplified 8 to 16 times in different teleost species. Analysis of the amplified exons suggests they were added and/or deleted from the gene as individual exons in rare events that are likely the result of unequal crossovers between homologous sequences. Surprisingly, the structure of the repeats from cod and zebrafish suggest that amplification of this exon has occurred independently multiple times in the teleost lineage. Based on the structure of the exons, I propose a model in which selection for high sequence similarity at the 5′ and 3′ ends of the exon drives amplification of the repeats and diversity in repeat length likely promotes the stability of the repeated exons by minimizing the likelihood of mispairing of adjacent repeat sequences. Further analysis of piccolo b in teleosts should provide a window through which to examine the process of domain amplification.
PMCID: PMC3484663  PMID: 23173084

