In principle, any of the sequences in a region from which a target gene is able to receive regulatory input could be exapted into a regulatory element, as long as it does not interfere with its other essential functions. Indeed, tens of thousands of elements overlapping coding exons in mammalian genomes have constrained selection on synonymous sites in their codons; Lin et al.
] managed to assign putative function to 60% of them. The majority of functions, as expected, are splicing related, but other known overlapping functions have been detected: translational initiation, regulation of inclusion of cassette exons and, finally, developmental enhancers. The remaining 40% remain uncharacterized, but since they are enriched within developmental genes, a significant fraction of these are likely to have a regulatory role.
Eichenlaub and Ettwiller [3
] explored an evolutionary scenario where an exonic remnant of a copy of a gene that was inactivated (non-functionalized) following teleost whole-genome duplication has acquired a regulatory function, as an enhancer driving part of the expression pattern of a neighboring developmental gene. They first searched for genomic regions in stickleback (Gasterosteus aculeatus
) that are (1) conserved between human and stickleback; (2) non-coding in stickleback, but whose human ortholog regions are in the coding sequence; (3) near developmental genes. They identified four such exon-turned-enhancers, which they termed recycled regions, in the stickleback genome. The four corresponding human exons belong to the non-developmental genes TTC29
The recycled regions annotated in the stickleback genome were transferred to the medaka genome for experimental validation. Three out of four recycled regions in the medaka genome showed enhancer activity, and each recapitulated part of the expression pattern of a neighboring developmental gene. The authors proceeded to show, for each of those sequences, that the medaka paralog, which is still a coding exon of an active protein-coding gene, does not have enhancer activity, and neither do the orthologous exons in mouse and elephant shark, which represent a sister group (tetrapods) and an outgroup (cartilaginous fish), respectively. From this experimental evidence, the authors concluded that the exaptation of new enhancers occurred after whole-genome duplication at the root of teleost fish radiation and after inactivation of the copy of the gene from which the recycled region originated.
The suggested scenario poses some constraints. If the inactivation of the protein-coding gene preceded exaptation, the exaptation should have followed quickly thereafter. Otherwise, the exon sequence conservation would have rapidly decayed beyond recognition by neutral mutation within a relatively narrow window of several million years (Figure ). This would make this scenario rare, but not implausible. Indeed, the fact that only four elements were found (three of them in which the exonic remnant itself is required for enhancer function) suggests that this is a rare event.
Figure 1 Four alternative scenarios for the timing of exaptation of a coding sequence into a regulatory function exclusive to teleost fish. After whole-genome duplication (WGD; gray and red circles) in teleost fish (teleost), one copy of an ancestral coding sequence (more ...)
The presented data do not exclude modified or alternative scenarios. For example, the exons could have been co-opted for an enhancer role before the whole-genome duplication (Figure ), yielding a dual-function element (enhancer overlapping a functional coding exon) of the kind that has been shown in several other instances [5
]. Co-option, in which an additional function is acquired by an existing functional element, could have been followed by the reciprocal loss of enhancer or exon function after the whole-genome duplication [6
]. This scenario still fits with the enhancer as newly emerged and teleost specific, but might have the benefit of a significantly longer 'window of opportunity' for emergence without much sequence divergence, because at no time is selective pressure on the element removed.
A slight modification of the scenario depicted in Figure would be that the co-option occurred in the post-whole-genome-duplication period while both copies of the original protein-coding gene were still functional (Figure ). Judging from rediploidization events in zebrafish relative to three other teleosts (medaka, stickleback and tetraodon), the post-whole-genome-duplication window of opportunity was also likely to be longer than that prior to whole-genome duplication, although one would assume that selective pressure to retain two copies of the gene was low. Other, more elaborate scenarios, such as that depicted in Figure , would benefit from even longer windows of opportunity, and will only be possible to exclude after additional fish genome sequences become available.
One of the three elements tested by Eichenlaub and Ettwiller [3
], the one originating from an exon of ccdc46
, is shown by the authors to be near a developmental enhancer that is conserved and functional in mouse, medaka and shark. The ccdc46
exon sequence from either mouse or elephant shark does not drive expression on its own in their assays, and is not required for the function of the neighboring enhancer in mouse. However, based on analysis of synonymous conservation across coding exons of 29 eutherian mammals [4
], the ccdc46
exon itself overlaps with an element predicted to be still under selection on synonymous sites in eutherian mammals, and bears histone modifications associated with enhancer function (H3K4me1) in a subset of ENCODE (Encyclopedia of DNA Elements) cell lines (Figure ). This indicates that a complex scenario and a contemporary dual role for the exon in mammals cannot be ruled out.
Figure 2 The AXIN2-CCDC46 locus. The human ortholog of an exon that was exapted into a regulatory function in teleosts is shown in the context of synonymous constraint elements (SCE), ENCODE histone marks in human embryonic stem cells indicative of enhancer function (more ...)