A duplicated gene newly arisen in a single genome must overcome substantial hurdles before it can be observed in evolutionary comparisons. First, it must become fixed in the population, and second, it must be preserved over time. Population genetics tells us that for new alleles, fixation is a rare event, even for new mutations that confer an immediate selective advantage. Nevertheless, it has been estimated that one in a hundred genes is duplicated and fixed every million years (Lynch and Conery 2000
), although it should be clear from the duplication mechanisms described above that it is highly unlikely that duplication rates are constant over time. However, once fixed, three possible fates are typically envisaged for our gene duplication.
Despite the slackened selective constraints, mutations can still destroy the incipient functionality of a duplicated gene: for example, by introducing a premature stop codon or a mutation that destroys the structure of a major protein domain. These degenerative mutations result in the creation of a pseudogene (nonfunctionalization). Over time, the likelihood of such a mutation being introduced increases. Recent studies suggest that there is a relatively narrow time window for evolutionary exploration before degradation becomes the most likely outcome, typically of the order of 4 million years (Lynch and Conery 2000
During the relatively brief period of relaxed selection following gene duplication, a new, advantageous allele may arise as a result of one of the gene copies gaining a new function (neofunctionalization). This can be revealed by an accelerated rate of amino-acid change after duplication in one of the gene copies. This burst of selection is necessarily episodic—once a new function is attained by one of the duplicates, selective constraints on this gene are reasserted. These patterns of selection can be observed in real data: most recently duplicated gene pairs in the human genome have diverged at different rates from their ancestral amino-acid sequence (Zhang et al. 2003
). A convincing instance of neofunctionalization is the evolution of antibacterial activity in the ECP
gene in Old World Monkeys and hominoids after a burst of amino-acid changes following the tandem duplication of the progenitor gene EDN
(a ribonuclease) some 30 MYA (Zhang et al. 1998
). The divergence of duplicated genes over time can be also monitored in genome-wide functional studies. In both yeast and nematodes, the ability of a gene to buffer the loss of its duplicate declines over time as their functional overlap decreases.
Rather than one gene duplicate retaining the original function, while the other either degrades or evolves a new function, the original functions of the single-copy gene may be partitioned between the duplicates (subfunctionalization). Many genes perform a multiplicity of subtly distinct functions, and selective pressures have resulted in a compromise between optimal sequences for each role. Partitioning these functions between the duplicates may increase the fitness of the organism by removing the conflict between two or more functions. This outcome has become associated with a population genetic model known as the Duplication–Degeneration–Complementation (DDC) model, which focuses attention on the regulatory changes after duplication (Force et al. 1999
). In this model, degenerative changes occur in regulatory sequences of both duplicates, such that these changes complement each other, and the union of the expression patterns of the two duplicates reconstitutes the expression pattern of the original ().
A recent study by Dorus and colleagues (Dorus et al. 2003
) investigated the retrotransposition (since the existence of a human–mouse common ancestor) of one of the two autosomal copies of the CDYL
gene to Y chromosome (forming CDY
). In the mouse, both Cdyl
genes produce two distinct transcripts, one of which is expressed ubiquitously while the other is testis-specific. By contrast, in humans both CDYL
genes produce a single ubiquitously expressed transcript, and CDY
exhibits testis-specific expression. As CDY
is a retrogene (see above) that has not been duplicated together with its ancestral regulatory sequences, it is clear that the DDC model is not the only route by which to achieve spatial partitioning of ancestral expression patterns.
Subfunctionalization can also lead to the partitioning of temporal as well as spatial expression patterns. In humans, the β-globin cluster of duplicated genes contains three genes with coordinated but distinct developmental expression patterns. One gene is expressed in embryos, another in foetuses, and the third from neonates onwards. In addition, coding sequence changes have co-evolved with the regulatory changes so that the O2
binding affinity of haemoglobin is optimised for each developmental stage. This coupling between coding and regulatory change is similarly noted at a genomic level when expression differences between many duplicated genes pairs are correlated with their coding sequence divergence (Makova and Li 2003