PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-25 (1015783)

Clipboard (0)
None

Related Articles

1.  Asymmetric and non-uniform evolution of recently duplicated human genes 
Biology Direct  2010;5:54.
Background
Gene duplications are a source of new genes and protein functions. The innovative role of duplication events makes families of paralogous genes an interesting target for studies in evolutionary biology. Here we study global trends in the evolution of human genes that resulted from recent duplications.
Results
The pressure of negative selection is weaker during a short time immediately after a duplication event. Roughly one fifth of genes in paralogous gene families are evolving asymmetrically: one of the proteins encoded by two closest paralogs accumulates amino acid substitutions significantly faster than its partner. This asymmetry cannot be explained by differences in gene expression levels. In asymmetric gene pairs the number of deleterious mutations is increased in one copy, while decreased in the other copy as compared to genes constituting non-asymmetrically evolving pairs. The asymmetry in the rate of synonymous substitutions is much weaker and not significant.
Conclusions
The increase of negative selection pressure over time after a duplication event seems to be a major trend in the evolution of human paralogous gene families. The observed asymmetry in the evolution of paralogous genes shows that in many cases one of two gene copies remains practically unchanged, while the other accumulates functional mutations. This supports the hypothesis that slowly evolving gene copies preserve their original functions, while fast evolving copies obtain new specificities or functions.
Reviewers
This article was reviewed by Dr. Igor Rogozin (nominated by Dr. Arcady Mushegian), Dr. Fyodor Kondrashov, and Dr. Sergei Maslov.
doi:10.1186/1745-6150-5-54
PMCID: PMC2942815  PMID: 20825637
2.  Genetic interactions reveal the evolutionary trajectories of duplicate genes 
Duplicate genes show significantly fewer interactions than singleton genes, and functionally similar duplicates can exhibit dissimilar profiles because common interactions are ‘hidden' due to buffering.Genetic interaction profiles provide insights into evolutionary mechanisms of duplicate retention by distinguishing duplicates under dosage selection from those retained because of some divergence in function.The genetic interactions of duplicate genes evolve in an extremely asymmetric way and the directionality of this asymmetry correlates well with other evolutionary properties of duplicate genes.Genetic interaction profiles can be used to elucidate the divergent function of specific duplicate pairs.
Gene duplication and divergence serves as a primary source for new genes and new functions, and as such has broad implications on the evolutionary process. Duplicate genes within S. cerevisiae have been shown to retain a high degree of similarity with regard to many of their functional properties (Papp et al, 2004; Guan et al, 2007; Wapinski et al, 2007; Musso et al, 2008), and perturbation of duplicate genes has been shown to result in smaller fitness defects than singleton genes (Gu et al, 2003; DeLuna et al, 2008; Dean et al, 2008; Musso et al, 2008). Individual genetic interactions between pairs of genes and profiles of such interactions across the entire genome provide a new context in which to examine the properties of duplicate compensation.
In this study we use the most recent and comprehensive set of genetic interactions in yeast produced to date (Costanzo et al, 2010) to address questions of duplicate retention and redundancy. We show that the ability for duplicate genes to buffer the deletion of a partner has three main consequences. First it agrees with previous work demonstrating that a high proportion of duplicate pairs are synthetic lethal, a classic indication of the ability to buffer one another functionally (DeLuna et al, 2008; Dean et al, 2008; Musso et al, 2008). Second, it reduces the number of genetic interactions observed between duplicate genes and the rest of the genome by masking interactions relating to common function from experimental detection. Third, this buffering of common interactions serves to reduce profile similarity in spite of common function (Figure 1). The compensatory ability of functionally similar duplicates buffers genetic interactions related to their common function (reducing the number of genetic interactions overall), while allowing the measurement of interactions related to any divergent function. Thus, even functionally similar duplicates may have dissimilar genetic interaction profiles. As previously surmised (Ihmels et al, 2007), duplicate genes under selection for dosage amplification have differing profile characteristics. We show that dosage-mediated duplicates have much higher genetic interaction profile similarity than do other duplicate pairs. Furthermore, we show in a comparison with local neighbors on a protein–protein interaction (PPI) network, that although dosage-mediated duplicates more often have higher similarity to each other than they do to their neighbors, the reverse is true for duplicates in general. That is, slightly divergent duplicate genes more often exhibit a higher similarity with a common neighbor on the PPI network than they do with each other, and that observation is consistent with the idea that common interactions are buffered while interactions corresponding to divergent functions are observed.
We then asked whether duplicates' genetic interactions that are not buffered appear in a symmetric or an asymmetric fashion. Previous work has established asymmetric patterns with regard to PPI degree (Wagner, 2002; He and Zhang, 2005), sequence divergence (Conant and Wagner, 2003; Zhang et al, 2003; Kellis et al, 2004; Scannell and Wolfe, 2008) and expression patterns (Gu et al, 2002b; Tirosh and Barkai, 2007). Although genetic interactions are further removed from mechanism than protein–protein interactions, for example, they do offer a more direct measurement of functional consequence and, thus, may give a better indication of the functional differences between a duplicate pair. We found that duplicates exhibit a strikingly asymmetric pattern of genetic interactions, with the ratio of interactions between sisters commonly exceeding 7:1 (Figure 4A). The observations differ significantly from random simulations in which genetic interactions were redistributed between sisters with equal probability (Figure 4A). Moreover, the directionality of this interaction asymmetry agrees with other physiological properties of duplicate pairs. For example, the sister with more genetic interactions also tends to have more protein–protein interactions and also tends to evolve at a slower rate (Figure 4B).
Genetic interaction degree and profiles can be used to understand the functional divergence of particular duplicates pairs. As a case example, we consider the whole-genome-duplication pair CIK1–VIK1. Each of these genes encode proteins that form distinct heterodimeric complexes with the microtubule motor protein Kar3 (Manning et al, 1999). Although each of these proteins depend on a direct physical interaction with Kar3, Cik1 has a much higher profile similarity to Kar3 than does Vik1 (r=0.5 and r=0.3, respectively). Consistent with its higher similarity, Δcik1 and Δkar3 exhibit several similar phenotypes, including abnormally short spindles, chromosome loss and delayed cell cycle progression (Page et al, 1994; Manning et al, 1999). In contrast, a Δvik1 mutant strain exhibits no overt phenotype (Manning et al, 1999).
The characterization of functional redundancy and divergence between duplicate genes is an important step in understanding the evolution of genetic systems. Large-scale genetic network analysis in Saccharomyces cerevisiae provides a powerful perspective for addressing these questions through quantitative measurements of genetic interactions between pairs of duplicated genes, and more generally, through the study of genome-wide genetic interaction profiles associated with duplicated genes. We show that duplicate genes exhibit fewer genetic interactions than other genes because they tend to buffer one another functionally, whereas observed interactions are non-overlapping and reflect their divergent roles. We also show that duplicate gene pairs are highly imbalanced in their number of genetic interactions with other genes, a pattern that appears to result from asymmetric evolution, such that one duplicate evolves or degrades faster than the other and often becomes functionally or conditionally specialized. The differences in genetic interactions are predictive of differences in several other evolutionary and physiological properties of duplicate pairs.
doi:10.1038/msb.2010.82
PMCID: PMC3010121  PMID: 21081923
duplicate genes; functional divergence; genetic interactions; paralogs; Saccharomyces cerevisiae
3.  Selection in the evolution of gene duplications 
Genome Biology  2002;3(2):research0008.1-research0008.9.
Background
Gene duplications have a major role in the evolution of new biological functions. Theoretical studies often assume that a duplication per se is selectively neutral and that, following a duplication, one of the gene copies is freed from purifying (stabilizing) selection, which creates the potential for evolution of a new function.
Results
In search of systematic evidence of accelerated evolution after duplication, we used data from 26 bacterial, six archaeal, and seven eukaryotic genomes to compare the mode and strength of selection acting on recently duplicated genes (paralogs) and on similarly diverged, unduplicated orthologous genes in different species. We find that the ratio of nonsynonymous to synonymous substitutions (Kn/Ks) in most paralogous pairs is <<1 and that paralogs typically evolve at similar rates, without significant asymmetry, indicating that both paralogs produced by a duplication are subject to purifying selection. This selection is, however, substantially weaker than the purifying selection affecting unduplicated orthologs that have diverged to the same extent as the analyzed paralogs. Most of the recently duplicated genes appear to be involved in various forms of environmental response; in particular, many of them encode membrane and secreted proteins.
Conclusions
The results of this analysis indicate that recently duplicated paralogs evolve faster than orthologs with the same level of divergence and similar functions, but apparently do not experience a phase of neutral evolution. We hypothesize that gene duplications that persist in an evolving lineage are beneficial from the time of their origin, due primarily to a protein dosage effect in response to variable environmental conditions; duplications are likely to give rise to new functions at a later phase of their evolution once a higher level of divergence is reached.
PMCID: PMC65685  PMID: 11864370
4.  Local synteny and codon usage contribute to asymmetric sequence divergence of Saccharomyces cerevisiae gene duplicates 
Background
Duplicated genes frequently experience asymmetric rates of sequence evolution. Relaxed selective constraints and positive selection have both been invoked to explain the observation that one paralog within a gene-duplicate pair exhibits an accelerated rate of sequence evolution. In the majority of studies where asymmetric divergence has been established, there is no indication as to which gene copy, ancestral or derived, is evolving more rapidly. In this study we investigated the effect of local synteny (gene-neighborhood conservation) and codon usage on the sequence evolution of gene duplicates in the S. cerevisiae genome. We further distinguish the gene duplicates into those that originated from a whole-genome duplication (WGD) event (ohnologs) versus small-scale duplications (SSD) to determine if there exist any differences in their patterns of sequence evolution.
Results
For SSD pairs, the derived copy evolves faster than the ancestral copy. However, there is no relationship between rate asymmetry and synteny conservation (ancestral-like versus derived-like) in ohnologs. mRNA abundance and optimal codon usage as measured by the CAI is lower in the derived SSD copies relative to ancestral paralogs. Moreover, in the case of ohnologs, the faster-evolving copy has lower CAI and lowered expression.
Conclusions
Together, these results suggest that relaxation of selection for codon usage and gene expression contribute to rate asymmetry in the evolution of duplicated genes and that in SSD pairs, the relaxation of selection stems from the loss of ancestral regulatory information in the derived copy.
doi:10.1186/1471-2148-11-279
PMCID: PMC3190396  PMID: 21955875
5.  In silico evidence for functional specialization after genome duplication in yeast 
Fems Yeast Research  2008;9(1):16-31.
A fairly recent whole-genome duplication (WGD) event in yeast enables the effects of gene duplication and subsequent functional divergence to be characterized. We examined 15 ohnolog pairs (i.e. paralogs from a WGD) out of c. 500 Saccharomyces cerevisiae ohnolog pairs that have persisted over an estimated 100 million years of evolution. These 15 pairs were chosen for their high levels of asymmetry, i.e. within the pair, one ohnolog had evolved much faster than the other. Sequence comparisons of the 15 pairs revealed that the faster evolving duplicated genes typically appear to have experienced partially – but not fully – relaxed negative selection as evidenced by an average nonsynonymous/synonymous substitution ratio (dN/dSavg=0.44) that is higher than the slow-evolving genes' ratio (dN/dSavg=0.14) but still <1. Increased number of insertions and deletions in the fast-evolving genes also indicated loosened structural constraints. Sequence and structural comparisons indicated that a subset of these pairs had significant differences in their catalytically important residues and active or cofactor-binding sites. A literature survey revealed that several of the fast-evolving genes have gained a specialized function. Our results indicate that subfunctionalization and even neofunctionalization has occurred along with degenerative evolution, in which unneeded functions were destroyed by mutations.
doi:10.1111/j.1567-1364.2008.00451.x
PMCID: PMC2704937  PMID: 19133069
gene duplication; yeast genome; protein evolution; sequence analysis; structural analysis
6.  A network perspective on the evolution of metabolism by gene duplication 
Genome Biology  2007;8(2):R26.
In silico models trying to explain the origin and evolution of metabolism are improved with the inclusion of specific functional constraints, such as the preferential coupling of reactions.
Background
Gene duplication followed by divergence is one of the main sources of metabolic versatility. The patchwork and stepwise models of metabolic evolution help us to understand these processes, but their assumptions are relatively simplistic. We used a network-based approach to determine the influence of metabolic constraints on the retention of duplicated genes.
Results
We detected duplicated genes by looking for enzymes sharing homologous domains and uncovered an increased retention of duplicates for enzymes catalyzing consecutive reactions, as illustrated by the ligases acting in the biosynthesis of peptidoglycan. As a consequence, metabolic networks show a high retention of duplicates within functional modules, and we found a preferential biochemical coupling of reactions that partially explains this bias. A similar situation was found in enzyme-enzyme interaction networks, but not in interaction networks of non-enzymatic proteins or gene transcriptional regulatory networks, suggesting that the retention of duplicates results from the biochemical rules governing substrate-enzyme-product relationships. We confirmed a high retention of duplicates between chemically similar reactions, as illustrated by fatty-acid metabolism. The retention of duplicates between chemically dissimilar reactions is, however, also greater than expected by chance. Finally, we detected a significant retention of duplicates as groups, instead of single pairs.
Conclusion
Our results indicate that in silico modeling of the origin and evolution of metabolism is improved by the inclusion of specific functional constraints, such as the preferential biochemical coupling of reactions. We suggest that the stepwise and patchwork models are not independent of each other: in fact, the network perspective enables us to reconcile and combine these models.
doi:10.1186/gb-2007-8-2-r26
PMCID: PMC1852415  PMID: 17326820
7.  Profiling of gene duplication patterns of sequenced teleost genomes: evidence for rapid lineage-specific genome expansion mediated by recent tandem duplications 
BMC Genomics  2012;13:246.
Background
Gene duplication has had a major impact on genome evolution. Localized (or tandem) duplication resulting from unequal crossing over and whole genome duplication are believed to be the two dominant mechanisms contributing to vertebrate genome evolution. While much scrutiny has been directed toward discerning patterns indicative of whole-genome duplication events in teleost species, less attention has been paid to the continuous nature of gene duplications and their impact on the size, gene content, functional diversity, and overall architecture of teleost genomes.
Results
Here, using a Markov clustering algorithm directed approach we catalogue and analyze patterns of gene duplication in the four model teleost species with chromosomal coordinates: zebrafish, medaka, stickleback, and Tetraodon. Our analyses based on set size, duplication type, synonymous substitution rate (Ks), and gene ontology emphasize shared and lineage-specific patterns of genome evolution via gene duplication. Most strikingly, our analyses highlight the extraordinary duplication and retention rate of recent duplicates in zebrafish and their likely role in the structural and functional expansion of the zebrafish genome. We find that the zebrafish genome is remarkable in its large number of duplicated genes, small duplicate set size, biased Ks distribution toward minimal mutational divergence, and proportion of tandem and intra-chromosomal duplicates when compared with the other teleost model genomes. The observed gene duplication patterns have played significant roles in shaping the architecture of teleost genomes and appear to have contributed to the recent functional diversification and divergence of important physiological processes in zebrafish.
Conclusions
We have analyzed gene duplication patterns and duplication types among the available teleost genomes and found that a large number of genes were tandemly and intrachromosomally duplicated, suggesting their origin of independent and continuous duplication. This is particularly true for the zebrafish genome. Further analysis of the duplicated gene sets indicated that a significant portion of duplicated genes in the zebrafish genome were of recent, lineage-specific duplication events. Most strikingly, a subset of duplicated genes is enriched among the recently duplicated genes involved in immune or sensory response pathways. Such findings demonstrated the significance of continuous gene duplication as well as that of whole genome duplication in the course of genome evolution.
doi:10.1186/1471-2164-13-246
PMCID: PMC3464592  PMID: 22702965
Gene duplication; Whole genome duplication; Teleost species; Tandem duplication
8.  Evolution and functional divergence of NLRP genes in mammalian reproductive systems 
Background
NLRPs (Nucleotide-binding oligomerization domain, Leucine rich Repeat and Pyrin domain containing Proteins) are members of NLR (Nod-like receptors) protein family. Recent researches have shown that NLRP genes play important roles in both mammalian innate immune system and reproductive system. Several of NLRP genes were shown to be specifically expressed in the oocyte in mammals. The aim of the present work was to study how these genes evolved and diverged after their duplication, as well as whether natural selection played a role during their evolution.
Results
By using in silico methods, we have evaluated the evolution and functional divergence of NLRP genes, in particular of mouse reproduction-related Nlrp genes. We found that (1) major NLRP genes have been duplicated before the divergence of mammals, with certain lineage-specific duplications in primates (NLRP7 and 11) and in rodents (Nlrp1, 4 and 9 duplicates); (2) tandem duplication events gave rise to a mammalian reproduction-related NLRP cluster including NLRP2, 4, 5, 7, 8, 9, 11, 13 and 14 genes; (3) the function of mammalian oocyte-specific NLRP genes (NLRP4, 5, 9 and 14) might have diverged during gene evolution; (4) recent segmental duplications concerning Nlrp4 copies and vomeronasal 1 receptor encoding genes (V1r) have been undertaken in the mouse; and (5) duplicates of Nlrp4 and 9 in the mouse might have been subjected to adaptive evolution.
Conclusion
In conclusion, this study brings us novel information on the evolution of mammalian reproduction-related NLRPs. On the one hand, NLRP genes duplicated and functionally diversified in mammalian reproductive systems (such as NLRP4, 5, 9 and 14). On the other hand, during evolution, different lineages adapted to develop their own NLRP genes, particularly in reproductive function (such as the specific expansion of Nlrp4 and Nlrp9 in the mouse).
doi:10.1186/1471-2148-9-202
PMCID: PMC2735741  PMID: 19682372
9.  The fate of the duplicated androgen receptor in fishes: a late neofunctionalization event? 
Background
Based on the observation of an increased number of paralogous genes in teleost fishes compared with other vertebrates and on the conserved synteny between duplicated copies, it has been shown that a whole genome duplication (WGD) occurred during the evolution of Actinopterygian fish. Comparative phylogenetic dating of this duplication event suggests that it occurred early on, specifically in teleosts. It has been proposed that this event might have facilitated the evolutionary radiation and the phenotypic diversification of the teleost fish, notably by allowing the sub- or neo-functionalization of many duplicated genes.
Results
In this paper, we studied in a wide range of Actinopterygians the duplication and fate of the androgen receptor (AR, NR3C4), a nuclear receptor known to play a key role in sex-determination in vertebrates. The pattern of AR gene duplication is consistent with an early WGD event: it has been duplicated into two genes AR-A and AR-B after the split of the Acipenseriformes from the lineage leading to teleost fish but before the divergence of Osteoglossiformes. Genomic and syntenic analyses in addition to lack of PCR amplification show that one of the duplicated copies, AR-B, was lost in several basal Clupeocephala such as Cypriniformes (including the model species zebrafish), Siluriformes, Characiformes and Salmoniformes. Interestingly, we also found that, in basal teleost fish (Osteoglossiformes and Anguilliformes), the two copies remain very similar, whereas, specifically in Percomorphs, one of the copies, AR-B, has accumulated substitutions in both the ligand binding domain (LBD) and the DNA binding domain (DBD).
Conclusion
The comparison of the mutations present in these divergent AR-B with those known in human to be implicated in complete, partial or mild androgen insensitivity syndrome suggests that the existence of two distinct AR duplicates may be correlated to specific functional differences that may be connected to the well-known plasticity of sex determination in fish. This suggests that three specific events have shaped the present diversity of ARs in Actinopterygians: (i) early WGD, (ii) parallel loss of one duplicate in several lineages and (iii) putative neofunctionalization of the same duplicate in percomorphs, which occurred a long time after the WGD.
doi:10.1186/1471-2148-8-336
PMCID: PMC2637867  PMID: 19094205
10.  Expression pattern divergence of duplicated genes in rice 
BMC Bioinformatics  2009;10(Suppl 6):S8.
Background
Genome-wide duplication is ubiquitous during diversification of the angiosperms, and gene duplication is one of the most important mechanisms for evolutionary novelties. As an indicator of functional evolution, the divergence of expression patterns following duplication events has drawn great attention in recent years. Using large-scale whole-genome microarray data, we systematically analyzed expression divergence patterns of rice genes from block, tandem and dispersed duplications.
Results
We found a significant difference in expression divergence patterns for the three types of duplicated gene pairs. Expression correlation is significantly higher for gene pairs from block and tandem duplications than those from dispersed duplications. Furthermore, a significant correlation was observed between the expression divergence and the synonymous substitution rate which is an approximate proxy of divergence time. Thus, both duplication types and divergence time influence the difference in expression divergence. Using a linear model, we investigated the influence of these two variables and found that the difference in expression divergence between block and dispersed duplicates is attributed largely to their different divergence time. In addition, the difference in expression divergence between tandem and the other two types of duplicates is attributed to both divergence time and duplication type.
Conclusion
Consistent with previous studies on Arabidopsis, our results revealed a significant difference in expression divergence between the types of duplicated genes and a significant correlation between expression divergence and synonymous substitution rate. We found that the attribution of duplication mode to the expression divergence implies a different evolutionary course of duplicated genes.
doi:10.1186/1471-2105-10-S6-S8
PMCID: PMC2697655  PMID: 19534757
11.  Noncoding Sequences Near Duplicated Genes Evolve Rapidly 
Gene expression divergence and chromosomal rearrangements have been put forward as major contributors to phenotypic differences between closely related species. It has also been established that duplicated genes show enhanced rates of positive selection in their amino acid sequences. If functional divergence is largely due to changes in gene expression, it follows that regulatory sequences in duplicated loci should also evolve rapidly. To investigate this hypothesis, we performed likelihood ratio tests (LRTs) on all noncoding loci within 5 kb of every transcript in the human genome and identified sequences with increased substitution rates in the human lineage since divergence from Old World Monkeys. The fraction of rapidly evolving loci is significantly higher nearby genes that duplicated in the common ancestor of humans and chimps compared with nonduplicated genes. We also conducted a genome-wide scan for nucleotide substitutions predicted to affect transcription factor binding. Rates of binding site divergence are elevated in noncoding sequences of duplicated loci with accelerated substitution rates. Many of the genes associated with these fast-evolving genomic elements belong to functional categories identified in previous studies of positive selection on amino acid sequences. In addition, we find enrichment for accelerated evolution nearby genes involved in establishment and maintenance of pregnancy, processes that differ significantly between humans and monkeys. Our findings support the hypothesis that adaptive evolution of the regulation of duplicated genes has played a significant role in human evolution.
doi:10.1093/gbe/evq037
PMCID: PMC2942038  PMID: 20660939
accelerated substitution; noncoding sequence; gene duplication
12.  Divergent evolutionary fates of major photosynthetic gene networks following gene and whole genome duplications 
Plant Signaling & Behavior  2011;6(4):594-597.
Gene and genome duplication are recurring processes in flowering plants, and elucidating the mechanisms by which duplicated genes are lost or deployed is a key component of understanding plant evolution. Using gene ontologies (GO) or protein family (PFAM) domains, distinct patterns of duplicate retention and loss have been identified depending on gene functional properties and duplication mechanism, but little is known about how gene networks encoding interacting proteins (protein complexes or signaling cascades) evolve in response to duplication. We examined patterns of duplicate retention within four major gene networks involved in photosynthesis (the Calvin cycle, photosystem I, photosystem II and the light harvesting complex) across three species and four whole genome duplications, as well as small-scale duplications and showed that photosystem gene family evolution is governed largely by dosage sensitivity.1 In contrast, Calvin cycle gene families are not dosage-sensitive, but exhibit a greater capacity for functional differentiation. Here we review these findings, highlight how this study, by analyzing defined gene networks, is complementary to global studies using functional annotations such as GO and PFAM, and elaborate on one example of functional differentiation in the Calvin cycle gene family, transketolase.
doi:10.4161/psb.6.4.15370
PMCID: PMC3142401  PMID: 21494088
gene duplication; whole genome duplication; dosage sensitivity; balance hypothesis
13.  Divergence of exonic splicing elements after gene duplication and the impact on gene structures 
Genome Biology  2009;10(11):R120.
An analysis of human exonic splicing elements in duplicated genes reveals their important role in the generation of new gene structures.
Background
The origin of new genes and their contribution to functional novelty has been the subject of considerable interest. There has been much progress in understanding the mechanisms by which new genes originate. Here we examine a novel way that new gene structures could originate, namely through the evolution of new alternative splicing isoforms after gene duplication.
Results
We studied the divergence of exonic splicing enhancers and silencers after gene duplication and the contributions of such divergence to the generation of new splicing isoforms. We found that exonic splicing enhancers and exonic splicing silencers diverge especially fast shortly after gene duplication. About 10% and 5% of paralogous exons undergo significantly asymmetric evolution of exonic splicing enhancers and silencers, respectively. When compared to pre-duplication ancestors, we found that there is a significant overall loss of exonic splicing enhancers and the magnitude increases with duplication age. Detailed examination reveals net gains and losses of exonic splicing enhancers and silencers in different copies and paralog clusters after gene duplication. Furthermore, we found that exonic splicing enhancer and silencer changes are mainly caused by synonymous mutations, though nonsynonymous changes also contribute. Finally, we found that exonic splicing enhancer and silencer divergence results in exon splicing state transitions (from constitutive to alternative or vice versa), and that the proportion of paralogous exon pairs with different splicing states also increases over time, consistent with previous predictions.
Conclusions
Our results suggest that exonic splicing enhancer and silencer changes after gene duplication have important roles in alternative splicing divergence and that these changes contribute to the generation of new gene structures.
doi:10.1186/gb-2009-10-11-r120
PMCID: PMC3091315  PMID: 19883501
14.  Temporal pattern of loss/persistence of duplicate genes involved in signal transduction and metabolic pathways after teleost-specific genome duplication 
Background
Recent genomic studies have revealed a teleost-specific third-round whole genome duplication (3R-WGD) event occurred in a common ancestor of teleost fishes. However, it is unclear how the genes duplicated in this event were lost or persisted during the diversification of teleosts, and therefore, how many of the duplicated genes contribute to the genetic differences among teleosts. This subject is also important for understanding the process of vertebrate evolution through WGD events. We applied a comparative evolutionary approach to this question by focusing on the genes involved in long-term potentiation, taste and olfactory transduction, and the tricarboxylic acid cycle, based on the whole genome sequences of four teleosts; zebrafish, medaka, stickleback, and green spotted puffer fish.
Results
We applied a state-of-the-art method of maximum-likelihood phylogenetic inference and conserved synteny analyses to each of 130 genes involved in the above biological systems of human. These analyses identified 116 orthologous gene groups between teleosts and tetrapods, and 45 pairs of 3R-WGD-derived duplicate genes among them. This suggests that more than half [(45×2)/(116+45)] = 56.5%) of the loci, probably more than ten thousand genes, present in a common ancestor of the four teleosts were still duplicated after the 3R-WGD. The estimated temporal pattern of gene loss suggested that, after the 3R-WGD, many (71/116) of the duplicated genes were rapidly lost during the initial 75 million years (MY), whereas on average more than half (27.3/45) of the duplicated genes remaining in the ancestor of the four teleosts (45/116) have persisted for about 275 MY. The 3R-WGD-derived duplicates that have persisted for a long evolutionary periods of time had significantly larger number of interacting partners and longer length of protein coding sequence, implying that they tend to be more multifunctional than the singletons after the 3R-WGD.
Conclusion
We have shown firstly the temporal pattern of gene loss process after 3R-WGD on the basis of teleost phylogeny and divergence time frameworks. The 3R-WGD-derived duplicates have not undergone constant exponential decay, suggesting that selection favoured the long-term persistence of a subset of duplicates that tend to be multi-functional. On the basis of these results obtained from the analysis of 116 orthologous gene groups, we propose that more than ten thousand of 3R-WGD-derived duplicates have experienced lineage-specific evolution, that is, the differential sub-/neo-functionalization or secondary loss between lineages, and contributed to teleost diversity.
doi:10.1186/1471-2148-9-127
PMCID: PMC2702319  PMID: 19500364
15.  Evolution of developmental roles of Pax2/5/8 paralogs after independent duplication in urochordate and vertebrate lineages 
BMC Biology  2008;6:35.
Background
Gene duplication provides opportunities for lineage diversification and evolution of developmental novelties. Duplicated genes generally either disappear by accumulation of mutations (nonfunctionalization), or are preserved either by the origin of positively selected functions in one or both duplicates (neofunctionalization), or by the partitioning of original gene subfunctions between the duplicates (subfunctionalization). The Pax2/5/8 family of important developmental regulators has undergone parallel expansion among chordate groups. After the divergence of urochordate and vertebrate lineages, two rounds of independent gene duplications resulted in the Pax2, Pax5, and Pax8 genes of most vertebrates (the sister group of the urochordates), and an additional duplication provided the pax2a and pax2b duplicates in teleost fish. Separate from the vertebrate genome expansions, a duplication also created two Pax2/5/8 genes in the common ancestor of ascidian and larvacean urochordates.
Results
To better understand mechanisms underlying the evolution of duplicated genes, we investigated, in the larvacean urochordate Oikopleura dioica, the embryonic gene expression patterns of Pax2/5/8 paralogs. We compared the larvacean and ascidian expression patterns to infer modular subfunctions present in the single pre-duplication Pax2/5/8 gene of stem urochordates, and we compared vertebrate and urochordate expression to infer the suite of Pax2/5/8 gene subfunctions in the common ancestor of olfactores (vertebrates + urochordates). Expression pattern differences of larvacean and ascidian Pax2/5/8 orthologs in the endostyle, pharynx and hindgut suggest that some ancestral gene functions have been partitioned differently to the duplicates in the two urochordate lineages. Novel expression in the larvacean heart may have resulted from the neofunctionalization of a Pax2/5/8 gene in the urochordates. Expression of larvacean Pax2/5/8 in the endostyle, in sites of epithelial remodeling, and in sensory tissues evokes like functions of Pax2, Pax5 and Pax8 in vertebrate embryos, and may indicate ancient origins for these functions in the chordate common ancestor.
Conclusion
Comparative analysis of expression patterns of chordate Pax2/5/8 duplicates, rooted on the single-copy Pax2/5/8 gene of amphioxus, whose lineage diverged basally among chordates, provides new insights into the evolution and development of the heart, thyroid, pharynx, stomodeum and placodes in chordates; supports the controversial conclusion that the atrial siphon of ascidians and the otic placode in vertebrates are homologous; and backs the notion that Pax2/5/8 functioned in ancestral chordates to engineer epithelial fusions and perforations, including gill slit openings.
doi:10.1186/1741-7007-6-35
PMCID: PMC2532684  PMID: 18721460
16.  High Occurrence of Functional New Chimeric Genes in Survey of Rice Chromosome 3 Short Arm Genome Sequences 
Genome Biology and Evolution  2013;5(5):1038-1048.
In an effort to identify newly evolved genes in rice, we searched the genomes of Asian-cultivated rice Oryza sativa ssp. japonica and its wild progenitors, looking for lineage-specific genes. Using genome pairwise comparison of approximately 20-Mb DNA sequences from the chromosome 3 short arm (Chr3s) in six rice species, O. sativa, O. nivara, O. rufipogon, O. glaberrima, O. barthii, and O. punctata, combined with synonymous substitution rate tests and other evidence, we were able to identify potential recently duplicated genes, which evolved within the last 1 Myr. We identified 28 functional O. sativa genes, which likely originated after O. sativa diverged from O. glaberrima. These genes account for around 1% (28/3,176) of all annotated genes on O. sativa’s Chr3s. Among the 28 new genes, two recently duplicated segments contained eight genes. Fourteen of the 28 new genes consist of chimeric gene structure derived from one or multiple parental genes and flanking targeting sequences. Although the majority of these 28 new genes were formed by single or segmental DNA-based gene duplication and recombination, we found two genes that were likely originated partially through exon shuffling. Sequence divergence tests between new genes and their putative progenitors indicated that new genes were most likely evolving under natural selection. We showed all 28 new genes appeared to be functional, as suggested by Ka/Ks analysis and the presence of RNA-seq, cDNA, expressed sequence tag, massively parallel signature sequencing, and/or small RNA data. The high rate of new gene origination and of chimeric gene formation in rice may demonstrate rice’s broad diversification, domestication, its environmental adaptation, and the role of new genes in rice speciation
doi:10.1093/gbe/evt071
PMCID: PMC3673630  PMID: 23651622
chimera; comparative genomics; gene duplication; new gene; Oryza
17.  Comparative phylogenomic analyses of teleost fish Hox gene clusters: lessons from the cichlid fish Astatotilapia burtoni 
BMC Genomics  2007;8:317.
Background
Teleost fish have seven paralogous clusters of Hox genes stemming from two complete genome duplications early in vertebrate evolution, and an additional genome duplication during the evolution of ray-finned fish, followed by the secondary loss of one cluster. Gene duplications on the one hand, and the evolution of regulatory sequences on the other, are thought to be among the most important mechanisms for the evolution of new gene functions. Cichlid fish, the largest family of vertebrates with about 2500 species, are famous examples of speciation and morphological diversity. Since this diversity could be based on regulatory changes, we chose to study the coding as well as putative regulatory regions of their Hox clusters within a comparative genomic framework.
Results
We sequenced and characterized all seven Hox clusters of Astatotilapia burtoni, a haplochromine cichlid fish. Comparative analyses with data from other teleost fish such as zebrafish, two species of pufferfish, stickleback and medaka were performed. We traced losses of genes and microRNAs of Hox clusters, the medaka lineage seems to have lost more microRNAs than the other fish lineages. We found that each teleost genome studied so far has a unique set of Hox genes. The hoxb7a gene was lost independently several times during teleost evolution, the most recent event being within the radiation of East African cichlid fish. The conserved non-coding sequences (CNS) encompass a surprisingly large part of the clusters, especially in the HoxAa, HoxCa, and HoxDa clusters. Across all clusters, we observe a trend towards an increased content of CNS towards the anterior end.
Conclusion
The gene content of Hox clusters in teleost fishes is more variable than expected, with each species studied so far having a different set. Although the highest loss rate of Hox genes occurred immediately after whole genome duplications, our analyses showed that gene loss continued and is still ongoing in all teleost lineages. Along with the gene content, the CNS content also varies across clusters. The excess of CNS at the anterior end of clusters could imply a stronger conservation of anterior expression patters than those towards more posterior areas of the embryo.
doi:10.1186/1471-2164-8-317
PMCID: PMC2080641  PMID: 17845724
18.  Evolutionary history of the UCP gene family: gene duplication and selection 
Background
The uncoupling protein (UCP) genes belong to the superfamily of electron transport carriers of the mitochondrial inner membrane. Members of the uncoupling protein family are involved in thermogenesis and determining the functional evolution of UCP genes is important to understand the evolution of thermo-regulation in vertebrates.
Results
Sequence similarity searches of genome and scaffold data identified homologues of UCP in eutherians, teleosts and the first squamates uncoupling proteins. Phylogenetic analysis was used to characterize the family evolutionary history by identifying two duplications early in vertebrate evolution and two losses in the avian lineage (excluding duplications within a species, excluding the losses due to incompletely sequenced taxa and excluding the losses and duplications inferred through mismatch of species and gene trees). Estimates of synonymous and nonsynonymous substitution rates (dN/dS) and more complex branch and site models suggest that the duplication events were not associated with positive Darwinian selection and that the UCP is constrained by strong purifying selection except for a single site which has undergone positive Darwinian selection, demonstrating that the UCP gene family must be highly conserved.
Conclusion
We present a phylogeny describing the evolutionary history of the UCP gene family and show that the genes have evolved through duplications followed by purifying selection except for a single site in the mitochondrial matrix between the 5th and 6th α-helices which has undergone positive selection.
doi:10.1186/1471-2148-8-306
PMCID: PMC2584656  PMID: 18980678
19.  Many genes in fish have species-specific asymmetric rates of molecular evolution 
BMC Genomics  2006;7:20.
Background
Gene and genome duplication events increase the amount of genetic material that might then contribute to an increase in the genomic and phenotypic complexity of organisms during evolution. Thus, it has been argued that there is a relationship between gene copy number and morphological complexity and/or species diversity. This hypothesis implies that duplicated genes have subdivided or evolved novel functions compared to their pre-duplication proto-orthologs. Such a functional divergence might be caused by an increase in evolutionary rates in one ortholog, by changes in expression, regulatory evolution, insertion of repetitive elements, or due to positive Darwinian selection in one copy. We studied a set of 2466 genes that were present in Danio rerio, Takifugu rubripes, Tetraodon nigroviridis and Oryzias latipes to test (i) for forces of positive Darwinian selection; (ii) how frequently duplicated genes are retained, and (iii) whether novel gene functions might have evolved.
Results
25% (610) of all investigated genes show significantly smaller or higher genetic distances in the genomes of particular fish species compared to their human ortholog than their orthologs in other fish according to relative rate tests. We identified 49 new paralogous pairs of duplicated genes in fish, in which one of the paralogs is under positive Darwinian selection and shows a significantly higher rate of molecular evolution in one of the four fish species, whereas the other copy apparently did not undergo adaptive changes since it retained the original rate of evolution. Among the genes under positive Darwinian selection, we found a surprisingly high number of ATP binding proteins and transcription factors.
Conclusion
The significant rate difference suggests that the function of these rate-changed genes might be essential for the respective fish species. We demonstrate that the measurement of positive selection is a powerful tool to identify divergence rates of duplicated genes and that this method has the capacity to identify potentially interesting candidates for adaptive gene evolution.
doi:10.1186/1471-2164-7-20
PMCID: PMC1413527  PMID: 16466575
20.  Evolutionary Analysis of Sequence Divergence and Diversity of Duplicate Genes in Aspergillus fumigatus 
Gene duplication as a major source of novel genetic material plays an important role in evolution. In this study, we focus on duplicate genes in Aspergillus fumigatus, a ubiquitous filamentous fungus causing life-threatening human infections. We characterize the extent and evolutionary patterns of the duplicate genes in the genome of A. fumigatus. Our results show that A. fumigatus contains a large amount of duplicate genes with pronounced sequence divergence between two copies, and approximately 10% of them diverge asymmetrically, i.e. two copies of a duplicate gene pair diverge at significantly different rates. We use a Bayesian approach of the McDonald-Kreitman test to infer distributions of selective coefficients γ(=2Nes) and find that (1) the values of γ for two copies of duplicate genes co-vary positively and (2) the average γ for the two copies differs between genes from different gene families. This analysis highlights the usefulness of combining divergence and diversity data in studying the evolution of duplicate genes. Taken together, our results provide further support and refinement to the theories of gene duplication. Through characterizing the duplicate genes in the genome of A. fumigatus, we establish a computational framework, including parameter settings and methods, for comparative study of genetic redundancy and gene duplication between different fungal species.
doi:10.4137/EBO.S10372
PMCID: PMC3510868  PMID: 23225993
duplicate gene; Aspergillus fumigatus; positive selection; sequence diversity
21.  Pervasive and Persistent Redundancy among Duplicated Genes in Yeast 
PLoS Genetics  2008;4(7):e1000113.
The loss of functional redundancy is the key process in the evolution of duplicated genes. Here we systematically assess the extent of functional redundancy among a large set of duplicated genes in Saccharomyces cerevisiae. We quantify growth rate in rich medium for a large number of S. cerevisiae strains that carry single and double deletions of duplicated and singleton genes. We demonstrate that duplicated genes can maintain substantial redundancy for extensive periods of time following duplication (∼100 million years). We find high levels of redundancy among genes duplicated both via the whole genome duplication and via smaller scale duplications. Further, we see no evidence that two duplicated genes together contribute to fitness in rich medium substantially beyond that of their ancestral progenitor gene. We argue that duplicate genes do not often evolve to behave like singleton genes even after very long periods of time.
Author Summary
Gene duplication is the primary source of new genes. To persist, duplicated genes must lose some of the original redundancy either by partitioning the ancestral function (subfunctionalization) or by gaining new non-redundant functions (neofunctionalization). The extent to which these processes shape the evolution of duplicated genes over long periods of time is unknown. We investigate these questions experimentally by building strains carrying single and double gene deletions of duplicated genes and measuring their growth rates in rich medium. Using these data, we determine that many duplicated genes are functionally redundant to a substantial degree. We also investigate how often duplicated genes gain new functionality. We demonstrate that the fitness effects of double deletions of duplicate genes are indistinguishable from our best estimate of the fitness effects of deletions of their ancestral singleton genes. We therefore argue that many duplicate genes do not gain substantial new functionality at least in the rich medium. Our results suggest that subfunctionalization does not generally proceed to completion, even after very long periods of time, and that neofunctionalization is either rare or of little consequence, at least under some growth conditions.
doi:10.1371/journal.pgen.1000113
PMCID: PMC2440806  PMID: 18604285
22.  Evolution of Stress-Regulated Gene Expression in Duplicate Genes of Arabidopsis thaliana 
PLoS Genetics  2009;5(7):e1000581.
Due to the selection pressure imposed by highly variable environmental conditions, stress sensing and regulatory response mechanisms in plants are expected to evolve rapidly. One potential source of innovation in plant stress response mechanisms is gene duplication. In this study, we examined the evolution of stress-regulated gene expression among duplicated genes in the model plant Arabidopsis thaliana. Key to this analysis was reconstructing the putative ancestral stress regulation pattern. By comparing the expression patterns of duplicated genes with the patterns of their ancestors, duplicated genes likely lost and gained stress responses at a rapid rate initially, but the rate is close to zero when the synonymous substitution rate (a proxy for time) is >∼0.8. When considering duplicated gene pairs, we found that partitioning of putative ancestral stress responses occurred more frequently compared to cases of parallel retention and loss. Furthermore, the pattern of stress response partitioning was extremely asymmetric. An analysis of putative cis-acting DNA regulatory elements in the promoters of the duplicated stress-regulated genes indicated that the asymmetric partitioning of ancestral stress responses are likely due, at least in part, to differential loss of DNA regulatory elements; the duplicated genes losing most of their stress responses were those that had lost more of the putative cis-acting elements. Finally, duplicate genes that lost most or all of the ancestral responses are more likely to have gained responses to other stresses. Therefore, the retention of duplicates that inherit few or no functions seems to be coupled to neofunctionalization. Taken together, our findings provide new insight into the patterns of evolutionary changes in gene stress responses after duplication and lay the foundation for testing the adaptive significance of stress regulatory changes under highly variable biotic and abiotic environments.
Author Summary
Plants have developed a multitude of response mechanisms to survive stressful environments. Since the environment is highly variable, these stress response mechanisms are expected to undergo frequent innovation. Duplicate genes represent a potential source for such innovation. In this paper, we explored the evolutionary changes in stress responses at the transcriptional level among duplicated genes in the model plant Arabidopsis thaliana. We found that after gene duplication, ancestral stress responses tend to be retained by only one of the gene duplicates (partitioning). In addition, the pattern of partitioning of multiple stress responses is extremely asymmetric, where one duplicate tends to inherit most or all of the ancestral stress responses. We present evidence that the asymmetric loss of stress responses is correlated with the asymmetric loss of putative transcription factor binding sites. Interestingly, those duplicate genes inheriting few or no ancestral responses tend to have gained new stress responses, providing support for the model that gene duplicates are a source of innovation. Our findings provide important insight into the mechanisms of gene function evolution and lay the foundation for experimental studies to determine the significance of gain of stress responses in plant adaptation.
doi:10.1371/journal.pgen.1000581
PMCID: PMC2709438  PMID: 19649161
23.  The evolution of filamin – A protein domain repeat perspective 
Journal of structural biology  2012;179(3):289-298.
Particularly in higher eukaryotes, some protein domains are found in tandem repeats, performing broad functions often related to cellular organization. For instance, the eukaryotic protein filamin interacts with many proteins and is crucial for the cytoskeleton. The functional properties of long repeat domains are governed by the specific properties of each individual domain as well as by the repeat copy number. To provide better understanding of the evolutionary and functional history of repeating domains, we investigated the mode of evolution of the filamin domain in some detail.
Among the domains that are common in long repeat proteins, sushi and spectrin domains evolve primarily through cassette tandem duplications while scavenger and immunoglobulin repeats appear to evolve through clustered tandem duplications. Additionally, immunoglobulin and filamin repeats exhibit a unique pattern where every other domain shows high sequence similarity. This pattern may be the result of tandem duplications, serve to avert aggregation between adjacent domains or it is the result of functional constraints.
In filamin, our studies confirm the presence of interspersed integrin binding domains in vertebrates, while invertebrates exhibit more varied patterns, including more clustered integrin binding domains. The most notable case is leech filamin, which contains a 20 repeat expansion and exhibits unique dimerization topology.
Clearly, invertebrate filamins are varied and contain examples of similar adjacent integrin-binding domains. Given that invertebrate integrin shows more similarity to the weaker filamin binder, integrin β3, it is possible that the distance between integrin-binding domains is not as crucial for invertebrate filamins as for vertebrates.
doi:10.1016/j.jsb.2012.02.010
PMCID: PMC3728663  PMID: 22414427
Filamin; Protein domain repeats; Integrin; Protein domain evolution; Aggregation; Tandem duplication
24.  Complex patterns of divergence among green-sensitive (RH2a) African cichlid opsins revealed by Clade model analyses 
Background
Gene duplications play an important role in the evolution of functional protein diversity. Some models of duplicate gene evolution predict complex forms of paralog divergence; orthologous proteins may diverge as well, further complicating patterns of divergence among and within gene families. Consequently, studying the link between protein sequence evolution and duplication requires the use of flexible substitution models that can accommodate multiple shifts in selection across a phylogeny. Here, we employed a variety of codon substitution models, primarily Clade models, to explore how selective constraint evolved following the duplication of a green-sensitive (RH2a) visual pigment protein (opsin) in African cichlids. Past studies have linked opsin divergence to ecological and sexual divergence within the African cichlid adaptive radiation. Furthermore, biochemical and regulatory differences between the RH2aα and RH2aβ paralogs have been documented. It thus seems likely that selection varies in complex ways throughout this gene family.
Results
Clade model analysis of African cichlid RH2a opsins revealed a large increase in the nonsynonymous-to-synonymous substitution rate ratio (ω) following the duplication, as well as an even larger increase, one consistent with positive selection, for Lake Tanganyikan cichlid RH2aβ opsins. Analysis using the popular Branch-site models, by contrast, revealed no such alteration of constraint. Several amino acid sites known to influence spectral and non-spectral aspects of opsin biochemistry were found to be evolving divergently, suggesting that orthologous RH2a opsins may vary in terms of spectral sensitivity and response kinetics. Divergence appears to be occurring despite intronic gene conversion among the tandemly-arranged duplicates.
Conclusions
Our findings indicate that variation in selective constraint is associated with both gene duplication and divergence among orthologs in African cichlid RH2a opsins. At least some of this variation may reflect an adaptive response to differences in light environment. Interestingly, these patterns only became apparent through the use of Clade models, not through the use of the more widely employed Branch-site models; we suggest that this difference stems from the increased flexibility associated with Clade models. Our results thus bear both on studies of cichlid visual system evolution and on studies of gene family evolution in general.
doi:10.1186/1471-2148-12-206
PMCID: PMC3514295  PMID: 23078361
Codon substitution model; Visual pigment evolution; Nonsynonymous-to-synonymous substitution rate ratio; dN/dS; Clade model; Maximum likelihood; Gene family evolution
25.  Post-duplication charge evolution of phosphoglucose isomerases in teleost fishes through weak selection on many amino acid sites 
Background
The partitioning of ancestral functions among duplicated genes by neutral evolution, or subfunctionalization, has been considered the primary process for the evolution of novel proteins (neofunctionalization). Nonetheless, how a subfunctionalized protein can evolve into a more adaptive protein is poorly understood, mainly due to the limitations of current analytical methods, which can detect only strong selection for amino acid substitutions involved in adaptive molecular evolution. In this study, we employed a comparative evolutionary approach to this question, focusing on differences in the structural properties of a protein, specifically the electric charge, encoded by fish-specific duplicated phosphoglucose isomerase (Pgi) genes.
Results
Full-length cDNA cloning, RT-PCR based gene expression analyses, and comparative sequence analyses showed that after subfunctionalization with respect to the expression organ of duplicate Pgi genes, the net electric charge of the PGI-1 protein expressed mainly in internal tissues became more negative, and that of PGI-2 expressed mainly in muscular tissues became more positive. The difference in net protein charge was attributable not to specific amino acid sites but to the sum of various amino acid sites located on the surface of the PGI molecule.
Conclusion
This finding suggests that the surface charge evolution of PGI proteins was not driven by strong selection on individual amino acid sites leading to permanent fixation of a particular residue, but rather was driven by weak selection on a large number of amino acid sites and consequently by steady directional and/or purifying selection on the overall structural properties of the protein, which is derived from many modifiable sites. The mode of molecular evolution presented here may be relevant to various cases of adaptive modification in proteins, such as hydrophobic properties, molecular size, and electric charge.
doi:10.1186/1471-2148-7-204
PMCID: PMC2176064  PMID: 17963532

Results 1-25 (1015783)