PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-25 (409380)

Clipboard (0)
None

Related Articles

1.  Genetic interactions reveal the evolutionary trajectories of duplicate genes 
Duplicate genes show significantly fewer interactions than singleton genes, and functionally similar duplicates can exhibit dissimilar profiles because common interactions are ‘hidden' due to buffering.Genetic interaction profiles provide insights into evolutionary mechanisms of duplicate retention by distinguishing duplicates under dosage selection from those retained because of some divergence in function.The genetic interactions of duplicate genes evolve in an extremely asymmetric way and the directionality of this asymmetry correlates well with other evolutionary properties of duplicate genes.Genetic interaction profiles can be used to elucidate the divergent function of specific duplicate pairs.
Gene duplication and divergence serves as a primary source for new genes and new functions, and as such has broad implications on the evolutionary process. Duplicate genes within S. cerevisiae have been shown to retain a high degree of similarity with regard to many of their functional properties (Papp et al, 2004; Guan et al, 2007; Wapinski et al, 2007; Musso et al, 2008), and perturbation of duplicate genes has been shown to result in smaller fitness defects than singleton genes (Gu et al, 2003; DeLuna et al, 2008; Dean et al, 2008; Musso et al, 2008). Individual genetic interactions between pairs of genes and profiles of such interactions across the entire genome provide a new context in which to examine the properties of duplicate compensation.
In this study we use the most recent and comprehensive set of genetic interactions in yeast produced to date (Costanzo et al, 2010) to address questions of duplicate retention and redundancy. We show that the ability for duplicate genes to buffer the deletion of a partner has three main consequences. First it agrees with previous work demonstrating that a high proportion of duplicate pairs are synthetic lethal, a classic indication of the ability to buffer one another functionally (DeLuna et al, 2008; Dean et al, 2008; Musso et al, 2008). Second, it reduces the number of genetic interactions observed between duplicate genes and the rest of the genome by masking interactions relating to common function from experimental detection. Third, this buffering of common interactions serves to reduce profile similarity in spite of common function (Figure 1). The compensatory ability of functionally similar duplicates buffers genetic interactions related to their common function (reducing the number of genetic interactions overall), while allowing the measurement of interactions related to any divergent function. Thus, even functionally similar duplicates may have dissimilar genetic interaction profiles. As previously surmised (Ihmels et al, 2007), duplicate genes under selection for dosage amplification have differing profile characteristics. We show that dosage-mediated duplicates have much higher genetic interaction profile similarity than do other duplicate pairs. Furthermore, we show in a comparison with local neighbors on a protein–protein interaction (PPI) network, that although dosage-mediated duplicates more often have higher similarity to each other than they do to their neighbors, the reverse is true for duplicates in general. That is, slightly divergent duplicate genes more often exhibit a higher similarity with a common neighbor on the PPI network than they do with each other, and that observation is consistent with the idea that common interactions are buffered while interactions corresponding to divergent functions are observed.
We then asked whether duplicates' genetic interactions that are not buffered appear in a symmetric or an asymmetric fashion. Previous work has established asymmetric patterns with regard to PPI degree (Wagner, 2002; He and Zhang, 2005), sequence divergence (Conant and Wagner, 2003; Zhang et al, 2003; Kellis et al, 2004; Scannell and Wolfe, 2008) and expression patterns (Gu et al, 2002b; Tirosh and Barkai, 2007). Although genetic interactions are further removed from mechanism than protein–protein interactions, for example, they do offer a more direct measurement of functional consequence and, thus, may give a better indication of the functional differences between a duplicate pair. We found that duplicates exhibit a strikingly asymmetric pattern of genetic interactions, with the ratio of interactions between sisters commonly exceeding 7:1 (Figure 4A). The observations differ significantly from random simulations in which genetic interactions were redistributed between sisters with equal probability (Figure 4A). Moreover, the directionality of this interaction asymmetry agrees with other physiological properties of duplicate pairs. For example, the sister with more genetic interactions also tends to have more protein–protein interactions and also tends to evolve at a slower rate (Figure 4B).
Genetic interaction degree and profiles can be used to understand the functional divergence of particular duplicates pairs. As a case example, we consider the whole-genome-duplication pair CIK1–VIK1. Each of these genes encode proteins that form distinct heterodimeric complexes with the microtubule motor protein Kar3 (Manning et al, 1999). Although each of these proteins depend on a direct physical interaction with Kar3, Cik1 has a much higher profile similarity to Kar3 than does Vik1 (r=0.5 and r=0.3, respectively). Consistent with its higher similarity, Δcik1 and Δkar3 exhibit several similar phenotypes, including abnormally short spindles, chromosome loss and delayed cell cycle progression (Page et al, 1994; Manning et al, 1999). In contrast, a Δvik1 mutant strain exhibits no overt phenotype (Manning et al, 1999).
The characterization of functional redundancy and divergence between duplicate genes is an important step in understanding the evolution of genetic systems. Large-scale genetic network analysis in Saccharomyces cerevisiae provides a powerful perspective for addressing these questions through quantitative measurements of genetic interactions between pairs of duplicated genes, and more generally, through the study of genome-wide genetic interaction profiles associated with duplicated genes. We show that duplicate genes exhibit fewer genetic interactions than other genes because they tend to buffer one another functionally, whereas observed interactions are non-overlapping and reflect their divergent roles. We also show that duplicate gene pairs are highly imbalanced in their number of genetic interactions with other genes, a pattern that appears to result from asymmetric evolution, such that one duplicate evolves or degrades faster than the other and often becomes functionally or conditionally specialized. The differences in genetic interactions are predictive of differences in several other evolutionary and physiological properties of duplicate pairs.
doi:10.1038/msb.2010.82
PMCID: PMC3010121  PMID: 21081923
duplicate genes; functional divergence; genetic interactions; paralogs; Saccharomyces cerevisiae
2.  Preferential Duplication of Conserved Proteins in Eukaryotic Genomes 
PLoS Biology  2004;2(3):e55.
A central goal in genome biology is to understand the origin and maintenance of genic diversity. Over evolutionary time, each gene's contribution to the genic content of an organism depends not only on its probability of long-term survival, but also on its propensity to generate duplicates that are themselves capable of long-term survival. In this study we investigate which types of genes are likely to generate functional and persistent duplicates. We demonstrate that genes that have generated duplicates in the C. elegans and S. cerevisiae genomes were 25%–50% more constrained prior to duplication than the genes that failed to leave duplicates. We further show that conserved genes have been consistently prolific in generating duplicates for hundreds of millions of years in these two species. These findings reveal one way in which gene duplication shapes the content of eukaryotic genomes. Our finding that the set of duplicate genes is biased has important implications for genome-scale studies.
Gene duplication is a key process in genome evolution. These authors show that highly conserved genes duplicate more often and are therefore likely to contribute more to the content of eukaryotic genomes
doi:10.1371/journal.pbio.0020055
PMCID: PMC368158  PMID: 15024414
3.  High Spontaneous Rate of Gene Duplication in Caenorhabditis elegans 
Current biology : CB  2011;21(4):306-310.
SUMMARY
Gene and genome duplications are the primary source of new genes and novel functions and have played a pivotal role in the evolution of genomic and organismal complexity [1, 2]. The spontaneous rate of gene duplication is a critical parameter for understanding the evolutionary dynamics of gene duplicates; yet few direct empirical estimates exist and differ widely. The presence of a large population of recently derived gene duplicates in sequenced genomes suggests a high rate of spontaneous origin, also evidenced by population-genomic studies reporting rampant copy-number polymorphism at the intraspecific level [3–6]. An analysis of long-term mutation-accumulation lines of Caenorhabditis elegans for gene copy-number changes using array Comparative Genomic Hybridization yields the first direct estimate of the genome-wide rate of gene duplication in a multicellular eukaryote. The gene duplication rate in C. elegans is quite high, on the order of 10−7 duplications/gene/generation. This rate is two orders of magnitude greater than the spontaneous rate of point mutation per nucleotide site in this species and also greatly exceeds an earlier estimate derived from the frequency distribution of extant gene duplicates in the sequenced C. elegans genome.
doi:10.1016/j.cub.2011.01.026
PMCID: PMC3056611  PMID: 21295484
4.  The Cellular Robustness by Genetic Redundancy in Budding Yeast 
PLoS Genetics  2010;6(11):e1001187.
The frequent dispensability of duplicated genes in budding yeast is heralded as a hallmark of genetic robustness contributed by genetic redundancy. However, theoretical predictions suggest such backup by redundancy is evolutionarily unstable, and the extent of genetic robustness contributed from redundancy remains controversial. It is anticipated that, to achieve mutual buffering, the duplicated paralogs must at least share some functional overlap. However, counter-intuitively, several recent studies reported little functional redundancy between these buffering duplicates. The large yeast genetic interactions released recently allowed us to address these issues on a genome-wide scale. We herein characterized the synthetic genetic interactions for ∼500 pairs of yeast duplicated genes originated from either whole-genome duplication (WGD) or small-scale duplication (SSD) events. We established that functional redundancy between duplicates is a pre-requisite and thus is highly predictive of their backup capacity. This observation was particularly pronounced with the use of a newly introduced metric in scoring functional overlap between paralogs on the basis of gene ontology annotations. Even though mutual buffering was observed to be prevalent among duplicated genes, we showed that the observed backup capacity is largely an evolutionarily transient state. The loss of backup capacity generally follows a neutral mode, with the buffering strength decreasing in proportion to divergence time, and the vast majority of the paralogs have already lost their backup capacity. These observations validated previous theoretic predictions about instability of genetic redundancy. However, departing from the general neutral mode, intriguingly, our analysis revealed the presence of natural selection in stabilizing functional overlap between SSD pairs. These selected pairs, both WGD and SSD, tend to have decelerated functional evolution, have higher propensities of co-clustering into the same protein complexes, and share common interacting partners. Our study revealed the general principles for the long-term retention of genetic redundancy.
Author Summary
Eukaryotic cells show remarkable robustness against external perturbations, which has been thought to be attributed, at least in part, to the extensive gene duplication events in eukaryotic genomes. By duplication, genes are likely to gain redundant copies for backup purposes, however, this notion contradicts the population genetic theory that genetic redundancy is evolutionarily unstable. In this study, we used yeast as a model organism to delineate the evolutionary trajectory of genetic robustness by gene duplication, utilizing the comprehensively characterized synthetic genetic interaction data in the yeast genome. We showed that the evolution of genetic robustness by duplication follows a neutral mode, with the loss of backup capacity proportional to the divergence time. However, natural selection was also acting on a few pairs to maintain their long-term backup capacity; and these pairs are slowly evolving, are co-clustered in the same protein complexes, and tend to interact with the similar partners. This study unravels the general principles underlying the evolution of the cellular robustness arising from genetic redundancy.
doi:10.1371/journal.pgen.1001187
PMCID: PMC2973813  PMID: 21079672
5.  Selective maintenance of Drosophila tandemly arranged duplicated genes during evolution 
Genome Biology  2008;9(12):R176.
Genes occurring in conserved, tandemly-arrayed clusters in Drosophila melanogaster are co-expressed to a much higher extent than other duplicated genes.
Background
The physical organization and chromosomal localization of genes within genomes is known to play an important role in their function. Most genes arise by duplication and move along the genome by random shuffling of DNA segments. Higher order structuring of the genome occurs in eukaryotes, where groups of physically linked genes are co-expressed. However, the contribution of gene duplication to gene order has not been analyzed in detail, as it is believed that co-expression due to recent duplicates would obscure other domains of co-expression.
Results
We have catalogued ordered duplicated genes in Drosophila melanogaster, and found that one in five of all genes is organized as tandem arrays. Furthermore, among arrays that have been spatially conserved over longer periods than would be expected on the basis of random shuffling, a disproportionate number contain genes encoding developmental regulators. Using in situ gene expression data for more than half of the Drosophila genome, we find that genes in these conserved clusters are co-expressed to a much higher extent than other duplicated genes.
Conclusions
These results reveal the existence of functional constraints in insects that retain copies of genes encoding developmental and regulatory proteins as neighbors, allowing their co-expression. This co-expression may be the result of shared cis-regulatory elements or a shared need for a specific chromatin structure. Our results highlight the association between genome architecture and the gene regulatory networks involved in the construction of the body plan.
doi:10.1186/gb-2008-9-12-r176
PMCID: PMC2646280  PMID: 19087263
6.  Drosophila Ana2 is a conserved centriole duplication factor 
The Journal of Cell Biology  2010;188(3):313-323.
Drosophila protein Ana2 supports de novo formation of centrioles and may be a functional homologue of the C. elegans centriole duplication protein SAS-5.
In Caenorhabditis elegans, five proteins are required for centriole duplication: SPD-2, ZYG-1, SAS-5, SAS-6, and SAS-4. Functional orthologues of all but SAS-5 have been found in other species. In Drosophila melanogaster and humans, Sak/Plk4, DSas-6/hSas-6, and DSas-4/CPAP—orthologues of ZYG-1, SAS-6, and SAS-4, respectively—are required for centriole duplication. Strikingly, all three fly proteins can induce the de novo formation of centriole-like structures when overexpressed in unfertilized eggs. Here, we find that of eight candidate duplication factors identified in cultured fly cells, only two, Ana2 and Asterless (Asl), share this ability. Asl is now known to be essential for centriole duplication in flies, but no equivalent protein has been found in worms. We show that Ana2 is the likely functional orthologue of SAS-5 and that it is also related to the vertebrate STIL/SIL protein family that has been linked to microcephaly in humans. We propose that members of the SAS-5/Ana2/STIL family of proteins are key conserved components of the centriole duplication machinery.
doi:10.1083/jcb.200910016
PMCID: PMC2819680  PMID: 20123993
7.  Duplicate genes and robustness to transient gene knock-downs in Caenorhabditis elegans. 
We examine robustness to mutations in the nematode worm Caenorhabditis elegans and the role of single-copy and duplicate genes in it. We do so by integrating complete genome sequence and microarray gene expression data with results from a genome-scale study using RNA interference (RNAi) to temporarily eliminate the functions of more than 16000 worm genes. We found that 89% of single-copy and 96% of duplicate genes show no detectable phenotypic effect in an RNAi knock-down experiment. We find that mutational robustness is greatest for closely related gene duplicates, large gene families and similarly expressed genes. We discuss the different causes of mutational robustness in single-copy and duplicate genes, as well as its evolutionary origin.
PMCID: PMC1691561  PMID: 15002776
8.  Evolutionary Patterns of Recently Emerged Animal Duplogs 
Genome Biology and Evolution  2011;3:1119-1135.
Duplogs, or intraspecies paralogs, constitute the important portion of eukaryote genomes and serve as a major source of functional innovation. We conducted detailed analyses of recently emerged animal duplogs. Genome data of three vertebrate species (Homo sapiens, Mus musculus, and Danio rerio), Caenorhabditis elegans, and two Drosophila species (Drosophila melanogaster and D. pseudoobscura) were used. Duplication events were divided into six age-groups according to the synonymous distance (dS) up to 0.6. Duplogs were classified into four equal-sized classes on physical distances and into three classes on relative orientations. We observed the following shared characteristics among intrachromosomal multiexon duplogs: 1) inverted duplogs account for 20–50%, and about a half of the physically most distant 25%; 2) except for C. elegans, the composition of physical distances, that of relative orientations, and the proportion of inverted duplogs in each physical distance category are more or less uniform; 3) except for C. elegans, the characteristics of the youngest (dS < 0.01) duplogs are similar to the overall characteristics of the entire set. These results suggest that intrachromosomal duplogs with fairly long physical distances were generated at once, rather than resulting from tandem duplications and subsequent genomic rearrangements. This is different from the three well-known modes of gene duplication: tandem duplication, retrotransposition, and genome duplication. We termed this new mode as “drift” duplication. The drift duplication has been producing duplicate copies at paces comparable with tandem duplications since the common ancestor of vertebrates, and it may have already operated in the common ancestor of bilateral animals.
doi:10.1093/gbe/evr074
PMCID: PMC3194840  PMID: 21859807
duplog; paralog; gene duplication; physical distance; transcriptional orientation; animals; genome-wide analysis; cross-sectional analysis
9.  Interrogation of alternative splicing events in duplicated genes during evolution 
BMC Genomics  2011;12(Suppl 3):S16.
Background
Gene duplication provides resources for developing novel genes and new functions while retaining the original functions. In addition, alternative splicing could increase the complexity of expression at the transcriptome and proteome level without increasing the number of gene copy in the genome. Duplication and alternative splicing are thought to work together to provide the diverse functions or expression patterns for eukaryotes. Previously, it was believed that duplication and alternative splicing were negatively correlated and probably interchangeable.
Results
We look into the relationship between occurrence of alternative splicing and duplication at different time after duplication events. We found duplication and alternative splicing were indeed inversely correlated if only recently duplicated genes were considered, but they became positively correlated when we took those ancient duplications into account. Specifically, for slightly or moderately duplicated genes with gene families containing 2 - 7 paralogs, genes were more likely to evolve alternative splicing and had on average a greater number of alternative splicing isoforms after long-term evolution compared to singleton genes. On the other hand, those large gene families (contain at least 8 paralogs) had a lower proportion of alternative splicing, and fewer alternative splicing isoforms on average even when ancient duplicated genes were taken into consideration. We also found these duplicated genes having alternative splicing were under tighter evolutionary constraints compared to those having no alternative splicing, and had an enrichment of genes that participate in molecular transducer activities.
Conclusions
We studied the association between occurrences of alternative splicing and gene duplication. Our results implicate that there are key differences in functions and evolutionary constraints among singleton genes or duplicated genes with or without alternative splicing incidences. It implies that the gene duplication and alternative splicing may have different functional significance in the evolution of speciation diversity.
doi:10.1186/1471-2164-12-S3-S16
PMCID: PMC3333175  PMID: 22369477
10.  Identification of pseudogenes in the Drosophila melanogaster genome 
Nucleic Acids Research  2003;31(3):1033-1037.
Pseudogenes are copies of genes that cannot produce a protein. They can be detected from disruptions to their apparent coding sequence, caused by frameshifts and premature stop codons. They are classed as either processed pseudogenes (made by reverse transcription from an mRNA) or duplicated pseudogenes, arising from duplication in the genomic DNA and subsequent disablement. Historically, there is anecdotal evidence that the fruit fly (Drosophila melanogaster) has few pseudogenes. Investigators have linked this to a high deletion rate of genomic DNA, for which there is evidence from genetic experiments on genome size. Here, we apply a homology-based pipeline that was developed previously to identify pseudogenes in other eukaryotic genomes, to the fruit fly, so as to derive the first complete survey of its pseudogene population. We find approximately 100 pseudogenes, with at least a sixth of these as candidate processed pseudogenes. This gives a much lower proportion of pseudogenes (compared with the size of the proteome) than in the genomes of other eukaryotes for which data are available (human, nematode and budding yeast). Closest matching proteins to Drosophila pseudogenes are significantly longer than the average protein in its proteome (up to ∼60% more than the average protein’s length), in contrast to the situation in the three other eukaryotic genomes. This may be due to the persistence of fragments of longer genes. In the fly pseudogene population, we found most pseudogenes for serine proteases (which are more abundant in the Drosophila lineage compared with the other eukaryotes), immunoglobulin-motif-containing proteins and cytochromes P450. Data on the sequences and positions of the putative pseudogenes are available at: http://www.pseudogene.org/fly. The detection of a small number of pseudogenes in the Drosophila genome and the higher mean length for the closest matching proteins to pseudogenes (possibly because remnants of genes encoding longer proteins are more likely to persist) are further evidence for a high deletion rate of genomic DNA in the fruit fly. The data are useful for molecular evolution study in Drosophila.
PMCID: PMC149191  PMID: 12560500
11.  When Double is not Twice as Much 
Gene and genome duplications provide a playground for various selective pressures and contribute significantly to genome complexity. It is assumed that the genomes of all major eukaryotic lineages possess duplicated regions that result from gene and genome duplication. There is evidence that the model plant Arabidopsis has been subjected to at least three whole-genome duplication events over the last 150–200 million years. As a result, many cellular processes are governed by redundantly acting gene families. Plants pass through two distinct life phases with a haploid gametophytic alternating with a diploid sporophytic generation. This ontogenetic difference in gene copy number has important implications for the outcome of deleterious mutations, which are masked by the second gene copy in diploid systems but expressed in a dominant fashion in haploid organisms. As a consequence, maintaining the activity of duplicated genes might be particularly advantageous during the haploid gametophytic generation. Here, we describe the distinctive features associated with the alteration of generations and discuss how activity profiles of duplicated genes might get modulated in a life phase dependent fashion.
doi:10.3389/fpls.2011.00094
PMCID: PMC3355729  PMID: 22645557
gene duplication; flowering plants; alternation of generations; haploid; diploid
12.  Using Likelihood-Free Inference to Compare Evolutionary Dynamics of the Protein Networks of H. pylori and P. falciparum 
PLoS Computational Biology  2007;3(11):e230.
Gene duplication with subsequent interaction divergence is one of the primary driving forces in the evolution of genetic systems. Yet little is known about the precise mechanisms and the role of duplication divergence in the evolution of protein networks from the prokaryote and eukaryote domains. We developed a novel, model-based approach for Bayesian inference on biological network data that centres on approximate Bayesian computation, or likelihood-free inference. Instead of computing the intractable likelihood of the protein network topology, our method summarizes key features of the network and, based on these, uses a MCMC algorithm to approximate the posterior distribution of the model parameters. This allowed us to reliably fit a flexible mixture model that captures hallmarks of evolution by gene duplication and subfunctionalization to protein interaction network data of Helicobacter pylori and Plasmodium falciparum. The 80% credible intervals for the duplication–divergence component are [0.64, 0.98] for H. pylori and [0.87, 0.99] for P. falciparum. The remaining parameter estimates are not inconsistent with sequence data. An extensive sensitivity analysis showed that incompleteness of PIN data does not largely affect the analysis of models of protein network evolution, and that the degree sequence alone barely captures the evolutionary footprints of protein networks relative to other statistics. Our likelihood-free inference approach enables a fully Bayesian analysis of a complex and highly stochastic system that is otherwise intractable at present. Modelling the evolutionary history of PIN data, it transpires that only the simultaneous analysis of several global aspects of protein networks enables credible and consistent inference to be made from available datasets. Our results indicate that gene duplication has played a larger part in the network evolution of the eukaryote than in the prokaryote, and suggests that single gene duplications with immediate divergence alone may explain more than 60% of biological network data in both domains.
Author Summary
The importance of gene duplication to biological evolution has been recognized since the 1930s. For more than a decade, substantial evidence has been collected from genomic sequence data in order to elucidate the importance and the mechanisms of gene duplication; however, most biological characteristics arise from complex interactions between the cell's numerous constituents. Recently, preliminary descriptions of the protein interaction networks have become available for species of different domains. Adapting novel techniques in stochastic simulation, the authors demonstrate that evolutionary inferences can be drawn from large-scale, incomplete network data by fitting a stochastic model of network growth that captures hallmarks of evolution by duplication and divergence. They have also analyzed the effect of summarizing protein networks in different ways, and show that a reliable and consistent analysis requires many aspects of network data to be considered jointly; in contrast to what is commonly done in practice. Their results indicate that duplication and divergence has played a larger role in the network evolution of the eukaryote P. falciparum than in the prokaryote H. pylori, and emphasize at least for the eukaryote the potential importance of subfunctionalization in network evolution.
doi:10.1371/journal.pcbi.0030230
PMCID: PMC2098858  PMID: 18052538
13.  Dynamics of Gene Duplication in the Genomes of Chlorophyll d-Producing Cyanobacteria: Implications for the Ecological Niche 
Gene duplication may be an important mechanism for the evolution of new functions and for the adaptive modulation of gene expression via dosage effects. Here, we analyzed the fate of gene duplicates for two strains of a novel group of cyanobacteria (genus Acaryochloris) that produces the far-red light absorbing chlorophyll d as its main photosynthetic pigment. The genomes of both strains contain an unusually high number of gene duplicates for bacteria. As has been observed for eukaryotic genomes, we find that the demography of gene duplicates can be well modeled by a birth–death process. Most duplicated Acaryochloris genes are of comparatively recent origin, are strain-specific, and tend to be located on different genetic elements. Analyses of selection on duplicates of different divergence classes suggest that a minority of paralogs exhibit near neutral evolutionary dynamics immediately following duplication but that most duplicate pairs (including those which have been retained for long periods) are under strong purifying selection against amino acid change. The likelihood of duplicate retention varied among gene functional classes, and the pronounced differences between strains in the pool of retained recent duplicates likely reflects differences in the nutrient status and other characteristics of their respective environments. We conclude that most duplicates are quickly purged from Acaryochloris genomes and that those which are retained likely make important contributions to organism ecology by conferring fitness benefits via gene dosage effects. The mechanism of enhanced duplication may involve homologous recombination between genetic elements mediated by paralogous copies of recA.
doi:10.1093/gbe/evr060
PMCID: PMC3156569  PMID: 21697100
Acaryochloris; recA; homologous recombination; plasmid
14.  Gene Duplicability-Connectivity-Complexity across Organisms and a Neutral Evolutionary Explanation 
PLoS ONE  2012;7(9):e44491.
Gene duplication has long been acknowledged by biologists as a major evolutionary force shaping genomic architectures and characteristics across the Tree of Life. Major research has been conducting on elucidating the fate of duplicated genes in a variety of organisms, as well as factors that affect a gene’s duplicability–that is, the tendency of certain genes to retain more duplicates than others. In particular, two studies have looked at the correlation between gene duplicability and its degree in a protein-protein interaction network in yeast, mouse, and human, and another has looked at the correlation between gene duplicability and its complexity (length, number of domains, etc.) in yeast. In this paper, we extend these studies to six species, and two trends emerge. There is an increase in the duplicability-connectivity correlation that agrees with the increase in the genome size as well as the phylogenetic relationship of the species. Further, the duplicability-complexity correlation seems to be constant across the species. We argue that the observed correlations can be explained by neutral evolutionary forces acting on the genomic regions containing the genes. For the duplicability-connectivity correlation, we show through simulations that an increasing trend can be obtained by adjusting parameters to approximate genomic characteristics of the respective species. Our results call for more research into factors, adaptive and non-adaptive alike, that determine a gene’s duplicability.
doi:10.1371/journal.pone.0044491
PMCID: PMC3439388  PMID: 22984517
15.  Duplication and Gene Conversion in the Drosophila melanogaster Genome 
PLoS Genetics  2008;4(12):e1000305.
Using the genomic sequences of Drosophila melanogaster subgroup, the pattern of gene duplications was investigated with special attention to interlocus gene conversion. Our fine-scale analysis with careful visual inspections enabled accurate identification of a number of duplicated blocks (genomic regions). The orthologous parts of those duplicated blocks were also identified in the D. simulans and D. sechellia genomes, by which we were able to clearly classify the duplicated blocks into post- and pre-speciation blocks. We found 31 post-speciation duplicated genes, from which the rate of gene duplication (from one copy to two copies) is estimated to be 1.0×10−9 per single-copy gene per year. The role of interlocus gene conversion was observed in several respects in our data: (1) synonymous divergence between a duplicated pair is overall very low. Consequently, the gene duplication rate would be seriously overestimated by counting duplicated genes with low divergence; (2) the sizes of young duplicated blocks are generally large. We postulate that the degeneration of gene conversion around the edges could explain the shrinkage of “identifiable” duplicated regions; and (3) elevated paralogous divergence is observed around the edges in many duplicated blocks, supporting our gene conversion–degeneration model. Our analysis demonstrated that gene conversion between duplicated regions is a common and genome-wide phenomenon in the Drosophila genomes, and that its role should be especially significant in the early stages of duplicated genes. Based on a population genetic prediction, we applied a new genome-scan method to test for signatures of selection for neofunctionalization and found a strong signature in a pair of transporter genes.
Author Summary
Eukaryote genomes have a number of duplicated genes, which could potentially coevolve by exchanging DNA sequences by interlocus gene conversion. However, the extent of gene conversion on a genomic scale is not well understood, except that an extensive role of gene conversion was reported in yeast. Here, we show a second evaluation of the role of gene conversion by analyzing multiple genomes in the D. melanogaster subgroup. We found that most of young duplicated genes have experienced gene conversion, although not as extensively as yeast. We further performed fine-scale analysis of duplicated DNA sequences and estimated the gene duplication rate. Our estimate turned out to be much smaller than that of a commonly used method, which usually causes an overestimation when gene conversion is active. The role of positive selection for neofunctionalization was inferred by applying a novel test. Our results suggest that interlocus gene conversion could be a crucial mutational mechanism in the evolution of duplicated genes in eukaryote genomes and that the effect of gene conversion should be taken into account when analyzing molecular evolution of duplicated genes.
doi:10.1371/journal.pgen.1000305
PMCID: PMC2588116  PMID: 19079581
16.  Genome comparisons highlight similarity and diversity within the eukaryotic kingdoms 
In 2000, the number of completely sequenced eukaryotic genomes increased to four. The addition of Drosophila and Arabidopsis into this cohort permits additional insights into the processes that have shaped evolution. Analysis and comparisons of both completed genomes and partially sequenced genomes have already shed light on mechanisms such as gene duplication and gene loss that have long been hypothesized to be major forces in speciation. Indeed, duplicate gene pairs in Saccharomyces, Arabidopsis, Caenorhabditis and Drosophila are high: 30%, 60%, 48% and 40%, respectively. Evidence of horizontal gene-transfer, thought to be a major evolutionary force in bacteria, has been found in Arabidopsis. The release of the ‘first draft’ of the human genome sequence in 2000 heralds a new stage of biological study. Understanding the as-yet-unannotated human genome will be largely based on conclusions, techniques and tools developed during the analysis and comparison of the genome of these four model organisms.
PMCID: PMC3040119  PMID: 11166654
17.  P110 and P140 Cytadherence-Related Proteins Are Negative Effectors of Terminal Organelle Duplication in Mycoplasma genitalium 
PLoS ONE  2009;4(10):e7452.
Background
The terminal organelle is a complex structure involved in many aspects of the biology of mycoplasmas such as cell adherence, motility or cell division. Mycoplasma genitalium cells display a single terminal organelle and duplicate this structure prior to cytokinesis in a coordinated manner with the cell division process. Despite the significance of the terminal organelle in mycoplasma virulence, little is known about the mechanisms governing its duplication.
Methodology/Principal Findings
In this study we describe the isolation of a mutant, named T192, with a transposon insertion close to the 3′ end of the mg192 gene encoding for P110 adhesin. This mutant shows a truncated P110, low levels of P140 and P110 adhesins, a large number of non-motile cells and a high frequency of new terminal organelle formation. Further analyses revealed that the high rates of new terminal organelle formation in T192 cells are a direct consequence of the reduced levels of P110 and P140 rather than to the expression of a truncated P110. Consistently, the phenotype of the T192 mutant was successfully complemented by the reintroduction of the mg192 WT allele which restored the levels of P110 and P140 to those of the WT strain. Quantification of DAPI-stained DNA also showed that the increase in the number of terminal organelles in T192 cells is not accompanied by a higher DNA content, indicating that terminal organelle duplication does not trigger DNA replication in mycoplasmas.
Conclusions/Significance
Our results demonstrate the existence of a mechanism regulating terminal organelle duplication in M. genitalium and strongly suggest the implication of P110 and P140 adhesins in this mechanism.
doi:10.1371/journal.pone.0007452
PMCID: PMC2759538  PMID: 19829712
18.  Parameters of proteome evolution from histograms of amino-acid sequence identities of paralogous proteins 
Biology Direct  2007;2:32.
Background
The evolution of the full repertoire of proteins encoded in a given genome is mostly driven by gene duplications, deletions, and sequence modifications of existing proteins. Indirect information about relative rates and other intrinsic parameters of these three basic processes is contained in the proteome-wide distribution of sequence identities of pairs of paralogous proteins.
Results
We introduce a simple mathematical framework based on a stochastic birth-and-death model that allows one to extract some of this information and apply it to the set of all pairs of paralogous proteins in H. pylori, E. coli, S. cerevisiae, C. elegans, D. melanogaster, and H. sapiens. It was found that the histogram of sequence identities p generated by an all-to-all alignment of all protein sequences encoded in a genome is well fitted with a power-law form ~ p-γ with the value of the exponent γ around 4 for the majority of organisms used in this study. This implies that the intra-protein variability of substitution rates is best described by the Gamma-distribution with the exponent α ≈ 0.33. Different features of the shape of such histograms allow us to quantify the ratio between the genome-wide average deletion/duplication rates and the amino-acid substitution rate.
Conclusion
We separately measure the short-term ("raw") duplication and deletion rates rdup∗, rdel∗ which include gene copies that will be removed soon after the duplication event and their dramatically reduced long-term counterparts rdup, rdel. High deletion rate among recently duplicated proteins is consistent with a scenario in which they didn't have enough time to significantly change their functional roles and thus are to a large degree disposable. Systematic trends of each of the four duplication/deletion rates with the total number of genes in the genome were analyzed. All but the deletion rate of recent duplicates rdel∗ were shown to systematically increase with Ngenes. Abnormally flat shapes of sequence identity histograms observed for yeast and human are consistent with lineages leading to these organisms undergoing one or more whole-genome duplications. This interpretation is corroborated by our analysis of the genome of Paramecium tetraurelia where the p-4 profile of the histogram is gradually restored by the successive removal of paralogs generated in its four known whole-genome duplication events.
doi:10.1186/1745-6150-2-32
PMCID: PMC2246104  PMID: 18039386
19.  Oligoarray Comparative Genomic Hybridization-Mediated Mapping of Suppressor Mutations Generated in a Deletion-Biased Mutagenesis Screen 
G3: Genes|Genomes|Genetics  2012;2(6):657-663.
Suppressor screens are an invaluable method for identifying novel genetic interactions between genes in the model organism Caenorhabditis elegans. However, traditionally this approach has suffered from the laborious and protracted process of mapping mutations at the molecular level. Using a mutagen known to generate small deletions, coupled with oligoarray comparative genomic hybridization (aCGH), we have identified mutations in two genes that suppress the lethality associated with a mutation of the essential receptor tyrosine kinase rol-3. First, we find that deletion of the Bicaudal-C ortholog, bcc-1, suppresses rol-3–associated lethality. Second, we identify several duplications that also suppress rol-3–associated lethality. We establish that overexpression of srap-1, a single gene present in these duplications, mediates the suppression. This study demonstrates the suitability of deletion-biased mutagenesis screening in combination with aCGH characterization for the rapid identification of novel suppressor mutations. In addition to detecting small deletions, this approach is suitable for identifying copy number suppressor mutations, a class of suppressor not easily characterized using alternative approaches.
doi:10.1534/g3.112.002238
PMCID: PMC3362295  PMID: 22690375
Caenorhabditis elegans; oligoarray comparative genomic hybridization; ROL-3; BCC-1; SRAP-1; duplication suppressors
20.  Drug target prediction and prioritization: using orthology to predict essentiality in parasite genomes 
BMC Genomics  2010;11:222.
Background
New drug targets are urgently needed for parasites of socio-economic importance. Genes that are essential for parasite survival are highly desirable targets, but information on these genes is lacking, as gene knockouts or knockdowns are difficult to perform in many species of parasites. We examined the applicability of large-scale essentiality information from four model eukaryotes, Caenorhabditis elegans, Drosophila melanogaster, Mus musculus and Saccharomyces cerevisiae, to discover essential genes in each of their genomes. Parasite genes that lack orthologues in their host are desirable as selective targets, so we also examined prediction of essential genes within this subset.
Results
Cross-species analyses showed that the evolutionary conservation of genes and the presence of essential orthologues are each strong predictors of essentiality in eukaryotes. Absence of paralogues was also found to be a general predictor of increased relative essentiality. By combining several orthology and essentiality criteria one can select gene sets with up to a five-fold enrichment in essential genes compared with a random selection. We show how quantitative application of such criteria can be used to predict a ranked list of potential drug targets from Ancylostoma caninum and Haemonchus contortus - two blood-feeding strongylid nematodes, for which there are presently limited sequence data but no functional genomic tools.
Conclusions
The present study demonstrates the utility of using orthology information from multiple, diverse eukaryotes to predict essential genes. The data also emphasize the challenge of identifying essential genes among those in a parasite that are absent from its host.
doi:10.1186/1471-2164-11-222
PMCID: PMC2867826  PMID: 20361874
21.  An unexpectedly high degree of specialization and a widespread involvement in sterol metabolism among the C. elegans putative aminophospholipid translocases 
Background
P-type ATPases in subfamily IV are exclusively eukaryotic transmembrane proteins that have been proposed to directly translocate the aminophospholipids phosphatidylserine and phosphatidylethanolamine from the exofacial to the cytofacial monolayer of the plasma membrane. Eukaryotic genomes contain many genes encoding members of this subfamily. At present it is unclear why there are so many genes of this kind per organism or what individual roles these genes perform in organism development.
Results
We have systematically investigated expression and developmental function of the six, tat-1 through 6, subfamily IV P-type ATPase genes encoded in the Caenorhabditis elegans genome. tat-5 is the only ubiquitously-expressed essential gene in the group. tat-6 is a poorly-transcribed recent duplicate of tat-5. tat-2 through 4 exhibit tissue-specific developmentally-regulated expression patterns. Strong expression of both tat-2 and tat-4 occurs in the intestine and certain other cells of the alimentary system. The two are also expressed in the uterus, during spermatogenesis and in the fully-formed spermatheca. tat-2 alone is expressed in the pharyngeal gland cells, the excretory system and a few cells of the developing vulva. The expression pattern of tat-3 is almost completely different from those of tat-2 and tat-4. tat-3 expression is detectable in the steroidogenic tissues: the hypodermis and the XXX cells, as well as in most cells of the pharynx (except gland), various tissues of the reproductive system (except uterus and spermatheca) and seam cells. Deletion of tat-1 through 4 individually interferes little or not at all with the regular progression of organism growth and development under normal conditions. However, tat-2 through 4 become essential for reproductive growth during sterol starvation.
Conclusion
tat-5 likely encodes a housekeeping protein that performs the proposed aminophospholipid translocase function routinely. Although individually dispensable, tat-1 through 4 seem to be at most only partly redundant. Expression patterns and the sterol deprivation hypersensitivity deletion phenotype of tat-2 through 4 suggest that these genes carry out subtle metabolic functions, such as fine-tuning sterol metabolism in digestive or steroidogenic tissues. These findings uncover an unexpectedly high degree of specialization and a widespread involvement in sterol metabolism among the genes encoding the putative aminophospholipid translocases.
doi:10.1186/1471-213X-8-96
PMCID: PMC2572054  PMID: 18831765
22.  Modification of Gene Duplicability during the Evolution of Protein Interaction Network 
PLoS Computational Biology  2011;7(4):e1002029.
Duplications of genes encoding highly connected and essential proteins are selected against in several species but not in human, where duplicated genes encode highly connected proteins. To understand when and how gene duplicability changed in evolution, we compare gene and network properties in four species (Escherichia coli, yeast, fly, and human) that are representative of the increase in evolutionary complexity, defined as progressive growth in the number of genes, cells, and cell types. We find that the origin and conservation of a gene significantly correlates with the properties of the encoded protein in the protein-protein interaction network. All four species preserve a core of singleton and central hubs that originated early in evolution, are highly conserved, and accomplish basic biological functions. Another group of hubs appeared in metazoans and duplicated in vertebrates, mostly through vertebrate-specific whole genome duplication. Such recent and duplicated hubs are frequently targets of microRNAs and show tissue-selective expression, suggesting that these are alternative mechanisms to control their dosage. Our study shows how networks modified during evolution and contributes to explaining the occurrence of somatic genetic diseases, such as cancer, in terms of network perturbations.
Author Summary
Gene copy number is often tightly controlled because it directly affects the gene dosage. In several species, including yeast, worm, and fly, genes that have a single gene copy (singleton genes) encode proteins with several connections in the protein interaction network (hubs) as well as essential proteins. Surprisingly, in mouse and human essential proteins and hubs are encoded by genes with more than one copy in the genome (duplicated genes). Here we show that these two distinct groups of hubs were acquired at different times during the evolution of protein interaction network and contribute in different ways to the cell life. Singleton hubs are ancestral genes that are conserved from prokaryotes to vertebrates and accomplish basic functions that deal with the cell survival. Duplicated hubs were acquired mostly within metazoans and duplicated through vertebrate-specific whole genome duplication. These genes are involved in processes that are crucial for the organization of multicellularity. Although duplicated, also recent hubs are subject to gene dosage control through microRNAs and tissue-selective expression. The clarification of how the protein interaction network evolves enables us to understand the adaptation to the progressive increase in complexity and to better characterize the genes involved in diseases such as cancer.
doi:10.1371/journal.pcbi.1002029
PMCID: PMC3072358  PMID: 21490719
23.  Combinatorial RNA interference in Caenorhabditis elegans reveals that redundancy between gene duplicates can be maintained for more than 80 million years of evolution 
Genome Biology  2006;7(8):R69.
High-throughput combinatorial RNAi demonstrates that many duplicated genes in C. elegans can retain redundant functions for more than 80 million years
Background
Systematic analyses of loss-of-function phenotypes have been carried out for most genes in Saccharomyces cerevisiae, Caenorhabditis elegans, and Drosophila melanogaster. Although such studies vastly expand our knowledge of single gene function, they do not address redundancy in genetic networks. Developing tools for the systematic mapping of genetic interactions is thus a key step in exploring the relationship between genotype and phenotype.
Results
We established conditions for RNA interference (RNAi) in C. elegans to target multiple genes simultaneously in a high-throughput setting. Using this approach, we can detect the great majority of previously known synthetic genetic interactions. We used this assay to examine the redundancy of duplicated genes in the genome of C. elegans that correspond to single orthologs in S. cerevisiae or D. melanogaster and identified 16 pairs of duplicated genes that have redundant functions. Remarkably, 14 of these redundant gene pairs were duplicated before the divergence of C. elegans and C. briggsae 80-110 million years ago, suggesting that there has been selective pressure to maintain the overlap in function between some gene duplicates.
Conclusion
We established a high throughput method for examining genetic interactions using combinatorial RNAi in C. elegans. Using this technique, we demonstrated that many duplicated genes can retain redundant functions for more than 80 million years of evolution. This provides strong support for evolutionary models that predict that genetic redundancy between duplicated genes can be actively maintained by natural selection and is not just a transient side effect of recent gene duplication events.
doi:10.1186/gb-2006-7-8-r69
PMCID: PMC1779603  PMID: 16884526
24.  GenomeHistory: a software tool and its application to fully sequenced genomes 
Nucleic Acids Research  2002;30(15):3378-3386.
We present a publicly available software tool (http://www.unm.edu/∼compbio/software/GenomeHistory) that identifies all pairs of duplicate genes in a genome and then determines the degree of synonymous and non-synonymous divergence between each duplicate pair. Using this tool, we analyze the relations between (i) gene function and the propensity of a gene to duplicate and (ii) the number of genes in a gene family and the family’s rate of sequence evolution. We do so for the complete genomes of four eukaryotes (fission and budding yeast, fruit fly and nematode) and one prokaryote (Escherichia coli). For some classes of genes we observe a strong relationship between gene function and a gene’s propensity to undergo duplication. Most notably, ribosomal genes and transcription factors appear less likely to undergo gene duplication than other genes. In both fission and budding yeast, we see a strong positive correlation between the selective constraint on a gene and the size of the gene family of which this gene is a member. In contrast, a weakly negative such correlation is seen in multicellular eukaryotes.
PMCID: PMC137074  PMID: 12140322
25.  Expansion of the human mitochondrial proteome by intra- and inter-compartmental protein duplication 
Genome Biology  2009;10(11):R135.
The human mitochondrial proteome is shown to have expanded due to duplication of protein encoding genes and re-localization of these duplicated proteins.
Background
Mitochondria are highly complex, membrane-enclosed organelles that are essential to the eukaryotic cell. The experimental elucidation of organellar proteomes combined with the sequencing of complete genomes allows us to trace the evolution of the mitochondrial proteome.
Results
We present a systematic analysis of the evolution of mitochondria via gene duplication in the human lineage. The most common duplications are intra-mitochondrial, in which the ancestral gene and the daughter genes encode mitochondrial proteins. These duplications significantly expanded carbohydrate metabolism, the protein import machinery and the calcium regulation of mitochondrial activity. The second most prevalent duplication, inter-compartmental, extended the catalytic as well as the RNA processing repertoire by the novel mitochondrial localization of the protein encoded by one of the daughter genes. Evaluation of the phylogenetic distribution of N-terminal targeting signals suggests a prompt gain of the novel localization after inter-compartmental duplication. Relocalized duplicates are more often expressed in a tissue-specific manner relative to intra-mitochondrial duplicates and mitochondrial proteins in general. In a number of cases, inter-compartmental duplications can be observed in parallel in yeast and human lineages leading to the convergent evolution of subcellular compartments.
Conclusions
One-to-one human-yeast orthologs are typically restricted to their ancestral subcellular localization. Gene duplication relaxes this constraint on the cellular location, allowing nascent proteins to be relocalized to other compartments. We estimate that the mitochondrial proteome expanded at least 50% since the common ancestor of human and yeast.
doi:10.1186/gb-2009-10-11-r135
PMCID: PMC3091328  PMID: 19930686

Results 1-25 (409380)