PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-25 (29)
 

Clipboard (0)
None

Select a Filter Below

Journals
more »
Year of Publication
more »
1.  Weak Negative and Positive Selection and the Drift Load at Splice Sites 
Genome Biology and Evolution  2014;6(6):1437-1447.
Splice sites (SSs) are short sequences that are crucial for proper mRNA splicing in eukaryotic cells, and therefore can be expected to be shaped by strong selection. Nevertheless, in mammals and in other intron-rich organisms, many of the SSs often involve nonconsensus (Nc), rather than consensus (Cn), nucleotides, and beyond the two critical nucleotides, the SSs are not perfectly conserved between species. Here, we compare the SS sequences between primates, and between Drosophila fruit flies, to reveal the pattern of selection acting at SSs. Cn-to-Nc substitutions are less frequent, and Nc-to-Cn substitutions are more frequent, than neutrally expected, indicating, respectively, negative and positive selection. This selection is relatively weak (1 < |4Nes| < 4), and has a similar efficiency in primates and in Drosophila. Within some nucleotide positions, the positive selection in favor of Nc-to-Cn substitutions is weaker than the negative selection maintaining already established Cn nucleotides; this difference is due to site-specific negative selection favoring current Nc nucleotides. In general, however, the strength of negative selection protecting the Cn alleles is similar in magnitude to the strength of positive selection favoring replacement of Nc alleles, as expected under the simple nearly neutral turnover. In summary, although a fraction of the Nc nucleotides within SSs is maintained by selection, the abundance of deleterious nucleotides in this class suggests a substantial genome-wide drift load.
doi:10.1093/gbe/evu100
PMCID: PMC4079205  PMID: 24966225
splicing; splice sites; nearly neutral evolution; positive selection; negative selection; drift load
2.  Intrasubtype Reassortments Cause Adaptive Amino Acid Replacements in H3N2 Influenza Genes 
PLoS Genetics  2014;10(1):e1004037.
Reassortments and point mutations are two major contributors to diversity of Influenza A virus; however, the link between these two processes is unclear. It has been suggested that reassortments provoke a temporary increase in the rate of amino acid changes as the viral proteins adapt to new genetic environment, but this phenomenon has not been studied systematically. Here, we use a phylogenetic approach to infer the reassortment events between the 8 segments of influenza A H3N2 virus since its emergence in humans in 1968. We then study the amino acid replacements that occurred in genes encoded in each segment subsequent to reassortments. In five out of eight genes (NA, M1, HA, PB1 and NS1), the reassortment events led to a transient increase in the rate of amino acid replacements on the descendant phylogenetic branches. In NA and HA, the replacements following reassortments were enriched with parallel and/or reversing replacements; in contrast, the replacements at sites responsible for differences between antigenic clusters (in HA) and at sites under positive selection (in NA) were underrepresented among them. Post-reassortment adaptive walks contribute to adaptive evolution in Influenza A: in NA, an average reassortment event causes at least 2.1 amino acid replacements in a reassorted gene, with, on average, 0.43 amino acid replacements per evolving post-reassortment lineage; and at least ∼9% of all amino acid replacements are provoked by reassortments.
Author Summary
Influenza A is a rapidly evolving virus with genome composed of eight distinct RNA molecules called segments. This genetic structure allows formation of new combinations of segments when a cell is coinfected by multiple viral strains, in a process called reassortment. While “antigenic drift” – the process of continuous accumulation of point mutations that change the antigenic properties of the viral proteins – is mainly responsible for the seasonal flu, the heaviest pandemics were caused by spread of novel reassortant strains and the associated radical “antigenic shift”. However, the association between these two types of processes has not been studied systematically. Here, we use the extensive available complete-genome sequencing data for Influenza A H3N2 subtype to infer the evolutionary timings of within-subtype reassortment events, and study the patterns of point amino acid-changing replacements that followed reassortments. We find that reassortments were often rapidly followed by replacements, which possibly compensated for the loss of fitness associated with the reassortment or explored newly accessible fitness peaks. These findings may be relevant for prediction of future pandemic strains of Influenza A.
doi:10.1371/journal.pgen.1004037
PMCID: PMC3886890  PMID: 24415946
3.  Fitness conferred by replaced amino acids declines with time 
Biology Letters  2012;8(5):825-828.
The fitness landscape of a locus, the array of fitnesses conferred by its alleles, can be affected by allele replacements at other loci, in the presence of epistatic interactions between loci. In a pair of diverging homologous proteins, the initially high probability that an amino acid replacement in one of them will make it more similar to the other declines with time, implying that the fitness landscapes of homologous sites diverge. Here, we use data on within-population non-synonymous polymorphisms and on amino acid replacements between species to study the dynamics, after an amino acid replacement, of the fitness of the ancestral amino acid, and show that selection against its restoration increases with time. This effect can be owing to increase of fitness conferred by the new amino acid occupying the site, and/or to decline of fitness conferred by the replaced amino acid. We show that the fitness conferred by the replaced amino acid rapidly declines, reaching a new lower steady-state level after approximately 20 per cent of amino acids in the protein get replaced. Therefore, amino acid replacements in evolving proteins are routinely involved in negative epistatic interactions with currently absent amino acids, and chisel off the unused parts of the fitness landscape.
doi:10.1098/rsbl.2012.0356
PMCID: PMC3440982  PMID: 22628094
evolution; fitness landscape dynamics; absent allele; reversing replacement; epistatic interactions
4.  The miniature genome of a carnivorous plant Genlisea aurea contains a low number of genes and short non-coding sequences 
BMC Genomics  2013;14:476.
Background
Genlisea aurea (Lentibulariaceae) is a carnivorous plant with unusually small genome size - 63.6 Mb – one of the smallest known among higher plants. Data on the genome sizes and the phylogeny of Genlisea suggest that this is a derived state within the genus. Thus, G. aurea is an excellent model organism for studying evolutionary mechanisms of genome contraction.
Results
Here we report sequencing and de novo draft assembly of G. aurea genome. The assembly consists of 10,687 contigs of the total length of 43.4 Mb and includes 17,755 complete and partial protein-coding genes. Its comparison with the genome of Mimulus guttatus, another representative of higher core Lamiales clade, reveals striking differences in gene content and length of non-coding regions.
Conclusions
Genome contraction was a complex process, which involved gene loss and reduction of lengths of introns and intergenic regions, but not intron loss. The gene loss is more frequent for the genes that belong to multigenic families indicating that genetic redundancy is an important prerequisite for genome size reduction.
doi:10.1186/1471-2164-14-476
PMCID: PMC3728226  PMID: 23855885
Genome reduction; Carnivorous plant; Intron; Intergenic region
5.  Prevalence of Multinucleotide Replacements in Evolution of Primates and Drosophila 
Molecular Biology and Evolution  2013;30(6):1315-1325.
Evolution of sequences mostly involves independent changes at different sites. However, substitutions at neighboring sites may co-occur as multinucleotide replacement events (MNRs). Here, we compare noncoding sequences of several species of primates, and of three species of Drosophila fruit flies, in a phylogenetic analysis of the replacements that occurred between species at nearby nucleotide sites. Both in primates and in Drosophila, the frequency of single-nucleotide replacements is substantially elevated within 10 nucleotides from other replacements that occurred on the same lineage but not on another lineage. The data imply that dinucleotide replacements (DNRs) affecting sites at distances of up to 10 nucleotides from each other are responsible for 2.3% of single-nucleotide replacements in primate genomes and for 5.6% in Drosophila genomes. Among these DNRs, 26% and 69%, respectively, are in fact parts of replacements of three or more trinucleotide replacements (TNRs). The plurality of MNRs affect nearby nucleotides, so that at least six times as many DNRs affect two adjacent nucleotide sites than sites 10 nucleotides apart. Still, approximately 60% of DNRs, and approximately 90% of TNRs, span distances more than two (or three) nucleotides. MNRs make a major contribution to the observed clustering of substitutions: In the human–chimpanzee comparison, DNRs are responsible for 50% of cases when two nearby replacements are observed on the human lineage, and TNRs are responsible for 83% of cases when three replacements at three immediately adjacent sites are observed on the human lineage. The prevalence of MNRs matches that is observed in data on de novo mutations and is also observed in the regions with the lowest sequence conservation, suggesting that MNRs mainly have mutational origin; however, epistatic selection and/or gene conversion may also play a role.
doi:10.1093/molbev/mst036
PMCID: PMC3649671  PMID: 23447710
multinucleotide replacements; complex mutations; mutagenesis; D. melanogaster; H. sapiens
6.  Genome-Level Analysis of Selective Constraint without Apparent Sequence Conservation 
Genome Biology and Evolution  2013;5(3):532-541.
Conservation of function can be accompanied by obvious similarity of homologous sequences which may persist for billions of years (Iyer LM, Leipe DD, Koonin EV, Aravind L. 2004. Evolutionary history and higher order classification of AAA+ ATPases. J Struct Biol. 146:11–31.). However, presumably homologous segments of noncoding DNA can also retain their ancestral function even after their sequences diverge beyond recognition (Fisher S, Grice EA, Vinton RM, Bessling SL, McCallion AS. 2006. Conservation of RET regulatory function from human to zebrafish without sequence similarity. Science 312:276–279.). To investigate this phenomenon at the genomic scale, we studied homologous introns in a quartet of insect species, and in a quartet of vertebrate species. Each quartet consisted of two pairs of moderately distant genomes, with a much larger evolutionary distance between the pairs. In both quartets, we found that introns that carry a regulatory segment or a conserved segment in the first pair tend to carry a conserved segment in the second pair, even though no similarity of these segments could be detected between the two pairs. Furthermore, introns from one pair that are preserved in the other pair tend to carry a conserved segment within the first pair, and be longer in the first pair, compared with the introns that were lost between pairs, even though no similarity between pairs could be detected in such preserved introns. These results indicate that selective constraint, presumably caused by conservation of the ancestral function, often persists even after the homologous DNA segments become unalignable.
doi:10.1093/gbe/evt023
PMCID: PMC3622294  PMID: 23418180
conserved noncoding elements; turnover of regulatory elements; negative selection; evolution of regulatory sequences
7.  Strong Mutational Bias Toward Deletions in the Drosophila melanogaster Genome Is Compensated by Selection 
Genome Biology and Evolution  2013;5(3):514-524.
Insertions and deletions (collectively indels) obviously have a major impact on genome evolution. However, before large-scale data on indel polymorphism became available, it was difficult to estimate the strength of selection acting on indel mutations. Here, we analyze indel polymorphism and divergence in different compartments of the Drosophila melanogaster genome: exons, introns of different lengths, and intergenic regions. Data on low-frequency polymorphisms indicate that 0.036–0.039 short (1–30 nt) insertion mutations and 0.085–0.092 short deletion mutations, with mean lengths 3.23 and 4.78, respectively, occur per single-nucleotide substitution. The excess of short deletion over short insertion mutations implies that indel mutations of these lengths should lead to a loss of approximately 0.30 nt per single-nucleotide replacement. However, polymorphism and divergence data show that this deletion bias is almost completely compensated by selection: Negative selection is stronger against deletions, whereas insertions are more likely to be favored by positive selection. Among the inframe low-frequency polymorphic mutations in exons, long introns, and intergenic regions, selection prevents a larger fraction of deletions (80–87%, depending on the type of the compartment) than of insertions (70–82%) or single-nucleotide substitutions (49–73%), from reaching high frequencies. The corresponding fractions were the lowest in short introns: 66%, 47%, and 15%, respectively, consistent with the weakest selective constraint in them. The McDonald–Kreitman test shows that 32–46% of the deletions and 60–73% of the insertions that were fixed in the recent evolution of D. melanogaster are adaptive, whereas this fraction is only 0–29% for single-nucleotide substitutions.
doi:10.1093/gbe/evt021
PMCID: PMC3622295  PMID: 23395983
indels; deletion bias; indel polymorphism; positive selection; negative selection
8.  Nested genes and increasing organizational complexity of metazoan genomes 
Trends in Genetics  2008;24(10):475-478.
The most common form of protein-coding gene overlap in eukaryotes is a simple nested structure, whereby one gene is embedded in an intron of another. Analysis of nested protein-coding genes in vertebrates, fruit flies and nematodes revealed substantially higher rates of evolutionary gains than losses. The accumulation of nested gene structures could not be attributed to any obvious functional relationships between the genes involved and represents an increase of the organizational complexity of animal genomes via a neutral process.
doi:10.1016/j.tig.2008.08.003
PMCID: PMC3380635  PMID: 18774620
9.  Major role of positive selection in the evolution of conservative segments of Drosophila proteins 
Slow evolution of conservative segments of coding and non-coding DNA is caused by the action of negative selection, which removes new mutations. However, the mode of selection that affects the few substitutions that do occur within such segments remains unclear. Here, we show that the fraction of allele replacements that were driven by positive selection, and the strength of this selection, is the highest within the conservative segments of Drosophila protein-coding genes. The McDonald–Kreitman test, applied to the data on variation in Drosophila melanogaster and in Drosophila simulans, indicates that within the most conservative protein segments, approximately 72 per cent (approx. 80%) of allele replacements were driven by positive selection, as opposed to only approximately 44 per cent (approx. 53%) at rapidly evolving segments. Data on multiple non-synonymous substitutions at a codon lead to the same conclusion and additionally indicate that positive selection driving allele replacements at conservative sites is the strongest, as it accelerates evolution by a factor of approximately 40, as opposed to a factor of approximately 5 at rapidly evolving sites. Thus, random drift plays only a minor role in the evolution of conservative DNA segments, and those relatively rare allele replacements that occur within such segments are mostly driven by substantial positive selection.
doi:10.1098/rspb.2012.0776
PMCID: PMC3396909  PMID: 22673359
positive selection; negative selection; McDonald–Kreitman test; double substitutions
10.  The Effects of New Alibernet Red Wine Extract on Nitric Oxide and Reactive Oxygen Species Production in Spontaneously Hypertensive Rats 
We aimed to perform a chemical analysis of both Alibernet red wine and an alcohol-free Alibernet red wine extract (AWE) and to investigate the effects of AWE on nitric oxide and reactive oxygen species production as well as blood pressure development in normotensive Wistar Kyoto (WKY) and spontaneously hypertensive rats (SHRs). Total antioxidant capacity together with total phenolic and selected mineral content was measured in wine and AWE. Young 6-week-old male WKY and SHR were treated with AWE (24,2 mg/kg/day) for 3 weeks. Total NOS and SOD activities, eNOS and SOD1 protein expressions, and superoxide production were determined in the tissues. Both antioxidant capacity and phenolic content were significantly higher in AWE compared to wine. The AWE increased NOS activity in the left ventricle, aorta, and kidney of SHR, while it did not change NOS activity in WKY rats. Similarly, increased SOD activity in the plasma and left ventricle was observed in SHR only. There were no changes in eNOS and SOD1 expressions. In conclusion, phenolics and minerals included in AWE may contribute directly to increased NOS and SOD activities of SHR. Nevertheless, 3 weeks of AWE treatment failed to affect blood pressure of SHR.
doi:10.1155/2012/806285
PMCID: PMC3375118  PMID: 22720118
11.  Insertions and deletions trigger adaptive walks in Drosophila proteins 
Maps that relate all possible genotypes or phenotypes to fitness—fitness landscapes—are central to the evolution of life, but remain poorly known. An insertion or a deletion (indel) of one or several amino acids constitutes a substantial leap of a protein within the space of amino acid sequences, and it is unlikely that after such a leap the new sequence corresponds precisely to a fitness peak. Thus, one can expect an indel in the protein-coding sequence that gets fixed in a population to be followed by some number of adaptive amino acid substitutions, which move the new sequence towards a nearby fitness peak. Here, we study substitutions that occur after a frame-preserving indel in evolving proteins of Drosophila. An insertion triggers 1.03 ± 0.75 amino acid substitutions within the protein region centred at the site of insertion, and a deletion triggers 4.77 ± 1.03 substitutions within such a region. The difference between these values is probably owing to a higher fraction of effectively neutral insertions. Almost all of the triggered amino acid substitutions can be attributed to positive selection, and most of them occur relatively soon after the triggering indel and take place upstream of its site. A high fraction of substitutions that follow an indel occur at previously conserved sites, suggesting that an indel substantially changes selection that shapes the protein region around it. Thus, an indel is often followed by an adaptive walk of length that is in agreement with the theory of molecular adaptation.
doi:10.1098/rspb.2011.2571
PMCID: PMC3385466  PMID: 22456880
indels; fitness landscape; adaptive walk; McDonald–Kreitman
12.  A Strong Deletion Bias in Nonallelic Gene Conversion 
PLoS Genetics  2012;8(2):e1002508.
Gene conversion is the unidirectional transfer of genetic information between orthologous (allelic) or paralogous (nonallelic) genomic segments. Though a number of studies have examined nucleotide replacements, little is known about length difference mutations produced by gene conversion. Here, we investigate insertions and deletions produced by nonallelic gene conversion in 338 Drosophila and 10,149 primate paralogs. Using a direct phylogenetic approach, we identify 179 insertions and 614 deletions in Drosophila paralogs, and 132 insertions and 455 deletions in primate paralogs. Thus, nonallelic gene conversion is strongly deletion-biased in both lineages, with almost 3.5 times as many conversion-induced deletions as insertions. In primates, the deletion bias is considerably stronger for long indels and, in both lineages, the per-site rate of gene conversion is orders of magnitudes higher than that of ordinary mutation. Due to this high rate, deletion-biased nonallelic gene conversion plays a key role in genome size evolution, leading to the cooperative shrinkage and eventual disappearance of selectively neutral paralogs.
Author Summary
Gene conversion is a process whereby a DNA sequence is copied from one segment of the genome (donor) to another (recipient), resulting in the replacement, insertion, or deletion of a DNA sequence in the recipient. This exchange is facilitated by the high sequence similarity of the two segments, which is due to their evolutionary relationship. Here, we study insertions and deletions produced by gene conversion between paralogs, segments related by DNA duplication events. By comparing paralog sequences in multiple species of fruit flies and primates, we find that deletions occur more than three times as frequently as insertions. We also discover that the rate of gene conversion between paralogs is quite high. The deletion bias and high rate of this process causes paralogs to shrink cooperatively and eventually be eliminated from the genome. Because of the abundance of paralogs in animal genomes, this phenomenon can lead to a significant reduction in genome size. Therefore, our finding enhances our understanding of the forces that lead to changes in genome size during evolution.
doi:10.1371/journal.pgen.1002508
PMCID: PMC3280953  PMID: 22359514
13.  Rate and breadth of protein evolution are only weakly correlated 
Biology Direct  2012;7:8.
Background
Evolution at a protein site can be characterized from two different perspectives, by its rate and by the breadth of the set of acceptable amino acids.
Results
There is a weak positive correlation between rates and breadths of evolution, both across individual amino acid sites and across proteins.
Conclusions
Rate and breadth are two distinct, and only weakly correlated, characteristics of protein evolution. The most likely explanation of their positive correlation is heterogeneity of selective constraint, such that less functionally important sites evolve faster and can accept more amino acids.
Reviewers
This article was reviewed by Eugene V. Koonin, Arcady R. Mushegyan, and Eugene I. Shakhnovich.
doi:10.1186/1745-6150-7-8
PMCID: PMC3331848  PMID: 22336199
14.  Detecting Past Positive Selection through Ongoing Negative Selection 
Genome Biology and Evolution  2011;3:1006-1013.
Detecting positive selection is a challenging task. We propose a method for detecting past positive selection through ongoing negative selection, based on comparison of the parameters of intraspecies polymorphism at functionally important and selectively neutral sites where a nucleotide substitution of the same kind occurred recently. Reduced occurrence of recently replaced ancestral alleles at functionally important sites indicates that negative selection currently acts against these alleles and, therefore, that their replacements were driven by positive selection. Application of this method to the Drosophila melanogaster lineage shows that the fraction of adaptive amino acid replacements remained approximately 0.5 for a long time. In the Homo sapiens lineage, however, this fraction drops from approximately 0.5 before the Ponginae–Homininae divergence to approximately 0 after it. The proposed method is based on essentially the same data as the McDonald–Kreitman test but is free from some of its limitations, which may open new opportunities, especially when many genotypes within a species are known.
doi:10.1093/gbe/evr086
PMCID: PMC3184776  PMID: 21859804
natural selection; amino acid substitutions; polymorphism; divergence; McDonald–Kreitman test; allele frequency spectrum
15.  Measurements of spontaneous rates of mutations in the recent past and the near future 
The rate of spontaneous mutation in natural populations is a fundamental parameter for many evolutionary phenomena. Because the rate of mutation is generally low, most of what is currently known about mutation has been obtained through indirect, complex and imprecise methodological approaches. However, in the past few years genome-wide sequencing of closely related individuals has made it possible to estimate the rates of mutation directly at the level of the DNA, avoiding most of the problems associated with using indirect methods. Here, we review the methods used in the past with an emphasis on next generation sequencing, which may soon make the accurate measurement of spontaneous mutation rates a matter of routine.
doi:10.1098/rstb.2009.0286
PMCID: PMC2871817  PMID: 20308091
mutation; sequencing; estimating mutation rate; mutation accumulation
16.  On the relationship between the load and the variance of relative fitness 
Biology Direct  2011;6:20.
Background
Operation of natural selection can be characterized by a variety of quantities. Among them, variance of relative fitness V and load L are the most fundamental.
Results
Among all modes of selection that produce a particular value V of the variance of relative fitness, the minimal value Lmin of load L is produced by a mode under which fitness takes only two values, 0 and some positive value, and is equal to V/(1+V).
Conclusions
Although it is impossible to deduce the load from knowledge of the variance of relative fitness alone, it is possible to determine the minimal load consistent with a particular variance of relative fitness. The concept of minimal load consistent with a particular biological phenomenon may be applicable to studying several aspects of natural selection.
Reviewers
The manuscript was reviewed by Sergei Maslov, Alexander Gordon, and Eugene Koonin.
doi:10.1186/1745-6150-6-20
PMCID: PMC3094333  PMID: 21492441
18.  Rate of sequence divergence under constant selection 
Biology Direct  2010;5:5.
Background
Divergence of two independently evolving sequences that originated from a common ancestor can be described by two parameters, the asymptotic level of divergence E and the rate r at which this level of divergence is approached. Constant negative selection impedes allele replacements and, therefore, is routinely assumed to decelerate sequence divergence. However, its impact on E and on r has not been formally investigated.
Results
Strong selection that favors only one allele can make E arbitrarily small and r arbitrarily large. In contrast, in the case of 4 possible alleles and equal mutation rates, the lowest value of r, attained when two alleles confer equal fitnesses and the other two are strongly deleterious, is only two times lower than its value under selective neutrality.
Conclusions
Constant selection can strongly constrain the level of sequence divergence, but cannot reduce substantially the rate at which this level is approached. In particular, under any constant selection the divergence of sequences that accumulated one substitution per neutral site since their origin from the common ancestor must already constitute at least one half of the asymptotic divergence at sites under such selection.
Reviewers
This article was reviewed by Drs. Nicolas Galtier, Sergei Maslov, and Nick Grishin.
doi:10.1186/1745-6150-5-5
PMCID: PMC2835663  PMID: 20092641
19.  Correction: Hypermutable Non-Synonymous Sites Are under Stronger Negative Selection 
PLoS Genetics  2008;4(12):10.1371/annotation/a81b1fab-890c-447b-a308-5bc8ca3eb21d .
doi:10.1371/annotation/a81b1fab-890c-447b-a308-5bc8ca3eb21d
PMCID: PMC2605301  PMID: 19096535
20.  Hypermutable Non-Synonymous Sites Are under Stronger Negative Selection 
PLoS Genetics  2008;4(11):e1000281.
Mutation rate varies greatly between nucleotide sites of the human genome and depends both on the global genomic location and the local sequence context of a site. In particular, CpG context elevates the mutation rate by an order of magnitude. Mutations also vary widely in their effect on the molecular function, phenotype, and fitness. Independence of the probability of occurrence of a new mutation's effect has been a fundamental premise in genetics. However, highly mutable contexts may be preserved by negative selection at important sites but destroyed by mutation at sites under no selection. Thus, there may be a positive correlation between the rate of mutations at a nucleotide site and the magnitude of their effect on fitness. We studied the impact of CpG context on the rate of human–chimpanzee divergence and on intrahuman nucleotide diversity at non-synonymous coding sites. We compared nucleotides that occupy identical positions within codons of identical amino acids and only differ by being within versus outside CpG context. Nucleotides within CpG context are under a stronger negative selection, as revealed by their lower, proportionally to the mutation rate, rate of evolution and nucleotide diversity. In particular, the probability of fixation of a non-synonymous transition at a CpG site is two times lower than at a CpG site. Thus, sites with different mutation rates are not necessarily selectively equivalent. This suggests that the mutation rate may complement sequence conservation as a characteristic predictive of functional importance of nucleotide sites.
Author Summary
Mutations occur in some sites in the genome more frequently than in others. Similarly, mutations in some sites have greater consequences than in others. The effect of mutations might not be independent of the frequency with which mutations occur. Indeed, sites where mutations happen frequently will be preserved if the effects of these mutations are severe or will otherwise be allowed to mutate if there are no consequences for the organism. We compared both human–chimpanzee differences and sequence variation among humans in protein coding genes. We found that highly mutable nucleotide sites, such as the dinucleotide CpG, are on average more important and more frequently preserved by natural selection. Using this information, together with other features such as sequence conservation, opens a new perspective to predict the effect of human mutations, including their potential involvement in diseases.
doi:10.1371/journal.pgen.1000281
PMCID: PMC2583910  PMID: 19043566
21.  Short sequence motifs, overrepresented in mammalian conserved non-coding sequences 
BMC Genomics  2007;8:378.
Background
A substantial fraction of non-coding DNA sequences of multicellular eukaryotes is under selective constraint. In particular, ~5% of the human genome consists of conserved non-coding sequences (CNSs). CNSs differ from other genomic sequences in their nucleotide composition and must play important functional roles, which mostly remain obscure.
Results
We investigated relative abundances of short sequence motifs in all human CNSs present in the human/mouse whole-genome alignments vs. three background sets of sequences: (i) weakly conserved or unconserved non-coding sequences (non-CNSs); (ii) near-promoter sequences (located between nucleotides -500 and -1500, relative to a start of transcription); and (iii) random sequences with the same nucleotide composition as that of CNSs. When compared to non-CNSs and near-promoter sequences, CNSs possess an excess of AT-rich motifs, often containing runs of identical nucleotides. In contrast, when compared to random sequences, CNSs contain an excess of GC-rich motifs which, however, lack CpG dinucleotides. Thus, abundance of short sequence motifs in human CNSs, taken as a whole, is mostly determined by their overall compositional properties and not by overrepresentation of any specific short motifs. These properties are: (i) high AT-content of CNSs, (ii) a tendency, probably due to context-dependent mutation, of A's and T's to clump, (iii) presence of short GC-rich regions, and (iv) avoidance of CpG contexts, due to their hypermutability. Only a small number of short motifs, overrepresented in all human CNSs are similar to binding sites of transcription factors from the FOX family.
Conclusion
Human CNSs as a whole appear to be too broad a class of sequences to possess strong footprints of any short sequence-specific functions. Such footprints should be studied at the level of functional subclasses of CNSs, such as those which flank genes with a particular pattern of expression. Overall properties of CNSs are affected by patterns in mutation, suggesting that selection which causes their conservation is not always very strong.
doi:10.1186/1471-2164-8-378
PMCID: PMC2176071  PMID: 17945028
22.  Extensive parallelism in protein evolution 
Biology Direct  2007;2:20.
Background
Independently evolving lineages mostly accumulate different changes, which leads to their gradual divergence. However, parallel accumulation of identical changes is also common, especially in traits with only a small number of possible states.
Results
We characterize parallelism in evolution of coding sequences in three four-species sets of genomes of mammals, Drosophila, and yeasts. Each such set contains two independent evolutionary paths, which we call paths I and II. An amino acid replacement which occurred along path I also occurs along path II with the probability 50–80% of that expected under selective neutrality. Thus, the per site rate of parallel evolution of proteins is several times higher than their average rate of evolution, but still lower than the rate of evolution of neutral sequences. This deficit may be caused by changes in the fitness landscape, leading to a replacement being possible along path I but not along path II. However, constant, weak selection assumed by the nearly neutral model of evolution appears to be a more likely explanation. Then, the average coefficient of selection associated with an amino acid replacement, in the units of the effective population size, must exceed ~0.4, and the fraction of effectively neutral replacements must be below ~30%. At a majority of evolvable amino acid sites, only a relatively small number of different amino acids is permitted.
Conclusion
High, but below-neutral, rates of parallel amino acid replacements suggest that a majority of amino acid replacements that occur in evolution are subject to weak, but non-trivial, selection, as predicted by Ohta's nearly-neutral theory.
Reviewers
This article was reviewed by John McDonald (nominated by Laura Landweber), Sarah Teichmann and Subhajyoti De, and Chris Adami.
doi:10.1186/1745-6150-2-20
PMCID: PMC2020468  PMID: 17705846
23.  Two classes of deleterious recessive alleles in a natural population of zebrafish, Danio rerio. 
Natural populations carry deleterious recessive alleles which cause inbreeding depression. We compared mortality and growth of inbred and outbred zebrafish, Danio rerio, between 6 and 48 days of age. Grandparents of the studied fish were caught in the wild. Inbred fish were generated by brother-sister mating. Mortality was 9% in outbred fish, and 42% in inbred fish, which implies at least 3.6 lethal equivalents of deleterious recessive alleles per zygote. There was no significant inbreeding depression in the growth, perhaps because the surviving inbred fish lived under less crowded conditions. In contrast to alleles that cause embryonic and early larval mortality in the same population, alleles responsible for late larval and early juvenile mortality did not result in any gross morphological abnormalities. Thus, deleterious recessive alleles that segregate in a wild zebrafish population belong to two sharply distinct classes: early-acting, morphologically overt, unconditional lethals; and later-acting, morphologically cryptic, and presumably milder alleles.
doi:10.1098/rspb.2004.2787
PMCID: PMC1691827  PMID: 15451692
24.  Rate of promoter class turn-over in yeast evolution 
Background
Phylogenetic conservation at the DNA level is routinely used as evidence of molecular function, under the assumption that locations and sequences of functional DNA segments remain invariant in evolution. In particular, short DNA segments participating in initiation and regulation of transcription are often conserved between related species. However, transcription of a gene can evolve, and this evolution may involve changes of even such conservative DNA segments. Genes of yeast Saccharomyces have promoters of two classes, class 1 (TATA-containing) and class 2 (non-TATA-containing).
Results
Comparison of upstream non-coding regions of orthologous genes from the five species of Saccharomyces sensu stricto group shows that among 212 genes which very likely have class 1 promoters in S. cerevisiae, 17 probably have class 2 promoters in one or more other species. Conversely, among 322 genes which very likely have class 2 promoters in S. cerevisiae, 44 probably have class 1 promoters in one or more other species. Also, for at least 2 genes from the set of 212 S. cerevisiae genes with class 1 promoters, the locations of the TATA consensus sequences are substantially different between the species.
Conclusion
Our results indicate that, in the course of yeast evolution, a promoter switches its class with the probability at least ~0.1 per time required for the accumulation of one nucleotide substitution at a non-coding site. Thus, key sequences involved in initiation of transcription evolve with substantial rates in yeast.
doi:10.1186/1471-2148-6-14
PMCID: PMC1457003  PMID: 16472383
25.  Bioinformatical assay of human gene morbidity 
Nucleic Acids Research  2004;32(5):1731-1737.
Only a fraction of eukaryotic genes affect the phenotype drastically. We compared 18 parameters in 1273 human morbid genes, known to cause diseases, and in the remaining 16 580 unambiguous human genes. Morbid genes evolve more slowly, have wider phylogenetic distributions, are more similar to essential genes of Drosophila melanogaster, code for longer proteins containing more alanine and glycine and less histidine, lysine and methionine, possess larger numbers of longer introns with more accurate splicing signals and have higher and broader expressions. These differences make it possible to classify as non-morbid 34% of human genes with unknown morbidity, when only 5% of known morbid genes are incorrectly classified as non-morbid. This classification can help to identify disease-causing genes among multiple candidates.
doi:10.1093/nar/gkh330
PMCID: PMC390328  PMID: 15020709

Results 1-25 (29)