Search tips
Search criteria

Results 1-25 (1296887)

Clipboard (0)

Related Articles

1.  The evolution of human influenza A viruses from 1999 to 2006: A complete genome study 
Virology Journal  2008;5:40.
Knowledge about the complete genome constellation of seasonal influenza A viruses from different countries is valuable for monitoring and understanding of the evolution and migration of strains. Few complete genome sequences of influenza A viruses from Europe are publicly available at the present time and there have been few longitudinal genome studies of human influenza A viruses. We have studied the evolution of circulating human H3N2, H1N1 and H1N2 influenza A viruses from 1999 to 2006, we analysed 234 Danish human influenza A viruses and characterised 24 complete genomes.
H3N2 was the prevalent strain in Denmark during the study period, but H1N1 dominated the 2000–2001 season. H1N2 viruses were first observed in Denmark in 2002–2003. After years of little genetic change in the H1N1 viruses the 2005–2006 season presented H1N1 of greater variability than before. This indicates that H1N1 viruses are evolving and that H1N1 soon is likely to be the prevalent strain again. Generally, the influenza A haemagglutinin (HA) of H3N2 viruses formed seasonal phylogenetic clusters. Different lineages co-circulating within the same season were also observed. The evolution has been stochastic, influenced by small "jumps" in genetic distance rather than constant drift, especially with the introduction of the Fujian-like viruses in 2002–2003. Also evolutionary stasis-periods were observed which might indicate well fit viruses. The evolution of H3N2 viruses have also been influenced by gene reassortments between lineages from different seasons. None of the influenza genes were influenced by strong positive selection pressure. The antigenic site B in H3N2 HA was the preferred site for genetic change during the study period probably because the site A has been masked by glycosylations. Substitutions at CTL-epitopes in the genes coding for the neuraminidase (NA), polymerase acidic protein (PA), matrix protein 1 (M1), non-structural protein 1 (NS1) and especially the nucleoprotein (NP) were observed. The N-linked glycosylation pattern varied during the study period and the H3N2 isolates from 2004 to 2006 were highly glycosylated with ten predicted sequons in HA, the highest amount of glycosylations observed in this study period.
The present study is the first to our knowledge to characterise the evolution of complete genomes of influenza A H3N2, H1N1 and H1N2 isolates from Europe over a time period of seven years from 1999 to 2006. More precise knowledge about the circulating strains may have implications for predicting the following season strains and thereby better matching the vaccine composition.
PMCID: PMC2311284  PMID: 18325125
2.  Intrasubtype Reassortments Cause Adaptive Amino Acid Replacements in H3N2 Influenza Genes 
PLoS Genetics  2014;10(1):e1004037.
Reassortments and point mutations are two major contributors to diversity of Influenza A virus; however, the link between these two processes is unclear. It has been suggested that reassortments provoke a temporary increase in the rate of amino acid changes as the viral proteins adapt to new genetic environment, but this phenomenon has not been studied systematically. Here, we use a phylogenetic approach to infer the reassortment events between the 8 segments of influenza A H3N2 virus since its emergence in humans in 1968. We then study the amino acid replacements that occurred in genes encoded in each segment subsequent to reassortments. In five out of eight genes (NA, M1, HA, PB1 and NS1), the reassortment events led to a transient increase in the rate of amino acid replacements on the descendant phylogenetic branches. In NA and HA, the replacements following reassortments were enriched with parallel and/or reversing replacements; in contrast, the replacements at sites responsible for differences between antigenic clusters (in HA) and at sites under positive selection (in NA) were underrepresented among them. Post-reassortment adaptive walks contribute to adaptive evolution in Influenza A: in NA, an average reassortment event causes at least 2.1 amino acid replacements in a reassorted gene, with, on average, 0.43 amino acid replacements per evolving post-reassortment lineage; and at least ∼9% of all amino acid replacements are provoked by reassortments.
Author Summary
Influenza A is a rapidly evolving virus with genome composed of eight distinct RNA molecules called segments. This genetic structure allows formation of new combinations of segments when a cell is coinfected by multiple viral strains, in a process called reassortment. While “antigenic drift” – the process of continuous accumulation of point mutations that change the antigenic properties of the viral proteins – is mainly responsible for the seasonal flu, the heaviest pandemics were caused by spread of novel reassortant strains and the associated radical “antigenic shift”. However, the association between these two types of processes has not been studied systematically. Here, we use the extensive available complete-genome sequencing data for Influenza A H3N2 subtype to infer the evolutionary timings of within-subtype reassortment events, and study the patterns of point amino acid-changing replacements that followed reassortments. We find that reassortments were often rapidly followed by replacements, which possibly compensated for the loss of fitness associated with the reassortment or explored newly accessible fitness peaks. These findings may be relevant for prediction of future pandemic strains of Influenza A.
PMCID: PMC3886890  PMID: 24415946
3.  Changing Selective Pressure during Antigenic Changes in Human Influenza H3 
PLoS Pathogens  2008;4(5):e1000058.
The rapid evolution of influenza viruses presents difficulties in maintaining the optimal efficiency of vaccines. Amino acid substitutions result in antigenic drift, a process whereby antisera raised in response to one virus have reduced effectiveness against future viruses. Interestingly, while amino acid substitutions occur at a relatively constant rate, the antigenic properties of H3 move in a discontinuous, step-wise manner. It is not clear why this punctuated evolution occurs, whether this represents simply the fact that some substitutions affect these properties more than others, or if this is indicative of a changing relationship between the virus and the host. In addition, the role of changing glycosylation of the haemagglutinin in these shifts in antigenic properties is unknown. We analysed the antigenic drift of HA1 from human influenza H3 using a model of sequence change that allows for variation in selective pressure at different locations in the sequence, as well as at different parts of the phylogenetic tree. We detect significant changes in selective pressure that occur preferentially during major changes in antigenic properties. Despite the large increase in glycosylation during the past 40 years, changes in glycosylation did not correlate either with changes in antigenic properties or with significantly more rapid changes in selective pressure. The locations that undergo changes in selective pressure are largely in places undergoing adaptive evolution, in antigenic locations, and in locations or near locations undergoing substitutions that characterise the change in antigenicity of the virus. Our results suggest that the relationship of the virus to the host changes with time, with the shifts in antigenic properties representing changes in this relationship. This suggests that the virus and host immune system are evolving different methods to counter each other. While we are able to characterise the rapid increase in glycosylation of the haemagglutinin during time in human influenza H3, an increase not present in influenza in birds, this increase seems unrelated to the observed changes in antigenic properties.
Author Summary
H3N2-type influenza is responsible for widespread disease and significant mortality. The virus evolves rapidly, changing its antigenic properties, allowing it to escape clearance by the immune response as well as complicating the maintenance of vaccine effectiveness. Part of this evolution has been the rapid increase in glycosylation, an increase not observed either in H9 evolution in birds or in H1 evolution in humans. It has been observed that the antigenic properties change in a punctuated, discontinuous manner. This could be either because some mutations are more significant than others, or it could mean that the antigenic changes correspond to adjustments in the antagonistic relationship between virus and host. By studying the sequence evolution of the H3 haemagglutinin, we can demonstrate that the selective pressure acting on the virus protein changes with time, and that these changes are especially rapid during changes in antigenic properties. This indicates that the antigenic changes correspond to modifications in the virus–host relationship. Surprisingly, neither the changes in selective pressure nor the changes in antigenic properties correspond to changes in glycosylation.
PMCID: PMC2323114  PMID: 18451985
4.  Mechanisms of GII.4 Norovirus Persistence in Human Populations  
PLoS Medicine  2008;5(2):e31.
Noroviruses are the leading cause of viral acute gastroenteritis in humans, noted for causing epidemic outbreaks in communities, the military, cruise ships, hospitals, and assisted living communities. The evolutionary mechanisms governing the persistence and emergence of new norovirus strains in human populations are unknown. Primarily organized by sequence homology into two major human genogroups defined by multiple genoclusters, the majority of norovirus outbreaks are caused by viruses from the GII.4 genocluster, which was first recognized as the major epidemic strain in the mid-1990s. Previous studies by our laboratory and others indicate that some noroviruses readily infect individuals who carry a gene encoding a functional alpha-1,2-fucosyltransferase (FUT2) and are designated “secretor-positive” to indicate that they express ABH histo-blood group antigens (HBGAs), a highly heterogeneous group of related carbohydrates on mucosal surfaces. Individuals with defects in the FUT2 gene are termed secretor-negative, do not express the appropriate HBGA necessary for docking, and are resistant to Norwalk infection. These data argue that FUT2 and other genes encoding enzymes that regulate processing of the HBGA carbohydrates function as susceptibility alleles. However, secretor-negative individuals can be infected with other norovirus strains, and reinfection with the GII.4 strains is common in human populations. In this article, we analyze molecular mechanisms governing GII.4 epidemiology, susceptibility, and persistence in human populations.
Methods and Findings
Phylogenetic analyses of the GII.4 capsid sequences suggested an epochal evolution over the last 20 y with periods of stasis followed by rapid evolution of novel epidemic strains. The epidemic strains show a linear relationship in time, whereby serial replacements emerge from the previous cluster. Five major evolutionary clusters were identified, and representative ORF2 capsid genes for each cluster were expressed as virus-like particles (VLPs). Using salivary and carbohydrate-binding assays, we showed that GII.4 VLP-carbohydrate ligand binding patterns have changed over time and include carbohydrates regulated by the human FUT2 and FUT3 pathways, suggesting that strain sensitivity to human susceptibility alleles will vary. Variation in surface-exposed residues and in residues that surround the fucose ligand interaction domain suggests that antigenic drift may promote GII.4 persistence in human populations. Evidence supporting antigenic drift was obtained by measuring the antigenic relatedness of GII.4 VLPs using murine and human sera and demonstrating strain-specific serologic and carbohydrate-binding blockade responses. These data suggest that the GII.4 noroviruses persist by altering their HBGA carbohydrate-binding targets over time, which not only allows for escape from highly penetrant host susceptibility alleles, but simultaneously allows for immune-driven selection in the receptor-binding region to facilitate escape from protective herd immunity.
Our data suggest that the surface-exposed carbohydrate ligand binding domain in the norovirus capsid is under heavy immune selection and likely evolves by antigenic drift in the face of human herd immunity. Variation in the capsid carbohydrate-binding domain is tolerated because of the large repertoire of similar, yet distinct HBGA carbohydrate receptors available on mucosal surfaces that could interface with the remodeled architecture of the capsid ligand-binding pocket. The continuing evolution of new replacement strains suggests that, as with influenza viruses, vaccines could be targeted that protect against norovirus infections, and that continued epidemiologic surveillance and reformulations of norovirus vaccines will be essential in the control of future outbreaks.
Through phylogenetic analysis of norovirus isolates, Ralph Baric and colleagues show that new epidemic strains arise as the variety of available cellular receptors permits antigenic drift in the viral capsid.
Editors' Summary
Noroviruses are the leading cause of viral gastroenteritis (stomach flu), the symptoms of which include nausea, vomiting, and diarrhea. There is no treatment for infection with these highly contagious viruses. While most people recover within a few days, the very young and old may experience severe disease. Like influenza, large outbreaks (epidemics) of norovirus infection occur periodically (often in closed communities such as cruise ships), and most people have several norovirus infections during their lifetime. Currently, 100,000–200,000 people are being infected each week in England with a new GII.4 variant. There are several reasons for this pattern of infection and reinfection. First, the immune response induced by a norovirus infection is short-lived in some people, but not all. Second, there are many different noroviruses. Based on their genomes (genetic blueprints), noroviruses belong to five “genogroups,” which are further subdivided into “genotypes.” An immune response to one norovirus provides little protection against noroviruses of other genogroups or genotypes. Third, like influenza viruses, noroviruses frequently acquire small changes in their genome. This process is called antigenic drift (antigens are the molecules on the surface of infectious agents that stimulate the production of antibodies, proteins that help the immune system recognize and deal with foreign invaders). Norovirus epidemics occur when virus variants emerge to which the human population has no immunity.
Why Was This Study Done?
It is unknown exactly how noroviruses change over time or how they persist in human populations. In addition, little is known about susceptibility to norovirus infections except that secretor-positive individuals—people who express “histoblood group antigens” (HBGAs, a heterogeneous group of sugar molecules by which noroviruses attach themselves to human cells) on the cells that line their mouths and guts—are more susceptible than secretor-negative people, who express these antigens only on red blood cells. Information of this sort is needed to devise effective intervention strategies, therapies, and vaccines to reduce the illness and economic costs associated with norovirus outbreaks. In this study, the researchers investigate the molecular mechanisms governing the emergence and persistence of epidemic norovirus strains in human populations by analyzing how GII.4 norovirus strains (the genotype usually associated with epidemics) have changed over time.
What Did the Researchers Do and Find?
The researchers analyzed the relationships among the sequences of the gene encoding the capsid protein of GII.4 norovirus strains isolated over the past 20 years. The capsid protein forms a shell around noroviruses and is involved in their binding to HBGAs and their recognition by the human immune system. The researchers found that the virus evolved in fits and starts. That is, for several years, one cluster of strains was predominant but then new epidemic strains emerged rapidly from the cluster. In all, the researchers identified five major evolutionary clusters. They then created “virus-like particles” (VLPs) using representative capsid genes from each cluster and showed that these VLPs bound to different HBGAs. Finally they measured the antigenic relatedness of the different VLPs using human blood collected during a 1988 GII.4 outbreak. Antibodies in these samples recognized the VLPs representing early GII.4 strains better than VLPs representing recent GII.4 strains. The ability of the blood samples to block the interaction of VLPs with their matching HBGAs showed a similar pattern.
What Do These Findings Mean?
These findings suggest that the part of the norovirus capsid protein that binds to sugars on host cells is under heavy immune selection and evolves over time by antigenic drift. They show that, like influenza viruses, GII.4 viruses evolve through serial changes in the capsid sequence that occur sporadically after periods of stability, probably to evade the build up of immunity within the human population. Variation in this region of the viral genome is possible because human populations express a great variety of HBGA molecules so there is always likely to be a subpopulation of people that is susceptible to the altered virus. Overall, these findings suggest that it should be possible to develop vaccines to protect against norovirus infections but, just as with influenza virus, surveillance systems will have to monitor how the virus is changing and vaccines will need to be reformulated frequently to provide effective protection against norovirus outbreaks.
Additional Information.
Please access these Web sites via the online version of this summary at
See a related PLoS Medicine Perspective article
The MedlinePlus encyclopedia has a page on viral gastroenteritis (in English and Spanish)
The US Centers for Disease Control and Prevention provides information on viral gastroenteritis (in English and Spanish) and on noroviruses
The UK National Health Service's health website (NHS Direct) provides information about noroviruses
The UK Health Protection Agency and the US Food & Drug Administration also provide information about noroviruses
PMCID: PMC2235898  PMID: 18271619
5.  Transmission of Clonal Hepatitis C Virus Genomes Reveals the Dominant but Transitory Role of CD8+ T Cells in Early Viral Evolution ▿ † 
Journal of Virology  2011;85(22):11833-11845.
The RNA genome of the hepatitis C virus (HCV) diversifies rapidly during the acute phase of infection, but the selective forces that drive this process remain poorly defined. Here we examined whether Darwinian selection pressure imposed by CD8+ T cells is a dominant force driving early amino acid replacement in HCV viral populations. This question was addressed in two chimpanzees followed for 8 to 10 years after infection with a well-defined inoculum composed of a clonal genotype 1a (isolate H77C) HCV genome. Detailed characterization of CD8+ T cell responses combined with sequencing of recovered virus at frequent intervals revealed that most acute-phase nonsynonymous mutations were clustered in class I epitopes and appeared much earlier than those in the remainder of the HCV genome. Moreover, the ratio of nonsynonymous to synonymous mutations, a measure of positive selection pressure, was increased 50-fold in class I epitopes compared with the rest of the HCV genome. Finally, some mutation of the clonal H77C genome toward a genotype 1a consensus sequence considered most fit for replication was observed during the acute phase of infection, but the majority of these amino acid substitutions occurred slowly over several years of chronic infection. Together these observations indicate that during acute hepatitis C, virus evolution was driven primarily by positive selection pressure exerted by CD8+ T cells. This influence of immune pressure on viral evolution appears to subside as chronic infection is established and genetic drift becomes the dominant evolutionary force.
PMCID: PMC3209267  PMID: 21900166
6.  Deep Sequencing Reveals Potential Antigenic Variants at Low Frequencies in Influenza A Virus-Infected Humans 
Journal of Virology  2016;90(7):3355-3365.
Influenza vaccines must be frequently reformulated to account for antigenic changes in the viral envelope protein, hemagglutinin (HA). The rapid evolution of influenza virus under immune pressure is likely enhanced by the virus's genetic diversity within a host, although antigenic change has rarely been investigated on the level of individual infected humans. We used deep sequencing to characterize the between- and within-host genetic diversity of influenza viruses in a cohort of patients that included individuals who were vaccinated and then infected in the same season. We characterized influenza HA segments from the predominant circulating influenza A subtypes during the 2012-2013 (H3N2) and 2013-2014 (pandemic H1N1; H1N1pdm) flu seasons. We found that HA consensus sequences were similar in nonvaccinated and vaccinated subjects. In both groups, purifying selection was the dominant force shaping HA genetic diversity. Interestingly, viruses from multiple individuals harbored low-frequency mutations encoding amino acid substitutions in HA antigenic sites at or near the receptor-binding domain. These mutations included two substitutions in H1N1pdm viruses, G158K and N159K, which were recently found to confer escape from virus-specific antibodies. These findings raise the possibility that influenza antigenic diversity can be generated within individual human hosts but may not become fixed in the viral population even when they would be expected to have a strong fitness advantage. Understanding constraints on influenza antigenic evolution within individual hosts may elucidate potential future pathways of antigenic evolution at the population level.
IMPORTANCE Influenza vaccines must be frequently reformulated due to the virus's rapid evolution rate. We know that influenza viruses exist within each infected host as a “swarm” of genetically distinct viruses, but the role of this within-host diversity in the antigenic evolution of influenza has been unclear. We characterized here the genetic and potential antigenic diversity of influenza viruses infecting humans, some of whom became infected despite recent vaccination. Influenza virus between- and within-host genetic diversity was not significantly different in nonvaccinated and vaccinated humans, suggesting that vaccine-induced immunity does not exert strong selective pressure on viruses replicating in individual people. We found low-frequency mutations, below the detection threshold of traditional surveillance methods, in nonvaccinated and vaccinated humans that were recently associated with antibody escape. Interestingly, these potential antigenic variants did not reach fixation in infected people, suggesting that other evolutionary factors may be hindering their emergence in individual humans.
PMCID: PMC4794676  PMID: 26739054
7.  Hotspots of Biased Nucleotide Substitutions in Human Genes 
PLoS Biology  2009;7(1):e1000026.
Genes that have experienced accelerated evolutionary rates on the human lineage during recent evolution are candidates for involvement in human-specific adaptations. To determine the forces that cause increased evolutionary rates in certain genes, we analyzed alignments of 10,238 human genes to their orthologues in chimpanzee and macaque. Using a likelihood ratio test, we identified protein-coding sequences with an accelerated rate of base substitutions along the human lineage. Exons evolving at a fast rate in humans have a significant tendency to contain clusters of AT-to-GC (weak-to-strong) biased substitutions. This pattern is also observed in noncoding sequence flanking rapidly evolving exons. Accelerated exons occur in regions with elevated male recombination rates and exhibit an excess of nonsynonymous substitutions relative to the genomic average. We next analyzed genes with significantly elevated ratios of nonsynonymous to synonymous rates of base substitution (dN/dS) along the human lineage, and those with an excess of amino acid replacement substitutions relative to human polymorphism. These genes also show evidence of clusters of weak-to-strong biased substitutions. These findings indicate that a recombination-associated process, such as biased gene conversion (BGC), is driving fixation of GC alleles in the human genome. This process can lead to accelerated evolution in coding sequences and excess amino acid replacement substitutions, thereby generating significant results for tests of positive selection.
Author Summary
Regions of the human genome that appear to evolve rapidly may have been under strong positive selection and could contain the genetic changes responsible for the uniqueness of our species. However, neutral (nonadaptive) evolutionary processes can give rise to signals that can be mistaken as signs of selection. In this article, we identify coding sequences that have undergone accelerated rates of change in humans, affecting the divergence of the proteins they encode. By analyzing patterns of molecular evolution in these genes and their distribution in the genome, we show that many protein-coding changes in the fastest-changing genes are not a result of selection operating on the genes, but instead result from biased fixation of AT-to-GC mutations. Our findings are consistent with a model of recombination-driven biased gene conversion. This leads to the provocative hypothesis that many of the genetic changes leading to human-specific characters may have been prompted by fixation of deleterious mutations.
Natural selection is commonly believed to be the main engine of functional genetic change, but a separate neutral evolutionary process linked to recombination may have contributed significantly to the divergence of human proteins.
PMCID: PMC2631073  PMID: 19175294
8.  Simultaneous Positive and Purifying Selection on Overlapping Reading Frames of the tat and vpr Genes of Simian Immunodeficiency Virus 
Journal of Virology  2001;75(17):7966-7972.
Tat-specific cytotoxic T cells have previously been shown to exert positive Darwinian selection favoring amino acid replacements of an epitope of simian immunodeficiency virus (SIV). The region of the tat gene encoding this epitope falls within a region of overlap between the tat and vpr reading frames, and nonsynonymous nucleotide substitutions in the tat reading frame were found to occur disproportionately in such a way as to cause synonymous changes in the vpr reading frame. Comparison of published complete SIV genomes showed Tat to be the least conserved at the amino acid level of nine proteins encoded by the virus, while Vpr was one of the most conserved. Numerous parallel amino acid changes occurred within the Tat epitope independently in different monkeys, and purifying selection on the vpr reading frame, by limiting acceptable nonsynonymous substitutions in the tat reading frame, evidently has enhanced the probability of parallel evolution.
PMCID: PMC115040  PMID: 11483741
9.  Genome-Wide Influence of Indel Substitutions on Evolution of Bacteria of the PVC Superphylum, Revealed Using a Novel Computational Method 
Whole-genome scans for positive Darwinian selection are widely used to detect evolution of genome novelty. Most approaches are based on evaluation of nonsynonymous to synonymous substitution rate ratio across evolutionary lineages. These methods are sensitive to saturation of synonymous sites and thus cannot be used to study evolution of distantly related organisms. In contrast, indels occur less frequently than amino acid replacements, accumulate more slowly, and can be employed to characterize evolution of diverged organisms. As indels are also subject to the forces of natural selection, they can generate functional changes through positive selection. Here, we present a new computational approach to detect selective constraints on indel substitutions at the whole-genome level for distantly related organisms. Our method is based on ancestral sequence reconstruction, takes into account the varying susceptibility of different types of secondary structure to indels, and according to simulation studies is conservative. We applied this newly developed framework to characterize the evolution of organisms of the Planctomycetes, Verrucomicrobia, Chlamydiae (PVC) bacterial superphylum. The superphylum contains organisms with unique cell biology, physiology, and diverse lifestyles. It includes bacteria with simple cell organization and more complex eukaryote-like compartmentalization. Lifestyles range from free-living organisms to obligate pathogens. In this study, we conduct a whole-genome level analysis of indel substitutions specific to evolutionary lineages of the PVC superphylum and found that indels evolved under positive selection on up to 12% of gene tree branches. We also analyzed possible functional consequences for several case studies of predicted indel events.
PMCID: PMC3000692  PMID: 21048002
selection; indel substitutions; PVC superphylum
10.  Superiority of a mechanistic codon substitution model even for protein sequences in Phylogenetic analysis 
Nucleotide and amino acid substitution tendencies are characteristic of each species, organelle, and protein family. Hence, various empirical amino acid substitution rate matrices have needed to be estimated for phylogenetic analysis: JTT, WAG, and LG for nuclear proteins, mtREV for mitochondrial proteins, cpREV10 and cpREV64 for chloroplast-encoded proteins, and FLU for influenza proteins. On the other hand, in a mechanistic codon substitution model, in which each codon substitution rate is proportional to the product of a codon mutation rate and the ratio of fixation depending on the type of amino acid replacement, mutation rates and the strength of selective constraint on amino acids can be tailored to each protein family with additional 11 parameters. As a result, in the evolutionary analysis of codon sequences it outperforms codon substitution models equivalent to empirical amino acid substitution matrices. Is it superior even for amino acid sequences, among which synonymous substitutions cannot be identified?
Nucleotide mutations are assumed to occur independently of codon positions but multiple nucleotide changes in infinitesimal time are allowed. Selective constraints on the respective types of amino acid replacements are tailored to each gene with a linear function of a given estimate of selective constraints, which were estimated by maximizing the likelihood of an empirical amino acid or codon substitution frequency matrix, each of JTT, WAG, LG, and KHG. It is shown that the mechanistic codon substitution model with the assumption of equal codon usage yields better values of Akaike and Bayesian information criteria for all three phylogenetic trees of mitochondrial, chloroplast, and influenza-A hemagglutinin proteins than the empirical amino acid substitution models with mtREV, cpREV64, and FLU, which were designed specifically for those protein families, respectively. The variation of selective constraint across sites fits the datasets significantly better than variable codon mutation rates, confirming that substitution rate variations across sites detected by amino acid substitution models are caused primarily by the variation of selective constraint against amino acid substitutions rather than the variation of codon mutation rate.
The mechanistic codon substitution model is superior to amino acid substitution models even in the evolutionary analysis of protein sequences.
PMCID: PMC4225520  PMID: 24256155
Amino acid substitution model; Empirical amino acid substitution rate matrix; Mechanistic codon substitution model; Structural constraints; Functional constraints; Selective constraints; Variable selective constraint across sites; Variable mutation rate across sites; multiple nucleotide change
11.  Rapid Evolution of Pandemic Noroviruses of the GII.4 Lineage 
PLoS Pathogens  2010;6(3):e1000831.
Over the last fifteen years there have been five pandemics of norovirus (NoV) associated gastroenteritis, and the period of stasis between each pandemic has been progressively shortening. NoV is classified into five genogroups, which can be further classified into 25 or more different human NoV genotypes; however, only one, genogroup II genotype 4 (GII.4), is associated with pandemics. Hence, GII.4 viruses have both a higher frequency in the host population and greater epidemiological fitness. The aim of this study was to investigate if the accuracy and rate of replication are contributing to the increased epidemiological fitness of the GII.4 strains. The replication and mutation rates were determined using in vitro RNA dependent RNA polymerase (RdRp) assays, and rates of evolution were determined by bioinformatics. GII.4 strains were compared to the second most reported genotype, recombinant GII.b/GII.3, the rarely detected GII.3 and GII.7 and as a control, hepatitis C virus (HCV). The predominant GII.4 strains had a higher mutation rate and rate of evolution compared to the less frequently detected GII.b, GII.3 and GII.7 strains. Furthermore, the GII.4 lineage had on average a 1.7-fold higher rate of evolution within the capsid sequence and a greater number of non-synonymous changes compared to other NoVs, supporting the theory that it is undergoing antigenic drift at a faster rate. Interestingly, the non-synonymous mutations for all three NoV genotypes were localised to common structural residues in the capsid, indicating that these sites are likely to be under immune selection. This study supports the hypothesis that the ability of the virus to generate genetic diversity is vital for viral fitness.
Author Summary
Since 1995, norovirus has caused five pandemics of acute gastroenteritis. These pandemics spread across the globe within a few months, causing great economic burden on society due to medical and social expenses. Norovirus, like influenza virus, has over 40 genotypes circulating within the population at the same time. However, it is only a single genotype, known as genogroup II genotype 4 (GII.4), that causes mass outbreaks and pandemics. Very little research has been conducted to determine why GII.4 viruses can cause pandemics. Consequently, we compared the evolution properties of several pandemic GII.4 strains to non-pandemic strains and found that the GII.4 viruses were undergoing evolution at a much higher rate than the non-pandemic norovirus strains. This phenomenon is similar to influenza virus, where an increase in antigenic drift has been associated with increased outbreaks. This discovery has important implications in understanding norovirus incidence and also the development of a vaccine and treatment for norovirus.
PMCID: PMC2847951  PMID: 20360972
12.  Selective Pressure to Increase Charge in Immunodominant Epitopes of the H3 Hemagglutinin Influenza Protein 
Journal of Molecular Evolution  2010;72(1):90-103.
The evolutionary speed and the consequent immune escape of H3N2 influenza A virus make it an interesting evolutionary system. Charged amino acid residues are often significant contributors to the free energy of binding for protein–protein interactions, including antibody–antigen binding and ligand–receptor binding. We used Markov chain theory and maximum likelihood estimation to model the evolution of the number of charged amino acids on the dominant epitope in the hemagglutinin protein of circulating H3N2 virus strains. The number of charged amino acids increased in the dominant epitope B of the H3N2 virus since introduction in humans in 1968. When epitope A became dominant in 1989, the number of charged amino acids increased in epitope A and decreased in epitope B. Interestingly, the number of charged residues in the dominant epitope of the dominant circulating strain is never fewer than that in the vaccine strain. We propose these results indicate selective pressure for charged amino acids that increase the affinity of the virus epitope for water and decrease the affinity for host antibodies. The standard PAM model of generic protein evolution is unable to capture these trends. The reduced alphabet Markov model (RAMM) model we introduce captures the increased selective pressure for charged amino acids in the dominant epitope of hemagglutinin of H3N2 influenza (R2 > 0.98 between 1968 and 1988). The RAMM model calibrated to historical H3N2 influenza virus evolution in humans fit well to the H3N2/Wyoming virus evolution data from Guinea pig animal model studies.
Electronic supplementary material
The online version of this article (doi:10.1007/s00239-010-9405-4) contains supplementary material, which is available to authorized users.
PMCID: PMC3033527  PMID: 21086120
Influenza; Virus evolution; Pepitope
13.  Deep Sequencing of Influenza A Virus from a Human Challenge Study Reveals a Selective Bottleneck and Only Limited Intrahost Genetic Diversification 
Journal of Virology  2016;90(24):11247-11258.
Knowledge of influenza virus evolution at the point of transmission and at the intrahost level remains limited, particularly for human hosts. Here, we analyze a unique viral data set of next-generation sequencing (NGS) samples generated from a human influenza challenge study wherein 17 healthy subjects were inoculated with cell- and egg-passaged virus. Nasal wash samples collected from 7 of these subjects were successfully deep sequenced. From these, we characterized changes in the subjects' viral populations during infection and identified differences between the virus in these samples and the viral stock used to inoculate the subjects. We first calculated pairwise genetic distances between the subjects' nasal wash samples, the viral stock, and the influenza virus A/Wisconsin/67/2005 (H3N2) reference strain used to generate the stock virus. These distances revealed that considerable viral evolution occurred at various points in the human challenge study. Further quantitative analyses indicated that (i) the viral stock contained genetic variants that originated and likely were selected for during the passaging process, (ii) direct intranasal inoculation with the viral stock resulted in a selective bottleneck that reduced nonsynonymous genetic diversity in the viral hemagglutinin and nucleoprotein, and (iii) intrahost viral evolution continued over the course of infection. These intrahost evolutionary dynamics were dominated by purifying selection. Our findings indicate that rapid viral evolution can occur during acute influenza infection in otherwise healthy human hosts when the founding population size of the virus is large, as is the case with direct intranasal inoculation.
IMPORTANCE Influenza viruses circulating among humans are known to rapidly evolve over time. However, little is known about how influenza virus evolves across single transmission events and over the course of a single infection. To address these issues, we analyze influenza virus sequences from a human challenge experiment that initiated infection with a cell- and egg-passaged viral stock, which appeared to have adapted during its preparation. We find that the subjects' viral populations differ genetically from the viral stock, with subjects' viral populations having lower representation of the amino-acid-changing variants that arose during viral preparation. We also find that most of the viral evolution occurring over single infections is characterized by further decreases in the frequencies of these amino-acid-changing variants and that only limited intrahost genetic diversification through new mutations is apparent. Our findings indicate that influenza virus populations can undergo rapid genetic changes during acute human infections.
PMCID: PMC5126380  PMID: 27707932
14.  Clonal Interference in the Evolution of Influenza 
Genetics  2012;192(2):671-682.
The seasonal influenza A virus undergoes rapid evolution to escape human immune response. Adaptive changes occur primarily in antigenic epitopes, the antibody-binding domains of the viral hemagglutinin. This process involves recurrent selective sweeps, in which clusters of simultaneous nucleotide fixations in the hemagglutinin coding sequence are observed about every 4 years. Here, we show that influenza A (H3N2) evolves by strong clonal interference. This mode of evolution is a red queen race between viral strains with different beneficial mutations. Clonal interference explains and quantifies the observed sweep pattern: we find an average of at least one strongly beneficial amino acid substitution per year, and a given selective sweep has three to four driving mutations on average. The inference of selection and clonal interference is based on frequency time series of single-nucleotide polymorphisms, which are obtained from a sample of influenza genome sequences over 39 years. Our results imply that mode and speed of influenza evolution are governed not only by positive selection within, but also by background selection outside antigenic epitopes: immune adaptation and conservation of other viral functions interfere with each other. Hence, adapting viral proteins are predicted to be particularly brittle. We conclude that a quantitative understanding of influenza’s evolutionary and epidemiological dynamics must be based on all genomic domains and functions coupled by clonal interference.
PMCID: PMC3454888  PMID: 22851649
adaptive evolution; inference of selection; mutation rate; seasonal influenza
15.  Population genetic processes affecting the mode of selective sweeps and effective population size in influenza virus H3N2 
Human influenza virus A/H3N2 undergoes rapid adaptive evolution in response to host immunity. Positively selected amino acid substitutions have been detected mainly in the hemagglutinin (HA) segment. The genealogical tree of HA sequences sampled over several decades comprises a long trunk and short side branches, which indicates small effective population size. Various studies have reproduced this unique genealogical structure by modeling recurrent positive selection. However, it has not been clearly demonstrated whether recurrent selective sweeps alone can explain the limited level of genetic diversity observed in the HA of H3N2. In addition, the variation-reducing impacts of other evolutionary processes – background selection and complex demography – relative to that of positive selection have never been explicitly evaluated.
In this paper, using computer simulation of a viral population evolving under recurrent selective sweeps we demonstrate that positive selection alone, if it occurs at a rate estimated by previous studies, cannot lead to such a small effective population size. Genetic hitchhiking fails to completely wipe out pre-existing variation because soft, rather than hard, selective sweeps prevail under realistic parameters of mutation rate and population size. We find that antigenic-cluster-transition substitutions in HA occur as hard sweeps. This indicates that the effective population size under which those mutations arise must be much smaller than the actual population size due to other evolutionary forces before selective sweeps further reduce it. We thus examine the effects of background selection and metapopulation dynamics in reducing the effective population size, using parameter values that reproduce other aspects of molecular evolution in H3N2. When either process is incorporated in recurrent selective sweep simulation, selective sweeps are mostly hard and the observed level of synonymous diversity is obtained with large census population size.
Background selection and metapopulation dynamics have greater variation reducing power than recurrent positive selection under realistic parameters in H3N2. Therefore, these evolutionary processes are likely to play crucial roles in reducing the effective population size of H3N2 viruses and thus explaining the characteristic shape of H3N2 genealogy.
Electronic supplementary material
The online version of this article (doi:10.1186/s12862-016-0727-8) contains supplementary material, which is available to authorized users.
PMCID: PMC4972962  PMID: 27487769
Influenza virus; Positive selection; Background selection; Metapopulation; Soft sweep
16.  Frequent Toggling between Alternative Amino Acids Is Driven by Selection in HIV-1 
PLoS Pathogens  2008;4(12):e1000242.
Host immune responses against infectious pathogens exert strong selective pressures favouring the emergence of escape mutations that prevent immune recognition. Escape mutations within or flanking functionally conserved epitopes can occur at a significant cost to the pathogen in terms of its ability to replicate effectively. Such mutations come under selective pressure to revert to the wild type in hosts that do not mount an immune response against the epitope. Amino acid positions exhibiting this pattern of escape and reversion are of interest because they tend to coincide with immune responses that control pathogen replication effectively. We have used a probabilistic model of protein coding sequence evolution to detect sites in HIV-1 exhibiting a pattern of rapid escape and reversion. Our model is designed to detect sites that toggle between a wild type amino acid, which is susceptible to a specific immune response, and amino acids with lower replicative fitness that evade immune recognition. Through simulation, we show that this model has significantly greater power to detect selection involving immune escape and reversion than standard models of diversifying selection, which are sensitive to an overall increased rate of non-synonymous substitution. Applied to alignments of HIV-1 protein coding sequences, the model of immune escape and reversion detects a significantly greater number of adaptively evolving sites in env and nef. In all genes tested, the model provides a significantly better description of adaptively evolving sites than standard models of diversifying selection. Several of the sites detected are corroborated by association between Human Leukocyte Antigen (HLA) and viral sequence polymorphisms. Overall, there is evidence for a large number of sites in HIV-1 evolving under strong selective pressure, but exhibiting low sequence diversity. A phylogenetic model designed to detect rapid toggling between wild type and escape amino acids identifies a larger number of adaptively evolving sites in HIV-1, and can in some cases correctly identify the amino acid that is susceptible to the immune response.
Author Summary
Viruses, such as HIV, are able to evade host immune responses through escape mutations, yet sometimes they do so at a cost. This cost is the reduction in the ability of the virus to replicate, and thus selective pressure exists for a virus to revert to its original state in the absence of the host immune response that caused the initial escape mutation. This pattern of escape and reversion typically occurs when viruses are transmitted between individuals with different immune responses. We develop a phylogenetic model of immune escape and reversion and provide evidence that it outperforms existing models for the detection of selective pressure associated with host immune responses. Finally, we demonstrate that amino acid toggling is a pervasive process in HIV-1 evolution, such that many of the positions in the virus that evolve rapidly, under the influence of positive Darwinian selection, nonetheless display quite low sequence diversity. This highlights the limitations of HIV-1 evolution, and sites such as these are potentially good targets for HIV-1 vaccines.
PMCID: PMC2592544  PMID: 19096508
17.  Phylogenetic Patterns of Human Coxsackievirus B5 Arise from Population Dynamics between Two Genogroups and Reveal Evolutionary Factors of Molecular Adaptation and Transmission 
Journal of Virology  2013;87(22):12249-12259.
The aim of this study was to gain insights into the tempo and mode of the evolutionary processes that sustain genetic diversity in coxsackievirus B5 (CVB5) and into the interplay with virus transmission. We estimated phylodynamic patterns with a large sample of virus strains collected in Europe by Bayesian statistical methods, reconstructed the ancestral states of genealogical nodes, and tested for selection. The genealogies estimated with the structural one-dimensional gene encoding the VP1 protein and nonstructural 3CD locus allowed the precise description of lineages over time and cocirculating virus populations within the two CVB5 clades, genogroups A and B. Strong negative selection shaped the evolution of both loci, but compelling phylogenetic data suggested that immune selection pressure resulted in the emergence of the two genogroups with opposed evolutionary pathways. The genogroups also differed in the temporal occurrence of the amino acid changes. The virus strains of genogroup A were characterized by sequential acquisition of nonsynonymous changes in residues exposed at the virus 5-fold axis. The genogroup B viruses were marked by selection of three changes in a different domain (VP1 C terminus) during its early emergence. These external changes resulted in a selective sweep, which was followed by an evolutionary stasis that is still ongoing after 50 years. The inferred population history of CVB5 showed an alternation of the prevailing genogroup during meningitis epidemics across Europe and is interpreted to be a consequence of partial cross-immunity.
PMCID: PMC3807918  PMID: 24006446
18.  Evolution of the capsid protein genes of foot-and-mouth disease virus: antigenic variation without accumulation of amino acid substitutions over six decades. 
Journal of Virology  1992;66(6):3557-3565.
The genetic diversification of foot-and-mouth disease virus (FMDV) of serotype C over a 6-decade period was studied by comparing nucleotide sequences of the capsid protein-coding regions of viruses isolated in Europe, South America, and The Philippines. Phylogenetic trees were derived for VP1 and P1 (VP1, VP2, VP3, and VP4) RNAs by using the least-squares method. Confidence intervals of the derived phylogeny (significance levels of nodes and standard deviations of branch lengths) were placed by application of the bootstrap resampling method. These procedures defined six highly significant major evolutionary lineages and a complex network of sublines for the isolates from South America. In contrast, European isolates are considerably more homogeneous, probably because of the vaccine origin of several of them. The phylogenetic analysis suggests that FMDV CGC Ger/26 (one of the earliest FMDV isolates available) belonged to an evolutionary line which is now apparently extinct. Attempts to date the origin (ancestor) of the FMDVs analyzed met with considerable uncertainty, mainly owing to the stasis noted in European viruses. Remarkably, the evolution of the capsid genes of FMDV was essentially associated with linear accumulation of silent mutations but continuous accumulation of amino acid substitutions was not observed. Thus, the antigenic variation attained by FMDV type C over 6 decades was due to fluctuations among limited combinations of amino acid residues without net accumulation of amino acid replacements over time.
PMCID: PMC241137  PMID: 1316467
19.  Gnarled-Trunk Evolutionary Model of Influenza A Virus Hemagglutinin 
PLoS ONE  2011;6(10):e25953.
Human influenza A viruses undergo antigenic changes with gradual accumulation of amino acid substitutions on the hemagglutinin (HA) molecule. A strong antigenic mismatch between vaccine and epidemic strains often requires the replacement of influenza vaccines worldwide. To establish a practical model enabling us to predict the future direction of the influenza virus evolution, relative distances of amino acid sequences among past epidemic strains were analyzed by multidimensional scaling (MDS). We found that human influenza viruses have evolved along a gnarled evolutionary pathway with an approximately constant curvature in the MDS-constructed 3D space. The gnarled pathway indicated that evolution on the trunk favored multiple substitutions at the same amino acid positions on HA. The constant curvature was reasonably explained by assuming that the rate of amino acid substitutions varied from one position to another according to a gamma distribution. Furthermore, we utilized the estimated parameters of the gamma distribution to predict the amino acid substitutions on HA in subsequent years. Retrospective prediction tests for 12 years from 1997 to 2009 showed that 70% of actual amino acid substitutions were correctly predicted, and that 45% of predicted amino acid substitutions have been actually observed. Although it remains unsolved how to predict the exact timing of antigenic changes, the present results suggest that our model may have the potential to recognize emerging epidemic strains.
PMCID: PMC3189952  PMID: 22028800
20.  Theme and Variations in the Evolutionary Pathways to Virulence of an RNA Plant Virus Species 
PLoS Pathogens  2007;3(11):e180.
The diversity of a highly variable RNA plant virus was considered to determine the range of virulence substitutions, the evolutionary pathways to virulence, and whether intraspecific diversity modulates virulence pathways and propensity. In all, 114 isolates representative of the genetic and geographic diversity of Rice yellow mottle virus (RYMV) in Africa were inoculated to several cultivars with eIF(iso)4G-mediated Rymv1-2 resistance. Altogether, 41 virulent variants generated from ten wild isolates were analyzed. Nonconservative amino acid replacements at five positions located within a stretch of 15 codons in the central region of the 79-aa-long protein VPg were associated with virulence. Virulence substitutions were fixed predominantly at codon 48 in most strains, whatever the host genetic background or the experimental conditions. There were one major and two isolate-specific mutational pathways conferring virulence at codon 48. In the prevalent mutational pathway I, arginine (AGA) was successively displaced by glycine (GGA) and glutamic acid (GAA). Substitutions in the other virulence codons were displaced when E48 was fixed. In the isolate-specific mutational pathway II, isoleucine (ATA) emerged and often later coexisted with valine (GTA). In mutational pathway III, arginine, with the specific S2/S3 strain codon usage AGG, was displaced by tryptophane (TGG). Mutational pathway I never arose in the widely spread West African S2/S3 strain because G48 was not infectious in the S2/S3 genetic context. Strain S2/S3 least frequently overcame resistance, whereas two geographically localized variants of the strain S4 had a high propensity to virulence. Codons 49 and 26 of the VPg, under diversifying selection, are candidate positions in modulating the genetic barriers to virulence. The theme and variations in the evolutionary pathways to virulence of RYMV illustrates the extent of parallel evolution within a highly variable RNA plant virus species.
Author Summary
Parallel changes in independently evolving lineages are important, but their contribution to pathogen evolution has not been assessed at the species level. We investigated the extent of phenotypic and genotypic parallel evolution in a highly variable RNA plant virus species, Rice yellow mottle virus (RYMV). Isolates representative of the genetic and geographic diversity of RYMV in Africa were inoculated to several rice cultivars with eIF(iso)4G-mediated Rymv1-2 resistance. The theme and variations in the evolutionary pathways to gain virulence found in the VPg of RYMV illustrate the frequency of parallel evolution. The repeated occurrence of the R48E substitution in the VPg of most strains, whatever the Rymv1-2 background and plant growth conditions, showed the specificity of parallel evolution that operated through the same pathway, locus, and mutation. The frequency and specificity of parallel mutations indicate, respectively, that RYMV is able to rapidly explore the adaptive landscape, fixing favorable mutations to virulence, and that there are a limited number of pathways across the adaptive landscape. Our results provide insights into the ways an RNA virus species explores the adaptive landscape and into the constraints restricting the number of mutational pathways.
PMCID: PMC2094307  PMID: 18039030
21.  Evolutionary Dynamics of Variant Genomes of Human Papillomavirus Types 18, 45, and 97▿ †  
Journal of Virology  2008;83(3):1443-1455.
Human papillomavirus type 18 (HPV18) and HPV45 account for approximately 20% of all cervix cancers. We show that HPV18, HPV45, and the recently discovered HPV97 comprise a clade sharing a most recent common ancestor within HPV α7 species. Variant lineages of these HPV types were classified by sequence analysis of the upstream regulatory region/E6 region among cervical samples from a population-based study in Costa Rica, and 27 representative genomes from each major variant lineage were sequenced. Nucleotide variation within HPV18 and HPV45 was 3.82% and 2.39%, respectively, and amino acid variation was 4.73% and 2.87%, respectively. Only 18 nucleotide variations, of which 10 were nonsynonymous, were identified among three HPV97 genomes. Full-genome comparisons revealed maximal diversity between HPV18 African and non-African variants (2.6% dissimilarity), whereas HPV18 Asian-American [E1 (AA)] and European (E2) variants were closely related (less than 0.5% dissimilarity); HPV45 genomes had a maximal difference of 1.6% nucleotides. Using a Bayesian Markov chain Monte Carlo (MCMC) method, the divergence times of HPV18, -45, and -97 from their most recent common ancestors indicated that HPV18 diverged approximately 7.7 million years (Myr) ago, whereas HPV45 and HPV97 split off around 5.7 Myr ago, in a period encompassing the divergence of the great ape species. Variants within the HPV18/45/97 lineages were estimated to have diverged from their common ancestors in the genus Homo within the last 1 Myr (<0.7 Myr). To investigate the molecular basis of HPV18, HPV45, and HPV97 evolution, regression models of codon substitution were used to identify lineages and amino acid sites under selective pressure. The E5 open reading frame (ORF) of HPV18 and the E4 ORFs of HPV18, HPV45, and HPV18/45/97 had nonsynonymous/synonymous substitution rate ratios (dN/dS) over 1 indicative of positive Darwinian selection. The L1 ORF of HPV18 genomes had an increased proportion of nonsynonymous substitutions (4.93%; average dN/dS ratio [M3] = 0.3356) compared to HPV45 (1.86%; M3 = 0.1268) and HPV16 (2.26%; M3 = 0.1330) L1 ORFs. In contrast, HPV18 and HPV16 genomes had similar amino acid substitution rates within the E1 ORF (2.89% and 3.24%, respectively), while HPV45 E1 was highly conserved (amino acid substitution rate was 0.77%). These data provide an evolutionary history of this medically important clade of HPVs and identify an unexpected divergence of the L1 gene of HPV18 that may have clinical implications for the long-term use of an L1-virus-like particle-based prophylactic vaccine.
PMCID: PMC2620887  PMID: 19036820
22.  Potent Antibody-Mediated Neutralization and Evolution of Antigenic Escape Variants of Simian Immunodeficiency Virus Strain SIVmac239 In Vivo▿  
Journal of Virology  2008;82(19):9739-9752.
Here, we describe the evolution of antigenic escape variants in a rhesus macaque that developed unusually high neutralizing antibody titers to SIVmac239. By 42 weeks postinfection, 50% neutralization of SIVmac239 was achieved with plasma dilutions of 1:1,000. Testing of purified immunoglobulin confirmed that the neutralizing activity was antibody mediated. Despite the potency of the neutralizing antibody response, the animal displayed a typical viral load profile and progressed to terminal AIDS with a normal time course. Viral envelope sequences from week 16 and week 42 plasma contained an excess of nonsynonymous substitutions, predominantly in V1 and V4, including individual sites with ratios of nonsynonymous to synonymous substitution rates (dN/dS) highly suggestive of strong positive selection. Recombinant viruses encoding envelope sequences isolated from these time points remained resistant to neutralization by all longitudinal plasma samples, revealing the failure of the animal to mount secondary responses to the escaped variants. Substitutions at two sites with significant dN/dS values, one in V1 and one in V4, were independently sufficient to confer nearly complete resistance to neutralization. Substitutions at three additional sites, one in V4 and two in gp41, conferred moderate to high levels of resistance when tested individually. All the amino acid changes leading to escape resulted from single nucleotide substitutions. The observation that antigenic escape resulted from individual, single amino acid replacements at sites well separated in current structural models of Env indicates that the virus can utilize multiple independent pathways to rapidly achieve similar levels of resistance.
PMCID: PMC2546989  PMID: 18667507
23.  The human progesterone receptor shows evidence of adaptive evolution associated with its ability to act as a transcription factor 
The gene encoding the progesterone receptor (PGR) acts as a transcription factor, and participates in the regulation of reproductive processes including menstruation, implantation, pregnancy maintenance, parturition, mammary development, and lactation. Unlike other mammals, primates do not exhibit progesterone withdrawal at the time of parturition. Because progesterone-mediated reproductive features vary among mammals, PGR is an attractive candidate gene for studies of adaptive evolution. Thus, we sequenced the progesterone receptor coding regions in a diverse range of species including apes, Old World monkeys, New World monkeys, prosimian primates and other mammals. Adaptive evolution occurred on the human and chimpanzee lineages as evidenced by statistically significant increases in nonsynonymous substitution rates compared to synonymous substitution rates. Positive selection was rarely observed in other lineages. In humans, amino acid replacements occurred mostly in a region of the gene that has been shown to have an inhibitory function (IF) on the ability of the progesterone receptor to act as a transcription factor. Moreover, many of the nonsynonymous substitutions in primates occurred in the N-terminus. This suggests that cofactor interaction surfaces might have been altered, resulting in altered progesterone-regulated gene transcriptional effects. Further evidence that the changes conferred an adaptive advantage comes from SNP analysis indicating only one of the IF changes is polymorphic in humans. In chimpanzees, amino acid changes occurred in both the inhibitory and transactivation domains. Positive selection provides the basis for the hypothesis that changes in structure and function of the progesterone receptor during evolution contributes to the diversity of primate reproductive biology, especially in parturition.
PMCID: PMC2713739  PMID: 18375150
24.  CodonTest: Modeling Amino Acid Substitution Preferences in Coding Sequences 
PLoS Computational Biology  2010;6(8):e1000885.
Codon models of evolution have facilitated the interpretation of selective forces operating on genomes. These models, however, assume a single rate of non-synonymous substitution irrespective of the nature of amino acids being exchanged. Recent developments have shown that models which allow for amino acid pairs to have independent rates of substitution offer improved fit over single rate models. However, these approaches have been limited by the necessity for large alignments in their estimation. An alternative approach is to assume that substitution rates between amino acid pairs can be subdivided into rate classes, dependent on the information content of the alignment. However, given the combinatorially large number of such models, an efficient model search strategy is needed. Here we develop a Genetic Algorithm (GA) method for the estimation of such models. A GA is used to assign amino acid substitution pairs to a series of rate classes, where is estimated from the alignment. Other parameters of the phylogenetic Markov model, including substitution rates, character frequencies and branch lengths are estimated using standard maximum likelihood optimization procedures. We apply the GA to empirical alignments and show improved model fit over existing models of codon evolution. Our results suggest that current models are poor approximations of protein evolution and thus gene and organism specific multi-rate models that incorporate amino acid substitution biases are preferred. We further anticipate that the clustering of amino acid substitution rates into classes will be biologically informative, such that genes with similar functions exhibit similar clustering, and hence this clustering will be useful for the evolutionary fingerprinting of genes.
Author Summary
Evolution in protein-coding DNA sequences can be modeled at three levels: nucleotides, amino acids or codons that encode the amino acids. Codon models incorporate nucleotide and amino acid information, and allow the estimation of the rate at which amino acids are replaced () versus the rate at which they are preserved (). The ratio has been used in thousands of studies to detect molecular footprints of natural selection. A serious limitation of most codon models is the unrealistic assumption that all non-synonymous substitutions occur at the same rate. Indeed, amino acid models have consistently demonstrated that different residues are exchanged more or less frequently, depending on incompletely understood factors. We derive and validate a computational approach for inferring codon models which combine the power to investigate natural selection with data-driven amino acid substitution biases from alignments. The addition of amino acid properties can lead to more powerful and accurate methods for studying natural selection and the evolutionary history of protein-coding sequences. The pattern of amino acid substitutions specific to a given alignment can be used to compare and contrast the evolutionary properties of different genes, providing an evolutionary analog to protein family comparisons.
PMCID: PMC2924240  PMID: 20808876
25.  More Radical Amino Acid Replacements in Primates than in Rodents: Support for the Evolutionary Role of Effective Population Size 
Gene  2009;440(1-2):50-56.
We examined the pattern of nucleotide substitution in 4933 conserved single-copy orthologous protein-coding genes of human, rhesus, mouse, and rat. Consistent with previous studies, the median ratio of the number of nonsynonymous substitutions per nonsynonymous site (dN) to the number of synonymous substitutions per synonymous site (dS) was significantly higher in the comparison between the two primates than in the comparison between the two rodents. This pattern was particularly strong in the case of genes expressed in the immune system, but also occurred in other genes, including a set of highly conserved genes involved in the regulation of transcription. Both synonymous and nonsynonymous differences occurred independently in the same codons in the primates and in the rodents to a greater extent than expected by chance, but the extent of the deviation from random expectation was much greater in the case of nonsynonymous differences. Parallel amino acid replacements occurred at the same sites in the primates and rodents far more frequently than expected by chance, but tended to involve very conservative amino acid changes. Divergent amino acid changes involved more chemically different amino acids than parallel changes, and divergent amino acid replacements between the primates were significantly more radical than those between the rodents. These results are most easily explained on the hypothesis that the evolution of these genes has been shaped largely by purifying selection, which has been less effective in primates than in rodents, presumably as a consequence of lower long-term effective population sizes in the former.
PMCID: PMC2706701  PMID: 19332110
purifying selection; nearly neutral theory; parallel evolution; slightly deleterious mutation

Results 1-25 (1296887)