A key priority in infectious disease research is to understand the ecological and evolutionary drivers of viral diseases from data on disease incidence as well as viral genetic and antigenic variation. We propose using a simulation-based, Bayesian method known as Approximate Bayesian Computation (ABC) to fit and assess phylodynamic models that simulate pathogen evolution and ecology against summaries of these data. We illustrate the versatility of the method by analyzing two spatial models describing the phylodynamics of interpandemic human influenza virus subtype A(H3N2). The first model captures antigenic drift phenomenologically with continuously waning immunity, and the second epochal evolution model describes the replacement of major, relatively long-lived antigenic clusters. Combining features of long-term surveillance data from the Netherlands with features of influenza A (H3N2) hemagglutinin gene sequences sampled in northern Europe, key phylodynamic parameters can be estimated with ABC. Goodness-of-fit analyses reveal that the irregularity in interannual incidence and H3N2's ladder-like hemagglutinin phylogeny are quantitatively only reproduced under the epochal evolution model within a spatial context. However, the concomitant incidence dynamics result in a very large reproductive number and are not consistent with empirical estimates of H3N2's population level attack rate. These results demonstrate that the interactions between the evolutionary and ecological processes impose multiple quantitative constraints on the phylodynamic trajectories of influenza A(H3N2), so that sequence and surveillance data can be used synergistically. ABC, one of several data synthesis approaches, can easily interface a broad class of phylodynamic models with various types of data but requires careful calibration of the summaries and tolerance parameters.
The infectious disease dynamics of many viral pathogens like influenza, norovirus and coronavirus are inextricably tied to their evolution. This interaction between evolutionary and ecological processes complicates our ability to understand the infectious disease behavior of rapidly evolving pathogens. Most statistical methods for the analysis of these “phylodynamics” require that the likelihood of the data can be explicitly calculated. Currently, this is not possible for many phylodynamic models, so that questions on the interaction between viral variants cannot be well-addressed within this framework. Simulation-based statistical methods circumvent likelihood calculations. Considering interpandemic human influenza A virus subtype H3N2, we here illustrate the effectiveness of these methods to fit and assess complex phylodynamic models against both sequence and surveillance data. We find that combining molecular genetic and epidemiological data is key to estimate phylodynamic parameters reliably. Moreover, the information in the available data taken together is enough to expose quantitative model inconsistencies. Methods such as ABC which can combine sequence and surveillance data appear to be well-suited to fit and assess mechanistic hypotheses on the phylodynamics of RNA viruses.
Human Influenza A virus undergoes recurrent changes in the hemagglutinin (HA) surface protein, primarily involved in the human antibody recognition. Relevant antigenic changes, enabling the virus to evade host immune response, have been recognized to occur in parallel to multiple mutations at antigenic sites in HA. Yet, the role of correlated mutations (epistasis) in driving the molecular evolution of the virus still represents a challenging puzzle. Further, though circulation at a global geographic level is key for the survival of Influenza A, its role in shaping the viral phylodynamics remains largely unexplored. Here we show, through a sequence based epidemiological model, that epistatic effects between amino acids substitutions, coupled with a reservoir that mimics worldwide circulating viruses, are key determinants that drive human Influenza A evolution. Our approach explains all the up-to-date observations characterizing the evolution of H3N2 subtype, including phylogenetic properties, nucleotide fixation patterns, and composition of antigenic clusters.
The rapid evolution of influenza viruses presents difficulties in maintaining the optimal efficiency of vaccines. Amino acid substitutions result in antigenic drift, a process whereby antisera raised in response to one virus have reduced effectiveness against future viruses. Interestingly, while amino acid substitutions occur at a relatively constant rate, the antigenic properties of H3 move in a discontinuous, step-wise manner. It is not clear why this punctuated evolution occurs, whether this represents simply the fact that some substitutions affect these properties more than others, or if this is indicative of a changing relationship between the virus and the host. In addition, the role of changing glycosylation of the haemagglutinin in these shifts in antigenic properties is unknown. We analysed the antigenic drift of HA1 from human influenza H3 using a model of sequence change that allows for variation in selective pressure at different locations in the sequence, as well as at different parts of the phylogenetic tree. We detect significant changes in selective pressure that occur preferentially during major changes in antigenic properties. Despite the large increase in glycosylation during the past 40 years, changes in glycosylation did not correlate either with changes in antigenic properties or with significantly more rapid changes in selective pressure. The locations that undergo changes in selective pressure are largely in places undergoing adaptive evolution, in antigenic locations, and in locations or near locations undergoing substitutions that characterise the change in antigenicity of the virus. Our results suggest that the relationship of the virus to the host changes with time, with the shifts in antigenic properties representing changes in this relationship. This suggests that the virus and host immune system are evolving different methods to counter each other. While we are able to characterise the rapid increase in glycosylation of the haemagglutinin during time in human influenza H3, an increase not present in influenza in birds, this increase seems unrelated to the observed changes in antigenic properties.
H3N2-type influenza is responsible for widespread disease and significant mortality. The virus evolves rapidly, changing its antigenic properties, allowing it to escape clearance by the immune response as well as complicating the maintenance of vaccine effectiveness. Part of this evolution has been the rapid increase in glycosylation, an increase not observed either in H9 evolution in birds or in H1 evolution in humans. It has been observed that the antigenic properties change in a punctuated, discontinuous manner. This could be either because some mutations are more significant than others, or it could mean that the antigenic changes correspond to adjustments in the antagonistic relationship between virus and host. By studying the sequence evolution of the H3 haemagglutinin, we can demonstrate that the selective pressure acting on the virus protein changes with time, and that these changes are especially rapid during changes in antigenic properties. This indicates that the antigenic changes correspond to modifications in the virus–host relationship. Surprisingly, neither the changes in selective pressure nor the changes in antigenic properties correspond to changes in glycosylation.
Rapid antigenic evolution in the influenza A virus hemagglutinin precludes effective vaccination with existing vaccines. To understand this phenomenon, we passaged virus in mice immunized with influenza. Neutralizing antibodies selected mutants with single amino acid hemagglutinin substitutions that increased virus binding to cell surface glycan receptors. Passaging these high avidity-binding mutants in naïve mice, but not immune mice, selected for additional hemagglutinin substitutions that decreased cellular receptor binding avidity. Analyzing a panel of monoclonal antibody hemagglutinin escape mutants revealed a positive correlation between receptor binding avidity and escape from polyclonal antibodies. We propose that in response to variation in neutralizing antibody pressure between individuals, influenza A virus evolves by adjusting receptor binding avidity via amino acid substitutions throughout the hemagglutinin globular domain, many of which simultaneously alter antigenicity.
Human influenza A viruses undergo antigenic changes with gradual accumulation of amino acid substitutions on the hemagglutinin (HA) molecule. A strong antigenic mismatch between vaccine and epidemic strains often requires the replacement of influenza vaccines worldwide. To establish a practical model enabling us to predict the future direction of the influenza virus evolution, relative distances of amino acid sequences among past epidemic strains were analyzed by multidimensional scaling (MDS). We found that human influenza viruses have evolved along a gnarled evolutionary pathway with an approximately constant curvature in the MDS-constructed 3D space. The gnarled pathway indicated that evolution on the trunk favored multiple substitutions at the same amino acid positions on HA. The constant curvature was reasonably explained by assuming that the rate of amino acid substitutions varied from one position to another according to a gamma distribution. Furthermore, we utilized the estimated parameters of the gamma distribution to predict the amino acid substitutions on HA in subsequent years. Retrospective prediction tests for 12 years from 1997 to 2009 showed that 70% of actual amino acid substitutions were correctly predicted, and that 45% of predicted amino acid substitutions have been actually observed. Although it remains unsolved how to predict the exact timing of antigenic changes, the present results suggest that our model may have the potential to recognize emerging epidemic strains.
The evolution of haemagglutinin (HA), an important influenza virus antigen, has been the subject of intensive research for more than two decades. Many characteristics of HA's sequence evolution are captured by standard Markov chain substitution models. Such models assign equal fitness to all accessible amino acids at a site. We show, however, that such models strongly underestimate the number of homoplastic amino acid substitutions during the course of HA's evolution, i.e. substitutions that repeatedly give rise to the same amino acid at a site. We develop statistics to detect individual homoplastic events and find that they preferentially occur at positively selected epitopic sites. Our results suggest that the evolution of the influenza A HA, including evolution by positive selection, is strongly affected by the long-term site-specific preferences for individual amino acids.
directional selection; dN/dS; haemagglutinin; homoplasy; influenza A
The seasonal influenza A virus undergoes rapid evolution to escape human immune response. Adaptive changes occur primarily in antigenic epitopes, the antibody-binding domains of the viral hemagglutinin. This process involves recurrent selective sweeps, in which clusters of simultaneous nucleotide fixations in the hemagglutinin coding sequence are observed about every 4 years. Here, we show that influenza A (H3N2) evolves by strong clonal interference. This mode of evolution is a red queen race between viral strains with different beneficial mutations. Clonal interference explains and quantifies the observed sweep pattern: we find an average of at least one strongly beneficial amino acid substitution per year, and a given selective sweep has three to four driving mutations on average. The inference of selection and clonal interference is based on frequency time series of single-nucleotide polymorphisms, which are obtained from a sample of influenza genome sequences over 39 years. Our results imply that mode and speed of influenza evolution are governed not only by positive selection within, but also by background selection outside antigenic epitopes: immune adaptation and conservation of other viral functions interfere with each other. Hence, adapting viral proteins are predicted to be particularly brittle. We conclude that a quantitative understanding of influenza’s evolutionary and epidemiological dynamics must be based on all genomic domains and functions coupled by clonal interference.
adaptive evolution; inference of selection; mutation rate; seasonal influenza
Despite their close phylogenetic relationship, type A and B influenza viruses exhibit major epidemiological differences in humans, with the latter both less common and less often associated with severe disease. However, it is unclear what processes determine the evolutionary dynamics of influenza B virus, and how influenza viruses A and B interact at the evolutionary scale. To address these questions we inferred the phylogenetic history of human influenza B virus using complete genome sequences for which the date (day) of isolation was available. By comparing the phylogenetic patterns of all eight viral segments we determined the occurrence of segment reassortment over a 30-year sampling period. An analysis of rates of nucleotide substitution and selection pressures revealed sporadic occurrences of adaptive evolution, most notably in the viral hemagglutinin and compatible with the action of antigenic drift, yet lower rates of overall and nonsynonymous nucleotide substitution compared to influenza A virus. Overall, these results led us to propose a model in which evolutionary changes within and between the antigenically distinct ‘Yam88’ and ‘Vic87’ lineages of influenza B virus are the result of changes in herd immunity, with reassortment continuously generating novel genetic variation. Additionally, we suggest that the interaction with influenza A virus may be central in shaping the evolutionary dynamics of influenza B virus, facilitating the shift of dominance between the Vic87 and the Yam88 lineages.
Influenza B virus; Phylogeny; Reassortment; Coalescent; Antigenic drift; Epidemiology
Influenza virus undergoes rapid evolution by both antigenic shift and antigenic drift. Antibodies, particularly those binding near the receptor-binding site of hemagglutinin (HA) or the neuraminidase (NA) active site, are thought to be the primary defense against influenza infection, and mutations in antibody binding sites can reduce or eliminate antibody binding. The binding of antibodies to their cognate antigens is governed by such biophysical properties of the interacting surfaces as shape, non-polar and polar surface area, and charge.
To understand forces shaping evolution of influenza virus, we have examined HA sequences of human influenza A and B viruses, assigning each amino acid values reflecting total accessible surface area, non-polar and polar surface area, and net charge due to the side chain. Changes in each of these values between neighboring sequences were calculated for each residue and mapped onto the crystal structures.
Areas of HA showing the highest frequency of pairwise changes agreed well with previously identified antigenic sites in H3 and H1 HAs, and allowed us to propose more detailed antigenic maps and novel antigenic sites for H1 and influenza B HA. Changes in biophysical properties differed between HAs of different subtypes, and between different antigenic sites of the same HA. For H1, statistically significant differences in several biophysical quantities compared to residues lying outside antigenic sites were seen for some antigenic sites but not others. Influenza B antigenic sites all show statistically significant differences in biophysical quantities for all antigenic sites, whereas no statistically significant differences in biophysical quantities were seen for any antigenic site is seen for H3. In many cases, residues previously shown to be under positive selection at the genetic level also undergo rapid change in biophysical properties.
The biophysical consequences of amino acid changes introduced by antigenic drift vary from subtype to subtype, and between different antigenic sites. This suggests that the significance of antibody binding in selecting new variants may also be variable for different antigenic sites and influenza subtypes.
Influenza A virus is one of the best-studied viruses and a model organism for the study of molecular evolution; in particular, much research has focused on detecting natural selection on influenza virus proteins. Here, we study the dynamics of the synonymous and nonsynonymous nucleotide composition of influenza A virus genes. In several genes, the nucleotide frequencies at synonymous positions drift away from the equilibria predicted from the synonymous substitution matrices. We investigate possible reasons for this unexpected behavior by fitting several regression models. Relaxation toward a mutation-selection equilibrium following a host jump fails to explain the dynamics of the synonymous nucleotide composition, even if we allow for slow temporal changes in the substitution matrix. Instead, we find that deep internal branches of the phylogeny show distinct patterns of nucleotide substitution and that these branches strongly influence the dynamics of nucleotide composition, suggesting that the observed trends are at least in part a result of natural selection acting on synonymous sites. Moreover, we find that the dynamics of the nucleotide composition at synonymous and nonsynonymous sites are highly correlated, providing evidence that even nonsynonymous sites can be influenced by selection pressure for nucleotide composition.
The RNA genome of the hepatitis C virus (HCV) diversifies rapidly during the acute phase of infection, but the selective forces that drive this process remain poorly defined. Here we examined whether Darwinian selection pressure imposed by CD8+ T cells is a dominant force driving early amino acid replacement in HCV viral populations. This question was addressed in two chimpanzees followed for 8 to 10 years after infection with a well-defined inoculum composed of a clonal genotype 1a (isolate H77C) HCV genome. Detailed characterization of CD8+ T cell responses combined with sequencing of recovered virus at frequent intervals revealed that most acute-phase nonsynonymous mutations were clustered in class I epitopes and appeared much earlier than those in the remainder of the HCV genome. Moreover, the ratio of nonsynonymous to synonymous mutations, a measure of positive selection pressure, was increased 50-fold in class I epitopes compared with the rest of the HCV genome. Finally, some mutation of the clonal H77C genome toward a genotype 1a consensus sequence considered most fit for replication was observed during the acute phase of infection, but the majority of these amino acid substitutions occurred slowly over several years of chronic infection. Together these observations indicate that during acute hepatitis C, virus evolution was driven primarily by positive selection pressure exerted by CD8+ T cells. This influence of immune pressure on viral evolution appears to subside as chronic infection is established and genetic drift becomes the dominant evolutionary force.
Distinguishing mutations that determine an organism's phenotype from (near-) neutral ‘hitchhikers’ is a fundamental challenge in genome research, and is relevant for numerous medical and biotechnological applications. For human influenza viruses, recognizing changes in the antigenic phenotype and a strains' capability to evade pre-existing host immunity is important for the production of efficient vaccines. We have developed a method for inferring ‘antigenic trees’ for the major viral surface protein hemagglutinin. In the antigenic tree, antigenic weights are assigned to all tree branches, which allows us to resolve the antigenic impact of the associated amino acid changes. Our technique predicted antigenic distances with comparable accuracy to antigenic cartography. Additionally, it identified both known and novel sites, and amino acid changes with antigenic impact in the evolution of influenza A (H3N2) viruses from 1968 to 2003. The technique can also be applied for inference of ‘phenotype trees’ and genotype–phenotype relationships from other types of pairwise phenotype distances.
The molecular evolution of any organism is described by changes in the genotype resulting from genetic drift or selection to maintain or establish fitness under the given environmental conditions. Identification of phenotype-defining changes and their distinction from (near-) neutral (‘hitchhikers’) ones is a fundamental challenge in genome research. The standard approach involves time- and cost-intensive mutation experiments, which are typically low throughput, due to their experimental nature. We have developed a computational method for the inference of phenotypic impact of genotypic changes that is applicable to any system, within or across species, where homologous genetic sequences and associated pairwise phenotype distances are available. We demonstrate the accuracy of our method by application to the human influenza A (H3N2) virus. This exemplary system is of particular interest, as recognizing changes in the antigenic phenotype and a viral strains' capability to evade pre-existing host immunity is important for the production of efficient vaccines. We accurately identified known sites and amino acid changes with antigenic impact over 35 years of evolution, and provide further details on individual antigenically relevant changes in the evolution of influenza A (H3N2) viruses.
Whole-genome scans for positive Darwinian selection are widely used to detect evolution of genome novelty. Most approaches are based on evaluation of nonsynonymous to synonymous substitution rate ratio across evolutionary lineages. These methods are sensitive to saturation of synonymous sites and thus cannot be used to study evolution of distantly related organisms. In contrast, indels occur less frequently than amino acid replacements, accumulate more slowly, and can be employed to characterize evolution of diverged organisms. As indels are also subject to the forces of natural selection, they can generate functional changes through positive selection. Here, we present a new computational approach to detect selective constraints on indel substitutions at the whole-genome level for distantly related organisms. Our method is based on ancestral sequence reconstruction, takes into account the varying susceptibility of different types of secondary structure to indels, and according to simulation studies is conservative. We applied this newly developed framework to characterize the evolution of organisms of the Planctomycetes, Verrucomicrobia, Chlamydiae (PVC) bacterial superphylum. The superphylum contains organisms with unique cell biology, physiology, and diverse lifestyles. It includes bacteria with simple cell organization and more complex eukaryote-like compartmentalization. Lifestyles range from free-living organisms to obligate pathogens. In this study, we conduct a whole-genome level analysis of indel substitutions specific to evolutionary lineages of the PVC superphylum and found that indels evolved under positive selection on up to 12% of gene tree branches. We also analyzed possible functional consequences for several case studies of predicted indel events.
selection; indel substitutions; PVC superphylum
Non-long terminal repeat retroelements continue to impact the human genome through
cis-activity of long interspersed element-1 (LINE-1 or L1) and trans-mobilization of Alu.
Current activity is dominated by modern subfamilies of these elements, leaving behind an
evolutionary graveyard of extinct Alu and L1 subfamilies. Because Alu is a nonautonomous
element that relies on L1 to retrotranspose, there is the possibility that competition
between these elements has driven selection and antagonistic coevolution between Alu and
L1. Through analysis of synonymous versus nonsynonymous codon evolution across L1
subfamilies, we find that the C-terminal ORF2 cys domain experienced a dramatic increase
in amino acid substitution rate in the transition from L1PA5 to L1PA4 subfamilies. This
observation coincides with the previously reported rapid evolution of ORF1 during the same
transition period. Ancestral Alu sequences have been previously reconstructed, as their
short size and ubiquity have made it relatively easy to retrieve consensus sequences from
the human genome. In contrast, creating constructs of extinct L1 copies is a more
laborious task. Here, we report our efforts to recreate and evaluate the
retrotransposition capabilities of two ancestral L1 elements, L1PA4 and L1PA8 that were
active ∼18 and ∼40 Ma, respectively. Relative to the modern L1PA1 subfamily, we
find that both elements are similarly active in a cell culture retrotransposition assay in
HeLa, and both are able to efficiently trans-mobilize Alu elements from several
subfamilies. Although we observe some variation in Alu subfamily retrotransposition
efficiency, any coevolution that may have occurred between LINEs and SINEs is not evident
from these data. Population dynamics and stochastic variation in the number of active
source elements likely play an important role in individual LINE or SINE subfamily
amplification. If coevolution also contributes to changing retrotransposition rates and
the progression of subfamilies, cell factors are likely to play an important mediating
role in changing LINE-SINE interactions over evolutionary time.
Alu; extinction; LINE-1; L1; ORF1 protein; ORF2 protein; retroelement; SINE amplification
The influenza A(H1N1)2009 virus has been the dominant type of influenza A virus in Finland during the 2009–2010 and 2010–2011 epidemic seasons. We analyzed the antigenic characteristics of several influenza A(H1N1)2009 viruses isolated during the two influenza seasons by analyzing the amino acid sequences of the hemagglutinin (HA), modeling the amino acid changes in the HA structure and measuring antibody responses induced by natural infection or influenza vaccination.
Based on the HA sequences of influenza A(H1N1)2009 viruses we selected 13 different strains for antigenic characterization. The analysis included the vaccine virus, A/California/07/2009 and multiple California-like isolates from 2009–2010 and 2010–2011 epidemic seasons. These viruses had two to five amino acid changes in their HA1 molecule. The mutation(s) were located in antigenic sites Sa, Ca1, Ca2 and Cb region. Analysis of the antibody levels by hemagglutination inhibition test (HI) indicated that vaccinated individuals and people who had experienced a natural influenza A(H1N1)2009 virus infection showed good immune responses against the vaccine virus and most of the wild-type viruses. However, one to two amino acid changes in the antigenic site Sa dramatically affected the ability of antibodies to recognize these viruses. In contrast, the tested viruses were indistinguishable in regard to antibody recognition by the sera from elderly individuals who had been exposed to the Spanish influenza or its descendant viruses during the early 20th century.
According to our results, one to two amino acid changes (N125D and/or N156K) in the major antigenic sites of the hemagglutinin of influenza A(H1N1)2009 virus may lead to significant reduction in the ability of patient and vaccine sera to recognize A(H1N1)2009 viruses.
Geographical separation of host species has shaped the avian influenza A virus gene pool into independently evolving Eurasian and American lineages, although phylogenetic evidence for gene flow and reassortment indicates that these lineages also mix on occasion. While the evolutionary dynamics of the avian influenza gene pool have been described, the consequences of gene flow on virus evolution and population structure in this system have not been investigated. Here we show that viral gene flow from Eurasia has led to the replacement of endemic avian influenza viruses in North America, likely through competition for susceptible hosts. This competition is characterized by changes in rates of nucleotide substitution and selection pressures. However, the discontinuous distribution of susceptible hosts may produce long periods of co-circulation of competing virus strains before lineage extinction occurs. These results also suggest that viral competition for host resources may be an important mechanism in disease emergence.
evolution; ecology; population dynamics; influenza A virus; competition
After the emergence of influenza A viruses in the human population, the number of N-glycosylation sites (NGS) in the globular head region of hemagglutinin (HA) has increased continuously for several decades. It has been speculated that the addition of NGS to the globular head region of HA has conferred selective advantages to the virus by preventing the binding of antibodies (Ab) to antigenic sites (AS). Here, the effect of N-glycosylation on the binding of Ab to AS in human influenza A virus subtype H3N2 (A/H3N2) was examined by inferring natural selection at AS and other sites (NAS) that are located close to and distantly from the NGS in the three-dimensional structure of HA through a comparison of the rates of synonymous (dS) and nonsynonymous (dN) substitutions. When positions 63, 122, 126, 133, 144, and 246 in the globular head region of HA were non-NGS, the dN/dS was >1 and positive selection was detected at the AS located near these positions. However, the dN/dS value decreased and the evidence of positive selection disappeared when these positions became NGS. In contrast, dN/dS at the AS distantly located from the positions mentioned above and at the NAS of any location were generally <1 and did not decrease when these positions changed from non-NGS to NGS. These results suggest that the attachment of N-glycans to the NGS in the globular head region of HA prevented the binding of Ab to AS in the evolutionary history of human A/H3N2 virus.
The WHO Global Influenza Surveillance Network has routinely performed genetic and antigenic analyses of human influenza viruses to monitor influenza activity. Although these analyses provide supporting data for the selection of vaccine strains, it seems desirable to have user-friendly tools to visualize the antigenic evolution of influenza viruses for the purpose of surveillance. To meet this need, we have developed a web server, ATIVS (Analytical Tool for Influenza Virus Surveillance), for analyzing serological data of all influenza viruses and hemagglutinin sequence data of human influenza A/H3N2 viruses so as to generate antigenic maps for influenza surveillance and vaccine strain selection. Functionalities are described and examples are provided to illustrate its usefulness and performance. The ATIVS web server is available at http://influenza.nhri.org.tw/ATIVS/.
One selection pressure shaping sequence evolution is the requirement that a
protein fold with sufficient stability to perform its biological functions. We
present a conceptual framework that explains how this requirement causes the
probability that a particular amino acid mutation is fixed during evolution to
depend on its effect on protein stability. We mathematically formalize this
framework to develop a Bayesian approach for inferring the stability effects of
individual mutations from homologous protein sequences of known phylogeny. This
approach is able to predict published experimentally measured mutational
stability effects (ΔΔG values) with an accuracy
that exceeds both a state-of-the-art physicochemical modeling program and the
sequence-based consensus approach. As a further test, we use our phylogenetic
inference approach to predict stabilizing mutations to influenza hemagglutinin.
We introduce these mutations into a temperature-sensitive influenza virus with a
defect in its hemagglutinin gene and experimentally demonstrate that some of the
mutations allow the virus to grow at higher temperatures. Our work therefore
describes a powerful new approach for predicting stabilizing mutations that can
be successfully applied even to large, complex proteins such as hemagglutinin.
This approach also makes a mathematical link between phylogenetics and
experimentally measurable protein properties, potentially paving the way for
more accurate analyses of molecular evolution.
Mutating a protein frequently causes a change in its stability. As scientists, we
often care about these changes because we would like to engineer a
protein's stability or understand how its stability is impacted by a
naturally occurring mutation. Evolution also cares about mutational stability
changes, because a basic evolutionary requirement is that proteins remain
sufficiently stable to perform their biological functions. Our work is based on
the idea that it should be possible to use the fact that evolution selects for
stability to infer from related proteins the effects of specific mutations. We
show that we can indeed use protein evolutionary histories to computationally
predict previously measured mutational stability changes more accurately than
methods based on either of the two main existing strategies. We then test
whether we can predict mutations that increase the stability of hemagglutinin,
an influenza protein whose rapid evolution is partly responsible for the ability
of this virus to cause yearly epidemics. We experimentally create viruses
carrying predicted stabilizing mutations and find that several do in fact
improve the virus's ability to grow at higher temperatures. Our
computational approach may therefore be of use in understanding the evolution of
this medically important virus.
Influenza B virus hemagglutinin (HA) is a major surface glycoprotein with frequent amino acid substitutions. However, the roles of antibody selection in the amino acid substitutions of HA were still poorly understood. In order to gain insights into this important issue, an analysis was conducted on a total of 271 HA1. sequences of influenza B virus strains isolated during 1940–2007. In this analysis, phylogenetic analysis by maximum likelihood (PAML) package was used to detect the existence of positive selection and to identify positively selected sites on HA1. Strikingly, all the positively selected sites were located in the four major epitopes (120-loop, 150-loop, 160-loop, and 190-helix) of HA identified in previous studies, thus supporting a predominant role of antibody selection in HA evolution. Of particular significance is the involvement of the 120-loop in positive selection, which may become increasingly important in future field isolates. Despite the absence of different subtypes, influenza B virus HA continued to evolve into new sublineages, within which the four major epitopes were targeted selectively in positive selection. Thus, any newly emerging strains need to be placed in the context of their evolutionary history in order to understand and predict their epidemic potential.
positive selection; antigenic drift; molecular evolution; antibody selection
To investigate the process of human immunodeficiency virus type 1 (HIV-1) evolution in vivo, a total of 179 HIV-1 V3 sequences derived from cell-free plasma were determined from serial samples in three epidemiologically linked individuals (one infected blood donor and two transfusion recipients) over a maximum period of 8 years. A systematic analysis of pairwise comparisons of intrapatient sequences, both within and between each sample time point, revealed a preponderance and accumulation of nonsynonymous rather than synonymous substitutions in the V3 loop and flanking regions as they diverged over time. This strongly argues for the dominant role that positive selection for amino acid change plays in governing the pattern and process of HIV-1 env V3 evolution in vivo and nullifies hypotheses of purely neutral or mutation-driven evolution or completely chance events. In addition, different rates of evolution of HIV-1 were observed in these three different individuals infected with the same viral strain, suggesting that the degree of positive pressure for HIV-1 amino acid change is host dependent. Finally, the observed similar rate of accumulation in divergence within and between infected individuals suggests that the process of genetic divergence in the HIV epidemic proceeds regardless of host-to-host transmission events, i.e., that transmission does not reset the evolutionary clock.
The evolutionary speed and the consequent immune escape of H3N2 influenza A virus make it an interesting evolutionary system. Charged amino acid residues are often significant contributors to the free energy of binding for protein–protein interactions, including antibody–antigen binding and ligand–receptor binding. We used Markov chain theory and maximum likelihood estimation to model the evolution of the number of charged amino acids on the dominant epitope in the hemagglutinin protein of circulating H3N2 virus strains. The number of charged amino acids increased in the dominant epitope B of the H3N2 virus since introduction in humans in 1968. When epitope A became dominant in 1989, the number of charged amino acids increased in epitope A and decreased in epitope B. Interestingly, the number of charged residues in the dominant epitope of the dominant circulating strain is never fewer than that in the vaccine strain. We propose these results indicate selective pressure for charged amino acids that increase the affinity of the virus epitope for water and decrease the affinity for host antibodies. The standard PAM model of generic protein evolution is unable to capture these trends. The reduced alphabet Markov model (RAMM) model we introduce captures the increased selective pressure for charged amino acids in the dominant epitope of hemagglutinin of H3N2 influenza (R2 > 0.98 between 1968 and 1988). The RAMM model calibrated to historical H3N2 influenza virus evolution in humans fit well to the H3N2/Wyoming virus evolution data from Guinea pig animal model studies.
Electronic supplementary material
The online version of this article (doi:10.1007/s00239-010-9405-4) contains supplementary material, which is available to authorized users.
Influenza; Virus evolution; Pepitope
Examination of the evolutionary dynamics of complete influenza viral genomes reveals that other processes, in conjunction with antigenic drift, play important roles in viral evolution and selection, but there is little biological evidence to support these genomic data. Previous work demonstrated that after the A/Fujian/411/2002-like H3N2 influenza A epidemic during 2003–2004, a preexisting nondominant Fujian-like viral clade gained a small number of changes in genes encoding the viral polymerase complex, along with several changes in the antigenic regions of hemagglutinin, and in a genome-wide selective sweep, it replaced other co-circulating H3N2 clades.
Representative strains of these virus clades were evaluated in vitro and in vivo.
The newly dominant 2004–2005 A/California/7/2004-like H3N2 clade, which featured 2 key amino acid changes in the polymerase PA segment, grew to higher titers in MDCK cells and ferret tissues and caused more-severe disease in ferrets. The polymerase complex of this virus demonstrated enhanced activity in vitro, correlating directly to the enhanced replicative fitness and virulence in vivo.
These data suggest that influenza strains can be selected in humans through mutations that increase replicative fitness and virulence, in addition to the well-characterized antigenic changes in the surface glycoproteins.
Tat-specific cytotoxic T cells have previously been shown to exert positive Darwinian selection favoring amino acid replacements of an epitope of simian immunodeficiency virus (SIV). The region of the tat gene encoding this epitope falls within a region of overlap between the tat and vpr reading frames, and nonsynonymous nucleotide substitutions in the tat reading frame were found to occur disproportionately in such a way as to cause synonymous changes in the vpr reading frame. Comparison of published complete SIV genomes showed Tat to be the least conserved at the amino acid level of nine proteins encoded by the virus, while Vpr was one of the most conserved. Numerous parallel amino acid changes occurred within the Tat epitope independently in different monkeys, and purifying selection on the vpr reading frame, by limiting acceptable nonsynonymous substitutions in the tat reading frame, evidently has enhanced the probability of parallel evolution.
The unexpectedly low efficacy of influenza vaccine during school outbreaks of influenza B virus in the spring of 1987 in Japan was probably attributable to a poor antibody response of vaccinees to the epidemic viruses. An antigenic analysis of the causative B viruses isolated in 1987 and 1988 showed much variation in hemagglutination inhibition patterns. The nucleotide sequences that code for the HA1 domain of B/Fukuoka/c-27/81, B/Ibaraki/2/85, B/Nagasaki/1/87, and B/Yamagata/16/88 viruses were determined and compared with those of the previously reported hemagglutinin genes. The nucleotide sequences of the hemagglutinin gene of a new variant, B/Yamagata/16/88, had only 93.4% homology with those of two other viruses from the same epidemic. An analysis of nucleotide and amino acid substitutions of the hemagglutinin genes of influenza B viruses revealed that new and some old variants could cocirculate in the same epidemic. A phylogenetic tree constructed by the neighbor-joining method allowed estimation of an evolutionary rate of 2.3 x 10(-3) synonymous (silent) substitutions per nucleotide site per year in the hemagglutinin gene.