Next generation sequencing (NGS) is superseding Sanger technology for analysing intra-host viral populations, in terms of genome length and resolution. We introduce two new empirical validation data sets and test the available viral population assembly software. Two intra-host viral population ‘quasispecies’ samples (type-1 human immunodeficiency and hepatitis C virus) were Sanger-sequenced, and plasmid clone mixtures at controlled proportions were shotgun-sequenced using Roche's 454 sequencing platform. The performance of different assemblers was compared in terms of phylogenetic clustering and recombination with the Sanger clones. Phylogenetic clustering showed that all assemblers captured a proportion of the most divergent lineages, but none were able to provide a high precision/recall tradeoff. Estimated variant frequencies mildly correlated with the original. Given the limitations of currently available algorithms identified by our empirical validation, the development and exploitation of additional data sets is needed, in order to establish an efficient framework for viral population reconstruction using NGS.
Foamy viruses naturally infect a wide range of mammals, including Old World (OWP) and New World primates (NWP), which are collectively called simian foamy viruses (SFV). While NWP species in Central and South America are highly diverse, only SFV from captive marmoset, spider monkey, and squirrel monkey have been genetically characterized and the molecular epidemiology of SFV infection in NWPs remains unknown. We tested a large collection of genomic DNA (n = 332) comprising 14 genera of NWP species for the presence of SFV polymerase (pol) sequences using generic PCR primers. Further molecular characterization of positive samples was carried out by LTR-gag and larger pol sequence analysis. We identified novel SFVs infecting nine NWP genera. Prevalence rates varied between 14–30% in different species for which at least 10 specimens were tested. High SFV genetic diversity among NWP up to 50% in LTR-gag and 40% in pol was revealed by intragenus and intrafamilial comparisons. Two different SFV strains infecting two captive yellow-breasted capuchins did not group in species-specific lineages but rather clustered with SFVs from marmoset and spider monkeys, indicating independent cross-species transmission events. We describe the first SFV epidemiology study of NWP, and the first evidence of SFV infection in wild NWPs. We also document a wide distribution of distinct SFVs in 14 NWP genera, including two novel co-speciating SFVs in capuchins and howler monkeys, suggestive of an ancient evolutionary history in NWPs for at least 28 million years. A high SFV genetic diversity was seen among NWP, yet these viruses seem able to jump between NWP species and even genera. Our results raise concerns for the risk of zoonotic transmission of NWP SFV to humans as these primates are regularly hunted for food or kept as pets in forest regions of South America.
Methicillin-resistant Staphylococcus aureus (MRSA) is a leading cause of healthcare-associated infections and significant contributor to healthcare cost. Community-associated-MRSA (CA-MRSA) strains have now invaded healthcare settings. A convenience sample of 97 clinical MRSA isolates was obtained from seven hospitals during a one-week period in 2010. We employed a framework integrating Staphylococcus protein A typing and full-genome next-generation sequencing. Single nucleotide polymorphisms were analyzed using phylodynamics. Twenty-six t002, 48 t008, and 23 other strains were identified. Phylodynamic analysis of 30 t008 strains showed ongoing exponential growth of the effective population size the basic reproductive number (R0) ranging from 1.24 to 1.34. No evidence of hospital clusters was identified. The lack of phylogeographic clustering suggests that community introduction is a major contributor to emergence of CA-MRSA strains within hospitals. Phylodynamic analysis provides a powerful framework to investigate MRSA transmission between the community and hospitals, an understanding of which is essential for control.
Infection of CD8-depleted rhesus macaques with the genetically heterogeneous simian immunodeficiency virus (SIV)mac251 viral swarm provides a rapid-disease model for simian acquired immune deficiency syndrome and SIV-encephalitis (SIVE). The objective was to evaluate how the diversity of the swarm influences the initial seeding of the infection that may potentially affect disease progression. Plasma, lymphoid and non-lymphoid (brain and lung) tissues were collected from two infected macaques euthanized at 21 days post-infection (p.i.), as well as longitudinal specimens and post-mortem tissues from four macaques followed throughout the infection. About 1300 gp120 viral sequences were obtained from the infecting SIVmac251 swarm and the macaques longitudinal and post-mortem samples. Phylogenetic and amino acid signature pattern analyses were carried out to assess frequency, transmission dynamics and persistence of specific viral clusters. Although no significant reduction in viral heterogeneity was found early in infection (21 days p.i.), transmission and replication of SIV variants was not entirely random. In particular, two distinct motifs under-represented (<4 %) in the infecting swarm were found at high frequencies (up to 14 %) in all six macaques as early as 21 days p.i. Moreover, a macrophage tropic variant not detected in the viral swarm (<0.3 %) was present at high frequency (29–100 %) in sequences derived from the brain of two macaques with meningitis or severe SIVE. This study demonstrates the highly efficient transmission and persistence in vivo of multiple low frequency SIVmac251 founder variants, characterized by specific gp120 motifs that may be linked to pathogenesis in the rapid-disease model of neuroAIDS.
To understand the evolutionary processes leading to the diversity of Asian colobines, we report here on a phylogenetic, phylogeographical and population genetic analysis of three closely related langurs, Trachypithecus francoisi, T. poliocephalus and T. leucocephalus, which are all characterized by different pelage coloration predominantly on the head and shoulders. Therefore, we sequenced a 395 bp long fragment of the mitochondrial control region from 178 T. francoisi, 54 T. leucocephalus and 19 T. poliocephalus individuals, representing all extant populations of these three species. We found 29 haplotypes in T. francoisi, 12 haplotypes in T. leucocephalus and three haplotypes in T. poliocephalus. T. leucocephalus and T. poliocephalus form monophyletic clades, which are both nested within T. francoisi, and diverged from T. francoisi recently, 0.46-0.27 (T. leucocephalus) and 0.50-0.25 million years ago (T. poliocephalus). Thus, T. francoisi appears as a polyphyletic group, while T. leucocephalus and T. poliocephalus are most likely independent descendents of T. francoisi that are both physically separated from T. francoisi populations by rivers, open sea or larger habitat gaps. Since T. francoisi populations show no variability in pelage coloration, pelage coloration in T. leucocephalus and T. poliocephalus is most likely the result of new genetic mutations after the split from T. francoisi and not of the fixation of different characters derived from an ancestral polymorphism. This case study highlights that morphological changes for example in pelage coloration can occur in isolated populations in relatively short time periods and it provides a solid basis for studies in related species. Nevertheless, to fully understand the evolutionary history of these three langur species, nuclear loci should be investigated as well.
The evolution and population dynamics of human influenza in Taiwan is a microcosm of the viruses circulating worldwide, which has not yet been studied in detail. We collected 343 representative full genome sequences of human influenza A viruses isolated in Taiwan between 1979 and 2009. Phylogenetic and antigenic data analysis revealed that H1N1 and H3N2 viruses consistently co-circulated in Taiwan, although they were characterized by different temporal dynamics and degrees of genetic diversity. Moreover, influenza A viruses of both subtypes underwent internal gene reassortment involving all eight segments of the viral genome, some of which also occurred during non-epidemic periods. The patterns of gene reassortment were different in the two subtypes. The internal genes of H1N1 viruses moved as a unit, separately from the co-evolving HA and NA genes. On the other hand, the HA and NA genes of H3N2 viruses tended to segregate consistently with different sets of internal gene segments. In particular, as reassortment occurred, H3HA always segregated as a group with the PB1, PA and M genes, while N2NA consistently segregated with PB2 and NP. Finally, the analysis showed that new phylogenetic lineages and antigenic variants emerging in summer were likely to be the progenitors of the epidemic strains in the following season. The synchronized seasonal patterns and high genetic diversity of influenza A viruses observed in Taiwan make possible to capture the evolutionary dynamic and epidemiological rules governing antigenic drift and reassortment and may serve as a “warning” system that recapitulates the global epidemic.
HLA-B*5701 is the host factor most strongly associated with slow HIV-1 disease progression, although rates can vary within this group. Underlying mechanisms are not fully understood but likely involve both immunological and virological dynamics. The present study investigated HIV-1 in vivo evolution and epitope-specific CD8+ T cell responses in six HLA-B*5701 patients who had not received antiretroviral treatment, monitored from early infection for up to 7 years. The subjects were classified as high-risk progressors (HRPs) or low-risk progressors (LRPs) based on baseline CD4+ T cell counts. Dynamics of HIV-1 Gag p24 evolution and multifunctional CD8+ T cell responses were evaluated by high-resolution phylogenetic analysis and polychromatic flow cytometry, respectively. In all subjects, substitutions occurred more frequently in flanking regions than in HLA-B*5701-restricted epitopes. In LRPs, p24 sequence diversity was significantly lower; sequences exhibited a higher degree of homoplasy and more constrained mutational patterns than HRPs. The HIV-1 intrahost evolutionary rate was also lower in LRPs and followed a strict molecular clock, suggesting neutral genetic drift rather than positive selection. Additionally, polyfunctional CD8+ T cell responses, particularly to TW10 and QW9 epitopes, were more robust in LRPs, who also showed significantly higher interleukin-2 (IL-2) production in early infection. Overall, the findings indicate that HLA-B*5701 patients with higher CD4 counts at baseline have a lower risk of HIV-1 disease progression because of the interplay between specific HLA-linked immune responses and the rate and mode of viral evolution. The study highlights the power of a multidisciplinary approach, integrating high-resolution evolutionary and immunological data, to understand mechanisms underlying HIV-1 pathogenesis.
Chronic hepatitis C virus (HCV) infection can lead to liver cirrhosis in up to 20% of individuals, often requiring liver transplantation. Although the new liver is known to be rapidly reinfected, the dynamics and source of the reinfecting virus(es) are unclear, resulting in some confusion concerning the relationship between clinical outcome and viral characteristics. To clarify the dynamics of liver reinfection, longitudinal serum viral samples from 10 transplant patients were studied. Part of the E1/E2 region was sequenced, and advanced phylogenetic analysis methods were used in a multiparameter analysis to determine the history and ancestry of reinfecting lineages. Our results demonstrated the complexity of HCV evolutionary dynamics after liver transplantation, in which a large diverse population of viruses is transmitted and maintained for months to years. As many as 30 independent lineages in a single patient were found to reinfect the new liver. Several later posttransplant lineages were more closely related to older pretransplant viruses than to viruses detected immediately after transplantation. Although our data are consistent with a number of interpretations, the persistence of high viral genetic variation over long periods of time requires an active mechanism. We discuss possible scenarios, including frequency-dependent selection or variation in selective pressure among viral subpopulations, i.e., the population structure. The latter hypothesis, if correct, could have relevance to the success of newer direct-acting antiviral therapies.
Human enterovirus 85 (HEV85), whose prototype strain (Strain BAN00-10353/BAN/2000) was isolated in Bangladesh in 2000, is a recently identified serotype within the human enterovirus B (HEV-B) species. At present, only one nucleotide sequence of HEV85 (the complete genome sequence of the prototype strain) is available in the GenBank database.
In this study, we report the genetic characteristics of 33 HEV85 isolates that circulated in the Xinjiang Uighur autonomous region of China in 2011. Sequence analysis revealed that all these Chinese HEV85 isolates belong to 2 transmission chains, and intertypic recombination was found with the new unknown serotype HEV-B donor sequences. Two HEV85 isolates recovered from a patient presenting acute flaccid paralysis and one of his contacts were temperature-insensitive strains, and some nucleotide substitutions in the non-coding regions and in the 2C or 3D coding regions may have affected the temperature sensitivity of HEV85 strains.
The Chinese HEV85 recombinant described in this study trapped a new unknown serotype HEV-B donor sequence, indicating that new unknown HEV-B serotypes exist or circulate in Xinjiang of China. Our study also indicated that HEV85 is a prevalent and common enterovirus serotype in Xinjiang.
Combined anti-retroviral therapy (cART) has significantly reduced the number of AIDS-associated illnesses and changed the course of HIV-1 disease in developed countries. Despite the ability of cART to maintain high CD4+ T-cell counts, a number of macrophage-mediated diseases can still occur in HIV-infected subjects. These diseases include lymphoma, metabolic diseases, and HIV-associated neurological disorders. Within macrophages, the HIV-1 regulatory protein “Nef” can modulate surface receptors, interact with signaling pathways, and promote specific environments that contribute to each of these pathologies. Moreover, genetic variation in Nef may also guide the macrophage response. Herein, we review findings relating to the Nef–macrophage interaction and how this relationship contributes to disease pathogenesis.
cardiovascular disease; dementia; HIV-1 lymphoma; macrophages; Nef
Summary: Next-generation sequencing (NGS) is an ideal framework for the characterization of highly variable pathogens, with a deep resolution able to capture minority variants. However, the reconstruction of all variants of a viral population infecting a host is a challenging task for genome regions larger than the average NGS read length. QuRe is a program for viral quasispecies reconstruction, specifically developed to analyze long read (>100 bp) NGS data. The software performs alignments of sequence fragments against a reference genome, finds an optimal division of the genome into sliding windows based on coverage and diversity and attempts to reconstruct all the individual sequences of the viral quasispecies—along with their prevalence—using a heuristic algorithm, which matches multinomial distributions of distinct viral variants overlapping across the genome division. QuRe comes with a built-in Poisson error correction method and a post-reconstruction probabilistic clustering, both parameterized on given error rates in homopolymeric and non-homopolymeric regions.
Availability: QuRe is platform-independent, multi-threaded software implemented in Java. It is distributed under the GNU General Public License, available at https://sourceforge.net/projects/qure/.
Contact: firstname.lastname@example.org; email@example.com
Supplementary information: Supplementary data are available at Bioinformatics online.
The HIV-1 genome is highly heterogeneous. This variation affords the virus a wide range of molecular properties, including the ability to infect cell types, such as macrophages and lymphocytes, expressing different chemokine receptors on the cell surface. In particular, R5 HIV-1 viruses use CCR5 as a coreceptor for viral entry, X4 viruses use CXCR4, whereas some viral strains, known as R5X4 or D-tropic, have the ability to utilize both coreceptors. X4 and R5X4 viruses are associated with rapid disease progression to AIDS. R5X4 viruses differ in that they have yet to be characterized by the examination of the genetic sequence of HIV-1 alone. In this study, a series of experiments was performed to evaluate different strategies of feature selection and neural network optimization. We demonstrate the use of artificial neural networks trained via evolutionary computation to predict viral coreceptor usage. The results indicate the identification of R5X4 viruses with a predictive accuracy of 75.5 percent.
Computational intelligence; evolutionary computation; artificial neural networks; HIV; AIDS; phenotype prediction; tropism; dual-tropic viruses
Deep sequencing provides the basis for analysis of biodiversity of taxonomically similar organisms in an environment. While extensively applied to microbiome studies, population genetics studies of viruses are limited. To define the scope of HIV-1 population biodiversity within infected individuals, a suite of phylogenetic and population genetic algorithms was applied to HIV-1 envelope hypervariable domain 3 (Env V3) within peripheral blood mononuclear cells from a group of perinatally HIV-1 subtype B infected, therapy-naïve children.
Biodiversity of HIV-1 Env V3 quasispecies ranged from about 70 to 270 unique sequence clusters across individuals. Viral population structure was organized into a limited number of clusters that included the dominant variants combined with multiple clusters of low frequency variants. Next generation viral quasispecies evolved from low frequency variants at earlier time points through multiple non-synonymous changes in lineages within the evolutionary landscape. Minor V3 variants detected as long as four years after infection co-localized in phylogenetic reconstructions with early transmitting viruses or with subsequent plasma virus circulating two years later.
Deep sequencing defines HIV-1 population complexity and structure, reveals the ebb and flow of dominant and rare viral variants in the host ecosystem, and identifies an evolutionary record of low-frequency cell-associated viral V3 variants that persist for years. Bioinformatics pipeline developed for HIV-1 can be applied for biodiversity studies of virome populations in human, animal, or plant ecosystems.
HIV-1 envelope V3; Biodiversity; Population structure; Quasispecies; Fitness; Pyrosequencing; Founder virus persistence; Most recent common ancestor
Rabies virus (RABV) causes severe neurological disease and death. As an important mechanism for generating genetic diversity in viruses, homologous recombination can lead to the emergence of novel virus strains with increased virulence and changed host tropism. However, it is still unclear whether recombination plays a role in the evolution of RABV. In this study, we isolated and sequenced four circulating RABV strains in China. Phylogenetic analyses identified a novel lineage of hybrid origin that comprises two different strains, J and CQ92. Analyses revealed that the virus 3′ untranslated region (UTR) and part of the N gene (approximate 500 nt in length) were likely derived from Chinese lineage I while the other part of the genomic sequence was homologous to Chinese lineage II. Our findings reveal that homologous recombination can occur naturally in the field and shape the genetic structure of RABV populations.
Infecting rhesus macaques (Macaca mulatta) with the simian immunodeficiency virus (SIV) is an established animal model of human immunodeficiency virus (HIV) pathogenesis. Many studies have used various derivatives of the SIVmac251 viral swarm to investigate several aspects of the disease, including transmission, progression, response to vaccination, and SIV/HIV-associated neurological disorders. However, the lack of standardization of the infecting inoculum complicates comparative analyses. We investigated the genetic diversity and phylogenetic relationships of the 1991 animal-titered SIVmac251 swarm, the peripheral blood mononuclear cell (PBMC) passaged SIVmac251, and additional SIVmac251 sequences derived over the past 20 years. Significant sequence divergence and diversity were evident among the different viral sources. This finding highlights the importance of characterizing the exact source and genetic makeup of the infecting inoculum to achieve controlled experimental conditions and enable meaningful comparisons across studies.
HIV-1 CRF02_AG accounts for >50% of infected individuals in Cameroon. CRF02_AG prevalence has been increasing both in Africa and Europe, particularly in Italy because of migrations from the sub-Saharan region. This study investigated the molecular epidemiology of CRF02_AG in Cameroon by employing Bayesian phylodynamics and analyzed the relationship between HIV-1 CRF02_AG isolates circulating in Italy and those prevalent in Africa to understand the link between the two epidemics. Among 291 Cameroonian reverse transcriptase sequences analyzed, about 70% clustered within three distinct clades, two of which shared a most recent common ancestor, all related to sequences from Western Africa. The major Cameroonian clades emerged during the mid-1970s and slowly spread during the next 30 years. Little or no geographic structure was detected within these clades. One of the major driving forces of the epidemic was likely the high accessibility between locations in Southern Cameroon contributing to the mobility of the population. The remaining Cameroonian sequences and the new strains isolated from Italian patients were interspersed mainly within West and Central African sequences in the tree, indicating a continuous exchange of CRF02_AG viral strains between Cameroon and other African countries, as well as multiple independent introductions in the Italian population. The evaluation of the spread of CRF02_AG may provide significant insight about the future dynamics of the Italian and European epidemic.
The HIV-1 subtype C accounts for an important fraction of HIV infections in east Africa, but little is known about the genetic characteristics and evolutionary history of this epidemic. Here we reconstruct the origin and spatiotemporal dynamics of the major HIV-1 subtype C clades circulating in east Africa. A large number (n = 1,981) of subtype C pol sequences were retrieved from public databases to explore relationships between strains from the east, southern and central African regions. Maximum-likelihood phylogenetic analysis of those sequences revealed that most (>70%) strains from east Africa segregated in a single regional-specific monophyletic group, here called CEA. A second major Ethiopian subtype C lineage and a large collection of minor Kenyan and Tanzanian subtype C clades of southern African origin were also detected. A Bayesian coalescent-based method was then used to reconstruct evolutionary parameters and migration pathways of the CEA African lineage. This analysis indicates that the CEA clade most probably originated in Burundi around the early 1960s, and later spread to Ethiopia, Kenya, Tanzania and Uganda, giving rise to major country-specific monophyletic sub-clusters between the early 1970s and early 1980s. The results presented here demonstrate that a substantial proportion of subtype C infections in east Africa resulted from dissemination of a single HIV local variant, probably originated in Burundi during the 1960s. Burundi was the most important hub of dissemination of that subtype C clade in east Africa, fueling the origin of new local epidemics in Ethiopia, Kenya, Tanzania and Uganda. Subtype C lineages of southern African origin have also been introduced in east Africa, but seem to have had a much more restricted spread.
The Venezuelan Amerindians were, until recently, free of human immunodeficiency virus (HIV) infection. However, in 2007, HIV-1 infection was detected for the first time in the Warao Amerindian population living in the Eastern part of Venezuela, in the delta of the Orinoco river. The aim of this study was to analyze the genetic diversity of the HIV-1 circulating in this population.
The pol genomic region was sequenced for 16 HIV-1 isolates and for some of them, sequences from env, vif and nef genomic regions were obtained. All HIV-1 isolates were classified as subtype B, with exception of one that was classified as subtype C. The 15 subtype B isolates exhibited a high degree of genetic similarity and formed a highly supported monophyletic cluster in each genomic region analyzed. Evolutionary analyses of the pol genomic region indicated that the date of the most recent common ancestor of the Waraos subtype B clade dates back to the late 1990s.
At least two independent introductions of HIV-1 have occurred in the Warao Amerindians from Venezuela. The HIV-1 subtype B was successfully established and got disseminated in the community, while no evidence of local dissemination of the HIV-1 subtype C was detected in this study. These results warrant further surveys to evaluate the burden of this disease, which can be particularly devastating in this Amerindian population, with a high prevalence of tuberculosis, hepatitis B, among other infectious diseases, and with limited access to primary health care.
India has the third largest HIV-1 epidemic with 2.4 million infected individuals. Molecular epidemiological analysis has identified the predominance of HIV-1 subtype C (HIV-1C). However, the previous reports have been limited by sample size, and uneven geographical distribution. The introduction of HIV-1C in India remains uncertain due to this lack of structured studies. To fill the gap, we characterised the distribution pattern of HIV-1 subtypes in India based on data collection from nationwide clinical cohorts between 2007 and 2011. We also reconstructed the time to the most recent common ancestor (tMRCA) of the predominant HIV-1C strains.
Blood samples were collected from 168 HIV-1 seropositive subjects from 7 different states. HIV-1 subtypes were determined using two or three genes, gag, pol, and env using several methods. Bayesian coalescent-based approach was used to reconstruct the time of introduction and population growth patterns of the Indian HIV-1C. For the first time, a high prevalence (10%) of unique recombinant forms (BC and A1C) was observed when two or three genes were used instead of one gene (p<0.01; p = 0.02, respectively). The tMRCA of Indian HIV-1C was estimated using the three viral genes, ranged from 1967 (gag) to 1974 (env). Pol-gene analysis was considered to provide the most reliable estimate [1971, (95% CI: 1965–1976)]. The population growth pattern revealed an initial slow growth phase in the mid-1970s, an exponential phase through the 1980s, and a stationary phase since the early 1990s.
The Indian HIV-1C epidemic originated around 40 years ago from a single or few genetically related African lineages, and since then largely evolved independently. The effective population size in the country has been broadly stable since the 1990s. The evolving viral epidemic, as indicated by the increase of recombinant strains, warrants a need for continued molecular surveillance to guide efficient disease intervention strategies.
Staphylococcus aureus is recognized as one of the major human pathogens and is by far one of the most common nosocomial organisms. The genetic basis for the emergence of highly epidemic strains remains mysterious. Studying the microevolution of the different clones of S. aureus is essential for identifying the forces driving pathogen emergence and spread. The aim of the present study was to determine the genetic changes characterizing a lineage belonging to the South German clone (ST228) that spread over ten years in a tertiary care hospital in Switzerland. For this reason, we compared the whole genome of eight isolates recovered between 2001 and 2008 at the Lausanne hospital. The genetic comparison of these isolates revealed that their genomes are extremely closely related. Yet, a few more important genetic changes, such as the replacement of a plasmid, the loss of large fragments of DNA, or the insertion of transposases, were observed. These transfers of mobile genetic elements shaped the evolution of the ST228 lineage that spread within the Lausanne hospital. Nevertheless, although the strains analyzed differed in their dynamics, we have not been able to link a particular genetic element with spreading success. Finally, the present study showed that new sequencing technologies improve considerably the quality and quantity of information obtained for a single strain; but this information is still difficult to interpret and important investments are required for the technology to become accessible for routine investigations.
Serially-sampled nucleotide sequences can be used to infer demographic history of evolving viral populations. The shape of a phylogenetic tree often reflects the interplay between evolutionary and ecological processes. Several approaches exist to analyze the topology and traits of a phylogenetic tree, by means of tree balance, branching patterns and comparative properties. The temporal clustering (TC) statistic is a new topological measure, based on ancestral character reconstruction, which characterizes the temporal structure of a phylogeny. Here, PhyloTempo is the first implementation of the TC in the R language, integrating several other topological measures in a user-friendly graphical framework. The comparison of the TC statistic with other measures provides multifaceted insights on the dynamic processes shaping the evolution of pathogenic viruses. The features and applicability of PhyloTempo were tested on serially-sampled intra-host human and simian immunodeficiency virus population data sets. PhyloTempo is distributed under the GNU general public license at https://sourceforge.net/projects/phylotempo/.
fast evolving viruses; longitudinal samples; phylogenetics; phylodynamics; comparative methods; clustering; software; positive selection; coalescence
The introduction of non-native species into new habitats poses a major threat to native populations. Of particular interest, though often overlooked, are introductions of populations that are not fully reproductively isolated from native individuals and can hybridize with them. To address this important topic we used different approaches in a multi-pronged study, combining the effects of mate choice, shoaling behaviour and genetics. Here we present evidence that behavioural traits such as shoaling and mate choice can promote population mixing if individuals do not distinguish between native and foreign conspecifics. We examined this in the context of two guppy (Poecilia reticulata) populations that have been subject to an introduction and subsequent population mixing event in Trinidad. The introduction of Guanapo River guppies into the Turure River more than 50 years ago led to a marked reduction of the original genotype. In our experiments, female guppies did not distinguish between shoaling partners when given the choice between native and foreign individuals. Introduced fish are therefore likely to benefit from the protection of a shoal and will improve their survival chances as a result. The additional finding that male guppies do not discriminate between females on the basis of origin will further increase the process of population mixing, especially if males encounter mixed shoals. In a mesocosm experiment, in which the native and foreign populations were allowed to mate freely, we found, as expected on the basis of these behavioural interactions, that the distribution of offspring genotypes could be predicted from the proportions of the two types of founding fish. This result suggests that stochastic and environmental processes have reinforced the biological ones to bring about the genetic dominance of the invading population in the Turure River. Re-sampling the Turure for genetic analysis using SNP markers confirmed the population mixing process and showed that it is an on-going process in this river and has led to the nearly complete disappearance of the original genotype.
Hepatitis B virus genotype D can be found in many parts of the world and is the most prevalent strain in south-eastern Europe, the Mediterranean Basin, the Middle East, and the Indian sub-continent. The epidemiological history of the D genotype and its subgenotypes is still obscure because of the scarcity of appropriate studies. We retrieved from public databases a total of 312 gene P sequences of HBV genotype D isolated in various countries throughout the world, and reconstructed the spatio-temporal evolutionary dynamics of the HBV-D epidemic using a Bayesian framework.
The phylogeographical analysis showed that India had the highest posterior probability of being the location of the tree root, whereas central Asia was the most probable location of the common ancestor of subgenotypes D1–D3. HBV-D5 (identified in native Indian populations) diverged from the tree root earlier than D1–D3. The time of the most recent common ancestor (tMRCA) of the tree root was 128 years ago, which suggests that the common ancestor of the currently circulating subgenotypes existed in the second half of the XIX century. The mean tMRCA of subgenotypes D1–D3 was between the 1940s and the 1950–60s. On the basis of our phylogeographic reconstruction, it seems that HBV-D reached the Mediterranean area in the middle of the XX century by means of at least two routes: the first pathway (mainly due to the spread of subgenotype D1) crossing the Middle East and reaching north Africa and the eastern Mediterranean, and the second pathway (closely associated with D2) that crossed the former Soviet Union and reached eastern Europe and the Mediterranean through Albania. We hypothesise that the main route of dispersion of genotype D was the unsafe use of injections and drug addiction.
Staphylococcus aureus is a common cause of infections that has undergone rapid global spread over recent decades. Formal phylogeographic methods have not yet been applied to the molecular epidemiology of bacterial pathogens because the limited genetic diversity of data sets based on individual genes usually results in poor phylogenetic resolution. Here, we investigated a whole-genome single nucleotide polymorphism (SNP) data set of health care-associated Methicillin-resistant S. aureus sequence type 239 (HA-MRSA ST239) strains, which we analyzed using Markov spatial models that incorporate geographical sampling distributions. The reconstructed timescale indicated a temporal origin of this strain shortly after the introduction of Methicillin, followed by global pandemic spread. The estimate of the temporal origin was robust to the molecular clock, coalescent prior, full/intergenic/synonymous SNP inclusion, and correction for excluded invariant site patterns. Finally, phylogeographic analyses statistically supported the role of human movement in the global dissemination of HA-MRSA ST239, although it was unable to conclusively resolve the location of the root. This study demonstrates that bacterial genomes can indeed contain sufficient evolutionary information to elucidate the temporal and spatial dynamics of transmission. Future applications of this approach to other bacterial strains may provide valuable epidemiological insights that may justify the cost of genome-wide typing.
Bayesian inférence; phylogeography; phylogenetics; measurably evolving population