Chronic hepatitis C virus (HCV) infection can lead to liver cirrhosis in up to 20% of individuals, often requiring liver transplantation. Although the new liver is known to be rapidly reinfected, the dynamics and source of the reinfecting virus(es) are unclear, resulting in some confusion concerning the relationship between clinical outcome and viral characteristics. To clarify the dynamics of liver reinfection, longitudinal serum viral samples from 10 transplant patients were studied. Part of the E1/E2 region was sequenced, and advanced phylogenetic analysis methods were used in a multiparameter analysis to determine the history and ancestry of reinfecting lineages. Our results demonstrated the complexity of HCV evolutionary dynamics after liver transplantation, in which a large diverse population of viruses is transmitted and maintained for months to years. As many as 30 independent lineages in a single patient were found to reinfect the new liver. Several later posttransplant lineages were more closely related to older pretransplant viruses than to viruses detected immediately after transplantation. Although our data are consistent with a number of interpretations, the persistence of high viral genetic variation over long periods of time requires an active mechanism. We discuss possible scenarios, including frequency-dependent selection or variation in selective pressure among viral subpopulations, i.e., the population structure. The latter hypothesis, if correct, could have relevance to the success of newer direct-acting antiviral therapies.
Reconstructing the transmission history of infectious diseases in the absence of medical or epidemiological records often relies on the evolutionary analysis of pathogen genetic sequences. The precision of evolutionary estimates of epidemic history can be increased by the inclusion of sequences derived from ‘archived’ samples that are genetically distinct from contemporary strains. Historical sequences are especially valuable for viral pathogens that circulated for many years before being formally identified, including HIV and the hepatitis C virus (HCV). However, surprisingly few HCV isolates sampled before discovery of the virus in 1989 are currently available. Here, we report and analyse two HCV subgenomic sequences obtained from infected individuals in 1953, which represent the oldest genetic evidence of HCV infection. The pairwise genetic diversity between the two sequences indicates a substantial period of HCV transmission prior to the 1950s, and their inclusion in evolutionary analyses provides new estimates of the common ancestor of HCV in the USA. To explore and validate the evolutionary information provided by these sequences, we used a new phylogenetic molecular clock method to estimate the date of sampling of the archived strains, plus the dates of four more contemporary reference genomes. Despite the short fragments available, we conclude that the archived sequences are consistent with a proposed sampling date of 1953, although statistical uncertainty is large. Our cross-validation analyses suggest that the bias and low statistical power observed here likely arise from a combination of high evolutionary rate heterogeneity and an unstructured, star-like phylogeny. We expect that attempts to date other historical viruses under similar circumstances will meet similar problems.
molecular epidemiology; phylogenetics
Infection of CD8-depleted rhesus macaques with the genetically heterogeneous simian immunodeficiency virus (SIV)mac251 viral swarm provides a rapid-disease model for simian acquired immune deficiency syndrome and SIV-encephalitis (SIVE). The objective was to evaluate how the diversity of the swarm influences the initial seeding of the infection that may potentially affect disease progression. Plasma, lymphoid and non-lymphoid (brain and lung) tissues were collected from two infected macaques euthanized at 21 days post-infection (p.i.), as well as longitudinal specimens and post-mortem tissues from four macaques followed throughout the infection. About 1300 gp120 viral sequences were obtained from the infecting SIVmac251 swarm and the macaques longitudinal and post-mortem samples. Phylogenetic and amino acid signature pattern analyses were carried out to assess frequency, transmission dynamics and persistence of specific viral clusters. Although no significant reduction in viral heterogeneity was found early in infection (21 days p.i.), transmission and replication of SIV variants was not entirely random. In particular, two distinct motifs under-represented (<4 %) in the infecting swarm were found at high frequencies (up to 14 %) in all six macaques as early as 21 days p.i. Moreover, a macrophage tropic variant not detected in the viral swarm (<0.3 %) was present at high frequency (29–100 %) in sequences derived from the brain of two macaques with meningitis or severe SIVE. This study demonstrates the highly efficient transmission and persistence in vivo of multiple low frequency SIVmac251 founder variants, characterized by specific gp120 motifs that may be linked to pathogenesis in the rapid-disease model of neuroAIDS.
Infecting rhesus macaques (Macaca mulatta) with the simian immunodeficiency virus (SIV) is an established animal model of human immunodeficiency virus (HIV) pathogenesis. Many studies have used various derivatives of the SIVmac251 viral swarm to investigate several aspects of the disease, including transmission, progression, response to vaccination, and SIV/HIV-associated neurological disorders. However, the lack of standardization of the infecting inoculum complicates comparative analyses. We investigated the genetic diversity and phylogenetic relationships of the 1991 animal-titered SIVmac251 swarm, the peripheral blood mononuclear cell (PBMC) passaged SIVmac251, and additional SIVmac251 sequences derived over the past 20 years. Significant sequence divergence and diversity were evident among the different viral sources. This finding highlights the importance of characterizing the exact source and genetic makeup of the infecting inoculum to achieve controlled experimental conditions and enable meaningful comparisons across studies.
HIV-1 CRF02_AG accounts for >50% of infected individuals in Cameroon. CRF02_AG prevalence has been increasing both in Africa and Europe, particularly in Italy because of migrations from the sub-Saharan region. This study investigated the molecular epidemiology of CRF02_AG in Cameroon by employing Bayesian phylodynamics and analyzed the relationship between HIV-1 CRF02_AG isolates circulating in Italy and those prevalent in Africa to understand the link between the two epidemics. Among 291 Cameroonian reverse transcriptase sequences analyzed, about 70% clustered within three distinct clades, two of which shared a most recent common ancestor, all related to sequences from Western Africa. The major Cameroonian clades emerged during the mid-1970s and slowly spread during the next 30 years. Little or no geographic structure was detected within these clades. One of the major driving forces of the epidemic was likely the high accessibility between locations in Southern Cameroon contributing to the mobility of the population. The remaining Cameroonian sequences and the new strains isolated from Italian patients were interspersed mainly within West and Central African sequences in the tree, indicating a continuous exchange of CRF02_AG viral strains between Cameroon and other African countries, as well as multiple independent introductions in the Italian population. The evaluation of the spread of CRF02_AG may provide significant insight about the future dynamics of the Italian and European epidemic.
Serially-sampled nucleotide sequences can be used to infer demographic history of evolving viral populations. The shape of a phylogenetic tree often reflects the interplay between evolutionary and ecological processes. Several approaches exist to analyze the topology and traits of a phylogenetic tree, by means of tree balance, branching patterns and comparative properties. The temporal clustering (TC) statistic is a new topological measure, based on ancestral character reconstruction, which characterizes the temporal structure of a phylogeny. Here, PhyloTempo is the first implementation of the TC in the R language, integrating several other topological measures in a user-friendly graphical framework. The comparison of the TC statistic with other measures provides multifaceted insights on the dynamic processes shaping the evolution of pathogenic viruses. The features and applicability of PhyloTempo were tested on serially-sampled intra-host human and simian immunodeficiency virus population data sets. PhyloTempo is distributed under the GNU general public license at https://sourceforge.net/projects/phylotempo/.
fast evolving viruses; longitudinal samples; phylogenetics; phylodynamics; comparative methods; clustering; software; positive selection; coalescence
Staphylococcus aureus is a common cause of infections that has undergone rapid global spread over recent decades. Formal phylogeographic methods have not yet been applied to the molecular epidemiology of bacterial pathogens because the limited genetic diversity of data sets based on individual genes usually results in poor phylogenetic resolution. Here, we investigated a whole-genome single nucleotide polymorphism (SNP) data set of health care-associated Methicillin-resistant S. aureus sequence type 239 (HA-MRSA ST239) strains, which we analyzed using Markov spatial models that incorporate geographical sampling distributions. The reconstructed timescale indicated a temporal origin of this strain shortly after the introduction of Methicillin, followed by global pandemic spread. The estimate of the temporal origin was robust to the molecular clock, coalescent prior, full/intergenic/synonymous SNP inclusion, and correction for excluded invariant site patterns. Finally, phylogeographic analyses statistically supported the role of human movement in the global dissemination of HA-MRSA ST239, although it was unable to conclusively resolve the location of the root. This study demonstrates that bacterial genomes can indeed contain sufficient evolutionary information to elucidate the temporal and spatial dynamics of transmission. Future applications of this approach to other bacterial strains may provide valuable epidemiological insights that may justify the cost of genome-wide typing.
Bayesian inférence; phylogeography; phylogenetics; measurably evolving population
Little is known about HIV-1 subtype distribution in Morocco. Some data suggest an emergence of new HIV subtypes. We conducted phylogenetic analysis on a nationally representative sample of 60 HIV-1 viral specimens collected during 2004-2005 through the Morocco national HIV sentinel surveillance survey.
While subtype B is still the most prevalent, 23.3% of samples represented non-B subtypes, the majority of which were classified as CRF02_AG (15%). Molecular clock analysis confirmed that the initial introduction of HIV-1B in Morocco probably came from Europe in the early 1980s. In contrast, the CRF02_AG strain appeared to be introduced from sub-Saharan Africa in two separate events in the 1990s.
Subtype CRF02_AG has been emerging in Morocco since the 1990s. More information about the factors introducing HIV subtype-specific transmission will inform the prevention strategy in the region.
HIV-1; subtypes; phylogeny; Morocco
Ethanol is metabolized by two rate limiting reactions: alcohol dehydrogenases (ADH) convert ethanol to acetaldehyde, subsequently metabolized to acetate by aldehyde dehydrogenases (ALDH). Approximately 50% of East Asians have genetic variants that significantly impair this pathway and influence alcohol dependence (AD) vulnerability. We investigated whether variation in alcohol metabolism genes might alter the AD risk in four non-East Asian populations by performing systematic haplotype association analyses in order to maximize the chances of capturing functional variation.
Haplotype-tagging SNPs were genotyped using the Illumina GoldenGate platform. Genotypes were available for 40 SNPs across the ADH genes cluster and 24 SNPs across the two ALDH genes in four diverse samples that included cases (lifetime AD) and controls (no Axis 1 disorders). The case, control sample sizes were: Finnish Caucasians: 232, 194; African Americans: 267, 422; Plains American Indians: 226, 110; Southwestern American (SW) Indians: 317, 72.
In all four populations, as well as HapMap populations, five haplotype blocks were identified across the ADH gene cluster: (1) ADH5-ADH4; (2) ADH6-ADH1A-ADH1B; (3) ADH1C; (4) intergenic; (5) ADH7. The ALDH1A1 gene was defined by four blocks and ALDH2 by one block. No haplotype or SNP association results were significant after correction for multiple comparisons; however several results, particularly for ALDH1A1 and ADH4, replicated earlier findings. There was an ALDH1A1 block 1 and 2 (extending from intron 5 to the 3′ UTR) yin yang haplotype (haplotypes that have opposite allelic configuration) association with AD in the Finns driven by SNPs rs3764435 and rs2303317 respectively, and an ALDH1A1 block 3 (including the promoter region) yin yang haplotype association in SW Indians driven by 5 SNPs, all in allelic identity. The ADH4 SNP rs3762894 was associated with AD in Plains Indians.
The systematic evaluation of alcohol metabolizing genes in four non-East Asian populations has shown only modest associations with AD, largely for ALDH1A1 and ADH4. A concentration of signals for AD with ALDH1A1 yin yang haplotypes in several populations warrants further study.
Alcohol dependence; alcohol dehydrogenases (ADH); aldehyde dehydrogenases (ALDH); haplotype association; ALDH1A1
The origin and evolution of HIV-1 in breast milk is unclear, despite the continuing significance of this tissue as a transmitting compartment. To elucidate the evolutionary trajectory of viral populations in a transient mucosal compartment, longitudinal sequences of the envelope gp120 region from plasma and breast milk spanning the first year after delivery were analyzed in six women infected by HIV-1 subtype C.
Multiple phylogenetic algorithms were used to elucidate the evolutionary history and spatial structure of virus populations between tissues.
Overall persistent mixing of viral sequences between plasma and breast milk indicated that breast milk is not a distinct genetic viral compartment. Unexpectedly, longitudinal phylogenies showed multiple lineages defined by long branches that included virus from both the breast milk and the plasma. Plasma was unlikely the anatomical origin of the most recent common ancestor (MRCA) in at least three of the subjects, while in other women, the temporal origin of the MRCA of the viral populations following delivery occurred well before the onset of breast milk production.
These findings suggest that during pregnancy/lactation, a viral variant distinct from the plasma virus initially seeds the breast milk, followed by subsequent gene flow between the plasma and breast milk tissues. This study indicates the potential for reactivation or re-introduction of distinct lineages during major immunological disruptions during the course of natural infection.
HIV-1; breast milk; evolution; phylogeny; compartment; reservoir
Brain infection by the human immunodeficiency virus type 1 (HIV-1) has been investigated in many reports with a variety of conclusions concerning the time of entry and degree of viral compartmentalization. To address these diverse findings, we sequenced HIV-1 gp120 clones from a wide range of brain, peripheral and meningeal tissues from five patients who died from several HIV-1 associated disease pathologies. High-resolution phylogenetic analysis confirmed previous studies that showed a significant degree of compartmentalization in brain and peripheral tissue subpopulations. Some intermixing between the HIV-1 subpopulations was evident, especially in patients that died from pathologies other than HIV-associated dementia. Interestingly, the major tissue harboring virus from both the brain and peripheral tissues was the meninges. These results show that 1) HIV-1 is clearly capable of migrating out of the brain, 2) the meninges are the most likely primary transport tissues, and 3) infected brain macrophages comprise an important HIV reservoir during highly active antiretroviral therapy.
HIV; Brain; Dementia; Meninges; Viral migration; Macrophages; HIV-associated disease pathologies
Hepatitis C virus (HCV) is a rapidly-evolving RNA virus that establishes chronic infections in humans. Despite the virus' public health importance and a wealth of sequence data, basic aspects of HCV molecular evolution remain poorly understood. Here we investigate three sets of whole HCV genomes in order to directly compare the evolution of whole HCV genomes at different biological levels: within- and among-hosts. We use a powerful Bayesian inference framework that incorporates both among-lineage rate heterogeneity and phylogenetic uncertainty into estimates of evolutionary parameters.
Most of the HCV genome evolves at ~0.001 substitutions/site/year, a rate typical of RNA viruses. The antigenically-important E1/E2 genome region evolves particularly quickly, with correspondingly high rates of positive selection, as inferred using two related measures. Crucially, in this region an exceptionally higher rate was observed for within-host evolution compared to among-host evolution. Conversely, higher rates of evolution were seen among-hosts for functionally relevant parts of the NS5A gene. There was also evidence for slightly higher evolutionary rate for HCV subtype 1a compared to subtype 1b.
Using new statistical methods and comparable whole genome datasets we have quantified, for the first time, the variation in HCV evolutionary dynamics at different scales of organisation. This confirms that differences in molecular evolution between biological scales are not restricted to HIV and may represent a common feature of chronic RNA viral infection. We conclude that the elevated rate observed in the E1/E2 region during within-host evolution more likely results from the reversion of host-specific adaptations (resulting in slower long-term among-host evolution) than from the preferential transmission of slowly-evolving lineages.
hepatitis C; substitution rate; virus evolution; Bayesian phylogenetics; molecular clock; relaxed clock; adaptation
Despite highly active antiretroviral therapy (HAART), AIDS related lymphoma (ARL) occurs at a significantly higher rate in patients infected with the Human Immunodeficiency Virus (HIV) than in the general population. HIV-infected macrophages are a known viral reservoir and have been shown to have lymphomagenic potential in SCID mice; therefore, there is an interest in determining if a viral component to lymphomagenesis also exists. We sequenced HIV-1 envelope gp120 clones obtained post mortem from several tumor and non-tumor tissues of two patients who died with AIDS-related Non-Hodgkin's lymphoma (ARL-NH). Similar results were found in both patients: 1) high-resolution phylogenetic analysis showed a significant degree of compartmentalization between lymphoma and non-lymphoma viral sub-populations while viral sub-populations from lymph nodes appeared to be intermixed within sequences from tumor and non-tumor tissues, 2) a 100-fold increase in the effective HIV population size in tumor versus non-tumor tissues was associated with the emergence of lymphadenopathy and aggressive metastatic ARL, and 3) HIV gene flow among lymph nodes, normal and metastatic tissues was non-random. The different population dynamics between the viruses found in tumors versus the non-tumor associated viruses suggest that there is a significant relationship between HIV evolution and lymphoma pathogenesis. Moreover, the study indicates that HIV could be used as an effective marker to study the origin and dissemination of lymphomas in vivo.
Since recombination leads to the generation of mosaic genomes that violate the assumption of traditional phylogenetic methods that sequence evolution can be accurately described by a single tree, results and conclusions based on phylogenetic analysis of data sets including recombinant sequences can be severely misleading. Many methods are able to adequately detect recombination between diverse sequences, for example between different HIV-1 subtypes. More problematic is the identification of recombinants among closely related sequences such as a viral population within a host. We describe a simple algorithmic procedure that enables detection of intra-host recombinants based on split-decomposition networks and a robust statistical test for recombination. By applying this algorithm to several published HIV-1 datasets we conclude that intra-host recombination was significantly underestimated in previous studies and that up to one-third of the env sequences longitudinally sampled from a given subject can be of recombinant origin. The results show that our procedure can be a valuable exploratory tool for detection of recombinant sequences before phylogenetic analysis, and also suggest that HIV-1 recombination in vivo is far more frequent and significant than previously thought.
We sought to investigate the evolutionary and historical reasons for the different epidemiological patterns of HIV-1 in the early epidemic. In order to characterize the demographic history of HIV-1 subtypes A and D in east Africa, we examined molecular epidemiology, geographical and historical data.
We employed high-resolution phylodynamics to investigate the introduction of HIV-1A and D into east Africa, the geographic trends of viral spread, and the demographic growth of each subtype. We also used geographic information system data to investigate human migration trends, population growth, and human mobility.
HIV-1A and D were introduced into east Africa after 1950 and spread exponentially during the 1970s, concurrent with eastward expansion. Spatiotemporal data failed to explain the establishment and spread of HIV based on urban population growth and migration. The low prevalence of the virus in the Democratic Republic of Congo before and after the emergence of the pandemic was, however, consistent with regional accessibility data, highlighting the difficulty in travel between major population centers in central Africa. In contrast, the strong interconnectivity between population centers across the east African region since colonial times has likely fostered the rapid growth of the epidemic in this locale.
This study illustrates how phylodynamic analysis of pathogens informed by geospatial data can provide a more holistic and evidence-based interpretation of past epidemics. We advocate that this ‘landscape phylodynamics’ approach has the potential to provide a framework both to understand epidemics' spread and to design optimal intervention strategies.
Africa; epidemic; evolution; geographic information system; HIV-1; phylodynamics
Human T-lymphotropic virus type 4 (HTLV-4) is a new deltaretrovirus recently identified in a primate hunter in Cameroon. Limited sequence analysis previously showed that HTLV-4 may be distinct from HTLV-1, HTLV-2, and HTLV-3, and their simian counterparts, STLV-1, STLV-2, and STLV-3, respectively. Analysis of full-length genomes can provide basic information on the evolutionary history and replication and pathogenic potential of new viruses.
We report here the first complete HTLV-4 sequence obtained by PCR-based genome walking using uncultured peripheral blood lymphocyte DNA from an HTLV-4-infected person. The HTLV-4(1863LE) genome is 8791-bp long and is equidistant from HTLV-1, HTLV-2, and HTLV-3 sharing only 62–71% nucleotide identity. HTLV-4 has a prototypic genomic structure with all enzymatic, regulatory, and structural proteins preserved. Like STLV-2, STLV-3, and HTLV-3, HTLV-4 is missing a third 21-bp transcription element found in the long terminal repeats of HTLV-1 and HTLV-2 but instead contains unique c-Myb and pre B-cell leukemic transcription factor binding sites. Like HTLV-2, the PDZ motif important for cellular signal transduction and transformation in HTLV-1 and HTLV-3 is missing in the C-terminus of the HTLV-4 Tax protein. A basic leucine zipper (b-ZIP) region located in the antisense strand of HTLV-1 and believed to play a role in viral replication and oncogenesis, was also found in the complementary strand of HTLV-4. Detailed phylogenetic analysis shows that HTLV-4 is clearly a monophyletic viral group. Dating using a relaxed molecular clock inferred that the most recent common ancestor of HTLV-4 and HTLV-2/STLV-2 occurred 49,800 to 378,000 years ago making this the oldest known PTLV lineage. Interestingly, this period coincides with the emergence of Homo sapiens sapiens during the Middle Pleistocene suggesting that early humans may have been susceptible hosts for the ancestral HTLV-4.
The inferred ancient origin of HTLV-4 coinciding with the appearance of Homo sapiens, the propensity of STLVs to cross-species into humans, the fact that HTLV-1 and -2 spread globally following migrations of ancient populations, all suggest that HTLV-4 may be prevalent. Expanded surveillance and clinical studies are needed to better define the epidemiology and public health importance of HTLV-4 infection.
During HIV-1 infection coreceptor switch from CCR5- (R5)- to CXCR4 (X4)-using viruses is associated with disease progression. X4 strains of HIV-1 are highly cytopathic to immature thymocytes. Virtually no studies have evaluated the HIV-1 quasispecies present in vivo within thymic and lymphoid tissues or the evolutionary relationship between R5 and X4 viruses in tissues and peripheral blood.
High-resolution phylodynamic analysis was applied to virus envelope quasispecies in longitudinal peripheral blood mononuclear cells (PBMCs) and lymphoid and non-lymphoid tissues collected post mortem from therapy naïve children with AIDS. There were three major findings. First, continued evolution of R5 viruses in PBMCs, spleen and lymph nodes involved multiple bottlenecks, independent of coreceptor switch, resulting in fitter quasispecies driven by positive selection. Second, evolution of X4 strains appeared to be a sequential process requiring the initial fixation of positively selected mutations in V1-V2 and C2 domains of R5 variants before the emergence of high charge V3 X4 variants. Third, R5 viruses persisted after the emergence of CXCR4-using strains, which were found predominantly but not exclusively in the thymus.
Our data indicate that the evolution of X4 strains is a multi-step, temporally structured process and that the thymus may play an important role in the evolution/amplification of coreceptor variants. Development of new therapeutic protocols targeting virus in the thymus could be important to control HIV-1 infection prior to advanced disease.