In 49 patients with known Ebola virus disease outcomes during the ongoing outbreak in Sierra Leone, 13 were coinfected with the immunomodulatory pegivirus GB virus C (GBV-C). Fifty-three percent of these GBV-C+ patients survived; in contrast, only 22% of GBV-C− patients survived. Both survival and GBV-C status were associated with age, with older patients having lower survival rates and intermediate-age patients (21 to 45 years) having the highest rate of GBV-C infection. Understanding the separate and combined effects of GBV-C and age on Ebola virus survival may lead to new treatment and prevention strategies, perhaps through age-related pathways of immune activation.
Lassa fever (LF) is a severe viral hemorrhagic fever caused by Lassa virus (LASV). The LF program at the Kenema Government Hospital (KGH) in Eastern Sierra Leone currently provides diagnostic services and clinical care for more than 500 suspected LF cases per year. Nearly two-thirds of suspected LF patients presenting to the LF Ward test negative for either LASV antigen or anti-LASV immunoglobulin M (IgM), and therefore are considered to have a non-Lassa febrile illness (NLFI). The NLFI patients in this study were generally severely ill, which accounts for their high case fatality rate of 36%. The current studies were aimed at determining possible causes of severe febrile illnesses in non-LF cases presenting to the KGH, including possible involvement of filoviruses. A seroprevalence survey employing commercial enzyme-linked immunosorbent assay tests revealed significant IgM and IgG reactivity against dengue virus, chikungunya virus, West Nile virus (WNV), Leptospira, and typhus. A polymerase chain reaction–based survey using sera from subjects with acute LF, evidence of prior LASV exposure, or NLFI revealed widespread infection with Plasmodium falciparum malaria in febrile patients. WNV RNA was detected in a subset of patients, and a 419 nt amplicon specific to filoviral L segment RNA was detected at low levels in a single patient. However, 22% of the patients presenting at the KGH between 2011 and 2014 who were included in this survey registered anti-Ebola virus (EBOV) IgG or IgM, suggesting prior exposure to this agent. The 2014 Ebola virus disease (EVD) outbreak is already the deadliest and most widely dispersed outbreak of its kind on record. Serological evidence reported here for possible human exposure to filoviruses in Sierra Leone prior to the current EVD outbreak supports genetic analysis that EBOV may have been present in West Africa for some time prior to the 2014 outbreak.
Motivation: Efficient simulation of population genetic samples under a given demographic model is a prerequisite for many analyses. Coalescent theory provides an efficient framework for such simulations, but simulating longer regions and higher recombination rates remains challenging. Simulators based on a Markovian approximation to the coalescent scale well, but do not support simulation of selection. Gene conversion is not supported by any published coalescent simulators that support selection.
Results: We describe cosi2, an efficient simulator that supports both exact and approximate coalescent simulation with positive selection. cosi2 improves on the speed of existing exact simulators, and permits further speedup in approximate mode while retaining support for selection. cosi2 supports a wide range of demographic scenarios, including recombination hot spots, gene conversion, population size changes, population structure and migration.
cosi2 implements coalescent machinery efficiently by tracking only a small subset of the Ancestral Recombination Graph, sampling only relevant recombination events, and using augmented skip lists to represent tracked genetic segments. To preserve support for selection in approximate mode, the Markov approximation is implemented not by moving along the chromosome but by performing a standard backwards-in-time coalescent simulation while restricting coalescence to node pairs with overlapping or near-overlapping genetic material. We describe the algorithms used by cosi2 and present comparisons with existing selection simulators.
Availability and implementation: A free C++ implementation of cosi2 is available at http://broadinstitute.org/mpg/cosi2.
Supplementary data are available at Bioinformatics online.
The 2013–2015 Ebola virus disease (EVD) epidemic is caused by the Makona variant of Ebola virus (EBOV). Early in the epidemic, genome sequencing provided insights into virus evolution and transmission and offered important information for outbreak response. Here, we analyze sequences from 232 patients sampled over 7 months in Sierra Leone, along with 86 previously released genomes from earlier in the epidemic. We confirm sustained human-to-human transmission within Sierra Leone and find no evidence for import or export of EBOV across national borders after its initial introduction. Using high-depth replicate sequencing, we observe both host-to-host transmission and recurrent emergence of intrahost genetic variants. We trace the increasing impact of purifying selection in suppressing the accumulation of nonsynonymous mutations over time. Finally, we note changes in the mucin-like domain of EBOV glycoprotein that merit further investigation. These findings clarify the movement of EBOV within the region and describe viral evolution during prolonged human-to-human transmission.
•In Sierra Leone, transmission has primarily been within-country, not between-country•Infectious doses are large enough for intrahost variants to transmit between hosts•A prolonged epidemic removes deleterious mutations from the viral population•There is preliminary evidence for human RNA editing effects on the Ebola genome
Ebola virus genomes from 232 patients sampled over 7 months in Sierra Leone were sequenced. Transmission of intrahost genetic variants suggests a sufficiently high infectious dose during transmission. The human host may have caused direct alterations to the Ebola virus genome.
The human malaria parasite Plasmodium falciparum has a complex and multi-stage life cycle that requires extensive and precise gene regulation to allow invasion and hijacking of host cells, transmission, and immune escape. To date, the regulatory elements orchestrating these critical parasite processes remain largely unknown. Yet it is becoming increasingly clear that long non-coding RNAs (lncRNAs) could represent a missing regulatory layer across a broad range of organisms.
To investigate the regulatory capacity of lncRNA in P. falciparum, we harvested fifteen samples from two time-courses. Our sample set profiled 56 h of P. falciparum blood stage development. We then developed and validated strand-specific, non-polyA-selected RNA sequencing methods, and pursued the first assembly of P. falciparum strand-specific transcript structures from RNA sequencing data. This approach enabled the annotation of over one thousand lncRNA transcript models and their comprehensive global analysis: coding prediction, periodicity, stage-specificity, correlation, GC content, length, location relative to annotated transcripts, and splicing. We validated the complete splicing structure of three lncRNAs with compelling properties. Non-polyA-selected deep sequencing also enabled the prediction of hundreds of intriguing P. falciparum circular RNAs, six of which we validated experimentally.
We found that a subset of lncRNAs, including all subtelomeric lncRNAs, strongly peaked in expression during invasion. By contrast, antisense transcript levels significantly dropped during invasion. As compared to neighboring mRNAs, the expression of antisense-sense pairs was significantly anti-correlated during blood stage development, indicating transcriptional interference. We also validated that P. falciparum produces circRNAs, which is notable given the lack of RNA interference in the organism, and discovered that a highly expressed, five-exon antisense RNA is poised to regulate P. falciparum gametocyte development 1 (PfGDV1), a gene required for early sexual commitment events.
Electronic supplementary material
The online version of this article (doi:10.1186/s12864-015-1603-4) contains supplementary material, which is available to authorized users.
RNA sequencing; Non-coding RNA; lncRNA; Antisense RNA; circRNA; microRNA; Malaria; Plasmodium; Transcriptome; Gene regulation; Extreme genome; PfGDV1
In its largest outbreak, Ebola virus disease is spreading through Guinea, Liberia, Sierra Leone, and Nigeria. We sequenced 99 Ebola virus genomes from 78 patients in Sierra Leone to ∼2000× coverage. We observed a rapid accumulation of interhost and intrahost genetic variation, allowing us to characterize patterns of viral transmission over the initial weeks of the epidemic. This West African variant likely diverged from central African lineages around 2004, crossed from Guinea to Sierra Leone in May 2014, and has exhibited sustained human-to-human transmission subsequently, with no evidence of additional zoonotic sources. Because many of the mutations alter protein sequences and other biologically meaningful targets, they should be monitored for impact on diagnostics, vaccines, and therapies critical to outbreak response.
As an ancient disease with high fatality, cholera has likely exerted strong selective pressure on affected human populations. We performed a genome-wide study of natural selection in a population from the Ganges River Delta, the historic geographic epicenter of cholera. We identified 305 candidate selected regions using the Composite of Multiple Signals (CMS) method. The regions were enriched for potassium channel genes involved in cyclic AMP-mediated chloride secretion and for components of the innate immune system involved in NF-κB signaling. We demonstrate that a number of these strongly selected genes are associated with cholera susceptibility in two separate cohorts. We further identify repeated examples of selection and association in an NF-kB / inflammasome-dependent pathway that is activated in vitro by Vibrio cholerae. Our findings shed light on the genetic basis of cholera resistance in a population from the Ganges River Delta and present a promising approach for identifying genetic factors influencing susceptibility to infectious diseases.
Plasmodium vivax, one of the five species of Plasmodium parasites that cause human malaria, is responsible for 25–40% of malaria cases worldwide. Malaria global elimination efforts will benefit from accurate and effective genotyping tools that will provide insight into the population genetics and diversity of this parasite. The recent sequencing of P. vivax isolates from South America, Africa, and Asia presents a new opportunity by uncovering thousands of novel single nucleotide polymorphisms (SNPs). Genotyping a selection of these SNPs provides a robust, low-cost method of identifying parasite infections through their unique genetic signature or barcode. Based on our experience in generating a SNP barcode for P. falciparum using High Resolution Melting (HRM), we have developed a similar tool for P. vivax. We selected globally polymorphic SNPs from available P. vivax genome sequence data that were located in putatively selectively neutral sites (i.e., intergenic, intronic, or 4-fold degenerate coding). From these candidate SNPs we defined a barcode consisting of 42 SNPs. We analyzed the performance of the 42-SNP barcode on 87 P. vivax clinical samples from parasite populations in South America (Brazil, French Guiana), Africa (Ethiopia) and Asia (Sri Lanka). We found that the P. vivax barcode is robust, as it requires only a small quantity of DNA (limit of detection 0.3 ng/μl) to yield reproducible genotype calls, and detects polymorphic genotypes with high sensitivity. The markers are informative across all clinical samples evaluated (average minor allele frequency > 0.1). Population genetic and statistical analyses show the barcode captures high degrees of population diversity and differentiates geographically distinct populations. Our 42-SNP barcode provides a robust, informative, and standardized genetic marker set that accurately identifies a genomic signature for P. vivax infections.
Plasmodium vivax malaria is a major global public health problem, with nearly 2.5 billion people at risk for infection and approximately 132–391 million clinical infections annually. It has a wide geographical range, with a high disease burden in Asia, Central and South America, the Middle East, Oceania, and East Africa. Advances in sequencing technology and sample processing have made it possible to characterize the genetic diversity of P. vivax populations. This genetic variation provides a means to identify parasites by unique genetic signatures, or “barcodes.” We developed such a genetic barcode for P. vivax, composed of 42 robust and informative variants. Here we report its development and validation based on 87 clinical samples identified by microscopy to contain P. vivax from geographically diverse parasite populations from South America (Brazil, French Guiana), Africa (Ethiopia) and Asia (Sri Lanka). We show that the SNP barcode provides a genotyping tool that can be performed at low cost, providing a means to uniquely identify parasite infections and distinguish geographic origins, and that barcode data may offer new insights into P. vivax population structure and diversity.
Next-generation sequencing (NGS) has the potential to transform the discovery of viruses causing unexplained acute febrile illness (UAFI) because it does not depend on culturing the pathogen or a priori knowledge of the pathogen’s nucleic acid sequence. More generally, it has the potential to elucidate the complete human virome, including viruses that cause no overt symptoms of disease, but may have unrecognized immunological or developmental consequences. We have used NGS to identify RNA viruses in the blood of 195 patients with UAFI and compared them with those found in 328 apparently healthy (i.e., no overt signs of illness) control individuals, all from communities in southeastern Nigeria. Among UAFI patients, we identified the presence of nucleic acids from several well-characterized pathogenic viruses, such as HIV-1, hepatitis, and Lassa virus. In our cohort of healthy individuals, however, we detected the nucleic acids of two novel rhabdoviruses. These viruses, which we call Ekpoma virus-1 (EKV-1) and Ekpoma virus-2 (EKV-2), are highly divergent, with little identity to each other or other known viruses. The most closely related rhabdoviruses are members of the genus Tibrovirus and Bas-Congo virus (BASV), which was recently identified in an individual with symptoms resembling hemorrhagic fever. Furthermore, by conducting a serosurvey of our study cohort, we find evidence for remarkably high exposure rates to the identified rhabdoviruses. The recent discoveries of novel rhabdoviruses by multiple research groups suggest that human infection with rhabdoviruses might be common. While the prevalence and clinical significance of these viruses are currently unknown, these viruses could have previously unrecognized impacts on human health; further research to understand the immunological and developmental impact of these viruses should be explored. More generally, the identification of similar novel viruses in individuals with and without overt symptoms of disease highlights the need for a broader understanding of the human virome as efforts for viral detection and discovery advance.
Next-generation sequencing, a high-throughput method for sequencing DNA and RNA, has the potential to transform virus discovery because it does not depend on culturing the pathogen or a priori knowledge of the pathogen’s nucleic acid sequence. We used next-generation sequencing to identify RNA viruses present in the blood of patients with unexplained fever, as well as apparently healthy individuals in a peri-urban community in Nigeria. We found several well-characterized viruses in the blood of the febrile patients, including HIV-1, hepatitis B and C, as well as Lassa virus. We also discovered two novel rhabdoviruses in the blood of two apparently healthy (afebrile) females, which we named Ekpoma virus-1 and Ekpoma virus-2. Rhabdoviruses are distributed globally and include several human pathogens from the genera lyssavirus and vesiculovirus (e.g., rabies, Chandipura and vesicular stomatitis virus). The novel rhabdoviruses identified in this study are most similar to Bas-Congo virus, which was recently identified in an individual with an acute febrile illness. Furthermore, we demonstrate evidence of high levels of previous exposure to the two rhabdoviruses among our larger study population. Our results suggest that such rhabdovirus infections could be common, and may not necessarily cause overt disease. The identification of viral nucleic acid sequences in apparently healthy individuals highlights the need for a broader understanding of all viruses infecting humans as we increase efforts to identify viruses causing human disease.
The increasing availability of sequence data for many viruses provides power to detect regions under unusual evolutionary constraint at a high resolution. One approach leverages the synonymous substitution rate as a signature to pinpoint genic regions encoding overlapping or embedded functional elements. Protein-coding regions in viral genomes often contain overlapping RNA structural elements, reading frames, regulatory elements, microRNAs, and packaging signals. Synonymous substitutions in these regions would be selectively disfavored and thus these regions are characterized by excess synonymous constraint. Codon choice can also modulate transcriptional efficiency, translational accuracy, and protein folding.
We developed a phylogenetic codon model-based framework, FRESCo, designed to find regions of excess synonymous constraint in short, deep alignments, such as individual viral genes across many sequenced isolates. We demonstrated the high specificity of our approach on simulated data and applied our framework to the protein-coding regions of approximately 30 distinct species of viruses with diverse genome architectures.
FRESCo recovers known multifunctional regions in well-characterized viruses such as hepatitis B virus, poliovirus, and West Nile virus, often at a single-codon resolution, and predicts many novel functional elements overlapping viral genes, including in Lassa and Ebola viruses. In a number of viruses, the synonymously constrained regions that we identified also display conserved, stable predicted RNA structures, including putative novel elements in multiple viral species.
Electronic supplementary material
The online version of this article (doi:10.1186/s13059-015-0603-7) contains supplementary material, which is available to authorized users.
Until recently, Ebola virus (EBOV) was a rarely encountered human pathogen that caused disease among small populations with extraordinarily high lethality. At the end of 2013, EBOV initiated an unprecedented disease outbreak in West Africa that is still ongoing and has already caused thousands of deaths. Recent studies revealed the genomic changes this particular EBOV variant undergoes over time during human-to-human transmission. Here we highlight the genomic changes that might negatively impact the efficacy of currently available EBOV sequence-based candidate therapeutics, such as small interfering RNAs (siRNAs), phosphorodiamidate morpholino oligomers (PMOs), and antibodies. Ten of the observed mutations modify the sequence of the binding sites of monoclonal antibody (MAb) 13F6, MAb 1H3, MAb 6D8, MAb 13C6, and siRNA EK-1, VP24, and VP35 targets and might influence the binding efficacy of the sequence-based therapeutics, suggesting that their efficacy should be reevaluated against the currently circulating strain.
Complex malaria infections are defined as those containing more than one genetically distinct lineage of Plasmodium parasite. Complexity of infection (COI) is a useful parameter to estimate from patient blood samples because it is associated with clinical outcome, epidemiology and disease transmission rate. This manuscript describes a method for estimating COI using likelihood, called COIL, from a panel of bi-allelic genotyping assays.
COIL assumes that distinct parasite lineages in complex infections are unrelated and that genotyped loci do not exhibit significant linkage disequilibrium. Using the population minor allele frequency (MAF) of the genotyped loci, COIL uses the binomial distribution to estimate the likelihood of a COI level given the prevalence of observed monomorphic or polymorphic genotypes within each sample.
COIL reliably estimates COI up to a level of three or five with at least 24 or 96 unlinked genotyped loci, respectively, as determined by in silico simulation and empirical validation. Evaluation of COI levels greater than five in patient samples may require a very large collection of genotype data, making sequencing a more cost-effective approach for evaluating COI under conditions when disease transmission is extremely high. Performance of the method is positively correlated with the MAF of the genotyped loci. COI estimates from existing SNP genotype datasets create a more detailed portrait of disease than analyses based simply on the number of polymorphic genotypes observed within samples.
The capacity to reliably estimate COI from a genome-wide panel of SNP genotypes provides a potentially more accurate alternative to methods relying on PCR amplification of a small number of loci for estimating COI. This approach will also increase the number of applications of SNP genotype data, providing additional motivation to employ SNP barcodes for studies of disease epidemiology or control measure efficacy. The COIL program is available for download from GitHub, and users may also upload their SNP genotype data to a web interface for simple and efficient determination of sample COI.
Electronic supplementary material
The online version of this article (doi:10.1186/1475-2875-14-4) contains supplementary material, which is available to authorized users.
Malaria; Plasmodium; vivax; falciparum; Complexity of infection; Multiplicity of infection; SNP; Barcode; Genotype; Likelihood
We have developed a robust RNA sequencing method for generating complete de novo assemblies with intra-host variant calls of Lassa and Ebola virus genomes in clinical and biological samples. Our method uses targeted RNase H-based digestion to remove contaminating poly(rA) carrier and ribosomal RNA. This depletion step improves both the quality of data and quantity of informative reads in unbiased total RNA sequencing libraries. We have also developed a hybrid-selection protocol to further enrich the viral content of sequencing libraries. These protocols have enabled rapid deep sequencing of both Lassa and Ebola virus and are broadly applicable to other viral genomics studies.
Electronic supplementary material
The online version of this article (doi:10.1186/s13059-014-0519-7) contains supplementary material, which is available to authorized users.
In 2014, Ebola virus (EBOV) was identified as the etiological agent of a large and still expanding outbreak of Ebola virus disease (EVD) in West Africa and a much more confined EVD outbreak in Middle Africa. Epidemiological and evolutionary analyses confirmed that all cases of both outbreaks are connected to a single introduction each of EBOV into human populations and that both outbreaks are not directly connected. Coding-complete genomic sequence analyses of isolates revealed that the two outbreaks were caused by two novel EBOV variants, and initial clinical observations suggest that neither of them should be considered strains. Here we present consensus decisions on naming for both variants (West Africa: “Makona”, Middle Africa: “Lomela”) and provide database-compatible full, shortened, and abbreviated names that are in line with recently established filovirus sub-species nomenclatures.
Ebola; Ebola virus; ebolavirus; filovirid; Filoviridae; filovirus; genome annotation; Lomela; Lokolia; Makona; mononegavirad; Mononegavirales; mononegavirus; virus classification; virus isolate; virus nomenclature; virus strain; virus taxonomy; virus variant
Sequence determination of complete or coding-complete genomes of viruses is becoming common practice for supporting the work of epidemiologists, ecologists, virologists, and taxonomists. Sequencing duration and costs are rapidly decreasing, sequencing hardware is under modification for use by non-experts, and software is constantly being improved to simplify sequence data management and analysis. Thus, analysis of virus disease outbreaks on the molecular level is now feasible, including characterization of the evolution of individual virus populations in single patients over time. The increasing accumulation of sequencing data creates a management problem for the curators of commonly used sequence databases and an entry retrieval problem for end users. Therefore, utilizing the data to their fullest potential will require setting nomenclature and annotation standards for virus isolates and associated genomic sequences. The National Center for Biotechnology Information’s (NCBI’s) RefSeq is a non-redundant, curated database for reference (or type) nucleotide sequence records that supplies source data to numerous other databases. Building on recently proposed templates for filovirus variant naming [ ()////-], we report consensus decisions from a majority of past and currently active filovirus experts on the eight filovirus type variants and isolates to be represented in RefSeq, their final designations, and their associated sequences.
Bundibugyo virus; cDNA clone; cuevavirus; Ebola; Ebola virus; ebolavirus; filovirid; Filoviridae; filovirus; genome annotation; ICTV; International Committee on Taxonomy of Viruses; Lloviu virus; Marburg virus; marburgvirus; mononegavirad; Mononegavirales; mononegavirus; Ravn virus; RefSeq; Reston virus; reverse genetics; Sudan virus; Taï Forest virus; virus classification; virus isolate; virus nomenclature; virus strain; virus taxonomy; virus variant
Thanks to high-throughput sequencing technologies, genome sequencing has become a common component in nearly all aspects of viral research; thus, we are experiencing an explosion in both the number of available genome sequences and the number of institutions producing such data. However, there are currently no common standards used to convey the quality, and therefore utility, of these various genome sequences. Here, we propose five “standard” categories that encompass all stages of viral genome finishing, and we define them using simple criteria that are agnostic to the technology used for sequencing. We also provide genome finishing recommendations for various downstream applications, keeping in mind the cost-benefit trade-offs associated with different levels of finishing. Our goal is to define a common vocabulary that will allow comparison of genome quality across different research groups, sequencing platforms, and assembly techniques.
Mycobacterium tuberculosis is successfully evolving antibiotic resistance, threatening attempts at tuberculosis epidemic control. Mechanisms of resistance, including the genetic changes favored by selection in resistant isolates, are incompletely understood. Using 116 newly and 7 previously sequenced M. tuberculosis genomes, we identified genomewide signatures of positive selection specific to the 47 resistant genomes. By searching for convergent evolution, the independent fixation of mutations at the same nucleotide site or gene, we recovered 100% of a set of known resistance markers. We also found evidence of positive selection in an additional 39 genomic regions in resistant isolates. These regions encode pathways of cell wall biosynthesis, transcriptional regulation and DNA repair. Mutations in these regions could directly confer resistance or compensate for fitness costs associated with resistance. Functional genetic analysis of mutations in one gene, ponA1, demonstrated an in vitro growth advantage in the presence of the drug rifampicin.
Lassa fever (LF), an often-fatal hemorrhagic disease caused by Lassa virus (LASV), is a major public health threat in West Africa. When the violent civil conflict in Sierra Leone (1991 to 2002) ended, an international consortium assisted in restoration of the LF program at Kenema Government Hospital (KGH) in an area with the world's highest incidence of the disease.
Clinical and laboratory records of patients presenting to the KGH Lassa Ward in the post-conflict period were organized electronically. Recombinant antigen-based LF immunoassays were used to assess LASV antigenemia and LASV-specific antibodies in patients who met criteria for suspected LF. KGH has been reestablished as a center for LF treatment and research, with over 500 suspected cases now presenting yearly. Higher case fatality rates (CFRs) in LF patients were observed compared to studies conducted prior to the civil conflict. Different criteria for defining LF stages and differences in sensitivity of assays likely account for these differences. The highest incidence of LF in Sierra Leone was observed during the dry season. LF cases were observed in ten of Sierra Leone's thirteen districts, with numerous cases from outside the traditional endemic zone. Deaths in patients presenting with LASV antigenemia were skewed towards individuals less than 29 years of age. Women self-reporting as pregnant were significantly overrepresented among LASV antigenemic patients. The CFR of ribavirin-treated patients presenting early in acute infection was lower than in untreated subjects.
Lassa fever remains a major public health threat in Sierra Leone. Outreach activities should expand because LF may be more widespread in Sierra Leone than previously recognized. Enhanced case finding to ensure rapid diagnosis and treatment is imperative to reduce mortality. Even with ribavirin treatment, there was a high rate of fatalities underscoring the need to develop more effective and/or supplemental treatments for LF.
Lassa fever (LF) is a major public health threat in West Africa. After the violent civil conflict in Sierra Leone (1991 to 2002) ended, the LF research program at Kenema Government Hospital (KGH) was reestablished. Higher CFRs in LF patients were observed compared to studies conducted prior to the civil conflict. The criteria used for defining the stages of LF and differences in sensitivity of the assays used likely account for these differences. LF may be more widespread in Sierra Leone than recognized previously. Peak presentation of LF cases occurs in the dry season, which is consistent with previous studies. Our studies also confirmed reports conducted prior to the civil conflict that indicate that infants, children, young adults, and pregnant women are disproportionately impacted by LF. High fatality rates were observed among both ribavirin treated and untreated patients, which underscores then need for better LF treatments.
An adaptive variant of the human Ectodysplasin receptor, EDARV370A, is one of the strongest candidates of recent positive selection from genome-wide scans. We have modeled EDAR370A in mice and characterized its phenotype and evolutionary origins in humans. Our computational analysis suggests the allele arose in Central China approximately 30,000 years ago. Although EDAR370A has been associated with increased scalp hair thickness and changed tooth morphology in humans, its direct biological significance and potential adaptive role remain unclear. We generated a knock-in mouse model and find that, as in humans, hair thickness is increased in EDAR370A mice. We identify novel biological targets affected by the mutation, including mammary and eccrine glands. Building on these results, we find that EDAR370A is associated with an increased number of active eccrine glands in the Han Chinese. This interdisciplinary approach yields unique insight into the generation of adaptive variation among modern humans.
While several hundred regions of the human genome harbor signals of positive natural selection, few of the relevant adaptive traits and variants have been elucidated. Using full-genome sequence variation from the 1000 Genomes Project (1000G) and the Composite of Multiple Signals (CMS) test, we investigated 412 candidate signals and leveraged functional annotation, protein structure modeling, epigenetics, and association studies to identify and extensively annotate candidate causal variants. The resulting catalog provides a tractable list for experimental follow-up; it includes thirty-five high-scoring non-synonymous variants, fifty-nine variants associated with expression levels of a nearby coding gene or lincRNA, and numerous variants associated with susceptibility to infectious disease and other phenotypes. We experimentally characterized one candidate non-synonymous variant in TLR5, and show that it leads to altered NF-κB signaling in response to bacterial flagellin.
Malaria is a deadly disease that causes nearly one million deaths each year. To develop methods to control and eradicate malaria, it is important to understand the genetic basis of Plasmodium falciparum adaptations to antimalarial treatments and the human immune system while taking into account its demographic history. To study the demographic history and identify genes under selection more efficiently, we sequenced the complete genomes of 25 culture-adapted P. falciparum isolates from three sites in Senegal. We show that there is no significant population structure among these Senegal sampling sites. By fitting demographic models to the synonymous allele-frequency spectrum, we also estimated a major 60-fold population expansion of this parasite population ∼20,000–40,000 years ago. Using inferred demographic history as a null model for coalescent simulation, we identified candidate genes under selection, including genes identified before, such as pfcrt and PfAMA1, as well as new candidate genes. Interestingly, we also found selection against G/C to A/T changes that offsets the large mutational bias toward A/T, and two unusual patterns: similar synonymous and nonsynonymous allele-frequency spectra, and 18% of genes having a nonsynonymous-to-synonymous polymorphism ratio >1.
P. falciparum; population expansion; base composition; selection
Using parasite genotyping tools, we screened patients with mild uncomplicated malaria seeking treatment at a clinic in Thiès, Senegal, from 2006 to 2011. We identified a growing frequency of infections caused by genetically identical parasite strains, coincident with increased deployment of malaria control interventions and decreased malaria deaths. Parasite genotypes in some cases persisted clonally across dry seasons. The increase in frequency of genetically identical parasite strains corresponded with decrease in the probability of multiple infections. Further, these observations support evidence of both clonal and epidemic population structures. These data provide the first evidence of a temporal correlation between the appearance of identical parasite types and increased malaria control efforts in Africa, which here included distribution of insecticide treated nets (ITNs), use of rapid diagnostic tests (RDTs) for malaria detection, and deployment of artemisinin combination therapy (ACT). Our results imply that genetic surveillance can be used to evaluate the effectiveness of disease control strategies and assist a rational global malaria eradication campaign.
Despite efforts to reduce malaria morbidity and mortality, drug-resistant parasites continue to evade control strategies. Recently, emphasis has shifted away from control and toward regional elimination and global eradication of malaria. Such a campaign requires tools to monitor genetic changes in the parasite that could compromise the effectiveness of antimalarial drugs and undermine eradication programs. These tools must be fast, sensitive, unambiguous, and cost-effective to offer real-time reports of parasite drug susceptibility status across the globe. We have developed and validated a set of genotyping assays using high-resolution melting (HRM) analysis to detect molecular biomarkers associated with drug resistance across six genes in Plasmodium falciparum. We improved on existing technical approaches by developing refinements and extensions of HRM, including the use of blocked probes (LunaProbes) and the mutant allele amplification bias (MAAB) technique. To validate the sensitivity and accuracy of our assays, we compared our findings to sequencing results in both culture-adapted lines and clinical isolates from Senegal. We demonstrate that our assays (i) identify both known and novel polymorphisms, (ii) detect multiple genotypes indicative of mixed infections, and (iii) distinguish between variants when multiple copies of a locus are present. These rapid and inexpensive assays can track drug resistance and detect emerging mutations in targeted genetic loci in P. falciparum. They provide tools for monitoring molecular changes associated with changes in drug response across populations and for determining whether parasites present after drug treatment are the result of recrudescence or reinfection in clinical settings.
Lassa fever is a viral hemorrhagic fever endemic in West Africa. However, none of the hospitals in the endemic areas of Nigeria has the capacity to perform Lassa virus diagnostics. Case identification and management solely relies on non-specific clinical criteria. The Irrua Specialist Teaching Hospital (ISTH) in the central senatorial district of Edo State struggled with this challenge for many years.
A laboratory for molecular diagnosis of Lassa fever, complying with basic standards of diagnostic PCR facilities, was established at ISTH in 2008. During 2009 through 2010, samples of 1,650 suspected cases were processed, of which 198 (12%) tested positive by Lassa virus RT-PCR. No remarkable demographic differences were observed between PCR-positive and negative patients. The case fatality rate for Lassa fever was 31%. Nearly two thirds of confirmed cases attended the emergency departments of ISTH. The time window for therapeutic intervention was extremely short, as 50% of the fatal cases died within 2 days of hospitalization—often before ribavirin treatment could be commenced. Fatal Lassa fever cases were older (p = 0.005), had lower body temperature (p<0.0001), and had higher creatinine (p<0.0001) and blood urea levels (p<0.0001) than survivors. Lassa fever incidence in the hospital followed a seasonal pattern with a peak between November and March. Lassa virus sequences obtained from the patients originating from Edo State formed—within lineage II—a separate clade that could be further subdivided into three clusters.
Lassa fever case management was improved at a tertiary health institution in Nigeria through establishment of a laboratory for routine diagnostics of Lassa virus. Data collected in two years of operation demonstrate that Lassa fever is a serious public health problem in Edo State and reveal new insights into the disease in hospitalized patients.
In the past, diagnostic testing for Lassa fever patients in Nigeria has been performed nearly exclusively outside of the country. Patients thus were managed on-site based on clinical suspicion alone, posing risks to patients and health care workers and exhausting resources. To tackle this problem, we established a diagnostic PCR laboratory directly at a referral hospital serving a Lassa fever endemic area in Nigeria. Long-term collaboration between partners in the North and the South was crucial to implement this project. Training of laboratory staff in the partner institutions and on-site, mobilization of local human and financial resources, good management of the laboratory, a basic quality management and control system, and a stable supply chain for consumables and reagents were among the key factors for success. The laboratory reliably delivered results in a short turnaround time, despite some problems due to PCR contamination. The service has improved patient and contact management including treatment with ribavirin and led to better protection of health care workers against hospital-acquired infections. The data provide new insights into disease progression and a basis for further optimization of case management including supportive treatment.