Nearly three billion people worldwide are at risk for infection with dengue virus (DENV), a Flavivirus
transmitted by the mosquitoes Aedes aegypti
and A. albopictus
(for reviews, see references 24
). DENV is an RNA virus with a single-stranded, nonsegmented RNA genome ~10.7 kb in length, which encodes three structural proteins (capsid [C], premembrane/membrane [prM/M], and envelope ]E]) and seven nonstructural (NS) proteins (NS1, NS2A, NS2B, NS3, NS4A, NS4B, and NS5). There are four closely related serotypes of DENV (DENV-1 to DENV-4) that can cause asymptomatic infections or clinical illnesses ranging from the self-limiting but debilitating dengue fever to the life-threatening dengue hemorrhagic fever (DHF) and dengue shock syndrome (DSS), which are characterized by vascular leakage (76
). Even though several factors, such as viral genetics, preexisting heterotypic immunity (i.e., immunity to a different serotype due to a prior infection), the sequence of infection with distinct serotypes, and host genetics appear to influence disease severity (23
), the precise causes for progression to severe disease remain unknown.
The four DENV serotypes share limited identity, with 25 to 40% variability observed between serotypes at the amino acid level (30
). Considerable variation is also observed between viruses from the same serotype (~3 to 6% at the nucleotide level), which are phylogenetically divided into genotypes that are further subdivided into clades. This extensive genetic variability originates from the accumulation of genetically distinct genomes in individual hosts (referred to here as intrahost diversity) due to the error-prone nature of the enzyme responsible for viral RNA replication, the viral RNA-dependent RNA polymerase (RdRp) (16
). The overall composition of intrahost variants determines the consensus viral genome, which is defined as the composite of all intrahost variants in one host (). As in other RNA viruses (18
), these genetic intrahost variants are thought to serve as the templates on which evolutionary mechanisms, such as recombination, drift, bottlenecking, or positive/negative selective pressures, act to shape variation at the consensus level between hosts (i.e., interhost diversity) ().
Schematic representation of the distinction between intrahost diversity and consensus-level (interhost) diversity. Viral genomes and genomic polymorphisms are represented by lines and symbols, respectively.
Several investigations have uncovered associations between genetic determinants at the consensus level and viral fitness and pathogenicity (e.g., see references 57
). In addition to consensus sequence, the genetic composition of intrahost viral populations has been shown to be essential for maintaining the fitness of poliovirus populations in vivo
) and for influencing hepatitis C virus (HCV) and human immunodeficiency virus (HIV) pathogenesis and disease outcome (18
). For instance, in the case of HCV, high viral diversity was observed in individuals who presented with mild chronic hepatitis rather than severe hepatic injury (66
), while low viral diversity posttreatment was associated with sustained response to antiviral therapy (18
). The situation for HIV is more complex, with some studies reporting an association between higher viral diversity and slower disease progression (20
), while others suggest that individuals harboring HIV populations with low diversity progress slower and mount more robust immune responses (32
). In addition, recent work has shown that the level of diversity within an individual host during acute infection may be associated with viral set point, which refers to stabilized viral load after acute infection (28
). Such studies illustrate the importance of assessing the composition of viral populations at the intrahost level during the course of an infection, which is distinct from assessing changes in consensus sequence at the interhost level between multiple infected individuals.
For DENV, most sequencing efforts have focused on assessing interhost diversity in viral consensus genomes from viruses isolated from serum (e.g., see references 7
) or, as in a few studies, directly from viruses in human serum (e.g., see reference 39
). Far fewer efforts have been directed at capturing DENV intrahost diversity and have considerable limitations, in that they either characterize a few sequences at the whole-genome level or sequence one or two genes of the viral genome, such as C, E, or NS2B (15
). In these studies, a wide range of intrahost diversity was observed in DENV populations in humans (15
) and mosquitoes (43
), with lower diversity observed in mosquitoes (43
). Attempts have also been made to correlate intrahost viral diversity with disease severity. Descloux et al. (15
) postulated that the level of viral diversity was lower in patients with severe dengue (DHF/DSS) than in patients with the less severe DF. However, in a recent study by Thai et al. (67
), no correlation was observed between viral diversity in domain III of E and disease outcome or immune status. These authors also reported the detection of multiple viral lineages in some study subjects, indicating possible contributions from mixed infection to observed intrahost variation. Importantly, this study used rigorous algorithms to identify true variants from error introduced during PCR and documented far lower diversity than what had been previously reported.
We have developed a whole-genome segmented amplification approach which, when coupled with high-throughput sequencing, allows for capturing intrahost diversity across the entire coding region of the DENV genome with considerable depth of coverage. Using this approach, we captured diversity at 40 to 98% (average, ~75%) of the DENV-2 genome from 25 serum samples collected from 22 individuals, with an average coverage of 110 to 812 reads per nucleotide in each sample, and employed variant-calling algorithms (45
) to identify true variants. The scale of this data set allowed us to critically compare gene-wise diversity within and between samples, detect rare mutation events, and correlate multiple measures of intrahost diversity with interhost viral diversity in a manner that was not possible before.