The advent of extremely high throughput DNA sequencing ensures that genomic data from microbial organisms can be acquired in unprecedented quantities and with remarkable rapidity. Although this genomic revolution will affect all microbes alike, our focus here is on RNA viruses, as the rapidity of their evolution, which is observable over the time scale of human observation, allows phylodynamic inferences to be made with great precision. In the foreseeable future it is likely that complete genome sequencing will become the standard method of viral characterization, providing the highest possible resolution for phylogenetic studies. The rapidity with which genome sequence data were generated from the ongoing epidemic of swine-origin H1N1 influenza A virus [1] is testament to the power of this technology.
Understandably, pathogen discovery is a major focus of this new-scale genome sequencing [2]. It is now possible to sequence the entire assemblage of viruses in a particular tissue type or host species [3]–[5], as well as all those viruses that are associated with specific disease syndromes [6],[7]. In essence, this new era of metagenomics constitutes a crucial taxonomic discovery phase in virology and epidemiology that allows the genetic characterization of new viruses within hours of their isolation.
Assembling an inventory of viruses that may emerge in human populations is of major importance to public health and to students of biodiversity. However, it is only the first step in developing a full quantitative understanding of the processes that shape the epidemiology and evolution—the phylodynamics—of RNA virus infections [8]. To achieve this goal, we argue here that the field of viral phylodynamics requires its own discovery phase; that is, a comprehensive and quantitative analysis of the interaction between the ecological and evolutionary dynamics of all circulating RNA viruses from the molecular to the global scale. Such a marriage of phylogenetic and epidemiological dynamics is currently only potentially possible for the select few human viruses for which large genome sequence datasets have been acquired, such as HIV and influenza A virus, and even here fundamental gaps in our knowledge remain (see below). Indeed, it is striking that so few complete genome sequences are currently available for viruses whose epidemiological dynamics are known in exquisite detail, such as measles [9],[10]; these sequences have been so sparsely sampled in both time and space that a full phylodynamic perspective has not yet been achieved. We contend that a better understanding of RNA virus phylodynamics will allow more directed attempts at pathogen surveillance, facilitate more accurate predictions of the epidemiological impact of newly emerged viruses, and assist in the control of those viruses that exhibit complex patterns of antigenic variation such as dengue and influenza. Just as PCR and first-generation DNA sequencing ushered in the science of molecular epidemiology, so next-generation sequencing may herald the age of phylodynamics. Box 1 lists a number of key questions that can be addressed within this phylodynamics research program.
Box 1. Key Research Questions in RNA Virus Phylodynamics
- What is the range of phylodynamic patterns observed in RNA viruses? Can they be categorized into specific groups? How do these patterns relate to other “life history” variables exhibited by RNA viruses?
- What epidemiological and evolutionary processes give rise to these phylodynamic patterns? What generalities can be drawn?
- How commonly does natural selection (compared to neutral evolutionary processes) determine the population dynamics of pathogens? On what scale does natural selection act? How does viral immune escape reduce herd immunity at the population level and allow the persistence of viral lineages in epidemic troughs?
- What is the range of spatial patterns exhibited by RNA viruses? What epidemiological factors are responsible for these patterns?
- How do different viral species (various respiratory viruses, for example) interact in host immunity?
A number of important advances are needed to meet our goal of a comprehensive catalog of the diversity of phylodynamic patterns in RNA viruses. Because answers to many of the most interesting research questions depend on sufficiently large sample sizes, we require large numbers of sequences that have been rigorously sampled according to strict temporal, spatial, and clinical criteria, and that as much of these data are publicly accessible as possible. A phylodynamic analysis has little value unless viral genomes are sampled on the same scale as the epidemiological processes under investigation.
The only acute virus for which a suitably expansive genome dataset currently exists is influenza. In this case, the >4,000 complete genomes generated under the Influenza Genome Sequencing Project [11] have provided important new insights into the evolution and epidemiology of this major human pathogen [12]. To highlight one key insight here, these genome sequence data have revealed that multiple lineages of influenza virus are imported and circulate within specific geographic localities (even within relatively isolated populations), generating both frequent mixed infections [13] and reassortment events [14]. Even so, the sampling of these genome sequences (and associated epidemiological covariates) may not be dense enough to fully capture spatial dynamics [15]. There is also a marked absence of samples from asymptomatically infected patients (or those with mild disease), so it is impossible to link genetic variation to clinical syndrome. Such a bias against viruses sampled from individuals with asymptomatic infections is a common problem in molecular epidemiology.



This article has been
Generation Computational Tools