In this study, we analyzed viral metagenomes (viromes) in the sedimentary habitats of three geographically and geologically distinct (hado)pelagic environments in the northwest Pacific; the Izu-Ogasawara Trench (water depth = 9,760 m) (OG), the Challenger Deep in the Mariana Trench (10,325 m) (MA), and the forearc basin off the Shimokita Peninsula (1,181 m) (SH). Virus abundance ranged from 106 to 1011 viruses/cm3 of sediments (down to 30 cm below the seafloor [cmbsf]). We recovered viral DNA assemblages (viromes) from the (hado)pelagic sediment samples and obtained a total of 37,458, 39,882, and 70,882 sequence reads by 454 GS FLX Titanium pyrosequencing from the virome libraries of the OG, MA, and SH (hado)pelagic sediments, respectively. Only 24−30% of the sequence reads from each virome library exhibited significant similarities to the sequences deposited in the public nr protein database (E-value <10−3 in BLAST). Among the sequences identified as potential viral genes based on the BLAST search, 95−99% of the sequence reads in each library were related to genes from single-stranded DNA (ssDNA) viral families, including Microviridae, Circoviridae, and Geminiviridae. A relatively high abundance of sequences related to the genetic markers (major capsid protein [VP1] and replication protein [Rep]) of two ssDNA viral groups were also detected in these libraries, thereby revealing a high genotypic diversity of their viruses (833 genotypes for VP1 and 2,551 genotypes for Rep). A majority of the viral genes predicted from each library were classified into three ssDNA viral protein categories: Rep, VP1, and minor capsid protein. The deep-sea sedimentary viromes were distinct from the viromes obtained from the oceanic and fresh waters and marine eukaryotes, and thus, deep-sea sediments harbor novel viromes, including previously unidentified ssDNA viruses.
The pig faecal virome, which comprises the community of viruses present in pig faeces, is complex and consists of pig viruses, bacteriophages, transiently passaged plant viruses and other minor virus species. Only little is known about factors influencing its general composition. Here, the effect of the probiotic bacterium Enterococcus faecium (E. faecium) NCIMB 10415 on the pig faecal virome composition was analysed in a pig feeding trial with sows and their piglets, which received either the probiotic bacterium or not.
From 8 pooled faecal samples derived from the feeding trial, DNA and RNA virus particles were prepared and subjected to process-controlled Next Generation Sequencing resulting in 390,650 sequence reads. In average, 14% of the reads showed significant sequence identities to known viruses. The percentage of detected mammalian virus sequences was highest (55–77%) in the samples of the youngest piglets and lowest (8–10%) in the samples of the sows. In contrast, the percentage of bacteriophage sequences increased from 22–44% in the youngest piglets to approximately 90% in the sows. The dominating mammalian viruses differed remarkably among 12 day-old piglets (kobuvirus), 54 day-old piglets (boca-, dependo- and pig stool-associated small circular DNA virus [PigSCV]) and the sows (PigSCV, circovirus and “circovirus-like” viruses CB-A and RW-A). In addition, the Shannon index, which reflects the diversity of sequences present in a sample, was generally higher for the sows as compared to the piglets. No consistent differences in the virome composition could be identified between the viromes of the probiotic bacterium-treated group and the control group.
The analysis indicates that the pig faecal virome shows a high variability and that its general composition is mainly dependent on the age of the pigs. Changes caused by feeding with the probiotic bacterium E. faecium could not be demonstrated using the applied metagenomics method.
The residence of dinoflagellate algae (genus: Symbiodinium) within scleractinian corals is critical to the construction and persistence of tropical reefs. In recent decades, however, acute and chronic environmental stressors have frequently destabilized this symbiosis, ultimately leading to coral mortality and reef decline. Viral infection has been suggested as a trigger of coral–Symbiodinium dissociation; knowledge of the diversity and hosts of coral-associated viruses is critical to evaluating this hypothesis. Here, we present the first genomic evidence of viruses associated with Symbiodinium, based on the presence of transcribed +ss (single-stranded) RNA and ds (double-stranded) DNA virus-like genes in complementary DNA viromes of the coral Montastraea cavernosa and expressed sequence tag (EST) libraries generated from Symbiodinium cultures. The M. cavernosa viromes contained divergent viral sequences similar to the major capsid protein of the dinoflagellate-infecting +ssRNA Heterocapsa circularisquama virus, suggesting a highly novel dinornavirus could infect Symbiodinium. Further, similarities to dsDNA viruses dominated (∼69%) eukaryotic viral similarities in the M. cavernosa viromes. Transcripts highly similar to eukaryotic algae-infecting phycodnaviruses were identified in the viromes, and homologs to these sequences were found in two independently generated Symbiodinium EST libraries. Phylogenetic reconstructions substantiate that these transcripts are undescribed and distinct members of the nucleocytoplasmic large DNA virus (NCLDVs) group. Based on a preponderance of evidence, we infer that the novel NCLDVs and RNA virus described here are associated with the algal endosymbionts of corals. If such viruses disrupt Symbiodinium, they are likely to impact the flexibility and/or stability of coral–algal symbioses, and thus long-term reef health and resilience.
coral reef; Heterocapsa circularisquama RNA virus (HcRNAV); nuclear cytoplasmic large DNA virus (NCLDV); Phycodnaviridae; Symbiodinium; virome
The frequent interactions of rodents with humans make them a common source of zoonotic infections. To obtain an initial unbiased measure of the viral diversity in the enteric tract of wild rodents we sequenced partially purified, randomly amplified viral RNA and DNA in the feces of 105 wild rodents (mouse, vole, and rat) collected in California and Virginia. We identified in decreasing frequency sequences related to the mammalian viruses families Circoviridae, Picobirnaviridae, Picornaviridae, Astroviridae, Parvoviridae, Papillomaviridae, Adenoviridae, and Coronaviridae. Seventeen small circular DNA genomes containing one or two replicase genes distantly related to the Circoviridae representing several potentially new viral families were characterized. In the Picornaviridae family two new candidate genera as well as a close genetic relative of the human pathogen Aichi virus were characterized. Fragments of the first mouse sapelovirus and picobirnaviruses were identified and the first murine astrovirus genome was characterized. A mouse papillomavirus genome and fragments of a novel adenovirus and adenovirus-associated virus were also sequenced. The next largest fraction of the rodent fecal virome was related to insect viruses of the Densoviridae, Iridoviridae, Polydnaviridae, Dicistroviriade, Bromoviridae, and Virgaviridae families followed by plant virus-related sequences in the Nanoviridae, Geminiviridae, Phycodnaviridae, Secoviridae, Partitiviridae, Tymoviridae, Alphaflexiviridae, and Tombusviridae families reflecting the largely insect and plant rodent diet. Phylogenetic analyses of full and partial viral genomes therefore revealed many previously unreported viral species, genera, and families. The close genetic similarities noted between some rodent and human viruses might reflect past zoonoses. This study increases our understanding of the viral diversity in wild rodents and highlights the large number of still uncharacterized viruses in mammals.
Rodents are the natural reservoir of numerous zoonotic viruses causing serious diseases in humans. We used an unbiased metagenomic approach to characterize the viral diversity in rodent feces. In addition to diet-derived insect and plant viruses mammalian viral sequences were abundant and diverse. Most notably, multiple new circular viral DNA families, two new picornaviridae genera, and the first murine astrovirus and picobirnaviruses were characterized. A mouse kobuvirus was a close relative to the Aichi virus human pathogen. This study significantly increases the known genetic diversity of eukaryotic viruses in rodents and provides an initial description of their enteric viromes.
Viruses have a profound influence on the ecology and evolution of plankton, but our understanding of the composition of the aquatic viral communities is still rudimentary. This is especially true of those viruses having RNA genomes. The limited data that have been published suggest that the RNA virioplankton is dominated by viruses with positive-sense, single-stranded (+ss) genomes that have features in common with those of eukaryote-infecting viruses in the order Picornavirales (picornavirads). In this study, we investigated the diversity of the RNA virus assemblages in tropical coastal seawater samples using targeted PCR and metagenomics. Amplification of RNA-dependent RNA polymerase (RdRp) genes from fractions of a buoyant density gradient suggested that the distribution of two major subclades of the marine picornavirads was largely congruent with the distribution of total virus-like RNA, a finding consistent with their proposed dominance. Analyses of the RdRp sequences in the library revealed the presence of many diverse phylotypes, most of which were related only distantly to those of cultivated viruses. Phylogenetic analysis suggests that there were hundreds of unique picornavirad-like phylotypes in one 35-liter sample that differed from one another by at least as much as the differences among currently recognized species. Assembly of the sequences in the metagenome resulted in the reconstruction of six essentially complete viral genomes that had features similar to viruses in the families Bacillarna-, Dicistro-, and Marnaviridae. Comparison of the tropical seawater metagenomes with those from other habitats suggests that +ssRNA viruses are generally the most common types of RNA viruses in aquatic environments, but biases in library preparation remain a possible explanation for this observation.
Marine plankton account for much of the photosynthesis and respiration on our planet, and they influence the cycling of carbon and the distribution of nutrients on a global scale. Despite the fundamental importance of viruses to plankton ecology and evolution, most of the viruses in the sea, and the identities of their hosts, are unknown. This report is one of very few that delves into the genetic diversity within RNA-containing viruses in the ocean. The data expand the known range of viral diversity and shed new light on the physical properties and genetic composition of RNA viruses in the ocean.
Virioplankton have a significant role in marine ecosystems, yet we know little of the predominant biological characteristics of aquatic viruses that influence the flow of nutrients and energy through microbial communities. Family A DNA polymerases, critical to DNA replication and repair in prokaryotes, are found in many tailed bacteriophages. The essential role of DNA polymerase in viral replication makes it a useful target for connecting viral diversity with an important biological feature of viruses. Capturing the full diversity of this polymorphic gene by targeted approaches has been difficult; thus, full-length DNA polymerase genes were assembled out of virioplankton shotgun metagenomic sequence libraries (viromes). Within the viromes novel DNA polymerases were common and found in both double-stranded (ds) DNA and single-stranded (ss) DNA libraries. Finding DNA polymerase genes in ssDNA viral libraries was unexpected, as no such genes have been previously reported from ssDNA phage. Surprisingly, the most common virioplankton DNA polymerases were related to a siphovirus infecting an α-proteobacterial symbiont of a marine sponge and not the podoviral T7-like polymerases seen in many other studies. Amino acids predictive of catalytic efficiency and fidelity linked perfectly to the environmental clades, indicating that most DNA polymerase-carrying virioplankton utilize a lower efficiency, higher fidelity enzyme. Comparisons with previously reported, PCR-amplified DNA polymerase sequences indicated that the most common virioplankton metagenomic DNA polymerases formed a new group that included siphoviruses. These data indicate that slower-replicating, lytic or lysogenic phage populations rather than fast-replicating, highly lytic phages may predominate within the virioplankton.
viral ecology; metagenomics; phage diversity
Metagenomic studies characterize both the composition and diversity of uncultured viral and microbial communities. BLAST-based comparisons have typically been used for such analyses; however, sampling biases, high percentages of unknown sequences, and the use of arbitrary thresholds to find significant similarities can decrease the accuracy and validity of estimates. Here, we present Genome relative Abundance and Average Size (GAAS), a complete software package that provides improved estimates of community composition and average genome length for metagenomes in both textual and graphical formats. GAAS implements a novel methodology to control for sampling bias via length normalization, to adjust for multiple BLAST similarities by similarity weighting, and to select significant similarities using relative alignment lengths. In benchmark tests, the GAAS method was robust to both high percentages of unknown sequences and to variations in metagenomic sequence read lengths. Re-analysis of the Sargasso Sea virome using GAAS indicated that standard methodologies for metagenomic analysis may dramatically underestimate the abundance and importance of organisms with small genomes in environmental systems. Using GAAS, we conducted a meta-analysis of microbial and viral average genome lengths in over 150 metagenomes from four biomes to determine whether genome lengths vary consistently between and within biomes, and between microbial and viral communities from the same environment. Significant differences between biomes and within aquatic sub-biomes (oceans, hypersaline systems, freshwater, and microbialites) suggested that average genome length is a fundamental property of environments driven by factors at the sub-biome level. The behavior of paired viral and microbial metagenomes from the same environment indicated that microbial and viral average genome sizes are independent of each other, but indicative of community responses to stressors and environmental conditions.
Metagenomics uses DNA or RNA sequences isolated directly from the environment to determine what viruses or microorganisms exist in natural communities and what metabolic activities they encode. Typically, metagenomic sequences are compared to annotated sequences in public databases using the BLAST search tool. Our methods, implemented in the Genome relative Abundance and Average Size (GAAS) software, improve the way BLAST searches are processed to estimate the taxonomic composition of communities and their average genome length. GAAS provides a more accurate picture of community composition by correcting for a systematic sampling bias towards larger genomes, and is useful in situations where organisms with small genomes are abundant, such as disease outbreaks caused by small RNA viruses. Microbial average genome length relates to environmental complexity and the distribution of genome lengths describes community diversity. A study of the average genome length of viruses and microorganisms in four different biomes using GAAS on 169 metagenomes showed significantly different average genome sizes between biomes, and large variability within biomes as well. This also revealed that microbial and viral average genome sizes in the same environment are independent of each other, which reflects the different ways that microorganisms and viruses respond to stress and environmental conditions.
Bacterial viruses (bacteriophages) have a key role in shaping the development and functional outputs of host microbiomes. Although metagenomic approaches have greatly expanded our understanding of the prokaryotic virosphere, additional tools are required for the phage-oriented dissection of metagenomic data sets, and host-range affiliation of recovered sequences. Here we demonstrate the application of a genome signature-based approach to interrogate conventional whole-community metagenomes and access subliminal, phylogenetically targeted, phage sequences present within. We describe a portion of the biological dark matter extant in the human gut virome, and bring to light a population of potentially gut-specific Bacteroidales-like phage, poorly represented in existing virus like particle-derived viral metagenomes. These predominantly temperate phage were shown to encode functions of direct relevance to human health in the form of antibiotic resistance genes, and provided evidence for the existence of putative ‘viral-enterotypes’ among this fraction of the human gut virome.
Bacteriophages have a significant impact on microbial ecosystems, but additional tools are needed to assess viral communities. Ogilvie et al. present a new strategy to extract viral sequences from metagenomic data sets, and present new insights on their function in the gut ecosystem.
The human respiratory tract is constantly exposed to a wide variety of viruses, microbes and inorganic particulates from environmental air, water and food. Physical characteristics of inhaled particles and airway mucosal immunity determine which viruses and microbes will persist in the airways. Here we present the first metagenomic study of DNA viral communities in the airways of diseased and non-diseased individuals. We obtained sequences from sputum DNA viral communities in 5 individuals with cystic fibrosis (CF) and 5 individuals without the disease. Overall, diversity of viruses in the airways was low, with an average richness of 175 distinct viral genotypes. The majority of viral diversity was uncharacterized. CF phage communities were highly similar to each other, whereas Non-CF individuals had more distinct phage communities, which may reflect organisms in inhaled air. CF eukaryotic viral communities were dominated by a few viruses, including human herpesviruses and retroviruses. Functional metagenomics showed that all Non-CF viromes were similar, and that CF viromes were enriched in aromatic amino acid metabolism. The CF metagenomes occupied two different metabolic states, probably reflecting different disease states. There was one outlying CF virome which was characterized by an over-representation of Guanosine-5′-triphosphate,3′-diphosphate pyrophosphatase, an enzyme involved in the bacterial stringent response. Unique environments like the CF airway can drive functional adaptations, leading to shifts in metabolic profiles. These results have important clinical implications for CF, indicating that therapeutic measures may be more effective if used to change the respiratory environment, as opposed to shifting the taxonomic composition of resident microbiota.
The Human Microbiome Project (HMP) was undertaken with the goal of defining microbial communities in and on the bodies of healthy individuals using high-throughput metagenomic sequencing analysis. The viruses present in these microbial communities, the ‘human virome,’ are an important aspect of the human microbiome that is particularly understudied in the absence of overt disease. We analyzed eukaryotic double-stranded DNA (dsDNA) viruses, together with dsDNA replicative intermediates of single-stranded DNA viruses, in metagenomic sequence data generated by the HMP. We studied 706 samples from 102 subjects were studied, with each subject sampled at up to five major body habitats: nose, skin, mouth, vagina, and stool. Fifty-one individuals had samples taken at two or three time points 30 to 359 days apart from at least one of the body habitats.
We detected an average of 5.5 viral genera in each individual. At least one virus was detected in 92% of the individuals sampled. These viruses included herpesviruses, papillomaviruses, polyomaviruses, adenoviruses, anelloviruses, parvoviruses, and circoviruses. Each individual had a distinct viral profile, demonstrating the high interpersonal diversity of the virome. Some components of the virome were stable over time.
This study is the first to use high-throughput DNA sequencing to describe the diversity of eukaryotic dsDNA viruses in a large cohort of normal individuals who were sampled at multiple body sites. Our results show that the human virome is a complex component of the microbial flora. Some viruses establish long-term infections that may be associated with increased risk or possibly with protection from disease. A better understanding of the composition and dynamics of the virome may hold important keys to human health.
Electronic supplementary material
The online version of this article (doi:10.1186/s12915-014-0071-7) contains supplementary material, which is available to authorized users.
Metagenomics; Microbiome; Virome
Viruses are recognized as the most abundant biological components on Earth, and they regulate the structure of microbial communities in many environments. In soil and marine environments, microorganism-infecting phages are the most common type of virus. Although several types of bacteriophage have been isolated from fermented foods, little is known about the overall viral assemblages (viromes) of these environments. In this study, metagenomic analyses were performed on the uncultivated viral communities from three fermented foods, fermented shrimp, kimchi, and sauerkraut. Using a high-throughput pyrosequencing technique, a total of 81,831, 70,591 and 69,464 viral sequences were obtained from fermented shrimp, kimchi and sauerkraut, respectively. Moreover, 37 to 50% of these sequences showed no significant hit against sequences in public databases. There were some discrepancies between the prediction of bacteriophages hosts via homology comparison and bacterial distribution, as determined from 16S rRNA gene sequencing. These discrepancies likely reflect the fact that the viral genomes of fermented foods are poorly represented in public databases. Double-stranded DNA viral communities were amplified from fermented foods by using a linker-amplified shotgun library. These communities were dominated by bacteriophages belonging to the viral order Caudovirales (i.e., Myoviridae, Podoviridae, and Siphoviridae). This study indicates that fermented foods contain less complex viral communities than many other environmental habitats, such as seawater, human feces, marine sediment, and soil.
Deep sequencing of untreated sewage provides an opportunity to monitor enteric infections in large populations and for high-throughput viral discovery. A metagenomics analysis of purified viral particles in untreated sewage from the United States (San Francisco, CA), Nigeria (Maiduguri), Thailand (Bangkok), and Nepal (Kathmandu) revealed sequences related to 29 eukaryotic viral families infecting vertebrates, invertebrates, and plants (BLASTx E score, <10−4), including known pathogens (>90% protein identities) in numerous viral families infecting humans (Adenoviridae, Astroviridae, Caliciviridae, Hepeviridae, Parvoviridae, Picornaviridae, Picobirnaviridae, and Reoviridae), plants (Alphaflexiviridae, Betaflexiviridae, Partitiviridae, Sobemovirus, Secoviridae, Tombusviridae, Tymoviridae, Virgaviridae), and insects (Dicistroviridae, Nodaviridae, and Parvoviridae). The full and partial genomes of a novel kobuvirus, salivirus, and sapovirus are described. A novel astrovirus (casa astrovirus) basal to those infecting mammals and birds, potentially representing a third astrovirus genus, was partially characterized. Potential new genera and families of viruses distantly related to members of the single-stranded RNA picorna-like virus superfamily were genetically characterized and named Picalivirus, Secalivirus, Hepelivirus, Nedicistrovirus, Cadicistrovirus, and Niflavirus. Phylogenetic analysis placed these highly divergent genomes near the root of the picorna-like virus superfamily, with possible vertebrate, plant, or arthropod hosts inferred from nucleotide composition analysis. Circular DNA genomes distantly related to the plant-infecting Geminiviridae family were named Baminivirus, Nimivirus, and Niminivirus. These results highlight the utility of analyzing sewage to monitor shedding of viral pathogens and the high viral diversity found in this common pollutant and provide genetic information to facilitate future studies of these newly characterized viruses.
Transitions between saline and fresh waters have been shown to be infrequent for microorganisms. Based on host-specific interactions, the presence of specific clades among hosts suggests the existence of freshwater-specific viral clades. Yet, little is known about the composition and diversity of the temperate freshwater viral communities, and even if freshwater lakes and marine waters harbor distinct clades for particular viral sub-families, this distinction remains to be demonstrated on a community scale.
To help identify the characteristics and potential specificities of freshwater viral communities, such communities from two lakes differing by their ecological parameters were studied through metagenomics. Both the cluster richness and the species richness of the Lake Bourget virome were significantly higher that those of the Lake Pavin, highlighting a trend similar to the one observed for microorganisms (i.e. the specie richness observed in mesotrophic lakes is greater than the one observed in oligotrophic lakes). Using 29 previously published viromes, the cluster richness was shown to vary between different environment types and appeared significantly higher in marine ecosystems than in other biomes. Furthermore, significant genetic similarity between viral communities of related environments was highlighted as freshwater, marine and hypersaline environments were separated from each other despite the vast geographical distances between sample locations within each of these biomes. An automated phylogeny procedure was then applied to marker genes of the major families of single-stranded (Microviridae, Circoviridae, Nanoviridae) and double-stranded (Caudovirales) DNA viruses. These phylogenetic analyses all spotlighted a very broad diversity and previously unknown clades undetectable by PCR analysis, clades that gathered sequences from the two lakes. Thus, the two freshwater viromes appear closely related, despite the significant ecological differences between the two lakes. Furthermore, freshwater viral communities appear genetically distinct from other aquatic ecosystems, demonstrating the specificity of freshwater viruses at a community scale for the first time.
Viruses are the most abundant known infectious agents on the planet and are significant drivers of diversity in a variety of ecosystems. Although there have been numerous studies of viral communities, few have focused on viruses within the indigenous human microbiota. We analyzed 2 267 695 virome reads from viral particles and compared them with 263 516 bacterial 16S rRNA gene sequences from the saliva of five healthy human subjects over a 2- to 3-month period, in order to improve our understanding of the role viruses have in the complex oral ecosystem. Our data reveal viral communities in human saliva dominated by bacteriophages whose constituents are temporally distinct. The preponderance of shared homologs between the salivary viral communities in two unrelated subjects in the same household suggests that environmental factors are determinants of community membership. When comparing salivary viromes to those from human stool and the respiratory tract, each group was distinct, further indicating that habitat is of substantial importance in shaping human viromes. Compared with coexisting bacteria, there was concordance among certain predicted host–virus pairings such as Veillonella and Streptococcus, whereas there was discordance among others such as Actinomyces. We identified 122 728 virulence factor homologs, suggesting that salivary viruses may serve as reservoirs for pathogenic gene function in the oral environment. That the vast majority of human oral viruses are bacteriophages whose putative gene function signifies some have a prominent role in lysogeny, suggests these viruses may have an important role in helping shape the microbial diversity in the human oral cavity.
saliva; bacteriophage; virus; microbiome; virome; metagenome
Recent advances in genomics of viruses and cellular life forms have greatly stimulated interest in the origins and evolution of viruses and, for the first time, offer an opportunity for a data-driven exploration of the deepest roots of viruses. Here we briefly review the current views of virus evolution and propose a new, coherent scenario that appears to be best compatible with comparative-genomic data and is naturally linked to models of cellular evolution that, from independent considerations, seem to be the most parsimonious among the existing ones.
Several genes coding for key proteins involved in viral replication and morphogenesis as well as the major capsid protein of icosahedral virions are shared by many groups of RNA and DNA viruses but are missing in cellular life forms. On the basis of this key observation and the data on extensive genetic exchange between diverse viruses, we propose the concept of the ancient virus world. The virus world is construed as a distinct contingent of viral genes that continuously retained its identity throughout the entire history of life. Under this concept, the principal lineages of viruses and related selfish agents emerged from the primordial pool of primitive genetic elements, the ancestors of both cellular and viral genes. Thus, notwithstanding the numerous gene exchanges and acquisitions attributed to later stages of evolution, most, if not all, modern viruses and other selfish agents are inferred to descend from elements that belonged to the primordial genetic pool. In this pool, RNA viruses would evolve first, followed by retroid elements, and DNA viruses. The Virus World concept is predicated on a model of early evolution whereby emergence of substantial genetic diversity antedates the advent of full-fledged cells, allowing for extensive gene mixing at this early stage of evolution. We outline a scenario of the origin of the main classes of viruses in conjunction with a specific model of precellular evolution under which the primordial gene pool dwelled in a network of inorganic compartments. Somewhat paradoxically, under this scenario, we surmise that selfish genetic elements ancestral to viruses evolved prior to typical cells, to become intracellular parasites once bacteria and archaea arrived at the scene. Selection against excessively aggressive parasites that would kill off the host ensembles of genetic elements would lead to early evolution of temperate virus-like agents and primitive defense mechanisms, possibly, based on the RNA interference principle. The emergence of the eukaryotic cell is construed as the second melting pot of virus evolution from which the major groups of eukaryotic viruses originated as a result of extensive recombination of genes from various bacteriophages, archaeal viruses, plasmids, and the evolving eukaryotic genomes. Again, this vision is predicated on a specific model of the emergence of eukaryotic cell under which archaeo-bacterial symbiosis was the starting point of eukaryogenesis, a scenario that appears to be best compatible with the data.
The existence of several genes that are central to virus replication and structure, are shared by a broad variety of viruses but are missing from cellular genomes (virus hallmark genes) suggests the model of an ancient virus world, a flow of virus-specific genes that went uninterrupted from the precellular stage of life's evolution to this day. This concept is tightly linked to two key conjectures on evolution of cells: existence of a complex, precellular, compartmentalized but extensively mixing and recombining pool of genes, and origin of the eukaryotic cell by archaeo-bacterial fusion. The virus world concept and these models of major transitions in the evolution of cells provide complementary pieces of an emerging coherent picture of life's history.
W. Ford Doolittle, J. Peter Gogarten, and Arcady Mushegian.
The bovine rumen hosts a diverse and complex community of Eukarya, Bacteria, Archea and viruses (including bacteriophage). The rumen viral population (the rumen virome) has received little attention compared to the rumen microbial population (the rumen microbiome). We used massively parallel sequencing of virus like particles to investigate the diversity of the rumen virome in thirteen lactating Australian Holstein dairy cattle all housed in the same location, 12 of which were sampled on the same day.
Fourteen putative viral sequence fragments over 30 Kbp in length were assembled and annotated. Many of the putative genes in the assembled contigs showed no homology to previously annotated genes, highlighting the large amount of work still required to fully annotate the functions encoded in viral genomes. The abundance of the contig sequences varied widely between animals, even though the cattle were of the same age, stage of lactation and fed the same diets. Additionally the twelve animals which were co-habited shared a number of their dominant viral contigs. We compared the functional characteristics of our bovine viromes with that of other viromes, as well as rumen microbiomes. At the functional level, we found strong similarities between all of the viral samples, which were highly distinct from the rumen microbiome samples.
Our findings suggest a large amount of between animal variation in the bovine rumen virome and that co-habiting animals may have more similar viromes than non co-habited animals. We report the deepest sequencing to date of the rumen virome. This work highlights the enormous amount of novelty and variation present in the rumen virome.
Virome; Rumen; Bacteriophage; Metagenomics
Viruses are a significant component of the intestinal microbiota in mammals. In recent years, advances in sequencing technologies and data analysis techniques have enabled detailed metagenomic studies investigating intestinal viromes (collections of bacteriophage and eukaryotic viral nucleic acids) and their potential contributions to the ecology of the microbiota. An important component of virome studies is the isolation and purification of virus-like particles (VLPs) from intestinal contents or feces. Several methods have been applied to isolate VLPs from intestinal samples, yet to our knowledge, the efficiency and reproducibility between methods have not been explored. A rigorous evaluation of methods for VLP purification is critical as many studies begin to move from descriptive analyses of virus diversity to studies striving to quantitatively compare viral abundances across many samples. Therefore, reproducible VLP purification methods which allow for high sample throughput are needed. Here we compared and evaluated four methods for VLP purification using artificial intestinal microbiota samples of known bacterial and viral composition.
We compared the following four methods of VLP purification from fecal samples: (i) filtration + DNase, (ii) dithiothreitol treatment + filtration + DNase, (iii) filtration + DNase + PEG precipitation and (iv) filtration + DNase + CsCl density gradient centrifugation. Three of the four tested methods worked well for VLP purification. We observed several differences between methods related to the removal efficiency of bacterial and host DNAs and biases against specific phages. In particular the CsCl density gradient centrifugation method, which is frequently used for VLP purification, was most efficient in removing host derived DNA, but also showed strong discrimination against specific phages and showed a lower reproducibility of quantitative results.
Based on our data we recommend the use of methods (i) or (ii) for large scale studies when quantitative comparison of viral abundances across samples is required. The CsCl density gradient centrifugation method, while being excellently suited to achieve highly purified samples, in our opinion, should be used with caution when performing quantitative studies.
Electronic supplementary material
The online version of this article (doi:10.1186/s12864-014-1207-4) contains supplementary material, which is available to authorized users.
Virus metagenomics; Viral metagenomes; Virus-like particles; Microbiome; Bacteriophage; CsCl density gradient
The aim of this study was to develop and demonstrate an approach for describing the diversity of human pathogenic viruses in an environmentally isolated viral metagenome.
Methods and Results
In silico bioinformatic experiments were used to select an optimum annotation strategy for discovering human viruses in virome datasets, and applied to annotate a class B biosolids virome. Results from the in silico study indicated that less than 1% errors in virus identification could be achieved when nucleotide-based search programs (BLASTn or tBLASTx), viral genome only databases, and sequence reads greater than 200 nt were considered. Within the 51,925 annotated sequences, 94 DNA and 19 RNA sequences were identified as human viruses. Virus diversity included environmentally transmitted agents such as parechovirus, coronavirus, adenovirus, and aichi virus, as well as viruses associated with chronic human infections such as human herpes and hepatitis C viruses.
This study provided a bioinformatic approach for identifying pathogens in a virome dataset, and demonstrated the human virus diversity in a relevant environmental sample.
Significance and Impact of Study
As the costs of next generation sequencing decrease, the pathogen diversity described by virus metagenomes will provide an unbiased guide for subsequent cell-culture and quantitative pathogen analyses, and ensures that highly enriched and relevant pathogens are not neglected in exposure and risk assessments.
virus; bioinformatics; biosolids; next generation DNA sequencing; viral metagenome; pathogen; virome
Viruses are the most abundant and diverse genetic entities on Earth; however, broad surveys of viral diversity are hindered by the lack of a universal assay for viruses and the inability to sample a sufficient number of individual hosts. This study utilized vector-enabled metagenomics (VEM) to provide a snapshot of the diversity of DNA viruses present in three mosquito samples from San Diego, California. The majority of the sequences were novel, suggesting that the viral community in mosquitoes, as well as the animal and plant hosts they feed on, is highly diverse and largely uncharacterized. Each mosquito sample contained a distinct viral community. The mosquito viromes contained sequences related to a broad range of animal, plant, insect and bacterial viruses. Animal viruses identified included anelloviruses, circoviruses, herpesviruses, poxviruses, and papillomaviruses, which mosquitoes may have obtained from vertebrate hosts during blood feeding. Notably, sequences related to human papillomaviruses were identified in one of the mosquito samples. Sequences similar to plant viruses were identified in all mosquito viromes, which were potentially acquired through feeding on plant nectar. Numerous bacteriophages and insect viruses were also detected, including a novel densovirus likely infecting Culex erythrothorax. Through sampling insect vectors, VEM enables broad survey of viral diversity and has significantly increased our knowledge of the DNA viruses present in mosquitoes.
Viral diversity and lifecycles are poorly understood in the human gut and other body habitats. Therefore, we sequenced the viromes (metagenomes) of virus-like particles isolated from fecal samples collected from adult female monozygotic twins and their mothers at three time points over a one-year period. These datasets were compared to datasets of sequenced bacterial 16S rRNA genes and total fecal community DNA. Co-twins and their mothers share a significantly greater degree of similarity in their fecal bacterial communities than do unrelated individuals. In contrast, viromes are unique to individuals regardless of their degree of genetic relatedness. Despite remarkable interpersonal variations in viromes and their encoded functions, intrapersonal diversity is very low, with >95% of virotypes retained over the period surveyed, and with viromes dominated by a few temperate phage that exhibit remarkable genetic stability. These results indicate that a predatory viral-microbial dynamic, manifest in a number of other characterized environmental ecosystems, is notably absent in the very distal intestine.
Phylogenetic mapping of metagenomics data reveals the taxonomic distribution of large DNA viruses in the sea, including giant viruses of the Mimiviridae family.
Viruses are ubiquitous and the most abundant biological entities in marine environments. Metagenomics studies are increasingly revealing the huge genetic diversity of marine viruses. In this study, we used a new approach - 'phylogenetic mapping' - to obtain a comprehensive picture of the taxonomic distribution of large DNA viruses represented in the Sorcerer II Global Ocean Sampling Expedition metagenomic data set.
Using DNA polymerase genes as a taxonomic marker, we identified 811 homologous sequences of likely viral origin. As expected, most of these sequences corresponded to phages. Interestingly, the second largest viral group corresponded to that containing mimivirus and three related algal viruses. We also identified several DNA polymerase homologs closely related to Asfarviridae, a viral family poorly represented among isolated viruses and, until now, limited to terrestrial animal hosts. Finally, our approach allowed the identification of a new combination of genes in 'viral-like' sequences.
Albeit only recently discovered, giant viruses of the Mimiviridae family appear to constitute a diverse, quantitatively important and ubiquitous component of the population of large eukaryotic DNA viruses in the sea.
Viral genomes often contain metabolic genes that were acquired from host genomes (auxiliary genes). It is assumed that these genes are fixed in viral genomes as a result of a selective force, favoring viruses that acquire specific metabolic functions. While many individual auxiliary genes were observed in viral genomes and metagenomes, there is great importance in investigating the abundance of auxiliary genes and metabolic functions in the marine environment towards a better understanding of their role in promoting viral reproduction.
In this study, we searched for enriched viral auxiliary genes and mapped them to metabolic pathways. To initially identify enriched auxiliary genes, we analyzed metagenomic microbial reads from the Global Ocean Survey (GOS) dataset that were characterized as viral, as well as marine virome and microbiome datasets from the Line Islands. Viral-enriched genes were mapped to a “global metabolism network” that comprises all KEGG metabolic pathways. Our analysis of the viral-enriched pathways revealed that purine and pyrimidine metabolism pathways are among the most enriched pathways. Moreover, many other viral-enriched metabolic pathways were found to be closely associated with the purine and pyrimidine metabolism pathways. Furthermore, we observed that sequential reactions are promoted in pathways having a high proportion of enriched genes. In addition, these enriched genes were found to be of modular nature, participating in several pathways.
Our naïve metagenomic analyses strongly support the well-established notion that viral auxiliary genes promote viral replication via both degradation of host DNA and RNA as well as a shift of the host metabolism towards nucleotide biosynthesis, clearly indicating that comparative metagenomics can be used to understand different environments and systems without prior knowledge of pathways involved.
Metabolic networks; Metabolism; Nucleotide biosynthesis; Phage; Virus
Bats are hosts to a variety of viruses capable of zoonotic transmissions. Because of increased contact between bats, humans, and other animal species, the possibility exists for further cross-species transmissions and ensuing disease outbreaks. We describe here full and partial viral genomes identified using metagenomics in the guano of bats from California and Texas. A total of 34% and 58% of 390,000 sequence reads from bat guano in California and Texas, respectively, were related to eukaryotic viruses, and the largest proportion of those infect insects, reflecting the diet of these insectivorous bats, including members of the viral families Dicistroviridae, Iflaviridae, Tetraviridae, and Nodaviridae and the subfamily Densovirinae. The second largest proportion of virus-related sequences infects plants and fungi, likely reflecting the diet of ingested insects, including members of the viral families Luteoviridae, Secoviridae, Tymoviridae, and Partitiviridae and the genus Sobemovirus. Bat guano viruses related to those infecting mammals comprised the third largest group, including members of the viral families Parvoviridae, Circoviridae, Picornaviridae, Adenoviridae, Poxviridae, Astroviridae, and Coronaviridae. No close relative of known human viral pathogens was identified in these bat populations. Phylogenetic analysis was used to clarify the relationship to known viral taxa of novel sequences detected in bat guano samples, showing that some guano viral sequences fall outside existing taxonomic groups. This initial characterization of the bat guano virome, the first metagenomic analysis of viruses in wild mammals using second-generation sequencing, therefore showed the presence of previously unidentified viral species, genera, and possibly families. Viral metagenomics is a useful tool for genetically characterizing viruses present in animals with the known capability of direct or indirect viral zoonosis to humans.
Nearly complete genome sequences of three novel RNA viruses were acquired from the stool of an Afghan child. Phylogenetic analysis indicated that these viruses belong to the picorna-like virus superfamily. Because of their unique genomic organization and deep phylogenetic roots, we propose these viruses, provisionally named calhevirus, tetnovirus-1, and tetnovirus-2, as prototypes of new viral families. A newly developed nucleotide composition analysis (NCA) method was used to compare mononucleotide and dinucleotide frequencies for RNA viruses infecting mammals, plants, or insects. Using a large training data set of 284 representative picornavirus-like genomic sequences with defined host origins, NCA correctly identified the kingdom or phylum of the viral host for >95% of picorna-like viruses. NCA predicted an insect host origin for the 3 novel picorna-like viruses. Their presence in human stool therefore likely reflects ingestion of insect-contaminated food. As metagenomic analyses of different environments and organisms continue to yield highly divergent viral genomes NCA provides a rapid and robust method to identify their likely cellular hosts.
Viruses are abundant biological entities on earth and the emergence of viral pathogens has become a serious threat to aquaculture and fisheries worldwide. However, our response to viral pathogens has been largely reactive, in the sense that a new pathogen is usually not discovered until it has already reached epidemic proportions. Current diagnostic methods such as PCR, immunological assays and pan-viral microarrays are limited in their ability to identify novel viruses. In this context, the knowledge on the diversity of viruses in healthy and disease situations becomes important for understanding their role on the health of animals in aquaculture species. Viral metagenomics, which involves viral purification and shotgun sequencing, has proven to be useful for understanding viral diversity and describing novel viruses in new diseases and has been recognized as an important tool for discovering novel viruses in human and veterinary medicine. With the advancements in sequencing technology and development of bioinformatics tools for nucleic acid sequence assembly and annotation, information on novel viruses and diversity of viruses in marine ecosystems has been rapidly expanding through viral metagenomics. Novel circoviruses and RNA viruses in Tampa bay pink shrimp, annelovirus in sea lion, picornavirus in ringed seals and several new viruses of marine animals have been recently described using viral metagenomics and this tool has been also recently used in describing viral diversity in aquaculture ponds. Further, a large amount of information has been generated on the diversity of viruses in the marine environment using viral metagenomics during the last decade. There exists a great potential with viral metagenomics for discovering novel viruses in asymptomatic marine candidate animals of aquaculture/mariculture, some of which may assume pathogenic status under high density culture and stress. Additionally, viral metagenomics can help our understanding of viruses present in aquaculture/mariculture settings and routine pathogen surveillance programmes.
Aquaculture; Bioinformatic tools; Diseases; Marine animals; Viral diversity; Viral metagenomics