Viral diversity and lifecycles are poorly understood in the human gut and other body habitats. Therefore, we sequenced the viromes (metagenomes) of virus-like particles isolated from fecal samples collected from adult female monozygotic twins and their mothers at three time points over a one-year period. These datasets were compared to datasets of sequenced bacterial 16S rRNA genes and total fecal community DNA. Co-twins and their mothers share a significantly greater degree of similarity in their fecal bacterial communities than do unrelated individuals. In contrast, viromes are unique to individuals regardless of their degree of genetic relatedness. Despite remarkable interpersonal variations in viromes and their encoded functions, intrapersonal diversity is very low, with >95% of virotypes retained over the period surveyed, and with viromes dominated by a few temperate phage that exhibit remarkable genetic stability. These results indicate that a predatory viral-microbial dynamic, manifest in a number of other characterized environmental ecosystems, is notably absent in the very distal intestine.
Bacterial viruses (bacteriophages) have a key role in shaping the development and functional outputs of host microbiomes. Although metagenomic approaches have greatly expanded our understanding of the prokaryotic virosphere, additional tools are required for the phage-oriented dissection of metagenomic data sets, and host-range affiliation of recovered sequences. Here we demonstrate the application of a genome signature-based approach to interrogate conventional whole-community metagenomes and access subliminal, phylogenetically targeted, phage sequences present within. We describe a portion of the biological dark matter extant in the human gut virome, and bring to light a population of potentially gut-specific Bacteroidales-like phage, poorly represented in existing virus like particle-derived viral metagenomes. These predominantly temperate phage were shown to encode functions of direct relevance to human health in the form of antibiotic resistance genes, and provided evidence for the existence of putative ‘viral-enterotypes’ among this fraction of the human gut virome.
Bacteriophages have a significant impact on microbial ecosystems, but additional tools are needed to assess viral communities. Ogilvie et al. present a new strategy to extract viral sequences from metagenomic data sets, and present new insights on their function in the gut ecosystem.
The aim of this study was to develop and demonstrate an approach for describing the diversity of human pathogenic viruses in an environmentally isolated viral metagenome.
Methods and Results
In silico bioinformatic experiments were used to select an optimum annotation strategy for discovering human viruses in virome datasets, and applied to annotate a class B biosolids virome. Results from the in silico study indicated that less than 1% errors in virus identification could be achieved when nucleotide-based search programs (BLASTn or tBLASTx), viral genome only databases, and sequence reads greater than 200 nt were considered. Within the 51,925 annotated sequences, 94 DNA and 19 RNA sequences were identified as human viruses. Virus diversity included environmentally transmitted agents such as parechovirus, coronavirus, adenovirus, and aichi virus, as well as viruses associated with chronic human infections such as human herpes and hepatitis C viruses.
This study provided a bioinformatic approach for identifying pathogens in a virome dataset, and demonstrated the human virus diversity in a relevant environmental sample.
Significance and Impact of Study
As the costs of next generation sequencing decrease, the pathogen diversity described by virus metagenomes will provide an unbiased guide for subsequent cell-culture and quantitative pathogen analyses, and ensures that highly enriched and relevant pathogens are not neglected in exposure and risk assessments.
virus; bioinformatics; biosolids; next generation DNA sequencing; viral metagenome; pathogen; virome
Acanthamoeba polyphaga mimivirus is the largest known ds-DNA virus and its 1.2 Mb-genome sequence has revealed many unique features. Mimivirus occupies an independent lineage among eukaryotic viruses and its known hosts include only species from the Acanthamoeba genus. The existence of mimivirus relatives was first suggested by the analysis of the Sargasso Sea metagenomic data.
We now further demonstrate the presence of numerous "mimivirus-like" sequences using a larger marine metagenomic data set. We also show that the DNA polymerase sequences from three algal viruses (CeV01, PpV01, PoV01) infecting different marine algal species (Chrysochromulina ericina, Phaeocystis pouchetii, Pyramimonas orientalis) are very closely related to their homolog in mimivirus.
Our results suggest that the numerous mimivirus-related sequences identified in marine environments are likely to originate from diverse large DNA viruses infecting phytoplankton. Micro-algae thus constitute a new category of potential hosts in which to look for new species of Mimiviridae.
The residence of dinoflagellate algae (genus: Symbiodinium) within scleractinian corals is critical to the construction and persistence of tropical reefs. In recent decades, however, acute and chronic environmental stressors have frequently destabilized this symbiosis, ultimately leading to coral mortality and reef decline. Viral infection has been suggested as a trigger of coral–Symbiodinium dissociation; knowledge of the diversity and hosts of coral-associated viruses is critical to evaluating this hypothesis. Here, we present the first genomic evidence of viruses associated with Symbiodinium, based on the presence of transcribed +ss (single-stranded) RNA and ds (double-stranded) DNA virus-like genes in complementary DNA viromes of the coral Montastraea cavernosa and expressed sequence tag (EST) libraries generated from Symbiodinium cultures. The M. cavernosa viromes contained divergent viral sequences similar to the major capsid protein of the dinoflagellate-infecting +ssRNA Heterocapsa circularisquama virus, suggesting a highly novel dinornavirus could infect Symbiodinium. Further, similarities to dsDNA viruses dominated (∼69%) eukaryotic viral similarities in the M. cavernosa viromes. Transcripts highly similar to eukaryotic algae-infecting phycodnaviruses were identified in the viromes, and homologs to these sequences were found in two independently generated Symbiodinium EST libraries. Phylogenetic reconstructions substantiate that these transcripts are undescribed and distinct members of the nucleocytoplasmic large DNA virus (NCLDVs) group. Based on a preponderance of evidence, we infer that the novel NCLDVs and RNA virus described here are associated with the algal endosymbionts of corals. If such viruses disrupt Symbiodinium, they are likely to impact the flexibility and/or stability of coral–algal symbioses, and thus long-term reef health and resilience.
coral reef; Heterocapsa circularisquama RNA virus (HcRNAV); nuclear cytoplasmic large DNA virus (NCLDV); Phycodnaviridae; Symbiodinium; virome
California sea lions are one of the major marine mammal species along the Pacific coast of North America. Sea lions are susceptible to a wide variety of viruses, some of which can be transmitted to or from terrestrial mammals. Using an unbiased viral metagenomic approach, we surveyed the fecal virome in California sea lions of different ages and health statuses. Averages of 1.6 and 2.5 distinct mammalian viral species were shed by pups and juvenile sea lions, respectively. Previously undescribed mammalian viruses from four RNA virus families (Astroviridae, Picornaviridae, Caliciviridae, and Reoviridae) and one DNA virus family (Parvoviridae) were characterized. The first complete or partial genomes of sapeloviruses, sapoviruses, noroviruses, and bocavirus in marine mammals are reported. Astroviruses and bocaviruses showed the highest prevalence and abundance in California sea lion feces. The diversity of bacteriophages was higher in unweaned sea lion pups than in juveniles and animals in rehabilitation, where the phage community consisted largely of phages related to the family Microviridae. This study increases our understanding of the viral diversity in marine mammals, highlights the high rate of enteric viral infections in these highly social carnivores, and may be used as a baseline viral survey for comparison with samples from California sea lions during unexplained disease outbreaks.
The pig faecal virome, which comprises the community of viruses present in pig faeces, is complex and consists of pig viruses, bacteriophages, transiently passaged plant viruses and other minor virus species. Only little is known about factors influencing its general composition. Here, the effect of the probiotic bacterium Enterococcus faecium (E. faecium) NCIMB 10415 on the pig faecal virome composition was analysed in a pig feeding trial with sows and their piglets, which received either the probiotic bacterium or not.
From 8 pooled faecal samples derived from the feeding trial, DNA and RNA virus particles were prepared and subjected to process-controlled Next Generation Sequencing resulting in 390,650 sequence reads. In average, 14% of the reads showed significant sequence identities to known viruses. The percentage of detected mammalian virus sequences was highest (55–77%) in the samples of the youngest piglets and lowest (8–10%) in the samples of the sows. In contrast, the percentage of bacteriophage sequences increased from 22–44% in the youngest piglets to approximately 90% in the sows. The dominating mammalian viruses differed remarkably among 12 day-old piglets (kobuvirus), 54 day-old piglets (boca-, dependo- and pig stool-associated small circular DNA virus [PigSCV]) and the sows (PigSCV, circovirus and “circovirus-like” viruses CB-A and RW-A). In addition, the Shannon index, which reflects the diversity of sequences present in a sample, was generally higher for the sows as compared to the piglets. No consistent differences in the virome composition could be identified between the viromes of the probiotic bacterium-treated group and the control group.
The analysis indicates that the pig faecal virome shows a high variability and that its general composition is mainly dependent on the age of the pigs. Changes caused by feeding with the probiotic bacterium E. faecium could not be demonstrated using the applied metagenomics method.
Viruses are the most abundant known infectious agents on the planet and are significant drivers of diversity in a variety of ecosystems. Although there have been numerous studies of viral communities, few have focused on viruses within the indigenous human microbiota. We analyzed 2 267 695 virome reads from viral particles and compared them with 263 516 bacterial 16S rRNA gene sequences from the saliva of five healthy human subjects over a 2- to 3-month period, in order to improve our understanding of the role viruses have in the complex oral ecosystem. Our data reveal viral communities in human saliva dominated by bacteriophages whose constituents are temporally distinct. The preponderance of shared homologs between the salivary viral communities in two unrelated subjects in the same household suggests that environmental factors are determinants of community membership. When comparing salivary viromes to those from human stool and the respiratory tract, each group was distinct, further indicating that habitat is of substantial importance in shaping human viromes. Compared with coexisting bacteria, there was concordance among certain predicted host–virus pairings such as Veillonella and Streptococcus, whereas there was discordance among others such as Actinomyces. We identified 122 728 virulence factor homologs, suggesting that salivary viruses may serve as reservoirs for pathogenic gene function in the oral environment. That the vast majority of human oral viruses are bacteriophages whose putative gene function signifies some have a prominent role in lysogeny, suggests these viruses may have an important role in helping shape the microbial diversity in the human oral cavity.
saliva; bacteriophage; virus; microbiome; virome; metagenome
Nearly complete genome sequences of three novel RNA viruses were acquired from the stool of an Afghan child. Phylogenetic analysis indicated that these viruses belong to the picorna-like virus superfamily. Because of their unique genomic organization and deep phylogenetic roots, we propose these viruses, provisionally named calhevirus, tetnovirus-1, and tetnovirus-2, as prototypes of new viral families. A newly developed nucleotide composition analysis (NCA) method was used to compare mononucleotide and dinucleotide frequencies for RNA viruses infecting mammals, plants, or insects. Using a large training data set of 284 representative picornavirus-like genomic sequences with defined host origins, NCA correctly identified the kingdom or phylum of the viral host for >95% of picorna-like viruses. NCA predicted an insect host origin for the 3 novel picorna-like viruses. Their presence in human stool therefore likely reflects ingestion of insect-contaminated food. As metagenomic analyses of different environments and organisms continue to yield highly divergent viral genomes NCA provides a rapid and robust method to identify their likely cellular hosts.
Phylogenetic mapping of metagenomics data reveals the taxonomic distribution of large DNA viruses in the sea, including giant viruses of the Mimiviridae family.
Viruses are ubiquitous and the most abundant biological entities in marine environments. Metagenomics studies are increasingly revealing the huge genetic diversity of marine viruses. In this study, we used a new approach - 'phylogenetic mapping' - to obtain a comprehensive picture of the taxonomic distribution of large DNA viruses represented in the Sorcerer II Global Ocean Sampling Expedition metagenomic data set.
Using DNA polymerase genes as a taxonomic marker, we identified 811 homologous sequences of likely viral origin. As expected, most of these sequences corresponded to phages. Interestingly, the second largest viral group corresponded to that containing mimivirus and three related algal viruses. We also identified several DNA polymerase homologs closely related to Asfarviridae, a viral family poorly represented among isolated viruses and, until now, limited to terrestrial animal hosts. Finally, our approach allowed the identification of a new combination of genes in 'viral-like' sequences.
Albeit only recently discovered, giant viruses of the Mimiviridae family appear to constitute a diverse, quantitatively important and ubiquitous component of the population of large eukaryotic DNA viruses in the sea.
Novel DNA sequencing techniques, referred to as “next-generation” sequencing (NGS), provide high speed and throughput that can produce an enormous volume of sequences with many possible applications in research and diagnostic settings. In this article, we provide an overview of the many applications of NGS in diagnostic virology. NGS techniques have been used for high-throughput whole viral genome sequencing, such as sequencing of new influenza viruses, for detection of viral genome variability and evolution within the host, such as investigation of human immunodeficiency virus and human hepatitis C virus quasispecies, and monitoring of low-abundance antiviral drug-resistance mutations. NGS techniques have been applied to metagenomics-based strategies for the detection of unexpected disease-associated viruses and for the discovery of novel human viruses, including cancer-related viruses. Finally, the human virome in healthy and disease conditions has been described by NGS-based metagenomics.
next generation sequencing; deep sequencing; virus discovery; metagenomics; virome; virology; quasispecies; molecular diagnosis; human immunodeficiency virus; drug resistance; minority variants
Viruses are abundant in the ocean and a major driving force in plankton ecology and evolution. It has been assumed that most of the viruses in seawater contain DNA and infect bacteria, but RNA-containing viruses in the ocean, which almost exclusively infect eukaryotes, have never been quantified. We compared the total mass of RNA and DNA in the viral fraction harvested from seawater and using data on the mass of nucleic acid per RNA- or DNA-containing virion, estimated the abundances of each. Our data suggest that the abundance of RNA viruses rivaled or exceeded that of DNA viruses in samples of coastal seawater. The dominant RNA viruses in the samples were marine picorna-like viruses, which have small genomes and are at or below the detection limit of common fluorescence-based counting methods. If our results are typical, this means that counts of viruses and the rate measurements that depend on them, such as viral production, are significantly underestimated by current practices. As these RNA viruses infect eukaryotes, our data imply that protists contribute more to marine viral dynamics than one might expect based on their relatively low abundance. This conclusion is a departure from the prevailing view of viruses in the ocean, but is consistent with earlier theoretical predictions.
RNA viruses; DNA viruses; virioplankton; marine; seawater; abundance; metagenome
There are no known RNA viruses that infect Archaea. Filling this gap in our knowledge of viruses will enhance our understanding of the relationships between RNA viruses from the three domains of cellular life and, in particular, could shed light on the origin of the enormous diversity of RNA viruses infecting eukaryotes. We describe here the identification of novel RNA viral genome segments from high-temperature acidic hot springs in Yellowstone National Park in the United States. These hot springs harbor low-complexity cellular communities dominated by several species of hyperthermophilic Archaea. A viral metagenomics approach was taken to assemble segments of these RNA virus genomes from viral populations isolated directly from hot spring samples. Analysis of these RNA metagenomes demonstrated unique gene content that is not generally related to known RNA viruses of Bacteria and Eukarya. However, genes for RNA-dependent RNA polymerase (RdRp), a hallmark of positive-strand RNA viruses, were identified in two contigs. One of these contigs is approximately 5,600 nucleotides in length and encodes a polyprotein that also contains a region homologous to the capsid protein of nodaviruses, tetraviruses, and birnaviruses. Phylogenetic analyses of the RdRps encoded in these contigs indicate that the putative archaeal viruses form a unique group that is distinct from the RdRps of RNA viruses of Eukarya and Bacteria. Collectively, our findings suggest the existence of novel positive-strand RNA viruses that probably replicate in hyperthermophilic archaeal hosts and are highly divergent from RNA viruses that infect eukaryotes and even more distant from known bacterial RNA viruses. These positive-strand RNA viruses might be direct ancestors of RNA viruses of eukaryotes.
In this study, we analyzed viral metagenomes (viromes) in the sedimentary habitats of three geographically and geologically distinct (hado)pelagic environments in the northwest Pacific; the Izu-Ogasawara Trench (water depth = 9,760 m) (OG), the Challenger Deep in the Mariana Trench (10,325 m) (MA), and the forearc basin off the Shimokita Peninsula (1,181 m) (SH). Virus abundance ranged from 106 to 1011 viruses/cm3 of sediments (down to 30 cm below the seafloor [cmbsf]). We recovered viral DNA assemblages (viromes) from the (hado)pelagic sediment samples and obtained a total of 37,458, 39,882, and 70,882 sequence reads by 454 GS FLX Titanium pyrosequencing from the virome libraries of the OG, MA, and SH (hado)pelagic sediments, respectively. Only 24−30% of the sequence reads from each virome library exhibited significant similarities to the sequences deposited in the public nr protein database (E-value <10−3 in BLAST). Among the sequences identified as potential viral genes based on the BLAST search, 95−99% of the sequence reads in each library were related to genes from single-stranded DNA (ssDNA) viral families, including Microviridae, Circoviridae, and Geminiviridae. A relatively high abundance of sequences related to the genetic markers (major capsid protein [VP1] and replication protein [Rep]) of two ssDNA viral groups were also detected in these libraries, thereby revealing a high genotypic diversity of their viruses (833 genotypes for VP1 and 2,551 genotypes for Rep). A majority of the viral genes predicted from each library were classified into three ssDNA viral protein categories: Rep, VP1, and minor capsid protein. The deep-sea sedimentary viromes were distinct from the viromes obtained from the oceanic and fresh waters and marine eukaryotes, and thus, deep-sea sediments harbor novel viromes, including previously unidentified ssDNA viruses.
Viruses are recognized as the most abundant biological components on Earth, and they regulate the structure of microbial communities in many environments. In soil and marine environments, microorganism-infecting phages are the most common type of virus. Although several types of bacteriophage have been isolated from fermented foods, little is known about the overall viral assemblages (viromes) of these environments. In this study, metagenomic analyses were performed on the uncultivated viral communities from three fermented foods, fermented shrimp, kimchi, and sauerkraut. Using a high-throughput pyrosequencing technique, a total of 81,831, 70,591 and 69,464 viral sequences were obtained from fermented shrimp, kimchi and sauerkraut, respectively. Moreover, 37 to 50% of these sequences showed no significant hit against sequences in public databases. There were some discrepancies between the prediction of bacteriophages hosts via homology comparison and bacterial distribution, as determined from 16S rRNA gene sequencing. These discrepancies likely reflect the fact that the viral genomes of fermented foods are poorly represented in public databases. Double-stranded DNA viral communities were amplified from fermented foods by using a linker-amplified shotgun library. These communities were dominated by bacteriophages belonging to the viral order Caudovirales (i.e., Myoviridae, Podoviridae, and Siphoviridae). This study indicates that fermented foods contain less complex viral communities than many other environmental habitats, such as seawater, human feces, marine sediment, and soil.
Viruses are the most abundant and diverse genetic entities on Earth; however, broad surveys of viral diversity are hindered by the lack of a universal assay for viruses and the inability to sample a sufficient number of individual hosts. This study utilized vector-enabled metagenomics (VEM) to provide a snapshot of the diversity of DNA viruses present in three mosquito samples from San Diego, California. The majority of the sequences were novel, suggesting that the viral community in mosquitoes, as well as the animal and plant hosts they feed on, is highly diverse and largely uncharacterized. Each mosquito sample contained a distinct viral community. The mosquito viromes contained sequences related to a broad range of animal, plant, insect and bacterial viruses. Animal viruses identified included anelloviruses, circoviruses, herpesviruses, poxviruses, and papillomaviruses, which mosquitoes may have obtained from vertebrate hosts during blood feeding. Notably, sequences related to human papillomaviruses were identified in one of the mosquito samples. Sequences similar to plant viruses were identified in all mosquito viromes, which were potentially acquired through feeding on plant nectar. Numerous bacteriophages and insect viruses were also detected, including a novel densovirus likely infecting Culex erythrothorax. Through sampling insect vectors, VEM enables broad survey of viral diversity and has significantly increased our knowledge of the DNA viruses present in mosquitoes.
Swine are an important source of proteins worldwide but are subject to frequent viral outbreaks and numerous infections capable of infecting humans. Modern farming conditions may also increase viral transmission and potential zoonotic spread. We describe here the metagenomics-derived virome in the feces of 24 healthy and 12 diarrheic piglets on a high-density farm. An average of 4.2 different mammalian viruses were shed by healthy piglets, reflecting a high level of asymptomatic infections. Diarrheic pigs shed an average of 5.4 different mammalian viruses. Ninety-nine percent of the viral sequences were related to the RNA virus families Picornaviridae, Astroviridae, Coronaviridae, and Caliciviridae, while 1% were related to the small DNA virus families Circoviridae, and Parvoviridae. Porcine RNA viruses identified, in order of decreasing number of sequence reads, consisted of kobuviruses, astroviruses, enteroviruses, sapoviruses, sapeloviruses, coronaviruses, bocaviruses, and teschoviruses. The near-full genomes of multiple novel species of porcine astroviruses and bocaviruses were generated and phylogenetically analyzed. Multiple small circular DNA genomes encoding replicase proteins plus two highly divergent members of the Picornavirales order were also characterized. The possible origin of these viral genomes from pig-infecting protozoans and nematodes, based on closest sequence similarities, is discussed. In summary, an unbiased survey of viruses in the feces of intensely farmed animals revealed frequent coinfections with a highly diverse set of viruses providing favorable conditions for viral recombination. Viral surveys of animals can readily document the circulation of known and new viruses, facilitating the detection of emerging viruses and prospective evaluation of their pathogenic and zoonotic potentials.
Podoviruses that infect marine picocyanobacteria are abundant and could play a significant role on regulating host populations due to their specific phage-host relationship. Genome sequencing of cyanophages has unveiled that many marine cyanophages encode certain photosynthetic genes like psbA. It appears that psbA is only present in certain groups of cyanopodovirus isolates. In order to better understand the prevalence of psbA in cyanobacterial podoviruses, we searched the marine metagenomic database (GOS, BATS, HOT and MarineVirome). Our study suggests that 89% of recruited cyanopodovirus scaffolds from the GOS database contained the psbA gene, supporting the ecological relevance of the photosynthesis gene for surface oceanic cyanophages. Diversification between Clade A and B are consistent with recent finding of two major groups of cyanopodoviruses. All the data also shows that Clade B cyanopodoviruses dominate the surface ocean water, while Clade A cyanopodoviruses become more important in the coastal and estuarine environments.
The Microviridae comprises icosahedral lytic viruses with circular single-stranded DNA genomes. The family is divided into two distinct groups based on genome characteristics and virion structure. Viruses infecting enterobacteria belong to the genus Microvirus, whereas those infecting obligate parasitic bacteria, such as Chlamydia, Spiroplasma and Bdellovibrio, are classified into a subfamily, the Gokushovirinae. Recent metagenomic studies suggest that members of the Microviridae might also play an important role in marine environments. In this study we present the identification and characterization of Microviridae-related prophages integrated in the genomes of species of the Bacteroidetes, a phylum not previously known to be associated with microviruses. Searches against metagenomic databases revealed the presence of highly similar sequences in the human gut. This is the first report indicating that viruses of the Microviridae lysogenize their hosts. Absence of associated integrase-coding genes and apparent recombination with dif-like sequences suggests that Bacteroidetes-associated microviruses are likely to rely on the cellular chromosome dimer resolution machinery. Phylogenetic analysis of the putative major capsid proteins places the identified proviruses into a group separate from the previously characterized microviruses and gokushoviruses, suggesting that the genetic diversity and host range of bacteriophages in the family Microviridae is wider than currently appreciated.
Although the importance of viruses in natural ecosystems is widely acknowledged, the functional potential of viral communities is yet to be determined. Viral genomes are traditionally believed to carry only those genes that are directly pertinent to the viral life cycle, though this view was challenged by the discovery of metabolism genes in several phage genomes. Metagenomic approaches extended these analyses to a community scale, and several studies concluded that microbial and viral communities encompass similar functional potentials. However, these conclusions could originate from the presence of cellular DNA within viral metagenomes. We developed a computational method to estimate the proportion and origin of cellular sequences in a set of 67 published viromes. A quarter of the datasets were found to contain a substantial amount of sequences originating from cellular genomes. When considering only viromes with no cellular DNA detected, the functional potential of viral and microbial communities was found to be fundamentally different—a conclusion more consistent with the actual picture drawn from known viruses. Yet a significant number of cellular metabolism genes was still retrieved in these viromes, suggesting that the presence of auxiliary genes involved in various metabolic pathways within viral genomes is a general trend in the virosphere.
phages; viruses; metagenomics; functional potential
Viruses are ubiquitous and abundant throughout the biosphere. In marine systems, virus-mediated processes can have significant impacts on microbial diversity and on global biogeocehmical cycling. However, viral genetic diversity remains poorly characterized. To address this shortcoming, a metagenomic library was constructed from Chesapeake Bay virioplankton. The resulting sequences constitute the largest collection of long-read double-stranded DNA (dsDNA) viral metagenome data reported to date. BLAST homology comparisons showed that Chesapeake Bay virioplankton contained a high proportion of unknown (homologous only to environmental sequences) and novel (no significant homolog) sequences. This analysis suggests that dsDNA viruses are likely one of the largest reservoirs of unknown genetic diversity in the biosphere. The taxonomic origin of BLAST homologs to viral library sequences agreed well with reported abundances of cooccurring bacterial subphyla within the estuary and indicated that cyanophages were abundant. However, the low proportion of Siphophage homologs contradicts a previous assertion that this family comprises most bacteriophage diversity. Identification and analyses of cyanobacterial homologs of the psbA gene illustrated the value of metagenomic studies of virioplankton. The phylogeny of inferred PsbA protein sequences suggested that Chesapeake Bay cyanophage strains are endemic in that environment. The ratio of psbA homologous sequences to total cyanophage sequences in the metagenome indicated that the psbA gene may be nearly universal in Chesapeake Bay cyanophage genomes. Furthermore, the low frequency of psbD homologs in the library supports the prediction that Chesapeake Bay cyanophage populations are dominated by Podoviridae.
At this time, about 3,000 different viruses are recognized, but metagenomic studies suggest that these viruses are a small fraction of the viruses that exist in nature. We have explored viral diversity by deep sequencing nucleic acids obtained from virion populations enriched from raw sewage. We identified 234 known viruses, including 17 that infect humans. Plant, insect, and algal viruses as well as bacteriophages were also present. These viruses represented 26 taxonomic families and included viruses with single-stranded DNA (ssDNA), double-stranded DNA (dsDNA), positive-sense ssRNA [ssRNA(+)], and dsRNA genomes. Novel viruses that could be placed in specific taxa represented 51 different families, making untreated wastewater the most diverse viral metagenome (genetic material recovered directly from environmental samples) examined thus far. However, the vast majority of sequence reads bore little or no sequence relation to known viruses and thus could not be placed into specific taxa. These results show that the vast majority of the viruses on Earth have not yet been characterized. Untreated wastewater provides a rich matrix for identifying novel viruses and for studying virus diversity.
Importance At this time, virology is focused on the study of a relatively small number of viral species. Specific viruses are studied either because they are easily propagated in the laboratory or because they are associated with disease. The lack of knowledge of the size and characteristics of the viral universe and the diversity of viral genomes is a roadblock to understanding important issues, such as the origin of emerging pathogens and the extent of gene exchange among viruses. Untreated wastewater is an ideal system for assessing viral diversity because virion populations from large numbers of individuals are deposited and because raw sewage itself provides a rich environment for the growth of diverse host species and thus their viruses. These studies suggest that the viral universe is far more vast and diverse than previously suspected.
At this time, virology is focused on the study of a relatively small number of viral species. Specific viruses are studied either because they are easily propagated in the laboratory or because they are associated with disease. The lack of knowledge of the size and characteristics of the viral universe and the diversity of viral genomes is a roadblock to understanding important issues, such as the origin of emerging pathogens and the extent of gene exchange among viruses. Untreated wastewater is an ideal system for assessing viral diversity because virion populations from large numbers of individuals are deposited and because raw sewage itself provides a rich environment for the growth of diverse host species and thus their viruses. These studies suggest that the viral universe is far more vast and diverse than previously suspected.
The bovine rumen hosts a diverse and complex community of Eukarya, Bacteria, Archea and viruses (including bacteriophage). The rumen viral population (the rumen virome) has received little attention compared to the rumen microbial population (the rumen microbiome). We used massively parallel sequencing of virus like particles to investigate the diversity of the rumen virome in thirteen lactating Australian Holstein dairy cattle all housed in the same location, 12 of which were sampled on the same day.
Fourteen putative viral sequence fragments over 30 Kbp in length were assembled and annotated. Many of the putative genes in the assembled contigs showed no homology to previously annotated genes, highlighting the large amount of work still required to fully annotate the functions encoded in viral genomes. The abundance of the contig sequences varied widely between animals, even though the cattle were of the same age, stage of lactation and fed the same diets. Additionally the twelve animals which were co-habited shared a number of their dominant viral contigs. We compared the functional characteristics of our bovine viromes with that of other viromes, as well as rumen microbiomes. At the functional level, we found strong similarities between all of the viral samples, which were highly distinct from the rumen microbiome samples.
Our findings suggest a large amount of between animal variation in the bovine rumen virome and that co-habiting animals may have more similar viromes than non co-habited animals. We report the deepest sequencing to date of the rumen virome. This work highlights the enormous amount of novelty and variation present in the rumen virome.
Virome; Rumen; Bacteriophage; Metagenomics
RNA viruses have been isolated that infect marine organisms ranging from bacteria to whales, but little is known about the composition and population structure of the in situ marine RNA virus community. In a recent study, the majority of three genomes of previously unknown positive-sense single-stranded (ss) RNA viruses were assembled from reverse-transcribed whole-genome shotgun libraries. The present contribution comparatively analyzes these genomes with respect to representative viruses from established viral taxa.
Two of the genomes (JP-A and JP-B), appear to be polycistronic viruses in the proposed order Picornavirales that fall into a well-supported clade of marine picorna-like viruses, the characterized members of which all infect marine protists. A temporal and geographic survey indicates that the JP genomes are persistent and widespread in British Columbia waters. The third genome, SOG, encodes a putative RNA-dependent RNA polymerase (RdRp) that is related to the RdRp of viruses in the family Tombusviridae, but the remaining SOG sequence has no significant similarity to any sequences in the NCBI database.
The complete genomes of these viruses permitted analyses that resulted in a more comprehensive comparison of these pathogens with established taxa. For example, in concordance with phylogenies based on the RdRp, our results support a close homology between JP-A and JP-B and RsRNAV. In contrast, although classification of the SOG genome based on the RdRp places SOG within the Tombusviridae, SOG lacks a capsid and movement protein conserved within this family and SOG is thus likely more distantly related to the Tombusivridae than the RdRp phylogeney indicates.
Recent advances of genomics and metagenomics reveal remarkable diversity of viruses and other selfish genetic elements. In particular, giant viruses have been shown to possess their own mobilomes that include virophages, small viruses that parasitize on giant viruses of the Mimiviridae family, and transpovirons, distinct linear plasmids. One of the virophages known as the Mavirus, a parasite of the giant Cafeteria roenbergensis virus, shares several genes with large eukaryotic self-replicating transposon of the Polinton (Maverick) family, and it has been proposed that the polintons evolved from a Mavirus-like ancestor.
We performed a comprehensive phylogenomic analysis of the available genomes of virophages and traced the evolutionary connections between the virophages and other selfish genetic elements. The comparison of the gene composition and genome organization of the virophages reveals 6 conserved, core genes that are organized in partially conserved arrays. Phylogenetic analysis of those core virophage genes, for which a sufficient diversity of homologs outside the virophages was detected, including the maturation protease and the packaging ATPase, supports the monophyly of the virophages. The results of this analysis appear incompatible with the origin of polintons from a Mavirus-like agent but rather suggest that Mavirus evolved through recombination between a polinton and an unknownvirus. Altogether, virophages, polintons, a distinct Tetrahymena transposable element Tlr1, transpovirons, adenoviruses, and some bacteriophages form a network of evolutionary relationships that is held together by overlapping sets of shared genes and appears to represent a distinct module in the vast total network of viruses and mobile elements.
The results of the phylogenomic analysis of the virophages and related genetic elements are compatible with the concept of network-like evolution of the virus world and emphasize multiple evolutionary connections between bona fide viruses and other classes of capsid-less mobile elements.