Viral metagenomics, consisting of viral particle purification and shotgun sequencing, is a powerful technique for discovering viruses associated with diseases with no definitive etiology, viruses that share limited homology with known viruses, or viruses that are not culturable. Here we used viral metagenomics to examine viruses associated with sea turtle fibropapillomatosis (FP), a debilitating neoplastic disease affecting sea turtles worldwide. By means of purifying and shotgun sequencing the viral community directly from the fibropapilloma of a Florida green sea turtle, a novel single-stranded DNA virus, sea turtle tornovirus 1 (STTV1), was discovered. The single-stranded, circular genome of STTV1 was approximately 1,800 nucleotides in length. STTV1 has only weak amino acid level identities (25%) to chicken anemia virus in short regions of its genome; hence, STTV1 may represent the first member of a novel virus family. A total of 35 healthy turtles and 27 turtles with FP were tested for STTV1 using PCR, and only 2 turtles severely afflicted with FP were positive. The affected turtles were systemically infected with STTV1, since STTV1 was found in blood and all major organs. STTV1 exists as a quasispecies, with several genome variants identified in the fibropapilloma of each positive turtle, suggesting rapid evolution of this virus. The STTV1 variants were identical over the majority of their genomes but contained a hypervariable region with extensive divergence. This study demonstrates the potential of viral metagenomics for discovering novel viruses directly from animal tissue, which can enhance our understanding of viral evolution and diversity.
The non-enveloped bacilliform viruses are the second group of plant viruses known to possess a genome consisting of circular double-stranded DNA. We have characterized the viral transcript and determined the complete sequence of the genome of Commelina mellow mottle virus (CoYMV), a member of this group. Analysis of the viral transcript indicates that the virus encodes a single terminally-redundant genome-length plus 120 nucleotide transcript. A fraction of the transcripts is polyadenylated, although the majority of the transcript is not polyadenylated. Analysis of the genome sequence indicates that the genome is 7489 bp in size and that the transcribed strand contains three open reading frames capable of encoding proteins of 23, 15 and 216 kd. The function of the 25 and 15 kd proteins is unknown. Similarities between the 216 kd polypeptide and the cauliflower mosaic virus coat protein and protease/reverse transcriptase polyprotein suggest that the 216 kd polypeptide is a polyprotein that is proteolytically processed to yield the virion coat protein, a protease, and replicase (reverse transcriptase and ribonuclease H). Each strand of the CoYMV genome is interrupted by site-specific discontinuities. The locations of the 5'-ends of these discontinuities, and the presence and location of a region on the CoYMV transcript capable of annealing with the 3'-end of cytosolic initiator methionine tRNA are consistent with replication by reverse transcription. We have demonstrated that a construct containing 1.3 CoYMV genomes is infective when introduced into Commelina diffusa, the host for CoYMV, using Agrobacterium-mediated infection.
Sulfolobus turreted icosahedral virus (STIV) was the first icosahedral virus characterized from an archaeal host. It infects Sulfolobus species that thrive in the acidic hot springs (pH 2.9 to 3.9 and 72 to 92°C) of Yellowstone National Park. The overall capsid architecture and the structure of its major capsid protein are very similar to those of the bacteriophage PRD1 and eukaryotic viruses Paramecium bursaria Chlorella virus 1 and adenovirus, suggesting a viral lineage that predates the three domains of life. The 17,663-base-pair, circular, double-stranded DNA genome contains 36 potential open reading frames, whose sequences generally show little similarity to other genes in the sequence databases. However, functional and evolutionary information may be suggested by a protein's three-dimensional structure. To this end, we have undertaken structural studies of the STIV proteome. Here we report our work on A197, the product of an STIV open reading frame. The structure of A197 reveals a GT-A fold that is common to many members of the glycosyltransferase superfamily. A197 possesses a canonical DXD motif and a putative catalytic base that are hallmarks of this family of enzymes, strongly suggesting a glycosyltransferase activity for A197. Potential roles for the putative glycosyltransferase activity of A197 and their evolutionary implications are discussed.
Viral particles in stool samples from wild-living chimpanzees were analysed using random PCR amplification and sequencing. Sequences encoding proteins distantly related to the replicase protein of single-stranded circular DNA viruses were identified. Inverse PCR was used to amplify and sequence multiple small circular DNA viral genomes. The viral genomes were related in size and genome organization to vertebrate circoviruses and plant geminiviruses but with a different location for the stem–loop structure involved in rolling circle DNA replication. The replicase genes of these viruses were most closely related to those of the much smaller (∼1 kb) plant nanovirus circular DNA chromosomes. Because the viruses have characteristics of both animal and plant viruses, we named them chimpanzee stool-associated circular viruses (ChiSCV). Further metagenomic studies of animal samples will greatly increase our knowledge of viral diversity and evolution.
At this time, about 3,000 different viruses are recognized, but metagenomic studies suggest that these viruses are a small fraction of the viruses that exist in nature. We have explored viral diversity by deep sequencing nucleic acids obtained from virion populations enriched from raw sewage. We identified 234 known viruses, including 17 that infect humans. Plant, insect, and algal viruses as well as bacteriophages were also present. These viruses represented 26 taxonomic families and included viruses with single-stranded DNA (ssDNA), double-stranded DNA (dsDNA), positive-sense ssRNA [ssRNA(+)], and dsRNA genomes. Novel viruses that could be placed in specific taxa represented 51 different families, making untreated wastewater the most diverse viral metagenome (genetic material recovered directly from environmental samples) examined thus far. However, the vast majority of sequence reads bore little or no sequence relation to known viruses and thus could not be placed into specific taxa. These results show that the vast majority of the viruses on Earth have not yet been characterized. Untreated wastewater provides a rich matrix for identifying novel viruses and for studying virus diversity.
Importance At this time, virology is focused on the study of a relatively small number of viral species. Specific viruses are studied either because they are easily propagated in the laboratory or because they are associated with disease. The lack of knowledge of the size and characteristics of the viral universe and the diversity of viral genomes is a roadblock to understanding important issues, such as the origin of emerging pathogens and the extent of gene exchange among viruses. Untreated wastewater is an ideal system for assessing viral diversity because virion populations from large numbers of individuals are deposited and because raw sewage itself provides a rich environment for the growth of diverse host species and thus their viruses. These studies suggest that the viral universe is far more vast and diverse than previously suspected.
At this time, virology is focused on the study of a relatively small number of viral species. Specific viruses are studied either because they are easily propagated in the laboratory or because they are associated with disease. The lack of knowledge of the size and characteristics of the viral universe and the diversity of viral genomes is a roadblock to understanding important issues, such as the origin of emerging pathogens and the extent of gene exchange among viruses. Untreated wastewater is an ideal system for assessing viral diversity because virion populations from large numbers of individuals are deposited and because raw sewage itself provides a rich environment for the growth of diverse host species and thus their viruses. These studies suggest that the viral universe is far more vast and diverse than previously suspected.
The human gut is known to be a reservoir of a wide variety of microbes, including viruses. Many RNA viruses are known to be associated with gastroenteritis; however, the enteric RNA viral community present in healthy humans has not been described. Here, we present a comparative metagenomic analysis of the RNA viruses found in three fecal samples from two healthy human individuals. For this study, uncultured viruses were concentrated by tangential flow filtration, and viral RNA was extracted and cloned into shotgun viral cDNA libraries for sequencing analysis. The vast majority of the 36,769 viral sequences obtained were similar to plant pathogenic RNA viruses. The most abundant fecal virus in this study was pepper mild mottle virus (PMMV), which was found in high concentrations—up to 109 virions per gram of dry weight fecal matter. PMMV was also detected in 12 (66.7%) of 18 fecal samples collected from healthy individuals on two continents, indicating that this plant virus is prevalent in the human population. A number of pepper-based foods tested positive for PMMV, suggesting dietary origins for this virus. Intriguingly, the fecal PMMV was infectious to host plants, suggesting that humans might act as a vehicle for the dissemination of certain plant viruses.
A comparative metagenomic analysis of RNA viruses in the human gut identifies the vast majority as plant pathogens.
The Microviridae comprises icosahedral lytic viruses with circular single-stranded DNA genomes. The family is divided into two distinct groups based on genome characteristics and virion structure. Viruses infecting enterobacteria belong to the genus Microvirus, whereas those infecting obligate parasitic bacteria, such as Chlamydia, Spiroplasma and Bdellovibrio, are classified into a subfamily, the Gokushovirinae. Recent metagenomic studies suggest that members of the Microviridae might also play an important role in marine environments. In this study we present the identification and characterization of Microviridae-related prophages integrated in the genomes of species of the Bacteroidetes, a phylum not previously known to be associated with microviruses. Searches against metagenomic databases revealed the presence of highly similar sequences in the human gut. This is the first report indicating that viruses of the Microviridae lysogenize their hosts. Absence of associated integrase-coding genes and apparent recombination with dif-like sequences suggests that Bacteroidetes-associated microviruses are likely to rely on the cellular chromosome dimer resolution machinery. Phylogenetic analysis of the putative major capsid proteins places the identified proviruses into a group separate from the previously characterized microviruses and gokushoviruses, suggesting that the genetic diversity and host range of bacteriophages in the family Microviridae is wider than currently appreciated.
Viruses are known to be the most numerous biological entities in soil; however, little is known about their diversity in this environment. In order to explore the genetic diversity of soil viruses, we isolated viruses by centrifugation and sequential filtration before performing a metagenomic investigation. We adopted multiple-displacement amplification (MDA), an isothermal whole-genome amplification method with φ29 polymerase and random hexamers, to amplify viral DNA and construct clone libraries for metagenome sequencing. By the MDA method, the diversity of both single-stranded DNA (ssDNA) viruses and double-stranded DNA viruses could be investigated at the same time. On the contrary, by eliminating the denaturing step in the MDA reaction, only ssDNA viral diversity could be explored selectively. Irrespective of the denaturing step, more than 60% of the soil metagenome sequences did not show significant hits (E-value criterion, 0.001) with previously reported viral sequences. Those hits that were considered to be significant were also distantly related to known ssDNA viruses (average amino acid similarity, approximately 34%). Phylogenetic analysis showed that replication-related proteins (which were the most frequently detected proteins) related to those of ssDNA viruses obtained from the metagenomic sequences were diverse and novel. Putative circular genome components of ssDNA viruses that are unrelated to known viruses were assembled from the metagenomic sequences. In conclusion, ssDNA viral diversity in soil is more complex than previously thought. Soil is therefore a rich pool of previously unknown ssDNA viruses.
Whiteflies from the Bemisia tabaci species complex have the ability to transmit a large number of plant viruses and are some of the most detrimental pests in agriculture. Although whiteflies are known to transmit both DNA and RNA viruses, most of the diversity has been recorded for the former, specifically for the Begomovirus genus. This study investigated the total diversity of DNA and RNA viruses found in whiteflies collected from a single site in Florida to evaluate if there are additional, previously undetected viral types within the B. tabaci vector. Metagenomic analysis of viral DNA extracted from the whiteflies only resulted in the detection of begomoviruses. In contrast, whiteflies contained sequences similar to RNA viruses from divergent groups, with a diversity that extends beyond currently described viruses. The metagenomic analysis of whiteflies also led to the first report of a whitefly-transmitted RNA virus similar to Cowpea mild mottle virus (CpMMV Florida) (genus Carlavirus) in North America. Further investigation resulted in the detection of CpMMV Florida in native and cultivated plants growing near the original field site of whitefly collection and determination of its experimental host range. Analysis of complete CpMMV Florida genomes recovered from whiteflies and plants suggests that the current classification criteria for carlaviruses need to be reevaluated. Overall, metagenomic analysis supports that DNA plant viruses carried by B. tabaci are dominated by begomoviruses, whereas significantly less is known about RNA viruses present in this damaging insect vector.
Viruses are the most common biological entities in the marine environment. There has not been a global survey of these viruses, and consequently, it is not known what types of viruses are in Earth's oceans or how they are distributed. Metagenomic analyses of 184 viral assemblages collected over a decade and representing 68 sites in four major oceanic regions showed that most of the viral sequences were not similar to those in the current databases. There was a distinct “marine-ness” quality to the viral assemblages. Global diversity was very high, presumably several hundred thousand of species, and regional richness varied on a North-South latitudinal gradient. The marine regions had different assemblages of viruses. Cyanophages and a newly discovered clade of single-stranded DNA phages dominated the Sargasso Sea sample, whereas prophage-like sequences were most common in the Arctic. However most viral species were found to be widespread. With a majority of shared species between oceanic regions, most of the differences between viral assemblages seemed to be explained by variation in the occurrence of the most common viral species and not by exclusion of different viral genomes. These results support the idea that viruses are widely dispersed and that local environmental conditions enrich for certain viral types through selective pressure.
An extensive metagenomic survey of viral diversity in the marine environment is presented. Many phages are widely distributed, although location-specific selection results in enrichment of some viruses.
Viruses are important drivers of ecosystem functions, yet little is known about the vast majority of viruses. Viral shotgun metagenomics enables the investigation of broad ecological questions in phage communities. One ecological characteristic is species richness, which is the number of different species in a community. Viruses do not have a phylogenetic marker analogous to the bacterial 16S rRNA gene with which to estimate richness, and so contig spectra are employed to measure the number of virus taxa in a given community. A contig spectrum is generated from a viral shotgun metagenome by assembling the random sequence reads into groups of sequences that overlap (contigs) and counting the number of sequences that group within each contig. Current tools available to analyze contig spectra to estimate phage richness are limited by relying on rank-abundance data.
We present statistical estimates of virus richness from contig spectra. The program CatchAll (http://www.northeastern.edu/catchall/) was used to analyze contig spectra in terms of frequency count data rather than rank-abundance, thus enabling formal statistical analyses. Also, the influence of potentially spurious low-frequency counts on richness estimates was minimized by two methods, empirical and statistical. The results show greater estimates of viral richness than previous calculations in nearly all environments analyzed, including swine feces and reclaimed fresh water.
CatchAll yielded consistent estimates of richness across viral metagenomes from the same or similar environments. Additionally, analysis of pooled viral metagenomes from different environments via mixed contig spectra resulted in greater richness estimates than those of the component metagenomes. Using CatchAll to analyze contig spectra will improve estimations of richness from viral shotgun metagenomes, particularly from large datasets, by providing statistical measures of richness.
Phage; Metagenomics; Virome; Ecology; Richness; CatchAll; Singleton
Much remains to be learned about single-stranded (ss) DNA viruses in natural systems, and the evolutionary relationships among them. One of the eight recognized families of ssDNA viruses is the Microviridae, a group of viruses infecting bacteria. In this study we used metagenomic analysis, genome assembly, and amplicon sequencing of purified ssDNA to show that bacteriophages belonging to the subfamily Gokushovirinae within the Microviridae are genetically diverse and widespread members of marine microbial communities. Metagenomic analysis of coastal samples from the Gulf of Mexico (GOM) and British Columbia, Canada, revealed numerous sequences belonging to gokushoviruses and allowed the assembly of five putative genomes with an organization similar to chlamydiamicroviruses. Fragment recruitment to these genomes from different metagenomic data sets is consistent with gokushovirus genotypes being restricted to specific oceanic regions. Conservation among the assembled genomes allowed the design of degenerate primers that target an 800 bp fragment from the gene encoding the major capsid protein. Sequences could be amplified from coastal temperate and subtropical waters, but not from samples collected from the Arctic Ocean, or freshwater lakes. Phylogenetic analysis revealed that most sequences were distantly related to those from cultured representatives. Moreover, the sequences fell into at least seven distinct evolutionary groups, most of which were represented by one of the assembled metagenomes. Our results greatly expand the known sequence space for gokushoviruses, and reveal biogeographic separation and new evolutionary lineages of gokushoviruses in the oceans.
biogeography; ssDNA viruses; Microviridae; Gokushovirinae; virus diversity; ocean viruses
Metagenomics can be used to determine the diversity of complex, often unculturable, viral communities with various nucleic acid compositions. Here, we report the use of hydroxyapatite chromatography to efficiently fractionate double-stranded DNA (dsDNA), single-stranded DNA (ssDNA), dsRNA, and ssRNA genomes from known bacteriophages. Linker-amplified shotgun libraries were constructed to generate sequencing reads from each hydroxyapatite fraction. Greater than 90% of the reads displayed significant similarity to the expected genomes at the nucleotide level. These methods were applied to marine viruses collected from the Chesapeake Bay and the Dry Tortugas National Park. Isolated nucleic acids were fractionated using hydroxyapatite chromatography followed by linker-amplified shotgun library construction and sequencing. Taxonomic analysis demonstrated that the majority of environmental sequences, regardless of their source nucleic acid, were most similar to dsDNA viruses, reflecting the bias of viral metagenomic sequence databases.
Geminiviruses with small circular single-stranded DNA genomes replicate in plant cell nuclei by using various double-stranded DNA (dsDNA) intermediates: distinct open circular and covalently closed circular as well as heterogeneous linear DNA. Their DNA may be methylated partially at cytosine residues, as detected previously by bisulfite sequencing and subsequent PCR. In order to determine the methylation patterns of the circular molecules, the DNAs of tomato yellow leaf curl Sardinia virus (TYLCSV) and Abutilon mosaic virus were investigated utilizing bisulfite treatment followed by rolling circle amplification. Shotgun sequencing of the products yielded a randomly distributed 50% rate of C maintenance after the bisulfite reaction for both viruses. However, controls with unmethylated single-stranded bacteriophage DNA resulted in the same level of C maintenance. Only one short DNA stretch within the C2/C3 promoter of TYLCSV showed hyperprotection of C, with the protection rate exceeding the threshold of the mean value plus 1 standard deviation. Similarly, the use of methylation-sensitive restriction enzymes suggested that geminiviruses escape silencing by methylation very efficiently, by either a rolling circle or recombination-dependent replication mode. In contrast, attempts to detect methylated bases positively by using methylcytosine-specific antibodies detected methylated DNA only in heterogeneous linear dsDNA, and methylation-dependent restriction enzymes revealed that the viral heterogeneous linear dsDNA was methylated preferentially.
There are no known RNA viruses that infect Archaea. Filling this gap in our knowledge of viruses will enhance our understanding of the relationships between RNA viruses from the three domains of cellular life and, in particular, could shed light on the origin of the enormous diversity of RNA viruses infecting eukaryotes. We describe here the identification of novel RNA viral genome segments from high-temperature acidic hot springs in Yellowstone National Park in the United States. These hot springs harbor low-complexity cellular communities dominated by several species of hyperthermophilic Archaea. A viral metagenomics approach was taken to assemble segments of these RNA virus genomes from viral populations isolated directly from hot spring samples. Analysis of these RNA metagenomes demonstrated unique gene content that is not generally related to known RNA viruses of Bacteria and Eukarya. However, genes for RNA-dependent RNA polymerase (RdRp), a hallmark of positive-strand RNA viruses, were identified in two contigs. One of these contigs is approximately 5,600 nucleotides in length and encodes a polyprotein that also contains a region homologous to the capsid protein of nodaviruses, tetraviruses, and birnaviruses. Phylogenetic analyses of the RdRps encoded in these contigs indicate that the putative archaeal viruses form a unique group that is distinct from the RdRps of RNA viruses of Eukarya and Bacteria. Collectively, our findings suggest the existence of novel positive-strand RNA viruses that probably replicate in hyperthermophilic archaeal hosts and are highly divergent from RNA viruses that infect eukaryotes and even more distant from known bacterial RNA viruses. These positive-strand RNA viruses might be direct ancestors of RNA viruses of eukaryotes.
In a recent BMC Evolutionary Biology article, Huiquan Liu and colleagues report two new genomes of double-stranded RNA (dsRNA) viruses from fungi and use these as a springboard to perform an extensive phylogenomic analysis of dsRNA viruses. The results support the old scenario of polyphyletic origin of dsRNA viruses from different groups of positive-strand RNA viruses and additionally reveal extensive horizontal gene transfer between diverse viruses consistent with the network-like rather than tree-like mode of viral evolution. Together with the unexpected discoveries of the first putative archaeal RNA virus and a RNA-DNA virus hybrid, this work shows that RNA viral genomics has major surprises to deliver.
See research article: http://www.biomedcentral.com/1471-2148/12/91
Viruses, most of which are phage, are extremely abundant in marine sediments, yet almost nothing is known about their identity or diversity. We present the metagenomic analysis of an uncultured near-shore marine-sediment viral community. Three-quarters of the sequences in the sample were not related to anything previously reported. Among the sequences that could be identified, the majority belonged to double-stranded DNA phage. Temperate phage were more common than lytic phage, suggesting that lysogeny may be an important lifestyle for sediment viruses. Comparisons between the sediment sample and previously sequenced seawater viral communities showed that certain phage phylogenetic groups were abundant in all marine viral communities, while other phage groups were under-represented or absent. This 'marineness' suggests that marine phage are derived from a common set of ancestors. Several independent mathematical models, based on the distribution of overlapping shotgun sequence fragments from the library, were used to show that the diversity of the viral community was extremely high, with at least 10(4) viral genotypes per kilogram of sediment and a Shannon index greater than 9 nats. Based on these observations we propose that marine-sediment viral communities are one of the largest unexplored reservoirs of sequence space on the planet.
Virus-infected plants accumulate abundant, 21–24 nucleotide viral siRNAs which are generated by the evolutionary conserved RNA interference (RNAi) machinery that regulates gene expression and defends against invasive nucleic acids. Here we show that, similar to RNA viruses, the entire genome sequences of DNA viruses are densely covered with siRNAs in both sense and antisense orientations. This implies pervasive transcription of both coding and non-coding viral DNA in the nucleus, which generates double-stranded RNA precursors of viral siRNAs. Consistent with our finding and hypothesis, we demonstrate that the complete genomes of DNA viruses from Caulimoviridae and Geminiviridae families can be reconstructed by deep sequencing and de novo assembly of viral siRNAs using bioinformatics tools. Furthermore, we prove that this ‘siRNA omics’ approach can be used for reliable identification of the consensus master genome and its microvariants in viral quasispecies. Finally, we utilized this approach to reconstruct an emerging DNA virus and two viroids associated with economically-important red blotch disease of grapevine, and to rapidly generate a biologically-active clone representing the wild type master genome of Oilseed rape mosaic virus. Our findings show that deep siRNA sequencing allows for de novo reconstruction of any DNA or RNA virus genome and its microvariants, making it suitable for universal characterization of evolving viral quasispecies as well as for studying the mechanisms of siRNA biogenesis and RNAi-based antiviral defense.
Little is known about the viruses infecting most species. Even in groups as well-studied as Drosophila, only a handful of viruses have been well-characterized. A viral metagenomic approach was used to explore viral diversity in 83 wild-caught Drosophila innubila, a mushroom feeding member of the quinaria group. A single fly that was injected with, and died from, Drosophila C Virus (DCV) was added to the sample as a control. Two-thirds of reads in the infected sample had DCV as the best BLAST hit, suggesting that the protocol developed is highly sensitive. In addition to the DCV hits, several sequences had Oryctes rhinoceros Nudivirus, a double-stranded DNA virus, as a best BLAST hit. The virus associated with these sequences was termed Drosophila innubila Nudivirus (DiNV). PCR screens of natural populations showed that DiNV was both common and widespread taxonomically and geographically. Electron microscopy confirms the presence of virions in fly fecal material similar in structure to other described Nudiviruses. In 2 species, D. innubila and D. falleni, the virus is associated with a severe (∼80–90%) loss of fecundity and significantly decreased lifespan.
Although single stranded (ss) DNA viruses that infect humans and their domesticated animals do not generally cause major diseases, the arthropod borne ssDNA viruses of plants do, and as a result seriously constrain food production in most temperate regions of the world. Besides the well known plant and animal-infecting ssDNA viruses, it has recently become apparent through metagenomic surveys of ssDNA molecules that there also exist large numbers of other diverse ssDNA viruses within almost all terrestrial and aquatic environments. The host ranges of these viruses probably span the tree of life and they are likely to be important components of global ecosystems. Various lines of evidence suggest that a pivotal evolutionary process during the generation of this global ssDNA virus diversity has probably been genetic recombination. High rates of homologous recombination, non-homologous recombination and genome component reassortment are known to occur within and between various different ssDNA virus species and we look here at the various roles that these different types of recombination may play, both in the day-to-day biology, and in the longer term evolution, of these viruses. We specifically focus on the ecological, biochemical and selective factors underlying patterns of genetic exchange detectable amongst the ssDNA viruses and discuss how these should all be considered when assessing the adaptive value of recombination during ssDNA virus evolution.
parvovirus; geminivirus; anellovirus; circovirus; nanovirus
Metagenomic analysis of viruses suggests novel patterns of evolution, changes the existing ideas of the composition of the virus world and reveals novel groups of viruses and virus-like agents. The gene composition of the marine DNA virome is dramatically different from that of known bacteriophages. The virome is dominated by rare genes, many of which might be contained within virus-like entities such as gene transfer agents. Analysis of marine metagenomes thought to consist mostly of bacterial genes revealed a variety of sequences homologous to conserved genes of eukaryotic nucleocytoplasmic large DNA viruses, resulting in the discovery of diverse members of previously undersampled groups and suggesting the existence of new classes of virus-like agents. Unexpectedly, metagenomics of marine RNA viruses showed that representatives of only one superfamily of eukaryotic viruses, the picorna-like viruses, dominate the RNA virome.
Chlorella viruses (or chloroviruses) are very large, plaque-forming viruses. The viruses are multilayered structures containing a large double-stranded DNA genome, a lipid bilayered membrane, and an outer icosahedral capsid shell. The viruses replicate in certain isolates of the coccal green alga, Chlorella. Sequence analysis of the 330-kbp genome of Paramecium bursaria Chlorella virus 1 (PBCV-1), the prototype of the virus family Phycodnaviridae, reveals <365 protein-encoding genes and 11 tRNA genes. Products of about 40% of these genes resemble proteins of known function, including many that are unexpected for a virus. Among these is a virus-encoded protein, called Kcv, which forms a functional K+ channel. This chapter focuses on the initial steps in virus infection and provides a plausible role for the function of the viral K+ channel in lowering the turgor pressure of the host. This step appears to be a prerequisite for delivery of the viral genome into the host.
The aim of this study was to develop and demonstrate an approach for describing the diversity of human pathogenic viruses in an environmentally isolated viral metagenome.
Methods and Results
In silico bioinformatic experiments were used to select an optimum annotation strategy for discovering human viruses in virome datasets, and applied to annotate a class B biosolids virome. Results from the in silico study indicated that less than 1% errors in virus identification could be achieved when nucleotide-based search programs (BLASTn or tBLASTx), viral genome only databases, and sequence reads greater than 200 nt were considered. Within the 51,925 annotated sequences, 94 DNA and 19 RNA sequences were identified as human viruses. Virus diversity included environmentally transmitted agents such as parechovirus, coronavirus, adenovirus, and aichi virus, as well as viruses associated with chronic human infections such as human herpes and hepatitis C viruses.
This study provided a bioinformatic approach for identifying pathogens in a virome dataset, and demonstrated the human virus diversity in a relevant environmental sample.
Significance and Impact of Study
As the costs of next generation sequencing decrease, the pathogen diversity described by virus metagenomes will provide an unbiased guide for subsequent cell-culture and quantitative pathogen analyses, and ensures that highly enriched and relevant pathogens are not neglected in exposure and risk assessments.
virus; bioinformatics; biosolids; next generation DNA sequencing; viral metagenome; pathogen; virome
Virophages, e.g., Sputnik, Mavirus, and Organic Lake virophage (OLV), are unusual parasites of giant double-stranded DNA (dsDNA) viruses, yet little is known about their diversity. Here, we describe the global distribution, abundance, and genetic diversity of virophages based on analyzing and mapping comprehensive metagenomic databases. The results reveal a distinct abundance and worldwide distribution of virophages, involving almost all geographical zones and a variety of unique environments. These environments ranged from deep ocean to inland, iced to hydrothermal lakes, and human gut- to animal-associated habitats. Four complete virophage genomic sequences (Yellowstone Lake virophages [YSLVs]) were obtained, as was one nearly complete sequence (Ace Lake Mavirus [ALM]). The genomes obtained were 27,849 bp long with 26 predicted open reading frames (ORFs) (YSLV1), 23,184 bp with 21 ORFs (YSLV2), 27,050 bp with 23 ORFs (YSLV3), 28,306 bp with 34 ORFs (YSLV4), and 17,767 bp with 22 ORFs (ALM). The homologous counterparts of five genes, including putative FtsK-HerA family DNA packaging ATPase and genes encoding DNA helicase/primase, cysteine protease, major capsid protein (MCP), and minor capsid protein (mCP), were present in all virophages studied thus far. They also shared a conserved gene cluster comprising the two core genes of MCP and mCP. Comparative genomic and phylogenetic analyses showed that YSLVs, having a closer relationship to each other than to the other virophages, were more closely related to OLV than to Sputnik but distantly related to Mavirus and ALM. These findings indicate that virophages appear to be widespread and genetically diverse, with at least 3 major lineages.
Studies on viral capsid architectures and coat protein folds have revealed the evolutionary lineages of viruses branching to all three domains of life. A widespread group of icosahedral tailless viruses, the PRD1-adenovirus lineage, was the first to be established. A double β-barrel fold for a single major capsid protein is characteristic of these viruses. Similar viruses carrying genes coding for two major capsid proteins with a more complex structure, such as Thermus phage P23-77 and haloarchaeal virus SH1, have been isolated. Here, we studied the host range, life cycle, biochemical composition, and genomic sequence of a new isolate, Haloarcula hispanica icosahedral virus 2 (HHIV-2), which resembles SH1 despite being isolated from a different location. Comparative analysis of these viruses revealed that their overall architectures are very similar except that the genes for the receptor recognition vertex complexes are unrelated even though these viruses infect the same hosts.