Metagenomic analysis provides a rich source of biological information for otherwise intractable viral communities. However, study of viral metagenomes has been hampered by its nearly complete reliance on BLAST algorithms for identification of DNA sequences. We sought to develop algorithms for examination of viral metagenomes to identify the origin of sequences independent of BLAST algorithms. We chose viral metagenomes obtained from two hot springs, Bear Paw and Octopus, in Yellowstone National Park, as they represent simple microbial populations where comparatively large contigs were obtained. Thermal spring metagenomes have high proportions of sequences without significant Genbank homology, which has hampered identification of viruses and their linkage with hosts. To analyze each metagenome, we developed a method to classify DNA fragments using genome signature-based phylogenetic classification (GSPC), where metagenomic fragments are compared to a database of oligonucleotide signatures for all previously sequenced Bacteria, Archaea, and viruses.
From both Bear Paw and Octopus hot springs, each assembled contig had more similarity to other metagenome contigs than to any sequenced microbial genome based on GSPC analysis, suggesting a genome signature common to each of these extreme environments. While viral metagenomes from Bear Paw and Octopus share some similarity, the genome signatures from each locale are largely unique. GSPC using a microbial database predicts most of the Octopus metagenome has archaeal signatures, while bacterial signatures predominate in Bear Paw; a finding consistent with those of Genbank BLAST. When using a viral database, the majority of the Octopus metagenome is predicted to belong to archaeal virus Families Globuloviridae and Fuselloviridae, while none of the Bear Paw metagenome is predicted to belong to archaeal viruses. As expected, when microbial and viral databases are combined, each of the Octopus and Bear Paw metagenomic contigs are predicted to belong to viruses rather than to any Bacteria or Archaea, consistent with the apparent viral origin of both metagenomes.
That BLAST searches identify no significant homologs for most metagenome contigs, while GSPC suggests their origin as archaeal viruses or bacteriophages, indicates GSPC provides a complementary approach in viral metagenomic analysis.
It is currently difficult to detect unknown viruses in any given environment. The recent discovery of CRISPR (clusters of regularly interspaced short palindromic repeats) loci within bacterial and archaeal cellular genomes may provide an alternative approach to detect new viruses. It has been shown that the spacer sequences between the direct repeat units of the CRISPR loci are often derived from viruses and likely function as guide sequences to protect the cell from viral infection. The spacer sequences within the CRISPR loci may therefore serve as a record of the viruses that have replicated within the cell. We have cataloged the CRISPR spacer sequences from cellular metagenomic data from high-temperature (>80°C), acidic (pH < 4) hot spring environments located in Yellowstone National Park (YNP). We designed a microarray platform utilizing these CRISPR spacer sequences as potential probes to detect viruses present in YNP hot spring environments. We show that this microarray approach can detect viral sequences directly from virus-enriched environmental samples, detecting new viruses which have not been previously characterized. We further demonstrated that this microarray approach can be used to examine temporal changes in viral populations within the environment. Our results demonstrate that CRISPR spacer sequence-based microarrays will be useful tools for detecting and monitoring viruses from diverse environmental samples.
A new type of viral-induced lysis system has recently been discovered for two unrelated archaeal viruses, STIV and SIRV2. Prior to the lysis of the infected host cell, unique pyramid-like lysis structures are formed on the cell surface by the protrusion of the underlying cell membrane through the overlying external S-layer. It is through these pyramid structures that assembled virions are released during lysis. The STIV viral protein c92 is responsible for the formation of these lysis structures. We searched for c92-like proteins in viral sequences present in multiple viral and cellular metagenomic libraries from Yellowstone National Park acidic hot spring environments. Phylogenetic analysis of these proteins demonstrates that, although c92-like proteins are detected in these environments, some are quite divergent and may represent new viral families. We hypothesize that this new viral lysis system is common within diverse archaeal viral populations found within acidic hot springs.
Icosahedral nontailed double-stranded DNA (dsDNA) viruses are present in all three domains of life, leading to speculation about a common viral ancestor that predates the divergence of Eukarya, Bacteria, and Archaea. This suggestion is supported by the shared general architecture of this group of viruses and the common fold of their major capsid protein. However, limited information on the diversity and replication of archaeal viruses, in general, has hampered further analysis. Sulfolobus turreted icosahedral virus (STIV), isolated from a hot spring in Yellowstone National Park, was the first icosahedral virus with an archaeal host to be described. Here we present a detailed characterization of the components forming this unusual virus. Using a proteomics-based approach, we identified nine viral and two host proteins from purified STIV particles. Interestingly, one of the viral proteins originates from a reading frame lacking a consensus start site. The major capsid protein (B345) was found to be glycosylated, implying a strong similarity to proteins from other dsDNA viruses. Sequence analysis and structural predication of virion-associated viral proteins suggest that they may have roles in DNA packaging, penton formation, and protein-protein interaction. The presence of an internal lipid layer containing acidic tetraether lipids has also been confirmed. The previously presented structural models in conjunction with the protein, lipid, and carbohydrate information reported here reveal that STIV is strikingly similar to viruses associated with the Bacteria and Eukarya domains of life, further strengthening the hypothesis for a common ancestor of this group of dsDNA viruses from all domains of life.
Thermosphaera aggregans Huber et al. 1998 is the type species of the genus Thermosphaera, which comprises at the time of writing only one species. This species represents archaea with a hyperthermophilic, heterotrophic, strictly anaerobic and fermentative phenotype. The type strain M11TLT was isolated from a water-sediment sample of a hot terrestrial spring (Obsidian Pool, Yellowstone National Park, Wyoming). Here we describe the features of this organism, together with the complete genome sequence and annotation. The 1,316,595 bp long single replicon genome with its 1,410 protein-coding and 47 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project.
hyperthermophile; strictly fermentative metabolism; sulfur reduction; obligate anaerobic; hot solfataric spring; Desulfurococcaceae; Crenarchaeota; GEBA
Roseiflexus sp. strains were cultivated from a microbial mat of an alkaline siliceous hot spring in Yellowstone National Park. These strains are closely related to predominant filamentous anoxygenic phototrophs found in the mat, as judged by the similarity of small-subunit rRNA, lipid distributions, and genomic and metagenomic sequences. Like a Japanese isolate, R. castenholzii, the Yellowstone isolates contain bacteriochlorophyll a, but not bacteriochlorophyll c or chlorosomes, and grow photoheterotrophically or chemoheterotrophically under dark aerobic conditions. The genome of one isolate, Roseiflexus sp. strain RS1, contains genes necessary to support these metabolisms. This genome also contains genes encoding the 3-hydroxypropionate pathway for CO2 fixation and a hydrogenase, which might enable photoautotrophic metabolism, even though neither isolate could be grown photoautotrophically with H2 or H2S as a possible electron donor. The isolates exhibit temperature, pH, and sulfide preferences typical of their habitat. Lipids produced by these isolates matched much better with mat lipids than do lipids produced by R. castenholzii or Chloroflexus isolates.
The complete genome sequences of two Sulfolobus spindle-shaped viruses (SSVs) from acidic hot springs in Kamchatka (Russia) and Yellowstone National Park (United States) have been determined. These nonlytic temperate viruses were isolated from hyperthermophilic Sulfolobus hosts, and both viruses share the spindle-shaped morphology characteristic of the Fuselloviridae family. These two genomes, in combination with the previously determined SSV1 genome from Japan and the SSV2 genome from Iceland, have allowed us to carry out a phylogenetic comparison of these geographically distributed hyperthermal viruses. Each virus contains a circular double-stranded DNA genome of ∼15 kbp with approximately 34 open reading frames (ORFs). These Fusellovirus ORFs show little or no similarity to genes in the public databases. In contrast, 18 ORFs are common to all four isolates and may represent the minimal gene set defining this viral group. In general, ORFs on one half of the genome are colinear and highly conserved, while ORFs on the other half are not. One shared ORF among all four genomes is an integrase of the tyrosine recombinase family. All four viral genomes integrate into their host tRNA genes. The specific tRNA gene used for integration varies, and one genome integrates into multiple loci. Several unique ORFs are found in the genome of each isolate.
CRISPR arrays and associated cas genes are widespread in bacteria and archaea and confer acquired resistance to viruses. To examine viral immunity in the context of naturally evolving microbial populations we analyzed genomic data from two thermophilic Synechococcus isolates (Syn OS-A and Syn OS-B′) as well as a prokaryotic metagenome and viral metagenome derived from microbial mats in hotsprings at Yellowstone National Park. Two distinct CRISPR types, distinguished by the repeat sequence, are found in both the Syn OS-A and Syn OS-B′ genomes. The genome of Syn OS-A contains a third CRISPR type with a distinct repeat sequence, which is not found in Syn OS-B′, but appears to be shared with other microorganisms that inhabit the mat. The CRISPR repeats identified in the microbial metagenome are highly conserved, while the spacer sequences (hereafter referred to as “viritopes” to emphasize their critical role in viral immunity) were mostly unique and had no high identity matches when searched against GenBank. Searching the viritopes against the viral metagenome, however, yielded several matches with high similarity some of which were within a gene identified as a likely viral lysozyme/lysin protein. Analysis of viral metagenome sequences corresponding to this lysozyme/lysin protein revealed several mutations all of which translate into silent or conservative mutations which are unlikely to affect protein function, but may help the virus evade the host CRISPR resistance mechanism. These results demonstrate the varied challenges presented by a natural virus population, and support the notion that the CRISPR/viritope system must be able to adapt quickly to provide host immunity. The ability of metagenomics to track population-level variation in viritope sequences allows for a culture-independent method for evaluating the fast co-evolution of host and viral genomes and its consequence on the structuring of complex microbial communities.
Uncovering the chemical and physical links between natural environments and microbial communities is becoming increasingly amenable owing to geochemical observations and metagenomic sequencing. At the hot spring known as Bison Pool in Yellowstone National Park, the cooling of the water in the outflow channel is associated with an increase in oxidation potential estimated from multiple field-based measurements. Representative groups of proteins whose sequences were derived from metagenomic data also exhibit an increase in average oxidation state of carbon in the protein molecules with distance from the hot-spring source. The energetic requirements of reactions to form selected proteins used in the model were computed using amino-acid group additivity for the standard molal thermodynamic properties of the proteins, and the relative chemical stabilities of the proteins were investigated by varying temperature, pH and oxidation state, expressed as activity of dissolved hydrogen. The relative stabilities of the proteins were found to track the locations of the sampling sites when the calculations included a function for hydrogen activity that increases with temperature and is higher, or more reducing, than values consistent with measurements of dissolved oxygen, sulfide and oxidation-reduction potential in the field. These findings imply that spatial patterns in the amino acid compositions of proteins can be linked, through energetics of overall chemical reactions representing the formation of the proteins, to the environmental conditions at this hot spring, even if microbial cells maintain considerably different internal conditions. Further applications of the thermodynamic calculations are possible for other natural microbial ecosystems.
Phototrophic microbial mat communities from 60 °C and 65 °C regions in the effluent channels of Mushroom and Octopus Springs (Yellowstone National Park, WY, USA) were investigated by shotgun metagenomic sequencing. Analyses of assembled metagenomic sequences resolved six dominant chlorophototrophic populations and permitted the discovery and characterization of undescribed but predominant community members and their physiological potential. Linkage of phylogenetic marker genes and functional genes showed novel chlorophototrophic bacteria belonging to uncharacterized lineages within the order Chlorobiales and within the Kingdom Chloroflexi. The latter is the first chlorophototrophic member of Kingdom Chloroflexi that lies outside the monophyletic group of chlorophototrophs of the Order Chloroflexales. Direct comparison of unassembled metagenomic sequences to genomes of representative isolates showed extensive genetic diversity, genomic rearrangements and novel physiological potential in native populations as compared with genomic references. Synechococcus spp. metagenomic sequences showed a high degree of synteny with the reference genomes of Synechococcus spp. strains A and B′, but synteny declined with decreasing sequence relatedness to these references. There was evidence of horizontal gene transfer among native populations, but the frequency of these events was inversely proportional to phylogenetic relatedness.
cyanobacteria; Chloroflexi; community structure and function; metagenomics
Sulfolobus turreted icosahedral virus (STIV) was the first icosahedral virus characterized from an archaeal host. It infects Sulfolobus species that thrive in the acidic hot springs (pH 2.9 to 3.9 and 72 to 92°C) of Yellowstone National Park. The overall capsid architecture and the structure of its major capsid protein are very similar to those of the bacteriophage PRD1 and eukaryotic viruses Paramecium bursaria Chlorella virus 1 and adenovirus, suggesting a viral lineage that predates the three domains of life. The 17,663-base-pair, circular, double-stranded DNA genome contains 36 potential open reading frames, whose sequences generally show little similarity to other genes in the sequence databases. However, functional and evolutionary information may be suggested by a protein's three-dimensional structure. To this end, we have undertaken structural studies of the STIV proteome. Here we report our work on A197, the product of an STIV open reading frame. The structure of A197 reveals a GT-A fold that is common to many members of the glycosyltransferase superfamily. A197 possesses a canonical DXD motif and a putative catalytic base that are hallmarks of this family of enzymes, strongly suggesting a glycosyltransferase activity for A197. Potential roles for the putative glycosyltransferase activity of A197 and their evolutionary implications are discussed.
Paenibacillus sp.Y412MC10 was one of a number of organisms isolated from Obsidian Hot Spring, Yellowstone National Park, Montana, USA under permit from the National Park Service. The isolate was initially classified as a Geobacillus sp. Y412MC10 based on its isolation conditions and similarity to other organisms isolated from hot springs at Yellowstone National Park. Comparison of 16 S rRNA sequences within the Bacillales indicated that Geobacillus sp.Y412MC10 clustered with Paenibacillus species, and the organism was most closely related to Paenibacillus lautus. Lucigen Corp. prepared genomic DNA and the genome was sequenced, assembled, and annotated by the DOE Joint Genome Institute. The genome sequence was deposited at the NCBI in October 2009 (NC_013406). The genome of Paenibacillus sp. Y412MC10 consists of one circular chromosome of 7,121,665 bp with an average G+C content of 51.2%. Comparison to other Paenibacillus species shows the organism lacks nitrogen fixation, antibiotic production and social interaction genes reported in other paenibacilli. The Y412MC10 genome shows a high level of synteny and homology to the draft sequence of Paenibacillus sp. HGF5, an organism from the Human Microbiome Project (HMP) Reference Genomes. This, combined with genomic CAZyme analysis, suggests an intestinal, rather than environmental origin for Y412MC10.
Geobacillus sp. Y412MC10; Paenibacillus sp. Y412MC10; Obsidian Hot Spring
The Yellowstone caldera contains the most numerous and diverse geothermal systems on Earth, yielding an extensive array of unique high-temperature environments that host a variety of deeply-rooted and understudied Archaea, Bacteria and Eukarya. The combination of extreme temperature and chemical conditions encountered in geothermal environments often results in considerably less microbial diversity than other terrestrial habitats and offers a tremendous opportunity for studying the structure and function of indigenous microbial communities and for establishing linkages between putative metabolisms and element cycling. Metagenome sequence (14–15,000 Sanger reads per site) was obtained for five high-temperature (>65°C) chemotrophic microbial communities sampled from geothermal springs (or pools) in Yellowstone National Park (YNP) that exhibit a wide range in geochemistry including pH, dissolved sulfide, dissolved oxygen and ferrous iron. Metagenome data revealed significant differences in the predominant phyla associated with each of these geochemical environments. Novel members of the Sulfolobales are dominant in low pH environments, while other Crenarchaeota including distantly-related Thermoproteales and Desulfurococcales populations dominate in suboxic sulfidic sediments. Several novel archaeal groups are well represented in an acidic (pH 3) Fe-oxyhydroxide mat, where a higher O2 influx is accompanied with an increase in archaeal diversity. The presence or absence of genes and pathways important in S oxidation-reduction, H2-oxidation, and aerobic respiration (terminal oxidation) provide insight regarding the metabolic strategies of indigenous organisms present in geothermal systems. Multiple-pathway and protein-specific functional analysis of metagenome sequence data corroborated results from phylogenetic analyses and clearly demonstrate major differences in metabolic potential across sites. The distribution of functional genes involved in electron transport is consistent with the hypothesis that geochemical parameters (e.g., pH, sulfide, Fe, O2) control microbial community structure and function in YNP geothermal springs.
The phototrophic microbial mat community of Mushroom Spring, an alkaline siliceous hot spring in Yellowstone National Park, was studied by metatranscriptomic methods. RNA was extracted from mat specimens collected at four timepoints during light-to-dark and dark-to-light transitions in one diel cycle, and these RNA samples were analyzed by both pyrosequencing and SOLiD technologies. Pyrosequencing was used to assess the community composition, which showed that ∼84% of the rRNA was derived from members of four kingdoms Cyanobacteria, Chloroflexi, Chlorobi and Acidobacteria. Transcription of photosynthesis-related genes conclusively demonstrated the phototrophic nature of two newly discovered populations; these organisms, which were discovered through metagenomics, are currently uncultured and previously undescribed members of Chloroflexi and Chlorobi. Data sets produced by SOLiD sequencing of complementary DNA provided >100-fold greater sequence coverage. The much greater sequencing depth allowed transcripts to be detected from ∼15 000 genes and could be used to demonstrate statistically significant differential transcription of thousands of genes. Temporal differences for in situ transcription patterns of photosynthesis-related genes suggested that the six types of chlorophototrophs in the mats may use different strategies for maximizing their solar-energy capture, usage and growth. On the basis of both temporal pattern and transcript abundance, intra-guild gene expression differences were also detected for two populations of the oxygenic photosynthesis guild. This study showed that, when community-relevant genomes and metagenomes are available, SOLiD sequencing technology can be used for metatranscriptomic analyses, and the results suggested that this method can potentially reveal new insights into the ecophysiology of this model microbial community.
metatranscriptome; photosynthesis; phototrophy; cyanobacteria; chlorophyll; reaction centers
Glycerol dialkyl glycerol tetraethers (GDGTs) are core membrane lipids originally thought to be produced mainly by (hyper)thermophilic archaea. Environmental screening of low-temperature environments showed, however, the abundant presence of structurally diverse GDGTs from both bacterial and archaeal sources. In this study, we examined the occurrences and distribution of GDGTs in hot spring environments in Yellowstone National Park with high temperatures (47 to 83°C) and mostly neutral to alkaline pHs. GDGTs with 0 to 4 cyclopentane moieties were dominant in all samples and are likely derived from both (hyper)thermophilic Crenarchaeota and Euryarchaeota. GDGTs with 4 to 8 cyclopentane moieties, likely derived from the crenarchaeotal order Sulfolobales and the euryarchaeotal order Thermoplasmatales, are usually present in much lower abundance, consistent with the relatively high pH values of the hot springs. The relative abundances of cyclopentane-containing GDGTs did not correlate with in situ temperature and pH, suggesting that other environmental and possibly genetic factors play a role as well. Crenarchaeol, a biomarker thought to be specific for nonthermophilic group I Crenarchaeota, was also found in most hot springs, though in relatively low concentrations, i.e., <5% of total GDGTs. Its abundance did not correlate with temperature, as has been reported previously. Instead, the cooccurrence of relatively abundant nonisoprenoid GDGTs thought to be derived from soil bacteria suggests a predominantly allochthonous source for crenarchaeol in these hot spring environments. Finally, the distribution of bacterial branched GDGTs suggests that they may be derived from the geothermally heated soils surrounding the hot springs.
The microorganisms inhabiting a 91 degrees C hot spring in Yellowstone National Park were characterized by sequencing 5S rRNAs isolated from the mixed, natural microflora without cultivation. By comparisons of these sequences with reference sequences, the phylogenetic relationships of the hot spring organisms to better characterized ones were established. Quantitation of the total 5S-sized rRNAs revealed a complex microbial community of three dominant members, a predominant archaebacterium affiliated with the sulfur-metabolizing (dependent) branch of the archaebacteria, and two eubacteria distantly related to Thermus spp. The archaebacterial and the eubacterial 5S rRNAs each constituted about half the examined population.
Denaturing gradient gel electrophoresis analysis of PCR-amplified 16S rRNA gene segments was used to examine the distributions of bacterial populations within a hot spring microbial mat (Octopus Spring, Yellowstone National Park). Populations at sites along the thermal gradient of the spring's effluent channel were surveyed at seasonal intervals. No shift in the thermal gradient was detected, and populations at spatially or temperature-defined sites exhibited only slight changes over the annual sampling period. A new cyanobacterial 16S rRNA sequence type was detected at temperatures from 63 to 75 degrees C. A new green nonsulfur bacterium-like sequence type was also detected at temperatures from 53 to 62 degrees C. Genetically unique though closely related cyanobacterial and green nonsulfur bacterium-like populations were successively distributed along the thermal gradient of the Octopus Spring effluent channel. At least two cyanobacterial populations were detected at each site; however, a limited ability to detect some cyanobacterial populations suggests that only dominant populations were observed.
The phylogenetic group termed OP5 was originally discovered in the Yellowstone National Park hot spring and proposed as an uncultured phylum; the group was afterwards analyzed by applying culture-independent approaches. Recently, a novel thermophilic chemoheterotrophic filamentous bacterium was obtained from a hot spring in Japan that was enriched through various isolation procedures. Phylogenetic analyses of the isolate have revealed that it is closely related to the OP5 phylum that has mainly been constructed with the environmental clones retrieved from thermophilic and mesophilic anaerobic environments. It appears that the lineage is independent at the phylum level in the domain Bacteria. Therefore, we designed a primer set for the 16S rRNA gene to specifically target the OP5 phylum and performed quantitative field analysis by using the real-time PCR method. Thus, the 16S rRNA gene of the OP5 phylum was detected in some hot-spring samples with the relative abundance ranging from 0.2% to 1.4% of the prokaryotic organisms detected. The physiology of the above-mentioned isolate and the related environmental clones indicated that they are scavengers contributing to the sulfur cycle in nature.
An emerging model for investigating virus-host interactions in hyperthermophilic Archaea is the Fusellovirus-Sulfolobus system. The host, Sulfolobus, is a hyperthermophilic acidophile endemic to sulfuric hot springs worldwide. The Fuselloviruses, also known as Sulfolobus Spindle-shaped Viruses (SSVs), are “lemon” or “spindle”-shaped double-stranded DNA viruses, which are also found worldwide. Although a few studies have addressed the host-range for the type virus, Sulfolobus Spindle-shaped Virus 1 (SSV1), using common Sulfolobus strains, a comprehensive host-range study for SSV-Sulfolobus systems has not been performed. Herein, we examine six bona fide SSV strains (SSV1, SSV2, SSV3, SSVL1, SSVK1, SSVRH) and their respective infection characteristics on multiple hosts from the family Sulfolobaceae. A spot-on-lawn or “halo” assay was employed to determine SSV infectivity (and host susceptibility) in parallel challenges of multiple SSVs on a lawn of a single Sulfolobus strain. Different SSVs have different host-ranges with SSV1 exhibiting the narrowest host-range and SSVRH exhibiting the broadest host range. In contrast to previous reports, SSVs can infect hosts beyond the genus Sulfolobus. Furthermore, geography does not appear to be a reliable predictor of Sulfolobus susceptibility to infection by any given SSV. The ability for SSVs to infect susceptible Sulfolobus host does not appear to change between 65°C and 88°C (physiological range); however, very low pH appears to influence infection. Lastly, for the virus-host pairs tested the Fusellovirus-Sulfolobus system appears to exhibit host-advantage. This work provides a foundation for understanding Fusellovirus biology and virus-host coevolution in extreme ecosystems.
Archaea; Crenarchaea; Fusellovirus; halo assay; host-range; hyperthermophilic; Sulfolobus; Sulfolobus spindle-shaped virus
At this time, about 3,000 different viruses are recognized, but metagenomic studies suggest that these viruses are a small fraction of the viruses that exist in nature. We have explored viral diversity by deep sequencing nucleic acids obtained from virion populations enriched from raw sewage. We identified 234 known viruses, including 17 that infect humans. Plant, insect, and algal viruses as well as bacteriophages were also present. These viruses represented 26 taxonomic families and included viruses with single-stranded DNA (ssDNA), double-stranded DNA (dsDNA), positive-sense ssRNA [ssRNA(+)], and dsRNA genomes. Novel viruses that could be placed in specific taxa represented 51 different families, making untreated wastewater the most diverse viral metagenome (genetic material recovered directly from environmental samples) examined thus far. However, the vast majority of sequence reads bore little or no sequence relation to known viruses and thus could not be placed into specific taxa. These results show that the vast majority of the viruses on Earth have not yet been characterized. Untreated wastewater provides a rich matrix for identifying novel viruses and for studying virus diversity.
Importance At this time, virology is focused on the study of a relatively small number of viral species. Specific viruses are studied either because they are easily propagated in the laboratory or because they are associated with disease. The lack of knowledge of the size and characteristics of the viral universe and the diversity of viral genomes is a roadblock to understanding important issues, such as the origin of emerging pathogens and the extent of gene exchange among viruses. Untreated wastewater is an ideal system for assessing viral diversity because virion populations from large numbers of individuals are deposited and because raw sewage itself provides a rich environment for the growth of diverse host species and thus their viruses. These studies suggest that the viral universe is far more vast and diverse than previously suspected.
At this time, virology is focused on the study of a relatively small number of viral species. Specific viruses are studied either because they are easily propagated in the laboratory or because they are associated with disease. The lack of knowledge of the size and characteristics of the viral universe and the diversity of viral genomes is a roadblock to understanding important issues, such as the origin of emerging pathogens and the extent of gene exchange among viruses. Untreated wastewater is an ideal system for assessing viral diversity because virion populations from large numbers of individuals are deposited and because raw sewage itself provides a rich environment for the growth of diverse host species and thus their viruses. These studies suggest that the viral universe is far more vast and diverse than previously suspected.
We characterized and compared five geographically isolated hot springs with distinct red-layer communities in Yellowstone National Park. Individual red-layer communities were observed to thrive in temperatures ranging from 35 to 60°C and at pH 7 to 9. All communities were dominated by red filamentous bacteria and contained bacteriochlorophyll a (Bchl a), suggesting that they represented novel green nonsulfur (GNS) bacteria. The in vivo absorption spectra of individual sites were different, with two sites showing unusual Bchl a protein absorption bands beyond 900 nm. We prepared and analyzed 16S rRNA libraries from all of these sites by using a combination of general bacterial primers and new GNS-specific primers described here. These studies confirmed the presence of novel GNS-like bacteria in all five communities. All GNS-like clones were most similar to Roseiflexus castenholzii, a red filamentous bacterium from Japan that also contains only Bchl a. Phylogenies constructed by using GNS-like clones from Yellowstone red-layer communities suggest the presence of a moderately diverse new “red” cluster within the GNS lineage. Within this cluster, at least two well-supported subclusters emerged: YRL-A was most similar to Roseiflexus and YRL-B appeared to be novel, containing no known isolates. While these patterns showed some site specificity, they did not correlate with observed Bchl a spectrum differences or obvious features of the habitat.
Archaea often live in extreme, harsh environments such as acidic hot springs and hypersaline waters. To date, only two icosahedrally symmetric, membrane-containing archaeal viruses, SH1 and Sulfolobus turreted icosahedral virus (STIV), have been described in detail. We report the sequence and three-dimensional structure of a third such virus isolated from a hyperthermoacidophilic crenarchaeon, Sulfolobus strain G4ST-2. Characterization of this new isolate revealed it to be similar to STIV on the levels of genome and structural organization. The genome organization indicates that these two viruses have diverged from a common ancestor. Interestingly, the prominent surface turrets of the two viruses are strikingly different. By sequencing and mass spectrometry, we mapped several large insertions and deletions in the known structural proteins that could account for these differences and showed that both viruses can infect the same host. A combination of genomic and proteomic analyses revealed important new insights into the structural organization of these viruses and added to our limited knowledge of archaeal virus life cycles and host-cell interactions.
We have constructed a conceptual model of biogeochemical cycles and metabolic and microbial community shifts within a hot spring ecosystem via coordinated analysis of the “Bison Pool” (BP) Environmental Genome and a complementary contextual geochemical dataset of ∼75 geochemical parameters. 2,321 16S rRNA clones and 470 megabases of environmental sequence data were produced from biofilms at five sites along the outflow of BP, an alkaline hot spring in Sentinel Meadow (Lower Geyser Basin) of Yellowstone National Park. This channel acts as a >22 m gradient of decreasing temperature, increasing dissolved oxygen, and changing availability of biologically important chemical species, such as those containing nitrogen and sulfur. Microbial life at BP transitions from a 92°C chemotrophic streamer biofilm community in the BP source pool to a 56°C phototrophic mat community. We improved automated annotation of the BP environmental genomes using BLAST-based Markov clustering. We have also assigned environmental genome sequences to individual microbial community members by complementing traditional homology-based assignment with nucleotide word-usage algorithms, allowing more than 70% of all reads to be assigned to source organisms. This assignment yields high genome coverage in dominant community members, facilitating reconstruction of nearly complete metabolic profiles and in-depth analysis of the relation between geochemical and metabolic changes along the outflow. We show that changes in environmental conditions and energy availability are associated with dramatic shifts in microbial communities and metabolic function. We have also identified an organism constituting a novel phylum in a metabolic “transition” community, located physically between the chemotroph- and phototroph-dominated sites. The complementary analysis of biogeochemical and environmental genomic data from BP has allowed us to build ecosystem-based conceptual models for this hot spring, reconstructing whole metabolic networks in order to illuminate community roles in shaping and responding to geochemical variability.
Glycerol dialkyl glycerol tetraethers (GDGTs) found in hot springs reflect the abundance and community structure of Archaea in these extreme environments. The relationships between GDGTs, archaeal communities, and physical or geochemical variables are underexamined to date and when reported often result in conflicting interpretations. Here, we examined profiles of GDGTs from pure cultures of Crenarchaeota and from terrestrial geothermal springs representing a wide distribution of locations, including Yellowstone National Park (United States), the Great Basin of Nevada and California (United States), Kamchatka (Russia), Tengchong thermal field (China), and Thailand. These samples had temperatures of 36.5 to 87°C and pH values of 3.0 to 9.2. GDGT abundances also were determined for three soil samples adjacent to some of the hot springs. Principal component analysis identified four factors that accounted for most of the variance among nine individual GDGTs, temperature, and pH. Significant correlations were observed between pH and the GDGTs crenarchaeol and GDGT-4 (four cyclopentane rings, m/z 1,294); pH correlated positively with crenarchaeol and inversely with GDGT-4. Weaker correlations were observed between temperature and the four factors. Three of the four GDGTs used in the marine TEX86 paleotemperature index (GDGT-1 to -3, but not crenarchaeol isomer) were associated with a single factor. No correlation was observed for GDGT-0 (acyclic caldarchaeol): it is effectively its own variable. The biosynthetic mechanisms and exact archaeal community structures leading to these relationships remain unknown. However, the data in general show promise for the continued development of GDGT lipid-based physiochemical proxies for archaeal evolution and for paleo-ecology or paleoclimate studies.
The microbial mats growing in the runoff channels of the hot springs of Yellowstone National Park (YNP) are a rich mix of bacterial, archaeal, and eukaryotic species. Mat samples were gathered from Octopus Hot Spring in 2005 and 2006.The samples were subjected to labeling with iTRAQ reagents followed by shotgun proteomics. Mascot was used to query an in-house YNP database derived from the microbial portion of the NCBInr. It was expected that the majority of the proteins mapping to species with high temperature optima would be associated with the sample taken at a higher temperature, while those proteins from species with lower optima would be associated with the sample from lower temperature. Although Synechococcus is the most abundant microorganism in the mat community when abundance is measured by the percentage of DNA in a metagenomics sample, more peptides were identified from Roseiflexus sp. RS-1 than from any other organism. This discrepancy is most likely due to the sample being a mix of the large red and small green layers of the mat. Synechococcus resides only in the green layer. The large number of distinct peptides from the Roseiflexus extracellular binding protein resulted in the highest Mowse score of the proteins identified. Eighty percent of the peptides associated with the extracellular binding protein were isolated from the 58°C sample, while only 20% of this protein came from the 71°C sample. This trend continues throughout the housekeeping proteins quantified from Roseiflexus, as might be expected of a thermophile with a lower temperature optimum. The remainder of the identified proteins showed the association with collection temperature that would be expected from the temperature optimums of the mciroorganisms from which they were extracted. The majority of the proteins identified in this experiment were housekeeping and structural proteins.