We have previously used a de novo metagenomic assembly approach to describe the presence of an abundant gammaproteobacterium comprising nearly 15% of the microbial community in an intermediate salinity solar saltern pond. We have obtained this microbe in pure culture and describe the genome sequencing of the halophilic photoheterotrophic microbe, Spiribacter salinus M19-40.
doi:10.1128/genomeA.00179-12
PMCID: PMC3569344
PMID: 23409269
Mobilome of hyperthermophilic archaea dwelling in deep-sea hydrothermal vents is poorly characterized. To gain insight into genetic diversity and dynamics of mobile genetic elements in these environments we have sequenced five new plasmids from different Thermococcus strains that have been isolated from geographically remote hydrothermal vents. The plasmids were ascribed to two subfamilies, pTN2-like and pEXT9a-like. Gene content and phylogenetic analyses illuminated a robust connection between pTN2-like plasmids and Pyrococcus abyssi virus 1 (PAV1), with roughly half of the viral genome being composed of genes that have homologues in plasmids. Unexpectedly, pEXT9a-like plasmids were found to be closely related to the previously sequenced plasmid pMETVU01 from Methanocaldococcus vulcanius M7. Our data suggests that the latter observation is most compatible with an unprecedented horizontal transfer of a pEXT9a-like plasmid from Thermococcales to Methanococcales. Gene content analysis revealed that thermococcal plasmids encode Hfq-like proteins and toxin-antitoxin (TA) systems of two different families, VapBC and RelBE. Notably, although abundant in archaeal genomes, to our knowledge, TA and hfq-like genes have not been previously found in archaeal plasmids or viruses. Finally, the plasmids described here might prove to be useful in developing new genetic tools for hyperthermophiles.
doi:10.1371/journal.pone.0049044
PMCID: PMC3543421
PMID: 23326305
Gonzaga, Aitor | Martin-Cuadrado, Ana-Belen | López-Pérez, Mario | Megumi Mizuno, Carolina | García-Heredia, Inmaculada | Kimes, Nikole E. | Lopez-García, Purificación | Moreira, David | Ussery, David | Zaballos, Mila | Ghai, Rohit | Rodriguez-Valera, Francisco
We have analyzed a natural population of the marine bacterium, Alteromonas macleodii, from a single sample of seawater to evaluate the genomic diversity present. We performed full genome sequencing of four isolates and 161 metagenomic fosmid clones, all of which were assigned to A. macleodii by sequence similarity. Out of the four strain genomes, A. macleodii deep ecotype (AltDE1) represented a different genome, whereas AltDE2 and AltDE3 were identical to the previously described AltDE. Although the core genome (∼80%) had an average nucleotide identity of 98.51%, both AltDE and AltDE1 contained flexible genomic islands (fGIs), that is, genomic islands present in both genomes in the same genomic context but having different gene content. Some of the fGIs encode cell surface receptors known to be phage recognition targets, such as the O-chain of the lipopolysaccharide, whereas others have genes involved in physiological traits (e.g., nutrient transport, degradation, and metal resistance) denoting microniche specialization. The presence in metagenomic fosmids of genomic fragments differing from the sequenced strain genomes, together with the presence of new fGIs, indicates that there are at least two more A. macleodii clones present. The availability of three or more sequences overlapping the same genomic region also allowed us to estimate the frequency and distribution of recombination events among these different clones, indicating that these clustered near the genomic islands. The results indicate that this natural A. macleodii population has multiple clones with a potential for different phage susceptibility and exploitation of resources, within a seemingly unstructured habitat.
doi:10.1093/gbe/evs112
PMCID: PMC3542563
PMID: 23212172
Alteromonas macleodii; metagenome; population genomics; genomic island; constant-diversity; phage
Bacteria belonging to the SAR11 clade are among the most abundant prokaryotes in the pelagic zone of the ocean. 16S rRNA gene-based analyses indicate that they constitute up to 60% of the bacterioplankton community in the surface waters of the Red Sea. This extremely oligotrophic water body is further characterized by an epipelagic zone, which has a temperature above 24°C throughout the year, and a remarkable uniform temperature (∼22°C) and salinity (∼41 psu) from the mixed layer (∼200 m) to the bottom at over 2000 m depth. Despite these conditions that set it apart from other marine environments, the microbiology of this ecosystem is still vastly understudied. Prompted by the limited phylogenetic resolution of the 16S rRNA gene, we extended our previous study by sequencing the internal transcribed spacer (ITS) region of SAR11 in different depths of the Red Sea’s water column together with the respective 16S fragment. The overall diversity captured by the ITS loci was ten times higher than that of the corresponding 16S rRNA genes. Moreover, species estimates based on the ITS showed a highly diverse population of SAR11 in the mixed layer that became diminished in deep isothermal waters, which was in contrast to results of the related 16S rRNA genes. While the 16S rRNA gene-based sequences clustered into three phylogenetic subgroups, the related ITS fragments fell into several phylotypes that showed clear depth-dependent shifts in relative abundances. Blast-based analyses not only documented the observed vertical partitioning and universal co-occurrence of specific phylotypes in five other distinct oceanic provinces, but also highlighted the influence of ecosystem-specific traits (e.g., temperature, nutrient availability, and concentration of dissolved oxygen) on the population dynamics of this ubiquitous marine bacterium.
doi:10.1371/journal.pone.0050274
PMCID: PMC3502338
PMID: 23185592
16S rRNA gene (rrs) is considered of low taxonomic interest in the genus Aeromonas. Here, 195 Aeromonas strains belonging to populations structured by multilocus phylogeny were studied using an original approach that considered Ribosomal Multi-Operon Diversity. This approach associated pulsed-field gel electrophoresis (PFGE) to assess rrn operon number and distribution across the chromosome and PCR-temporal temperature gel electrophoresis (TTGE) to assess rrs V3 region heterogeneity. Aeromonads harbored 8 to 11 rrn operons, 10 operons being observed in more than 92% of the strains. Intraspecific variability was low or nul except for A. salmonicida and A. aquariorum suggesting that large chromosomic rearrangements might occur in these two species while being extremely rarely encountered in the evolution of other taxa. rrn operon number at 8 as well as PFGE patterns were shown valuable for taxonomic purpose allowing resolution of species complexes. PCR-TTGE revealed a high rate of strains (41.5%) displaying intragenomic rrs heterogeneity. Strains isolated from human samples more frequently displayed intragenomic heterogeneity than strains recovered from non-human and environmental specimens. Intraspecific variability ranged from 0 to 76.5% of the strains. The observation of species-specific TTGE bands, the recovery of identical V3 regions in different species and the variability of intragenomic heterogeneity (1–13 divergent nucleotides) supported the occurrence of mutations and horizontal transfer in aeromonad rrs evolution. Altogether, the presence of a high number of rrn operon, the high proportion of strains harboring divergent rrs V3 region and the previously demonstrated high level of genetic diversity argued in favor of highly adaptative capabilities of aeromonads. Outstanding features observed for A. caviae supported the ongoing process of adaptation to a specialized niche represented by the gut, previously hypothesized. 16S rRNA gene is an informative marker in the genus Aeromonas for both evolutionary and polyphasic taxonomic studies provided that multi-operon fingerprinting approaches are used.
doi:10.1371/journal.pone.0046268
PMCID: PMC3459834
PMID: 23032081
Alteromonas macleodii is a marine gammaproteobacterium with widespread distribution in temperate or tropical waters. We describe three genomes of isolates from surface waters around Europe (Atlantic, Mediterranean and Black Sea) and compare them with a previously described deep Mediterranean isolate (AltDE) that belongs to a widely divergent clade. The surface isolates are quite similar, the most divergent being the Black Sea (BS11) isolate. The genomes contain several genomic islands with different gene content. The recruitment of very similar genomic fragments from metagenomes in different locations indicates that the surface clade is globally abundant with little effect of geography, even the AltDE and the BS11 genomes recruiting from surface samples in open ocean locations. The finding of CRISPR protospacers of AltDE in a lysogenic phage in the Atlantic (English Channel) isolate illustrates a flow of genetic material among these clades and a remarkably wide distribution of this phage.
doi:10.1038/srep00696
PMCID: PMC3458243
PMID: 23019517
Viruses are a crucial component of the human microbiome, but large population sizes, high sequence diversity, and high frequencies of novel genes have hindered genomic analysis by high-throughput sequencing. Here we investigate approaches to metagenomic assembly to probe genome structure in a sample of 5.6 Gb of gut viral DNA sequence from six individuals. Tests showed that a new pipeline based on DeBruijn graph assembly yielded longer contigs that were able to recruit more reads than the equivalent non-optimized, single-pass approach. To characterize gene content, the database of viral RefSeq proteins was compared to the assembled viral contigs, generating a bipartite graph with functional cassettes linking together viral contigs, which revealed a high degree of connectivity between diverse genomes involving multiple genes of the same functional class. In a second step, open reading frames were grouped by their co-occurrence on contigs in a database-independent manner, revealing conserved cassettes of co-oriented ORFs. These methods reveal that free-living bacteriophages, while usually dissimilar at the nucleotide level, often have significant similarity at the level of encoded amino acid motifs, gene order, and gene orientation. These findings thus connect contemporary metagenomic analysis with classical studies of bacteriophage genomic cassettes. Software is available at https://sourceforge.net/projects/optitdba/.
doi:10.1371/journal.pone.0042342
PMCID: PMC3416800
PMID: 22900013
Ghai, Rohit | Hernandez, Claudia Mella | Picazo, Antonio | Mizuno, Carolina Megumi | Ininbergs, Karolina | Díez, Beatriz | Valas, Ruben | DuPont, Christopher L. | McMahon, Katherine D. | Camacho, Antonio | Rodriguez-Valera, Francisco
Coastal lagoons, both hypersaline and freshwater, are common, but still understudied ecosystems. We describe, for the first time, using high throughput sequencing, the extant microbiota of two large and representative Mediterranean coastal lagoons, the hypersaline Mar Menor, and the freshwater Albufera de Valencia, both located on the south eastern coast of Spain. We show there are considerable differences in the microbiota of both lagoons, in comparison to other marine and freshwater habitats. Importantly, a novel uncultured sulfur oxidizing Alphaproteobacteria was found to dominate bacterioplankton in the hypersaline Mar Menor. Also, in the latter prokaryotic cyanobacteria were almost exclusively comprised by Synechococcus and no Prochlorococcus was found. Remarkably, the microbial community in the freshwaters of the hypertrophic Albufera was completely in contrast to known freshwater systems, in that there was a near absence of well known and cosmopolitan groups of ultramicrobacteria namely Low GC Actinobacteria and the LD12 lineage of Alphaproteobacteria.
doi:10.1038/srep00490
PMCID: PMC3391805
PMID: 22778901
Direct sequencing of environmental DNA (metagenomics) has a great potential for describing the 16S rRNA gene diversity of microbial communities. However current approaches using this 16S rRNA gene information to describe community diversity suffer from low taxonomic resolution or chimera problems. Here we describe a new strategy that involves stringent assembly and data filtering to reconstruct full-length 16S rRNA genes from metagenomicpyrosequencing data. Simulations showed that reconstructed 16S rRNA genes provided a true picture of the community diversity, had minimal rates of chimera formation and gave taxonomic resolution down to genus level. The strategy was furthermore compared to PCR-based methods to determine the microbial diversity in two marine sponges. This showed that about 30% of the abundant phylotypes reconstructed from metagenomic data failed to be amplified by PCR. Our approach is readily applicable to existing metagenomic datasets and is expected to lead to the discovery of new microbial phylotypes.
doi:10.1371/journal.pone.0039948
PMCID: PMC3384625
PMID: 22761935
Vaulot, Daniel | Lepère, Cécile | Toulza, Eve | De la Iglesia, Rodrigo | Poulain, Julie | Gaboyer, Frédéric | Moreau, Hervé | Vandepoele, Klaas | Ulloa, Osvaldo | Gavory, Frederick | Piganeau, Gwenael | Rodriguez-Valera, Francisco
Among small photosynthetic eukaryotes that play a key role in oceanic food webs, picoplanktonic Mamiellophyceae such as Bathycoccus, Micromonas, and Ostreococcus are particularly important in coastal regions. By using a combination of cell sorting by flow cytometry, whole genome amplification (WGA), and 454 pyrosequencing, we obtained metagenomic data for two natural picophytoplankton populations from the coastal upwelling waters off central Chile. About 60% of the reads of each sample could be mapped to the genome of Bathycoccus strain from the Mediterranean Sea (RCC1105), representing a total of 9 Mbp (sample T142) and 13 Mbp (sample T149) of non-redundant Bathycoccus genome sequences. WGA did not amplify all regions uniformly, resulting in unequal coverage along a given chromosome and between chromosomes. The identity at the DNA level between the metagenomes and the cultured genome was very high (96.3% identical bases for the three larger chromosomes over a 360 kbp alignment). At least two to three different genotypes seemed to be present in each natural sample based on read mapping to Bathycoccus RCC1105 genome.
doi:10.1371/journal.pone.0039648
PMCID: PMC3382182
PMID: 22745802
Sequencing of microbial community RNA (metatranscriptome) is a useful approach for assessing gene expression in microorganisms from the natural environment. This method has revealed transcriptional patterns in situ, but can also be used to detect transcriptional cascades in microcosms following experimental perturbation. Unambiguously identifying differential transcription between control and experimental treatments requires constraining effects that are simply due to sampling and bottle enclosure. These effects remain largely uncharacterized for “challenging” microbial samples, such as those from anoxic regions that require special handling to maintain in situ conditions. Here, we demonstrate substantial changes in microbial transcription induced by sample collection and incubation in experimental bioreactors. Microbial communities were sampled from the water column of a marine oxygen minimum zone by a pump system that introduced minimal oxygen contamination and subsequently incubated in bioreactors under near in situ oxygen and temperature conditions. Relative to the source water, experimental samples became dominated by transcripts suggestive of cell stress, including chaperone, protease, and RNA degradation genes from diverse taxa, with strong representation from SAR11-like alphaproteobacteria. In tandem, transcripts matching facultative anaerobic gammaproteobacteria of the Alteromonadales (e.g., Colwellia) increased 4–13 fold up to 43% of coding transcripts, and encoded a diverse gene set suggestive of protein synthesis and cell growth. We interpret these patterns as taxon-specific responses to combined environmental changes in the bioreactors, including shifts in substrate or oxygen availability, and minor temperature and pressure changes during sampling with the pump system. Whether such changes confound analysis of transcriptional patterns may vary based on the design of the experiment, the taxonomic composition of the source community, and on the metabolic linkages between community members. These data highlight the impressive capacity for transcriptional changes within complex microbial communities, underscoring the need for caution when inferring in situ metabolism based on transcript abundances in experimental incubations.
doi:10.1371/journal.pone.0037118
PMCID: PMC3353902
PMID: 22615914
Viruses are ubiquitous in the oceans and critical components of marine microbial communities, regulating nutrient transfer to higher trophic levels or to the dissolved organic pool through lysis of host cells. Hydrothermal vent systems are oases of biological activity in the deep oceans, for which knowledge of biodiversity and its impact on global ocean biogeochemical cycling is still in its infancy. In order to gain biological insight into viral communities present in hydrothermal vent systems, we developed a method based on deep-sequencing of pulsed field gel electrophoretic bands representing key viral fractions present in seawater within and surrounding a hydrothermal plume derived from Loki's Castle vent field at the Arctic Mid-Ocean Ridge. The reduction in virus community complexity afforded by this novel approach enabled the near-complete reconstruction of a lambda-like phage genome from the virus fraction of the plume. Phylogenetic examination of distinct gene regions in this lambdoid phage genome unveiled diversity at loci encoding superinfection exclusion- and integrase-like proteins. This suggests the importance of fine-tuning lyosgenic conversion as a viral survival strategy, and provides insights into the nature of host-virus and virus-virus interactions, within hydrothermal plumes. By reducing the complexity of the viral community through targeted sequencing of prominent dsDNA viral fractions, this method has selectively mimicked virus dominance approaching that hitherto achieved only through culturing, thus enabling bioinformatic analysis to locate a lambdoid viral “needle" within the greater viral community “haystack". Such targeted analyses have great potential for accelerating the extraction of biological knowledge from diverse and poorly understood environmental viral communities.
doi:10.1371/journal.pone.0034238
PMCID: PMC3324506
PMID: 22509283
Background
Metaviriomes, the viral genomes present in an environment, have been studied by direct sequencing of the viral DNA or by cloning in small insert libraries. The short reads generated by both approaches make it very difficult to assemble and annotate such flexible genomic entities. Many environmental viruses belong to unknown groups or prey on uncultured and little known cellular lineages, and hence might not be present in databases.
Methodology and Principal Findings
Here we have used a different approach, the cloning of viral DNA into fosmids before sequencing, to obtain natural contigs that are close to the size of a viral genome. We have studied a relatively low diversity extreme environment: saturated NaCl brines, which simplifies the analysis and interpretation of the data. Forty-two different viral genomes were retrieved, and some of these were almost complete, and could be tentatively identified as head-tail phages (Caudovirales).
Conclusions and Significance
We found a cluster of phage genomes that most likely infect Haloquadratum walsbyi, the square archaeon and major component of the community in these hypersaline habitats. The identity of the prey could be confirmed by the presence of CRISPR spacer sequences shared by the virus and one of the available strain genomes. Other viral clusters detected appeared to prey on the Nanohaloarchaea and on the bacterium Salinibacter ruber, covering most of the diversity of microbes found in this type of environment. This approach appears then as a viable alternative to describe metaviriomes in a much more detailed and reliable way than by the more common approaches based on direct sequencing. An example of transfer of a CRISPR cluster including repeats and spacers was accidentally found supporting the dynamic nature and frequent transfer of this peculiar prokaryotic mechanism of cell protection.
doi:10.1371/journal.pone.0033802
PMCID: PMC3316494
PMID: 22479446
The disaccharide trehalose is considered as a universal stress molecule, protecting cells and biomolecules from injuries imposed by high osmolarity, heat, oxidation, desiccation and freezing. Chromohalobacter salexigens is a halophilic and extremely halotolerant γ-proteobacterium of the family Halomonadaceae. In this work, we have investigated the role of trehalose as a protectant against salinity, temperature and desiccation in C. salexigens. A mutant deficient in the trehalose-6-phosphate synthase gene (otsA::Ω) was not affected in its salt or heat tolerance, but double mutants ectoine- and trehalose-deficient, or hydroxyectoine-reduced and trehalose-deficient, displayed an osmo- and thermosensitive phenotype, respectively. This suggests a role of trehalose as a secondary solute involved in osmo- (at least at low salinity) and thermoprotection of C. salexigens. Interestingly, trehalose synthesis was osmoregulated at the transcriptional level, and thermoregulated at the post-transcriptional level, suggesting that C. salexigens cells need to be pre-conditioned by osmotic stress, in order to be able to quickly synthesize trehalose in response to heat stress. C. salexigens was more sensitive to desiccation than E. coli and desiccation tolerance was slightly improved when cells were grown at high temperature. Under these conditions, single mutants affected in the synthesis of trehalose or hydroxyectoine were more sensitive to desiccation than the wild-type strain. However, given the low survival rates of the wild type, the involvement of trehalose and hydroxyectoine in C. salexigens response to desiccation could not be firmly established.
doi:10.1371/journal.pone.0033587
PMCID: PMC3308980
PMID: 22448254
Microbial metagenomes are DNA samples of the most abundant, and therefore most successful organisms at the sampling time and location for a given cell size range. The study of microbial communities via their DNA content has revolutionized our understanding of microbial ecology and evolution. Iron availability is a critical resource that limits microbial communities' growth in many oceanic areas. Here, we built a database of 2319 sequences, corresponding to 140 gene families of iron metabolism with a large phylogenetic spread, to explore the microbial strategies of iron acquisition in the ocean's bacterial community. We estimate iron metabolism strategies from metagenome gene content and investigate whether their prevalence varies with dissolved iron concentrations obtained from a biogeochemical model. We show significant quantitative and qualitative variations in iron metabolism pathways, with a higher proportion of iron metabolism genes in low iron environments. We found a striking difference between coastal and open ocean sites regarding Fe2+ versus Fe3+ uptake gene prevalence. We also show that non-specific siderophore uptake increases in low iron open ocean environments, suggesting bacteria may acquire iron from natural siderophore-like organic complexes. Despite the lack of knowledge of iron uptake mechanisms in most marine microorganisms, our approach provides insights into how the iron metabolic pathways of microbial communities may vary with seawater iron concentrations.
doi:10.1371/journal.pone.0030931
PMCID: PMC3281889
PMID: 22363520
Next-generation sequencing (NGS) is commonly used in metagenomic studies of complex microbial communities but whether or not different NGS platforms recover the same diversity from a sample and their assembled sequences are of comparable quality remain unclear. We compared the two most frequently used platforms, the Roche 454 FLX Titanium and the Illumina Genome Analyzer (GA) II, on the same DNA sample obtained from a complex freshwater planktonic community. Despite the substantial differences in read length and sequencing protocols, the platforms provided a comparable view of the community sampled. For instance, derived assemblies overlapped in ∼90% of their total sequences and in situ abundances of genes and genotypes (estimated based on sequence coverage) correlated highly between the two platforms (R2>0.9). Evaluation of base-call error, frameshift frequency, and contig length suggested that Illumina offered equivalent, if not better, assemblies than Roche 454. The results from metagenomic samples were further validated against DNA samples of eighteen isolate genomes, which showed a range of genome sizes and G+C% content. We also provide quantitative estimates of the errors in gene and contig sequences assembled from datasets characterized by different levels of complexity and G+C% content. For instance, we noted that homopolymer-associated, single-base errors affected ∼1% of the protein sequences recovered in Illumina contigs of 10× coverage and 50% G+C; this frequency increased to ∼3% when non-homopolymer errors were also considered. Collectively, our results should serve as a useful practical guide for choosing proper sampling strategies and data possessing protocols for future metagenomic studies.
doi:10.1371/journal.pone.0030087
PMCID: PMC3277595
PMID: 22347999
Eukaryotic organisms play essential roles in the biology and fertility of soils. For example the micro and mesofauna contribute to the fragmentation and homogenization of plant organic matter, while its hydrolysis is primarily performed by the fungi. To get a global picture of the activities carried out by soil eukaryotes we sequenced 2×10,000 cDNAs synthesized from polyadenylated mRNA directly extracted from soils sampled in beech (Fagus sylvatica) and spruce (Picea abies) forests. Taxonomic affiliation of both cDNAs and 18S rRNA sequences showed a dominance of sequences from fungi (up to 60%) and metazoans while protists represented less than 12% of the 18S rRNA sequences. Sixty percent of cDNA sequences from beech forest soil and 52% from spruce forest soil had no homologs in the GenBank/EMBL/DDJB protein database. A Gene Ontology term was attributed to 39% and 31.5% of the spruce and beech soil sequences respectively. Altogether 2076 sequences were putative homologs to different enzyme classes participating to 129 KEGG pathways among which several were implicated in the utilisation of soil nutrients such as nitrogen (ammonium, amino acids, oligopeptides), sugars, phosphates and sulfate. Specific annotation of plant cell wall degrading enzymes identified enzymes active on major polymers (cellulose, hemicelluloses, pectin, lignin) and glycoside hydrolases represented 0.5% (beech soil)–0.8% (spruce soil) of the cDNAs. Other sequences coding enzymes active on organic matter (extracellular proteases, lipases, a phytase, P450 monooxygenases) were identified, thus underlining the biotechnological potential of eukaryotic metatranscriptomes. The phylogenetic affiliation of 12 full-length carbohydrate active enzymes showed that most of them were distantly related to sequences from known fungi. For example, a putative GH45 endocellulase was closely associated to molluscan sequences, while a GH7 cellobiohydrolase was closest to crustacean sequences, thus suggesting a potentially significant contribution of non-fungal eukaryotes in the actual hydrolysis of soil organic matter.
doi:10.1371/journal.pone.0028967
PMCID: PMC3253082
PMID: 22238585
Bacterioplankton community metabolism is central to the functioning of aquatic ecosystems, and strongly reactive to changes in the environment, yet the processes underlying this response remain unclear. Here we explore the role that community composition plays in shaping the bacterial metabolic response to resource gradients that occur along aquatic ecotones in a complex watershed in Québec. Our results show that the response is mediated by complex shifts in community structure, and structural equation analysis confirmed two main pathways, one involving adjustments in the level of activity of existing phylotypes, and the other the replacement of the dominant phylotypes. These contrasting response pathways were not determined by the type or the intensity of the gradients involved, as we had hypothesized, but rather it would appear that some compositional configurations may be intrinsically more plastic than others. Our results suggest that community composition determines this overall level of community plasticity, but that composition itself may be driven by factors independent of the environmental gradients themselves, such that the response of bacterial communities to a given type of gradient may alternate between the adjustment and replacement pathways. We conclude that community composition influences the pathways of response in these bacterial communities, but not the metabolic outcome itself, which is driven by the environment, and which can be attained through multiple alternative configurations.
doi:10.1371/journal.pone.0025266
PMCID: PMC3181318
PMID: 21980410
Environmental metagenomics provides snippets of genomic sequences from all organisms in an environmental sample and are an unprecedented resource of information for investigating microbial population genetics. Current analytical methods, however, are poorly equipped to handle metagenomic data, particularly of short, unlinked sequences. A custom analytical pipeline was developed to calculate dN/dS ratios, a common metric to evaluate the role of selection in the evolution of a gene, from environmental metagenomes sequenced using 454 technology of flow-sorted populations of marine Synechococcus, the dominant cyanobacteria in coastal environments. The large majority of genes (98%) have evolved under purifying selection (dN/dS<1). The metagenome sequence coverage of the reference genomes was not uniform and genes that were highly represented in the environment (i.e. high read coverage) tended to be more evolutionarily conserved. Of the genes that may have evolved under positive selection (dN/dS>1), 77 out of 83 (93%) were hypothetical. Notable among annotated genes, ribosomal protein L35 appears to be under positive selection in one Synechococcus population. Other annotated genes, in particular a possible porin, a large-conductance mechanosensitive channel, an ATP binding component of an ABC transporter, and a homologue of a pilus retraction protein had regions of the gene with elevated dN/dS. With the increasing use of next-generation sequencing in metagenomic investigations of microbial diversity and ecology, analytical methods need to accommodate the peculiarities of these data streams. By developing a means to analyze population diversity data from these environmental metagenomes, we have provided the first insight into the role of selection in the evolution of Synechococcus, a globally significant primary producer.
doi:10.1371/journal.pone.0024249
PMCID: PMC3170327
PMID: 21931665
De Luca, Gilles | Barakat, Mohamed | Ortet, Philippe | Fochesato, Sylvain | Jourlin-Castelli, Cécile | Ansaldi, Mireille | Py, Béatrice | Fichant, Gwennaele | Coutinho, Pedro M. | Voulhoux, Romé | Bastien, Olivier | Maréchal, Eric | Henrissat, Bernard | Quentin, Yves | Noirot, Philippe | Filloux, Alain | Méjean, Vincent | DuBow, Michael S. | Barras, Frédéric | Barbe, Valérie | Weissenbach, Jean | Mihalcescu, Irina | Verméglio, André | Achouak, Wafa | Heulin, Thierry | Rodriguez-Valera, Francisco
Ramlibacter tataouinensis TTB310T (strain TTB310), a betaproteobacterium isolated from a semi-arid region of South Tunisia (Tataouine), is characterized by the presence of both spherical and rod-shaped cells in pure culture. Cell division of strain TTB310 occurs by the binary fission of spherical “cyst-like” cells (“cyst-cyst” division). The rod-shaped cells formed at the periphery of a colony (consisting mainly of cysts) are highly motile and colonize a new environment, where they form a new colony by reversion to cyst-like cells. This unique cell cycle of strain TTB310, with desiccation tolerant cyst-like cells capable of division and desiccation sensitive motile rods capable of dissemination, appears to be a novel adaptation for life in a hot and dry desert environment. In order to gain insights into strain TTB310's underlying genetic repertoire and possible mechanisms responsible for its unusual lifestyle, the genome of strain TTB310 was completely sequenced and subsequently annotated. The complete genome consists of a single circular chromosome of 4,070,194 bp with an average G+C content of 70.0%, the highest among the Betaproteobacteria sequenced to date, with total of 3,899 predicted coding sequences covering 92% of the genome. We found that strain TTB310 has developed a highly complex network of two-component systems, which may utilize responses to light and perhaps a rudimentary circadian hourglass to anticipate water availability at the dew time in the middle/end of the desert winter nights and thus direct the growth window to cyclic water availability times. Other interesting features of the strain TTB310 genome that appear to be important for desiccation tolerance, including intermediary metabolism compounds such as trehalose or polyhydroxyalkanoate, and signal transduction pathways, are presented and discussed.
doi:10.1371/journal.pone.0023784
PMCID: PMC3164672
PMID: 21912644
Ghai, Rohit | Rodriguez-Valera, Francisco | McMahon, Katherine D. | Toyama, Danyelle | Rinke, Raquel | Cristina Souza de Oliveira, Tereza | Wagner Garcia, José | Pellon de Miranda, Fernando | Henrique-Silva, Flavio | Lopez-Garcia, Purification
River water is a small percentage of the total freshwater on Earth but represents an essential resource for mankind. Microbes in rivers perform essential ecosystem roles including the mineralization of significant quantities of organic matter originating from terrestrial habitats. The Amazon river in particular is famous for its size and importance in the mobilization of both water and carbon out of its enormous basin. Here we present the first metagenomic study on the microbiota of this river. It presents many features in common with the other freshwater metagenome available (Lake Gatun in Panama) and much less similarity with marine samples. Among the microbial taxa found, the cosmopolitan freshwater acI lineage of the actinobacteria was clearly dominant. Group I Crenarchaea and the freshwater sister group of the marine SAR11 clade, LD12, were found alongside more exclusive and well known freshwater taxa such as Polynucleobacter. A metabolism-centric analysis revealed a disproportionate representation of pathways involved in heterotrophic carbon processing, as compared to those found in marine samples. In particular, these river microbes appear to be specialized in taking up and mineralizing allochthonous carbon derived from plant material.
doi:10.1371/journal.pone.0023785
PMCID: PMC3158796
PMID: 21915244
Bacteria are highly diverse and drive a bulk of ecosystem processes. Analysis of relationships between diversity and single specific ecosystem processes neglects the possibility that different species perform multiple functions at the same time. The degradation of dissolved organic carbon (DOC) followed by respiration is a key bacterial function that is modulated by the availability of DOC and the capability to produce extracellular enzymes. In freshwater ecosystems, biofilms are metabolic hotspots and major sites of DOC degradation. We manipulated the diversity of biofilm forming communities which were fed with DOC differing in availability. We characterized community composition using molecular fingerprinting (T-RFLP) and measured functioning as oxygen consumption rates, the conversion of DOC in the medium, bacterial abundance and the activities of five specific enzymes. Based on assays of the extracellular enzyme activity, we calculated how the likelihood of sustaining multiple functions was affected by reduced diversity. Carbon source and biofilm age were strong drivers of community functioning, and we demonstrate how the likelihood of sustaining multifunctionality decreases with decreasing diversity.
doi:10.1371/journal.pone.0023225
PMCID: PMC3151291
PMID: 21850263
Oxygen-tolerant [NiFe] hydrogenases may be used in future photobiological hydrogen production systems once the enzymes can be heterologously expressed in host organisms of interest. To achieve heterologous expression of [NiFe] hydrogenases in cyanobacteria, the two hydrogenase structural genes from Alteromonas macleodii Deep ecotype (AltDE), hynS and hynL, along with the surrounding genes in the gene operon of HynSL were cloned in a vector with an IPTG-inducible promoter and introduced into Synechococcus elongatus PCC7942. The hydrogenase protein was expressed at the correct size upon induction with IPTG. The heterologously-expressed HynSL hydrogenase was active when tested by in vitro H2 evolution assay, indicating the correct assembly of the catalytic center in the cyanobacterial host. Using a similar expression system, the hydrogenase structural genes from Thiocapsa roseopersicina (hynSL) and the entire set of known accessory genes were transferred to S. elongatus. A protein of the correct size was expressed but had no activity. However, when the 11 accessory genes from AltDE were co-expressed with hynSL, the T. roseopersicina hydrogenase was found to be active by in vitro assay. This is the first report of active, heterologously-expressed [NiFe] hydrogenases in cyanobacteria.
doi:10.1371/journal.pone.0020126
PMCID: PMC3102683
PMID: 21637846
Eloe, Emiley A. | Fadrosh, Douglas W. | Novotny, Mark | Zeigler Allen, Lisa | Kim, Maria | Lombardo, Mary-Jane | Yee-Greenbaum, Joyclyn | Yooseph, Shibu | Allen, Eric E. | Lasken, Roger | Williamson, Shannon J. | Bartlett, Douglas H. | Rodriguez-Valera, Francisco
The paucity of sequence data from pelagic deep-ocean microbial assemblages has severely restricted molecular exploration of the largest biome on Earth. In this study, an analysis is presented of a large-scale 454-pyrosequencing metagenomic dataset from a hadopelagic environment from 6,000 m depth within the Puerto Rico Trench (PRT). A total of 145 Mbp of assembled sequence data was generated and compared to two pelagic deep ocean metagenomes and two representative surface seawater datasets from the Sargasso Sea. In a number of instances, all three deep metagenomes displayed similar trends, but were most magnified in the PRT, including enrichment in functions for two-component signal transduction mechanisms and transcriptional regulation. Overrepresented transporters in the PRT metagenome included outer membrane porins, diverse cation transporters, and di- and tri-carboxylate transporters that matched well with the prevailing catabolic processes such as butanoate, glyoxylate and dicarboxylate metabolism. A surprisingly high abundance of sulfatases for the degradation of sulfated polysaccharides were also present in the PRT. The most dramatic adaptational feature of the PRT microbes appears to be heavy metal resistance, as reflected in the large numbers of transporters present for their removal. As a complement to the metagenome approach, single-cell genomic techniques were utilized to generate partial whole-genome sequence data from four uncultivated cells from members of the dominant phyla within the PRT, Alphaproteobacteria, Gammaproteobacteria, Bacteroidetes and Planctomycetes. The single-cell sequence data provided genomic context for many of the highly abundant functional attributes identified from the PRT metagenome, as well as recruiting heavily the PRT metagenomic sequence data compared to 172 available reference marine genomes. Through these multifaceted sequence approaches, new insights have been provided into the unique functional attributes present in microbes residing in a deeper layer of the ocean far removed from the more productive sun-drenched zones above.
doi:10.1371/journal.pone.0020388
PMCID: PMC3101246
PMID: 21629664
High-throughput sequencing technologies have strongly impacted microbiology, providing a rapid and cost-effective way of generating draft genomes and exploring microbial diversity. However, sequences obtained from impure nucleic acid preparations may contain DNA from sources other than the sample. Those sequence contaminations are a serious concern to the quality of the data used for downstream analysis, causing misassembly of sequence contigs and erroneous conclusions. Therefore, the removal of sequence contaminants is a necessary and required step for all sequencing projects. We developed DeconSeq, a robust framework for the rapid, automated identification and removal of sequence contamination in longer-read datasets (150 bp mean read length). DeconSeq is publicly available as standalone and web-based versions. The results can be exported for subsequent analysis, and the databases used for the web-based version are automatically updated on a regular basis. DeconSeq categorizes possible contamination sequences, eliminates redundant hits with higher similarity to non-contaminant genomes, and provides graphical visualizations of the alignment results and classifications. Using DeconSeq, we conducted an analysis of possible human DNA contamination in 202 previously published microbial and viral metagenomes and found possible contamination in 145 (72%) metagenomes with as high as 64% contaminating sequences. This new framework allows scientists to automatically detect and efficiently remove unwanted sequence contamination from their datasets while eliminating critical limitations of current methods. DeconSeq's web interface is simple and user-friendly. The standalone version allows offline analysis and integration into existing data processing pipelines. DeconSeq's results reveal whether the sequencing experiment has succeeded, whether the correct sample was sequenced, and whether the sample contains any sequence contamination from DNA preparation or host. In addition, the analysis of 202 metagenomes demonstrated significant contamination of the non-human associated metagenomes, suggesting that this method is appropriate for screening all metagenomes. DeconSeq is available at http://deconseq.sourceforge.net/.
doi:10.1371/journal.pone.0017288
PMCID: PMC3052304
PMID: 21408061