Zymomonas mobilis is an ethanologenic bacterium that has been studied for use in biofuel production. Of the sequenced Zymomonas strains, ATCC 29191 has been described as the phenotypic centrotype of Zymomonas mobilis subsp. mobilis, the taxon that harbors the highest ethanol-producing Z. mobilis strains. ATCC 29191 was isolated in Kinshasa, Congo, from palm wine fermentations. This strain is reported to be a robust levan producer, while in recent years it has been employed in studies addressing Z. mobilis respiration. Here we announce the finishing and annotation of the ATCC 29191 genome, which comprises one chromosome and three plasmids.
Zymomonas mobilis subsp. mobilis is one of the most rigorous
ethanol-producing organisms known to date, considered by many to be the prokaryotic alternative to yeast. The two most applied Z. mobilis subsp. mobilis strains, ZM4 and CP4, derive from Recife, Brazil, and have been isolated from sugarcane fermentations. Of these, ZM4 was the first Z. mobilis representative strain to be sequenced and analyzed. Here, we report the finishing of the genome sequence of strain CP4, which is highly similar but not identical to that of ZM4.
The Integrated Microbial Genomes (IMG) data warehouse integrates genomes from all three domains of life, as well as plasmids, viruses and genome fragments. IMG provides tools for analyzing and reviewing the structural and functional annotations of genomes in a comparative context. IMG’s data content and analytical capabilities have increased continuously since its first version released in 2005. Since the last report published in the 2012 NAR Database Issue, IMG’s annotation and data integration pipelines have evolved while new tools have been added for recording and analyzing single cell genomes, RNA Seq and biosynthetic cluster data. Different IMG datamarts provide support for the analysis of publicly available genomes (IMG/W: http://img.jgi.doe.gov/w), expert review of genome annotations (IMG/ER: http://img.jgi.doe.gov/er) and teaching and training in the area of microbial genome analysis (IMG/EDU: http://img.jgi.doe.gov/edu).
IMG/M (http://img.jgi.doe.gov/m) provides support for comparative analysis of microbial community aggregate genomes (metagenomes) in the context of a comprehensive set of reference genomes from all three domains of life, as well as plasmids, viruses and genome fragments. IMG/M’s data content and analytical tools have expanded continuously since its first version was released in 2007. Since the last report published in the 2012 NAR Database Issue, IMG/M’s database architecture, annotation and data integration pipelines and analysis tools have been extended to copewith the rapid growth in the number and size of metagenome data sets handled by the system. IMG/M data marts provide support for the analysis of publicly available genomes, expert review of metagenome annotations (IMG/M ER: http://img.jgi.doe.gov/mer) and Human Microbiome Project (HMP)-specific metagenome samples (IMG/M HMP: http://img.jgi.doe.gov/imgm_hmp).
Leisingera methylohalidivorans Schaefer et al. 2002 emend. Vandecandelaere et al. 2008 is the type species of the genus Leisingera. The genus belongs to the Roseobacter clade (Rhodobacteraceae, Alphaproteobacteria), a widely distributed lineage in marine environments. Leisingera and particularly L. methylohalidivorans strain MB2T is of special interest due to its methylotrophy. Here we describe the complete genome sequence and annotation of this bacterium together with previously unreported aspects of its phenotype. The 4,650,996 bp long genome with its 4,515 protein-coding and 81 RNA genes consists of three replicons, a single chromosome and two extrachromosomal elements with sizes of 221 kb and 285 kb.
Methylotrophy; methyl halides; extrachromosomal elements; Alphaproteobacteria; Rhodobacteraceae; Roseobacter clade; aerobe
TF-218T is the type strain of the species Phaeobacter daeponensis Yoon et al. 2007, a facultatively anaerobic Phaeobacter species isolated from tidal flats. Here we describe the draft genome sequence and annotation of this bacterium together with previously unreported aspects of its phenotype. We analyzed the genome for genes involved in secondary metabolite production and its anaerobic lifestyle, which have also been described for its closest relative Phaeobacter caeruleus. The 4,642,596 bp long genome of strain TF-218T contains 4,310 protein-coding genes and 78 RNA genes including four rRNA operons and consists of five replicons: one chromosome and four extrachromosomal elements with sizes of 276 kb, 174 kb, 117 kb and 90 kb. Genome analysis showed that TF-218T possesses all of the genes for indigoidine biosynthesis, and on specific media the strain showed a blue pigmentation. We also found genes for dissimilatory nitrate reduction, gene-transfer agents, NRPS/ PKS genes and signaling systems homologous to the LuxR/I system.
Marine microbiology; facultative anaerobe; indigoidine; Rhodobacteraceae; Roseobacter clade
Mass spectrometry-based metabolomics has become a powerful tool for the detection of metabolites in complex biological systems and for the identification of novel metabolites. We previously identified a number of unexpected metabolites in the cyanobacterium Synechococcus sp. PCC 7002, such as histidine betaine, its derivatives and several unusual oligosaccharides. To test for the presence of these compounds and to assess the diversity of small polar metabolites in other cyanobacteria, we profiled cell extracts of nine strains representing much of the morphological and evolutionary diversification of this phylum. Spectral features in raw metabolite profiles obtained by normal phase liquid chromatography coupled to mass spectrometry (MS) were manually curated so that chemical formulae of metabolites could be assigned. For putative identification, retention times and MS/MS spectra were cross-referenced with those of standards or available sprectral library records. Overall, we detected 264 distinct metabolites. These included indeed different betaines, oligosaccharides as well as additional unidentified metabolites with chemical formulae not present in databases of metabolism. Some of these metabolites were detected only in a single strain, but some were present in more than one. Genomic interrogation of the strains revealed that generally, presence of a given metabolite corresponded well with the presence of its biosynthetic genes, if known. Our results show the potential of combining metabolite profiling and genomics for the identification of novel biosynthetic genes.
cyanobacteria; metabolomics; mass spectrometry; MS/MS; betaines; oligosaccharides
Litoreibacter arenae Kim et al. 2012 is a member of the genomically well-characterized Rhodobacteraceae clade within the Roseobacter clade. Representatives of this clade are known to be metabolically versatile and involved in marine carbon-producing and biogeochemical processes. They form a physiologically heterogeneous group of Alphaproteobacteria and were mostly found in coastal or polar waters, especially in symbiosis with algae, in microbial mats, in sediments or together with invertebrates and vertebrates. Here we describe the features of L. arenae DSM 19593T, including novel aspects of its phenotype, together with the draft genome sequence and annotation. The 3,690,113 bp long genome consists of 17 scaffolds with 3,601 protein-coding and 56 RNA genes. This genome was sequenced as part of the activities of the Transregional Collaborative Research Centre 51 funded by the German Research Foundation (DFG).
marine; rod-shaped; sea sand; sediment; motile; strictly aerobic; mesophile; chemoorganotrophic; halophilic; virus-like structures; carbon monoxide utilization; sulfur oxidation; Rhodobacteraceae; Alphaproteobacteria; Thalassobacter arenae
Saccharomonospora cyanea Runmao et al. 1988 is a member of the genus Saccharomonospora in the family Pseudonocardiaceae that is moderately well characterized at the genome level thus far. Members of the genus Saccharomonospora are of interest because they originate from diverse habitats, such as soil, leaf litter, manure, compost, surface of peat, moist, over-heated grain, and ocean sediment, where they probably play a role in the primary degradation of plant material by attacking hemicellulose. Species of the genus Saccharomonospora are usually Gram-positive, non-acid fast, and are classified among the actinomycetes. S. cyanea is characterized by a dark blue (= cyan blue) aerial mycelium. After S. viridis, S. azurea, and S. marina, S. cyanea is only the fourth member in the genus for which a completely sequenced (non-contiguous finished draft status) type strain genome will be published. Here we describe the features of this organism, together with the draft genome sequence, and annotation. The 5,408,301 bp long chromosome with its 5,139 protein-coding and 57 RNA genes was sequenced as part of the DOE funded Community Sequencing Program (CSP) 2010 at the Joint Genome Institute (JGI).
draft genome; aerobic; chemoheterotrophic; Gram-positive; vegetative and aerial mycelia; spore-forming; non-motile; soil bacterium; Pseudonocardiaceae; CSP 2010
Phaeobacter arcticus Zhang et al. 2008 belongs to the marine Roseobacter clade whose members are phylogenetically and physiologically diverse. In contrast to the type species of this genus, Phaeobacter gallaeciensis, which is well characterized, relatively little is known about the characteristics of P. arcticus. Here, we describe the features of this organism including the annotated high-quality draft genome sequence and highlight some particular traits. The 5,049,232 bp long genome with its 4,828 protein-coding and 81 RNA genes consists of one chromosome and five extrachromosomal elements. Prophage sequences identified via PHAST constitute nearly 5% of the bacterial chromosome and included a potential Mu-like phage as well as a gene-transfer agent (GTA). In addition, the genome of strain DSM 23566T encodes all of the genes necessary for assimilatory nitrate reduction. Phylogenetic analysis and intergenomic distances indicate that the classification of the species might need to be reconsidered.
aerobic; psychrophilic; motile; high-quality draft; prophage-like structures; extrachromosomal elements; assimilatory nitrate reduction; Alphaproteobacteria; Roseobacter clade
Serratia proteamaculans S4 (previously Serratia sp. S4), isolated from the rhizosphere of wild Equisetum sp., has the ability to stimulate plant growth and to suppress the growth of several soil-borne fungal pathogens of economically important crops. Here we present the non-contiguous, finished genome sequence of S. proteamaculans S4, which consists of a 5,324,944 bp circular chromosome and a 129,797 bp circular plasmid. The chromosome contains 5,008 predicted genes while the plasmid comprises 134 predicted genes. In total, 4,993 genes are assigned as protein-coding genes. The genome consists of 22 rRNA genes, 82 tRNA genes and 58 pseudogenes. This genome is a part of the project “Genomics of four rapeseed plant growth-promoting bacteria with antagonistic effect on plant pathogens” awarded through the 2010 DOE-JGI’s Community Sequencing Program.
Facultative aerobe; gram-negative; motile; non-sporulating; mesophilic; chemoorganotrophic; agriculture
In 2009 Phaeobacter caeruleus was described as a novel species affiliated with the marine Roseobacter clade, which, in turn, belongs to the class Alphaproteobacteria. The genus Phaeobacter is well known for members that produce various secondary metabolites. Here we report of putative quorum sensing systems, based on the finding of six N-acyl-homoserine lactone synthetases, and show that the blue color of P. caeruleus is probably due to the production of the secondary metabolite indigoidine. Therefore, P. caeruleus might have inhibitory effects on other bacteria. In this study the genome of the type strain DSM 24564T was sequenced, annotated and characterized. The 5,344,419 bp long genome with its seven plasmids contains 5,227 protein-coding genes (3,904 with a predicted function) and 108 RNA genes.
biofilm; motile; indigoidine; quorum sensing; siderophores; Rhodobacteraceae; Alphaproteobacteria
Marinitoga piezophila KA3 is a thermophilic, anaerobic, chemoorganotrophic, sulfur-reducing bacterium isolated from the Grandbonum deep-sea hydrothermal vent site at the East Pacific Rise (13°N, 2,630-m depth). The genome of M. piezophila KA3 comprises a 2,231,407-bp circular chromosome and a 13,386-bp circular plasmid. This genome was sequenced within Department of Energy Joint Genome Institute CSP 2010.
Desulfotomaculum kuznetsovii is a moderately thermophilic member of the polyphyletic spore-forming genus Desulfotomaculum in the family Peptococcaceae. This species is of interest because it originates from deep subsurface thermal mineral water at a depth of about 3,000 m. D. kuznetsovii is a rather versatile bacterium as it can grow with a large variety of organic substrates, including short-chain and long-chain fatty acids, which are degraded completely to carbon dioxide coupled to the reduction of sulfate. It can grow methylotrophically with methanol and sulfate and autotrophically with H2 + CO2 and sulfate. For growth it does not require any vitamins. Here, we describe the features of D. kuznetsovii together with the genome sequence and annotation. The chromosome has 3,601,386 bp organized in one contig. A total of 3,567 candidate protein-encoding genes and 58 RNA genes were identified. Genes of the acetyl-CoA pathway, possibly involved in heterotrophic growth with acetate and methanol, and in CO2 fixation during autotrophic growth are present. Genomic comparison revealed that D. kuznetsovii shows a high similarity with Pelotomaculum thermopropionicum. Genes involved in propionate metabolism of these two strains show a strong similarity. However, main differences are found in genes involved in the electron acceptor metabolism.
Thermophilic spore-forming anaerobes; sulfate reduction; autotrophic; methylotrophic; Peptococcaceae; Clostridiales
Termites effectively feed on many types of lignocellulose assisted by their gut microbial symbionts. To better understand the microbial decomposition of biomass with varied chemical profiles, it is important to determine whether termites harbor different microbial symbionts with specialized functionalities geared toward different feeding regimens. In this study, we compared the microbiota in the hindgut paunch of Amitermes wheeleri collected from cow dung and Nasutitermes corniger feeding on sound wood by 16S rRNA pyrotag, comparative metagenomic and metatranscriptomic analyses. We found that Firmicutes and Spirochaetes were the most abundant phyla in A. wheeleri, in contrast to N. corniger where Spirochaetes and Fibrobacteres dominated. Despite this community divergence, a convergence was observed for functions essential to termite biology including hydrolytic enzymes, homoacetogenesis and cell motility and chemotaxis. Overrepresented functions in A. wheeleri relative to N. corniger microbiota included hemicellulose breakdown and fixed-nitrogen utilization. By contrast, glycoside hydrolases attacking celluloses and nitrogen fixation genes were overrepresented in N. corniger microbiota. These observations are consistent with dietary differences in carbohydrate composition and nutrient contents, but may also reflect the phylogenetic difference between the hosts.
Assembling individual genomes from complex community metagenomic data remains a challenging issue for environmental studies. We evaluated the quality of genome assemblies from community short read data (Illumina 100 bp pair-ended sequences) using datasets recovered from freshwater and soil microbial communities as well as in silico simulations. Our analyses revealed that the genome of a single genotype (or species) can be accurately assembled from a complex metagenome when it shows at least about 20 × coverage. At lower coverage, however, the derived assemblies contained a substantial fraction of non-target sequences (chimeras), which explains, at least in part, the higher number of hypothetical genes recovered in metagenomic relative to genomic projects. We also provide examples of how to detect intrapopulation structure in metagenomic datasets and estimate the type and frequency of errors in assembled genes and contigs from datasets of varied species complexity.
metagenome; assembly; Illumina
Serratia sp. strain FGI 94 was isolated from a fungus garden of the leaf-cutter ant Atta colombica. Analysis of its 4.86-Mbp chromosome will help advance our knowledge of symbiotic interactions and plant biomass degradation in this ancient ant-fungus mutualism.
The Enterobacteriaceae bacterium strain FGI 57 was isolated from a fungus garden of the leaf-cutter ant Atta colombica. Analysis of its single 4.76-Mbp chromosome will shed light on community dynamics and plant biomass degradation in ant fungus gardens.
Effective comparative analysis of microbial genomes requires a consistent and complete view of biological data. Consistency regards the biological coherence of annotations, while completeness regards the extent and coverage of functional characterization for genomes. We have developed tools that allow scientists to assess and improve the consistency and completeness of microbial genome annotations in the context of the Integrated Microbial Genomes (IMG) family of systems. All publicly available microbial genomes are characterized in IMG using different functional annotation and pathway resources, thus providing a comprehensive framework for identifying and resolving annotation discrepancies. A rule based system for predicting phenotypes in IMG provides a powerful mechanism for validating functional annotations, whereby the phenotypic traits of an organism are inferred based on the presence of certain metabolic reactions and pathways and compared to experimentally observed phenotypes. The IMG family of systems are available at http://img.jgi.doe.gov/.
Metagenome analysis of the gut symbionts of three different insects was conducted as a means of comparing taxonomic and metabolic diversity of gut microbiomes to diet and life history of the insect hosts. A second goal was the discovery of novel biocatalysts for biorefinery applications. Grasshopper and cutworm gut symbionts were sequenced and compared with the previously identified metagenome of termite gut microbiota. These insect hosts represent three different insect orders and specialize on different food types. The comparative analysis revealed dramatic differences among the three insect species in the abundance and taxonomic composition of the symbiont populations present in the gut. The composition and abundance of symbionts was correlated with their previously identified capacity to degrade and utilize the different types of food consumed by their hosts. The metabolic reconstruction revealed that the gut metabolome of cutworms and grasshoppers was more enriched for genes involved in carbohydrate metabolism and transport than wood-feeding termite, whereas the termite gut metabolome was enriched for glycosyl hydrolase (GH) enzymes relevant to lignocellulosic biomass degradation. Moreover, termite gut metabolome was more enriched with nitrogen fixation genes than those of grasshopper and cutworm gut, presumably due to the termite's adaptation to the high fiber and less nutritious food types. In order to evaluate and exploit the insect symbionts for biotechnology applications, we cloned and further characterized four biomass-degrading enzymes including one endoglucanase and one xylanase from both the grasshopper and cutworm gut symbionts. The results indicated that the grasshopper symbiont enzymes were generally more efficient in biomass degradation than the homologous enzymes from cutworm symbionts. Together, these results demonstrated a correlation between the composition and putative metabolic functionality of the gut microbiome and host diet, and suggested that this relationship could be exploited for the discovery of symbionts and biocatalysts useful for biorefinery applications.
The symbiotic gut microbiome of herbivorous insects is vital for their ability to utilize and specialize on plants with very different nutrient qualities. Moreover, the gut microbiome is a significant resource for the discovery of biocatalysts and microbes with applications to various biotechnologies. We compared the gut symbionts from three different insect species to examine whether there was a relationship between the diversity and metabolic capability of the symbionts and the diet of their hosts, with the goal of using such a relationship for the discovery of biocatalysts for biofuel applications. The study revealed that the metabolic capabilities of the insect gut symbionts correlated with insect adaptation to different food types and life histories at the levels of species, metabolic pathway, and individual gene. Moreover, we showed that the grasshopper cellulase and xylanase enzymes generally exhibited higher activities than those of cutworm, demonstrating differences in capabilities even at the protein level. Together, our findings confirmed our previous research and suggested that the grasshopper might be a good target for biocatalyst discovery due to their high gut cellulytic enzyme activities.
Marinomonas posidonica IVIA-Po-181T Lucas-Elío et al. 2011 belongs to the family Oceanospirillaceae within the phylum Proteobacteria. Different species of the genus Marinomonas can be readily isolated from the seagrass Posidonia oceanica. M. posidonica is among the most abundant species of the genus detected in the cultured microbiota of P. oceanica, suggesting a close relationship with this plant, which has a great ecological value in the Mediterranean Sea, covering an estimated surface of 38,000 Km2. Here we describe the genomic features of M. posidonica. The 3,899,940 bp long genome harbors 3,544 protein-coding genes and 107 RNA genes and is a part of the Genomic
Aerobic; Gram-negative; marine; plant-associated
Serratia plymuthica AS13 is a plant-associated Gammaproteobacteria, isolated from rapeseed roots. It is of special interest because of its ability to inhibit fungal pathogens of rapeseed and to promote plant growth. The complete genome of S. plymuthica AS13 consists of a 5,442,549 bp circular chromosome. The chromosome contains 4,951 protein-coding genes, 87 tRNA genes and 7 rRNA operons. This genome was sequenced as part of the project entitled “Genomics of four rapeseed plant growth promoting bacteria with antagonistic effect on plant pathogens” within the 2010 DOE-JGI Community Sequencing Program (CSP2010).
Gram-negative; non-sporulating; motile; plant-associated; chemoorganotrophic; Enterobacteriaceae
A protein fraction exhibiting 1-hydroxy-2-naphthoic acid (1-H2NA) dioxygenase activity was purified via ion exchange, hydrophobic interactions, and gel filtration chromatography from Arthrobacter phenanthrenivorans sp. nov. strain Sphe3 isolated from a Greek creosote-oil-polluted site. Matrix-assisted laser desorption ionization–time of flight mass spectrometry (MALDI-TOF MS) and tandem MS (MS-MS) analysis revealed that the amino acid sequences of oligopeptides of the major 45-kDa protein species, as analyzed by SDS-PAGE and silver staining, comprising 29% of the whole sequence, exhibited strong homology with 1-H2NA dioxygenase of Nocardioides sp. strain KP7. A BLAST search of the recently sequenced Sphe3 genome revealed two putative open reading frames, named diox1 and diox2, showing 90% nucleotide identity to each other and 85% identity at the amino acid level with the Nocardia sp. homologue. diox1 was found on an indigenous Sphe3 plasmid, whereas diox2 was located on the chromosome. Both genes were induced by the presence of phenanthrene used as a sole carbon and energy source, and as expected, both were subject to carbon catabolite repression. The relative RNA transcription level of the chromosomal (diox2) gene was significantly higher than that of its plasmid (diox1) homologue. Both diox1 and diox2 putative genes were PCR amplified, cloned, and overexpressed in Escherichia coli. Recombinant E. coli cells expressed 1-H2NA dioxygenase activity. Recombinant enzymes exhibited Michaelis-Menten kinetics with an apparent Km of 35 μM for Diox1 and 29 μM for Diox2, whereas they showed similar kinetic turnover characteristics with Kcat/Km values of 11 × 106 M−1 s−1 and 12 × 106 M−1 s−1, respectively. Occurrence of two diox1 and diox2 homologues in the Sphe3 genome implies that a replicative transposition event has contributed to the evolution of 1-H2NA dioxygenase in A. phenanthrenivorans.
Variability in the extent of the descriptions of data (‘metadata’) held in public repositories forces users to assess the quality of records individually, which rapidly becomes impractical. The scoring of records on the richness of their description provides a simple, objective proxy measure for quality that enables filtering that supports downstream analysis. Pivotally, such descriptions should spur on improvements. Here, we introduce such a measure - the ‘Metadata Coverage Index’ (MCI): the percentage of available fields actually filled in a record or description. MCI scores can be calculated across a database, for individual records or for their component parts (e.g., fields of interest). There are many potential uses for this simple metric: for example; to filter, rank or search for records; to assess the metadata availability of an ad hoc collection; to determine the frequency with which fields in a particular record type are filled, especially with respect to standards compliance; to assess the utility of specific tools and resources, and of data capture practice more generally; to prioritize records for further curation; to serve as performance metrics of funded projects; or to quantify the value added by curation. Here we demonstrate the utility of MCI scores using metadata from the Genomes Online Database (GOLD), including records compliant with the ‘Minimum Information about a Genome Sequence’ (MIGS) standard developed by the Genomic Standards Consortium. We discuss challenges and address the further application of MCI scores; to show improvements in annotation quality over time, to inform the work of standards bodies and repository providers on the usability and popularity of their products, and to assess and credit the work of curators. Such an index provides a step towards putting metadata capture practices and in the future, standards compliance, into a quantitative and objective framework.