Comparison of genome-wide, high-resolution restriction maps of Klebsiella pneumoniae clinical isolates, including an NDM-1 producer, and in silico-generated restriction maps of sequenced genomes revealed a highly heterogeneous region we designated the “high heterogeneity zone” (HHZ). The HHZ consists of several regions including a “hot spot” prone to insertions and other rearrangements. The HHZ is a characteristic genomic area that can be used in the identification and tracking of outbreak-causing strains.
Klebsiella pneumoniae; Genomic analysis; optical map; NDM-1; ICE
Summary: One form of immune evasion is a developmental state called “persistence” whereby chlamydial pathogens respond to the host-mediated withdrawal of l-tryptophan (Trp). A sophisticated survival mode of reversible quiescence is implemented. A mechanism has evolved which suppresses gene products necessary for rapid pathogen proliferation but allows expression of gene products that underlie the morphological and developmental characteristics of persistence. This switch from one translational profile to an alternative translational profile of newly synthesized proteins is proposed to be accomplished by maximizing the Trp content of some proteins needed for rapid proliferation (e.g., ADP/ATP translocase, hexose-phosphate transporter, phosphoenolpyruvate [PEP] carboxykinase, the Trp transporter, the Pmp protein superfamily for cell adhesion and antigenic variation, and components of the cell division pathway) while minimizing the Trp content of other proteins supporting the state of persistence. The Trp starvation mechanism is best understood in the human-Chlamydia trachomatis relationship, but the similarity of up-Trp and down-Trp proteomic profiles in all of the pathogenic Chlamydiaceae suggests that Trp availability is an underlying cue relied upon by this family of pathogens to trigger developmental transitions. The biochemically expensive pathogen strategy of selectively increased Trp usage to guide the translational profile can be leveraged significantly with minimal overall Trp usage by (i) regional concentration of Trp residue placements, (ii) amplified Trp content of a single protein that is required for expression or maturation of multiple proteins with low Trp content, and (iii) Achilles'-heel vulnerabilities of complex pathways to high Trp content of one or a few enzymes.
The Aquificales are thermophilic microorganisms that inhabit hydrothermal systems worldwide and are considered one of the earliest lineages of the domain Bacteria. We analyzed metagenome sequence obtained from six thermal “filamentous streamer” communities (∼40 Mbp per site), which targeted three different groups of Aquificales found in Yellowstone National Park (YNP). Unassembled metagenome sequence and PCR-amplified 16S rRNA gene libraries revealed that acidic, sulfidic sites were dominated by Hydrogenobaculum (Aquificaceae) populations, whereas the circum-neutral pH (6.5–7.8) sites containing dissolved sulfide were dominated by Sulfurihydrogenibium spp. (Hydrogenothermaceae). Thermocrinis (Aquificaceae) populations were found primarily in the circum-neutral sites with undetectable sulfide, and to a lesser extent in one sulfidic system at pH 8. Phylogenetic analysis of assembled sequence containing 16S rRNA genes as well as conserved protein-encoding genes revealed that the composition and function of these communities varied across geochemical conditions. Each Aquificales lineage contained genes for CO2 fixation by the reverse-TCA cycle, but only the Sulfurihydrogenibium populations perform citrate cleavage using ATP citrate lyase (Acl). The Aquificaceae populations use an alternative pathway catalyzed by two separate enzymes, citryl-CoA synthetase (Ccs), and citryl-CoA lyase (Ccl). All three Aquificales lineages contained evidence of aerobic respiration, albeit due to completely different types of heme Cu oxidases (subunit I) involved in oxygen reduction. The distribution of Aquificales populations and differences among functional genes involved in energy generation and electron transport is consistent with the hypothesis that geochemical parameters (e.g., pH, sulfide, H2, O2) have resulted in niche specialization among members of the Aquificales.
thermophiles; functional genomics; phylogeny; autotrophic processes; sulfide oxidation
Sanger and shotgun sequencing of Clostridium botulinum strain Af84 type Af and its botulinum neurotoxin gene (bont) clusters identified the presence of three bont gene clusters rather than the expected two. The three toxin gene clusters consisted of bont subtypes A2, F4 and F5. The bont/A2 and bont/F4 gene clusters were located within the chromosome (the latter in a novel location), while the bont/F5 toxin gene cluster was located within a large 246 kb plasmid. These findings are the first identification of a C. botulinum strain that contains three botulinum neurotoxin gene clusters.
The etiology of dental caries remains elusive because of our limited understanding of the complex oral microbiomes. The current methodologies have been limited by insufficient depth and breadth of microbial sampling, paucity of data for diseased hosts particularly at the population level, inconsistency of sampled sites and the inability to distinguish the underlying microbial factors. By cross-validating 16S rRNA gene amplicon-based and whole-genome-based deep-sequencing technologies, we report the most in-depth, comprehensive and collaborated view to date of the adult saliva microbiomes in pilot populations of 19 caries-active and 26 healthy human hosts. We found that: first, saliva microbiomes in human population were featured by a vast phylogenetic diversity yet a minimal organismal core; second, caries microbiomes were significantly more variable in community structure whereas the healthy ones were relatively conserved; third, abundance changes of certain taxa such as overabundance of Prevotella Genus distinguished caries microbiota from healthy ones, and furthermore, caries-active and normal individuals carried different arrays of Prevotella species; and finally, no ‘caries-specific' operational taxonomic units (OTUs) were detected, yet 147 OTUs were ‘caries associated', that is, differentially distributed yet present in both healthy and caries-active populations. These findings underscored the necessity of species- and strain-level resolution for caries prognosis, and were consistent with the ecological hypothesis where the shifts in community structure, instead of the presence or absence of particular groups of microbes, underlie the cariogenesis.
caries; metagenomics; oral-microbiome; Prevotella; saliva
Kingella kingae is a human oral bacterium that can cause infections of the skeletal system in children. The bacterium is also a cardiovascular pathogen causing infective endocarditis in children and adults. We report herein the draft genome sequence of septic arthritis K. kingae strain PYKK081.
In May of 2011, an enteroaggregative Escherichia coli O104:H4 strain that had acquired a Shiga toxin 2-converting phage caused a large outbreak of bloody diarrhea in Europe which was notable for its high prevalence of hemolytic uremic syndrome cases. Several studies have described the genomic inventory and phylogenies of strains associated with the outbreak and a collection of historical E. coli O104:H4 isolates using draft genome assemblies. We present the complete, closed genome sequences of an isolate from the 2011 outbreak (2011C–3493) and two isolates from cases of bloody diarrhea that occurred in the Republic of Georgia in 2009 (2009EL–2050 and 2009EL–2071). Comparative genome analysis indicates that, while the Georgian strains are the nearest neighbors to the 2011 outbreak isolates sequenced to date, structural and nucleotide-level differences are evident in the Stx2 phage genomes, the mer/tet antibiotic resistance island, and in the prophage and plasmid profiles of the strains, including a previously undescribed plasmid with homology to the pMT virulence plasmid of Yersinia pestis. In addition, multiphenotype analysis showed that 2009EL–2071 possessed higher resistance to polymyxin and membrane-disrupting agents. Finally, we show evidence by electron microscopy of the presence of a common phage morphotype among the European and Georgian strains and a second phage morphotype among the Georgian strains. The presence of at least two stx2 phage genotypes in host genetic backgrounds that may derive from a recent common ancestor of the 2011 outbreak isolates indicates that the emergence of stx2 phage-containing E. coli O104:H4 strains probably occurred more than once, or that the current outbreak isolates may be the result of a recent transfer of a new stx2 phage element into a pre-existing stx2-positive genetic background.
The Rex repressor has been implicated in regulation of central carbon and energy metabolism in Gram-positive bacteria. We have previously shown that Streptococcus mutans, the primary causative agent of dental caries, alters its transcriptome upon Rex-deficiency and renders S. mutans to have increased susceptibility to oxidative stress, aberrations in glucan production, and poor biofilm formation. In this study, we showed that rex in S. mutans is co-transcribed as an operon with downstream guaA, encoding a putative glutamine amidotransferase. Electrophoretic mobility shift assays showed that recombinant Rex bound promoters of target genes avidly and specifically, including those down-regulated in response to Rex-deficiency, and that the ability of recombinant Rex to bind to selected promoters was modulated by NADH and NAD+. Results suggest that Rex in S. mutans can function as an activator in response to intracellular NADH/NAD+ level, although the exact binding site for activator Rex remains unclear. Consistent with a role in oxidative stress tolerance, hydrogen peroxide challenge assays showed that the Rex-deficient mutant, TW239, and the Rex/GuaA double mutant, JB314, were more susceptible to hydrogen peroxide killing than the wildtype, UA159. Relative to UA159, JB314 displayed major defects in biofilm formation, with a decrease of more than 50-fold in biomass after 48-hours. Collectively, these results further suggest that Rex in S. mutans regulates fermentation pathways, oxidative stress tolerance, and biofilm formation in response to intracellular NADH/NAD+ level. Current effort is being directed to further investigation of the role of GuaA in S. mutans cellular physiology.
Taxonomic and phylogenetic fingerprinting based on sequence analysis of gene fragments from the large-subunit rRNA (LSU) gene or the internal transcribed spacer (ITS) region is becoming an integral part of fungal classification. The lack of an accurate and robust classification tool trained by a validated sequence database for taxonomic placement of fungal LSU genes is a severe limitation in taxonomic analysis of fungal isolates or large data sets obtained from environmental surveys. Using a hand-curated set of 8,506 fungal LSU gene fragments, we determined the performance characteristics of a naïve Bayesian classifier across multiple taxonomic levels and compared the classifier performance to that of a sequence similarity-based (BLASTN) approach. The naïve Bayesian classifier was computationally more rapid (>460-fold with our system) than the BLASTN approach, and it provided equal or superior classification accuracy. Classifier accuracies were compared using sequence fragments of 100 bp and 400 bp and two different PCR primer anchor points to mimic sequence read lengths commonly obtained using current high-throughput sequencing technologies. Accuracy was higher with 400-bp sequence reads than with 100-bp reads. It was also significantly affected by sequence location across the 1,400-bp test region. The highest accuracy was obtained across either the D1 or D2 variable region. The naïve Bayesian classifier provides an effective and rapid means to classify fungal LSU sequences from large environmental surveys. The training set and tool are publicly available through the Ribosomal Database Project (http://rdp.cme.msu.edu/classifier/classifier.jsp).
Classification is difficult for shotgun metagenomics data from environments such as soils, where the diversity of sequences is high and where reference sequences from close relatives may not exist. Approaches based on sequence-similarity scores must deal with the confounding effects that inheritance and functional pressures exert on the relation between scores and phylogenetic distance, while approaches based on sequence alignment and tree-building are typically limited to a small fraction of gene families. We describe an approach based on finding one or more exact matches between a read and a precomputed set of peptide 10-mers.
At even the largest phylogenetic distances, thousands of 10-mer peptide exact matches can be found between pairs of bacterial genomes. Genes that share one or more peptide 10-mers typically have high reciprocal BLAST scores. Among a set of 403 representative bacterial genomes, some 20 million 10-mer peptides were found to be shared. We assign each of these peptides as a signature of a particular node in a phylogenetic reference tree based on the RNA polymerase genes. We classify the phylogeny of a genomic fragment (e.g., read) at the most specific node on the reference tree that is consistent with the phylogeny of observed signature peptides it contains. Using both synthetic data from four newly-sequenced soil-bacterium genomes and ten real soil metagenomics data sets, we demonstrate a sensitivity and specificity comparable to that of the MEGAN metagenomics analysis package using BLASTX against the NR database. Phylogenetic and functional similarity metrics applied to real metagenomics data indicates a signal-to-noise ratio of approximately 400 for distinguishing among environments. Our method assigns ~6.6 Gbp/hr on a single CPU, compared with 25 kbp/hr for methods based on BLASTX against the NR database.
Classification by exact matching against a precomputed list of signature peptides provides comparable results to existing techniques for reads longer than about 300 bp and does not degrade severely with shorter reads. Orders of magnitude faster than existing methods, the approach is suitable now for inclusion in analysis pipelines and appears to be extensible in several different directions.
Microbial hydrolysis of polysaccharides is critical to ecosystem functioning and is of great interest in diverse biotechnological applications, such as biofuel production and bioremediation. Here we demonstrate the use of a new, efficient approach to recover genomes of active polysaccharide degraders from natural, complex microbial assemblages, using a combination of fluorescently labeled substrates, fluorescence-activated cell sorting, and single cell genomics. We employed this approach to analyze freshwater and coastal bacterioplankton for degraders of laminarin and xylan, two of the most abundant storage and structural polysaccharides in nature. Our results suggest that a few phylotypes of Verrucomicrobia make a considerable contribution to polysaccharide degradation, although they constituted only a minor fraction of the total microbial community. Genomic sequencing of five cells, representing the most predominant, polysaccharide-active Verrucomicrobia phylotype, revealed significant enrichment in genes encoding a wide spectrum of glycoside hydrolases, sulfatases, peptidases, carbohydrate lyases and esterases, confirming that these organisms were well equipped for the hydrolysis of diverse polysaccharides. Remarkably, this enrichment was on average higher than in the sequenced representatives of Bacteroidetes, which are frequently regarded as highly efficient biopolymer degraders. These findings shed light on the ecological roles of uncultured Verrucomicrobia and suggest specific taxa as promising bioprospecting targets. The employed method offers a powerful tool to rapidly identify and recover discrete genomes of active players in polysaccharide degradation, without the need for cultivation.
Bacillus coagulans is a ubiquitous soil bacterium that grows at 50-55 °C and pH 5.0 and ferments various sugars that constitute plant biomass to L (+)-lactic acid. The ability of this sporogenic lactic acid bacterium to grow at 50-55 °C and pH 5.0 makes this organism an attractive microbial biocatalyst for production of optically pure lactic acid at industrial scale not only from glucose derived from cellulose but also from xylose, a major constituent of hemicellulose. This bacterium is also considered as a potential probiotic. Complete genome sequence of a representative strain, B. coagulans strain 36D1, is presented and discussed.
Bacillus coagulans; genome sequence; lactic acid; fermentation; probiotics; thermotolerant bacterium
An isolate originally labeled Bacillus megaterium CDC 684 was found to contain both pXO1 and pXO2, was non-hemolytic, sensitive to gamma-phage, and produced both the protective antigen and the poly-D-glutamic acid capsule. These phenotypes prompted Ezzell et al., (J. Clin. Microbiol. 28:223) to reclassify this isolate to Bacillus anthracis in 1990.
We demonstrate that despite these B. anthracis features, the isolate is severely attenuated in a guinea pig model. This prompted whole genome sequencing and closure. The comparative analysis of CDC 684 to other sequenced B. anthracis isolates and further analysis reveals: a) CDC 684 is a close relative of a virulent strain, Vollum A0488; b) CDC 684 defines a new B. anthracis lineage (at least 51 SNPs) that includes 15 other isolates; c) the genome of CDC 684 contains a large chromosomal inversion that spans 3.3 Mbp; d) this inversion has caused a displacement of the usual spatial orientation of the origin of replication (ori) to the termination of replication (ter) from 180° in wild-type B. anthracis to 120° in CDC 684 and e) this isolate also has altered growth kinetics in liquid media.
We propose two alternative hypotheses explaining the attenuated phenotype of this isolate. Hypothesis 1 suggests that the skewed ori/ter relationship in CDC 684 has altered its DNA replication and/or transcriptome processes resulting in altered growth kinetics and virulence capacity. Hypothesis 2 suggests that one or more of the single nucleotide polymorphisms in CDC 684 has altered the expression of a regulatory element or other genes necessary for virulence.
Members of the bacterial phylum Acidobacteria are widespread in soils and sediments worldwide, and are abundant in many soils. Acidobacteria are challenging to culture in vitro, and many basic features of their biology and functional roles in the soil have not been determined. Candidatus Solibacter usitatus strain Ellin6076 has a 9.9 Mb genome that is approximately 2–5 times as large as the other sequenced Acidobacteria genomes. Bacterial genome sizes typically range from 0.5 to 10 Mb and are influenced by gene duplication, horizontal gene transfer, gene loss and other evolutionary processes. Our comparative genome analyses indicate that the Ellin6076 large genome has arisen by horizontal gene transfer via ancient bacteriophage and/or plasmid-mediated transduction, and widespread small-scale gene duplications, resulting in an increased number of paralogs. Low amino acid sequence identities among functional group members, and lack of conserved gene order and orientation in regions containing similar groups of paralogs, suggest that most of the paralogs are not the result of recent duplication events. The genome sizes of additional cultured Acidobacteria strains were estimated using pulsed-field gel electrophoresis to determine the prevalence of the large genome trait within the phylum. Members of subdivision 3 had larger genomes than those of subdivision 1, but none were as large as the Ellin6076 genome. The large genome of Ellin6076 may not be typical of the phylum, and encodes traits that could provide a selective metabolic, defensive and regulatory advantage in the soil environment.
Summary: Aspartokinase (Ask) exists within a variable network that supports the synthesis of 9 amino acids and a number of other important metabolites. Lysine, isoleucine, aromatic amino acids, and dipicolinate may arise from the ASK network or from alternative pathways. Ask proteins were subjected to cohesion group analysis, a methodology that sorts a given protein assemblage into groups in which evolutionary continuity is assured. Two subhomology divisions, ASKα and ASKβ, have been recognized. The ASKα subhomology division is the most ancient, being widely distributed throughout the Archaea and Eukarya and in some Bacteria. Within an indel region of about 75 amino acids near the N terminus, ASKβ sequences differ from ASKα sequences by the possession of a proposed ancient deletion. ASKβ sequences are present in most Bacteria and usually exhibit an in-frame internal translational start site that can generate a small Ask subunit that is identical to the C-terminal portion of the larger subunit of a heterodimeric unit. Particularly novel are ask genes embedded in gene contexts that imply specialization for ectoine (osmotic agent) or aromatic amino acids. The cohesion group approach is well suited for the easy recognition of relatively recent lateral gene transfer (LGT) events, and many examples of these are described. Given the current density of genome representation for Proteobacteria, it is possible to reconstruct more ancient landmark LGT events. Thus, a plausible scenario in which the three well-studied and iconic Ask homologs of Escherichia coli are not within the vertical genealogy of Gammaproteobacteria, but rather originated via LGT from a Bacteroidetes donor, is supported.
The combination of sucrose and starch in the presence of surface-adsorbed salivary α-amylase and bacterial glucosyltransferases increase the formation of a structurally and metabolically distinctive biofilm by Streptococcus mutans. This host-pathogen-diet interaction may modulate the formation of pathogenic biofilms related to dental caries disease. We conducted a comprehensive study to further investigate the influence of the dietary carbohydrates on S. mutans-transcriptome at distinct stages of biofilm development using whole genomic profiling with a new computational tool (MDV) for data mining. S. mutans UA159 biofilms were formed on amylase-active saliva coated hydroxyapatite discs in the presence of various concentrations of sucrose alone (ranging from 0.25 to 5% w/v) or in combination with starch (0.5 to 1% w/v). Overall, the presence of sucrose and starch (suc+st) influenced the dynamics of S. mutans transcriptome (vs. sucrose alone), which may be associated with gradual digestion of starch by surface-adsorbed amylase. At 21 h of biofilm formation, most of the differentially expressed genes were related to sugar metabolism, such as upregulation of genes involved in maltose/maltotriose uptake and glycogen synthesis. In addition, the groEL/groES chaperones were induced in the suc+st-biofilm, indicating that presence of starch hydrolysates may cause environmental stress. In contrast, at 30 h of biofilm development, multiple genes associated with sugar uptake/transport (e.g. maltose), two-component systems, fermentation/glycolysis and iron transport were differentially expressed in suc+st-biofilms (vs. sucrose-biofilms). Interestingly, lytT (bacteria autolysis) was upregulated, which was correlated with presence of extracellular DNA in the matrix of suc+st-biofilms. Specific genes related to carbohydrate uptake and glycogen metabolism were detected in suc+st-biofilms in more than one time point, indicating an association between presence of starch hydrolysates and intracellular polysaccharide storage. Our data show complex remodeling of S. mutans-transcriptome in response to changing environmental conditions in situ, which could modulate the dynamics of biofilm development and pathogenicity.
The 6.10-Mb genome sequence of the aerobic chitin-digesting gliding bacterium Flavobacterium johnsoniae (phylum Bacteroidetes) is presented. F. johnsoniae is a model organism for studies of bacteroidete gliding motility, gene regulation, and biochemistry. The mechanism of F. johnsoniae gliding is novel, and genome analysis confirms that it does not involve well-studied motility organelles, such as flagella or type IV pili. The motility machinery is composed of Gld proteins in the cell envelope that are thought to comprise the “motor” and SprB, which is thought to function as a cell surface adhesin that is propelled by the motor. Analysis of the genome identified genes related to sprB that may encode alternative adhesins used for movement over different surfaces. Comparative genome analysis revealed that some of the gld and spr genes are found in nongliding bacteroidetes and may encode components of a novel protein secretion system. F. johnsoniae digests proteins, and 125 predicted peptidases were identified. F. johnsoniae also digests numerous polysaccharides, and 138 glycoside hydrolases, 9 polysaccharide lyases, and 17 carbohydrate esterases were predicted. The unexpected ability of F. johnsoniae to digest hemicelluloses, such as xylans, mannans, and xyloglucans, was predicted based on the genome analysis and confirmed experimentally. Numerous predicted cell surface proteins related to Bacteroides thetaiotaomicron SusC and SusD, which are likely involved in binding of oligosaccharides and transport across the outer membrane, were also identified. Genes required for synthesis of the novel outer membrane flexirubin pigments were identified by a combination of genome analysis and genetic experiments. Genes predicted to encode components of a multienzyme nonribosomal peptide synthetase were identified, as were novel aspects of gene regulation. The availability of techniques for genetic manipulation allows rapid exploration of the features identified for the polysaccharide-digesting gliding bacteroidete F. johnsoniae.
Francisella tularensis subspecies tularensis consists of two separate populations A1 and A2. This report describes the complete genome sequence of NE061598, an F. tularensis subspecies tularensis A1 isolated in 1998 from a human with clinical disease in Nebraska, United States of America. The genome sequence was compared to Schu S4, an F. tularensis subspecies tularensis A1a strain originally isolated in Ohio in 1941. It was determined that there were 25 nucleotide polymorphisms (22 SNPs and 3 indels) between Schu S4 and NE061598; two of these polymorphisms were in potential virulence loci. Pulsed-field gel electrophoresis analysis demonstrated that NE061598 was an A1a genotype. Other differences included repeat sequences (n = 11 separate loci), four of which were contained in coding sequences, and an inversion and rearrangement probably mediated by insertion sequences and the previously identified direct repeats I, II, and III. Five new variable-number tandem repeats were identified; three of these five were unique in NE061598 compared to Schu S4. Importantly, there was no gene loss or gain identified between NE061598 and Schu S4. Interpretation of these data suggests there is significant sequence conservation and chromosomal synteny within the A1 population. Further studies are needed to determine the biological properties driving the selective pressure that maintains the chromosomal structure of this monomorphic pathogen.
The marine bacterium strain MC-1 is a member of the alpha subgroup of the proteobacteria that contains the magnetotactic cocci and was the first member of this group to be cultured axenically. The magnetotactic cocci are not closely related to any other known alphaproteobacteria and are only distantly related to other magnetotactic bacteria. The genome of MC-1 contains an extensive (102 kb) magnetosome island that includes numerous genes that are conserved among all known magnetotactic bacteria, as well as some genes that are unique. Interestingly, certain genes that encode proteins considered to be important in magnetosome assembly (mamJ and mamW) are absent from the genome of MC-1. Magnetotactic cocci exhibit polar magneto-aerotaxis, and the MC-1 genome contains a relatively large number of identified chemotaxis genes. Although MC-1 is capable of both autotrophic and heterotrophic growth, it does not appear to be metabolically versatile, with heterotrophic growth confined to the utilization of acetate. Central carbon metabolism is encoded by genes for the citric acid cycle (oxidative and reductive), glycolysis, and gluconeogenesis. The genome also reveals the presence or absence of specific genes involved in the nitrogen, sulfur, iron, and phosphate metabolism of MC-1, allowing us to infer the presence or absence of specific biochemical pathways in strain MC-1. The pathways inferred from the MC-1 genome provide important information regarding central metabolism in this strain that could provide insights useful for the isolation and cultivation of new magnetotactic bacterial strains, in particular strains of other magnetotactic cocci.
Clostridium botulinum is a taxonomic designation for at least four diverse species that are defined by the expression of one (monovalent) or two (bivalent) of seven different C. botulinum neurotoxins (BoNTs, A-G). The four species have been classified as C. botulinum Groups I-IV. The presence of bont genes in strains representing the different Groups is probably the result of horizontal transfer of the toxin operons between the species.
Chromosome and plasmid sequences of several C. botulinum strains representing A, B, E and F serotypes and a C. butyricum type E strain were compared to examine their genomic organization, or synteny, and the location of the botulinum toxin complex genes. These comparisons identified synteny among proteolytic (Group I) strains or nonproteolytic (Group II) strains but not between the two Groups. The bont complex genes within the strains examined were not randomly located but found within three regions of the chromosome or in two specific sites within plasmids. A comparison of sequences from a Bf strain revealed homology to the plasmid pCLJ with similar locations for the bont/bv b genes but with the bont/a4 gene replaced by the bont/f gene. An analysis of the toxin cluster genes showed that many recombination events have occurred, including several events within the ntnh gene. One such recombination event resulted in the integration of the bont/a1 gene into the serotype toxin B ha cluster, resulting in a successful lineage commonly associated with food borne botulism outbreaks. In C. botulinum type E and C. butyricum type E strains the location of the bont/e gene cluster appears to be the result of insertion events that split a rarA, recombination-associated gene, independently at the same location in both species.
The analysis of the genomic sequences representing different strains reveals the presence of insertion sequence (IS) elements and other transposon-associated proteins such as recombinases that could facilitate the horizontal transfer of the bonts; these events, in addition to recombination among the toxin complex genes, have led to the lineages observed today within the neurotoxin-producing clostridia.
The complete genomes of three strains from the phylum Acidobacteria were compared. Phylogenetic analysis placed them as a unique phylum. They share genomic traits with members of the Proteobacteria, the Cyanobacteria, and the Fungi. The three strains appear to be versatile heterotrophs. Genomic and culture traits indicate the use of carbon sources that span simple sugars to more complex substrates such as hemicellulose, cellulose, and chitin. The genomes encode low-specificity major facilitator superfamily transporters and high-affinity ABC transporters for sugars, suggesting that they are best suited to low-nutrient conditions. They appear capable of nitrate and nitrite reduction but not N2 fixation or denitrification. The genomes contained numerous genes that encode siderophore receptors, but no evidence of siderophore production was found, suggesting that they may obtain iron via interaction with other microorganisms. The presence of cellulose synthesis genes and a large class of novel high-molecular-weight excreted proteins suggests potential traits for desiccation resistance, biofilm formation, and/or contribution to soil structure. Polyketide synthase and macrolide glycosylation genes suggest the production of novel antimicrobial compounds. Genes that encode a variety of novel proteins were also identified. The abundance of acidobacteria in soils worldwide and the breadth of potential carbon use by the sequenced strains suggest significant and previously unrecognized contributions to the terrestrial carbon cycle. Combining our genomic evidence with available culture traits, we postulate that cells of these isolates are long-lived, divide slowly, exhibit slow metabolic rates under low-nutrient conditions, and are well equipped to tolerate fluctuations in soil hydration.
Francisella tularensis subspecies holarctica FTNF002-00 strain was originally obtained from the first known clinical case of bacteremic F. tularensis pneumonia in Southern Europe isolated from an immunocompetent individual. The FTNF002-00 complete genome contains the RD23 deletion and represents a type strain for a clonal population from the first epidemic tularemia outbreak in Spain between 1997–1998. Here, we present the complete sequence analysis of the FTNF002-00 genome. The complete genome sequence of FTNF002-00 revealed several large as well as small genomic differences with respect to two other published complete genome sequences of F. tularensis subsp. holarctica strains, LVS and OSU18. The FTNF002-00 genome shares >99.9% sequence similarity with LVS and OSU18, and is also ∼5 MB smaller by comparison. The overall organization of the FTNF002-00 genome is remarkably identical to those of LVS and OSU18, except for a single 3.9 kb inversion in FTNF002-00. Twelve regions of difference ranging from 0.1–1.5 kb and forty-two small insertions and deletions were identified in a comparative analysis of FTNF002-00, LVS, and OSU18 genomes. Two small deletions appear to inactivate two genes in FTNF002-00 causing them to become pseudogenes; the intact genes encode a protein of unknown function and a drug:H+ antiporter. In addition, we identified ninety-nine proteins in FTNF002-00 containing amino acid mutations compared to LVS and OSU18. Several non-conserved amino acid replacements were identified, one of which occurs in the virulence-associated intracellular growth locus subunit D protein. Many of these changes in FTNF002-00 are likely the consequence of direct selection that increases the fitness of this subsp. holarctica clone within its endemic population. Our complete genome sequence analyses lay the foundation for experimental testing of these possibilities.
The difficulty associated with the cultivation of most microorganisms and the complexity of natural microbial assemblages, such as marine plankton or human microbiome, hinder genome reconstruction of representative taxa using cultivation or metagenomic approaches. Here we used an alternative, single cell sequencing approach to obtain high-quality genome assemblies of two uncultured, numerically significant marine microorganisms. We employed fluorescence-activated cell sorting and multiple displacement amplification to obtain hundreds of micrograms of genomic DNA from individual, uncultured cells of two marine flavobacteria from the Gulf of Maine that were phylogenetically distant from existing cultured strains. Shotgun sequencing and genome finishing yielded 1.9 Mbp in 17 contigs and 1.5 Mbp in 21 contigs for the two flavobacteria, with estimated genome recoveries of about 91% and 78%, respectively. Only 0.24% of the assembling sequences were contaminants and were removed from further analysis using rigorous quality control. In contrast to all cultured strains of marine flavobacteria, the two single cell genomes were excellent Global Ocean Sampling (GOS) metagenome fragment recruiters, demonstrating their numerical significance in the ocean. The geographic distribution of GOS recruits along the Northwest Atlantic coast coincided with ocean surface currents. Metabolic reconstruction indicated diverse potential energy sources, including biopolymer degradation, proteorhodopsin photometabolism, and hydrogen oxidation. Compared to cultured relatives, the two uncultured flavobacteria have small genome sizes, few non-coding nucleotides, and few paralogous genes, suggesting adaptations to narrow ecological niches. These features may have contributed to the abundance of the two taxa in specific regions of the ocean, and may have hindered their cultivation. We demonstrate the power of single cell DNA sequencing to generate reference genomes of uncultured taxa from a complex microbial community of marine bacterioplankton. A combination of single cell genomics and metagenomics enabled us to analyze the genome content, metabolic adaptations, and biogeography of these taxa.
This paper describes the genome sequence of M. thermoacetica (f. Clostridium thermoaceticum), which is the model acetogenic bacterium that has been widely used for elucidating the Wood-Ljungdahl pathway of CO and CO2 fixation. This pathway, which is also known as the reductive acetyl-CoA pathway, allows acetogenic (often called homoacetogenic) bacteria to convert glucose stoichiometrically into three mol of acetate and to grow autotrophically using H2 and CO as electron donors and CO2 as an electron acceptor. Methanogenic archaea use this pathway in reverse to grow by converting acetate into methane and CO2. Acetogenic bacteria also couple the Wood-Ljungdahl pathway to a variety of other pathways to allow the metabolism of a wide variety of carbon sources and electron donors (sugars, carboxylic acids, alcohols, and aromatic compounds) and electron acceptors (CO2, nitrate, nitrite, thiosulfate, dimethylsulfoxide, and aromatic carboxyl groups). The genome consists of a single circular 2628784 bp chromosome encoding 2615 open reading frames, which includes 2523 predicted protein-encoding genes. Of these, 1834 genes (70.13%) have been assigned tentative functions, 665 (25.43%) matched genes of unknown function, and the remaining 24 (0.92%) had no database match. Two thousand three hundred eighty-four (91.17%) of the ORFs in the M. thermoacetica genome can be grouped in ortholog clusters. This first genome sequence of an acetogenic bacterium provides important information related to how acetogens engage their extreme metabolic diversity by switching among different carbon substrates and electron donors/acceptors and how they conserve energy by anaerobic respiration. Our genome analysis indicates that the key genetic trait for homoacetogenesis is the core acs gene cluster of the Wood-Ljungdahl pathway.